Re: Database design question - Isolated, unrelated tables

From: Brian Selzer <brian_at_selzer-software.com>
Date: Tue, 26 Jun 2007 13:59:42 GMT
Message-ID: <iZ8gi.2471$K44.260_at_newssvr13.news.prodigy.net>


<nyathancha_at_hotmail.com> wrote in message news:1182830820.680395.9910_at_e16g2000pri.googlegroups.com...

> On Jun 26, 11:08 am, nyathan..._at_hotmail.com wrote:

>> On Jun 26, 10:29 am, "Tom Cooper"
>>
>>
>>
>> <tomcoo..._at_comcast.no.spam.please.net> wrote:
>> > In addition to the other replies, I would add that foreign key
>> > constraints
>> > are just one of many tools thedatabasedesigner can use to help ensure
>> > that
>> > bad data does not get placed in yourdatabase. Other tools include
>> > check
>> > constraints, using the right datatypes (eg, store dates in a datetime
>> > column, not a varchar column), sometimes triggers, etc.
>>
>> > So an important question is what the consequences will be if (when!, my
>> > experience is if bad data can be put into adatabase, sooner or later,
>> > it
>> > will be) invalid data is put into your audit table(s). That might
>> > range
>> > from nobody really cares, to it's going to be a lot of work to fix it,
>> > to
>> > somebody (you?) gets fired, to your company would be subject to a
>> > significant fine, to somebody might go to prison (if, for example, your
>> > audit trail is being used to prove compliance with SOX). So ask
>> > yourself
>> > questions like what will happen if your boss comes to you and says the
>> > audit
>> > trail says that user x created project y at time z, but there is no
>> > project
>> > y in the system.
>>
>> > I certainly have tables in databases I have designed that do not have
>> > any
>> > foreign key relationships to other tables, but before implementing one,
>> > I
>> > would always think carefully about it.
>>
>> > Tom
>>
>> > <nyathan..._at_hotmail.com> wrote in message
>>
>> >news:1182755325.216207.318140_at_g37g2000prf.googlegroups.com...
>>
>> > > Hi,
>>
>> > > I have a question regarding best practices indatabasedesign. In a
>> > > relationaldatabase, is it wise/necessary to sometimes create tables
>> > > that are not related to other tables through a foreign Key
>> > > relationship or does this always indicate some sort of underlying
>> > > design flaw. Something that requires a re evaluation of the problem
>> > > domain?
>>
>> > > The reason I ask is because in our application, the user can perform
>> > > x
>> > > number of high level operations (creating/updating projects,
>> > > creating/
>> > > answering surveys etc. etc.). Different users can perform different
>> > > operations and each operation can manipulate one or more table. This
>> > > part of the system is done and working. Now there is a requirement to
>> > > have some sort of audit logging inside thedatabase(separate from the
>> > > text based log file that the application generates anyway). This
>> > > "audit logging" table will contain high level events that occur
>> > > inside
>> > > the application (which may or may not relate to a particular
>> > > operation). This table is in some sense related to every other table
>> > > in thedatabase, as well as data that is not in thedatabaseitself
>> > > (exceptions, external events etc.). For example : it might have
>> > > entries that specify that at time x user created project y, at time A
>> > > user filled out survey B, at time C LDAP server was down, At time D
>> > > an
>> > > unauthorized login attempt occurred etc.
>>
>> > > As I said, these seems to suggest a stand alone, floating table with
>> > > a
>> > > few fields that store entries regarding whats going on the system
>> > > without any direct relationship to other tables in thedatabase. But I
>> > > just feel uneasy about creating such an isolated table. Another
>> > > option
>> > > is to store the "logging" information in another schema/database, but
>> > > that doubles the maintainance work load. Not really looking forward
>> > > to
>> > > maintaining/designing two different schemas.
>>
>> > > I had a look at the microsoft adventureworksdatabaseschema diagram
>> > > and they seem to have 3 standalong tables - AWBuildVersion, ErrorLog
>> > > and DatabaseLog (unless i am reading it wrong!)
>>
>> > > Any advice, Information or resources are much appreciated.
>>
>> Thanks for the prompt replies everyone.
>>
>> >From what I am hearing, the consensus seems to be use it if you
>>
>> absolutely must, but try to avoid it if you can.
>>
>> One good point everyone seems to raise is "what is it used for?" ...
>> To be perfectly honest I am not entirely sure myself. Its one of those
>> requirements that filtered down from the management cloud. I think
>> the view is to use it mainly for "reporting" kind of functionality and
>> maybe only on some rare occasion for some sort of postmortem
>> debugging. Although in the latter situation, the application logs and
>> the sql server logs will probably end up being more helpful. I think
>> there is a system table somewhere in sql server that logs all the
>> transactions and changes that happen in the table right?
>>
>> Crystal reports were being considered at some stage for more
>> sophisticated reports, but for now they want some sort of entries in
>> there to see whats happening (not necessarily at thedatabaselevel,
>> but at the application level). The resolution of the reporting and
>> entries hasn't been decided yet ... as in, do we want to know
>> everytime someone retrieves a list of customers or only when someone
>> adds/removes customers. I have a feeling that if I chase this up, the
>> answer is going to be "both", "we may not want to start logging very
>> detailed stuff into thedatabaseright away, but if at some stage we
>> want to do it, the design should allow for it."
>>
>> So just thinking in terms of some sort of "reporting" solution, in
>> abstract a sort of condensed data for easier consumption, does it make
>> sense to store an isolated table(s)/schemas along with the actual
>> data?
>>
>> As to the consequences of a bad audit trail/log entry, I don't think
>> it would be catastrophic (fines, people going to prison etc.). Its an
>> internal application used to streamline inhouse processes. But of
>> course, we still don't want bad, inconsistent data in there and it
>> would lead to a lot of headaches, finger pointings, late nights etc.
>
> Actually, another best practices question now that I am here. Does it
> make sense for a table to have two (or more different foreign keys)
> both (or all) of which can be nullable and then tie them to different
> tables for different records? For example I have a survey table. It
> has all the fields are relations for describing various survey data
> (survey questions, participants, start, finish dates etc. ). Now a
> survey can be related to a project or a supplier. Of course, the same
> thing can be done with two different junction tables. Which is the
> better method? Add the junction tables and increasing the number of
> tables, complexity of the system (and the number of joins required for
> a query) or just adding extra nullable foreign key field(s) to the
> table? Is there a rule of thumb I should be following here?
>

Yes. The presence of a null should only ever indicate that a value for an *applicable* attribute is absent. If an attribute is not universally applicable, then it should appear in a different relation schema. A functional dependency A --> B requires that whenever two different tuples have the same set of values for A, they have the same set of values for B. Just because one of the values in B has not been supplied doesn't alter that requirement: whenever a value is supplied, it must necessarily be the same in all tuples with the same set of values for A. Furthermore, if an attribute is not universally applicable, then the dependency between A and B is no longer functional, since even in a world in which all missing values were supplied there can be some values for A that do not determine a value for each element of B. Since a key implies a set of functional dependencies, including attributes that do not universally apply is an indication that a relation is not fully normalized because the relationship between the key and the attribute is definitely not a functional dependency. On the other hand, decomposing a relation where all attributes universally apply but some may not be supplied introduces ambiguity: the sense that an attribute universally applies is lost in translation. It cannot be determined from the schema whether an attribute applies only some of the time or if the attribute universally applies but a value hasn't yet been supplied. So a null should only be used as a placeholder for a value that has yet to be supplied. Received on Tue Jun 26 2007 - 15:59:42 CEST

Original text of this message