Re: Database schema for univesal usage

From: Kenneth Downs <knode.wants.this_at_see.sigblock>
Date: Wed, 11 May 2005 21:22:09 -0400
Message-Id: <7qnal2-k1g.ln1_at_pluto.downsfam.net>


lauri.pietarinen_at_atbusiness.com wrote:

>
> Kenneth Downs wrote:

>> lauri.pietarinen_at_atbusiness.com wrote:
>>
>> >
>> > My reasoning that DB2 is more general than DB1 rests on the fact

> that
>> > DB2 can contain any information that DB1 can contain but not vice
>> > versa.
>> > Additionally, by adding constraints in the database or in the
>> > application program DB2 can be made to contain exactly the same
>> > information as DB1.
>> > It does not work the other way around.
>> >
>> > Well, thats my reasoning.
>> >
>>
>> What you say is correct, but if you were putting together an app to

> track
>> cell phones, then for one EIN record there would be exactly one and

> only
>> one phone #.  Putting in a child table requires a constraint plus a

> join to
>> retrieve the column, when there will always only be one value.

> Occam's
>> razor cuts off the child table and puts the column into the first

> table.
>
> I agree. Which approach you choose depends on what you want out of it.
> I think the reasoning behind the "super schema" that can hold
> "anything" is that we don't know what kinds of requirements we will
> have in the future so we should build some flexibility into the schema.

I would suggest that doing so, while it may be a wise approach to reality on the ground, actually reveals weaknesses in the ability to adjust table structures to their appropriate form over time.

> Taken to the extreme you get the OneBigTable approach which gives you
> everything and nothing, since you can put anything into it, but you
> have to code the "meaning" of the database into the application.

Well, again the OneBigTable approach is not an extreme, it is a completely different format, but I think we've gone around once already on that.

>
> What I am saying is that there are many degrees of flexibility that can
> be built into the schema. Maybe we currently need only two different
> kinds of phone nubmers, but we suspect that the situation might change
> in the future. Should we go for the flexible schema now or just add
> new columns into the table when we happen to need them? How will the
> application programs be affected? How will the queries be affected?
> Will generalising now unnecessarily complicate queries?

As I said above, if you know that your toolkit does not handle schema changes well, then you are in trouble no matter what. If you provide extra stuff you have to constrain its use, and then you work harder making a UI for a structure that may not even be necessary. Balance this against the expense of adding things after the fact and they usually come out even in practice.

What solves all of these cases is the ability to modify structure and corresponding code at low cost as the system evolves. Then you can have everything fit all of the time.

>
> This phone number example is trivial. A typical question might be:
> should one put ones suppliers and customers in the same table, hence
> generalising these concepts into the concept of the "party"?
>

no.

-- 
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth_at_(Sec)ure(Dat)a(.com)
Received on Thu May 12 2005 - 03:22:09 CEST

Original text of this message