Re: When does one put data into a separate table?

From: Kenneth Downs <knode.wants.this_at_see.sigblock>
Date: Mon, 01 Aug 2005 20:23:20 -0400
Message-Id: <0uq2s2-hah.ln1_at_pluto.downsfam.net>


Paul wrote:

>
>
> This is a sort of vague question.

Not really, this is faced every day.

>
>
> Let's say that you have a db with a "personnel" table.
>
> Personnel_ID INTEGER PRIMARY KEY,
> First_Name VARCHAR(30) NOT NULL,
> Last_Name VARCHAR(50) NOT NULL,
> add1..
> add2..
> add3...
> tel..
> mobile...
> Language1 VARCHAR(25) NOT NULL,
> Language2 VARCHAR(25),
> Language3 VARCHAR(25),
>
>
>
> Now, my question here is when does it become necessary to hive off
> another table
>
> Person_Language
> ID INTEGER NOT NULL
> Language VARCHAR(25) NOT NULL
>
>
> The vast *_vast_* majority of people will only speak one or two, a few
> 3, a couple 4 and very, very few will speak 5 or more. So, let's say
> we allow 5 fields in the Language part of the table, or even 6 or 7
> (just to cover the Einsteins).
>
> I'm wondering when people would both in theory and/or in practice
> start to move an attribute like this off into another table? Is there
> a "Golden Number" above which it automatically gets put into another
> table, or how do IT professionals decide on such things?
>

Google the "0-1-infinity rule", which states that you must support something not at all, or 1 of it, or an infinite number. All other numbers are wrong. The nifty thing about tabular data is that it supports the 0-1-infinity rule rather well.

This supports Marshall's statement that the golden number is 1.

The worst thing that can happen to you if you ignore the rule is that you will become successful and very busy and prosperous, and perhaps suicidal as you wear the curse of "Language2" and "Language3" like Captain Barbosa of the Black Pearl. Success and widespread usage will prevent you from changing the schema because you will be scrambling to add new features and to fix bugs that are revealed by intense usage, but you will find again and again that "Language2" and "Language3" are either directly causing problems or are slowing down development of new features.

Another temptation not often mentioned is what happens when you are supporting 1 of something (such as Language) and then somebody comes along and announces they need multiple support. There is some significant change management here, as you must take existing values and break them out to a child table, while still making it easy for users who just need a single entry. It is at this point that many programmers shrug and put in "LanguageX". But if your tools support it, you'll be much happier breaking it out.

-- 
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth_at_(Sec)ure(Dat)a(.com)
Received on Tue Aug 02 2005 - 02:23:20 CEST

Original text of this message