Re: Table design - reducing number of entities

From: David Cressey <david.cressey_at_earthlink.net>
Date: Thu, 01 Dec 2005 07:02:02 GMT
Message-ID: <Kdxjf.8007$wf.3136_at_newsread3.news.atl.earthlink.net>


"Roy Hann" <specially_at_processed.almost.meat> wrote in message news:oJqdnSkDlYtc7RDeRVnyrQ_at_pipex.net...
> "Daniel J Watkins" <danielwatkinslearn_at_hotmail.com> wrote in message
> news:438d5f10$1_1_at_glkas0286.greenlnk.net...
> > I need some advise on this subject...
> >
> > Previously in my ERD there have been separate entities for 'Home
Address'
> &
> > 'Semester Address' which the 'Student' has foreign keys for. I was
> thinking
> > of just having an 'Address' entity which that 'Student' has two foreign
> keys
> > for - one for each address type. Is this ok to do? Will I lose any marks
> for
> > modelling like this? I suppose the correct way to do this is to use a
many
> > to many between 'Student' and 'Address' now and use a juntion entity,
> rather
> > than use two foreign keys?
> >
> > This is the same for the 'Supervisor' and 'Training Offficer' entities
> which
> > the student has - why not just use an 'Employee' entity but is it
correct
> to
> > use a foreign key for supervisor and training officer which both
reference
> > employee - that removes the need to use a juntion entity!
>
> Before offering my opinion, I am curious to know why practically everyone
in
> the real world seems to have an instinct to minimize the number entity
types
> in a logical design by forcing them to be ever more generic? Why does
> everyone seem to think there is a need to minimize the number of tables in
a
> physical database design? I ask because the more generic you make them,
the
> more code you are going to have to design, write, test, document and
> maintain to *impose* the business model at run time. It just makes no
sense
> to me. Coding it very slow and expensive. I bet half the code I have to
> look at (pretty crappy code too BTW) is there just to make the generic
> specific again. (End of rant.)

Good rant. My rant follows.

The best summary, I think, is to paraphrase a comment made by Paul C. in this newsgroup a few months ago:
"It's easier to understand 600 tables than 100,000 lines of code."

In my day, I've looked at a lot of crappy code written against databases. A lot of that code is crappy, because the database design itself was crappy. The database design was crappy, because it was the first attempt on the part of some programmer to build a database, and the programmer made some design errors. Unfortunately, people tend to look at the crappy code, and think they see the problem, when that's just the symptom.

People really ought to learn this stuff backwards:

  1. Learn interactive queries against good data in a good database.
  2. Learn to embed queries in programs, extraction scripts, and report generators, using good data in a good database.
  3. Learn how to program updates to a good database, with transactions, concurrency, ACID and all that.
  4. Learn how to distinguish between a good database design and a crappy database design.
  5. Learn how to design a good database, given good requirements.
  6. Learn how to derive good database requirements from a project statement.

Unfortunately, a lot of people learn this stuff starting with step 6, and working their way down to step 1.
(end of my rant)

>
> My rule of thumb in the real world (where I earn my miserable crusts) is
to
> look at all the constraints on the putative entities--not forgetting the
> transition constraints. If two entities have the same constraints, they
are
> the same entity type and they belong in the same table. And crucially if
> they *don't* have the same constraints they are not the same entity type
and
> they absolutely do NOT belong in the same table. My very strong intuition
> is that "home address" and "semester address" have very different meanings
> and hence different constraints. Likewise (perhaps even more so) for
> "supervisor" and "training officer". However in both cases there might be
a
> significant number of attributes that the pairs of entity types have in
> common, and one could (and I say should) use a super-entity to represent
> *just* the common attributes that have the same constraints. And there I
> think perhaps we meet in some kind of middle ground.
>
> Roy
>
>
Received on Thu Dec 01 2005 - 08:02:02 CET

Original text of this message