Re: Proposal: 6NF

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 01 Oct 2006 00:21:35 GMT
Message-ID: <jSDTg.10095$6S3.4633_at_newssvr25.news.prodigy.net>


"Roy Hann" <specially_at_processed.almost.meat> wrote in message news:zLmdnQxwyMHjTIPYnZ2dnUVZ8qudnZ2d_at_pipex.net...
> "Brian Selzer" <brian_at_selzer-software.com> wrote in message
> news:WvxTg.11808$7I1.3639_at_newssvr27.news.prodigy.net...
>
>> Recording propositions is only about 5% of the typical activity for a
>> database. With today's technology, querying multiple tables is almost
>> always more expensive than querying one,
>
> Simply wrong.
>
> Consider a table SHIPMENTS with, say, a million rows. One nullable
> attribute might be CAUSE_OF_LOSS. Since shipments are almost never lost
> this attribute is almost invariably null. Supposing that the DBA is even
> slightly skilled and chooses to physically key both tables on
> SHIPMENT_NUMBER, which query for the value of shipments lost to pirates
> will be quicker? The one that scans the million rows? Or the one that
> scans the CAUSE_OF_LOSSES table then joins the (possibly empty)
> restriction with the SHIPMENTS table on SHIPMENT_NUMBER?
>

An even slightly skilled DBA would use an index on such a column. For that matter, if several nullable attributes are often queried together, the DBA could use a compound index on those columns. Try that with 6NF.

> I would argue that the limitations of today's technology are precisely why
> this approach is so useful. Not only does it make it easier to reason
> about what is going on, but it also helps the real system go like blazes.
> I regularly apply such decompositions to solve performance problems in
> systems with tables with hundreds of millions of rows.
>

Your argument is based on query bias and depends on the physical ordering of the tables. I'm sure I could come up with some exceptions, too--I believe I wrote "almost always."

> [snip]
>> For example, if you have a table with 30 columns where 15 are NULLable,
>> you would need as many as 16 tables to eliminate NULLs.
>
> That is a far-fetched possibility. Maybe you could construct a
> hypothetical table with nullable attributes that would have to be
> decomposed that way, but in practice it doesn't happen. One rarely needs
> more than three or four additional tables, and often less. And as noted
> above, they often actually improve performance.
>
> Roy
>
Received on Sun Oct 01 2006 - 02:21:35 CEST

Original text of this message