Re: Proposal: 6NF

From: Roy Hann <specially_at_processed.almost.meat>
Date: Sun, 1 Oct 2006 09:45:15 +0100
Message-ID: <IbCdnbe4T4S-HYLYnZ2dnUVZ8sudnZ2d_at_pipex.net>


"Brian Selzer" <brian_at_selzer-software.com> wrote in message news:jSDTg.10095$6S3.4633_at_newssvr25.news.prodigy.net...
>> Consider a table SHIPMENTS with, say, a million rows. One nullable
>> attribute might be CAUSE_OF_LOSS. Since shipments are almost never lost
>> this attribute is almost invariably null. Supposing that the DBA is even
>> slightly skilled and chooses to physically key both tables on
>> SHIPMENT_NUMBER, which query for the value of shipments lost to pirates
>> will be quicker? The one that scans the million rows? Or the one that
>> scans the CAUSE_OF_LOSSES table then joins the (possibly empty)
>> restriction with the SHIPMENTS table on SHIPMENT_NUMBER?
>
> An even slightly skilled DBA would use an index on such a column.

If the attribute in question appeared in a WHERE clause sufficiently often that might be useful. But the DBA can't just index every column hoping it will be useful one day. Indices are not free. Indices have to be maintained at run time, they have to be considered by the query optimimizer, and they take time to back up. Properly designed tables that correctly model the business are very nearly free, or better.

> Your argument is based on query bias and depends on the physical ordering
> of the tables.

In the first place I wasn't making an argument, I was telling you what I do in the real world, and it works. You will have a tough time persuading me I don't see what I do plainly see.

Secondly, there is no such thing as "physical ordering of tables". The operation is performed in whatever order the DBMS optimizer decides is optimal. What I suggested is a very reasonable strategy that my particular optimizer often does choose, unless it chooses a still better one.

Roy Received on Sun Oct 01 2006 - 10:45:15 CEST

Original text of this message