Re: Joins with nulls

From: Bob Badour <bbadour_at_golden.net>
Date: Sun, 24 Nov 2002 17:27:08 -0500
Message-ID: <1qcE9.1212$F63.240625700_at_radon.golden.net>


"Finarfin" <finarfin_at_sympatico.ca> wrote in message news:NV4E9.410$IQ5.26532_at_news20.bellglobal.com...
> Bob Badour wrote:
> >
> >
> > I disagree. In more involved examples where one allows the user to
> > enter partial information, one requires a more involved design. NULL
> > causes far more problems than it claims to solve. If the database
> > knows of no correct answers to my question, I am happy to receive an
> > empty set of answers. Again, no NULL required.
> >
> >
> Bob:
>
> Your entire argument can be summarized as: That can't happen so you
> are wrong.

Please do not put words in my mouth: your summary is incorrect.

> I'll try one more time to show my point. I'll use a real situation.
>
> The data consists of the performance characteristics of a piece of
> industrial equipment called a flotation cell. The operator (let's
> call him Homer) must record at set intervals:
>
> Date sample was taken, Time sample was taken, Froth depth at the time
> sample was taken.
>
> There are clear functional dependancies with each of these and they
> should all go into a single table. The DDL for such a table is
> relatively trivial.

I disagree that they belong in the same relation. From the information you give subsequently, the user may record the time independent of the froth depth and froth depth independent of the time.

> Other tables would hold related information, such as sample results,
> product destinations, PLC data etc.
>
> For any particular set of data, Homer may:
>
> Forget the time the sample was taken (or neglect to note it) and / or
> Forget to measure (or neglect to note) the froth depth.

Which means that the functional dependencies do not place these measurements into the same relation.

> So in a situation with three bits of data, two of which are required
> for the candidate key

If you suggest that the date and time form any part of any candidate key for froth depth measurements, I suggest you are mistaken. One would have to always know these values for them to qualify as identifiers. Any design that uses them as identifiers when they may be unknown is observably a bad design.

>, I am faced with up to two pieces of missing
> information, one of which results in a candidate key that is
> incomplete. Please show me this "more involved design" that will
> allow me to have a table(s) without a blank.

First, we have to recognize that the time of measurement cannot identify the data. We must then devise some other identifier. Presumably, this identifier will be built into the form on which the operator records the measurements. For our purposes, we can call it the measurement identifier; it need not be numeric.

We can clearly see that we need one relation to represent the time of the measurement whenever the operator does record the time of the measurement. We need another relation to represent the froth depth whenever the operator does record froth depth. If the operator sometimes records the date of a measurement without recording the time of the measurement, we will need separate relations to record date and time of day. The only candidate key for any of these relations is the measurement identifier. The relative cardinality between any pair of these relations is (zero or one):(zero or one).

Other relations would represent the other related information such as sample results, product destinations, PLC data etc.

> Note that 0 is a valid
> froth depth and entering a 0 when it isn't would be wrong and could
> lead to actions detrimental to the process.

One would simply not record a depth. One need not record zero or any other arbitrary depth--or NULL either.

> Also note that other
> information, such as analytical results from the sample, must be
> recorded, whether or not this particular table is complete.

One need not record a date, a time of day or a froth depth to record the other information according to the design. One only needs a measurement identifier.

> Also note
> that mathematical work is performed on the froth depth data. If there
> is missing information, averages etc. cannot be impacted.

Define "impacted". If a particular average or calculation absolutely requires a missing measurement, one cannot avoid impact with any design. One can calculate any meaningful average possible without using NULL. NULL--especially as specified in SQL--frequently leads people to believe they have meaningful averages when they do not.

>
> JE
Received on Sun Nov 24 2002 - 23:27:08 CET

Original text of this message