Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]

From: erk <eric.kaun_at_gmail.com>
Date: 6 Jun 2005 14:13:19 -0700
Message-ID: <1118092399.501135.237640_at_g14g2000cwa.googlegroups.com>


Alexandr Savinov wrote:
> You describe the semantics of UNKNOWN which is also a special "value".
> And I completely aggree with your interpretation until you call it NULL.
>
> Generally, in order to effectively manipulate data semantics we might
> need two special "values" with concrete semantics defined at the system
> level:
>
> NULL - absence (of value, record or any other entity).
> UNKNOWN - presence but having unknown identifier.

FYI, this sort of thing has been rehashed many times, most of them somewhat unsatisfactorily. The fact that data manipulation can be done without nulls makes their value primarily that of placebos, while their costs can be great, as they're hard to make consistent.

> Here is one example of using nulls:
> Attribute/property/variable does not make sense for an object. requently
> for the sake of schema simplicity we have wide tables where some columns
> do not makes sense for some objects.

How is that simple? Your table (it's not a relationa) now has many possible meanings. Keep in mind that a relation is a predicate, not an "entity." Otherwise how would you characterize a "linking" / "cross-reference" relation?

> For example, we have a list of
> products (mostly cars) and define a column Power. However, sometimes we
> also sell products which do not have a motor so what should we write in
> the field Power? This is precisely the case where the object does not
> exist in this dimension and we can formally correctly write NULL.

A bad example. "Mostly cars"? Why not define the car and (other product) relations properly?

> Unknown values are complex in processing and can be used only in
> sophisticaed databases like deductive databases. The unknown "value"
> actually means that this field has one normal value but it is unknown,
> i.e., everything is possible. In this case it is very natural when the
> result of aggregation is also unknown. For example, if we have a new
> employee then he is guaranteed to have some concrete salary which is
> however currently unknown for some reason (no paper arrived the
> accounting department or the salary is still negotiated). In this case
> we could formally corrected write UNKNOWN. This would inhject a great
> deal of uncertainty into the whole database so that many related queries
> will return also UNKNOWN (in the case the database is able to process
> such values at all).

And that's precisely the reason such things are better dealt with through normalized relations. Then you can get consistent and correct answers, by specifying exactly what you want: average salaries for those who have them, count of those without salaries. The joins, far from being a burden, allow you to be specific without have to work around however the DBMS has "defined nulls."

  • Eric
Received on Mon Jun 06 2005 - 23:13:19 CEST

Original text of this message