Re: Does Codd's view of a relational database differ from that ofDate&Darwin? [M.Gittens]

From: Alexandr Savinov <savinov_at_host.com>
Date: Mon, 06 Jun 2005 17:47:59 +0200
Message-ID: <42a47041$1_at_news.fhg.de>


Paul schrieb:

> Alfredo Novoa wrote:
> 

>>>I guess the other claim against NULLs is that they make things more
>>>complex and less intuitive, which is a more subjective point. You could
>>>argue that Darwen's method of avoiding NULLs is more complex that using
>>>NULLs in the first place.
>>
>>Nulls undermine the conceptual integrity of the Relational Model which
>>is based in predicate logic.
>>
>>A tuple with nulls does not match to a logical proposition.
>>
>>This causes many problems. Some of them were pointed by Eric.
> 
> 
> OK so basically the problem is with aggregates?

Assume that we have a set of 3 values S = {1, 3, 10}. We want to aggreage them and apply some function func: A = func(S). Do we have a problem? No. Now remove some item from the set so that we have S = {1, 3} and then apply again the aggregation function. Do we have a problem? No.

Having null values is actually a way of removing data items from consideration. In this example we apply the aggregation function to the set {1, 3} which is equivalent to applying it to the set {1, 3, null}.

Some difficulties may appear in multidimensional case (in the case of many columns). What if a row has null in field F1? This means that this object does not exist along the dimension F1. If we project all rows onto this dimension then we will not be able to find it there - it is absent. In particular, aggregation functions and other procedures will not see it at all (if it does not exist then it is not visible).

> Why not say then that all aggregates that involve a NULL return NULL?

It is possible but I do not find it very natural because we need the properties of NULLs and aggregations to be consistent with other properties of the model being developed. We cannot say "let's do it so" - but need to have a kind of global consistency. For example, take a row <1, 3> and then consider this point in 3-dimensional space by adding one new dimension. How it will look like (represented)? I find it very natural to write it as follows: <1, 3, null>. This actually says that this object does not exist in this dimension, it is not visible, it cannot be counted or aggregated. We might add some other properties of nulls and then derive their consequences. And finally we will develop yet another data model.

Formally, objects exist in all dimensions but in most of them they have null values. In order to optimize such a property (a limited number of dimensions for some objects) we use multidimensional hierarchical system   which formally describes what is the data semantics, its dimensionality, its projections and many other issues unsolved in other models.

-- 
alex
http://conceptoriented.com
Received on Mon Jun 06 2005 - 17:47:59 CEST

Original text of this message