Re: Does Codd's view of a relational database differ from that ofDate&Darwin? [M.Gittens]

From: erk <eric.kaun_at_gmail.com>
Date: 6 Jun 2005 13:57:41 -0700
Message-ID: <1118091461.936418.318220_at_g49g2000cwa.googlegroups.com>


Alexandr Savinov wrote:
> Assume that we have a set of 3 values S = {1, 3, 10}. We want to
> aggreage them and apply some function func: A = func(S). Do we have a
> problem? No. Now remove some item from the set so that we have S = {1,
> 3} and then apply again the aggregation function. Do we have a problem? No.

Incorrect - you may have a problem. You're treating S as a variable, whose "contents" can vary from time to time. A domain, however, isn't a variable. It's a set whose definition (whether intentional or not) is fixed. A function over a varying domain, such as you describe, represents the sort of situation meant for relational constraints.

Adding elements seems to represent a greater problem than removing them, but still.

I think the above point is addressed in any of the conversations on dbdebunk about variables vs. values vs. types vs. objects vs. ...

> Having null values is actually a way of removing data items from
> consideration. In this example we apply the aggregation function to the
> set {1, 3} which is equivalent to applying it to the set {1, 3, null}.

Not really. Is null to be counted, for example?

> Some difficulties may appear in multidimensional case (in the case of
> many columns). What if a row has null in field F1? This means that this
> object does not exist along the dimension F1. If we project all rows
> onto this dimension then we will not be able to find it there - it is
> absent. In particular, aggregation functions and other procedures will
> not see it at all (if it does not exist then it is not visible).

How about conditional tests on those attributes?

> It is possible but I do not find it very natural because we need the
> properties of NULLs and aggregations to be consistent with other
> properties of the model being developed. We cannot say "let's do it so"
> - but need to have a kind of global consistency.

So all "objects" need to be addressable by all predicates? I think that's a nonsense. What's the point, when a simple clause like is2D(x) can properly "distinguish"?

> For example, take a row
> <1, 3> and then consider this point in 3-dimensional space by adding one
> new dimension. How it will look like (represented)? I find it very
> natural to write it as follows: <1, 3, null>. This actually says that
> this object does not exist in this dimension, it is not visible, it
> cannot be counted or aggregated.

It's not a 3-D point, so why even consider it? If it doesn't exist in the "dimension" of 3-D points, why even mention it? Are "objects" like "Love" and "hate" both written <null, null, null> because they have no "projection" into 3-D space?

> We might add some other properties of
> nulls and then derive their consequences. And finally we will develop
> yet another data model.
>
> Formally, objects exist in all dimensions but in most of them they have
> null values.

Null==absent? Why? Or rather, why bother?

> In order to optimize such a property (a limited number of
> dimensions for some objects) we use multidimensional hierarchical system
> which formally describes what is the data semantics, its
> dimensionality, its projections and many other issues unsolved in other
> models.

  • Eric
Received on Mon Jun 06 2005 - 22:57:41 CEST

Original text of this message