Re: So what's null then if it's not nothing?

From: Alexandr Savinov <spam_at_conceptoriented.com>
Date: Thu, 17 Nov 2005 17:00:23 +0100
Message-ID: <437ca927$1_at_news.fhg.de>


David Portas schrieb:

> Alexandr Savinov wrote:
> 

>>David Portas schrieb:
>>
>>>Alexandr Savinov wrote:
>>>
>>>
>>>>David Portas schrieb:
>>>>
>>>>
>>>>>Alexandr Savinov wrote:
>>>>>
>>>>>
>>>>>
>>>>>>JOG schrieb:
>>>>>
>>>>>
>>>>>>But what if some
>>>>>>other object we need to add is known in advance to have no colour at
>>>>>>all? A formal criteria for "does not have a colour" is that we do not
>>>>>>want to see it when we select objects along dimension COLOUR.
>>>>>>
>>>>>>Theoretically, we might define such a structure of tables that this
>>>>>>situation is excluded. Practially, it is almost impossible so we need to
>>>>>>permit null value as an indication that some column is inappropriate
>>>>>>for this object (and this object will be invisible from the side of this
>>>>>>column).
>>>>>>
>>>>>
>>>>>
>>>>>If an "object" lacks the attribute of colour then surely it ought to be
>>>>>modelled by a relvar that doesn't have a colour attribute. Can you
>>>>>explain why this would be "almost impossible" given that you also said
>>>>>this fact is known in advance?
>>>>
>>>>I meant that it is practically impossible taking into account complexity
>>>>of real-world systems. The final design where null values will be made
>>>>unnecessary may well be too granular (many tables with a few or even
>>>>only one column).
>>>>
>>>
>>>
>>>This is the heart of the problem you have in mind I think: That a model
>>>without nulls is more expensive to implement than one with. In many
>>>current database products that is true, but it isn't a fundamental
>>>objection. It says more about those database products than about the
>>>theory.
>>>
>>>
>>>
>>>>Another reason is that we may well lack information about our problem
>>>>domain so that we cannot guarantee that objects will have all attributes
>>>>assinged to them in our schema. For example, assume that we have table
>>>>Persons with the company employees. But today we had to hire a Marsian.
>>>>Are you sure that it will have such a characteristics as sex? Maybe if
>>>>we ask him then it will answer that he does not understand what we are
>>>>talking about. Should we then take Marsians into account when we count
>>>>the number of women and men in the company?
>>>>
>>>
>>>
>>>If your reality changes then your logical model of it does too. This is
>>>entirely to be expected. Nulls don't change the picture.
>>
>>One point is that it is always to predict especially about future. So
>>that any model allows for some degree of flexibility. We cannot hard
>>code into its structure all possible situations.
>>
>>But even if it would be possible there is another (fundamental) problem.
>>We need nulls. It is a very important mechanism of data modeling. We
>>need to somehow designate the fact of absence of an object. So nulls
>>cannot be avoided -- they can be only ignored, hidden or discredited.
>>Because NULL IS A FACT OF ABSENCE OR NON-EXISTENCE.
>>
>>
>>>>And the third reason is theoretical. Independent what happens in
>>>>practice we would like our theory (data model) be general enough to
>>>>explain very exotic cases. This inclues such extremes as one wide table
>>>>with all possible columns and many narrow tables with a few columns.
>>>>
>>>
>>>
>>>That is a pretty bizarre argument. The relational model is general
>>>enough to represent entities with any non-negative number of
>>>attributes. It doesn't need nulls to do this. The question of
>>>"explaining" tables with all possible columns or with few columns
>>>doesn't arise. We model the real world facts - we don't model tables.
>>
>>The problem is that there exist different equivalent design for each
>>real world problem domain (a set of facts). So the problem is what
>>concrete structure of tables to choose. One schema might be very simple
>>but needs sparse values (a lot of null values). Another schema has many
>>tables. But note again that all of them are equivalent because have the
>>same number of dimensions by means of which we position our objects. The
>>difference is the way how we pack our facts, in what kind of container
>>structure.
>>
>>What people cannot understand is that we cannot simply disable nulls. It
>>is too simplistic point of view. It is not possible to say that we will
>>not use nulls and that is all. Why? Because the notion of absence exists
>>in almost any data model. We need to know if an object exists or not. If
>>yes, then we get some value. If not then we get null.
>>
>>Currently the null construct is only a primitive mechanism for dealing
>>with absence. Namely, we use nulls when we want to store the fact of
>>absence of *primitive* values. However, in general case all types of
>>objects---not only primitive values---can be absent. So in a good model
>>the notion of absence has to be formalized by means of a first level
>>construct. It is an illusion (simplification) that we can build a model
>>without a mechanism for expressing the fact of absence.
>>
>>--
>>http://conceptoriented.com
> 
> 
> You are making some abrupt changes of terminology and reference - or
> maybe we were always talking about different things! I might accept
> that there may be a need for nulls in some "object oriented data model"
> - a model which you haven't yet named or described. Not so in the
> relational model. That was my point. 


> Nothing stops you representing

> *facts* about "absence" or "non-existence" in the relational model > (without nulls).

Yes, but in this case the model (and dbms) will be unaware about that and I will need to implement many complex mechanisms (grouping, aggregation etc.) manually.

> An RDBMS as a set of know facts. If I have dogs with
> and without colour then I have at least two relvars one with and one
> without a colour attribute. If I need to distinguish between colourless
> dogs and dogs of which I don't know the colour then I add an attribute
> to indicate that fact. I don't need to add a null.

Sometimes you need. For example, if you are not allowed to create more tables. Such design can be more or less attractive but nothing changes semantically: removing an attribute from a table (into the second table)   and storing nulls is one and the same thing. For example, assume you create a separate table T2 for colourless dogs because you do not want to write nulls into T1 (where normal dogs are stored). What if you now ask: what colour has a dog from T2 (a colourless dog)? RM says that this question does make sense so it is illigal (effectively it prohibits such questions from being asked). But I think it is more natural to answer: the colour is null. Thus adding new table does change anything except that it imposes addition constraints on objects in T2 (they physically cannot be assigned a colour which is guaranteed to be null). Right?

Another example, we have two tables Apples and Oranges with different attributes. Ask a question: What value takes an Orange attribute for an object from Apples? Again my answer is null. Having two tables is only a structural constraint and nothing else. We can get an equivalent design with only one tables and combined set of attributes.

-- 
http://conceptoriented.com
Received on Thu Nov 17 2005 - 17:00:23 CET

Original text of this message