Re: So what's null then if it's not nothing?

From: David Portas <REMOVE_BEFORE_REPLYING_dportas_at_acm.org>
Date: 21 Nov 2005 09:32:43 -0800
Message-ID: <1132594363.693296.237340_at_o13g2000cwo.googlegroups.com>


Alexandr Savinov wrote:
> David Portas schrieb:
> > Alexandr Savinov wrote:
> >
> >>The first question: Is this design (4 tables) right price for having no
> >>nulls? Do nulls deserve such an urgly design?
> >>
> >
> >
> > I'll ignore this one because it's entirely subjective as well as
> > irrelevant.
> >
> >
> >
> >>The second question:
> >
> > [snip]
> >
> >>- determining object type (i.e., to what table it has to be added in
> >>4-table design) is equivalent to determining if it may have nulls in the
> >>1-table design.
> >>
> >
> >
> > Yes, and constraints to check this are I assume part of the data model
> > in either case. How we handle constraint / type violations is
> > implementation-defined. It's not a model issue.
> >
> >
> >
> >>The third question: Has the 4-table design less information? Obviously,
> >>no. It has 4 times more tables that are expensive resource.
> >
> >
> > You are confusing logical and physical models again. The point of the
> > data model is to separate the two. Relvars (NOT the same thing as
> > tables) are not an expensive resource because cost is
> > implementation-defined. Nothing in RM says these 4 relvars must be
> > implemented as separate data structures. Possibly it is desirable to
> > implement them as a single data structure but that does not change the
> > data model at all - they are exposed in the data model as separate
> > relations. This is the heart of what RM is about.
> >
> >
> >>These tables
> >>have to be linked somehow because these objects are not isolated and an
> >>object from one table represet an entity from another table. In other
> >>words, each entity is distributed among many tables and we need to keep
> >>track of them via links. So we add additional columns. But why we then
> >>tried to remove null columns if then we add even more auxiliary columns?
> >>
> >
> >
> > Nothing in your hypothetical problem suggests it is necessary to "link
> > the tables" whatever that means (there are no "links" in RM)
>
> If we do not link them then we get even worse design. The thing is that
> all those objects with 2 properties (colur and weight) are thought of as
> one type entities. (Problem formulation: there is a set of objects with
> 2 attributes that can be absent.) In the case of no links all those
> entities will be distributed among 4 tables. If we want to produce a
> report then we need somehow to combine the results from those 4 tables
> manually. The system is unaware that they contain objects of one type
> (in the sense of the problem domain).
>
> The design with links could help as follows. We store all objects from
> the problem domain in one table (independent of whether they have nulls
> or not). This table stores objects that have both attributes nulls.
> Actually it stores only identities with no properties (in our case). If
> some object has a non-null attribute then it is also placed in the
> corresponding extension table. So we have an entity distributed among
> many tables (a common practice).
>
> This design may help in some situations and can be not very good in
> other situations. But what I meant is that having more tables (in order
> to reduce the number of nulls) is an overhead of having more links. So
> you win in one place (nulls) and loose in another place (links between
> tables and the necessity to maintain distributed entites). Note that you
> have also the problem of processing the distributed entiteis.
>

I can't make any sense of this because I don't know what you mean by "links". There is no such thing in RM but since you have repeatedly failed to define any other context for your discussion about "tables" I am assuming you mean "relations" or "relvars" when you say "tables". I repeat the question implied in my previous post: What is a link? What needs linking? Do you mean a foreign key constraint? If so, you haven't explained what attribute(s) such a constraint would reference. It isn't obvious from your original description of the problem that any such thing would be required - or even possible for that matter. Of course you could create a derived relation (view) from the common attributes in each case if that is useful in any way.

Now as I have said throughout, I am assuming RM in the absence of any other context for this discussion. But you seem a little confused on relational terms. In particular I think that if you appreciated the difference between a table and a relation and between logical and physical models then I think you would understand some of my arguments better and would have responded to them differently. Of course, given your fondness for nulls it may be that you don't think the relational model is important or correct - but I think millions of people would disagree with you there (that doesn't make them right of course - but the logic does :-)). I recommend almost anything by Chris Date for further reading. This is a very old debate.

-- 
David Portas 
SQL Server MVP 
--
Received on Mon Nov 21 2005 - 18:32:43 CET

Original text of this message