Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]

From: erk <eric.kaun_at_gmail.com>
Date: 7 Jun 2005 09:26:24 -0700
Message-ID: <1118161584.488438.282850_at_g47g2000cwa.googlegroups.com>


Alexandr Savinov wrote:
> By schema simplicity I meant the number of tables. Sometimes it is
> really simpler to have one wide table than numerous specific tables.
> There is alway trade off between these two extremes and do not say that
> having one wide table is good practice - it is only an example for
> demostrating some properties of nulls.

Not every "choice" represents an arbitrary continuum. There are repercussions to those choices, and I have yet to encounter a situation, even for simple databases, where the wide table with lots o' nullable columns made sense.

> Tables are not entities in relational model in other conventional models
> (and it is one of fundamental drawbacks).

In what sense is it a drawback? Relational deals with predicates, not tables or objects.

> However, they should be
> treated as entities at least formally (in a "good" data model).

Applying relational to the system catalogs for the database renders the sort of meta-manipulation that I think you're leaning toward.

> I want to have everything treated as only one sort of things.

Why is that desirable? Occam's Razor says entities should not be multiplied beyond necessity - and I think necessity dictates that there is more than one type of thing.

> Thus theoretically we can
> express our semantics without rows by using only tables.

Have you seen Neo's XDb? I think you'd like it.

> One simple
> conclusion is that "data semantics can be expressed by tables rather
> than only by rows (in tables)".

How do you express any commonality between things, and how is that any "better" (e.g. simpler) than relations which store facts that have common sets of attributes?

> It is very strong and somewhat surprising result.

It's not actually very surprising, since it's been done before.

> In such an approach we do not need rows because
> anything can be modelled by means of tables.

I'd suggest bits - you can "model" anything with them.

> Such a model is ugly from
> relational point of view (although we simply followed an advice to
> decompose tables) however it is interesting from theoretical point of
> view (with big consequences). Manipulating data in this model means
> adding/remove tables rather than records and one table bears some
> semantics which is encoded in its static name.

Semantics encoded in the name alone? Can you give an example? I suspect this is a meta-nonsense.

> Yet, again, you touch an issue which has nothing to do with null values
> and their semantics.

I think the above has much to do with nulls, even though I didn't mention the word. Nulls are all "about" what we know about each fact X, and how X relates to other facts Y and Z. The existence or non-existence of nulls is at the heart of the matter.

> > A bad example. "Mostly cars"? Why not define the car and (other
> > product) relations properly?
>
> I deliberatly gave such an example!
> However, in real world complex applications we (unfortuanately) have a
> lot of such situations which cannot be avoided.

Yes, they can.

> In other words, it may
> well happen that for some record some attribute simply does not make
> sense (including bad design).

Coping with a pre-existing bad design is another matter - I thought we were discussing doing good design?

> Theoretically it is not an issue of good or bad design.

Oh, then ignore my question. We'll proceed with judgement-free design. :-)

> The problem is
> if we want to keep meaningless columns in one table or optimize
> representation. I mean that independent of our desire or schema any
> object in the database will formally have all dimensions. The question
> is only how we *represent* the database semantics. In one case null is
> written explicitly. In other cases it still exists as a value of an
> object along meaningless dimensions but is not explicitly written
> because of good schema. For database and for the model this object still
> has null as a value along this dimensions independent of how it is
> represented.

So null is there even if it's not? Then I much prefer my nulls not-there.

> That is a (serious) problem in relational model because depending on
> chosen schema you may get different results.

Well, that's a good thing. The schema reflects the logical properties of what you're "modeling." So naturally different choices will give you different "results" - unless you think that a query should always return the same results regardless of the underlying data and structures?

> However, in good model
> normalization does not change the data semantics - it is only an issue
> of where and how it will be stored. You can store everything in one wide
> table or you can create specialized tables but all queries must return
> one and the same result. This is how concept-oriented model works where
> normalization is not an issue. The model has some dimensionality and
> data in it has some semantics and the question is what structure for
> dimensions we prefer - it is a design issue which has nothing to do what
> results will be returned by our queries.

Does that mean it's a performance consideration only? Surely semantics has an impact on query results?

> If data semantics depends on
> the structure of its relations, how you use joins and how you use nulls
> then it is bad model.

The alternative, then, is to have 'data semantics' independent of the relations (which are what one designs) and their constraints? So the semantics arise magically from the data themselves?

Seriously, write to Neo.

  • Tired in Pittsburgh
Received on Tue Jun 07 2005 - 18:26:24 CEST

Original text of this message