Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]
Date: Fri, 17 Jun 2005 16:00:20 +0200
Message-ID: <MPG.1d1cf5e23a15e3149896a3_at_news.ntnu.no>
In article <42b2c1ba$1_at_news.fhg.de>, savinov_at_host.com says...
> The semantics of my 4-column table is that employees and departments are
> related, and departments are related,
Departments and manager are related, you mean?
> but not more. In particular, ED and DM are unrelated.
Even if the same department occurs in both ED and DM? That is denying the obvious.
> the problem is only which of them describes our problem domain. I think
> that your model has more information than there is in the problem
> description. (But I do not argue - you may see the problem description
> differently, may be I miss something.)
If employee Jon is in department D1, and department D1 is managed by
manager Sally, then it is obvious that Jon is in D1 which is managed by
Sally---which is all my table is saying. It really is just a restatement
of what we have already told the database---a deduction. That is all a
query is (in the RM).
Now, you could perhaps argue that my table means that Jon is managed by
Sally, and that this is information that is invalid---it does not
"describe our problem domain", it is based on a possibly wrong
assumption.
The flaw in that argument is that I do *not* say that my table means
this. My table has a very specific meaning in the RM, which is perfectly
valid in all circumstances (given the validity of ED and DM),
*regardless* of how employees and managers may otherwise be related. And
this meaning is GIVEN BY THE RM AUTOMATICALLY, through the semantics of
join (logical and). The user ignores this meaning at their peril. This
is, by the way, more or less what I was talking about when I first
criticised Ullman's UR treatment.
> Here is my schema:
>
> Top
> / | \
> E D M
> \/ \/
> ED DM
> \ /
> Bottom
>
> It is 4 dimensional (4 primitive dimensions - 4 paths from the bottom to
> the top).
>
> Here is your schema:
> > If I understand you correctly, you have to introduce an additional
> > construct (a common subtable) to establish the connection between
> > employee and manager. In the RM, this is not needed. Simpler. Better.
>
> I see no difference here between COM and RM. In both cases we need to
> have some additional table to represent a relationship.
No we don't! In the RM, we just join the existing tables!
> If ORM (object-role modeling)
> stores data in tables then it does not mean that it is based on the RM.
> In most cases RM is used as a low-level storage model. In is very
> convenient to have a complete freedom at low level and then constrain it
> at higher level.
Yes. But why d'you have to go and make things so complicated?
> >>As we found out this table can be also viewed as the universal relation.
> >
> > On the contrary (if I understand UR correctly---my understanding is just
> > based on Ullman's slides, though): It is *my* relation that is the UR in
> > this case.
>
> Your representation makes an additional assumption about equality of two
> department variables. May it is precisely what Ullman meant...
> I agree, different paths express different relationships in data so we
> need to specify explicitly what we need and the database does not help
> here. For example, if we want to get a set of houses related to some
> person then we need to choose via what kind of relationships: house
> ensurance or house sales records. But in any case this mechanism has to
> be present in the database and its query language. I still do not want
> to write joins. You propose to implelent it at the user interface level
> but I would prefer to have more support from the database. For example,
> the database should know about alternative paths, which are represented
> in a way different from explicit joins. One approach consists in
> specifying an intermediate table in the path. In this case the qurey
> might look as follows:
>
> get all houses related to 'Smith' via HouseEnsurance
>
> Nice format, is not it? Would not you like to have such a facility?
> > It is certainly possible to derive useless facts, such as "The customer
> > id of the customer named 'Jon' looks the same as the product id of the
> > product named 'Monkey wrench'" (a join of hypothetical customer and
> > product relvars over id attributes). But the fact is nevertheless
> > *true*, and who am I (or you) to say that such a query should be
> > disallowed? After all, we have already put that information into the
> > database (though in a different form, and probably piecemeal); we *must*
> > be able to retrieve it.
>
> So the question is do we really need to have such a freedom which allows
> us (makes it easy) to produce meaningless results? Once we can evaluate
> what query will produce meaningfull result and what will not then why
> not to have this knowledge as a part of our database?
I object to the terms "meaningless" and "meaningful". All results have meaning in the sense of clearly defined semantics. They may be more or less useful/useless for the purposes at hand, but that is completely different.
And what is a useless query will obviously depend on the circumstances; it cannot be determined once and for all.
> But what I mean is that it
> is better to constrain our freedom of producing meaningless results by
> changing the model, i.e., the axiomatizaiont and inference enigne.
This may be another crux of our disagreement, then. Since what is useless is a value judgement I expect to change over time (and will be different for different users!), I will not cripple my database in this manner for so little gain. I will instead hide complexity by constructing appropriate user interfaces.
> > With proper domain support, the database can even warn us that comparing
> > customer ids and product ids is a suspect undertaking. If you want to
> > restrict the end users' options, just don't give them direct access to
> > the database.
>
> Yes, it is possible. But if we can warn a user about some meaningless
> situation then why not to bring this knowledge into the database as a
> part of the model?
It *is* part of the model, that was my point! I am saying that if users really want to compare customer ids and product ids, then I should respect their wishes, not maim my model in order to make it impossible because *I* think they are wrong!
-- JonReceived on Fri Jun 17 2005 - 16:00:20 CEST