Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]

From: Jon Heggland <heggland_at_idi.ntnu.no>
Date: Fri, 17 Jun 2005 10:59:09 +0200
Message-ID: <MPG.1d1caf43ee7c488b98969a_at_news.ntnu.no>


In article <Vvjse.122620$Vb7.6870078_at_phobos.telenet-ops.be>, jan.hidders_at_REMOVETHIS.pandora.be says...
> > Note that the relations do not say that *employees* have managers, just
> > that *departments* do. Is it the case that an employee in a given
> > department is always managed by the manager of that department? It is
> > possible, even probable, but the database (as presented) does not say!
>
> IMO that's not really a fair criticism. Whether you use the UR
> assumption or not, you always have to know what the columns of the
> relation(s) mean in order to understand what a query means. So in this
> case the querier can be assumed to know that the 'manager' column
> contains the manager of the department of the employee.

Exactly---it contains the manager of the *department*. My point is that the user is referring to managers of *employees*. The assumption that department managers also manage employees is significantly different from the assumption that the relation DM means that department D is managed by manager M.

Thus, I still say the formulation "find the offices of employees managed by Sally" is ill-formed. It should be "find the offices of employees in departments managed by Sally"---in which case something like "O from EO and ED and DM where M='Sally'" is IMO a very natural query formulation.

Perhaps I don't understand the problem the UR is meant to solve.

Is it that the query with the joins is difficult/complicated to formulate? That is mostly a matter of syntax/UI. The good thing about the join query is that it explicitly states how the relationship between employees and managers is inferred---this has to be stated somewhere/sometime, either by the querier or by the database designer.

Or is it that the view of all tables joined does not necessarily contain all the information in the database? That is pretty obvious from the semantics of the view, so why use it in this manner?

I realise that the computer surely can guess what the user means in this case, but I think it is important that the user is aware what it really is s/he is asking (cf. queries involving NULLs). This is especially important if there are multiple possible paths (I.e. if the "hypergraph is cyclic", if I understand Ullman correctly)---I think the DBMS should ask the user to clarify in the cases where the query is underspecified and/or ambiguous:

Me: Find the offices of employees managed by Sally! DB: You mean the offices of employees who work in departments managed by Sally?
Me: Yes.
DB: Ok, here they are....

The confirmation could be skipped if there is only one alternative (though I think the DB should present the clarified query along with the result anyway). Introduce an additional relation OD (office in department, independent of which employee(s) use(s) the office), and you would have three alternative interpretations of the original query.

Anyway, I consider this mostly a user interface issue (though of course, algorithms are needed for finding possible meanings).

-- 
Jon
Received on Fri Jun 17 2005 - 10:59:09 CEST

Original text of this message