Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]
Date: Fri, 17 Jun 2005 12:39:18 +0200
Jon Heggland schrieb:
> In article <42b185ff$1_at_news.fhg.de>, email@example.com says...
>>For example, consider the use of joins. We have several types of them >>including manual joining by means of WHERE. Then in each individual >>query you need to specify all the details of joins. Again, that is >>needed because our database unable to derive necessary information. And >>it is unable to do it because it does not know the semantics of data -
> Not at all. It knows constraints (of which foreign keys are an important
> special case), it knows domains. It can suggest "join paths", but if
> multiple paths are possible, some path (or combination of paths) must be
> selected! This can be done in multiple ways; *how* is not an issue of
> data model.
> What makes your model "more semantic"? How does your model specify which
> path(s) to use? In your model, how does the EO, ED, EP, DM, OD database
> look, and how is the "find the offices of employees managed
> by Sally" handled?
>>it simply can retrieve the specified data according to processing >>instructions given in the query. And the semantics is absent because >>there is no acceptable data model. So in very simplified form you can >>consider a goal getting rid of joins. Each model then will include as a >>necessary part all relationship and the user needs only specify *what* >>he want to retrieve but not *how* he has to produce the result set. >> >>As I mentioned somewhere, a motivating example from UR model might be >>very appropriate for COM as well: we need to compute queries like >> >>SELECT Emploees.name FROM Emploees WHERE Managers.name='Jones' AND >>Products.type'cars' >> >>In terms of MS WinFS it would sound like "I want to get all employees >>with Jones as a manager and related to product with type 'cars'". Note >>that we use 3 tables here (2 for constraints which are propagated and 1 >>as a target).
> Thank you. Now I think I have discovered still more about where our
> disagreement lies. I dislike the query above. It is vague, ambiguous and
> like a box of chocolates---you never know what you're gonna get, because
> you don't specify what kind of relationships between employees, managers
> and products you are asking about. Someone might have added the
> information that a certain manager's mother-in-law likes a certain kind
> of product, and that influences your result whether you think that
> relevant or not.
> In the RM, queries are theorem proofs. Very exact, but you have to be
> specific when you ask them. So if you want information retrieval-style
> queries---simple formulation, possibly ranked results, but you have to
> manually check their validity---your approach has merit, and you feel
> the RM is cumbersome. Fair enough. I think it is possible, and very
> useful, to use the RM for this purpose also, cf. my recent posts about
> the UR. You disagree, and I don't think any of us will convince the
> other in this manner.
You explained your point very clearly and I understand what you mean although I do not agree with you (as well as you do not agree with my point of view). Since it is a rare case (in most cases disputes in this forum have a religous/theological/political character) I would like to summarize what we get. There is two points of views:
- The conventional approach where we have a complete freedom of writing heavy queries with all the necessary instructions to the database how to build the necessary result set. They can be written manually (with numerous errors and inconsistencies) or automatically via some sophisticaded user interface. The issue however is that the database gets these instructions only when they need to be exectued and then again forgets everything. It is vey flexible approach like all methods with complete freedom (for example, assembly language where we can do whatever we like). For each new result set we create a new query and most queries actually will one and the same form and one and the same joins.
- An alternative approach where we try to move most of result set building "insructions" into the database itself. However, the idea is not in trivially hard-coding them in tables but changing the way how we view our data. In other words, these instructions (joins, constraints etc.) which are normally a part of our queries (see point 1) now become an integral part of our data semantics. Moreover, this part of data semantics is more important than the data in the sense of point 1. So the role of database consists in manageing all those join specifications and constraints. In these sense they are not joins and constraints anymore (it is how they are used when building the result set) - they have more complex interpretation. Anyway, the usefulness is that now we can build simpler queries. More important however is that more information is under protection of our database (reliablitiy and consistency). And even more important issue is that now we a different approach to modeling and database design.
What approach is better, more useful, efficient or whatever other criteria you like? I do not know. I prefer the second one. But it actually does not exist, so it is a future direction. In the RM this direction was killed actually (as a UR model).Received on Fri Jun 17 2005 - 12:39:18 CEST