Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]

From: Alexandr Savinov <savinov_at_host.com>
Date: Thu, 16 Jun 2005 11:51:30 +0200
Message-ID: <42b14ba3$1_at_news.fhg.de>


Jon Heggland schrieb:

>>As I 
>>mentioned, we define very naturally a model dimensionality, model "more 
>>specific" relation, model consequence just because we are able to 
>>produce canonical semantics. As a consequence we can define grouping and 
>>aggregation, inference and other mechanisms.

>
>
> And otherwise, you can't? I feel we are getting nowhere. This is just
> talk.

Yes, definitely. That was clear from the very beginning. First of all it ia forum. The second point is that you have a defensive position which is unbreakable. Something like "why do I need OOP if I can implement everything in a procedural language". One can understand things only if he wants to, i.e., the position is not defensive.

>>I understand your question: So what is the problem, we simply write 
>>queries and database engine executes them. Correct. But what if I do not 
>>want to write numerous queries? I want my database do it for me. I 
>>define only my data and then ask questions and that is all.

>
>
> You have to specify unambiguously what you want in any case. What is the
> difference between writing a query and asking a question?
>
>
>>See above. Because we need to build a database that would know our data 
>>semantics for carrying out inference, aggregation, constraints 
>>propagation etc. Otherwise I need to write complex queries myself.

>
>
> How can it be otherwise? The computer is not a mind-reader.

Yes, we need to add more information into our model so that the database knows what to do if queries do not have enough information. In other words, the model has more information while queries are simpler.

>>>>A little mit more definitions and we get the concept-oriented data model 
>>>>with which we can play. In particular, such questions as "What is power 
>>>>in kW of some house or what is area of some car" are absolutely legal. 
>>>
>>>Legal, perhaps, but useful?

It depends. If you love the RM or if it is your religion then nothing else will be accepted as more useful. If you need to earn money then the new model will be more useful.

>>Microsoft follows (approximately) this way in its WinFS file system. 
>>Although their approach is based on Object-Role Modeling the general 
>>direction is clear - we need to relate our data items in such a way that 
>>the underying storage is able to do some tasks for us automatically.

>
>
> What tasks? What need is served? Again, how is it useful?

You might want to read more about MS WinFS. They explain the usefulness at the user level (for it is difficult). Surprisingly, but here Microsoft is a leader.

>>>>- How many dimensions (degrees of freedom) does it have?
>>>
>>>How do you determine this in your model? What inhibits this in the RM?
>>
>>RM is too low level mnodel and in this sense we are able to implement 
>>almost everything. The main problem is that the database itself (the 
>>model itself) is unaware of what we are doing, what we are implementing, 
>>what our data means, what is the purpose of some query.

>
>
> Constraints are the mechanism for specifying semantics in the RM. They
> are of course just an approximation---but do you claim that with your
> model, a computer can actually understand the real-world meaning of the
> information you enter into it? How?

Constraints are a mechanism for specifying what is impossible (what cannot exist, normally in intensional form). Data is a mechanism of specfying what is necessary (what really exists, normally in extensional form). Semantics can be defined as both constraints with data or only data. There are of course numerous variations of these terms frequently even incompatible.

Model is model - it is some formalism so we do not know what is its *real* meaning. By real meaning we mean its canonical (explicitly and expressed) semantics that does not allow for ambigous interpretations. What I claim is that database should know much more about our data and its relationships in order to be able to perform useful tasks for us.

>>In paritucular, in RM nothing prevents you to introduce all the 
>>necessary additional features (line dimensionality, hierarchy etc.) but 
>>it simply will know nothing about that.

>
>
> It will know what you tell it, and the inferences it is able to draw
> from that.

There is only one mechanism to tell my database what to do - write some query. Query is a program normally written in source code. Each time it is compiled and then executed as a sequence of operations by the database engine. Once a query has been executed the database forgets everything and it continues its life as a simple storage. Most queries have one and the same form which actually reflects what our data means. Without queries database is not able to interpret data and to maintain it. In simplified form you can view this problem as moving some part of all queries into the database (into the model) where they have a persistent form. After that queries are simpler but the database is more complex.

>>>>- If I have two states of this model or system then can I say that one 
>>>>of them is more general than the other (and equivalence as a particular 
>>>>case)?
>>>
>>>How do you define more general? As representing a superset of the 
>>>information? This sounds trivial in the RM, though I haven't thought 
>>>about it all that much.
>>
>>I do not think it is technically difficult. The main problem is that it 
>>contradicts to the spirit of the RM, i.e., this question is not 
>>considered actual, meaningful and even legal.

>
>
> How so? In its simplest form, the question is just "is this relation a
> subset of that?"---unless I mistake your meaning. Please explain.

We are not talking about relations. We are talking about the whole model and its semantics. Again, assume you have two models each with several thousands tables. The question is if the first model is more specific than the second one. You follow RM tradition where database is a set of tables and we can manipulate these relations by producing new relations, making grouping and aggregations etc. It is a kind of programming where we need to specify a concrete way how our result set needs to be produced and then our database will simply execute this program. But you can look at it differently. What if I do not want to write (complex) queries. I want to explain everything about my data in the very beginning and this information is a part of the model and is maintained by the database.

>>>>- How can look at it at different levels of details? In other words, how 
>>>>can I produce something that is qualified as an abstract/general 
>>>>representation of this very initial model or system?
>>>
>>>I don't understand this one.

You have a model/system. I want to look at it at the highest level by ignoring all the details. It is not only legal question, it is crucial importance. There is no surprise that you do not understand it - it is not your fault. There are traditions, there are prejudices, there is a rigid coordinate system of the contemporary knowledge. For example, we have a data model for a corporation. It includes production departments, sales departemt, peronel etc. But it is still one organisation as such, i.e., one point possible with some properties. When I start creating this model I want to define my organisation as one element of this model. Tomorrow I come and see that this corporation has something inside and I add this something into the model as additional elements. Eventually I will add some very specific elements at the level. But in any case I am able to look at it at different levels and I am able to query it at different levels. This model is not hierarchical (it is both hierarchical and multidimensional) and this way of modelling is described in the concept-oriented approach.

>>This is why OLAP was developed. We need different levels of details. 
>>This in turn requires other mechanisms like constraint propagation.

>
>
> We need aggregate functions, you mean?

In order to aggregate and even store data we do not need a database - we can do it in Pascal. The problem is that we need more productive and more natural model. In concept-oriented model aggregation is not a function applied to a set. It is more general - strictly speaking we can aggregate (project) everything and deproject everything.

-- 
alex
http://conceptoriented.com
Received on Thu Jun 16 2005 - 11:51:30 CEST

Original text of this message