Re: domain questionnaire

From: Jan Hidders <hidders_at_REMOVE.THIS.win.tue.nl>
Date: 26 Feb 2001 09:58:21 GMT
Message-ID: <97d9bt$dfm$1_at_news.tue.nl>


Vadim Tropashko wrote:
> In article <97bek2$sn2$1_at_news.tue.nl>, Jan Hidders says...
>
> >As I said elsewhere, when you are making a data model you are making
> >a model of what the people you are modeling for think that their
> >reality is like. So if they don't know then you cannot model it, by
> >definition. That's why ordinary data modelling is usually simpeler
> >than modelling in the work-of-god arena; it's harder to ask God if
> >your model is right. :-)
>
> Can I ask priests instead?

I dunno, in my religion they tend to be more interested in the words than in the works of God. :-)

> Seriously, I see your point: I don't necessarily have to expect that
> relational model must be successfull in science topics simply because
> of the fact that it is successfull in business world.

Hmmm, that's not really what I meant. Your goal should not be to model reality but to model the model that the physicists that are going to use your database use to describe their reality. If they use SI then you have to use SI. If they use something else then you have to use that. It's that simple.

> The situation looks somewhat parallel to OOP, where any attempt of
> deducing inheritance via LSP principle fails for rectangle and
> square, for example. Somebody even summarized it like "OOP dosn't
> work for Geometry".

I *vehemently* disagree with that; OOP *does* work for geometry. The problem is that there are many people who do not really understand how OO modelling works. (The main problem is that people think it models statics where in reality it models dynamics.) That is IMHO not their fault but the fault of the OO people that have been claiming all the time that OO modelling is so straightforward and intuitive (everything is an object et cetera) that you don't have to think very hard about what your model means and what it exactly models. As database modellers have already known for decades, that is very very wrong.

> Still, since science has much more rigorous vocabulary, many people
> are tempted to test their ideas in this area first...

And it does work there too. As I hope we will see. :-)

> This is more like relational metaphysics. Let's write a person as a vector:
>
> <person|

If your users write persons as vectors then this is Ok. If they don't then you shouldn't either.

> Same for the weight domain:
>
> |weight>
>
> Then, some "inner" product looks like:
>
> <Joe|weightInKg> = 60
>
> If we want to translate weight, then
>
> |weightInKg> = 0.4 |weightInLbs>
>
> Here, while weight metrics clearly belong to some a linear vector
> space, can't I make person superpositions as well? (Schrödinger cat
> societies:-)
>
> This QM parallel, however, looks less naive if we notice that a fact like
>
> <JoesCar|plateNumber>
>
> is a QM measurement, where measuring device is, say, a police
> officer.

So, what is the problem? The relational model models facts. What are the facts in QM? The facts are the measurements. Then you ask what atomic values / printables are involved in these facts. So at that point you have to ask the user what it is exactly that he or she wants to know/remember about these facts and how this should be represented. About the weight measurement you would probably want to store a string that identifies the person, the units of the weight, and a number that represents the weight that was measured. For the plate-number registration you would probably want to store some identifier of the car plus the plate number.

Note that these are all printables with some kind of representation. You have to ask the user what the representation exactly is. This can be used to decide if certain columns have the same domain:

  rule #1 : If two columns have the same domain then if the same

            denotation is found in both columns then it denotes the
            same value.

The next question what operations you want to have on these values inside the database. Again you have to ask the user which operators he or she needs to ask the queries that they want to ask, and what the semantics of these operators are in terms of the notation. The most important operator is the equality operator. This can be used to decide which columns have different domains:

  rule #2: If two columns have the same domain then the values in these

           columns can be meaningfully compared.

(Actually this rule implies rule #1 because any meaningful comparison would respect the rule that X = X.)

For all the operators you have to decide on which columns they can operate and in which columns their results are allowed. This also can be used to decide which columns have different domains:

  rule #3: If two columns have the same domain then the values of one

           column can be used as a certain argument of a certain
           operator if the values of the other column can also be
           used there.

  rule #4: If two columns have the same domain then if one column can
           contain the result of the operator then the other column can
           also contain the result of that operator.

After applying all these rules you will have found which columns may belong to the same domain. You can then proceed then with determining what values are precisely allowed in every domain.

Applying all this to your example is left to the reader as an exercise. :-)  

-- 
  Jan Hidders
Received on Mon Feb 26 2001 - 10:58:21 CET

Original text of this message