Re: The Practical Benefits of the Relational Model

From: Leandro Guimarães Faria Corsetti Dutra <lgcdutra_at_terra.com.br>
Date: Wed, 16 Oct 2002 16:01:52 +0200
Message-ID: <aojrgi$mmaj4$1_at_ID-148886.news.dfncis.de>


Peter Koch Larsen wrote:

>>
>>Isn't it the case in any language that a rogue type designer can mess
>>things up pretty badly (oops, forgot to call base::XXX).  Before you
>>bring up member hiding, I would point out that it is mostly syntactic
>>sugar and not real protection.

>
> Yes - this is indeed the case. But the operator overloading in D may affect
> otherwise perfectly working code - eg. code working on ELLIPSE-values
> specifically may break when some other user implements the CIRCLE type.

        I fail to see how. Ellipses are a superset of circles. Any circle is an ellipse too. Any operation that can be done on an ellipse should still work with a circle, but operations intended for a circle not necessarily work for an ellipse, if they rely on the particularities of a circle.

        Can you give an example, or just explain why do you thing ellipse operators would fail on cycles?

>>>And as soon as you turn to concrete implementations
>>>such as C++, you can find very concise definitions of a model.
>>
>>I think this is apples and oranges territory again.  Surely you could
>>derive some conceptual model from any specific implementation (this
>>would be a VERY complex model in the case of C++).  Would everyone
>>agree that the extracted model is _the_ OO model?!  Probably not.

>
> I do believe that part of the C++ standard is describing an ObjectO model.
> This is done very concisely in the C++ ISO standard. Again, we're off-topic.

        Even if it is off-topic, I think you missed Nathan's point. Sure there is a C++ model, but the point is that there is also a Smalltalk one, an ObjectiveC one, and so forth… and there is little common ground between them, to the point of the definitions given to terms in each of these models being either incompatible or fuzzy.

        More specifically, there is no such a thing as an OO _data model_.

>>Your assertion that TTM is a "very high-level description" is correct
>>and desirable.  It's purpose is to lay out a general blue print
>>without restricting implementation possibilities.

>
> This depends on the purpose with the TTM book. If it was to form the base
> for a common group of languages, it fails by not providing a "feel" of what
> e.g. relational assignment should be.

        But you fail to see that the book is not a complete treaty on the RM. It only takes on ground that was controversial or new. For well-agreed-upon issues like assignment it just relies on, and references, previous work. And it raises some issues that will have to be refined yet, like the implementation and physical ones.

        <flamebait>It is like the Bible. It is not a Systematic Theology work, nor it provides definitions to each word and concept it uses. It is useful in its purposes, but need previous information, further elaboration and quite a lot of illumination from the Holy Ghost.</flamebait>

> If it is a motivation for a new
> approach to database management systems (and this I believe to be the case)

        No, it is just a clarification and elaboration on various implications of an old model, the relational one.

> the book is in my opinion far to detailed in the description of its type
> system, and with to little emphasis of what should be its core: relational
> stuff such as view updateability, relational assignment, why nulls should be
> forbidden, why there should not be tuple-level access and lots of other
> stuff in that ballpark.

        As Nathan said, some or all of this stuff is in the "Relational Database Writings" series, and other books both by Date and other words.

> There I do
> remember reading about view-updateability, but it definitely did not propose
> that ALL views were in theory updateable. The brief mention of the subject
> in TTM claims that (with a back door wrt constraint violations (pg 151))

        Integrity onstraint violations are not a "back door"! You simply cannot update any relation if it violates integrity contraints, be it a base or derived relation.

> and
> follows up with an example of an update of a union, where both underlying
> relations are updated. While this does work wrt the view, there is no
> argumentation that this should be so - you could e.g. just update one of the
> relations. I am sceptical of such an approach, especially when the choice
> seems so arbitrary and justification is not present.

> Anyway it is a
> disappointment on such a central point not to see references to books that
> are not by the present authors.

        That is one thing that annoys me about most books by the "relational writers", namely Date, Darwen, Pascal, McGoveran & Codd, and has been becoming worse. Even more, if I try to find books or articles by other people, I cannot find much.

        Now I will propose a daring interpretation of the lack of "enough" "external" references: perhaps there aren't many works worth of being mentioned. Perhaps, looking at the sad state of the field, this intepretation isn't daring at all, just sad…

>>The closest system type to a rational in our implementation (D4) is
>>Decimal, which is a supertype of several different flavors of integer
>>types.  So loosely the answer is yes.

>
> I had hoped for some rational, that would be suitable for complex numbers
> ;-(

        You could do it yourself… granted Dataphor is not free software, but still is very much flexible.

>>In our implementation also, they could be different physical
>>representations.  I will qualify that our implementation does have a
>>few gotchas when a descendant changes the physical representation.

>
> I do not believe you do need to have gotchas - just a more complex
> implementation.

        Implementation changes. If you read Nathan's posts carefully, you will see that Alphora plans lots of changes and improvement for Dataphor.

>>>and that is not to my liking. For one reason because of the ugliness
>>>of the resulting expression. You might end up writing k :=
>>>INTEGER_DIVIDE(i,j) rather than k := i/j for integers and z :=
>>>REAL_DIVIDE(x,y) for rational numbers.
>>
>>You might, but not in D4.  ;-)
>>
>>k := i div j
>>k := i / j

>
> Okay. But this only requires my argument to change. I could create my
> (mathematically unsound) decimal_with_infinity (or even simpler a bounded
> integer). This type then would be unable to use the "/" or the "div"
> operator - at least not in a D-implementation with S by C implemented. At
> least matematicians would be very sorry if they would have to change
> operators for each type used.

        I don't see that. If they are different operators that happen to be historically called differently, some Mathematicians at least could even welcome that they have different names.

        Anyway, type is everything. Domains include the definition of applicable operators. Overloading would detract from this cleanliness.

>>This is stolen from Pascal.  C's overload of the "/" operator has
>>doubtless caused innumerable bugs because it is poorly defined.

>
> On the contrary (being off topic again), the "/" operator is very clearly
> defined in C and C++.

        Yet it causes confusion to programmers who don't always realise it means different things in different contexts.

>>I think you misunderstood.  The principal I am discussing is the
>>inference of "metadata" for derived tables.  Such inference has no
>>impact on the semantics of the relational operators.  Let me give a
>>more concrete example:  We have a Customer table and a Zipcode table.
>>There is a reference (FK) from the Customer to the Zipcode table.  Now
>>lets say we have a view, ActiveCustomer, defined as "Customer join
>>Sale".  We would expect for any SQL system to know the columns (with
>>associated names and types) for the ActiveCustomer view.  This gives
>>us a degree of logical data independence in this respect.  But what I
>>am saying is possible (and is done by Dataphor), is the inference of
>>other information.  In our example, the system can tell us that the
>>ActiveCustomer view references the Zipcode table.  This knowledge can
>>be used, for example, to provide a "lookup" from the ActiveCustomer
>>user interface to the Zipcode table.  This is an extremely powerful
>>concept that has been previously neglected.  I would also mention,
>>though it should be obvious, that there are inference semantics in all
>>relational operators (not just joins) for all metadata (not just
>>references).

>
> I fail to see how this is related to TTM or Dataphor. This inference is
> available for any SQL system with FK-support as well.

        Yes, but it has not been used. And note Nathan mentioned other inferences than strictly the join and referential integrity one.

>>>I am still confused. This should be possible even if another
>>>inheritance model is used, should it not? You would just have to
>>>declare (in that hypothetical language) that (e.g.) CIRCLE is a
>>>subtype of ELLIPSE.
>>
>>Right, but then enters the work of specifying the specific semantics
>>of a CIRCLE.  Using a constraint-based inheritance model such as the
>>one provided by TTM, we can easily create types (e.g. LARGECIRCLE)
>>merely by specifying a declarative constraint (i.e. radius > 1000).

>
> Yes. And?

        And this is an example of how easy and powerful S by C is, because of its inherent elegance.

>>Our implementation doesn't do this, but my point
>>is that a logical model is a logical model precisely to give
>>implementors total leeway.

>
> Ahhh - a logical model is there to be broken? I surely misunderstood you!
> ;-)

        Yes you do. Total leeway in the physical implementation inside the bounds set by the logical model, not to break it.

>>There are good reasons for this.  A relation constitutes a single
>>value.  Any attempt to update the relation at a more granular level
>>undermines the definition of the relation.  This is a non issue if
>>more granular update operators are defined as short-hands for
>>relational assignment to avoid update anomalies.  The explicit
>>mentions to SQL are to make it clear that an implementation of D is
>>not to fall into some of the same pitfalls that SQL did in this
>>regard.

>
> I do not follow you here. A string is a single value, but you can still (I
> believe) ask for the character value at a given position, and probably
> update that value as well. This does not undermine the definition of the
> string, does it?

        No, it just makes it so much harder to find out and check all associated integrity constraints, just undermining the specific relation meaning, not the definition of the domain itself.

        Now if tuple-level operators are defined as shorthands for relation-level operators, this whole problem simply goes away. Much gain and no pain.

>>CIRCLE to ELLIPSE are not logically coercions.  See the same page 286
>>as before.  The reason (to me) is that the language is both simpler
>>and clearer without coercions.  No funny business.  WYSIWYG.

>
> This must be a question of definition. In my mind the assignment ELLIPSE :=
> CIRCLE is a coercion as ELLIPSE and CIRCLE are different types, that might
> not even share the same physical representation.

        Physical representations are irrelevant. We are talking domains here, not their physical representations.

        And ellipse and circles are not really different types, but one is a subtype of the other.

> I did look at it, but never thought that it would be nonimplementable. My
> first thought way "why? this is clearly the case for type substitution, not
> type inheritance"

        Then *I* ask you why?

> and my second was: "this model has some negative
> implications for performance".

        Again, why?

>>Very much enjoying this conversation.

>
> So do I.

        Then why you both stopped it, or at least took it online? I very much want to see it carried to its conclusion…

-- 
  _
/ \ Leandro Guimarães Faria Corsetti Dutra        +41 (21) 216 15 93
\ / http://homepage.mac.com./leandrod/        fax +41 (21) 216 19 04
  X  http://tutoriald.sourceforge.net./      Orange Communications CH
/ \ Campanha fita ASCII, contra correio HTML      +41 (21) 644 23 01
Received on Wed Oct 16 2002 - 16:01:52 CEST

Original text of this message