Re: XQuery question

From: Sebastian Schaffert <schaffer_at_informatik.uni-muenchen.de>
Date: Fri, 16 May 2003 10:24:49 +0200
Message-ID: <ba27if$lq9$1_at_minotaurus.cip.informatik.uni-muenchen.de>


Bob Badour wrote:

> "Sebastian Schaffert" <schaffer_at_informatik.uni-muenchen.de> wrote in
> message news:b9vvpd$hjc$1_at_minotaurus.cip.informatik.uni-muenchen.de...

>> Costin Cozianu wrote:
>>
>> > Sebastian Schaffert wrote:
>> >
>> >> Costin Cozianu wrote:
>> >>
>> >> [transitive relations]
>> >>
>> >>>
>> >>> SQL '99 it's called WITH RECURSIVE. IBM db2 decided that recursive
>> >>> is superfluous and didn't implement the full options (which are
>> >>> probably more powerfuil than you need for transitive closure)
>> >>>
>> >>> Here's a DB2 Example:
>> >>>
>> >>> CREATE TABLE Rel (ID integer, Parent integer );
>> >>>
>> >>> WITH MyView ( ID Integer, Ancestor integer)
>> >>> AS ( SELECT  ID , Parent as Ancestor FROM Rel
>> >>>     UNION ALL
>> >>>      SELECT r1.ID, r2.Parent AS Ancestor
>> >>>      FROM Myview r1, Rel r2
>> >>>             WHERE r1.Parent= r2.ID )
>> >>> SELECT * from MyView
>> >>>
>> >>> As you can see MyView is joined recursively with otsel;f in its
>> >>> definition so that we get the transitive closure.
>> >>
>> >>
>> >> Now, just tell me in which way this is *declarative*, and why SQL is
>> >> less complex than XML query languages...<g>
>> >>
>> > The fact that is declarative is obvious, and if you want me to explain
>> > that yo you, I'm at a loss. Either get over it, or else stop the
>> > discussion.
>>
>>
>> The point is, the expression you provide above is not obvious without
>> digging deeper into it's semantics.

>
> No language is obvious without knowledge of its syntax and semantics.

True. But I'd for example argue that Pascal is more "obvious" than Assembler which again is more "obvious" than a sequence of 0's and 1's, isn't it?

>> 
>> The "either get over it, or else stop discussion" is the argument of the
>> weak who is not able to support his point differently.

>
> Sebastian, you lack the intellectual honesty to cede valid points. That
> does not weaken Costin's argument; it just wastes his time.

I'd prefer not to comment on that or otherwise we end in a flame war without fair discussion.

>
>

>> >>>>> But in order to do a fair and unbiased comparison I'd like to know

> if
>> >>>>> XQuery is relationally complete (including the transitive closure
>> >>>>> of a
>> >>>>> binary relation).
>> >>>>
>> >>>>
>> >>>>
>> >>
>> >> Whether XQuery is relationally complete or not is of no relevance,
>> >> since it is not intended to be so, as the underlying data is not
>> >> relational, it is tree-shaped.
>> >>
>> >
>> > That's non-sense. Data is data. It comes from the plural of the latin
>> > "datum" which means a "given", therefore data is a collection of
>> > (given) facts . "Given facts" are proposition that are true. Relational
>> > databases hold a collection of facts and those have no "relational"
>> > shape. That's just non-sense good only for marketing brochures.
>>
>> What you call a proposition is really considered a n-ary predicate over
>> atomic terms in logics.

>
> What Costin calls a proposition is a proposition. It is an instantiated
> predicate known or assumed to be true, and a relational database comprises
> a set of such propositions.

Maybe we have a different terminology here. A proposition in my understanding is not necessarily true.

> The propositions do not necessarily all instantiate the same predicate.

A single table instantiates a single predicate.

>> > If you're willing to show me a useful "tree-shaped" logic, I'd be very
>> > curious to find out more about that.
>>
>>
>> Ok, take first order logic. It's data is tree shaped (terms made up of
>> functors and constant symbols). In contrast, "datalog" (or the
>> relational theory) is merely a subset of FOL with no functors, only
>> constants.
>>
>> FOL:
>> forall X. person(X) => person(father(X))
>>
>> Obviously, father('John') is a (simple) tree shaped data item.

>
> And how, exactly, does this differ from a relational 'person' domain with
> a 'father' operation returning 'person' ?

Where is the father-Operation? How do you write it? How do you have transitivity over it?

>> You cannot express such terms conveniently in datalog.

>
> I am not convinced by your assertion.

Then, please give the above statement in datalog.

I hope you accept that relational "database logic" is a strict subset of (i.e. strictly less expressive than) first order logic (if you don't, please have a look into a good book on mathematical logic). Syntactically, the difference is:
- missing function constructs (i.e. only flat data, no terms) - missing existential quantification

As for semantics, what you cannot express in "database logic" are infinite/recursive data items (i.e. X = f(X)). This is certainly of some advantage for relational databases, since testing entailment thus becomes completely decideable. However, this is at the cost of lower expressivity.

>
>

>> >> I prefer encoding my data model as a tree in my data model than
>> >> encoding it in completely ridiculous SQL queries (see above).   If
>> >> the data is
>> >> inherently graph-structured it is much more intuitive to represent it
>> >> graph structured and provide appropriate query languages for it
>> >> (note: I don't think XQuery is appropriate, but this is not the fault
>> >> of the XML data model).
>> >
>> > You are confused. Graphs and relations are almost indistigusihable from
>> > a logical point of view. A tree is a binary relation with certain
>> > properties.
>>
>> I think that rather you are confusing the different levels of
>> abstraction. Obviously, I can represent a graph structure as a binary
>> relation. On the other hand, this is merely an encoding of the "picture"
>> (or "shape") of the graph, where the latter can immediately be grasped
>> while it is almost impossible to even partially imagine the shape of the
>> graph when encoded as a binary relation. That's what I mean with
>> "declarativity".

>
> I suggest to you that the "picture" is merely an encoding of the binary
> relation. Google on EWD696
>
Granted. But if you'd explain a graph to students, you would almost certainly use the "picture" as illustration. Dijkstra probably did that as well (didn't have a look at the cited articles). This at least indicates that the "picture" view of a graph is much easier to comprehend (or "grasp") than the binary relation or adjacence matrix view.

> See also: EWD1280a or EWD1268 or EWD992 or EWD991 or EWD872
>
> (I've been perusing Dijkstra's EWD series recently. Could you tell?)
>
>

>> Relational databases are very declarative for *certain* kinds of data,

>
> They are declarative for all kinds of data.

No. Basically they are not declarative for any kind of data that is not in inherently in at least first normal form.

>
>

>> i.e. such data that is inherently relational (addressbooks, booking
>> system).

>
> Facts are inherently data, and data are inherently facts. A relational

But a fact is not necessarily relational. Also, the fact in a relational database is not data of the domain in discussion, it is a statement about the truth of some relation between the data.

> database comprises a set of facts. The suggestion that data is "inherently
> relational" or "inherently trees" or inherently any other structure is
> absurd.

Basically what you are telling me invalidates the complete idea of object oriented programming. Moreover, it invalidates the necessity for first order logic.

> Data are data, and structure is structure.

Unfortunately, your precious relational facts lack a certain amount of expressivity found in other approaches, since they are only flat.

>
> The question becomes: How does one represent data? As data or as
> structure? The relational model suggests we represent data as data.

No, it doesn't. The relational model enforces a different kind of structure on your data.

>> >> XML is merely a form of tree structured data representation which is
>> >> IMO rather intuitive for the user. How the data is stored is a
>> >> completely different matter. It could be an XML view on a relational
>> >> database. It could also be an XML view on an XML document. And it can
>> >> be a mixture of both (i.e. combined XML view onto multiple,
>> >> heterogeneous databases).
>> >>
>> >
>> > Your opinion does not count as long as it is not backed by arguments.
>>
>>
>> One cannot back subjectivity by arguments, and when one talks about
>> "intuitivity", it is inherently subjective.

>
> If one associates "intuitivity" with usability, one can measure it
> empirically.

Obviously. Unfortunately such empirical investigations are only rarely conducted in computer sciences.

>> Your answer again shows
>> that you try to be insulting when you don't have arguments left.

>
> Costin is efficient. I am insulting. Your opinion does not count and
> neither do your insinuations.

Every opinion does count. But I see that it is senseless to broaden your horizon.

>> >>> This leads in my opinion to way unnecessary complexity.
>> >>
>> >> As for unnecessary complexity, I'd like again to refer to your SQL
>> >> example above.
>> >
>> > For which you haven't provide a counter example. Show me a general
>> > transitive closure (not just recursive descent down the bracnches of a
>> > tree) over a binary relation and let's talk after that.
>>
>> Recursive descent is per definition the way to do transitive closure
>> over a graph, isn't it (take your favorite book on graph theory and
>> you'll see)? It's just a matter of how I encode the graph. Your example
>> XML document is not the appropriate representation of graph structures
>> in a graph language, it is trying to "port" the insufficiencies of
>> relational databases to XML. In XML, I don't need to encode the graph as
>> binary relation, I can encode it as graph (under the assumption of a
>> reasonable link mechanism).

>
> In other words, XML lacks both logical and physical independence. If one
> encodes the graph in the data and not in the structure, one cannot
> evaluate the transitive closure even though the data encode a graph. How
> does one encode a DAG in the hierarchic structure of an XML document? How
> does the hierarchic structure of an XML document make this encoding any
> simpler, any more natural or any more intuitive?
>
>
>> I repeat it, I am NOT claiming that XQuery is good. I am claiming that
>> an inherently tree or graph-shaped data representation language is often
>> superior to flat relational tuples.

>
> If an n-ary tuple describes a point in an n-dimensional space, and if a
> relation comprises a locus of such points through the n-dimensional space,
> how exactly is a relation "flat" ?

Not a "relation", but a "relation in the relational database model". The data contained in the latter relation is atomic, i.e. it may not consist of complex statements.

Sebastian Received on Fri May 16 2003 - 10:24:49 CEST

Original text of this message