Re: database design method

From: Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com>
Date: 11 Nov 2002 23:44:58 -0800
Message-ID: <e9d83568.0211112344.1d0c5ccf_at_posting.google.com>


> >They are - in your approach - in fact identified by their parents
> >values.
>
> Have you been reading my PhD thesis?

Sorry to say - no I haven't. Can you give me a pointer?

> But seriously, they are not. Two trees
> are the same if they have the same attributes and set of children, just like
> two tupels are the same if they have the same attributes. Period. The parent
> doesn't come into that.

But we are looking at the tree from two (or more?) viewpoints:

  • the nodes themselves
  • the nodes in relation to other nodes

Say I have I file structure with files and folders (or just folders to simplify the picture).

Now the folders them selves are interesting on their own right, but what is also interesting is their relation to one another. But these two aspects are distinct.

In the "nested relation" approach we would emphasize the structural aspect only.

And what if these folders have files in them? Do we just nest the files inside the folders?

And if the files have access rights - we just nest them inside the files?

And the nodes _are_ identified via their parents. If you dont know their parents, how do you get a hold on them?   

>
> By using recursion:
>
> Then you can do (I hope you will forgive me if I use the conventional nested
> relational algebra):

Sure.

>
> R1 := Trees UNION (UNNEST[Subtrees](PROJECT[Subtrees](R1)))
> R2 := PROJECT[Name,Attr](R1)
> R3 := SELECT[Attr="Lauri"](R2)
>

OK - no do-while loops. But how on earth does the optimiser handle this query besides materialising the whole tree? Can it use a traditional B-tree index to find Attr="Lauri"?

I think we are loosing so much for so little gain.

> >Do you need a do-while loop for that??
>
> No.
>
> >What's wrong with the 'traditional' way, say [...]
>
> You are inventing a nodeID that was not in the original data model of the
> user. Don't forget that we are talking about a *logical* data model here.
> There's nothing wrong with using surrogate identifiers but that is an
> implementation issue and the user shouldn't have to be aware of that.

Well, speaking of XML specifically - the nodes are identified by their parents and their position relative to the other siblings, and that causes lot's of problems. So you either identify the nodes by position or by some surrogate. My vote goes for surrogates.

>
> >This requires recursive support (available
> >in DB2, and partly Oracle)
> >
> >(Needless to say: Dataphor has it)
>
> Good. So everything we need is already there.

Sorry.
Not the recursive definitions, but an explode-operator.

>
> >Or, you can take a look at
> >
> >"On Accelerating XPath Evaluation in Any RDBMS"
> >
> >http://www.google.com/url?sa=U&start=1&q=http://wwwhome.cs.utwente.nl/~keulen/onderzoek/XPathAccel/tods2002_xpath-accel.pdf&e=747
>
> I know that research very well. (It's from the university where I studied.)
> In fact, part of my current research is about showing how simple relational
> algebra is enough to implement almost the whole of XPath, axes, filters and
> all. That shows that even something as complicated and messy as XML (=
> eXtremely Messy Language) can be effectively tackled by the simple relational
> model.
>
> >But I want to use my RELATIONAL OPERATORS!!
>
> You can. But I don't think you should be using algebra operators for that
> but a more declarative type of language. I assume Dataphor has that?

What on earth could be more declarative than relational operators???

> >They are hard to write and the optimizer won't understand them anyway.
>
> It should be able to deal with recursion. A lot of research has been done in
> that area, but at the moment very little of that has found its way into real
> DBMSs. Btw., while-loops can often be simulated with recursion (depends upon
> the type of recursion that is allowed and what its semantics is) so a good
> query optimizer should also be able to deal with that kind of construct.

Sorry, I am not convinced.

Maybe you could find an example where the recursive definition is _really_ needed and we can't do without it?

regards,
Lauri Pietarinen Received on Tue Nov 12 2002 - 08:44:58 CET

Original text of this message