Re: double linked list

From: --CELKO-- <>
Date: 5 Feb 2003 16:38:53 -0800
Message-ID: <>

>> Connect-By uses indexed nested loops. If both id and parent id
columns are indexed, it is reasonably fast. <<

And it is reasonable to assume that they will have two column index to ensure uniqueness of the pairs.

>> Recursive Aggregate Functions aka Hierarchical Totals are prefectly
possible with connect-by. Just use subqueries in the "select" list (Date's idea of expressing group-by with subqueries in the "select" list) ... I'm skeptical, however, about optimization of hierarchical total. For one thing, I'm not aware of any product that would be able to unnest subquery in the select list into "group by" query. Corrections are welcome. <<

I don't know of any such optimization either. The levels of correlation between the scalar subquery expresion and a containing query with mutliple table references in its FROM clause would be be pretty complicated.

>> Not true anymore for Oracle 9. <<

I am been corrected on that one, and changed it in the text.

>> What do you mean by "direct use"? This is SQL, a user is allowed to
inline and nest subqueries. <<

But you want to avoid them in favor of simple, "flat" joins whenever possible. They hurt performance and maintainability.

>> Where in the above examples did you see a cursor? <<

It's there, but it is hidden in the syntax. The fact that I have to proceed from one row to another in a particular order means I am doing sequential processing. Do you know of a parallel algorithm for ue with the adjacency list model of a tree?

>> The appealing part of the transitive closure syntax -- being it
Recursive With, or Connect By -- is that the hierarchical query doesn't use a particular labeling scheme. <<

Granted, but you also lose the ability to order the children, to separate the tree structure from the nodes (organizational chart versus personnel)

>> Certainly, you might insist that only one labeling scheme exists --
nested sets -- but we know otherwise. [snip list] <<

I know there are quite a few and I cover them. I just brought out nested sets
in this post because I have a "cut & paste" file on them. Nested sets is the easiest method to explain and it has the simplest math of any method that uses numeric pairs to locate a node.

>> The hierarchical query that is locked into any of these scheme is
implementation dependent! <<

Unh? All of the methods on your list use Standard SQL, and have no implementation dependent features (well, other than precision and scale of numbers) or vendor extensions. There are trade-offs with each; the amount of math needed for a query in the ones that depend on numeric pairs, the storage for a path enumeration string and the seartch time, etc.

The big advantage of the adjacency list model is that it is fast and easy to insert new rows. But summary data is much harder to obtain, and constraints to preserve the tree structure are more complex (most programmers do not bother to write the needed constraints!! Arrgh!). Simple deletion of a row splits the tree apart, so you need code to reconstruct it. Protection from cycles requires either procedural code or table level check clauses (not yet found in many products). Following a path requires cursors - hidden or explicit.

Done right, the adjacency list is harder than people think. Received on Thu Feb 06 2003 - 01:38:53 CET

Original text of this message