Re: double linked list

From: Ed Prochak <>
Date: Thu, 06 Feb 2003 05:10:04 GMT
Message-ID: <>

--CELKO-- wrote:

>>>When you are ready to demonstrate a superior implementation in

> another SQL
> RDBMS please let us know. And I don't mean superior in terms of some
> theoretical concept like building a database in fifth normal form. I
> mean superior in terms of performance and scalability in the real
> world. <<
> Funny you should ask :)!! I have a book coming out later this year on
> TREES & HIERARCHIES IN SQL (Morgan-Kaufmann), which gives several
> different models in Standard SQL-92 for tree structures which are
> superior in terms of performance and scalability compared to the
> A few years back, we ran a test on a huge tree with the following
> method versus cursors and CONNECT BY; it was 20 to 100 times faster
> for 75,000 nodes and 12 levels deep. It could also do things like
> compare the structures of sub-trees in one SELECT statement.
> The usual example of a tree structure in SQL books is called an
> adjacency list model and it looks like this:
> boss CHAR(10) DEFAULT NULL REFERENCES OrgChart(emp),
> salary DECIMAL(6,2) NOT NULL DEFAULT 100.00);
> OrgChart
> emp boss salary
> ===========================
> 'Albert' 'NULL' 1000.00
> 'Bert' 'Albert' 900.00
> 'Chuck' 'Albert' 900.00
> 'Donna' 'Chuck' 800.00
> 'Eddie' 'Chuck' 700.00
> 'Fred' 'Chuck' 600.00
> Another way of representing trees is to show them as nested sets.
> Since SQL is a set oriented language, this is a better model than the
> usual adjacency list approach you see in most text books. Let us
> define a simple OrgChart table like this, ignoring the left (lft) and
> right (rgt) columns for now. This problem is always given with a
> column for the employee and one for his boss in the textbooks. This
> table without the lft and rgt columns is called the adjacency list
> model, after the graph theory technique of the same name; the pairs of
> emps are adjacent to each other.
> CONSTRAINT order_okay CHECK (lft < rgt) );
> OrgChart
> emp lft rgt
> ======================
> 'Albert' 1 12
> 'Bert' 2 3
> 'Chuck' 4 11
> 'Donna' 5 6
> 'Eddie' 7 8
> 'Fred' 9 10
> The organizational chart would look like this as a directed graph:
> Albert (1,12)
> / \
> / \
> Bert (2,3) Chuck (4,11)
> / | \
> / | \
> / | \
> / | \
> Donna (5,6) Eddie (7,8) Fred (9,10)
> The first table is denormalized in several ways. We are modeling both
> the OrgChart and the organizational chart in one table. But for the
> sake of saving space, pretend that the names are job titles and that
> we have another table which describes the OrgChart that hold those
> positions.
> Another problem with the adjacency list model is that the boss and
> employee columns are the same kind of thing (i.e. names of OrgChart),
> and therefore should be shown in only one column in a normalized
> table. To prove that this is not normalized, assume that "Chuck"
> changes his name to "Charles"; you have to change his name in both
> columns and several places. The defining characteristic of a
> normalized table is that you have one fact, one place, one time.
> The final problem is that the adjacency list model does not model
> subordination. Authority flows downhill in a hierarchy, but If I fire
> Chuck, I disconnect all of his subordinates from Albert. There are
> situations (i.e. water pipes) where this is true, but that is not the
> expected situation in this case.

> 5. The nested set model has an implied ordering of siblings which
> theadjacency list model does not. To insert a new node, G1, under part
> G. We can insert one node at a time like this:
> DECLARE rightmost_spread INTEGER;
> SET rightmost_spread
> = (SELECT rgt
> FROM Frammis
> WHERE part = 'G');
> UPDATE Frammis
> SET lft = CASE WHEN lft > rightmost_spread
> THEN lft + 2
> ELSE lft END,
> rgt = CASE WHEN rgt >= rightmost_spread
> THEN rgt + 2
> ELSE rgt END
> WHERE rgt >= rightmost_spread;
> INSERT INTO Frammis (part, lft, rgt)
> VALUES ('G1', rightmost_spread, (rightmost_spread + 1));
> END;
> The idea is to spread the lft and rgt numbers after the youngest child
> of the parent, G in this case, over by two to make room for the new
> addition, G1. This procedure will add the new node to the rightmost
> child position, which helps to preserve the idea of an age order among
> the siblings.

And this update doesn't blow performance out of the water? Doesn't it potentionally update large percentages of the entire table?

Since you only go by two, wouldn't the next case of large updates happen after another couple inserts at the "leaf" nodes. Maybe I do not follow this correctly. In your orgchart example, what happens when Donna hires an assistant? It seems to me all the current employee records get touched except Bert. Is that right? (Strikes me like a network model DB in that you have to rebuild links as part of inserts.) Am I wrong?

> I have a book on TREES & HIERARCHIES IN SQL coming out in 2003 with
> more code and details.

Are we rediscovering the past?

(Network model databases have their place and can out perform a RDBMS in those cases. It's just that those cases are few and far between. In my experience, it's when the data is static, ie, load data once and read many times very fast. Dynamic data just bogs done on link updates.)

Disclamer: I'm no DB theoretician. I'm reading this from

(Which makes me curious who put oracle in the mix of groups when no other DB product is included?)

Ed Prochak
"Two roads diverged in a wood and I
I took the one less travelled by
and that has made all the difference."
robert frost
Received on Thu Feb 06 2003 - 06:10:04 CET

Original text of this message