Re: pointers on representing tree in db?

From: Jan Hidders <hidders_at_REMOVE.THIS.win.tue.nl>
Date: 18 Apr 2001 08:32:54 GMT
Message-ID: <9bjjfm$3ti$1_at_news.tue.nl>


Lennart Jonsson wrote:
>
> Quite neet feature. Oracle seems to have a lot of things that DB2
> doesnt. I am however more interested in "classical" work using only
> standard SQL. The solution I came up with is to use two tables
>
> Table Data (id integer not null, parent_id references...) id is p.k
> Table Ancestor (id, ancestor_id) both id and ancestor_id references id in
> Data
>
> [...]
>
> This seems to work with reasonable performance, but I'm eager to see how
> others have solved this type of issue.

FWIW your solution is a well known standard solution and has been studied quite well in the field of database theory. There is for instance the following article:   

       Maintaining the transitive closure of graphs in SQL. 
       Guozhu Dong, Leonid Libkin, Jianwen Su and Limsoon Wong. 
       Int. Journal of Information Technology, 5 (1999), 46-78. 

You can find an electronic version on the page

  http://cm.bell-labs.com/who/libkin/publ.html

Perhaps it may look a bit too theoretical for you but it also contains some concrete SQL you might want to compare to yours. I suggest you try to read it anyway, and if you have any questions, just let me know.

> I'm not very experienced with databases, and I feel that perhaps I'm
> violating normal forms etc by storing some values tvice (both as id,
> parent_id and as id, ancestor_id). Any hints or pointers to
> information would be greatly apreciated.

Perhaps a bit surprising but, no, strictly speaking it does not violate any of the standard normal forms. (The standard normal forms only prevent a specific kind of redundancy, i.e., redundancy that can be computed with a fixed number of natural joins.) But as you already know it contains redundant information and that forms a risc. But you can minimize this risc by somehow enforcing that all access to the tree is done via some functions/methods/procedures that keep the redundant data consistent, and making sure that these functions are correct.

Kind regards,

-- 
  Jan Hidders
Received on Wed Apr 18 2001 - 10:32:54 CEST

Original text of this message