Re: Tree Query question

From: Keith Boulton <boulke_at_globalnet.co.uk>
Date: 1997/11/12
Message-ID: <346a1c85.44624807_at_read.news.global.net.uk>


On Mon, 10 Nov 1997 22:11:33 -0500, rsenn <rsenn_at_capaccess.org> wrote:

>I'm trying to work through a tree query in PL*SQL and getting nowhere.
>Can anybody give me a hand with this? E-mail replies are encouraged,
>even if you also reply to the newsgroup.
>
>Thanks.
>
>Sincerely,
>Randall
>
>
>Data look something like this.
>
>Job Parent Spent
>10 NULL 100 -- top level has NULL parent
>20 10 500 -- child
>27 20 700 -- grandchild
>32 27 50 -- great grandchild
>21 20 200 -- a 2nd branch from job 20
>
>There is no limit to the breadth or depth of the tree, although as a
>practical matter I suspect about 10 or 15 levels will be the depth limit
>reached and 20 or 30 will be the practical limit for breadth.
>
>I want to construct a report similar to this, but can't see the coding
>logic to doing it. Can you point me in the right direction?
>
>Job Spent_by_Job_&_Progeny
>10 1550
>20 1450
>21 200
>27 750
>32 50
>

You need to calculate the total of child spent, including all the childrens' children for each item.

I created a table which included a column for this value, although a reporting table distinct from the main table could be used.

create table j (
  job number not null,
  parent number,
  amount_spent number,
  child_spent number)
storage ( initial 2m next 2m pctincrease 0 );

alter table j add constraint j_pk primary key ( job ) using index storage ( initial 2m next 2m pctincrease 0 );

  • create on index on the parent column alter table j add constraint j_fk foreign key ( parent ) references j;

create index j_1 on j( parent );  

The value for each row could be calculated by a statement like: select sum(amount_spent)
from j
connect by prior job = parent
start with parent = <job for current row>

This would work ok but repeatedly calculates the sum of each branch of the structure and would become very very slow for a large number of items.

Another approach is to calculate the sum of only the next level down. If this is repeated max(level) - 2 times, the impact of all the lowest items in the tree will bubble up to the top. This approach works ok for a five level heirachy containing 11,111 records, but becomes too slow when working with ten times as many rows.

Example code with times for 11,111 records: -- establish max level 3 seconds
select max(level)
from j
connect by prior job = parent
start with job = 1;

  • simple sql statements
  • 3 seconds update j set child_spent = null;
  • 5s update j set child_spent
    • ( select sum( nvl(j2.amount_spent,0) + nvl(j2.child_spent,0) ) from j j2 where j2.parent = j.job );
  • 3s begin
    • three here would be replaced by the max level-2 found above for i in 1..3 loop update j set child_spent
      • (select sum( nvl(j2.amount_spent,0) + nvl(j2.child_spent,0) ) from j j2 where j2.parent = j.job ) where j.child_spent is not null; end loop; end; /

When working with a larger number of records, the trick is to use the figure calcuated at the lowest levels and use it to calculate the levels above.

Example code for 111,111 records. Executes in about 1 minute create or replace procedure calc_cost as

  • order by column position to avoid internal error in pl/sql cursor c is select job, parent, level from j connect by prior job = parent start with parent is null order by 3 desc, 2; curparent j.parent%type; curtotal number; begin curtotal := 0;
    • we are going to update many rows in the table lock table j in exclusive mode;
    • reset existing totals update jobs set child_spent = null; for c_rec in c loop if curparent is null then curparent := c_rec.parent; end if;
      • on change of parent, update the totals of the parent record
      • only the top level has a null parent if curparent <> c_rec.parent or c_rec.parent is null then update j set child_spent = curtotal where job = curparent; curtotal := 0; curparent := c_rec.parent; if c_rec.parent is null then exit; end if; end if;
      • because we are using the query at the start which doesn't
      • reflect changes made within this procedure,
      • we need to fetch the spent values again select curtotal + nvl(amount_spent,0) + nvl( child_spent, 0 ) into curtotal from j where job = c_rec.job;

  end loop;  

end;
/

This procedure fetches the amount spent for all children of a given parent and updates the parent row. The value in the parent row is then used to calculate the totals for the level above.

Alternatively, use a trigger to maintain the child spent values in the parent row as rows are inserted/updated/deleted remembering to lock rows to prevent update conflicts with other users. This is probably the most useful approach, as it makes it very straightforward to see the current spend for any item. Care is required to make sure the right results are produced.

To avoid the use of connect by which is quite slow and may be limited by available memory, you could use a generated code for each item, giving the chain of parents e.g. job 32 would have a code assigned 10-20-27. The total spend for e.g. item 20 would be calculated by select sum( amount_spent ) from j where code like '10-20-%' Received on Wed Nov 12 1997 - 00:00:00 CET

Original text of this message