Re: A CTE Question

From: Erland Sommarskog <esquel_at_sommarskog.se>
Date: Sat, 28 Aug 2010 11:36:59 +0200
Message-ID: <Xns9DE2762B261AYazorman_at_127.0.0.1>



Thomas Kellerer (OTPXDAJCSJVU_at_spammotel.com) writes:
> I also would like to know if you are really using Oracle.

Actually, my assumption was that he was on Oracle!

That is, I saw the post in comp.databases.ms-sqlserver, and all the syntax parsed on SQL 2008 except for the row constructor in the join. The row constructor for the INSERT are ANSI as far as I know.

(And, no, I do normally hang out here. I had a feeling that Michel may have posted to more newsgroups, so I searched on Google groups, and sure enough.)

>> with t_recurs (dep_id, parent, iter) as
>> (
>> select dep_id, parent, 0 from DEPARTMENT
>> union all
>> select r.dep_id, d.parent, iter + 1
>> from t_recurs r, DEPARTMENT d
>> where r.parent = d.dep_id and iter< 1000
>> )
>
> That query is wrong, and that's why you need this "where iter < 1000"
> workaround.

The recursion as works: you start from the bottom and work your way up. You will stop recurse when you reach the top level. However, he would need a WHERE NOT EXISTS to only start at the leaves. Now he starts on all nodes.

And at first glance, it seems that this is the way to go: you want to count how many there are in each department, so why not start at the bottom? The problem is that if a department has several subdepartments,  you will count the staff in that department everytime. So I think that is a non-starter nevertheless. (And this test case was missing frmo Michel's test data.)

> this should do it as far as I can tell.
>
> with recursive t_recurs (dep_id, parent, iter) as
> (
> select dep_id, parent, 0 as iter
> from department
> where parent is null
>
> union all
>
> select d.dep_id, d.parent, iter + 1
> from department d
> join t_recurs r on d.parent = r.dep_id
> )
> select r.dep_id,
> (select count(*) from t_recurs r2 where r2.iter > r.iter)
> as department_count,
> (select count(*) from person where dep_id =
> any (select dep_id from t_recurs r3 where r3.iter >= r.iter))
> as person_count from t_recurs r
> order by 1;
 

I don't think this works. This assumes that there is a single line of departments with no branches, and Michel's data had two roots. (And uses -2 rather than NULL to signify a root.)

Here is the query that I suggested in the SQL Server newsgroup:

WITH depcnt (DEP_ID, cnt) AS (

   SELECT DEP_ID, COUNT(*)
   FROM PERSON
   GROUP BY DEP_ID
), recurs (DEP_ID, iter, path) AS (

   SELECT DEP_ID, 0, cast(str(DEP_ID, 4) as varchar(8000))    FROM DEPARTMENT
   WHERE PARENT = -2
   UNION ALL
   SELECT D.DEP_ID, r.iter+1, r.path + str(D.DEP_ID, 4)    FROM recurs r
   JOIN DEPARTMENT D ON r.DEP_ID = D.PARENT )
SELECT R1.DEP_ID, SUM(dc.cnt)
FROM recurs R1
JOIN recurs R2 ON R2.path LIKE R1.path + '%' LEFT JOIN depcnt dc ON R2.DEP_ID = dc.DEP_ID GROUP BY R1.DEP_ID
ORDER BY R1.DEP_ID To get this working on Oracle, you probably need to replace the str() function, and change + in the LIKE expression to ||.

As for performance, on SQL Server it is certainly recommendable to to first save the result of the CTE into a temp table and work from there.

Here is some more sample data (still with row constructors, though):

create table DEPARTMENT
(
DEP_ID integer not null primary key,
NAME VARCHAR(32) not null,
PARENT integer not null
) ;

insert into DEPARTMENT

VALUES (1,'A',-2),
       (2,'B',1),
       (3,'C',2),
       (4,'D',-2),
       (5,'E',4),
       (6,'F',5),
       (7,'G',6),
       (8,'D1', 4),
       (9,'D2', 4),
       (10,'E2', 5),
       (11,'E3', 5)


create table PERSON (
PERSON_ID integer not null primary key,
NAME VARCHAR(128) not null,
DEP_ID integer,
constraint FK_01 foreign key (DEP_ID) references DEPARTMENT(DEP_ID) );

-- 
Erland Sommarskog, Stockholm, esquel_at_sommarskog.se
Received on Sat Aug 28 2010 - 04:36:59 CDT

Original text of this message