Re: difficult select- question
Date: Mon, 16 Nov 2009 22:35:20 +0100
Message-ID: <1258407320.92_at_user.newsoffice.de>
Andreas Mosmann schrieb am 02.11.2009 in <1257181408.57_at_user.newsoffice.de>:
> Maxim Demenko schrieb am 27.10.2009 in <4AE6A3AF.4020506_at_gmail.com>:
>> Andreas Mosmann wrote:
>>> Hi ng, >>> >>> I hope, there is someone with a good idea concerning the following >>> question: >>> >>> There is a sorted Table (or view) like the following: >>> F1|F2|F3|F4 >>> AA|BB|10|20 >>> AA|BB|20|30 >>> AA|BB|40|50 >>> AA|BB|50|60 >>> AA|BB|60|70 >>> AA|CC|70|80 >>> AA|CC|80|90 >>> >>> As you can see there are some rows, that are connected by F3=prior F4, >>> but there are leaks too. I need a result as the following: >>> >>> AA|BB|10|30|2 parts >>> AA|BB|40|70|3 parts >>> AA|CC|70|90|2 parts >>> >>> I know it is easy to solve by programming. >>> I think it is possible to do this with inline view and hierarchical query. >>> I can imagine a way by select min(F3), max(F4) ... where exists (enough >>> connecting data in the middle). >>> >>> But what is the best (SQL-)way to solve this? (There actually are about >>> 15000 lines, the estimated result probably should not have more than 100 >>> lines) >>> I could do it by PL/SQL, but I am interested in SQL, if exists >>> >>> Andreas Mosmann >>> >>> >>> >>> >>>
>> Another alternative could be use of analytical functions,
>> 1) the classical interval calculation with *start of group* method
>> SQL> with t as (
>> 2 select 'AA' F1,'BB' F2,10 F3,20 F4 from dual union all
>> 3 select 'AA','BB',20,30 from dual union all
>> 4 select 'AA','BB',40,50 from dual union all
>> 5 select 'AA','BB',50,60 from dual union all
>> 6 select 'AA','BB',60,70 from dual union all
>> 7 select 'AA','BB',90,100 from dual union all
>> 8 select 'AA','BB',100,110 from dual union all
>> 9 select 'AA','BB',110,120 from dual union all
>> 10 select 'AA','BB',120,130 from dual union all
>> 11 select 'AA','CC',10,20 from dual union all
>> 12 select 'AA','CC',20,30 from dual union all
>> 13 select 'AA','CC',30,40 from dual union all
>> 14 select 'AA','CC',50,60 from dual union all
>> 15 select 'AA','CC',60,70 from dual union all
>> 16 select 'AA','CC',70,80 from dual union all
>> 17 select 'AA','CC',80,90 from dual union all
>> 18 select 'AA','CC',90,100 from dual union all
>> 19 select 'AA','CC',100,110 from dual union all
>> 20 select 'AA','CC',170,180 from dual union all
>> 21 select 'AA','CC',180,190 from dual
>> 22 )
>> 23 -- End test data
>> 24 select f1,f2,min(f3) f3,max(f4) f4,count(*) cnt
>> 25 from (
>> 26 select f1,
>> 27 f2,
>> 28 f3,
>> 29 f4,
>> 30 sum(start_of_group) over(partition by f1, f2 order by f3)
>> group_no
>> 31 from (select f1,
>> 32 f2,
>> 33 f3,
>> 34 f4,
>> 35 decode(lag(f4) over(partition by f1, f2 order by f3),
>> 36 f3,
>> 37 0,
>> 38 1) start_of_group
>> 39 from t)
>> 40 )
>> 41 group by f1,f2,group_no
>> 42 order by f1,f2,f3
>> 43 ;
>> F1 F2 F3 F4 CNT
>> -- -- ---------- ---------- ----------
>> AA BB 10 30 2
>> AA BB 40 70 3
>> AA BB 90 130 4
>> AA CC 10 40 3
>> AA CC 50 110 6
>> AA CC 170 190 2
>> 2) Slightly different approach with analytics
>> SQL> with t as (
>> 2 select 'AA' F1,'BB' F2,10 F3,20 F4 from dual union all
>> 3 select 'AA','BB',20,30 from dual union all
>> 4 select 'AA','BB',40,50 from dual union all
>> 5 select 'AA','BB',50,60 from dual union all
>> 6 select 'AA','BB',60,70 from dual union all
>> 7 select 'AA','BB',90,100 from dual union all
>> 8 select 'AA','BB',100,110 from dual union all
>> 9 select 'AA','BB',110,120 from dual union all
>> 10 select 'AA','BB',120,130 from dual union all
>> 11 select 'AA','CC',10,20 from dual union all
>> 12 select 'AA','CC',20,30 from dual union all
>> 13 select 'AA','CC',30,40 from dual union all
>> 14 select 'AA','CC',50,60 from dual union all
>> 15 select 'AA','CC',60,70 from dual union all
>> 16 select 'AA','CC',70,80 from dual union all
>> 17 select 'AA','CC',80,90 from dual union all
>> 18 select 'AA','CC',90,100 from dual union all
>> 19 select 'AA','CC',100,110 from dual union all
>> 20 select 'AA','CC',170,180 from dual union all
>> 21 select 'AA','CC',180,190 from dual
>> 22 )
>> 23 -- End test data
>> 24 select f1,
>> 25 f2,
>> 26 f3,
>> 27 nvl(lead(prev_f4) over(partition by f1, f2 order by f3),
>> max_f4) f4,
>> 28 nvl(lead(prev_cnt_f4)
>> 29 over(partition by f1, f2 order by f3) - prev_cnt_f4,
>> 30 cnt_f4 - prev_cnt_f4) cnt
>> 31 from (select t.*,
>> 32 count(*) over(partition by f1, f2) cnt_f4,
>> 33 count(*) over(partition by f1, f2 order by f3
>> 34 rows between unbounded preceding and 1 preceding)
>> prev_cnt_f4,
>> 35 max(f4) over(partition by f1, f2) max_f4,
>> 36 max(f4) over(partition by f1, f2 order by f3
>> 37 rows between unbounded preceding and 1 preceding)
>> prev_f4,
>> 38 decode(lag(f4) over(partition by f1, f2 order by f3),
>> 39 f3,
>> 40 0,
>> 41 1) g
>> 42 from t) t1
>> 43 where g = 1
>> 44 order by f1,f2,f3
>> 45 ;
>> F1 F2 F3 F4 CNT
>> -- -- ---------- ---------- ----------
>> AA BB 10 30 2
>> AA BB 40 70 3
>> AA BB 90 130 4
>> AA CC 10 40 3
>> AA CC 50 110 6
>> AA CC 170 190 2
>> Best regards
>> Maxim
> I did not understand everything by now, but it works fine. > - Innermost select seems to give first row of a sub-group defined by > connection between F3->F4 the number 1, all the others get number 0, > - second step is a numbering by sub-groups, each sub-group gets the same > number > - third stage is to find out the min/max and count of each sub-group
> I never heard about this functions, but they work very well and super > fast.
> Many thanks
> Andreas
Similar question, similar answer?
the situation before is
F1|F2|F3 AA|BB|10 AA|BB|20 AA|BB|40 AA|BB|50 AA|BB|60 AA|BB|70 AA|CC|70 AA|CC|80 AA|CC|90
and the result should be
F1|F2|F3|F4 AA|BB|10|20 AA|BB|20|30 AA|BB|40|50 AA|BB|50|60 AA|BB|60|70 AA|CC|70|80 AA|CC|80|90
Is there a similar way to get this? Your way is very fast.
Many thanks in advance
Andreas Mosmann
-- wenn email, dann AndreasMosmann <bei> web <punkt> deReceived on Mon Nov 16 2009 - 15:35:20 CST