Re: difficult select- question

From: Maxim Demenko <mdemenko_at_gmail.com>
Date: Tue, 17 Nov 2009 08:19:50 +0100
Message-ID: <4B024E96.2010503_at_gmail.com>



Andreas Mosmann wrote:
> Andreas Mosmann schrieb am 02.11.2009 in
> <1257181408.57_at_user.newsoffice.de>:
>
>> Maxim Demenko schrieb am 27.10.2009 in <4AE6A3AF.4020506_at_gmail.com>:

>
>>> Andreas Mosmann wrote:
>>>> Hi ng,
>>>>
>>>> I hope, there is someone with a good idea concerning the following 
>>>> question:
>>>>
>>>> There is a sorted Table (or view) like the following:
>>>> F1|F2|F3|F4
>>>> AA|BB|10|20
>>>> AA|BB|20|30
>>>> AA|BB|40|50
>>>> AA|BB|50|60
>>>> AA|BB|60|70
>>>> AA|CC|70|80
>>>> AA|CC|80|90
>>>>
>>>> As you can see there are some rows, that are connected by F3=prior 
>>>> F4, but there are leaks too. I need a result as the following:
>>>>
>>>> AA|BB|10|30|2 parts
>>>> AA|BB|40|70|3 parts
>>>> AA|CC|70|90|2 parts
>>>>
>>>> I know it is easy to solve by programming.
>>>> I think it is possible to do this with inline view and hierarchical 
>>>> query.
>>>> I can imagine a way by select min(F3), max(F4) ... where exists 
>>>> (enough connecting data in the middle).
>>>>
>>>> But what is the best (SQL-)way to solve this? (There actually are 
>>>> about 15000 lines, the estimated result probably should not have 
>>>> more than 100 lines)
>>>> I could do it by PL/SQL, but I am interested in SQL, if exists
>>>>
>>>> Andreas Mosmann
>>>>
>>>>
>>>>
>>>>
>>>>

>
>>> Another alternative could be use of analytical functions,
>>> 1) the classical interval calculation with *start of group* method
>>> SQL> with t as (
>>>    2   select 'AA' F1,'BB' F2,10 F3,20 F4 from dual  union all
>>>    3   select 'AA','BB',20,30 from dual  union all
>>>    4   select 'AA','BB',40,50 from dual  union all
>>>    5   select 'AA','BB',50,60 from dual  union all
>>>    6   select 'AA','BB',60,70 from dual  union all
>>>    7   select 'AA','BB',90,100 from dual  union all
>>>    8   select 'AA','BB',100,110 from dual  union all
>>>    9   select 'AA','BB',110,120 from dual  union all
>>>   10   select 'AA','BB',120,130 from dual  union all
>>>   11   select 'AA','CC',10,20 from dual  union all
>>>   12   select 'AA','CC',20,30 from dual  union all
>>>   13   select 'AA','CC',30,40 from dual  union all
>>>   14   select 'AA','CC',50,60 from dual  union all
>>>   15   select 'AA','CC',60,70 from dual  union all
>>>   16   select 'AA','CC',70,80 from dual  union all
>>>   17   select 'AA','CC',80,90 from dual  union all
>>>   18   select 'AA','CC',90,100 from dual  union all
>>>   19   select 'AA','CC',100,110 from dual  union all
>>>   20   select 'AA','CC',170,180 from dual  union all
>>>   21   select 'AA','CC',180,190 from dual
>>>   22  )
>>>   23  -- End test data
>>>   24  select f1,f2,min(f3) f3,max(f4) f4,count(*) cnt
>>>   25  from (
>>>   26  select f1,
>>>   27         f2,
>>>   28         f3,
>>>   29         f4,
>>>   30         sum(start_of_group) over(partition by f1, f2 order by 
>>> f3) group_no
>>>   31    from (select f1,
>>>   32                 f2,
>>>   33                 f3,
>>>   34                 f4,
>>>   35                 decode(lag(f4) over(partition by f1, f2 order by 
>>> f3),
>>>   36                        f3,
>>>   37                        0,
>>>   38                        1) start_of_group
>>>   39            from t)
>>>   40  )
>>>   41  group by f1,f2,group_no
>>>   42  order by f1,f2,f3
>>>   43  ;

>
>>> F1 F2 F3 F4 CNT >>> -- -- ---------- ---------- ---------- >>> AA BB 10 30 2 >>> AA BB 40 70 3 >>> AA BB 90 130 4 >>> AA CC 10 40 3 >>> AA CC 50 110 6 >>> AA CC 170 190 2

>
>
>>> 2) Slightly different approach with analytics

>
>>> SQL> with t as (
>>>    2   select 'AA' F1,'BB' F2,10 F3,20 F4 from dual  union all
>>>    3   select 'AA','BB',20,30 from dual  union all
>>>    4   select 'AA','BB',40,50 from dual  union all
>>>    5   select 'AA','BB',50,60 from dual  union all
>>>    6   select 'AA','BB',60,70 from dual  union all
>>>    7   select 'AA','BB',90,100 from dual  union all
>>>    8   select 'AA','BB',100,110 from dual  union all
>>>    9   select 'AA','BB',110,120 from dual  union all
>>>   10   select 'AA','BB',120,130 from dual  union all
>>>   11   select 'AA','CC',10,20 from dual  union all
>>>   12   select 'AA','CC',20,30 from dual  union all
>>>   13   select 'AA','CC',30,40 from dual  union all
>>>   14   select 'AA','CC',50,60 from dual  union all
>>>   15   select 'AA','CC',60,70 from dual  union all
>>>   16   select 'AA','CC',70,80 from dual  union all
>>>   17   select 'AA','CC',80,90 from dual  union all
>>>   18   select 'AA','CC',90,100 from dual  union all
>>>   19   select 'AA','CC',100,110 from dual  union all
>>>   20   select 'AA','CC',170,180 from dual  union all
>>>   21   select 'AA','CC',180,190 from dual
>>>   22  )
>>>   23  -- End test data
>>>   24  select f1,
>>>   25         f2,
>>>   26         f3,
>>>   27         nvl(lead(prev_f4) over(partition by f1, f2 order by f3), 
>>> max_f4) f4,
>>>   28         nvl(lead(prev_cnt_f4)
>>>   29             over(partition by f1, f2 order by f3) - prev_cnt_f4,
>>>   30             cnt_f4 - prev_cnt_f4) cnt
>>>   31    from (select t.*,
>>>   32                 count(*) over(partition by f1, f2) cnt_f4,
>>>   33                 count(*) over(partition by f1, f2 order by f3
>>>   34                 rows between unbounded preceding and 1 
>>> preceding) prev_cnt_f4,
>>>   35                 max(f4) over(partition by f1, f2) max_f4,
>>>   36                 max(f4) over(partition by f1, f2 order by f3
>>>   37                 rows between unbounded preceding and 1 
>>> preceding) prev_f4,
>>>   38                 decode(lag(f4) over(partition by f1, f2 order by 
>>> f3),
>>>   39                        f3,
>>>   40                        0,
>>>   41                        1) g
>>>   42            from t) t1
>>>   43   where g = 1
>>>   44   order by f1,f2,f3
>>>   45  ;

>
>>> F1 F2 F3 F4 CNT >>> -- -- ---------- ---------- ---------- >>> AA BB 10 30 2 >>> AA BB 40 70 3 >>> AA BB 90 130 4 >>> AA CC 10 40 3 >>> AA CC 50 110 6 >>> AA CC 170 190 2

>
>
>>> Best regards

>
>>> Maxim
>> I did not understand everything by now, but it works fine.
>> - Innermost select seems to give first row of a sub-group defined by 
>> connection between F3->F4 the number 1, all the others get number 0,
>> - second step is a numbering by sub-groups, each sub-group gets the 
>> same number
>> - third stage is to find out the min/max and count of each sub-group

>
>> I never heard about this functions, but they work very well and super 
>> fast.

>
>> Many thanks
>> Andreas

>
> Similar question, similar answer?
>
> the situation before is
>
> F1|F2|F3
> AA|BB|10
> AA|BB|20
> AA|BB|40
> AA|BB|50
> AA|BB|60
> AA|BB|70
> AA|CC|70
> AA|CC|80
> AA|CC|90
>
> and the result should be
>
> F1|F2|F3|F4
> AA|BB|10|20
> AA|BB|20|30
> AA|BB|40|50
> AA|BB|50|60
> AA|BB|60|70
> AA|CC|70|80
> AA|CC|80|90
>
> Is there a similar way to get this? Your way is very fast.
>
> Many thanks in advance
> Andreas Mosmann
>

Could you explain a bit the logic behind your desired results? In particular, it's unclear for me, based on what should the line

AA|BB|20|30

be generated

Best regards

Maxim Received on Tue Nov 17 2009 - 01:19:50 CST

Original text of this message