Re: Analytic function Help
Date: Tue, 01 Feb 2011 21:44:00 +0100
Message-ID: <ii9rah$538$1_at_news.eternal-september.org>
On 2011-02-01 19:31, Andreas Mosmann wrote:
> Andreas Mosmann schrieb am 31.01.2011 in
> <1296500556.65_at_user.newsoffice.de>:
>
>> Hi, there is a view like
>
>
>> ID SORT FK >> 01 -5 N >> 01 -4 N >> 01 -3 Y >> 01 -2 Y >> 01 -1 N >> 01 0 Y >> 01 1 N >> 01 2 N >> 02 -2 N >> 02 -1 Y >> 02 0 Y >> 02 1 Y >> 03 -5 Y >> 03 -4 N >> 03 -3 Y >> 03 -2 N >> 03 -1 Y >> 03 0 N >> 03 1 Y
>
>> I need the number of seperate groups of 'Y' >> that means in this case:
>
>> ID COUNT >> 01 2 >> 02 1 >> 03 4
>
>> ID 01: SORT (-2,-3) and (0) are a group ->2 >> ID 02: SORT (-1 .. +1) is a group ->1 >> ID 03: SORT (-5), (-3), (-1), (1) are groups ->4
>
>> I sometimes use lead and lag, but in this case I do not have an idea, >> but I feel it should be possible, isn't it?
>
>> Thank you >> Andreas
>
> I solved the question above with your help. Now I have another question
> to the same subject.
>
> Imagine the result a little more complex. Instead of
>
>> ID COUNT >> 01 2 >> 02 1 >> 03 4
>
> there is maybe:
> ID START STOP COUNT
> 01 0 100 2
> 01 100 150 2
> 01 150 375 4
> 01 375 500 1
> 01 500 700 1
> 01 700 900 2
> 01 900 1015 2
>
> I now want to melt records with the same count, but only that, that
> follow each other.
>
> wished result
> ID START STOP COUNT
> 01 0 150 2
> 01 150 375 4
> 01 375 700 1
> 01 700 1015 2
>
> I former got an answer to a similar question by Mr. Maxim Demenko and I
> tried to change some hours, but I could not resolve my new problem. It
> is, that f.e. COUNT=2 is twice in the result. So I can not use MAX-
> Function to find out end of the block.
>
> Maxims idea (that fitted for the other question very fine) was
> Inner query:
> select
> ID, START,
> --to find out the first
> nvl(decode(lag(START) over (partition by ID, COUNT order by
> VST),NULL,1,0) FIRST_LINE,
> --to find out "Stop"- Value
> max(STOP) over (partition by ID, COUNT),
> COUNT
> from MyResult
>
> Complete query (simplified):
> select
> ID,START,MAX_STOP,COUNT
> from (
> select
> ID, START,
> --to find out the first
> nvl(decode(lag(START) over (partition by ID, COUNT order by
> VST),NULL,1,0) IS_FIRST_LINE,
> --to find out "Stop"- Value
> max(STOP) over (partition by ID, COUNT) MAX_STOP,
> COUNT
> from MyResult
> ) WHERE IS_FIRST_LINE=1
>
> It works for COUNT 1 and 4, but of course not for COUNT 2.
>
> Any ideas?
Perhaps best to start without using any analytic functions etc. Here's a sketch (at the moment I don't have an Oracle dbms to try with, so you might have to do some adjustments):
create table T (
id char(2) not null,
start int not null,
stop int not null,
cnt int not null,
primary key (id, start)
);
insert into T (id, start, stop, cnt)
values
('01',0,100,2), ('01',100,150,2), ('01',150,375,4), ('01',375,500,1), ('01',500,700,1), ('01',700,900,2), ('01',900,1000,2);
For every tuple we would like to find the max stop such that no other cnt lies between our start and that stop:
select id, start, cnt, (select max(stop) as stop from T T2 where (T1.id, T1.cnt) = (T2.id, T2.cnt) and not exists ( select 1 from T T3 where T2.id = T3.id and T2.cnt <> T3.cnt and T3.start between T1.stop and T2.start ) )
from T T1
ID START CNT STOP
-- ----------- ----------- -----------
01 0 2 150 01 100 2 150 01 150 4 375 01 375 1 700 01 500 1 700 01 700 2 1000 01 900 2 1000
What remain is to find the smallest start per id, cnt and stop:
select id, min(start) as start, stop, cnt from (
select id, start, cnt, (select max(stop) as stop from T T2 where (T1.id, T1.cnt) = (T2.id, T2.cnt) and not exists ( select 1 from T T3 where T2.id = T3.id and T2.cnt <> T3.cnt and T3.start between T1.stop and T2.start ) ) from T T1
) group by id, stop, cnt order by stop
ID START STOP CNT
-- ----------- ----------- -----------
01 0 150 2 01 150 375 4 01 375 700 1 01 700 1000 2
Using a similar technique as in my previous example:
select id, start, stop, cnt,
row_number() over (partition by id order by start) - row_number() over (partition by id, cnt order by start) as difffrom T
order by start
ID START STOP CNT DIFF
-- ----------- ----------- ----------- -------------------- 01 0 100 2 0 01 100 150 2 0 01 150 375 4 2 01 375 500 1 3 01 500 700 1 3 01 700 900 2 3 01 900 1000 2 3
min(start) and max(stop) per id, diff gives:
select id, min(start) as start, max(stop) as stop, cnt, diff from (
select id, start, stop, cnt, row_number() over (partition by id order by start) - row_number() over (partition by id, cnt order by start) as diff from T
)
group by id, cnt, diff
order by start
ID START STOP CNT DIFF
-- ----------- ----------- ----------- -------------------- 01 0 150 2 0 01 150 375 4 2 01 375 700 1 3 01 700 1000 2 3
I'll have to stop here, if I come up with some other way I might post that later.
/Lennart Received on Tue Feb 01 2011 - 14:44:00 CST