Re: How make data transfomation faster?

From: <grrey_at_mail.ru>
Date: 9 Feb 2006 04:22:26 -0800
Message-ID: <1139487746.605560.295600@g43g2000cwa.googlegroups.com>

Ed Prochak wrote:

> Consider this query:
>
> select A.user_id,A.date_time from log A
> where A.kind=1 and
> A.id NOT IN (select B.id from log B
> where B.kind=2 );
>
> So fi the log table has
> id date kind
> 1 1/1 1
> 1 1/5 2
> 2 1/2 1
>
> then logically the pseduo table A has rows
> id date kind
> 1 1/1 1
> 2 1/2 1
>
> and pseudo table B has rows:
> id date kind
> 1 1/5 2
>
> so the result of A.id not in B.id gives the final result
> id date kind
> 2 1/2 1
>
> That can be slow. (at least it used to be slow) But it can be faster
> rewritten using the NOT EXISTS clause, or using the OUTER JOIN as I was
> trying to do. (Sorry I messed it up in my other posts.)

That not just could be slow. That's wrong again. If log table will be like this:
id date kind

 1    1/1     1
 1    1/5     2
 1    1/6     1
 2    1/2     1

Query should return :
1 1/6
2 1/2

But it doesn't.

>
> select A.user_id,A.date_time from log A, log B
> where A.kind=1 and
> A.id = B.id (+) and
> B.id IS NULL and B.kind(+) = 2 ;
>
> Here the process is different, first is the join of A and B (speaking
> logically not necessarily how ORACLE might physically do it). The outer
> join is a table with 6 columns and 2 rows, (before applying the NULL
> condition:
>
> A.id A.date A.kind B.id B.date B.kind
> 1 1/1 1 1 1/5 2
> 2 1/2 1 null null null
>
> Now the final where clause "B.id IS NULL" says keep only the nulls

B.id couldn't be null. Why are you keeping check null values?

>
> If we add another start event for id 1 to the table:
> id date kind
> 1 1/1 1
> 1 1/5 2
> 2 1/2 1
> 1 1/6 1
>
> and we add this condition to the where clause:
> AND a.date < B.date.
>
> Darn. we are close, but this isn't enough to handle multiples.
>

I wrote a query, I was trying to, but there was too much subqueries, so deleted it and forget about it. It was slower than first loop I showed in my question. Actually the answer by Michel Cadot is the best! This is greate becouse of one full data scan (I need them all so full scan is best), and one sort. By the way I didn't find any description about window sort in documintation. If any one have this description, please show me.

> >
> > I was tring do make any valuble result thinking that way, but coudn't
> > find it (.
>
> Well since the 1's and 2's process together, why not just put them in
> the PLSQL LOOP with a cursor query like
> select * from log where kind in (1,2)
>
> and then when processing the cancels, have a second loop with
> select * from log where kind in (3,4)
>
> Basically that moves some of the IF's from your original posted code
> into the SELECT statement.

For every row in second loop I should search last row in stat and then try to change it. It again will be slower. Received on Thu Feb 09 2006 - 06:22:26 CST