Re: Straight SQL always put perform PL/SQL?

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Thu, 13 Dec 2007 18:41:09 -0000
Message-ID: <0o2dnUfH57lf4fzanZ2dnUVZ8vKdnZ2d@bt.com>

<dshproperty_at_gmail.com> wrote in message news:1ad08016-0850-4030-acc6-98f08516c993_at_d21g2000prf.googlegroups.com...
> http://tkyte.blogspot.com/2006/10/slow-by-slow.html
> In there Tom Kytes says:
> * You should do it in a single SQL statement if at all possible.
> * If you cannot do it in a single SQL Statement, then do it in PL/SQL.
>
> I was a true beliver of that until this case. We have a huge table,
> and we need to update it with
> values from another table.
>
> Let me give you an example
> create table test (id number,col1 varchar2(20)) pctfree 99;
> (this is only for creating a huge table with lots of columns)
> declare
> v_number number :=1;
> begin
> loop
> insert into test values (v_number, 'a');
> v_number := v_number+1;
> exit when v_number > 1000000;
> end loop;
> commit;
> end;
>
> create table test1 as select * from test where id < 400000;
> update test1 set col1='b';
> commit;
>
> create index test_ind on test(id);
> create index test1_ind on test1(id);
> analyze table test compute statistics;
> analyze table test1 compute statistics;
> SQL> update test
> set col1=(select col1 from test1
> where test.id=test1.id )
> where exists
> (select 1 from test1
> where test.id=test1.id);
> 399999 rows updated.
> Elapsed: 00:04:15.01
> SQL> declare
> 2 v_count number;
> 3 cursor test_curs is
> 4 select test.id,test.col1 from test,test1
> 5 where test.id=test1.id;
> 6 begin
> 7 for test_cur in test_curs
> 8 loop
> 9 update test
> 10 set col1=test_cur.col1
> 11 where id=test_cur.id;
> 12 end loop;
> 13 commit;
> 14 end;
> 15 /
> PL/SQL procedure successfully completed.
> Elapsed: 00:04:30.82
>
> Here the difference is not huge yet, but in our case, single SQL will
> take an hour to finish, while
> PL/SQL only take 5 min. The bigger the table test, the huger the
> difference.
>
> And the reason is simple, as there will be 40% of rows needs to be
> updated, Oracle will pick full
> table scan of table test(a huge table). While in PL/SQL, "UPDATE TEST
> SET COL1=:B2 WHERE ID=:B1"
> will be able to use index.
>
> Any comments will be highly appreciated

A couple of problems with your comparison:

Your index on test1.id is not unique, so the scalar subquery used in the update could (in theory) return multiple rows. In the SQL case this would crash the query. The pl/sql code would "multiply" rows from test, and update them multiple times, which means you don't get a deterministic result.

Assume you have the appropriate unique index in place on test1.id, then your pl/sql is the coded equivalent of an update to a join view:

update (

        select test.id,test.col1 from test,test1
        where test.id=test1.id

)
set

id = col1
;

However, whilst this update is running, conflicting updates could get in ahead of the N million row update, do a quick change and commit.

In the case of the SQL, the statement would have to rollback if it hit a conflict, and start again with a 'write consistent' update - effectively doing a 'select for update / update' the second time around. (See Ask Tom, search for 'write consistency').

In the case of the PL/SQL, the code will simply be doing a read-consistent query, and updating rows identified by id. This allows you to get a 'lost update' from the conflict (a change made by the interfering update gets over-written by the pl/sql loop).

In your case of the 1 hour vs. 5 minutes, one of the causes of extra time may have been an implicit rollback and restart.

-- 
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Received on Thu Dec 13 2007 - 12:41:09 CST