Yasin Baskan

Subscribe to Yasin Baskan feed
Stuff about the Oracle database.
Updated: 38 min 9 sec ago

Collect in 10G

Fri, 2008-04-11 04:29
Collections can be a great help in speeding up the PL/SQL programs. By using bulk collect operations it is possible to get great performance improvements.

In 9.2 we needed to use the bulk collect clause to fetch rows into a collection. 10G brings a new function called COLLECT, which takes a column as a parameter and returns a nested table containing the column values. Using this we can get the data into a collection without using bulk collect.

Here is a very simple demo of this new function.

YAS@10G>create table t as select * from all_objects;

Table created.

YAS@10G>create or replace type name_type as table of varchar2(30);
2 /

Type created.

YAS@10G>set serveroutput on

YAS@10G>r
1 declare
2 v_names name_type;
3 begin
4 select cast(collect(object_name) as name_type) into v_names from t;
5 dbms_output.put_line(v_names.count);
6 dbms_output.put_line(v_names(1));
7* end;
42268
ICOL$

PL/SQL procedure successfully completed.


One difference between this and bulk collect is, since this a sql function we need a sql type for this, it cannot be used with local PL/SQL types.

YAS@10G>r
1 declare
2 type name_type is table of varchar2(30);
3 v_names name_type;
4 begin
5 select collect(object_name) into v_names from t;
6* end;
select collect(object_name) into v_names from t;
*
ERROR at line 5:
ORA-06550: line 5, column 35:
PLS-00642: local collection types not allowed in SQL statements
ORA-06550: line 5, column 9:
PL/SQL: ORA-00932: inconsistent datatypes: expected CHAR got -
ORA-06550: line 5, column 2:
PL/SQL: SQL Statement ignored


Another difference, and a more important one, is the performance difference between these two constructs. Using Tom Kyte's runstats package, I did a test to compare the two. I ran the test several times, the results were similar.

YAS@10G>r
1 declare
2 v_names name_type;
3 begin
4 runStats_pkg.rs_start;
5
6 for i in 1..1000 loop
7 select object_name bulk collect into v_names from t;
8 end loop;
9
10 runStats_pkg.rs_middle;
11
12 for i in 1..1000 loop
13 select cast(collect(object_name) as name_type) into v_names from t;
14 end loop;
15
16 runStats_pkg.rs_stop;
17
18* end;
Run1 ran in 1329 hsecs
Run2 ran in 4060 hsecs
run 1 ran in 32.73% of the time

Name Run1 Run2 Diff
LATCH.qmn state object latch 0 1 1
STAT...redo entries 9 10 1
LATCH.JS slv state obj latch 1 0 -1
LATCH.qmn task queue latch 5 6 1
LATCH.transaction branch alloc 0 1 1
LATCH.sort extent pool 0 1 1
LATCH.resmgr:actses change gro 1 0 -1
LATCH.ncodef allocation latch 0 1 1
LATCH.slave class 0 1 1
LATCH.archive control 0 1 1
LATCH.FAL subheap alocation 0 1 1
LATCH.FAL request queue 0 1 1
STAT...heap block compress 6 5 -1
LATCH.session switching 0 1 1
LATCH.ksuosstats global area 1 2 1
LATCH.event group latch 1 0 -1
LATCH.threshold alerts latch 0 1 1
LATCH.list of block allocation 2 0 -2
LATCH.transaction allocation 2 0 -2
LATCH.dummy allocation 3 1 -2
LATCH.user lock 2 0 -2
LATCH.Consistent RBA 4 2 -2
STAT...calls to kcmgcs 4 6 2
STAT...active txn count during 4 6 2
STAT...cleanout - number of kt 4 6 2
STAT...consistent gets 587,009 587,011 2
STAT...consistent gets from ca 587,009 587,011 2
STAT...consistent gets - exami 4 6 2
LATCH.resmgr:free threads list 3 0 -3
LATCH.PL/SQL warning settings 3 0 -3
LATCH.OS process 6 3 -3
LATCH.compile environment latc 3 0 -3
LATCH.slave class create 0 3 3
LATCH.resmgr:actses active lis 3 0 -3
LATCH.cache buffers lru chain 0 3 3
LATCH.OS process allocation 10 14 4
LATCH.session state list latch 4 0 -4
LATCH.redo allocation 50 46 -4
STAT...consistent changes 17 22 5
LATCH.resmgr group change latc 5 0 -5
STAT...db block gets 17 22 5
STAT...db block gets from cach 17 22 5
LATCH.library cache pin alloca 6 0 -6
STAT...db block changes 26 32 6
STAT...session logical reads 587,026 587,033 7
LATCH.In memory undo latch 23 16 -7
LATCH.session idle bit 10 3 -7
LATCH.mostly latch-free SCN 6 14 8
LATCH.KMG MMAN ready and start 5 13 8
LATCH.lgwr LWN SCN 6 14 8
LATCH.simulator hash latch 34,010 34,001 -9
LATCH.simulator lru latch 34,010 34,001 -9
LATCH.session timer 5 14 9
LATCH.dml lock allocation 10 1 -9
LATCH.undo global data 20 10 -10
LATCH.post/wait queue 10 0 -10
LATCH.active checkpoint queue 4 15 11
LATCH.session allocation 13 2 -11
LATCH.archive process latch 4 15 11
LATCH.library cache lock alloc 16 0 -16
LATCH.object queue header oper 8 32 24
LATCH.client/application info 25 0 -25
LATCH.redo writing 24 51 27
LATCH.active service list 37 81 44
STAT...undo change vector size 2,124 2,200 76
LATCH.channel operations paren 60 197 137
LATCH.cache buffers chains 1,174,359 1,174,189 -170
LATCH.JS queue state obj latch 108 288 180
LATCH.messages 108 291 183
STAT...redo size 2,772 2,964 192
LATCH.checkpoint queue latch 80 282 202
LATCH.enqueue hash chains 269 652 383
LATCH.enqueues 248 645 397
LATCH.SQL memory manager worka 276 944 668
LATCH.shared pool 42 1,029 987
LATCH.library cache lock 170 2,018 1,848
LATCH.library cache pin 2,130 4,064 1,934
STAT...Elapsed Time 1,330 4,061 2,731
STAT...CPU used by this sessio 1,334 4,083 2,749
STAT...recursive cpu usage 1,268 4,064 2,796
LATCH.library cache 2,264 5,074 2,810
LATCH.row cache objects 49 9,015 8,966
STAT...session pga memory 2,097,152 1,441,792 -655,360

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
1,248,536 1,267,073 18,537 98.54%

PL/SQL procedure successfully completed.


As you see the bulk collect method completes in 1/3rd of the time of the collect function method. Using the collect function hits the library cache harder and uses more latch operations.

The COLLECT function can be used in sql statements to compare collections of columns. In Laurent Schneider's comment on my previous post you can find an example of it.

Relational algebra: division in sql

Wed, 2008-04-09 04:42
There was a question in one of the Turkish Oracle mailing lists which got me interested. The question simply was:

I have a table holding the parts of a product and another table holding the suppliers of these parts. How can I find the suppliers which supply all the parts?

Someone suggested using the division operation of relational algebra but did not provide how to do it in Oracle. So I started with the Wikipedia link he provided to solve the problem in sql.

Here are the tables:

SQL> create table parts (pid number);

Table created.

SQL> create table catalog (sid number,pid number);

Table created.

SQL> insert into parts select rownum from all_objects where rownum<=5;

5 rows created.

SQL> insert into catalog values (10,1);

1 row created.

SQL> insert into catalog select 1,pid from parts;

5 rows created.

SQL> select * from catalog;

SID PID
---------- ----------
10 1
1 1
1 2
1 3
1 4
1 5


So, the supplier which has all the parts is 1. How do we find that?

The division in relational algebra is done by some steps which are explained in the link. Let's follow those:

The simulation of the division with the basic operations is as follows. We assume that a1,...,an are the attribute names unique to R and b1,...,bm are the attribute names of S. In the first step we project R on its unique attribute names and construct all combinations with tuples in S:

T := πa1,...,an(R) × S

In our case we have the table CATALOG as R, the table PARTS as S. So if we write a sql to find out the above relation we need a cartesian join.

select sid,pid
from (select sid from catalog) ,parts
;


This give us all suppliers combined with all parts.

SID PID
---------- ----------
10 1
1 1
1 1
1 1
1 1
1 1
10 2
1 2
1 2
1 2
1 2

SID PID
---------- ----------
1 2
10 3
1 3
1 3
1 3
1 3
1 3
10 4
1 4
1 4
1 4

SID PID
---------- ----------
1 4
1 4
10 5
1 5
1 5
1 5
1 5
1 5

We now have all the possibilities for the supplier-part relation.

The second step is:

In the next step we subtract R from this relation:

U := T - R

To subtract the table CATALOG we need the MINUS operator.

select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog;

SID PID
---------- ----------
10 2
10 3
10 4
10 5





After we had all the possibilities we subtracted the ones which are already in the CATALOG table and we got the ones which are not present in the table.

On to the next step:

Note that in U we have the possible combinations that "could have" been in R, but weren't. So if we now take the projection on the attribute names unique to R then we have the restrictions of the tuples in R for which not all combinations with tuples in S were present in R:

V := πa1,...,an(U)

This step just gets the supplier id's from the previous query.

select sid from (
select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog
);

SID
----------
10
10
10
10



Now we have the supplier id which does not supply all of the parts. The next step is obvious, to find the other suppliers (the remaining ones supply all the parts).

So what remains to be done is take the projection of R on its unique attribute names and subtract those in V:

W := πa1,...,an(R) - V

To do this we need to subtract the previous query from the table CATALOG.

select sid from catalog
minus
select sid from (
select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog
);

SID
----------
1


In some pages (like this one) the same operation is done using a different sql (obtained by using "not exists" instead of "minus").

select distinct sid from catalog c1
where not exists (
select null from parts p
where not exists (select null from catalog where pid=p.pid and c1.sid=sid));


This one is harder for me to understand in the first look, I am more comfortable following the steps above.

It is great to follow the steps of relational algebra to solve a problem in sql. It helps very much in understanding the solution. Relational algebra rocks!

Local vs. remote connection performance

Tue, 2008-02-05 04:13

It has been asked several times in several places; is there a performance difference between running a query locally in a client on the server and running the same query in a remote client?

The obvious answer given by the respondents including myself is: "if you do not return thousands of rows through the network, there must not be any difference". This type of response is opposed to what I believe; even if the answer seems obvious test it before you make any suggestions.

Tanel Poder got the same question and did what is needed to be done, he tested it and showed that there was a difference. In this great post of his.

His tests use a database on Solaris, sqlplus clients on Windows and Linux. I have tested the same using a database on Linux and the same behavior is observed there too.

Lesson learned again and again: test your suggestion even if the answer seems obvious.

Database version control

Mon, 2008-02-04 07:33

Coding horror is one of the software development blogs I keep a close eye on.

Jeff Atwood posted a nice piece about database version control recently. Database version control is maybe one of the most important and unfortunately most overlooked things in software development. The post is a good read including the links he provides.

The blog tagging thing

Fri, 2008-01-11 06:50
During the last few days lots of Oracle bloggers have been busy tagging each other and posting eight unknown things about themselves. I was also tagged by some friends and was asked to post eight things about myself. I have never forwarded any chain e-mails or messages to anyone and in parallel to that I have not written anything about myself after this either.

What I think about this blog tagging thing is very similar to what Howard Rogers thought about it. He shut his site down for some time and you can read what he thinks when you go to his blog. My thoughts on this are here in the comments to an Eddie Awad post. Howard has also posted a comment there to explain further.

Pages