Function in WHERE performance issue

From: Rich Jesse <rjoralist_at_society.servebeer.com>
Date: Fri, 10 Jul 2009 16:45:28 -0500 (CDT)
Message-ID: <a57e2326a6106c5ddaf729217b8ed587.squirrel_at_society.servebeer.com>



Hey all,

In 10.1.0.5.0, I have two tables of IP and lat/long info of the CSV data from http://www.maxmind.com/app/geolitecity. PK on the location table is locid. PK on the block table is startipnum and endipnum, with a non-unique index on locid. There is an FK in the blocks table on location.locid.

I created the IP-to-number function from here:

http://cbmc64.blogspot.com/2005/05/oracle-plsql-function-to-convert_30.html

...and called it "ip2number" and made it DETERMINISTIC. I then tried this
simple statement:

select /*+ first_rows(1) */ aa.*
from geo_location aa,

        geo_blocks b
where aa.locid = b.locid
and ip2number('192.168.1.1') between b.startipnum and b.endipnum;

(the hint is needed because the DB default is all_rows) It takes 32 seconds. If I run just:

SELECT ip2number('192.168.1.1') from dual;

...it returns zero rows in <1s. Likewise, substituting the returned value
in place of the function returns <1s. The difference in the explain plan is that the cardinality for the original is orders of magnitude higher. Given that, I also tried using a bind variable in place of the function. The explain plan is exactly the same as the original, but again it returns <1s. To rule out caching, rerunning the original again takes >30s. I've run these scenarios with the above 192.168.1.1 address which returns zero rows, as well as a "real" IP address that returns exactly 1 row.

The main difference execution-wise is that the original has 20 recursive calls, while the other speedy variants only have 1. Even a 10046 trace shows only:

select /*+ first_rows(1) */ aa.*
from geo_location aa,

        geo_blocks b
where aa.locid = b.locid
and ip2number('192.168.1.1') between b.startipnum and b.endipnum END OF STMT
PARSE #4:c=10000,e=15099,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=2,tim=3431319029024 BINDS #4:
EXEC #4:c=0,e=119,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=2,tim=3431319029214 WAIT #4: nam='SQL*Net message to client' ela= 1 p1=1413697536 p2=1 p3=0 *** 2009-07-10 15:30:20.599
FETCH
#4:c=33040000,e=33698563,p=0,cr=9135,cu=0,mis=0,r=1,dep=0,og=2,tim=3431352727835 WAIT #4: nam='SQL*Net message from client' ela= 3682 p1=1413697536 p2=1 p3=0 FETCH #4:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=2,tim=3431352731701 WAIT #4: nam='SQL*Net message to client' ela= 1 p1=1413697536 p2=1 p3=0

Even though this doesn't appear to me to be a stats issue since the explain plans are the same, I've tried it without stats, with compute and size 255 buckets, both with the same results.

I haven't checked a 10053 trace yet -- this doesn't appear to me to be a problem with an "incorrect" plan -- but I haven't really spent much time with 10053 yet and wanted to get some thoughts on this mess first.

Sorry for the long post! Hopefully I haven't left too much out in this "condensed" version. :)

TIA!
Rich

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jul 10 2009 - 16:45:28 CDT

Original text of this message