Home » SQL & PL/SQL » SQL & PL/SQL » Bad optimisation of left outer join
Bad optimisation of left outer join [message #194437] Fri, 22 September 2006 04:22 Go to next message
chimpychimpy
Messages: 7
Registered: September 2006
Junior Member
I'm very new to Oracle, but used SQL before. I've got the following situation

Table Person (Theres about 9,000,000 different people in my db)
- id (int)
- name (char(255))
- job_id (int)
----
id is indexed,
( id, job_id ) is indexed

Table Job (Theres only about 50 differnt jobs)
- job_id (int)
- job_name (char(255))
------
job_id is indexed

all fields are non-null (ints are actually bigint's)

When I execute the query:
select p.Name,
       j.name

from 
       Person p
         left outer join
       Job j
         on p.job_id = j.job_id

where
       rownum < 100

This query takes over 2 minutes!...

if I use a normal inner join instead of left outer join (which is fair, since job is non-null) - it is a matter of miliseconds.

I would have expected oracle to notice that on a 'left', the person p must always be present, and so pull back the first 100 records of Person, and join them to Job if applicable

unfortunately, the time its taking, it must be joining the entirity of Person (9 million records) to Job, *then* getting the first 100 results. The explain plan backs this up :-

explain plan:
SELECT STATEMENT
COUNT | STOPKEY
HASH JOIN | OUTER
TABLE ACCESS | FULL | Person
TABLE ACCESS | FULL | Job

Unfortunately I do *not* have the choice to not use left outer join, since it is generated from third party software (hibernate).

SQL server copes with the same query with no problem, (1sec response or something). (changing db is not an option either!!)

Is there anything I can do short of changing the script? (this is not an available option). I would have expected the optimiser to get this - is there some hint / index / something else I can do to this thing? I feel certain that oracle can do this, but I'm very new to it and don't know why it isn';t picking up this optimisation

Tim
Re: Bad optimisation of left outer join [message #194447 is a reply to message #194437] Fri, 22 September 2006 05:01 Go to previous messageGo to next message
ThomasG
Messages: 3189
Registered: April 2005
Location: Heilbronn, Germany
Senior Member
When the optimzer makes bad decicions it's usually due to the table statistics not being out of date.

A thing to try is analze the tables :

analyze table Person compute statistics for all indexed columns for all indexes for table;
analyze table Job compute statistics for all indexed columns for all indexes for table;


And see if that speeds things up.
Re: Bad optimisation of left outer join [message #194459 is a reply to message #194447] Fri, 22 September 2006 05:30 Go to previous messageGo to next message
chimpychimpy
Messages: 7
Registered: September 2006
Junior Member
I should have mentioned, it's been analyzed. Was last analyzed 5 days ago, I'll analyze those indexes but I suspect it will not change the result.

the usage hasn't really changed in those 5 days, its packing data from 3 months.
Re: Bad optimisation of left outer join [message #194542 is a reply to message #194437] Fri, 22 September 2006 10:29 Go to previous messageGo to next message
srinivnp
Messages: 136
Registered: January 2006
Location: stlouis MO USA
Senior Member

An outer join and an inner join are two different animals .
That comparison is not at all valid . Would like to see your inner join query and its where clause.

What makes you think that the above plan is bad for your outerjoin and the number of rows that you have ?

By the way do you have a terribly high high water mark ?
Do both the above tables undergo lots of deletes updates ?

Couple of suggestions.
1)Truncate and Reload the table such that rows in both the tables are ordered by jobid. That is both the parent and child have excellent clustering factor on the FK key columns.

2)If possible , Make the two tables btree clustered . Implementing this depends on your data and its growth.



Srini



Re: Bad optimisation of left outer join [message #194548 is a reply to message #194542] Fri, 22 September 2006 10:46 Go to previous messageGo to next message
chimpychimpy
Messages: 7
Registered: September 2006
Junior Member
srinivnp wrote on Fri, 22 September 2006 10:29



What makes you think that the above plan is bad for your outerjoin and the number of rows that you have ?



the fact that it runs in 1 min 40 when in sql-server it runs under a second gives me a hint.

logically it should just look at the top 100 entries in the ( id, job_id ) index, skim them off and flesh them out. its not a particularly difficult operation. I could probably perform it manually faster than this query returns, which tells me, its not optimised correctly.

lastly, theres 9 million rows... why do I think a full outer join would be bad for it?... call it a hunch...

Quote:


By the way do you have a terribly high high water mark ?



what do you mean by a water-mark?


Quote:


Do both the above tables undergo lots of deletes updates ?



average about 20,000 new records a day. no deletes. no updates.

Quote:


Couple of suggestions.
1)Truncate and Reload the table such that rows in both the tables are ordered by jobid. That is both the parent and child have excellent clustering factor on the FK key columns.

2)If possible , Make the two tables btree clustered . Implementing this depends on your data and its growth.



Clustering isn't the issue here. It has to come up with the first 100 records that consist of a person and corresponding job or no job. A person has at most a single job. Why should it need to join 9 million records?

[Updated on: Fri, 22 September 2006 10:52]

Report message to a moderator

Re: Bad optimisation of left outer join [message #194566 is a reply to message #194548] Fri, 22 September 2006 12:35 Go to previous messageGo to next message
ThomasG
Messages: 3189
Registered: April 2005
Location: Heilbronn, Germany
Senior Member
Can you try the query with an optimizer hint ?

select /*+FIRST_ROWS*/ 
       p.Name,
       j.name
from 
       Person p
         left outer join
       Job j
         on p.job_id = j.job_id
where
       rownum < 100


If that resolves the perfomance issue you could set the oracle initialization parameter "OPTIMIZER_MODE" from "CHOOSE" to "FIRST_ROWS". Then the optimizer will try to get the first rows quickly, instead of trying to get all rows.
Re: Bad optimisation of left outer join [message #194679 is a reply to message #194566] Sat, 23 September 2006 23:37 Go to previous messageGo to next message
Frank
Messages: 7880
Registered: March 2000
Senior Member
Word of warning here:
Don't "just" change initialization parameters bases on the behaviour of one query!!
Changing init-parameters should only be done after thorough testing of the complete application.
Re: Bad optimisation of left outer join [message #194926 is a reply to message #194679] Tue, 26 September 2006 03:37 Go to previous message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
Definitely don't change the init.ora parameters to fix a single problem.

Instead, you can create a stored outline for this query. Basically, this allows you to store a plan for a specific exact query, and whenever that exact query is executed, the stored plan will be used instead of the optimizer generated one.

Note:
Quote:

To create a public or private outline, you must have the CREATE ANY OUTLINE system privilege.




ORIGINAL PLAN:

explain plan for
select p.Name,
       j.job_name
from   Person p
         left outer join
       Job j
       on p.job_id = j.job_id
where rownum < 100;

select * from table(dbms_xplan.display);

"PLAN_TABLE_OUTPUT"
" "
"--------------------------------------------------------------------"
"| Id  | Operation            |  Name       | Rows  | Bytes | Cost  |"
"--------------------------------------------------------------------"
"|   0 | SELECT STATEMENT     |             |    99 |  2376 |  7298 |"
"|   1 |  COUNT STOPKEY       |             |       |       |       |"
"|   2 |   HASH JOIN OUTER    |             |  9000K|   205M|  7298 |"
"|   3 |    TABLE ACCESS FULL | PERSON      |  9000K|   137M|  5420 |"
"|   4 |    TABLE ACCESS FULL | JOB         |    50 |   400 |     2 |"
"--------------------------------------------------------------------"
" "
"Note: cpu costing is off, PLAN_TABLE' is old version"


alter session set optimizer_mode = 'FIRST_ROWS';

create or replace public outline query_1_fr on select p.Name,
       j.job_name
from   Person p
         left outer join
       Job j
       on p.job_id = j.job_id
where rownum < 100;

alter session set optimizer_mode = 'CHOOSE';

alter system set use_stored_outlines = true;

NEW PLAN
explain plan for
select p.Name,
       j.job_name
from   Person p
         left outer join
       Job j
       on p.job_id = j.job_id
where rownum < 100;

select * from table(dbms_xplan.display);

"PLAN_TABLE_OUTPUT"
" "
"-----------------------------------------------------------------------------"
"| Id  | Operation                     |  Name       | Rows  | Bytes | Cost  |"
"-----------------------------------------------------------------------------"
"|   0 | SELECT STATEMENT              |             |    99 |  2376 |  9005K|"
"|   1 |  COUNT STOPKEY                |             |       |       |       |"
"|   2 |   NESTED LOOPS OUTER          |             |  9000K|   205M|  9005K|"
"|   3 |    TABLE ACCESS FULL          | PERSON      |  9000K|   137M|  5420 |"
"|   4 |    TABLE ACCESS BY INDEX ROWID| JOB         |     1 |     8 |     1 |"
"|   5 |     INDEX RANGE SCAN          | JOB_IDX     |     1 |       |       |"
"-----------------------------------------------------------------------------"
" "
"Note: cpu costing is off, PLAN_TABLE' is old version"
Previous Topic: populate table
Next Topic: Importing only packages
Goto Forum:
  


Current Time: Wed Dec 07 18:14:00 CST 2016

Total time taken to generate the page: 0.08867 seconds