Query is extremly slow [message #426015] |
Tue, 13 October 2009 13:55  |
varshasantosh2004
Messages: 7 Registered: October 2009
|
Junior Member |
|
|
Hi, We have a table which has about 35M records. The same table has "AS_OF_DATE" as one of the columns for storing history data.
But at any time for application use, we are using latest AS_OF_DATE data.
We are joining this table with another table which has same 35 M records using Outer Join and creating a Materialized View of it for further operations. The Materialized View creation is taking atleast 8-10 hrs. Please help me in writing effective query.
Here are the queries that I am using currently:
//From first table.. Even this takes lot of time to return the data.
SELECT * FROM TABLE_A pa WHERE pa.AS_OF_DATE =(
SELECT MAX(AS_OF_DATE) FROM TABLE_A pa2 where pa.id = pa2.id)
//Second query:
SELECT * FROM TABLE_B pb
LEFT OUTER JOIN
( SELECT * FROM TABLE_A pa WHERE pa.AS_OF_DATE =(
SELECT MAX(AS_OF_DATE) FROM TABLE_A pa2 where pa.id = pa2.id)) pb2
ON pb.ID = pb2.ID
The second query takes forver to come back. Is there any way I can improve the performance.
Already I have following indexes on both the tables:
ID
AS_OF_DATE
ID, AS_OF_DATE (Unique Index)
Please help. Thanks
|
|
|
|
Re: Query is extremly slow [message #426018 is a reply to message #426015] |
Tue, 13 October 2009 14:23   |
varshasantosh2004
Messages: 7 Registered: October 2009
|
Junior Member |
|
|
Hi, Thanks for the quick reply. I will try to follow the guidelines going forward. Thanks.
Are statistics current on all tables & indexes?
Yes
How many rows get returned by first query?
About 20 M rows
How many rows get returned by second query?
About 35 M rows as its a Outer Join.
Also, Here is the explain plan output for the first query:
SELECT STATEMENT () (null)
HASH JOIN () (null)
VIEW () VW_SQ_1
HASH (GROUP BY) (null)
INDEX (FULL SCAN) X_ID_AND_ASOFDATE
TABLE ACCESS (FULL) TABLE_A
Thanks
|
|
|
Re: Query is extremly slow [message #426019 is a reply to message #426015] |
Tue, 13 October 2009 14:28   |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
in my experience, and this is just a gross comment with plenty of holes, you should be able to join about 1M rows/minute. So I would be happy with a straight up join of 35M rows in 1/2 to 1 hour, not 8 or 9 hours.
I mention this because it gives you some basic goal to shoot for.
If I understand your query, you are trying to join table B to the most recent counterpart row in table A. Is that so?
Also, I am not sure this is your actual query. You are select * from two tables that share common columns. Is this the actual query or did you cheat a little to give us a simpler version?
You might wish to break your process into steps. Here is an example:
I believe you already have this index.
create index temp_i1 nologging on tablea (id,as_of_date)
/
This table should create pretty fast because its query plan should only reference the index above. The goal is to create a rowset to join to that only contains the rows you actually need and thus remove any requirement for an BOB BARKER join.
create table temp1 nologging
as
select max(as_of_date),id
from tablea
group by id
/
This table create should avoid any complexities related to your BOB BARKER subqueries. This was the goal of creating the temp table, to allow you to rewrite your query into a simple set of joins.
create table temp2 nologging
as
select *
from tableb
, tablea
, temp1
where tableb.id = temp1.id(+)
and tableb.as_ofdate = temp1.as_of_date(+)
and temp1.id = tablea.id(+)
and temp1.as_of_date = tablea.as_of_date(+)
/
Ideally you should not have to do any of this but sometime oracle needs a little help and I have found that in every release, oracle always prefers a simple join to other constructs.
Good luck, Kevin
[Updated on: Tue, 13 October 2009 14:30] Report message to a moderator
|
|
|
Re: Query is extremly slow [message #426020 is a reply to message #426016] |
Tue, 13 October 2009 14:30   |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
this is a correlated subquery:
SELECT * FROM TABLE_A pa WHERE pa.AS_OF_DATE =(
SELECT MAX(AS_OF_DATE) FROM TABLE_A pa2 where pa.id = pa2.id)
So that means the inner select runs once for every row in the outer query which is very inefficient.
One alt is to rank the dates for each id. Example...
select object_type, created from user_objects where object_name like 'G%' order by 1, 2 desc
OBJECT_TYPE CREATED
DATABASE LINK 9/24/2009 9:49:19 PM
DATABASE LINK 7/10/2009 2:55:53 PM
DATABASE LINK 6/10/2009 1:41:14 PM
DATABASE LINK 6/8/2009 4:09:43 PM
DATABASE LINK 6/8/2009 4:09:34 PM
FUNCTION 6/1/2009 3:54:33 PM
FUNCTION 8/3/2007 1:23:09 PM
SYNONYM 5/2/2008 9:22:02 AM
SYNONYM 4/29/2008 2:17:12 PM
--Rank them by type by date desc...
select object_type, created, rank() over (partition by object_type order by created desc) as rnk
from user_objects where object_name like 'G%' order by 1, 3;
OBJECT_TYPE CREATED RNK
DATABASE LINK 9/24/2009 9:49:19 PM 1
DATABASE LINK 7/10/2009 2:55:53 PM 2
DATABASE LINK 6/10/2009 1:41:14 PM 3
DATABASE LINK 6/8/2009 4:09:43 PM 4
DATABASE LINK 6/8/2009 4:09:34 PM 5
FUNCTION 6/1/2009 3:54:33 PM 1
FUNCTION 8/3/2007 1:23:09 PM 2
SYNONYM 5/2/2008 9:22:02 AM 1
SYNONYM 4/29/2008 2:17:12 PM 2
--Then just pick the recs ranked 1st...
select object_type, created from(
select object_type, created, rank() over (partition by object_type order by created desc) as rnk
from user_objects where object_name like 'G%')
where rnk=1
order by 1
OBJECT_TYPE CREATED
DATABASE LINK 9/24/2009 9:49:19 PM
FUNCTION 6/1/2009 3:54:33 PM
SYNONYM 5/2/2008 9:22:02 AM
|
|
|
Re: Query is extremly slow [message #426021 is a reply to message #426015] |
Tue, 13 October 2009 15:52   |
varshasantosh2004
Messages: 7 Registered: October 2009
|
Junior Member |
|
|
All, Thanks a lot for your kind reply. This forum is awesome.
I followed the suggestion is to use "rank()" and it seems to be more effective, as I can straightaway use the query in the Materialized View rather than creating the temp table and using it further. "temp" table solution is also quite good and we can use it for many other queries.
I have another small issue where in I am trying to aggregate (comma seperated values) the column values for a particular id.
For e.g.
Table_A
id blocks
1 b1
1 b2
1 b3
1 b4
2 c1
2 c2
3 d1
The output need to be of following format:
id data
1 b1,b2,b3,b4
2 c1,c2
3 d1
etc
Is there any simple and effective solution for this which is good wrt performance?
Currently I am doing the same using Cursors and looping through the data and concatinating it to get the results and id very slow.
Please help. Thanks a lot in advance.
Thanks,
|
|
|
Re: Query is extremly slow [message #426028 is a reply to message #426021] |
Tue, 13 October 2009 16:52  |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
http://tkyte.blogspot.com/2006/08/evolution.html
ops$tkyte%ORA10GR2> with data
2 as
3 (
4 select job,
5 ename,
6 row_number() over (partition by job order by ename) rn,
7 count(*) over (partition by job) cnt
8 from emp
9 )
10 select job, ltrim(sys_connect_by_path(ename,','),',') scbp
11 from data
12 where rn = cnt
13 start with rn = 1
14 connect by prior job = job and prior rn = rn-1
15 order by job
16 /
JOB SCBP
--------- ----------------------------------------
ANALYST FORD,SCOTT
CLERK ADAMS,JAMES,MILLER,SMITH
MANAGER BLAKE,CLARK,JONES
PRESIDENT KING
SALESMAN ALLEN,MARTIN,TURNER,WARD
--10g
http://www.oracle-developer.net/display.php?id=306
|
|
|