Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: [SOLVED] Re: performance tuning questions: replace IN (values) by JOIN

Re: [SOLVED] Re: performance tuning questions: replace IN (values) by JOIN

From: Telemachus <telemachus_at_ulysseswillreturn.net>
Date: Mon, 1 Sep 2003 13:21:09 +0100
Message-ID: <VQG4b.29996$pK2.48199@news.indigo.ie>


Is that really the hint you are using ?

you may wish to consider why oracle does not wish to use the index by itself.
"Jérôme Lacoste - CoffeeBreaks" <lacostej_at_coffebreaks.org> wrote in message news:9si4b.17454$BD3.2774497_at_juliett.dax.net...
>
> For those interested, the complete solution was to:
>
> - add an index on BAG_LOG DTM_TIMESTAMP
> - replace the Time Range test by a BETWEEN
> - remove the sub-query and add the test on the time range directly to
> the WHERE clause
> - force the use of the HINT to the Optimizer
>
> select /*+ HINT BAG_LOG_IDX_DTM_TIMESTAMP */ BAGS.ID as BAGS__ID, [...],
> BAG_LOG.* FROM BAGS, BAG_LOG
> WHERE (BAGS.ID = BAG_LOG.BAG_ID AND
> BAG_LOG.DTM_TIMESTAMP BETWEEN (TO_DATE('2003-19-08
> 00:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> AND (TO_DATE('2003-19-08
> 12:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> ) ORDER BY BAGS.ID DESC, BAG_LOG.DTM_TIMESTAMP DESC;
>
>
> query went from 18 sec. to 0.02 seconds.
>
> Thanks all, I owe you one
>
> Jerome
>
> PS: for those who might wonder I will rename this index as its name his
> very bad...
>
> Jérôme Lacoste - CoffeeBreaks wrote:
> > Hi,
> >
> > I am trying to improve an SQL query design while still keeping its
> > performance.
> >
> >
> > DB Model
> > --------
> > I have 2 tables, one called BAGS the other called BAG_LOG. Each has a
> > primary key called ID. The BAG_LOG table has a foreign key called BAG_ID
> > pointing to BAGS.ID.
> >
> > The relation is: a BAG has 1 or more BAG_LOG(s).
> >
> > The BAG_LOG table has a DTM_TIMESTAMP column.
> >
> > There are 40000 rows in BAGS and 170000 in BAG_LOG.
> >
> > There's no specific INDEX appart the ones generated automatically on the
> > primary and foreign key.
> >
> >
> >
> > Query to improve
> > ----------------
> >
> > Let's say I want to find all BAGS who have had a TIMESTAMP within a
> > specific period. Additional criteria will add later on.
> >
> >
> > One solution is to use 3 queries:
> >
> > select DISTINCT(BAG_ID) FROM BAG_LOG
> > WHERE DTM_TIMESTAMP >= (TO_DATE('2003-19-08 0:00:00',
> > 'yyyy-dd-mm HH24:MI:SS'))
> > AND DTM_TIMESTAMP <= (TO_DATE('2003-19-08 2:00:00',
> > 'yyyy-dd-mm HH24:MI:SS'));
> >
> > SELECT * FROM BAGS where ID IN
> > (346564,346565,346566,346567,346568,346569);
> >
> > SELECT * FROM BAG_LOG where BAG_ID IN
> > (346564,346565,346566,346567,346568,346569);
> >
> > The first query find the BAG IDs who satisfy the condition, the second
> > and third queries find the rows of the BAG and BAG_LOG table that match
> > the result of the first query. The second and third query are built
> > dynamically by the calling program, based on the results returned by the
> > first query.
> >
> >
> > This has two drawbacks:
> > - it doesn't exploit SQL capabilities
> > - by using IN (list of values) one can hit a limitation if the first
> > query returns more than a certain number of rows. (Oracle 9 has a
> > default limit of 1000 possible values).
> >
> > It has one advantage
> > - it is very simple SQL and will probably work on all DB I know.
> >
> >
> >
> > Possible improvement?
> > ---------------------
> >
> > The 3 queries are joined into one. The columns are renamed so to be able
> > to retrieve the values by column name (otherwise identical BAGS and
> > BAG_LOG column names conflict).
> >
> > select BAGS.ID as BAGS__ID, [...],
> > BAG_LOG.* FROM BAGS, BAG_LOG
> > WHERE (BAGS.ID = BAG_LOG.BAG_ID AND
> > BAGS.ID IN (select DISTINCT(BAG_ID) FROM BAG_LOG
> > WHERE DTM_TIMESTAMP >= (TO_DATE('2003-19-08
> > 00:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> > AND DTM_TIMESTAMP <= (TO_DATE('2003-19-08
> > 12:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> > )
> > ) ORDER BY BAGS.ID DESC, BAG_LOG.DTM_TIMESTAMP DESC;
> >
> > replace [...] by the full list of fields from the BAGS table one wants
> > to retrieve. In my case all.
> >
> > [Note that I also added sorting in that query, but it doesn't seem to
> > affect the performance in my case].
> >
> >
> >
> > Performance results & analysis:
> > -------------------------------
> >
> > - (1) set of queries takes about 14 sec to perform
> > - (2) takes 18 secs to perform, a 25% performance loss
> >
> >
> > Here are some info using the autotrace function:
> >
> >
> > ------ (1) --------
> > 6 rows selected.
> >
> > Elapsed: 00:00:07.01
> >
> > Execution Plan
> > ----------------------------------------------------------
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 SORT (UNIQUE)
> > 2 1 TABLE ACCESS (FULL) OF 'BAG_LOG'
> >
> >
> >
> >
> > Statistics
> > ----------------------------------------------------------
> > 0 recursive calls
> > 0 db block gets
> > 34199 consistent gets
> > 34184 physical reads
> > 0 redo size
> > 456 bytes sent via SQL*Net to client
> > 499 bytes received via SQL*Net from client
> > 2 SQL*Net roundtrips to/from client
> > 1 sorts (memory)
> > 0 sorts (disk)
> > 6 rows processed
> >
> >
> > 6 rows selected.
> >
> > Elapsed: 00:00:00.03
> >
> > Execution Plan
> > ----------------------------------------------------------
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 CONCATENATION
> > 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 3 2 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 4 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 5 4 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 6 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 7 6 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 8 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 9 8 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 10 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 11 10 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 12 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 13 12 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> >
> >
> > Statistics
> > ----------------------------------------------------------
> > 0 recursive calls
> > 0 db block gets
> > 18 consistent gets
> > 0 physical reads
> > 0 redo size
> > 915 bytes sent via SQL*Net to client
> > 499 bytes received via SQL*Net from client
> > 2 SQL*Net roundtrips to/from client
> > 0 sorts (memory)
> > 0 sorts (disk)
> > 6 rows processed
> >
> >
> > 12 rows selected.
> >
> > Elapsed: 00:00:07.07
> >
> > Execution Plan
> > ----------------------------------------------------------
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 TABLE ACCESS (FULL) OF 'BAG_LOG'
> >
> >
> >
> >
> > Statistics
> > ----------------------------------------------------------
> > 0 recursive calls
> > 0 db block gets
> > 34200 consistent gets
> > 34185 physical reads
> > 0 redo size
> > 1774 bytes sent via SQL*Net to client
> > 499 bytes received via SQL*Net from client
> > 2 SQL*Net roundtrips to/from client
> > 0 sorts (memory)
> > 0 sorts (disk)
> > 12 rows processed
> >
> >
> >
> > ------ (2) ------
> >
> > 12 rows selected.
> >
> > Elapsed: 00:00:18.06
> >
> > Execution Plan
> > ----------------------------------------------------------
> > 0 SELECT STATEMENT Optimizer=CHOOSE
> > 1 0 SORT (ORDER BY)
> > 2 1 MERGE JOIN
> > 3 2 SORT (JOIN)
> > 4 3 NESTED LOOPS
> > 5 4 TABLE ACCESS (FULL) OF 'BAG_LOG'
> > 6 4 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> > 7 6 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> > 8 2 SORT (JOIN)
> > 9 8 VIEW OF 'VW_NSO_1'
> > 10 9 SORT (UNIQUE)
> > 11 10 TABLE ACCESS (FULL) OF 'BAG_LOG'
> >
> >
> >
> >
> > Statistics
> > ----------------------------------------------------------
> > 0 recursive calls
> > 26 db block gets
> > 345418 consistent gets
> > 71578 physical reads
> > 0 redo size
> > 2422 bytes sent via SQL*Net to client
> > 499 bytes received via SQL*Net from client
> > 2 SQL*Net roundtrips to/from client
> > 3 sorts (memory)
> > 1 sorts (disk)
> > 12 rows processed
> >
> >
> >
> >
> >
> > I clearly see that my second solution uses much more memory (consistent
> > gets is about 3 times the total used by the first 3 queries), and also
> > do 2 more sorts in memory and one in the disk.
> >
> > But I do not see any obvious design flaw in my query....
> >
> >
> >
> > -- Questions --
> > ---------------
> >
> > Q: Is the performance loss due to the NESTED LOOPS? or is it due to the
> > JOIN in itself ?
> >
> > Q: I was wondering what kind of query should I use in order to manage to
> > improve the design of my query (1) especially removing its limitations
> > without introducing the performance loss as with my second query. Should
> > I use a temporary table?
> >
> > Q: [slightly unrelated] How one can enable the autotrace to also display
> > (Cost, Card and Byte) as shown in most of Oracle examples?
> >
> > Any help appreciated...
> >
> >
> > Cheers,
> >
> > Jerome
> >
>
Received on Mon Sep 01 2003 - 07:21:09 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US