Home » RDBMS Server » Performance Tuning » A performance test of Oracle Analtyicals vs Group By (Oracle 9.2.0.3 / Unix)
A performance test of Oracle Analtyicals vs Group By [message #346462] Mon, 08 September 2008 11:17 Go to next message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
As I am happy as a lark to have found out from posts here about the Oracle Analytical functions - something previously blind and deaf to - thought I'd immediately do some performance testing.

This test compares doing multiple Select/Group By's vs a Partition/Over variation to get counts on a table rolled up by different data elements.

Here is a description of the table:


CREATE TABLE EFPREF_PROVIDER
(
  PROVIDER_ID             NUMBER(9)             NOT NULL,
  PROVIDER_CLASS          VARCHAR2(10 BYTE),
  PROVIDER_NAME           VARCHAR2(60 BYTE),
  PROVIDER_FNAME          VARCHAR2(25 BYTE),
  PROVIDER_LNAME          VARCHAR2(60 BYTE),
  PROVIDER_L7F2           VARCHAR2(9 BYTE),
  PROVIDER_L7             VARCHAR2(7 BYTE),
  LOCID                   NUMBER(9)             NOT NULL,
  ADDRESS                 VARCHAR2(60 BYTE),
  CITY                    VARCHAR2(60 BYTE),
  STATE                   VARCHAR2(2 BYTE),
  ZIP                     VARCHAR2(5 BYTE),
  PARRID                  NUMBER(9)             NOT NULL,
  TIN                     VARCHAR2(10 BYTE),
  SPARE_COL               VARCHAR2(10 BYTE),
  PRVNAME_32              VARCHAR2(60 BYTE),
  TINSPECEFFECTIVEDATE    DATE,
  TINSPECTERMINATIONDATE  DATE,
  PRODUCTGROUPS           VARCHAR2(50 BYTE)
)
TABLESPACE MPIEFP_TRANS_DATA
PCTUSED    0
PCTFREE    10
INITRANS   1
MAXTRANS   255
STORAGE    (
            INITIAL          512K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
LOGGING 
NOCOMPRESS 
NOCACHE
NOPARALLEL
NOMONITORING
/


CREATE INDEX IDX1_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_L7, TIN)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX4_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_NAME, CITY, STATE)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX7_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_CLASS, TIN, STATE)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE UNIQUE INDEX EFREF$PARRID ON EFPREF_PROVIDER
(PARRID)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFRFPRV$ID ON EFPREF_PROVIDER
(PROVIDER_ID)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$ADDR ON EFPREF_PROVIDER
(ADDRESS)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$CITY ON EFPREF_PROVIDER
(CITY)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$CLASS ON EFPREF_PROVIDER
(PROVIDER_CLASS)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$L7 ON EFPREF_PROVIDER
(PROVIDER_L7)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$L7F2 ON EFPREF_PROVIDER
(PROVIDER_L7F2)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$ST ON EFPREF_PROVIDER
(STATE)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFREF$ZIP ON EFPREF_PROVIDER
(ZIP)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFRFPRV$NM ON EFPREF_PROVIDER
(PROVIDER_NAME)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFRFPRV$SPCOL ON EFPREF_PROVIDER
(SPARE_COL)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX EFRFPRV$TIN ON EFPREF_PROVIDER
(TIN)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX_EFPREFPRVNM32 ON EFPREF_PROVIDER
(PRVNAME_32)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX2_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_NAME, TIN)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX3_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_L7, CITY, STATE)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX5_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_L7, ZIP)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX6_EFPREFPROVIDER ON EFPREF_PROVIDER
(PROVIDER_NAME, ZIP)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX8_EFPREFPROVIDER ON EFPREF_PROVIDER
(TIN, STATE)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


CREATE INDEX IDX9_EFPREFPROVIDER ON EFPREF_PROVIDER
(STATE, CITY)
NOLOGGING
TABLESPACE MPIEFP_TRANS_DATA
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       2147483645
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
/


Size information:

642 MB
Num Rows - 4046742
Avg Len - 139
Num Blocks - 81495.

Quote:

OK, test is to accomplish getting counts of records rolled up by:




Select  tin,count(*)  NT from efpref_provider where rownum < 100001

 group by tin,provider_Lname;

 

Select tin,count(*) tcs  from efpref_provider where rownum < 100001

group by tin,city,state;

 

Select tin,count(*)  tz  from efpref_provider where rownum < 100001

group by tin,zip;

 

Select tin,count(*)  ta  from efpref_provider where rownum < 100001

group by tin,address;



VS.

SELECT TIN,

       COUNT(*) over (PARTITION BY TIN,PROVIDER_LNAME) nt,

       COUNT(*) over (PARTITION BY TIN,CITY,STATE) tcs,

       COUNT(*) over (PARTITION BY TIN,ZIP) tz,

       COUNT(*) over (PARTITION BY TIN,ADDRESS) ta 

from efpref_provider where rownum < 100001;



The execution plans:

For all of the individual Select/Group By's we have
Quote:

Operation Object Name Rows Bytes Cost

SELECT STATEMENT Optimizer Mode =CHOOSE 100 K 23507
SORT GROUP BY 100 K 1 M 23507
COUNT STOPKEY
TABLE ACCESS FULL MPIEFP_DEV.EFPREF_PROVIDER 4 M 69 M 7839



With just variation in digits for the bytes/cost


For the analytical variation:

Quote:

Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop

SELECT STATEMENT Optimizer Mode=CHOOSE 100 K 160335
WINDOW SORT 100 K 5 M 160335
WINDOW SORT 100 K 5 M 160335
WINDOW SORT 100 K 5 M 160335
WINDOW SORT 100 K 5 M 160335
COUNT STOPKEY
TABLE ACCESS FULL MPIEFP_DEV.EFPREF_PROVIDER 4 M 219 M 7839




Window Sort is something for me to read up on...

And now the performance results. In both scripts I took
the sysdate before and after execution.

I executed each 3 times and took the last one, to account
for caching and what not

The Sort/Group By's:
29 minutes and 4 seconds.

The Analytic version:
11 Minutes 31 seconds.

Final Analysis : I am happy.

I will be taking this a step further to make it somewhat 'tangible'. Instead of individual statements for
the Sort/Group By's I will try a one statement UNION ALL,
and vary that against a Materialized view using WITH.

Then - try to gage the rate increase proportionality.
By that - How does the performance time increase relevent
to a data volume increase in both - linear, logarithmic, exponentional?

Any suggestions on adding factors to this test to make results
more meaningful pls contribute!

Oh - the analytical variation was the slow poke vs a single
group by, the rate proportionality test should give me
a good benchmark for break even points.

Regards
Harry
Re: A performance test of Oracle Analtyicals vs Group By [message #346474 is a reply to message #346462] Mon, 08 September 2008 12:27 Go to previous messageGo to next message
Michel Cadot
Messages: 68728
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Performances topic are for "Performances Tuning" forum.
Not formatted exexuction plans are unreadable.
Just my opinion

Regards
Michel
Re: A performance test of Oracle Analtyicals vs Group By [message #346477 is a reply to message #346462] Mon, 08 September 2008 12:33 Go to previous messageGo to next message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
Yikes! - sorry Michel, this definitely belongs moved to the performance section. I will post follow up there.
I let my enthusiasm get in front of my planning & communicating - as well with the unformatted explain plans.

Best Regards,
Harry
Re: A performance test of Oracle Analtyicals vs Group By [message #346608 is a reply to message #346462] Tue, 09 September 2008 02:21 Go to previous messageGo to next message
Olivia
Messages: 519
Registered: June 2008
Senior Member
Harry, thanks once again for the test cases you provided for Oracle Analtyicals vs Group By.


A concern, does creation of more indexes have adverse affect?
say we have an index on a timestamp column. And later again, I created another index combining some columns which contains this
timestamp column to optimize some queries. Wont it have adverse effect on other existing queries which were giving better performance?

Regards,
Oli

[Updated on: Tue, 09 September 2008 09:19]

Report message to a moderator

Re: A performance test of Oracle Analtyicals vs Group By [message #346761 is a reply to message #346608] Tue, 09 September 2008 12:01 Go to previous messageGo to next message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
Olivia - I had just typed out a long response to this and somehow lost it after previewing reply. Sad

In short I think the question becomes "Why or When would Oracle decide to choose an index for a sub-optimal access path?" ie,
If you create a new index on a field used by a previous query that was functioning optimally, such that it now degrades because Oracle chose a new access path with the new index,
we should examine when and why that can occur (vs viewing this as "dangers of new indexes" on existing query performance, as such risk lies in the light of other elements, not the index itself).

I am in that situation right now in my DEV environment - a word I have come to use after many an enlightening chat with rleishman of this site on the topic of "Oracle Optimization Plan Cost" is "Goals".

When Oracle's goals differ from your own, sub-optimal access path's can be chosen as the least costly plan for execution.
One of those goals is determined by initialization parameters on how to weight the cost of using an index and how much index data Oracle should expect as being cached in memory for use.

Have to run but I will give you real world examples of the abov e and my experience with exactly what you describe.

Are you experiencing that right now with a situation where you have created a new table index and previous queries against that table now function sub-optimally because they incorporate that index? And if so, are table statistics gathered fresh?

I will be back in a couple of hours! hope this helps.

Best regards
Harry
Re: A performance test of Oracle Analtyicals vs Group By [message #346932 is a reply to message #346761] Wed, 10 September 2008 02:25 Go to previous messageGo to next message
Olivia
Messages: 519
Registered: June 2008
Senior Member
Thanks for the update Harry...

Quote:

Are you experiencing that right now with a situation where you have created a new table index and previous queries against that table now function sub-optimally because they incorporate that index? And if so, are table statistics gathered fresh?




Yes, I have found such situation. I created an index after which I got better performance for so many queries but also noticed that some of the queries got affected by it.An index on a timestamp column was being created. And later again, I created another index combining some columns which contains this timestamp column to optimize some queries.Those SQL's were giving better performance after that but I also noticed that some got affected. I don't want that.

Yeah, statistics gathered fresh.


Regards,
Oli
Re: A performance test of Oracle Analtyicals vs Group By [message #347067 is a reply to message #346462] Wed, 10 September 2008 10:05 Go to previous message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
Ok, fully understand. Olivia- any chance you have an explain plan from before and after the performance drop/index creation point?

Even if not, If you have a sec to provide query, the plans, what the indexes (point out the new one) simply number of rows in relevent tables of query I think we can put our heads together and get things back the way they were pretty quick.

My thoughts -
(1) if you dont have previous explain plan before new index creation, we start with forcing no use on new index on the affected queries.

(2) Even if new index is the whole prob (why Oracle switched plans), this may not remedy as stats across the board have changed. (ideal, we want old stats back and run with block on new index).

(3) Regardless we need to take a peak at what Oracle's "goal" is here to see why it chose a sub-optimal plan, period. ie, Why the new index could cause this (and we havent fully isolated that directly it was the index).

So I think with the prob queries syntax - or equivalent if its productoi and rec counts per table and new access plans alone we can nail this. I want to change a couple things at session level but not first pass.

Look forward to the challenge- see if I learned the correct lessons from when I had to deal with the similar scenario in my system. Smile

Best Regards,
Harry
Previous Topic: Cursor versus select into for simple select
Next Topic: how to unpin a table from cache
Goto Forum:
  


Current Time: Sat Dec 14 13:45:07 CST 2024