Home » RDBMS Server » Performance Tuning » unexplained performance degradation of PL/SQL procedures (Oracle 10gr2, Solaris 10)
unexplained performance degradation of PL/SQL procedures [message #553106] Tue, 01 May 2012 23:47 Go to next message
apaju
Messages: 6
Registered: May 2012
Junior Member
Hi,
Decided to join and post here because managed to solve many Oracle db related problems with this FAQ.
My application is kind of a FIFO based data collection engine doing a lot of MIN/AVG/MAX based calculation within PL/SQL procedures using "insert into table select ... group by x,y,z" type SQL . New data is coming in at (example) ~1000 rows/sec with around 50 bytes row length. Old data is purged after certain retention like 30 days. The source data tables can be up to 1000 million rows with timestamp indexing. There is a rollup once every 5 minutes, which means the new data is analyzed and the PL/SQL procedure is fired up. Thus, it reads 1000 * 5 * 60 = 300000 rows and performs the calculation once every 5 minute.

Recently I have spent hundreds of hours for the performance optimization, analyzing the workload by AWR's etc and managed to make it quick by optimizing the joins and adding Dynamic Sampling. When the app and database is started the procedure execution time is like 5 seconds, but then after [15minutes...24hours] it suddenly jumps and it can actually jump to 10x or even 50x. As a test case I dropped the PL/SQL procedures and did the same stuff in a [begin...end] block and it is keeping static longer like 7 days but finally the slowdown happens, and when it happens doing "alter system flush shared_pool" makes it fast temporarily. This does not make sense because the load is constant and the database size keeps constant.

It is not an IO thing because I am using fast SSD's and actaully the AWR and iostats indicate huge activity only when there are the deletes. So, the problem is that exactly the same rollup takes tomorrow 10 - 50 times more CPU cycles than today. The oracle database initialization parameters:

background_dump_dest /export/home/oracle/admin/testdb/bdump
compatible 10.2.0.1.0
control_files /oradata/control01.ora, /oradata/control02.ora, /oradata/control03.ora
core_dump_dest /export/home/oracle/admin/testdb/cdump
db_block_size 8192
db_file_multiblock_read_count 16
db_files 80
db_name testdb
global_names TRUE
instance_name testdb
log_archive_dest_1 location=/backup/oracle/arch/archive
log_buffer 14330368
log_checkpoint_interval 10000
max_dump_file_size 10240
processes 180
service_names testdb
sga_target 8589934592
timed_statistics TRUE
undo_management AUTO
user_dump_dest /export/home/oracle/admin/testdb/udump

Any ideas what I should try to prevent this? Thank you


Re: unexplained performance degradation of PL/SQL procedures [message #553111 is a reply to message #553106] Wed, 02 May 2012 00:13 Go to previous messageGo to next message
BlackSwan
Messages: 22897
Registered: January 2009
Senior Member
Please read and follow the forum guidelines, to enable us to help you:

http://www.orafaq.com/forum/t/88153/0/


does this mythical code utilize BIND variables?

http://www.orafaq.com/forum/m/433888/136107/#msg_433888
Please refer to URL above & be sure to provide the details requested:
1) DDL for all tables & indexes
2) EXPLAIN PLAN
3) output from SQL_TRACE & tkprof
Re: unexplained performance degradation of PL/SQL procedures [message #553121 is a reply to message #553106] Wed, 02 May 2012 01:15 Go to previous messageGo to next message
John Watson
Messages: 4669
Registered: January 2010
Location: Global Village
Senior Member
Your problem could be statistics. You say that you are using optimizer_dynamic_sampling (what level? I usually set it to 4) but sampling occurs only once, when the statement is parsed. So if your table size changes as time goes by, the exec plans may no longer be appropriate. That would explain why flushing the shared pool (which will mean everything must be re-parsed with current statistics) helps.

But this is only one hypothesis. You would have to test by developing plans at various times.

Can you attach a couple of AWR reports, for good and bad periods? For the instance and the statements?
Re: unexplained performance degradation of PL/SQL procedures [message #553142 is a reply to message #553121] Wed, 02 May 2012 04:33 Go to previous messageGo to next message
apaju
Messages: 6
Registered: May 2012
Junior Member
I do not post the whole procedure because it is quite big, but below is a part of it and actually that part is the one that always starts getting slower when the problem hits:

CREATE OR REPLACE PROCEDURE CALC_APA(TSTAMP IN NUMBER, PER in NUMBER)
 AS
 BEGIN
 DECLARE
  STARTMSEC CONSTANT NUMBER := (TSTAMP - PER) *1000;
  ENDMSEC CONSTANT NUMBER := (TSTAMP+PER)*1000;
  STARTSEC CONSTANT NUMBER := TSTAMP;
  ENDSEC CONSTANT NUMBER := (TSTAMP+PER);
  BEGIN
   INSERT INTO RSC (A_ID, TIMESTAMP, SCOUNT)
   SELECT A_ID, TSTAMP AS TIMESTAMP, NVL(C, 0)
   FROM ( SELECT /*+ DYNAMIC_SAMPLING(8) */ COUNT(*) C, SRC,DST, A_ID
   FROM TMETER
   LEFT JOIN CM_VIEW ON TMETER.SRC = CM_VIEW.SOURCE
   AND CM_VIEW.SOURCE = CM_VIEW.DESTINATION
   WHERE TMETER.TIMESTAMP > STARTMSEC AND TMETER.TIMESTAMP <= ENDMSEC
   AND TMETER.TESTID = CM_VIEW.TESTID AND CM_VIEW.FSRPARAM = PER
   GROUP BY SRC,DST,A_ID);
  END;
END;
/


The other queries in the procedure are more complex but follow quite much the same structure with group by 3 columns. I am actually quite sure that what the queries actually calculate does not matter because what really seems to take CPU is getting the data from that TMETER table or actaully from its index because this case all the columns in the where clause or join condition are in the index. It is a big table (up to 1000 million rows), which is however quite stable in size, is not growing or shortening but the data is very living (FIFO).

The procedure is called through JDBC. I have made debug runs that do not call the procedure but performs exactly the same query so that it is parsed without bind variables and that one seems to be more stable, but slows down finally as well. The queries are not actually that much, but their nature is more like DW, so they are massive and read/write much. The execution takes several seconds in any case and doing the hard parsing before every query is probably not the issue. Dynamic Sampling level is currently 8 as you see,


*BlackSwan added {code} tags. Please do so yourself in the future.

[Updated on: Wed, 02 May 2012 08:21] by Moderator

Report message to a moderator

Re: unexplained performance degradation of PL/SQL procedures [message #553171 is a reply to message #553142] Wed, 02 May 2012 08:24 Go to previous messageGo to next message
BlackSwan
Messages: 22897
Registered: January 2009
Senior Member
>I have made debug runs that do not call the procedure but
>performs exactly the same query so that it is parsed without bind variables

could be impacted by Bind Variable Peaking.

Which columns are specified by Bind Variables?
Re: unexplained performance degradation of PL/SQL procedures [message #553510 is a reply to message #553171] Sat, 05 May 2012 21:04 Go to previous messageGo to next message
rleishman
Messages: 3700
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Is one of those tables in the join periodically flushed and reloaded?

According to the Oracle Manual for the DYNAMIC_SAMPLING hint:
Quote:
If a cardinality statistic already exists for the table, then the optimizer uses it. Otherwise, the optimizer enables dynamic sampling to estimate the cardinality statistic.


So, if you accidentally gathered PERMANENT statistics at a time when the table was empty, the cardinality would be LOCKED IN at 0. In such a case, CBO might well choose a CARTESIAN join, which would explain the high CPU.

There is some talk in this AskTom thread about DYNAMIC_SAMPLING and cardinality. If this is your problem, it appears there might be another undocumented hint you can use to re-sample the cardinality.

Ross Leishman
Re: unexplained performance degradation of PL/SQL procedures [message #553555 is a reply to message #553510] Sun, 06 May 2012 14:11 Go to previous messageGo to next message
apaju
Messages: 6
Registered: May 2012
Junior Member
I changed the procedure so that it is not using the constants but now taking STARTMSEC, ENDMSEC as IN arguments, which (as far as I have understood?) makes them automatically handled as bind variables when calling the procedure. However the query's execution time still jumps and does not drop until flushing the SHARED_POOL. According to the AWR timed model stats hard parsing takes only 0.50 sec within the 15min period which I have set for AWR's, so I do not think that the problem is due to not using bind variables.

According to the AWR's actually what I see as the difference between quick and slow execution is number of buffer gets. So, when the query is slow the number of gets for this query is ~10 times more compared to the case when it is quick. The TMETER table is joined with an MVIEW which is refreshed on commit, it contains only few thousands rows and it is unchanging during this test run.


Re: unexplained performance degradation of PL/SQL procedures [message #553556 is a reply to message #553555] Sun, 06 May 2012 14:20 Go to previous messageGo to next message
BlackSwan
Messages: 22897
Registered: January 2009
Senior Member
Please read and follow the forum guidelines, to enable us to help you:

http://www.orafaq.com/forum/t/88153/0/
Re: unexplained performance degradation of PL/SQL procedures [message #553578 is a reply to message #553556] Mon, 07 May 2012 01:06 Go to previous messageGo to next message
apaju
Messages: 6
Registered: May 2012
Junior Member
Oracle version is 10.2.0.1, Standard Edition One
HW is Oracle SPARC T3-1
OS is Solaris 10u9
Re: unexplained performance degradation of PL/SQL procedures [message #553597 is a reply to message #553578] Mon, 07 May 2012 02:13 Go to previous messageGo to next message
John Watson
Messages: 4669
Registered: January 2010
Location: Global Village
Senior Member
You say you are using Standard Edition 1. I think you will find that using AWR is illegal, you must check that. So you can use only statspack. Gather level 7 statspack snapshots when the statement is running well (after fushing the shared pool?) and again when it is running badly. Then generate the statspack SQL report for the statement at both times, and compare the two. That should help you decide whether to nail down the execution plan with a stored outline.
Re: unexplained performance degradation of PL/SQL procedures [message #553602 is a reply to message #553597] Mon, 07 May 2012 02:44 Go to previous messageGo to next message
apaju
Messages: 6
Registered: May 2012
Junior Member
That is new to me, as I thought that in lab/dev environments you can use those tools that are available, but not in end customer which is using Standard One as embedded. This is a dev environment, end customer is not interested in AWR reports...
Re: unexplained performance degradation of PL/SQL procedures [message #553613 is a reply to message #553602] Mon, 07 May 2012 04:23 Go to previous messageGo to next message
John Watson
Messages: 4669
Registered: January 2010
Location: Global Village
Senior Member
Er.... was the idea to use the tools on dev, and then make sme really skillful guesses on production? It doesn't work like that! You must talk to your Oracle account manager, your customer will not thank you if you take them out of compliance.
Re: unexplained performance degradation of PL/SQL procedures [message #553615 is a reply to message #553613] Mon, 07 May 2012 04:48 Go to previous message
apaju
Messages: 6
Registered: May 2012
Junior Member
Not sure if we are talking about the same thing but my idea was to make the SQL code as robust, stable and quick as possible so that the end customer does not need to care about the performance issues, AWR's and anyhing related to it. Probably this is a wrong forum then... Thanks anyway. Talking to the Oracle account manager alrady very often...
Previous Topic: Parallel index with Non-partitioned Noparallel table
Next Topic: cannot see a-rows while querying dbms_xplan.display_cursor
Goto Forum:
  


Current Time: Mon Oct 20 04:57:09 CDT 2014

Total time taken to generate the page: 0.09580 seconds