Re: ORA-03135 running plsql locally

From: Tim Gorman <tim_at_evdbt.com>
Date: Thu, 11 Oct 2012 10:23:27 -0600
Message-ID: <5076F27F.1070006_at_evdbt.com>



Mike,
Sorry for late response, hope I'm not entering this thread too late...

This error is a database issue (not a network issue) having to do with establishing a connection, so it can happen regardless of whether you are connecting via a network, via IPC, or even BEQ. This is not the termination of an established connection (which would be ORA-03113 or ORA-03114), but a connection that never happened. Chances are good someone has been fiddling with the SQLNET.ORA parameter INBOUND_CONNECT_TIMEOUT (default 60 secs) with the only result that it takes longer for you to get the error. Increasing a timeout rarely resolves anything, but instead masks the problem, as is likely the case here.

When a connection is being established, several SQL statements are run as SYS before authentication and authorization are complete and the session connects as the requested user account. During this time, while these SQL statements against the Oracle data dictionary are being run, the same kind of contention or blocking can occur which might happen to any SQL statement.

In your "alert.log" files are logged ORA-03136 error messages which correspond to the ORA-03135 messages that your shell script (and likely other end-user programs). The only useful thing about these logged ORA-03136 error messages is their timestamp. Please use that timestamp to query into the Active Session History views using a SQL*Plus script similar to the following...

    set echo on feedback on timing on pagesize 100 linesize 200 trimout     on trimspool on tab off
    col sample_time format a26
    col day format a12
    col min_sample_time format a5
    col max_sample_time format a5
    col inst# format 9990
    col sid format a10
    col state format a10
    col blksid format a10
    col blkstate format a10
    col time_waited format 999,999,990
    col event format a30

    col plsql_entry_object_id format 999990
    col plsql_entry_subprogram_id format 999990
    col plsql_object_id format 999990
    col plsql_subprogram_id format 999990
    col program format a30
    col module format a50
    col action format a40
    col entry_owner format a6 heading "Entry|Owner"
    col entry_name format a35 heading "Entry|Package.procedure"
    col owner format a6 heading "Owner"

    col name format a31 heading "Package.procedure"     col cnt format 999,990
    col duration format a20
    spool ashq

    clear breaks computes
    select sql_id,

               count(*) cnt,
               sum(decode(session_state, 'ON CPU', wait_time,
    time_waited))/1000000 time_waited
    from      gv$active_session_history /* dba_hist_active_sess_history */
    where     sample_time >= to_date('xxxxx','DD-MON-YYYY HH24:MI:SS')
    and       sample_time < to_date('yyyyy', 'DD-MON-YYYY HH24:MI:SS')
    group by sql_id
    order by time_waited desc;

    select event,

               count(*) cnt,
               sum(decode(session_state, 'ON CPU', wait_time,
    time_waited))/1000000 time_waited
    from      gv$active_session_history /* dba_hist_active_sess_history */
    where     sample_time >= to_date('xxxxx','DD-MON-YYYY HH24:MI:SS')
    and       sample_time < to_date('yyyyy', 'DD-MON-YYYY HH24:MI:SS')
    group by event
    order by time_waited desc;

    select event,

               sql_id,
               count(*) cnt,
               sum(decode(session_state, 'ON CPU', wait_time,
    time_waited))/1000000 time_waited
    from      gv$active_session_history /* dba_hist_active_sess_history */
    where sample_time >= to_date('xxxxx','DD-MON-YYYY HH24:MI:SS')
    and       sample_time < to_date('yyyyy', 'DD-MON-YYYY HH24:MI:SS')
    group by event, sql_id
    order by time_waited desc;

    select e.object_name || decode(e.object_name,'','','.') ||     e.procedure_name entry_name,

               p.object_name || decode(p.object_name,'','','.') ||     p.procedure_name name,

               x.sql_id,
               x.sql_plan_hash_value,
               x.module,
               x.event,
               x.time_waited,
               x.cnt
    from      dba_procedures    e,
               dba_procedures    p,
               (select    plsql_entry_object_id,
                          plsql_entry_subprogram_id,
                          plsql_object_id,
                          plsql_subprogram_id,
                          module,
                          sql_id,
                          sql_plan_hash_value,
                          event,
                          sum(decode(session_state, 'ON CPU', wait_time,
    time_waited))/1000000 time_waited,
                          count(*) cnt
                from      gv$active_session_history /*
    dba_hist_active_sess_history */
    where     sample_time >= to_date('xxxxx','DD-MON-YYYY HH24:MI:SS')
    and       sample_time < to_date('yyyyy', 'DD-MON-YYYY HH24:MI:SS')
                group by  plsql_entry_object_id,
                          plsql_entry_subprogram_id,
                          plsql_object_id,
                          plsql_subprogram_id,
                          module,
                          sql_id,
                          sql_plan_hash_value,
                          event) x
    where     e.object_id (+) = x.plsql_entry_object_id
    and       e.subprogram_id (+) = x.plsql_entry_subprogram_id
    and       p.object_id (+) = x.plsql_object_id
    and       p.subprogram_id (+) = x.plsql_subprogram_id
    order by 7 desc, 1, 2, 3, 4;

    select sample_time,

               session_id||'.'||session_serial# sid,
               session_state state,
               decode(session_state, 'ON CPU', wait_time, time_waited)
    time_waited,
    blocking_session||'.'||blocking_session_serial# blksid,
               blocking_session_status blkstate,
               user_id,
               sql_id,
               module,
               event,
               p1, p2, p3,
               action
    from      gv$active_session_history /* dba_hist_active_sess_history */
    where     sample_time >= to_date('xxxxx','DD-MON-YYYY HH24:MI:SS')
    and       sample_time < to_date('yyyyy', 'DD-MON-YYYY HH24:MI:SS')
    order by sample_time;

    spool off
    set echo off feedback 6 timing off linesize 130 tab on

...where the date/timestamp value of "xxxxx" is about 30-60 seconds prior to one or more of the timestamps of the ORA-03136 error(s) in the alert.log, and the date/timestamp value of "yyyyy" is about 10-20 seconds after that timestamp.

So, if you have a bunch of ORA-03136 errors logged on 02-OCT 2012 at 15:35:17, 15:35:22, 15:35:27, then you might want to replace "xxxxx" with the value of "02-OCT-2012 15:34:00" and "yyyyy" with the value of "02-OCT-2012 15:35:40", or something like that.

In the FROM clauses of each of the queries above, you'll notice that GV$ACTIVE_SESSION_HISTORY is used, while DBA_HIST_ACTIVE_SESS_HISTORY is commented out. If your queries return no rows for the requested SAMPLE_TIME values, then it means that the data has been rolled out of the memory-based GV$ view and is now stored in the table-based DBA_HIST view. The data in the GV$ view is much more granular (i.e. 1-second samples) so the information there is more useful, but retained for a shorter period of time. The data in the table-based DBA_HIST views is a little less granular (i.e. 10-second samples), but it is still quite useful for diagnosing problems like these.

Good luck and please feel free to share your findings with the list as you go? The causes of the problem might be quite subtle and require more data gathering, and I don't want to influence your investigations by suggesting possible causes. It is more useful to go where the ASH data leads, rather than trying to force a conclusion.

Good luck!

Thanks!

-- 
Tim Gorman
consultant -> Evergreen Database Technologies, Inc.
postal     => PO Box 352151, Westminster CO 80035
website    => http://www.EvDBT.com/
email      => Tim_at_EvDBT.com
mobile     => +1-303-885-4526
fax        => +1-303-484-3608
Lost Data? => http://www.ora600.be/ for info about DUDE...






On 10/7/2012 5:08 AM, David Roberts wrote:

> In this scenario It appears that the 'server side process' has failed,
> although that isn't always true.
> I suggest that you inspect the alert log and the dump directories to see if
> there are logs/dumps from the appropriate time that you can pass onto
> Oracle support or us for interpretation.
>
> Dave
>
> On Wed, Oct 3, 2012 at 8:18 PM, mike boligan <mboligan_at_yahoo.com> wrote:
>
>> Hi,
>> I am running Oracle 11.2.0.2.0 64 bit on Solaris 10. I have a process
>> that runs on the database server and calls a plsql package via a shell
>> script like so:
>>
>> sqlplus user/pass _at_scriptname parm1 parm2
>>
>> The package runs for roughly 2600 seconds and sometimes gets a ORA-03135.
>> The package does not fail in the same spot every time. It does not seem to
>> fail after x minutes either. I do have sqlnet.expire_time in sqlnet.ora
>> on the server, not sure that helps any as this should be using IPC. I do
>> see alot of log switching while the job runs, but I don't see how the
>> increased activity could cause this error.
>>
>> Maybe I am missing something, but I don't think I should get this error
>> when running a script locally on the database server. Has anyone ever seen
>> a similar situation or fill me in on anything I am overlooking? I am
>> trying avoid a sqlnet trace because of space concerns on where the output
>> would go.
>>
>> TIA,
>> Mike
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
>
>
-- http://www.freelists.org/webpage/oracle-l
Received on Thu Oct 11 2012 - 18:23:27 CEST

Original text of this message