RE: RE: controlfile schema global enqueue lock

From: Joan Hsieh <Joan.Hsieh_at_mirror-image.com>
Date: Wed, 07 Feb 2001 11:34:02 -0800
Message-ID: <F001.002ADF73.20010207102103@fatcity.com>

Hi Steve,

I still couldn't figure out why we have cf enqueue lock. As you suggest, I run your script repeatedly to see how long the problem persists. All the cf lock associated locks are tx and tm. Here I choose 4 sid to run. I didn't find any st lock, but we do have ts and MR lock.

RESOURCE SID SERIAL# HOLDING WANTING SECONDS

----------------- ---- ------- ------- ------- -------
CF-0-0              98   19700               X       1
TM-2750-0           98   19700      SX              10
TX-262220-157260    98   19700       X              10
CF-0-0             101      72               X       2
TM-2750-0          101      72      SX              14
TX-393292-158320   101      72       X              14
CF-0-0             102      31               X       2
TM-2750-0          102      31      SX             329
TX-131099-158619   102      31       X             328
CF-0-0             111     168               X       3
TM-2750-0          111     168      SX            1007
TX-720901-158511   111     168       X            1007


RESOURCE              SID HOLDING WANTING    SECONDS
-------------------- ---- ------- ------- ----------
MR-89-0              DBW0       S             516300
MR-9-0               DBW0       S                  0
MR-90-0              DBW0       S             515678
MR-91-0              DBW0       S              91178
MR-92-0              DBW0       S              91178
MR-93-0              DBW0       S              90454
RT-1-0               LGWR       X                  0
TM-2750-0             286      SX                 22
                     1230      SX                  0
                     1135      SX                  0
                     1062      SX                  0

RESOURCE              SID HOLDING WANTING    SECONDS
-------------------- ---- ------- ------- ----------
TS-3-16866818        SMON      SX             517376

"Looking at this, the sessions are not waiting on the TM and TX locks, but are holding them. The sessions are waiting on the CF lock (longest wait= 3 seconds)."

"The sessions are holding the TX and TM locks for a long period of time, but other sessions are not waiting on them, so doesn't appear to be an issue."

Can you explain how the timings shown above are computed? If they are by process, then how can the CF lock for sid 111 be only 3 seconds, where as the TX lock for sid 111 is 1007 seconds?

Thanks,

Joan

-----Original Message-----
Sent: Monday, February 05, 2001 3:26 PM
To: Multiple recipients of list ORACLE-L

Hi Joan,

You don't have a problem unless there is a session WANTING a lock on the resource. By "something interesting", I meant a session waiting for a lock on
either the CF or ST enqueues. Then you run it repeatedly to see how long the problem persists. Because you are going to run it repeatedly, it would be good
to hack the script to exclude all other enqueue types. Also, I note that you are
not running the new version of this script that gives the names of the background processes instead of their SIDs.

@ Regards,
@ Steve Adams
@ http://www.ixora.com.au/
@ http://www.christianity.net.au/

-----Original Message-----
Sent: Tuesday, 6 February 2001 1:11
To: Multiple recipients of list ORACLE-L

Steve,

Thank you so much for taking time to reply. I did run couple of times the enque_lock script as you suggested. CF waiting is really minimum. I noticed the maximum enqueue comes from MR which is Media Recovery in share lock. Should I need to concern this or this is just a very normal symptoms for the database nature?
The real exclusive lock comes from TX. Right now I am starting to look into it.

MR-8-0                        2       S                  0
MR-80-0                       2       S             335666
MR-81-0                       2       S             335238
MR-82-0                       2       S             334850
MR-83-0                       2       S             334473
MR-84-0                       2       S             334094

RESOURCE                    SID HOLDING WANTING    SECONDS
-------------------- ---------- ------- ------- ----------
MR-85-0                       2       S             333672
MR-86-0                       2       S             333131
MR-87-0                       2       S             332593
MR-88-0                       2       S             331998
MR-89-0                       2       S             331386
MR-9-0                        2       S                  0
MR-90-0                       2       S             330764
RT-1-0                        3       X                  0
TM-2750-0                  1459      SX               1262
                            406      SX                987
                            987      SX                968

RESOURCE                    SID HOLDING WANTING    SECONDS
-------------------- ---------- ------- ------- ----------
TM-2750-0                  1474      SX                957
                            462      SX                411
                            286      SX                336
TM-3196-0                  1318      SX                  0
TS-3-16866818                 5      SX             332462
TX-1048594-158974          1208       X                  1
TX-1048595-158908          1170       X                  0
TX-1048640-159199           475       X                  1
TX-1835008-159738          1474       X                957
TX-1835015-159892          1172       X                 64

Thanks,

Joan

-----Original Message-----
Sent: Friday, February 02, 2001 5:40 AM
To: Multiple recipients of list ORACLE-L

Hi Joan,

This filtered into my Inbox folder because you mentioned my name. I've not been
following the thread, but I'll comment on the output below.

Normally when seeing CF and ST enqueue waits and a lot of time spent in 'enqueue' waits I would suspect the ST enqueue to be the problem, rather than
the CF enqueue. CF enqueue waits are seldom longer than half a second or so. The
typical situation is one session doing a controlfile transaction and others waiting to read the controlfile. It is relatively rare to need to make two independent changes to the controlfile at the same time. By contrast, the ST enqueue is held for many seconds when it is taken, and the waiters always want
an exclusive lock too.

I suggest that you get a few samples of the enqueue waits using my 'enqueue_locks.sql' script. The new version reports the name of any background
process involved rather than just its sid, which might be helpful here. You may
also want to hack the script so that it only shows ST and CF enqueues. Then run
it intermittently until you see something interesting. Then run it several times
in quick succession to see what happens. If I'm right you see that the CF waits
are relatively brief, whereas the ST lock contention will be relatively enduring. If so, I'm sure you'll know what to do.

If it is the CF enqueue that is the source of most of the waits, then knowing
which sessions are the holders and waiters will be a help in further diagnosis
anyway.

@ Regards,
@ Steve Adams
@ http://www.ixora.com.au/
@ http://www.christianity.net.au/

-----Original Message-----
Sent: Friday, 2 February 2001 2:21
To: Multiple recipients of list ORACLE-L

Dick,
The system is not in archivelog mode. If the database crashed, I was told they don't care at all. They just rebuilt a shell database and let getter and feeder processes gradually feed the database. (The getter will check the url in the database, if didn't find then will go to the origin server to get the url) Of course, it will impact the performance. The end user at that region will experienced the slow access. We just have 5 tables, spilt into 22 separate raw disks. No i/o problem at all. Now the problems is where is control file enqueue come from? I thought since we have too many log switch can cause the cf enqueue problem. But that is not true either. Right now we don't have too many 4 per hr. still have the enqueue problem. We did'nt set any slave parameter. Oh, we set parallel_max_servers to 5, all the tables have 50 freelist and we have 8 getter and feeder processes running all the time.

I run couple Steve Adams scripts,

SQL> @enqueue_stats.sql

TY GETS WAITS
-- ---------- ----------

CF    3272862     679426
CI        140          0
CU       4071          0
DL         29          0
DR         27          0
DV         83          0
HW    3646713        437
IS      22669          0
MR        158          0
PE       5337          0
PR         15          0
PS         65          0
RT          1          0
SQ       1902          0
ST       5763       4692
TM   12264283          0
TS         57          0
TT         28          0
TX   12099800         19
US       5781          0

SQL> @response_time_breakdown.sql

MAJOR    MINOR         WAIT_EVENT                                SECONDS
-------- ------------- ---------------------------------------- --------
CPU time parsing       n/a                                          4920
         reloads       n/a                                            70
         execution     n/a                                         74532

disk I/O normal I/O    db file sequential read                    373254
         full scans    db file scattered read                       2038
         direct I/O    direct path read                            80465
                       direct path write                           36378
         other I/O     control file parallel write                 15312
                       control file sequential read                 9082
                       db file parallel read                         537

waits    DBWn writes   rdbms ipc reply                               136
                       local write wait                                1
         LGWR writes   log file switch completion                  12681
         enqueue locks enqueue                                   5716812
         other locks   latch free                                  82345
                       library cache pin                           51513
                       buffer busy waits                           27627
                       row cache lock                               7403
                       library cache lock                             15
                       buffer deadlock                                 1
                       index block split                               0

latency  commits       log file sync                              165491
         network       SQL*Net more data from client              101088
                       SQL*Net more data to client                  1143
                       SQL*Net message to client                     457
                       SQL*Net break/reset to client                 287
         file ops      file open                                      37
         process ctl   process startup                                 1
         misc          refresh controlfile command                 13197
                       reliable message                                0

This database bounced yesterday. So the data should be refresh. We pined all the sys package into share pool.

Thanks,

Joan

-----Original Message-----
Sent: Thursday, February 01, 2001 10:59 AM To: Joan Hsieh; Multiple recipients of list ORACLE-L

Joan,

HUMMMMMMM! Lots to think about here. For one, if your not doing any backups
what is the plan for when a disk drive fails? Predicting that is similar to predicting the next asteroid impact on the earth, it's not a matter of if, but
when. After that drive gets replaced, something has to be put back down there.
OH, your on a raid or mirrored system!! That's nice, same scenario, just more
drives to worry about.

But we digress. First, is this system in archivelog mode? If your not doing backups, why archive the redo information? Are each of these redo files
on separate drives, or is that the hottest drive in the system? Might be a good
idea to "spread the wealth". Is there only one control file or more than one?
are they on the same drive? are they mixed up with DB files? is one or more on
very busy drives? What is the setting of DBWR_IO_SLAVES and DB_WRITER_PROCESSES? I'd also have a discussion with the applications people. While I understand
that stuff on the Internet is constantly changing, all of it is not changing every hour. News stuff normally has a 24 hour life expectancy. What will enhance the overall performance of your site would be a process that accumulates
historical life expectancies for various types of content there by allowing the
processes that refresh your url data to have some intelligence built into them.

Dick Goulet

____________________Reply Separator____________________
Author: "Joan Hsieh" <Joan.Hsieh_at_mirror-image.com>
Date:       2/1/2001 10:25 AM

Dick,

The database size is 200Gb (datafiles), not including the redolog. This is a internet content delivery company. The database structure is very simple. Some processes called getter continually insert the http site to the database. Other processes keep check the url is expired or not. Based on that, will do delete the url. So we do have a lot of insert, update (update the expire date) and delete. The interesting thing is we don't do backup. even not cold backup. No snapshot either. Since the content of internet is keeping change. There is no point to backup. The major thing is performance. We all use raw disk on sun 5.6 or HP. The db_block_size is 8k, db_block_buffer is 50000. The hit ratio is around 85%.

Joan

-----Original Message-----
Sent: Thursday, February 01, 2001 9:33 AM To: Joan Hsieh; Multiple recipients of list ORACLE-L

Joan,

Something is indeed very fishy in London. The first question I would ask is
what is creating so much redo generation? 9x250MB = 2.1GB of data changes inside of that hour. One thought may be that the database is still in hot backup mode from a previously disturbed backup. The other is that several users
are doing a lot of temporary table creation, but not in the temp space.

What is the size of this database? Where does incoming data come from?
Are their a lot of very frequently refreshed snapshots? Do they do a full refresh vs a fast refresh? What is the db_block_buffer hit ratio like? Having
a low hit ratio can indicate an excessively busy dbwr writing dirty blocks to
disk which can cause lots of check points.

Dick Goulet

____________________Reply Separator____________________
Author: "Joan Hsieh" <Joan.Hsieh_at_mirror-image.com>
Date:       2/1/2001 6:05 AM

Dear Listers,

Our database in London has tremendous cf enque lock. Since I am new here. I checked the parameter found the log_checkpoint_interval set to 3200 and log_checkpoint_timeout set to default (1800 sec). So I suggest to set log_checkpoint_interval to 100000000 and log_checkpoint_timeout to 0. The second thing I found we have average log switch is 6 t0 8 per hour. We have 40 redo logs, each of them is 250m. Our log buffer set to 1m. I believe after we changed the parameter, the control file schema global enqueue lock should be released. But it get worse, we have 98% control file enqueue lock now. I think we have too many log switch (9 per hour now)and suggested to increase the log to 500m, but our principle dba is not convinced, he think log buffer size should play a more important role.

Any ideas,

Joan

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Joan Hsieh
  INET: Joan.Hsieh_at_mirror-image.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Joan Hsieh
  INET: Joan.Hsieh_at_mirror-image.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Steve Adams
  INET: steve.adams_at_ixora.com.au

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Joan Hsieh
  INET: Joan.Hsieh_at_mirror-image.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Steve Adams
  INET: steve.adams_at_ixora.com.au

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Joan Hsieh
  INET: Joan.Hsieh_at_mirror-image.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Wed Feb 07 2001 - 13:34:02 CST