Re: long running select min(timestamp) query

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Tue, 29 Jan 2008 15:17:10 -0800 (PST)
Message-ID: <623903e3-3844-428c-843c-57f51ea0be2d@e23g2000prf.googlegroups.com>


On Jan 29, 5:54 pm, joel garry <joel-ga..._at_home.com> wrote:
> On Jan 29, 1:36 pm, "bobdu..._at_gmail.com" <bobdu..._at_gmail.com> wrote:
> > On Jan 29, 2:53 pm, joel garry <joel-ga..._at_home.com> wrote:
>
> > > On Jan 29, 6:39 am, "bobdu..._at_gmail.com" <bobdu..._at_gmail.com> wrote:
>
> > > > On Jan 28, 7:17 pm, joel garry <joel-ga..._at_home.com> wrote:
>
> > > > > On Jan 28, 7:40 am, "bobdu..._at_gmail.com" <bobdu..._at_gmail.com> wrote:
>
> > > > > > Hi,
>
> > > > > > I have a basic event table in my 10g database, primary integer key,
> > > > > > and a nonnull timestamp "eventdate" field.  When executing the
> > > > > > following command:
>
> > > > > > select min(eventdate) from events;
>
> > > > > > It can occasionally take a VERY long time.  There is a standard btree
> > > > > > index on "eventdate", and roughly 20 other columns, a few of which
> > > > > > also have basic btree indexes.
>
> > > > > Well, I'm guessing that your 20 column index is just not the best one
> > > > > for your min.  What happens if you make a btree index on just that
> > > > > eventdate column?  It's been my experience that any table with more
> > > > > than a few columns in an index isn't properly normalized.
>
> > > > > There might be bug/patch issues, too.  timestamp is relatively new,
> > > > > and you are using an unpatched base release.
>
> > > > > How are you loading this table?  What is the exact definition of the
> > > > > index?
>
> > > > > jg
> > > > > --
> > > > > @home.com is bogus.http://www.spacefem.com/hydrox/
>
> > > > All the indexes on the table only span ONE column, not all!  Here's
> > > > the ddl (i've modified some names, none of the content has been
> > > > changed):
>
> > > Sorry 'bout that!
>
> > > Charles has bought up a good point about concurrency, and it makes me
> > > wonder if you've done mass deletes on this table.  Have you tried
> > > recreating the index?  (See Richard Foote's blog about indices, the
> > > general advice is you don't need to rebuild indices, but there are
> > > certain odd situations which you can and should check for if you are
> > > seeing noticeable issues.  I'm wondering if you've made a realllllly
> > > sparse index, and why Oracle wouldn't be smart enough to find the min
> > > right away.).
>
> > > I'm also idly wondering if delayed block cleanout can be involved
> > > (google for details).  Postulating a mass delete, followed by a simple
> > > inquiry that has to traverse all the deleted blocks and write them
> > > out.  If you are getting consistent results doing Charles' traces, it
> > > wouldn't be this.  But I'm focused on the description of "It can
> > > occasionally take a VERY long time" and am wondering what is different
> > > at those occasions.  Have you checked wait states at those times?
> > > Maybe there's more than one thing wrong here - for example, slow disks
> > > plus disks overwhelmed with with I/O while archiving logs plus
> > > concurrency issues.  Also, I have a vague thought of some process like
> > > MMON going nuts, are you seeing unexplained extremely high processor
> > > usage at times?
>
> > > > CREATE TABLE "MYSCHEMA"."EVENTS"
> > > > ( "EVENTID" NUMBER,
> > > > "EVENTDATE" TIMESTAMP(6) NOT NULL ,
> > > > "HOSTID" NUMBER NOT NULL ,
> > > > "DIRECTION" NUMBER NOT NULL ,
> > > > "INTERFACE" NVARCHAR2(17),
> > > > "FRAME" NUMBER NOT NULL ,
> > > > "SOURCEMAC" NVARCHAR2(17),
> > > > "DESTINATIONMAC" NVARCHAR2(17),
> > > > "PROTOCOL" NUMBER NOT NULL ,
> > > > "FLAGS" NVARCHAR2(64),
> > > > "SOURCEIP" NVARCHAR2(15),
> > > > "SOURCEPORT" NUMBER NOT NULL ,
> > > > "DESTINATIONIP" NVARCHAR2(15),
> > > > "DESTINATIONPORT" NUMBER NOT NULL ,
> > > > "PACKETSIZE" NUMBER NOT NULL ,
> > > > "TYPE" NUMBER NOT NULL ,
> > > > "ACTION" NUMBER NOT NULL ,
> > > > "ITEMID" NUMBER,
> > > > "REPEATCOUNT" NUMBER NOT NULL ,
> > > > "ENDEVENTDATE" TIMESTAMP(6),
> > > > "RANK" NUMBER,
> > > > "HOSTASSETVALUE" NUMBER(5),
> > > > "FILTERSEVERITYVALUE" NUMBER(5),
> > > > PRIMARY KEY ("EVENTID") VALIDATE )
> > > > TABLESPACE "USERS" PCTFREE 10 INITRANS 1 MAXTRANS 255 STORAGE
> > > > ( INITIAL 64K BUFFER_POOL DEFAULT) LOGGING
>
> > > > CREATE INDEX "MYSCHEMA"."EVENTS_EVENTDATE" ON
> > > > "MYSCHEMA"."EVENTS" ("EVENTDATE") TABLESPACE "USERS" PCTFREE 10
> > > > INITRANS 2 MAXTRANS 255 STORAGE ( INITIAL 64K BUFFER_POOL DEFAULT)
> > > > LOGGING LOCAL
>
> > > > Then there's a bunch of other indexes, 8 total, with the one above and
> > > > the one PK, 6 more look similar to the one below:
>
> > > > CREATE INDEX "MYSCHEMA"."EVENTS_ACTION" ON
> > > > "MYSCHEMA"."EVENTS" ("ACTION") TABLESPACE "USERS" PCTFREE 10 INITRANS
> > > > 2 MAXTRANS 255 STORAGE ( INITIAL 64K BUFFER_POOL DEFAULT) LOGGING
> > > > LOCAL
>
> > > jg
> > > --
> > > @home.com is bogus.
> > > "Weather forecasts are horoscopes with numbers."
>
> > I definitely have done some large scale deletes.  When i say
> > occasionally, there have been times when the query has been fast, but
> > it hasn't been fast in a long time...  likely since the last large
> > scale delete operation actually.  The CPU never spikes on the oracle
> > box, its definitely always blocking on IO.  Mind you, the disk is
> > slow, but since it is occasionally speedy, i can't blame it all on
> > disk (can i?).
>
> Sure, when all else fails blame the hardware!  :-)  (There are people
> who don't tune and just throw hardware at it - I've done that myself,
> it can work - but it is generally stupid without figuring out what is
> really going on).  I have seen failing hardware be slow and sometimes
> fast.
>
> I think we've ruled out the delayed block cleanout issue if it hasn't
> been fast in a long time, that would be an occasional thing, anything
> that touches all blocks, like getting stats without sampling, would
> clean it out.  And we've ruled out chaining.
>
> So:  http://richardfoote.wordpress.com/2007/12/11/index-internals-rebuildi...
>
> Also, please post your top waits, and continue on the path Charles has
> started, it may very well also lead to the solution(s).  I/O binding
> can be caused by many things, from actually needing lots of data to
> tuning to bad code.  It's most often bad code.  Mass deletes can be
> done many ways, some of which may be considered bad at times.  Searchhttp://asktom.oracle.comfor mass delete.  Sometimes a better way is
> to create a new table with what you want to keep, then rename.
>
> > I just tried to get more stats via the following call, and then
> > checked again for chained rows (not sure if it would've affected, but
> > tried anyways):
>
> > SQL> exec
> > DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'EVENTS,CASCADE=>TRUE)­­;
>
> > PL/SQL procedure successfully completed.
>
> > SQL> analyze table events list chained rows;
>
> > Table analyzed.
>
> > SQL> select owner_name, table_name, count(*)
> >   2  from chained_rows
> >   3  group by owner_name, table_name;
>
> > no rows selected
>
> jg
> --
> @home.com is bogus.http://www.signonsandiego.com/uniontrib/20080129/news_1b29lazar.html- Hide quoted text -

In addition to looking at the wait events, you may also want to further investigate session level statistics to see if undo blocks are being applied to produce a consistent view of the data. For example, before running your query, execute the following to obtain a beginning snapshot of the statistics:
SELECT
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.NAME IN (

    'consistent changes',
    'data blocks consistent reads - undo records applied',
    'consistent gets',
    'cleanouts and rollbacks - consistent read gets')
  AND SN.STATISTIC#=MS.STATISTIC#; Then, execute your query:
select min(eventdate) from events;

Then execute the statistics query again: SELECT
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.NAME IN (

    'consistent changes',
    'data blocks consistent reads - undo records applied',
    'consistent gets',
    'cleanouts and rollbacks - consistent read gets')
  AND SN.STATISTIC#=MS.STATISTIC#; Finally, manually subtract the beginning values from the ending values.

For example:

NAME                                                   VALUE
--------------------------------------------------- ----------
consistent gets                                        409934
consistent changes                                       1520
data blocks consistent reads - undo records applied      1468
cleanouts and rollbacks - consistent read gets              1

---

NAME                                                    VALUE
-------------------------------------------------- ----------
consistent gets                                        409940
consistent changes                                       1522
data blocks consistent reads - undo records applied      1470
cleanouts and rollbacks - consistent read gets              2

The 10046 trace for the query that I executed showed 8 consistent
reads.

CONSISTENT GETS: Requests for a Block Consistent Read - Statistic
counts the number of logical reads in consistent mode.

CONSISTENT CHANGES: Undo Records Applied to Provide Consistent Read -
Statistic indicates the number of times a database block has rollback
entries applied to perform a consistent read of the block.   Work
loads that produce a great deal of consistent changes can consume a
great deal of resources. The value of this statistic should be small
in relation to the 'consistent gets' statistic.

DATA BLOCKS CONSISTENT READS - UNDO RECORDS APPLIED: Undo Applied to
Block for a Consistent Read - Number of undo records applied to data
blocks that have been rolled back for consistent read purposes.

CLEANOUTS AND ROLLBACKS - CONSISTENT READ GETS: Undo/Block Cleanout
Required for Cons. Read - Number of consistent gets that require both
block rollbacks and block cleanouts.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Tue Jan 29 2008 - 17:17:10 CST

Original text of this message