Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!drn.maxwell.syr.edu!news.maxwell.syr.edu!postnews.google.com!z14g2000cwz.googlegroups.com!not-for-mail
From: "EdStevens" <quetico_man@yahoo.com>
Newsgroups: comp.databases.oracle.server
Subject: CBO influences
Date: 9 Sep 2005 09:26:20 -0700
Organization: http://groups.google.com
Lines: 35
Message-ID: <1126283179.991124.138480@z14g2000cwz.googlegroups.com>
NNTP-Posting-Host: 209.64.87.68
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1126283185 15851 127.0.0.1 (9 Sep 2005 16:26:25 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Fri, 9 Sep 2005 16:26:25 +0000 (UTC)
User-Agent: G2/0.2
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6,gzip(gfe),gzip(gfe)
Complaints-To: groups-abuse@google.com
Injection-Info: z14g2000cwz.googlegroups.com; posting-host=209.64.87.68;
   posting-account=U6cb-w0AAABHLCv5oskKmlFGXS7ueXvN
Xref: dp-news.maxwell.syr.edu comp.databases.oracle.server:251179

Platform:  8.1.7 and 9.2.0.1 on various Win2k or Win2003

Last week a developer brought me a query that was running
satisfactorily in prod (8.1.7), but took 45 minutes in test (9.2.0).

As it happens, we are in the middle of transitioning a group of
databases from 8i to 9i, and so have several copies of this database
running under both versions, on a variety of hardware, ranging from
honking big 8-cpu blade servers to rusty old 2-cpu doorstops with mice
nesting on the motherboard.

Pulling an extended trace showed that the test box was spending 99+
percent of its time on 'hs message to agent' events. (The query
includes a join to a DB2 table, via the Oracle Transparent Gateway.)
At first I suspected 8i vs. 9i gateway issues, but as we have tested on
various platforms, we have eliminated that.

As we tried to narrow down the variables, we discovered that the poor
execution (high HS message events) only occured on one 9.2 database,
running on a server that was in the middle of the range of available
hardware.  We found that on every other system, the optimizer was
generating an access plan that featured several hash joins, where the
one poor performing system generates nested loops.  Even when we
export/import the entire schema from the 'bad' 9.2 box to a 'good' 9.2
box.  Initialization parms idendtical, data structure and volume
identical.  And yet access plans are different, with disasterous
results.

So ... at this point I'm grasping for what other factors could
influence the CBO.  Where do I look for a reasonably detailed
discussion of such?  The qeury is rather lengthy, and explain plans
never format well here.  And I wouldn't expect anyone to pore over
them.  Just looking for some pointers on the kinds of things I should
be looking at.  At this point I'm out of ideas.