Re: High paging condition on RS/6000 SP under AIX 4.1.4

From: David Luner <Luner_at_VNET.IBM.COM>
Date: 1997/08/13
Message-ID: <33f1148a.3922633_at_news.binc.net>#1/1


John Schneider <jdschn_at_ibm.net> wrote:

>[...]
>AIX 4.1.4 and PSSP 2.1.2.0. Four of the nodes are running an Oracle
>database application running Oracle 7.3.2.3.0 and Oracle Parallel
>Query Server. The problem is that with parallelization turned up to
>run multiple processes on multiple nodes, the processes get in to a
>weird high-paging state that they never break out of.

What is paging? filemon will show you the active segments.

>We have a certain
>query, for example, that will return it's answer in 30 seconds or less
>with parallelization of 1-1 (that is, 1 process on 1 machine). But if
>you turn the parallelization on to 4-1 ( 4 processes on 1 machine) or
>4-4 ( 4 processes on 4 machines), the processes running the query begin
>to page very heavily.

What's the query execution plan?

>The page fault rate on each node running part
>of the query will climb as high as 10,000 page faults/second.

Page faults, as shown in SAR or VMSTAT do not necessarily mean true paging. Any access to a JFS file *requires* a (hardware) page fault to access the file's segment registers. Excatly what metric are you seeing this high?

>Looking
>at the system monitor shows the CPU at about 80% kernel state, 20% or
>less user state, which is typical in a high paging sort of situation.
>The page space utilization climbs somewhat, but never gets above 40-50%,
>so we are not out of paging space. The processes will stay in this state
>for hours until killed.
>[...]

I agree that this, cannonically, "isn't good" and there should be some explanation. How is your SGA sized on each of the nodes? Are you sure you're not forcing the Oracle Server to "ping pong" or is this a "read only" query? What about locking implications?

Contact me off-line for more.

  • David
Received on Wed Aug 13 1997 - 00:00:00 CEST

Original text of this message