High paging condition on RS/6000 SP under AIX 4.1.4

From: John Schneider <jdschn_at_ibm.net>
Date: 1997/08/12
Message-ID: <33F0B3C1.682A_at_ibm.net>#1/1


Greetings,

        I am working on a problem with an RS/6000 SP system running AIX 4.1.4 and PSSP 2.1.2.0. Four of the nodes are running an Oracle database application running Oracle 7.3.2.3.0 and Oracle Parallel Query Server. The problem is that with parallelization turned up to run multiple processes on multiple nodes, the processes get in to a weird high-paging state that they never break out of. We have a certain query, for example, that will return it's answer in 30 seconds or less with parallelization of 1-1 (that is, 1 process on 1 machine). But if you turn the parallelization on to 4-1 ( 4 processes on 1 machine) or 4-4 ( 4 processes on 4 machines), the processes running the query begin to page very heavily. The page fault rate on each node running part of the query will climb as high as 10,000 page faults/second. Looking at the system monitor shows the CPU at about 80% kernel state, 20% or less user state, which is typical in a high paging sort of situation. The page space utilization climbs somewhat, but never gets above 40-50%, so we are not out of paging space. The processes will stay in this state for hours until killed.

        When running with no parallelization ( that is, 1-1), the query uses very little kernel state (around 10%), and the rest user state, and no paging to speak of.

        My first thought of course, is that it is a bug in Oracle. But Oracle came in and ran traces of the query, and took them back for analysis. They came back and said that they could see nothing wrong with their product. They concluded that we either needed to upgrade memory on the nodes, or it was an AIX/PSSP bug.

        The four nodes have 256MB of memory each, which is not much for an Oracle server, but I still don't think it could be simply memory, since the query runs fine with parallelization at 1-1. Why does turning on the parallel query code consume so much more memory that the paging rate would go through the roof?

        Does anyone know of a AIX or PSSP bug that could be related to this problem? We are pursuing the memory upgrade path, too, but don't want to throw a lot of money in to memory if it won't resolve the problem. That seems like an expensive shot in the dark. Any Oracle patches that may address this?         

        Please email me with any information you may have. My heartfelt thanks in advance.

John Schneider


  • John D. Schneider | Email: jdschn_at_ibm.net *
  • Lowery Systems, Inc. | Phone: 314-349-4556 *
  • 275 Axminister | *
  • Fenton, MO 63026 | * *----------------------------------------------------------------------*
  • Opinions expressed herein are mine alone, and do not represent those *
  • of my company, its affiliates, or lackeys. Anyone who says otherwise*
  • is itching for a fight. *
Received on Tue Aug 12 1997 - 00:00:00 CEST

Original text of this message