Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Why does Oracle sort hashjoin output before parallel insert?

Re: Why does Oracle sort hashjoin output before parallel insert?

From: Jaap W. van Dijk <j.w.vandijk.removethis_at_hetnet.nl>
Date: Fri, 12 Mar 2004 23:40:22 +0100
Message-ID: <ft9450hd664fn43m6i3q3rmru2v7r10a0s@4ax.com>


On Thu, 4 Mar 2004 19:58:17 +0000 (UTC), "Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote:

(...)

>a)  When data moves from one set of parallel

> slaves to another, it does so through "table queues".
> If it is not possible for the target slaves to execute
> as the "table queues" are filled, then the "table queues"
> are dumped to sort segments. This usually happens in
> a multi-layer join/group/order query with only two
> sets of slaves - it happens that the first set of slaves
> is still performing an operation when the second set
> of slaves wants to feed results to a third (recycled first)
> set of slaves.

(...)

Jonathan,

I monitored the process while it executed with fourfold parallelism. When all records of the second table of the hash join are being read, the records that can not be joined immediately are written to HASH segments. After that, while processing the records from the HASH segments, there are 4 slave sessions, each possessing a HASH segment and a SORT segment. The SORT segments are growing due to the output of the hash join developing.

The moment all records for a slave have been processed the following happens for that slave:

- the HASH segment disappears
- the SORT segment does not grow anymore
- direct path reads and writes do not go up anymore

The moment th last slave is finished, the slaves appear simultaneously in the session statistic 'sort (rows)' for each slave SID, and at this moment the output table also starts getting written (v$filestat).

So I think it is safe to say your assumption is right: no actual sorting is done, the SORT segments are only used for keeping the end result.

The database I did my test on has PARALLEL_AUTOMATIC_TUNING set to TRUE. It's on a 2-CPU machine with a PARALLEL_THREADS_PER_CPU of 2. In this session I had disabled parallel dml. From earlier tests I gathered that the use of SORT segments was connected to the use of parallel dml, but that seems not to be the case here. Maybe it is connected to the maximal number of simultaneously active processes allowed by one of the parameters? This would explain why I did not see the SORT segments in a test in which the process ran in twofold parallelism.

Jaap. Received on Fri Mar 12 2004 - 16:40:22 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US