Re: Hash join -vs- merge sort join

From: Kendal Willets <kendall_at_slip.net>
Date: 1997/04/14
Message-ID: <5itreb$79d$1@owl.slip.net>#1/1

Emrynsoft <emrynsoft_at_aol.com> wrote:

: I am trying to figure out when it is better to use a hash join over a
: merge sort join. My understanding was that hash join works better if the
: table on which the bitmap is created is small enough to be held in memory.
: But I have noticed that the optimizer sometimes even chooses hash join for
: joins of 2 very large tables, neither of which would completely fit in
: memory. In fact it seems to me that Oracle optimizer always seems to
: choose a hash join. It also seems that hash join is a better method for
: most cases over merge-sort (just a very unscientific real-time
: observation).

If neither table will fit in memory, the tables are split into smaller chunks (using an "internal hash function") which will fit in memory. What happens then is not clear. The book I have (_Oracle_Data_Warehousing, by Corey and Abbey) claims that each chunk in table A gets loaded into memory and run against a table scan of table B. It seems more logical that if both A and B were chunked out using the same hash function, then each chunk Ax of A would have an identically-hash-valued chunk Bx in table B, so one could simply load chunk A1, say, and hash join it (the usual way) to chunk B1, followed by A2 join B2, etc.

In any case it does seem like hashing will work on larger tables, it's just obscure how it's done. Received on Mon Apr 14 1997 - 00:00:00 CDT