Hash Join vs. Nested Loops

From: Mikito Harakiri <mikharakiri_at_ywho.com>
Date: Tue, 25 Feb 2003 15:03:11 -0800
Message-ID: <hIS6a.12$Ih6.23_at_news.oracle.com>



I'm trying to understand the reason why different join methods exist, in particular, if Hash and Merge join are theoretically legitimate methods, or just some artifacts of Non Randomly Accessed Memory (aka disks). After all, when programming I'm always using "for/while" loop operator, not some obscure hash join.

Suppose both tables are cached into the memory. Then, we can ignore Random IO versus Sequential IO difference. Is Nested Loop algorithm inherently "nonoptimal" compared to Hash Join? What HJ does differently? It does put a hash table in front of the inner table. That's just extra work to me. Then, it does the similar nested looping which is only sugarcoated by the hash table probe. Specifically, database engine probes the hash table, so that if there is no match, then the target table is not accessed. But Indexed Nested Loop does the similar probing of the join index! Why Hash Join is
considered "more scalable"? Do Hash and Merge joins make sence for in-memory databases? Received on Wed Feb 26 2003 - 00:03:11 CET

Original text of this message