Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> HASH JOIN ...I want to understand

HASH JOIN ...I want to understand

From: Le JeanMimi <scjm_at_noos.fr>
Date: 30 Jul 2003 10:54:06 -0700
Message-ID: <c7be5048.0307300954.41063e79@posting.google.com>


Build hash table, probe, etc, sure i didn't understand everything. I want to make things clear in my mind with a simple example from the Design and Tuning for Performance Oracle Manual.

That's what i imagine
Please tell me how far from reality my imagination is .... :

emp
emp_id first_name last_name dept_id

001      JOE         SMITH      D001
002      BILL        BUSH       D002
003      PETER       JACKSON    D001
004      THERESA     CUBADA     D004
005      GARY        GIBB       D001

dept
dept_id name

D001     RESEARCH AND DEV
D002     SALES
D003     HUMAN RESOURCES
D004     ADMINISTRATIVE

SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno;


STEP 1 : ORACLE builds a hash table of DEPT. (this requires a full table scan of DEPT)

-> ORACLE copies the whole (?)(because of the * in the query)(?) EMP
table in memory BUT in a hash table structure (with a hash function).
-> To access a line of this table in memory with a primary key
(dept_id) :

ex :
Reads DEPT, gets (D001,RESEARCH AND DEV)  hash_function (D001) = &FCF0D540
  Copies (D001,RESEARCH AND DEV) into memory address &FCF0D540 Reads DEPT, gets (D002,SALES)
 hash_function (D002) = &FCF0E9C2
  Copies (D002,SALES) into memory address &FCF0E9C2 ...

STEP 2 : ORACLE performs a FULL TABLE SCAN of EMP and PROBEs.

  For each row of EMP, it passes EMP.dept_id to the hash function which translates the string to a memory location.   ORACLE retrieves the DEPT data directly from the memory for this row.

  If EMP.dept_id doesn't exist in DEPT ???

ex :
Reads EMP, gets (001,JOE,SMITH,D001)
 hash_function (D001) = &FCF0D540
  Gets (D001,RESEARCH AND DEV) from memory address &FCF0D540     ORACLE returns : (001,JOE,SMITH,D001, D001,RESEARCH AND DEV) Reads EMP, gets (002,BILL,BUSH,D002)
 hash_function (D002) = &FCF0E9C2
  Gets (D002,SALES) from memory address &FCF0E9C2     ORACLE returns : (001,JOE,SMITH,D002, D002, SALES) ...


Here, I don't understand the manual.
"Oracle performs a full table scan on each of the tables and splits each into as
many partitions as possible based on the available memory." I dont understand why it would have to partition EMP... ?

That's what i would have done :

The hash table made out of DEPT is too big to fit in memory so ORACLE has to make HASH partitions of DEPT.
Overall, ORACLE performs a full scan of DEPT but divides this table into "row sets" (partitions) :

  1. ORACLE builds a hash table of the first "partition" of DEPT.
  2. ORACLE performs a FULL TABLE SCAN of EMP and PROBEs it with the hash table of the first "partition" of DEPT.
  3. ORACLE builds a hash table of the second "partition" of DEPT. This hash table replaces the first one in memory.
  4. ORACLE performs a FULL TABLE SCAN of EMP -AGAIN- and PROBEs it with the hash table of the second "partition" of DEPT.

etc

Finally :
1 FULL TABLE SCAN of DEPT
N partitions of DEPT
N hash tables of DEPT
N FULL TABLE SCAN of EMP



So does it make sense ? Am I (completely) wrong ?

Perhaps there is a faq somewhere where the hash join is thoroughly explained.
However, i didn't find it.
If you have a link for me... =)

Sorry if i wrote many stupid things.
Thanks in advance for your help.

JM Received on Wed Jul 30 2003 - 12:54:06 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US