Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> HASH JOIN ...I want to understand
Build hash table, probe, etc, sure i didn't understand everything.
I want to make things clear in my mind with a simple example from the
Design and Tuning for Performance Oracle Manual.
That's what i imagine
Please tell me how far from reality my imagination is .... :
emp
emp_id first_name last_name dept_id
001 JOE SMITH D001 002 BILL BUSH D002 003 PETER JACKSON D001 004 THERESA CUBADA D004 005 GARY GIBB D001
dept
dept_id name
D001 RESEARCH AND DEV D002 SALES D003 HUMAN RESOURCES D004 ADMINISTRATIVE
SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno;
STEP 1 : ORACLE builds a hash table of DEPT. (this requires a full table scan of DEPT)
-> ORACLE copies the whole (?)(because of the * in the query)(?) EMP
table in memory BUT in a hash table structure (with a hash function).
-> To access a line of this table in memory with a primary key
(dept_id) :
ex :
Reads DEPT, gets (D001,RESEARCH AND DEV)
hash_function (D001) = &FCF0D540
Copies (D001,RESEARCH AND DEV) into memory address &FCF0D540
Reads DEPT, gets (D002,SALES)
hash_function (D002) = &FCF0E9C2
Copies (D002,SALES) into memory address &FCF0E9C2
...
STEP 2 : ORACLE performs a FULL TABLE SCAN of EMP and PROBEs.
For each row of EMP, it passes EMP.dept_id to the hash function which translates the string to a memory location. ORACLE retrieves the DEPT data directly from the memory for this row.
If EMP.dept_id doesn't exist in DEPT ???
ex :
Reads EMP, gets (001,JOE,SMITH,D001)
hash_function (D001) = &FCF0D540
Gets (D001,RESEARCH AND DEV) from memory address &FCF0D540
ORACLE returns : (001,JOE,SMITH,D001, D001,RESEARCH AND DEV)
Reads EMP, gets (002,BILL,BUSH,D002)
hash_function (D002) = &FCF0E9C2
Gets (D002,SALES) from memory address &FCF0E9C2
ORACLE returns : (001,JOE,SMITH,D002, D002, SALES)
...
Here, I don't understand the manual.
"Oracle performs a full table scan on each of the tables and splits
each into as
many partitions as possible based on the available memory."
I dont understand why it would have to partition EMP... ?
That's what i would have done :
The hash table made out of DEPT is too big to fit in memory so ORACLE
has to make HASH partitions of DEPT.
Overall, ORACLE performs a full scan of DEPT but divides this table
into "row sets" (partitions) :
etc
Finally :
1 FULL TABLE SCAN of DEPT
N partitions of DEPT
N hash tables of DEPT
N FULL TABLE SCAN of EMP
Perhaps there is a faq somewhere where the hash join is thoroughly
explained.
However, i didn't find it.
If you have a link for me... =)
Sorry if i wrote many stupid things.
Thanks in advance for your help.
JM Received on Wed Jul 30 2003 - 12:54:06 CDT