Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Is there a faster/better way to do this?
Hi all,
I have a puzzle for all you performance/tuning gurus out there!
I have a table with around 4 million records in it called CUSTOMERS.
Each record represents a customer with many columns that hold things like forename, surname, address, post code, telephone number etc.
The customer records have come from many different sources and as a result, there are in most cases, many customer records representing only one actual "real life" customer.
What I am trying to do is write a routine that will group together records that are likely to be the same customer.
The way I have set about this is to write an SQL statement that SELECTs across the whole CUSTOMERS table (4 million records) using the GROUP BY clause on 7 key columns. The results of this are then inserted into a MATCHED table for all groups of 2 or more records and flagged as match type "1" (best).
This is then repeated, but this time the GROUP BY is on only 6 of the above key columns. These results are stored in the MATCHED table as match type "2".
This is again repeated using GROUP BY on 5/4/3 key fields.
The above method works OK, the only problem is that it takes a very long time to run - around 20 hours!!!
So, my question is, is this the best way to do this, or is there a more efficient way to code this that will improve speed?
I am running this on Oracle 8i/Solaris platform.
Thanks in advance for any help.
John. Received on Wed Apr 03 2002 - 13:27:24 CST