Non-duplicate test on LARGE table

From: mattsteel <matteo_vitturi_at_virgilio.it>
Date: Tue, 27 May 2008 07:48:15 -0700 (PDT)
Message-ID: <f3e142ae-2589-4c0c-91a3-8397d89d8bf6@z66g2000hsc.googlegroups.com>


Hello.
My question is a little broad but maybe you already saw something like this ;-)

Here is my scenario:
Let us have a large table which contains over 20 million records and search for rows that are duplicated under some rule. The column involved in that comparison rule, named AREA, is declared VARCHAR2(4000) NOT NULL. Seeing that an usual index built on column AREA would not be so good, I tried to add a column, HASHVALUE, which always contains a suitable hash-value given by the standard "dbms_utility.get_hash_value" function.

Now that I have a much smaller column to search through for duplicates, should I put an index on column HASHVALUE, do you think overall performance would be improved or not?

Thank you in advance.

Matt. Received on Tue May 27 2008 - 09:48:15 CDT

Original text of this message