Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: How to hash a set of columns ?

Re: How to hash a set of columns ?

From: Jusung Yang <JusungYang_at_yahoo.com>
Date: 8 May 2003 14:17:10 -0700
Message-ID: <130ba93a.0305081317.19e92e06@posting.google.com>


Well, that's the thing with hashing. Uniqueness is never gauaranteed, yes? Not only that, the algorithm within dbms_utility.get_hash_value, which ORACLE does not publish, can change from release to release. If you want a hashing function that will always return the same value for a particular string, write your own.

It is all a matter of probability. If you increase the hash bucket size, the chance of collisions will be lower. If you want to hash 100 values with 50 as hash size, there is no way you can avoid collisions. The max hash size you can use with dbms_utility.get_hash_value is (2**30 - 1) - plenty of room for you lower the collision probability to a level that is statistically insignificant.

"Anurag Varma" <avdbi_at_hotmail.com> wrote in message news:<YWkua.5456$2c4.2458_at_news02.roc.ny.frontiernet.net>...
> "Jusung Yang" <JusungYang_at_yahoo.com> wrote in message news:130ba93a.0305071328.31c0a95b_at_posting.google.com...

> > madhu17_at_vsnl.net (S Madhusudhanan) wrote in message news:<2eead8c7.0305070825.70f20add_at_posting.google.com>...
> > > Hi Gurus,
> > >
> > > I'm trying to develop a method to hash a set of columns and store the
> > > result in another column.
> > >
> > > Any query's on the set of columns would be hashed and the result would
> > > be looked up in the Hash Value column.
> > >
> > > Could somebody point me in the right direction on how to achieve this?
> > >
> > > Thanks and Regards
> > > S Madhusudhanan
> >
> > You can get your own hash function or you can use the one from ORACLE.
> > Like:
> >
> > SQL> select dbms_utility.get_hash_value('abcd',100,800) from dual;
> >
> > DBMS_UTILITY.GET_HASH_VALUE('ABCD',100,800)
> > -------------------------------------------
> > 871
> >
> > SQL> select 1 from dual where
> > 871=dbms_utility.get_hash_value('abcd',100,800);
> >
> > 1
> > ----------
> > 1
> >
> > SQL>
> >
> > SQL> select * from test2;
> >
> > C1 C2
> > -- ----------
> > b 2
> > c 3
> > e 1
> > K 6
> > P 0
> > G 0
> > ap 9
> > G 1
> > G 1
> >
> > 9 rows selected.
> >
> > SQL> select dbms_utility.get_hash_value(c1||to_char(c2),100,800)
> > hashed_col from test2;
> >
> > HASHED_COL
> > ----------
> > 812
> > 373
> > 283
> > 161
> > 804
> > 568
> > 718
> > 390
> > 390
> >
> > 9 rows selected.
> >
> > SQL>
> >
> >
> > - Jusung Yang
> 
> To OP:
> Be aware however that dbms_utility.get_hash_value might return same/duplicate hash value for two different strings.
> In Jusung's case for example:
> SQL>select dbms_utility.get_hash_value('KU$_CONSTRAINT_COL_LIST_T',100,800) hashval
>   2  from dual;
> 
>    HASHVAL
> ----------
>        899
> 
> SQL>select dbms_utility.get_hash_value('GV$TRANSACTION_ENQUEUE',100,800) hashval
>   2  from dual;
> 
>    HASHVAL
> ----------
>        899
> 
> ... try this query yourself ..
>   1  select oname, dbms_utility.get_hash_value(oname,100,800) hashval
>   2  from (select distinct object_name oname from dba_objects)
>   3* order by hashval
> 
> ... of course if you change the ranges, the above might not clash
> SQL>select dbms_utility.get_hash_value('GV$TRANSACTION_ENQUEUE',100,8000) hashval from dual;
> 
>    HASHVAL
> ----------
>       5699
> 
> SQL>select dbms_utility.get_hash_value('KU$_CONSTRAINT_COL_LIST_T',100,8000) hashval from dual;
> 
>    HASHVAL
> ----------
>       4099
> .. however, original point being: The values are not guaranteed to be unique. Just be aware of it and take appropriate precautions.
> 
> Anurag
Received on Thu May 08 2003 - 16:17:10 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US