From LTierstein@cns.gov Mon, 09 Jun 2003 08:01:48 -0700 From: "Tierstein, Leslie" Date: Mon, 09 Jun 2003 08:01:48 -0700 Subject: RE: RE: anyone have any soundex scripts? Message-ID: MIME-Version: 1.0 Content-Type: text/plain Soundex essentially takes a single word and produces a value which can be compared to a similar value, to determine if the words are logically equivalent, mostly by paying attention to consonants. PROD> select soundex('Fairfield') from dual 2 / F614 1* select soundex('Faerfield') from dual PROD> / F614 PROD> select soundex('Fairfied') from dual 2 / F613 PROD> select soundex('Freyfeld') from dual 2 / F614 An enhanced algorithm would take into account false positives (like the one above; I doubt any human being would think that "Freyfeld" and "Fairfield" were the same person with a variant spelling) and false negatives which occur very easily when you don't have the same number of words being compared, or one word is missing or abbreviated, or an important consonant is missing or misplaced. You really need address merge/purge software, which exists out there, included complete packages and just APIs. Try a search on that. -----Original Message----- Sent: Monday, June 09, 2003 11:05 AM To: Multiple recipients of list ORACLE-L What advanced functionality are you looking for? -----Original Message----- Sent: Monday, June 09, 2003 10:29 AM To: Multiple recipients of list ORACLE-L yes, Im familiar with that function. but you have to write a soundex algorithm in order to get advanced functionality. I wouldnt even know where to start with something like that. Im hoping there is one on the web some where. > > From: "Seefelt, Beth" <[EMAIL PROTECTED]> > Date: 2003/06/09 Mon AM 09:59:43 EDT > To: Multiple recipients of list ORACLE-L <[EMAIL PROTECTED]> > Subject: RE: anyone have any soundex scripts? > > > There is a SOUNDEX sql function. Check tahiti.oracle.com for info. > > HTH. > > -----Original Message----- > Sent: Monday, June 09, 2003 8:49 AM > To: Multiple recipients of list ORACLE-L > > > I was on a project a few years ago where we used a soundex algorithm to > determine and eliminte duplicate data. > > For example we would have: > > 301 Fairfield Lane > 301 Faerfield Lane > > Notice the typo? The soundex algorithm caught it. Unfortunately I forgot > to grab a copy before I left. Everytime I do a google search on soundex > I get a theory website explaining the math behind it. > > anyone have one written? or know where I can find one? It is incredibly > useful. > > -- -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Tierstein, Leslie INET: [EMAIL PROTECTED] Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).