Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Detecting typos

Re: Detecting typos

From: DA Morgan <damorgan_at_x.washington.edu>
Date: Sun, 19 Dec 2004 13:26:06 -0800
Message-ID: <41c5f101$1_3@127.0.0.1>


CE wrote:
> Hi,
>
> I'm working on a project which involves matching customers against an
> existing database using (amonst other things) address, date of birth
> and name.
>
> One problem I have to overcome is allowing for typographical errors.
> E.g. matching "CATHERINE" with "CSTHERINE". So what I though I'd do is
> have a little function to compare 2 strings and return the number of
> differences:
>
> CREATE OR REPLACE function string_compare
> (string1 IN VARCHAR2,
> string2 IN VARCHAR2) RETURN NUMBER IS
> diffs BINARY_INTEGER := 0;
> BEGIN
> IF least(length(string1),length(string2)) > 0 THEN
> FOR i IN 1..least(length(string1),length(string2)) LOOP
> IF substr(string1,i,1) <> substr(string2,i,1) THEN
> diffs := diffs + 1;
> END IF;
> END LOOP;
> END IF;
> RETURN diffs;
> END;
> /
>
> This works ok (I've got to watch out for comparing MARY and MARK etc,
> though), but doesn't handle a comparison where there might be a missing
> letter (e.g. "CATHERINE" and "CTHERINE"). Can anybody think of
> anything a little bit cleverer?
>
> Also if anyone could recommend a book on the rationale/logic behind
> such customer matching/de-duplicating, I'd also be grateful.
> Thanks
>
> CE

I think you are sliding down a slippery slope and should slowly back away from the edge of the cliff.

If they are not exact matches they are not exact matches. After you are done working those that do match perform a comparison of what is left. There may not be enough to make doing anything worthwhile.

-- 
Daniel A. Morgan
University of Washington
damorgan_at_x.washington.edu
(replace 'x' with 'u' to respond)


-----------== Posted via Newsfeed.Com - Uncensored Usenet News ==----------
   http://www.newsfeed.com       The #1 Newsgroup Service in the World!
-----= Over 100,000 Newsgroups - Unlimited Fast Downloads - 19 Servers =-----
Received on Sun Dec 19 2004 - 15:26:06 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US