Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Detecting typos

Detecting typos

From: CE <charlie3101_at_hotmail.com>
Date: 15 Dec 2004 03:28:08 -0800
Message-ID: <1103110088.787968.186880@c13g2000cwb.googlegroups.com>


Hi,

I'm working on a project which involves matching customers against an existing database using (amonst other things) address, date of birth and name.

One problem I have to overcome is allowing for typographical errors. E.g. matching "CATHERINE" with "CSTHERINE". So what I though I'd do is have a little function to compare 2 strings and return the number of differences:

CREATE OR REPLACE function string_compare (string1 IN VARCHAR2,
string2 IN VARCHAR2) RETURN NUMBER IS
diffs BINARY_INTEGER := 0;
BEGIN
IF least(length(string1),length(string2)) > 0 THEN FOR i IN 1..least(length(string1),length(string2)) LOOP IF substr(string1,i,1) <> substr(string2,i,1) THEN diffs := diffs + 1;
END IF;
END LOOP;
END IF;
RETURN diffs;
END;
/

This works ok (I've got to watch out for comparing MARY and MARK etc, though), but doesn't handle a comparison where there might be a missing letter (e.g. "CATHERINE" and "CTHERINE"). Can anybody think of anything a little bit cleverer?

Also if anyone could recommend a book on the rationale/logic behind such customer matching/de-duplicating, I'd also be grateful. Thanks

CE Received on Wed Dec 15 2004 - 05:28:08 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US