Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Fuzzy search

Re: Fuzzy search

From: Al Reid <areidjr_at_nospamhotmail.com>
Date: Thu, 22 Jan 2004 13:27:58 -0500
Message-ID: <10105ogdall4i77@corp.supernews.com>

<ctcgag_at_hotmail.com> wrote in message news:20040122132234.578$bh_at_newsreader.com...

> "Al Reid" <areidjr_at_nospamhotmail.com> wrote:

> > "Daniel Morgan" <damorgan_at_x.washington.edu> wrote in message
> > news:1074784505.698129_at_yasure...
> > > Al Reid wrote:
> > >
> > > > First, I have Oracle 8.1.7.
> > > >
> > > > My "customer" has requested that they be able to retrieve only the
> > > > correct records from the database where they do not know
> > exactly
> > > > how the data was entered. The are referring to it as a fuzzy search.
> > > >
> > > > For example, the want to retrieve info from a customer database by
> > > > customer name. The entry in the database is 'A B C Corp'
> > > >
> > > > They want to be able to retrieve the record if the type in any of the
> > > > following:
> > > >
> > > > A. B. Corp
> > > > A.B. Corp
> > > > AB Corp
> > > > A.B Corp, etc.
>
> uh, that's pretty fuzzy.  What happened to the "C" in "A B C Corp"?  Do
> they want to also find "NBC", "ABC", and "CBS"?  If they want "George
> Washington" but they accidentally spell it "Thomas Jefferson", do they want
> you magically correct that, also?
>

Sorry, my bad. I meant the entry in the database is 'A B Corp' I guess I was a little frustrated when I posted this.

> > > > I currently use SPs to retrieve the records from a VB program. Is
> > > > there something I could add to the SP to provide this functionality
> > > > without severely effecting performance?

>
> It's effect on performance would depend on how large the customer table is.
> For some systems, doing FTS of the customer table 10 times a minute would
> have no meaningful impact.  For others, it would be fatal.  Strictly
> speaking, it may not have to do a FTS (for example, if you always insist
> that at the list the first letter is not fuzzy), but I think that's a good
> estimate to use for performance impact.
>

There are currently 626000 customers in the table.

> > >
> > > SELECT ...
> > > FROM ...
> > > WHERE some_column LIKE 'A%B%Corp%';
> > >

> >

> > Would that not then return a record for Customer Name = 'Allen Bradley
> > Corporation' in addition? If so, it is not selective or fuzzy enough.
> >

> > Is the customer making an unreasonable request?
>
> That depends on how much the customer is willing to pay for the feature,
> and how many customers the customer has.  The less they want to pay and
> the more cruft there is to weed out, the more unreasonable it is.
>
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service              New Rate! $9.95/Month 50GB
Received on Thu Jan 22 2004 - 12:27:58 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US