<ctcgag_at_hotmail.com> wrote in message
news:20040122150457.752$Go_at_newsreader.com...
> "Al Reid" <areidjr_at_nospamhotmail.com> wrote:
> > > > > >
> > > > > > They want to be able to retrieve the record if the type in any
of
> > > > > > the following:
> > > > > >
> > > > > > A. B. Corp
> > > > > > A.B. Corp
> > > > > > AB Corp
> > > > > > A.B Corp, etc.
> > >
> > > uh, that's pretty fuzzy. What happened to the "C" in "A B C Corp"?
Do
> > > they want to also find "NBC", "ABC", and "CBS"? If they want "George
> > > Washington" but they accidentally spell it "Thomas Jefferson", do they
> > > want you magically correct that, also?
> > >
> >
> > Sorry, my bad. I meant the entry in the database is 'A B Corp'
> > I guess I was a little frustrated when I posted this.
>
> Ah, that may be much less fuzzy, then. How about a fbi using a
> canonicalization function which removes all non-letter characters (and
> converts them all to upper while it is at it)? Of course, you'd still
have
> to handle (or forbid) situations where the name (after transformation) is
> non-unique. Then all the above would simply become "ABCORP".
>
> >
> > > > > > I currently use SPs to retrieve the records from a VB program.
> > > > > > Is there something I could add to the SP to provide this
> > > > > > functionality without severely effecting performance?
> > >
> > > It's effect on performance would depend on how large the customer
table
> > > is. For some systems, doing FTS of the customer table 10 times a
minute
> > > would have no meaningful impact. For others, it would be fatal.
> > > Strictly speaking, it may not have to do a FTS (for example, if you
> > > always insist that at the list the first letter is not fuzzy), but I
> > > think that's a good estimate to use for performance impact.
> > >
> >
> > There are currently 626000 customers in the table.
>
> If using a canonicalization function is good enough for them,
> then the size doesn't really matter (except to the extent that name
> collisions occur). If they want something fancier, like finding the
> minimum Levenshtein edit distance, then with a table that size you have
> your work cut out for you.
>
>
Thanks, I will explore the canonicalization function and see where it leads.
>
> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service New Rate! $9.95/Month 50GB
Received on Thu Jan 22 2004 - 15:11:01 CST