Re: Merge/Purge

From: modisc1 <grooch7NOgrSPAM_at_yahoo.com.invalid>
Date: 1999/10/27
Message-ID: <136f266c.21a6712c_at_usw-ex0101-003.remarq.com>#1/1


Check out www.vality.com

www.trilliumsoft.com
www.g1.com
www.postalsoft.com

In article <37f00e18_1_at_news1.prserv.net>, "Tom Leylan" <tleylan_at_leylan.com> wrote:
> Gerry Thorpe <gerrythorpe_at_hotmail.com> wrote...
> > I am looking for information about and software to do
> Merge/Purge.
> I think it is safe to say, this is a very specialized field (if
> you want to
> do it right.)
> > A typical situation is as follows: John Doe, may be in one
> > data source as John Doe, another as Jonny Doe, another
> > as J. Doe, Johnathan Doe, etc. I would like to be able to
> > take the data from all of those sources and construct a single
> > row of data that aggregates the data from all those rows.
> I'm not sure you actually want to aggregate the data. If "John
> Doe" uses
> "Jonny" these days and really lives in Philadelphia then the fact
> that "John
> Doe" appears 3 times as "St. Paul, MN" is of no value, he no
> longer lives
> there.
> > What I need is a system that knows that John Doe, Jonny Doe,
> > J. Doe and Johnathan Doe are all likely the same person.
 And the clue that they are the same person would be?
> > Any leads would be appreciated.
> 1) Search the Internet for what not to do.
> 2) Expect a multiple-pass solution.
> 3) Expect a solution based upon specific knowledge of the domain.
> 4) Accept "reasonable" solutions.
> 5) Document your assumptions.
> 6) Consider posting your ideas (as they arrive) here, before you
> merge all
> the "John Smith" records in Los Angeles into a single row.
> Tom
> Oh... visit www.deja.com and search for "database duplicates"
> --
> ---> Learn a little something at http://www.leylan.com

  • Sent from RemarQ http://www.remarq.com The Internet's Discussion Network * The fastest and easiest way to search and participate in Usenet - Free!
Received on Wed Oct 27 1999 - 00:00:00 CEST

Original text of this message