Re: data cleansing: externally or internally?

From: Deadly Dirk <dirk_at_pfln.invalid>
Date: Fri, 4 Nov 2011 17:51:02 +0000 (UTC)
Message-ID: <pan.2011.11.04.17.50.19_at_pfln.invalid>



On Fri, 04 Nov 2011 07:51:49 +0100, geos wrote:

> there is a big text file with dirty data.

How big is "big"?

> a company wants it to be
> clean. there are some known patterns expressed as like or regexp. I
> first thought about two approaches:
> 1) do this on the system level
> 2) or in a database

Database is not well suited for things like that. Personally, I would use Perl. Perl is my favorite tool because it's extremely versatile and fast but any scripting language with regex support will probably do.

-- 
I don't think, therefore I am not.
Received on Fri Nov 04 2011 - 12:51:02 CDT

Original text of this message