Re: Open Source [or free] tool for Data cleansing

From: Paul Linehan <linehanp_at_tcd.ie>
Date: Mon, 16 Sep 2013 14:15:20 +0100
Message-ID: <CAF4RT5RJGG=ANizc_cTi=Gra1YzvnGrM8LQ3fwbphQ+XSR-cRA_at_mail.gmail.com>



2013/9/16 Lukas Lehner <weblehner_at_gmail.com>:

> Examples: email address without _at_, domain name without TLD, Uppercase
> hostnames, hostname in two domains, ...

Hi again Lukas,

These cases look rather "trivial" - not in the sense that your data isn't important, but rather in the sense that they look easy-peasy to code - I used to a lot of this in Delphi, but if you're a Java shop, then it
would also be quite easy - even without a tool, good use of unique indexes and check constraints would go a long way.

Is your input data (i.e. data to be scrubbed) .csv or what? Doesn't matter, just curious. Oracle external tables perhaps? PL/SQL - really, the list of options is virtually endless.

Also, if you're still not happy - Mr. Google is your friend - there's a truck load of stuff
here:

"open source data scrubbing tools"

HTH, Paul...

-- 


linehanp_at_tcd.ie

Mob: 00 353 86 864 5772
--
http://www.freelists.org/webpage/oracle-l
Received on Mon Sep 16 2013 - 15:15:20 CEST

Original text of this message