Re: Data conversion between various schemas

From: Joe Novella <jnovella_at_voicenet.com>
Date: Tue, 16 Jul 2002 12:34:59 GMT
Message-ID: <3D3412FA.DA9ABDFB_at_voicenet.com>


> Very briefly, my thoughts here are.
>
> 1. Get each source system into 1NF
>
> 2. Use Relational Algebra (or SQL if that is all you have to hand..) to
> normalise and constrain each source individually
> 2a. Add temporal attributes to you source models, then normalise and
> constrain again. This will let you store history in the DW.
>
> 3. For each normalised, temporal source model, use Views (i.e. Relational Algebra again) to:
> 3a. Join to multiple 'lookup' tables to map source domains to common
> (target) domains
> 3b. Rename source attributes to common target names
>
> 4. Select a common subset of attributes from each source model and UNION
> them all together
> 4a. Use aggregation over the source models if the data cannot be
> consolidated at the detail level
>
> That's the basic outline. Life gets more complex when you get different
> sources stating contradictory facts.
> Note that there is no information loss as the data from each source is
> stored in individual normalised source models.

There are newer data profiling products which not only integrate multiple databases into a consolidated normalized model, they do so based on analyzing the data below the metadata. Through inferring dependencies and redundancies across different sources, the analyst can model the source data or map to a pre-defined fixed target.

From there, you modify the model and add structures for new requirements or denormalize for performance.

The idea is to identify the actual content, structure and quality of the data before migration. Of course, the data profiling products won't qualify or prioritize the problems or concerns, they do quantify them.

There are also numerous papers on inferring dependencies. Here's a query against citeseer.com.

http://citeseer.nj.nec.com/cs?q="functional+dependencies"&cs=1

Hope this helps.

--
Joseph Novella
Anthem Consulting, LLC
"Good ideas are not adopted automatically.  They must be driven into practice with courageous
patience." - Admiral Hyman Rickover
Received on Tue Jul 16 2002 - 14:34:59 CEST

Original text of this message