Re: Merging 2 "almost" identical databases.

From: Neomusashi <peter.de.winter_at_gmail.com>
Date: Mon, 29 Sep 2008 13:27:36 -0700 (PDT)
Message-ID: <f7ed53fb-2c0c-4f1e-b8aa-0b4fccc2a830@x41g2000hsb.googlegroups.com>


On 29 sep, 14:38, DA Morgan <damor..._at_psoug.org> wrote:
> Neomusashi wrote:
> > Hi,
> > So my work for the next few weeks is sorting out how to merge 2 almost
> > identical databases used by 2 almost identical applications into one
> > application with one database. These applications were developed for a
> > client.
> > Why 2 DB' s and 2 Applications? Well it’s a legacy decision. My
> > predecessors decided it should be that way and now it ' s up to me to
> > fix this. They needed small application/functional differences for
> > other distribution channels and made a clone of the original app. .
> > Now several years later they would like to merge the two apps into
> > one, but of course they grew apart in functionality and structure. No
> > big differences, but enough to make this a nice project ... .
> > Short description:
> > Each application runs on a sun solaris 10 server in a websphere 6.0
> > application server (J2EE apps of course) and an orcale 10 database is
> > used. Database queries are done using Stored Procedures. Both have 2
> > schemas. One schema has the core data in about 20 tables and the other
> > contains about 9 generic servers and about tables. The tables purposes
> > don’ t really differ from each other, but columns,  DDL and data load
> > do.
>
> > What I need to come up with is a merging process we can deliver to our
> > client so they can easily merge the two databases.  The process has
> > two deliver a clear report.
> > I was thinking of three possible ways to go:
> > 1. Shell scripts with logging:
> > pros: basic, quickest?
> > Cons: Debugging, testing, complex, report/logging not so fancy/clear,
> > inhouse expertise
> > 2. Write a mini J2EE app:
> > pros: easy to debug, easier to make?, better report/logging
> > Cons: very slow to develop, work to throw away.
> > 3. Write a java app:
> > pros: easy to debug, easy to make, better report/logging
> > cons: work to throw away, slow to develop
>
> > I’ m feeling most for the 3th solution. How would you approach this?
>
> I have a lot of questions from what you've written. Here they are in
> no particular order.
>
> 1. After merger will it be necessary for the data to have some column
> indicating its origin to keep track of its historical origin?
>
> 2. When you say "almost identical applications" are the changes to the
> application only or also to the schemas? If the schemas to column
> definitions such as names, data types, etc.?
>
> 3. Is it possible that primary key and other unique constraints may
> be violated during a merger?
>
> 4. Given that reality is that very few databases have clean data
> what are the chances of data in one or both of the schemas, at a
> customer site, need cleansing?
>
> 5. Is there any chance that a customer has, on their own, made any
> changes to the definitions in any manner such as adding indexes or
> tables to support ad hoc reporting or internally developed functionality?
> --
> Daniel A. Morgan
> Oracle Ace Director & Instructor
> University of Washington
> damor..._at_x.washington.edu (replace x with u to respond)
> Puget Sound Oracle Users Groupwww.psoug.org

Daniel,

First let me tell you that I' m only working on the application for about 3 weeks. So I got a lot of catching up to do.

  1. We have been thinking about adding some sort of flag indicator or a table containing a flag and foreign key link. We suspect the data on itself won' t be enough to tell wether the object' s are linked to one or the other application.
  2. We are currently using Toad (www.toadsoft.com) to compare the schemas. Both applications use the same schema structures. Tables and columns show only minor differences. For what I' ve seen so far some tables from application A (lets give it a name) contain extra fields used by a component connected to a back office application/component. This component is also present in the second application, but it isn' t "used". So merging these tables will imply that the merged table will probably be a reflection of the table in application A with some empty columns for the records of application B. Datatypes, for so far I' ve seen, only show minor differences. 9/10 cases it' s a size issue.
  3. We are looking in to that. Again we hope Toad will give us a better view on that. For so far we saw, no constraint issues were found.
  4. Indeed some cleansing will be needed. For example the tables containg user data. Our client gave one employee several application users in application A and only 1 in application B. Reason was business related. A user in application B is coupled to a working area and one employee could be responsible for several working area' s. In application B a working area isn' t used, so one user per employee. Our client used this "trick" in appliaction A so they wouldn' t need to provide resources for an application change... :) .
  5. No. The application are maintained by us.

I think this is interesting case study :) . I' m looking forward to how we are going to solve this. Received on Mon Sep 29 2008 - 15:27:36 CDT

Original text of this message