Re: Merging 2 "almost" identical databases.
From: DA Morgan <damorgan_at_psoug.org>
Date: Tue, 30 Sep 2008 07:52:11 -0700
Message-ID: <1222786327.983030@bubbleator.drizzle.com>
>
> Daniel,
>
> First let me tell you that I' m only working on the application for
> about 3 weeks. So I got a lot of catching up to do.
>
> 1. We have been thinking about adding some sort of flag indicator or a
> table containing a flag and foreign key link. We suspect the data on
> itself won' t be enough to tell wether the object' s are linked to one
> or the other application.
>
> 2. We are currently using Toad (www.toadsoft.com) to compare the
> schemas. Both applications use the same schema structures. Tables and
> columns show only minor differences. For what I' ve seen so far some
> tables from application A (lets give it a name) contain extra fields
> used by a component connected to a back office application/component.
> This component is also present in the second application, but it isn'
> t "used". So merging these tables will imply that the merged table
> will probably be a reflection of the table in application A with some
> empty columns for the records of application B.
> Datatypes, for so far I' ve seen, only show minor differences. 9/10
> cases it' s a size issue.
>
> 3. We are looking in to that. Again we hope Toad will give us a better
> view on that. For so far we saw, no constraint issues were found.
>
> 4. Indeed some cleansing will be needed. For example the tables
> containg user data. Our client gave one employee several application
> users in application A and only 1 in application B. Reason was
> business related. A user in application B is coupled to a working area
> and one employee could be responsible for several working area' s. In
> application B a working area isn' t used, so one user per employee.
> Our client used this "trick" in appliaction A so they wouldn' t need
> to provide resources for an application change... :) .
>
> 5. No. The application are maintained by us.
>
> I think this is interesting case study :) . I' m looking forward to
> how we are going to solve this.
Date: Tue, 30 Sep 2008 07:52:11 -0700
Message-ID: <1222786327.983030@bubbleator.drizzle.com>
Neomusashi wrote:
> On 29 sep, 14:38, DA Morgan <damor..._at_psoug.org> wrote:
>> Neomusashi wrote: >>> Hi, >>> So my work for the next few weeks is sorting out how to merge 2 almost >>> identical databases used by 2 almost identical applications into one >>> application with one database. These applications were developed for a >>> client. >>> Why 2 DB' s and 2 Applications? Well it’s a legacy decision. My >>> predecessors decided it should be that way and now it ' s up to me to >>> fix this. They needed small application/functional differences for >>> other distribution channels and made a clone of the original app. . >>> Now several years later they would like to merge the two apps into >>> one, but of course they grew apart in functionality and structure. No >>> big differences, but enough to make this a nice project ... . >>> Short description: >>> Each application runs on a sun solaris 10 server in a websphere 6.0 >>> application server (J2EE apps of course) and an orcale 10 database is >>> used. Database queries are done using Stored Procedures. Both have 2 >>> schemas. One schema has the core data in about 20 tables and the other >>> contains about 9 generic servers and about tables. The tables purposes >>> don’ t really differ from each other, but columns, DDL and data load >>> do. >>> What I need to come up with is a merging process we can deliver to our >>> client so they can easily merge the two databases. The process has >>> two deliver a clear report. >>> I was thinking of three possible ways to go: >>> 1. Shell scripts with logging: >>> pros: basic, quickest? >>> Cons: Debugging, testing, complex, report/logging not so fancy/clear, >>> inhouse expertise >>> 2. Write a mini J2EE app: >>> pros: easy to debug, easier to make?, better report/logging >>> Cons: very slow to develop, work to throw away. >>> 3. Write a java app: >>> pros: easy to debug, easy to make, better report/logging >>> cons: work to throw away, slow to develop >>> I’ m feeling most for the 3th solution. How would you approach this? >> I have a lot of questions from what you've written. Here they are in >> no particular order. >> >> 1. After merger will it be necessary for the data to have some column >> indicating its origin to keep track of its historical origin? >> >> 2. When you say "almost identical applications" are the changes to the >> application only or also to the schemas? If the schemas to column >> definitions such as names, data types, etc.? >> >> 3. Is it possible that primary key and other unique constraints may >> be violated during a merger? >> >> 4. Given that reality is that very few databases have clean data >> what are the chances of data in one or both of the schemas, at a >> customer site, need cleansing? >> >> 5. Is there any chance that a customer has, on their own, made any >> changes to the definitions in any manner such as adding indexes or >> tables to support ad hoc reporting or internally developed functionality? >> -- >> Daniel A. Morgan >> Oracle Ace Director & Instructor >> University of Washington >> damor..._at_x.washington.edu (replace x with u to respond) >> Puget Sound Oracle Users Groupwww.psoug.org
>
> Daniel,
>
> First let me tell you that I' m only working on the application for
> about 3 weeks. So I got a lot of catching up to do.
>
> 1. We have been thinking about adding some sort of flag indicator or a
> table containing a flag and foreign key link. We suspect the data on
> itself won' t be enough to tell wether the object' s are linked to one
> or the other application.
>
> 2. We are currently using Toad (www.toadsoft.com) to compare the
> schemas. Both applications use the same schema structures. Tables and
> columns show only minor differences. For what I' ve seen so far some
> tables from application A (lets give it a name) contain extra fields
> used by a component connected to a back office application/component.
> This component is also present in the second application, but it isn'
> t "used". So merging these tables will imply that the merged table
> will probably be a reflection of the table in application A with some
> empty columns for the records of application B.
> Datatypes, for so far I' ve seen, only show minor differences. 9/10
> cases it' s a size issue.
>
> 3. We are looking in to that. Again we hope Toad will give us a better
> view on that. For so far we saw, no constraint issues were found.
>
> 4. Indeed some cleansing will be needed. For example the tables
> containg user data. Our client gave one employee several application
> users in application A and only 1 in application B. Reason was
> business related. A user in application B is coupled to a working area
> and one employee could be responsible for several working area' s. In
> application B a working area isn' t used, so one user per employee.
> Our client used this "trick" in appliaction A so they wouldn' t need
> to provide resources for an application change... :) .
>
> 5. No. The application are maintained by us.
>
> I think this is interesting case study :) . I' m looking forward to
> how we are going to solve this.
Based on your answers it look straight forward.
Keep the tables from Application A modifying column definitions, only where necessary, to reflect the larger length. Add appropriate constraints so that bad data will fail, use DBMS_ERRLOG to create a table to hold constraint violations and then use the technique used here: http://www.psoug.org/reference/dbms_errlog.html.
Always run through first in a test environment before doing it in production and always have a fresh reliable backup, or flashback logs, before beginning.
-- Daniel A. Morgan Oracle Ace Director & Instructor University of Washington damorgan_at_x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.orgReceived on Tue Sep 30 2008 - 09:52:11 CDT