Re: Comparing rows in 2 tables

From: Paul Brewer <paul_at_paul.brewers.org.uk>
Date: Tue, 17 Dec 2002 21:27:00 -0000
Message-ID: <3e00cc96_1@mk-nntp-1.news.uk.worldonline.com>

"Jake" <Im_at_nottelling.com> wrote in message news:atna7c$c2j$1_at_bob.news.rcn.net...
> Thanks Sybrand,
>
> I don't know how to give the /*+ driving_site */ hint (or which site
should
> be driving). But I think you want me to try something like:
>
> connect scott/tiger_at_remotesite
>
> update tablea local
> set (data) =
> (select data from tablea_at_reverselink remote1
> where local.id = remote.id)
> where id in
> (select id
> from tablea local2,
> tablea_at_reverselink remote2
> where local2.id = remote2.id
> and nvl(local2.data,'1') != nvl(remote2.data,'1')
> )
>
> right?
>
> Thanks
> Jake
>
> "Sybrand Bakker" <postbus_at_sybrandb.demon.nl> wrote in message
> news:a20d28ee.0212170031.1de8d939_at_posting.google.com...
> > "Jake" <Im_at_nottelling.com> wrote in message
> news:<atm3qm$t8a$1_at_bob.news.rcn.net>...
> > > Hi, I'm trying to merge data from 2 tables. One table exists locally,
> > > and the other lives across the network on another box. Each table has
> ~1M
> > > rows. I
> > > want to update the remote table with the values from the local table
> > > where the key is the same but the data is different.
> > >
> > > My example has been (greatly) simplified. the table is created on
both
> > > database like:
> > >
> > > create table TABLEA
> > > (
> > > ID Number not null,
> > > DATA Varchar2(200) not null,
> > > constraint PK_TABLEA primary key (ID)
> > > )
> > > /
> > >
> > > when I do:
> > > select count(*)
> > > from tablea local,
> > > tablea_at_dblink remote
> > > where local.id = remote.id
> > > and nvl(local.data,'1') != nvl(remote.data,'1')
> > >
> > > I get my result back pretty quickly, ~2 mins (with a count of 212
> > > different rows).
> > > But when I update using a correlated subquery it takes 30ish mins.
> > >
> > > update tablea_at_dblink remote
> > > set (data) =
> > > (select data from tablea local
> > > where local.id = remote.id)
> > > where exists
> > > (select 1 from
> > > tablea local2
> > > where local2.id = remote.id
> > > and nvl(local.data,'1') != nvl(remote.data,'1')
> > > )
> > > /
> > >
> > > I think there has got to be a way to speed it up. I want to drive the
> > > update off the small set of different rows. I can do it
> programmatically
> > > but I'd like a SQL solution.
> > >
> > > Thanks
> > > Jay
> >
> > I would try to reverse the process: ie update locally and select
> > remotely, so connect to the remote database and use a reverse link.
> > In that case you don't have a distributed transaction (so no need for
> > 2 phase commit) and you can use the /*+ driving_site */ hint in your
> > subqueries.
> > Given the nature of your statement in that case you would probably do
> > better to rewrite the exists subquery to an in subquery and make sure
> > that this is executed once.
> >
> > Hth
> >
> > Sybrand Bakker
> > Senior Oracle DBA
>
>

Jake,

I agree with Sybrand; I'd do it as a 'pull' rather than a 'push'.

But two questions:

Have you tried minus rather than in or exists?
Did you consider Oracle replication?

Regards,
Paul Received on Tue Dec 17 2002 - 15:27:00 CST