Re: Oracle Streams and Uniqueness

From: Tim Gorman <tim_at_evdbt.com>
Date: Mon, 10 Nov 2008 13:12:34 -0700
Message-ID: <491895B2.1000105@evdbt.com>




  


Niall,

I wasn't trying to be combative or condescending in pedantically citing a phrase like "relational database theory";  my sincere apologies if it came across that way.  Rather, I was trying to provide an argument to deflect the issue into the area it belongs.  I just don't believe that any replication package can be expected to operate without that basic assumption.

By way of explanation, I should have mentioned that the aforementioned project in which we evaluated Oracle AdvRep, Oracle Streams, Quest, and Golden Gate had several tables which literally had *no* combination of columns which could uniquely identify a row.  The developers of this application (who were in-house) argued that this was necessary, for reasons I forget.  However, I recall being positive that it was something they overlooked and tried to justify later.  But the upshot was that, when the application updated or deleted a single row on the local database, it was utterly impossible to deterministically propagate that update or deletion to a remote replicated Oracle database, regardless of which tool we tried to replicate with.  So, here is an example of the problem:
  • the local table had 3 rows where all of the columns had exactly the same data values
  • a program belonging to the application updated only one of those 3 identical rows
  • how could that update be propagated to the replicated table accurately without updating all three rows?
Yes, I suppose the answer could have been to add an additional qualifier of "AND ROWNUM <= n" to the propagated UPDATE/DELETE statement during replication, but should a replication product make an assumption like that?  The fact is, the application on the local database has a "physical ROWID" to use to uniquely identify a row, but the remote database does not.  This then leads to a discussion of whether or not to add a ROWID column to the replicated tables, but in the end we all know that would be silly, and would be far better served to simply create a new "logical ROWID" column composed of a sequence number anyway.  Which circles us back to the original assertion about violations of basic "relational database theory"...

Regardless, none of the four replication products mentioned above implemented (or were able to implement) the proposed "ROWNUM <= n" solution, so the question of using that solution was moot unless we wanted to write our own replication package.

So, I don't think the problem has to do with anything being wrong with the details or implementation of the mechanism of "sql apply".  It's more to do with the simple problem of unique row identification, propagating changes to specific rows, and where the responsibility for that capability resides.  All I was trying to assert is that it resides with the designers of the application, not with the replication mechanism.

On a side note, on another project, employing Oracle Advanced Replication to replicate the database of an application called "Broadvision" (for those who know/knew that product), there were dozens of tables lacking any declared PK/UK constraints.  Being a COTS 3rd-party application, we could not get information from the vendor about "implicit" PK/UK constraints on these tables, or any help whatsoever, because they claimed that the information was "proprietary" and "confidential".  We were *pretty* sure that none of these tables had the same situation as the application mentioned above, where even the combination of all columns together did not constitute a "logical ROWID", but could not get any assurance of that.  So, as with the situation described above, I wrote some SQL*Plus scripts to:  1) add a new sequence column, 2) create a FOR EACH ROW BEFORE INSERT trigger to populate the new column, 3) declare a UK on the new column, 4) rename the table and 5) create a simple view with the table's original name to hide the new column from the application code.  AdvRep was happy, Broadvision did not care, and all was well.

Hope this helps...
Tim Gorman
consultant - Evergreen Database Technologies, Inc.
P.O. Box 630791, Highlands Ranch CO  80163-0791
website   = http://www.EvDBT.com/
email     = Tim@EvDBT.com
mobile    = +1-303-885-4526
fax       = +1-303-484-3608
Yahoo IM  = tim_evdbt


Niall Litchfield wrote:
Tim,
 
Well I don't immediately see why a statement at the source such as
 
delete from t1 where c1 = '42';
 
absolutely requires each row in t1 to be unique in order for the delete to be successfully replicated to a remote target (this is the nature of our problem transactions). I would have hoped that in a solution that trumpets itself with the description "sql apply" this would turn into
 
delete from t1 where c1 = 42; 
 
at the remote side, rather than 300,000 executions of 
 
delete from t1 where some_unique_key = :1 .
 
:(
 
I guess all this really shows is that I don't know enough about the theory of database replication.  
 
As for blaming Oracle for assuming that database apps are designed in accordance with relational principles, I think I do somewhat. I'm pretty sure that by about 1999/2001 at the latest that assumption was widely known to be untrue.
 
Niall
 
 
On Mon, Nov 10, 2008 at 4:26 PM, Tim Gorman <tim@evdbt.com> wrote:
Niall,

It is not a restriction only of Oracle Streams, but of absolutely any replication solution, anywhere, at any time.  How can any UPDATE or DELETE be propagated with a way of uniquely identifying rows, regardless of technology, whether using Oracle or Sybase or DB2 or MySQL, or any mixture of these?

A few years ago, I was on a project where we had Quest and Golden Gate each claiming that their respective products, unlike Oracle Advanced Replication and Oracle Streams, imposed no such restriction.  Which is true -- they didn't actually impose the restriction.  The Oracle products simply imposed the restriction before it caused logical data corruption.

Supplemental redo logging enables replication by recording the "logical ROWID" data values (i.e. PK/UK/substitute-key) in the generated redo, along with the usual "physical ROWID" values, for each change.  For example, prior to the advent of Streams and supplemental redo logging, the Quest SharePlex product had to perform a "lookup" (after the redo was replicated) by "physical ROWID" on the source database in order to obtain the "logical ROWID" data values.  The same is almost certainly true of other log-based replication products on Oracle prior to Oracle9i as well.

Still, none of this implies that declarative PK or UK constraints have to be created in the source database.  The combination of columns that represent uniqueness can alternatively be recorded into the metadata of Oracle Streams or Oracle Advanced Replication;  declarative PK or UK constraints are simply a convenience.  Of course, if neither of these are found, then Streams/AdvRep is forced to assume that all columns in combination represent uniqueness, which is not always true either.  In which case, should the blame be placed on Streams/AdvRep for assuming that the application was designed according to well-established relational database theory?

Hope this helps...

-Tim





Niall Litchfield wrote:
All,
 
I am being told by a support analyst at Oracle re Streams that
Oracle must be able to identify uniquely and match corresponding rows at different
databases.
So each table in a Streams environment should have a primary key, unique key constraints or a substitute key.
This in addition to supplemental logging. Is this correct? If so, surely this implies that streams is only useful in environments where tables are guaranteed (by Oracle or the application(s) running against it) to be unique. This seems a somewhat significant real-world disadvantage to me, based on the applications that I see. I guess I sort of hoped that Oracle might have used the ROWID to generated the LCR (in a way that I fully admit I have spend no time at all thinking about).     
 
 
--
Niall Litchfield
Oracle DBA
 



--
Niall Litchfield
Oracle DBA
http://www.orawin.info
-- http://www.freelists.org/webpage/oracle-l Received on Mon Nov 10 2008 - 14:12:34 CST

Original text of this message