Re: Oracle Streams and Uniqueness
From: Tim Gorman <tim_at_evdbt.com>
Date: Mon, 10 Nov 2008 13:12:34 -0700
Message-ID: <491895B2.1000105@evdbt.com>
I wasn't trying to be combative or condescending in pedantically citing a phrase like "relational database theory"; my sincere apologies if it came across that way. Rather, I was trying to provide an argument to deflect the issue into the area it belongs. I just don't believe that any replication package can be expected to operate without that basic assumption.
By way of explanation, I should have mentioned that the aforementioned project in which we evaluated Oracle AdvRep, Oracle Streams, Quest, and Golden Gate had several tables which literally had *no* combination of columns which could uniquely identify a row. The developers of this application (who were in-house) argued that this was necessary, for reasons I forget. However, I recall being positive that it was something they overlooked and tried to justify later. But the upshot was that, when the application updated or deleted a single row on the local database, it was utterly impossible to deterministically propagate that update or deletion to a remote replicated Oracle database, regardless of which tool we tried to replicate with. So, here is an example of the problem:
Regardless, none of the four replication products mentioned above implemented (or were able to implement) the proposed "ROWNUM <= n" solution, so the question of using that solution was moot unless we wanted to write our own replication package.
So, I don't think the problem has to do with anything being wrong with the details or implementation of the mechanism of "sql apply". It's more to do with the simple problem of unique row identification, propagating changes to specific rows, and where the responsibility for that capability resides. All I was trying to assert is that it resides with the designers of the application, not with the replication mechanism.
On a side note, on another project, employing Oracle Advanced Replication to replicate the database of an application called "Broadvision" (for those who know/knew that product), there were dozens of tables lacking any declared PK/UK constraints. Being a COTS 3rd-party application, we could not get information from the vendor about "implicit" PK/UK constraints on these tables, or any help whatsoever, because they claimed that the information was "proprietary" and "confidential". We were *pretty* sure that none of these tables had the same situation as the application mentioned above, where even the combination of all columns together did not constitute a "logical ROWID", but could not get any assurance of that. So, as with the situation described above, I wrote some SQL*Plus scripts to: 1) add a new sequence column, 2) create a FOR EACH ROW BEFORE INSERT trigger to populate the new column, 3) declare a UK on the new column, 4) rename the table and 5) create a simple view with the table's original name to hide the new column from the application code. AdvRep was happy, Broadvision did not care, and all was well.
Hope this helps...
Niall Litchfield wrote:
Date: Mon, 10 Nov 2008 13:12:34 -0700
Message-ID: <491895B2.1000105@evdbt.com>
Niall,
I wasn't trying to be combative or condescending in pedantically citing a phrase like "relational database theory"; my sincere apologies if it came across that way. Rather, I was trying to provide an argument to deflect the issue into the area it belongs. I just don't believe that any replication package can be expected to operate without that basic assumption.
By way of explanation, I should have mentioned that the aforementioned project in which we evaluated Oracle AdvRep, Oracle Streams, Quest, and Golden Gate had several tables which literally had *no* combination of columns which could uniquely identify a row. The developers of this application (who were in-house) argued that this was necessary, for reasons I forget. However, I recall being positive that it was something they overlooked and tried to justify later. But the upshot was that, when the application updated or deleted a single row on the local database, it was utterly impossible to deterministically propagate that update or deletion to a remote replicated Oracle database, regardless of which tool we tried to replicate with. So, here is an example of the problem:
- the local table had 3 rows where all of the columns had exactly the same data values
- a program belonging to the application updated only one of those 3 identical rows
- how could that update be propagated to the replicated table accurately without updating all three rows?
Regardless, none of the four replication products mentioned above implemented (or were able to implement) the proposed "ROWNUM <= n" solution, so the question of using that solution was moot unless we wanted to write our own replication package.
So, I don't think the problem has to do with anything being wrong with the details or implementation of the mechanism of "sql apply". It's more to do with the simple problem of unique row identification, propagating changes to specific rows, and where the responsibility for that capability resides. All I was trying to assert is that it resides with the designers of the application, not with the replication mechanism.
On a side note, on another project, employing Oracle Advanced Replication to replicate the database of an application called "Broadvision" (for those who know/knew that product), there were dozens of tables lacking any declared PK/UK constraints. Being a COTS 3rd-party application, we could not get information from the vendor about "implicit" PK/UK constraints on these tables, or any help whatsoever, because they claimed that the information was "proprietary" and "confidential". We were *pretty* sure that none of these tables had the same situation as the application mentioned above, where even the combination of all columns together did not constitute a "logical ROWID", but could not get any assurance of that. So, as with the situation described above, I wrote some SQL*Plus scripts to: 1) add a new sequence column, 2) create a FOR EACH ROW BEFORE INSERT trigger to populate the new column, 3) declare a UK on the new column, 4) rename the table and 5) create a simple view with the table's original name to hide the new column from the application code. AdvRep was happy, Broadvision did not care, and all was well.
Hope this helps...
Tim Gorman consultant - Evergreen Database Technologies, Inc. P.O. Box 630791, Highlands Ranch CO 80163-0791 website = http://www.EvDBT.com/ email = Tim@EvDBT.com mobile = +1-303-885-4526 fax = +1-303-484-3608 Yahoo IM = tim_evdbt
Niall Litchfield wrote:
-- http://www.freelists.org/webpage/oracle-l Received on Mon Nov 10 2008 - 14:12:34 CSTTim,Well I don't immediately see why a statement at the source such asdelete from t1 where c1 = '42';absolutely requires each row in t1 to be unique in order for the delete to be successfully replicated to a remote target (this is the nature of our problem transactions). I would have hoped that in a solution that trumpets itself with the description "sql apply" this would turn intodelete from t1 where c1 = 42;at the remote side, rather than 300,000 executions ofdelete from t1 where some_unique_key = :1 .:(I guess all this really shows is that I don't know enough about the theory of database replication.As for blaming Oracle for assuming that database apps are designed in accordance with relational principles, I think I do somewhat. I'm pretty sure that by about 1999/2001 at the latest that assumption was widely known to be untrue.NiallOn Mon, Nov 10, 2008 at 4:26 PM, Tim Gorman <tim@evdbt.com> wrote:
Niall,
It is not a restriction only of Oracle Streams, but of absolutely any replication solution, anywhere, at any time. How can any UPDATE or DELETE be propagated with a way of uniquely identifying rows, regardless of technology, whether using Oracle or Sybase or DB2 or MySQL, or any mixture of these?
A few years ago, I was on a project where we had Quest and Golden Gate each claiming that their respective products, unlike Oracle Advanced Replication and Oracle Streams, imposed no such restriction. Which is true -- they didn't actually impose the restriction. The Oracle products simply imposed the restriction before it caused logical data corruption.
Supplemental redo logging enables replication by recording the "logical ROWID" data values (i.e. PK/UK/substitute-key) in the generated redo, along with the usual "physical ROWID" values, for each change. For example, prior to the advent of Streams and supplemental redo logging, the Quest SharePlex product had to perform a "lookup" (after the redo was replicated) by "physical ROWID" on the source database in order to obtain the "logical ROWID" data values. The same is almost certainly true of other log-based replication products on Oracle prior to Oracle9i as well.
Still, none of this implies that declarative PK or UK constraints have to be created in the source database. The combination of columns that represent uniqueness can alternatively be recorded into the metadata of Oracle Streams or Oracle Advanced Replication; declarative PK or UK constraints are simply a convenience. Of course, if neither of these are found, then Streams/AdvRep is forced to assume that all columns in combination represent uniqueness, which is not always true either. In which case, should the blame be placed on Streams/AdvRep for assuming that the application was designed according to well-established relational database theory?
Hope this helps...
-Tim
Niall Litchfield wrote:All,I am being told by a support analyst at Oracle re Streams thatOracle must be able to identify uniquely and match corresponding rows at different
databases.
So each table in a Streams environment should have a primary key, unique key constraints or a substitute key.This in addition to supplemental logging. Is this correct? If so, surely this implies that streams is only useful in environments where tables are guaranteed (by Oracle or the application(s) running against it) to be unique. This seems a somewhat significant real-world disadvantage to me, based on the applications that I see. I guess I sort of hoped that Oracle might have used the ROWID to generated the LCR (in a way that I fully admit I have spend no time at all thinking about).--Niall LitchfieldOracle DBA
--
Niall Litchfield
Oracle DBA
http://www.orawin.info