Re: Distributed Database Issues

From: Contractor - Yuk Hon <jychan_at_corp.hp.com>
Date: 1995/08/21
Message-ID: <41b6bf$j49_at_hpcc48.corp.hp.com>


Haakon T. Soenderland (hts_at_sasdata.no) wrote:
: Hi all,
 

: I'm sitting here looking at the white-paper on Distributed Database
: Technology and Symmetric Replication from the Oracle WWW-site
: (http://www.oracle.com/info/products/entserver72/ddtsr.html).
 

: I'm wondering about some of the statements Oracle makes in this paper
: and would be interested in your comments on them.
 

: In the section on "Synchronous Distributed Technology" they say:
 

: "Synchronous technology ensures application integrity and minimizes
: complexity, but can have poorer response time performance and less
: availability if the systems and networks involved are unreliable
: and slow."
 

: and in the section on "Asynchronous Distributed Technology" they say:
 

: "Asynchronous technology maximizes availability and response time
: performance, but can be more complex and requires careful planning
: and design to enure application integrity."
 

: I'm reading (or interpreting) this (and the rest of the document)
: to mean that Oracle thinks that using PL/SQL, triggers and procedures
: is easier to set up and manage than snapshots.
 

: I would think that setting up a synchronous distribution environment
: would be just as complex as setting up an asynchronous one.

According to Oracle's definition, a "Synchronous Distributed Tech" distributed system is a system where the tables you need to access are physically located in different databases on two or more nodes. So for example, an HR application might have a main database which hold employee info but accesses a SALERY table that is physically located in another database (name it ACCT) located on another node, possible not hetergeneous and not very nearby. This access is accomplished through a database link. To make the access completely transparent, a synonym could be created which points to that SALERY table with that database link. So for example. we would create a link and synonym as:

create database link acct connect to ap identified by ap using 'ACCT'; create synonym SALERY for SALERY_at_acct;

Once this is done, the application code need only to refer to that remote table as SALERY. It doesn't need to know if its remotely located or even where it may be located...all that has been administratively handled in the setup. The application can do all the desired DML he would normally do against the table as if it were local. The Oracle DB removes that complexity from the user (as so the theory goes :)

Now, using the above example, a "Asynchronous Distributed Tech" distributed system is one where the SALERY table would by replicated from the ACCT to the main database using Symmetric Replication. In this senario, there is a SALERY table on ACCT which is copied to a new table on the Main DB that is also called SALERY. Oracle provides the procedures in the "replication option" of the db to allow the DBA to create and synchronize these tables so that updates on either one would be propogated to the other. Thus, in this model, the SALERY table physically resides on both databases. However, DML operations can be performed and (depending on some caveats below) those operations will be made immediately effective on the local copy and eventually be propagated to the remote copy. (This is essentially what is meant by a "deferred transaction")

Oracle states the synchronous distributed model maintains integrity because there's only one SALERY table and a distributed transaction which updated both local tables and the remote SALERY table will either succeed or fail (guaranteed according to Oracle using two-phase commit). Oracle also states its simpler because all you need to do is set up the link and the synonym. However, as Oracle also notes, it's not a high availability model, since if the ACCT database was down or not accessible for whatever reason, any transactions which would select from or update that table would fail. If these are mission-critical transactions or systems, that would not be acceptable. Moreover, your response time would be very dependent on the network bandwidth that connects the two nodes. If they were separated on a WAN environment, queries could take quite a lot longer to run since it has to send all this data across from the remote node to the local main db node.

Oracle states the asynchronous distributed model yields higher availability and response time because when you access the SALERY table, you are actually accessing the table that is physically residing in the local main db. No network penalty during the access and no dependence on whether the acct node is up or not. However, setting this up correctly could be pretty complex, since you now have two SALERY tables both of which are updatable and whose updates and changes are propagated to the other node. Some process of resolving conflicting updates (For example, someone increase Joe's salary in SALERY table from 500 to 100 on the main db while someone else on ACCT decreased it from 500 to 250 *before* the first update gets propagated over from the main db to ACCT. so which update is correct?) must be defined and setup. Otherwise, integrity problems are bound to pop up. Also, some infrastructure defining how often changes et propagated from the local copy to thee remote copy is needed, since changes will be held and deferred until it is told to send those deferred transactions over.

: More important, I would think that maintenance and expansion of such
: an environment would be easier using asynchronous replication (snapshots)
: than using synchronous distribution (i.e PL/SQL, triggers and procedures).
: One should think that adding one more site with synchronous
: distribution would involve writing or re-writing lots of
: PL/SQL, while adding another snapshot site would be easier (since its
: deklarative).
 

: Having little experience with this technology, I might be dead wrong.
: However, it seems to me that if one is using replication, it would
: be easier to set up and manage snapshots then writing the replication
: code yourself. I'm most concerned about management and maintenance
: of such an environment. Setting it up migth be difficult, but it is
: an onetime job, whilst one could imagine such a system expanding and
: beeing in operation for years. Thats when it needs to be managable and
: maintainable.
 

: What are your thoughts?
 

: I'm also wondering if anyone can tell me about the underlying technology
: for the "Multiple Master" replication they are talking about. They are
: saying that it uses "Deferred RPC" as the undelying technology, but does
: that mean that you'll have to write these RPC's yourself? Or is there
: a layer "over" this that you use (as with snapshots)?

Deferred RPC's is basically a fancy name for all the code Oracle added to allow Symmetric Replication to work properly. It includes procedures that a DBA would use to set up a replicated environment and propagate those eferred transactions over to the multiple masters.

Johnny Chan
Independent Oracle Specialist Received on Mon Aug 21 1995 - 00:00:00 CEST

Original text of this message