Re: San-Based replication VS DataGuard replication

From: Madison Pruet <>
Date: Fri, 03 Oct 2008 20:41:47 GMT
Message-ID: <>

joel garry wrote:
> On Oct 3, 7:24 am, Madison Pruet <> wrote:

>> DA Morgan wrote:
>>> macdba321 wrote:
>>>> Group,
>>>>  I have a database at Site1 stored on a SAN, and a disaster-recovery
>>>> site2 with identical hardware. They are connected by high-speed fiber.
>>>> (Both SANs are enterprise-class with full journaling capabilities in
>>>> case the connection were ever severed.)
>>> 4. Data Guard, interestingly enough, is more efficient. What is being
>>>    replicated is the transactions themselves not operating system
>>>    blocks so are shipping less data.
>> This does not make sense.  SAN based replication is done only when a
>> physical write occurs.  Since DG is pushing the logs to the secondary to
>> achieve replication, it is replicating for any change in the page.
>> Unless Oracle is flushing every page to disk as it is updated, then the
>> impact to performance for a SAN based solution should be much more
>> efficient than pushing the logs to the secondary.

> You got it wrong. DG is only replicating the logs, while the db
> writer can do its thing at any time later (even never, in some cases -
> while several things can, and do, signal the writer to write, there
> are cases where Oracle doesn't even bother, google delayed block
> cleanout).

Yet with DG, those same things would have to be created on the secondary, wouldn't they? And with SAN replication, those would never be replicated since they were never physically written to disk, were they?

The SAN would have to replicate the logs _and_ the db
> writer writes as they happen, that's way more to do over the critical
> network resource.

If that is the critical resource....

  The logs are then applied on the other end in a
> continuous recovery. Yes, there is a trade-off between network and
> local bandwidth. Which is cheaper? How do you define efficient?
> Doing less stuff in the critical path usually leads to better
> performance.

>> Also consider the case with hot pages, such as index pages.  DG will be
>> forced to send each update to the page to the secondaries while SAN
>> based replication will only replicate the page as it is flushed to disk.

> No, read the Oracle Concepts manual, available online at
> DG doesn't know jack about hot pages, doesn't
> care. The redo logs are the secret. The Achilles' heel, for that
> matter. You need to understand recovery to understand how this works
> with DG.

You miss my point about hot pages.

Yea - the updates are sent via the logs. But if the primary has a hot page, then the primary will eventually perform a flush on that page, but not until there have been many updates to that hot page. The network cost might increase - by one additional page -, but on the secondary there would be no additional activity besides the updating of a page on disk. However with DG you'd be loading that page into memory, updating it from the logs, and then eventually writing it back to disk. All of this creates additional work on the secondary copy which can create back flow conditions. Not only that the same process would have to be performed potentially several times, depending on the luck of the draw.


>> The only logical way that DG could be more efficient would be if the
>> Oracle database flushes every dirty page to disk as it is updated. I can
>> see the logs being flushed immediately, but the data and index pages????
>> Is that the case?

> You REALLY need to learn the architecture. The database writer
> flushes dirty blocks to disk at its leisure. It's the redo buffers
> that are critical for being flushed to the logs. Since the data being
> changed can be a lot less than a block, that's a lot less data to deal
> with.

Like I said - by using hardware solutions, the writes of the hot page on the secondary would be being updated by the same trickle process as done by the primary. Yet with DG, the page would be updated multiple times - which may or may not require additional IO on the secondary - which may or may not impact the delivery of logs to the secondary - which may or may not impact the availability of the secondary.
> jg
> --
> is bogus.
Received on Fri Oct 03 2008 - 15:41:47 CDT

Original text of this message