Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Replication - how hard is it really? (Updatable snapshots..)

Re: Replication - how hard is it really? (Updatable snapshots..)

From: Roger Hunwicks <mail.tsl_at_netcomuk.co.uk>
Date: Mon, 21 Jun 1999 20:08:47 +0100
Message-ID: <MPG.11d89c711f618da998968a@nntp.netcomuk.co.uk>


[This followup was posted to comp.databases.oracle.server and a copy was sent to the cited author.]

Pete

Your reply to my post is appended below.

Roger



Roger

Comments inline.

Pete

Roger Hunwicks wrote:

> [This followup was posted to comp.databases.oracle.server and a copy was
> sent to the cited author.]
>
> In article <37683142.6E3D5FFB_at_us.oracle.com>, psharman_at_us.oracle.com
> says...
> > I feel like running a class for clients where all they do is come in and say
> > "I want to do this (xyz). How can it be done in Oracle?". I reckon I'd
> > make a mint!
>
> If you run the course in London let me know how I book it! It would be
> be worth the mint - is that Tic Tac or Polo?

Actually, the Mint in Australia (where I come from, though I'm now based in the US) is where the Australian currency gets produced, but I'll settle for a Tic Tac!

> > > We need to run a database 24*7. We therefore need to replicate all
> > > updates to a hot standby server.
> >
> > Not necessarily. You may have already been down this road, but in case you
> > haven't, you should look at whether there is a real business need for 24x7.
> > There are obvious examples of this. For example, if eTrade used an Oracle
> > database for their data (and I must say I'm using them as an example without
> > having the faintest clue if they do or not!), downtime for them can cost
> > lots of dollars, and can in fact mean they go out of business. On the other
> > hand, I've seen sites where they think they need 24x7, then a disk crash
> > that takes out a day's work is shrugged off as something they can live
> > with! However, let's say for the rest of the discussion here that you do
> > need to run 24x7 ...
>
> We run a 24Hr Command & Control centre. It is deemed sufficiently
> important for us to have two sites, five miles apart, with replicated
> databases and disaster-recovery PCs ready to go. We are replacing the
> existing application (which doesn't use an Oracle database) and are
> required to provide equivalent functionality.

OK, looks like you've been into this reasonably thoroughly, and you do have a real need for 24x7. It's always worth checking before you start down the road of setting it up though.

> > Site failure can be protected against in a variety of ways too. The
> > simplest is off site backups, with associated data loss of anything since
> > the last available backup.
>
> We need to aim for less than 5 minutes downtime and loss of the current
> transaction only, in all but very exceptional circumstances.

Off site backups are obviously not sufficient then.

> > Alternatively, you could use standby database or
> > replication. Standby databases can be started in read-only mode with 8i,
> > but not before that.
>
> Forgive me if this is a stupid question, but I'm new to Oracle... Is a
> 'standby database' one that mirrors the live database (for example by
> applying redo logs). If it is what is the granularity (e.g. individual
> transactions are copied, or something else - our current database uses
> disk blocks, sort of). 8i wouldn't be a problem, provided it runs on
> VMS.
Standby database in Oracle terms means something very specific. It's a copy of the database set up remotely from the primary site. The standby database is mounted but not open, and permanently in recovery mode. As redo logs are archived on the primary site, they are shipped to the remote site as well, and applied to the standby. As a result, the level of granularity is the redo log. If the primary site dies, you will lose the information in the current redo log, unless you use disk geomirroring. In releases prior to 8i, opening the standby database made it the primary, and you then had to create another standby elsewhere. In 8i, you can open the standby database in read-only mode, run queries against it (but not updates), take it back to the mounted mode and it becomes your standby again. The only problem is that in the time after you open the standby, you are vulnerable to failure if the old standby fails for some reason.

> > They can be started in update mode but then they
> > become the primary database and you need to create another standby database
> > somewhere else.
>
> If we lost the main server and started the standby in update mode we
> would need to resync the two servers once the original problem had been
> resolved. When you say 'create another standby' do you mean that we
> would need 'delete' the ex-live database and copy the database from the
> up-to-date server? I would have thought that this would take a while -
> would you have to write-inhibit the database while you take the copy?

Yes, depending on the size of the database it can take some time. But the alternative is to not have a backup site for what was your old backup site once it becomes primary. What most DBA's in this situation do is rebuild the standby somewhere else (doesn't have to be on the old primary site), BEFORE opening the original standby for update. At least that way you're protected.

> > Replication can be set up with both sites updatable, but
> > you point clients to one only, then to the other if the first fails. That
> > can be done cleverly enough to avoid the need for conflict resolution.
>
> We definitely only need one site updatable. Presumably you could do
> something like have both databases set to updatable and then have a
> database table holding a flag to indicate which server is the master.
> Then the application login module could prevent users from logging in to
> the wrong server.

No, you just set up the replication as normal, so both sites appear updatable. Then set up the tnsnames.ora file for the clients to have a dual address list for a single alias. Then if the first one is unreachable, they automatically go to the second one. From a business perspective, the main issue here is duplicate data entry. What happens here is a user who has entered data on the primary site gets kicked off because the machine fails. They log on again (now on the other site) and see their data hasn't made it (because the transaction queues weren't pushed quickly enough), so they re-enter the data. When the primary site comes up again, the transaction queues get pushed and bang, you get a conflict.

> > Downside here is that the replication is transaction based rather than log
> > based, so for large throughput it's probably not the way to go.
>
> I wouldn't have characterised the application as 'large throughput' but I
> don't have any experience to judge throughput in an Oracle context.
>
> > > However, in the event of a failure of either server we would need to run
> > > the remaining server standalone until the problem was resolved and then
> > > resync the servers to get back to the original situation.
> >
> > This could be done using either standby or replication. If you use standby,
> > and the secondary database becomes the primary, then until you rebuild
> > another standby you have a window of time where you're not protected against
> > another site failure. If you use replication, likewise. There are a number
> > of expensive ways of addressing this sort of thing (geomirroring for
> > example) but you need to determine the business need for it and weigh that
> > up against the cost.
>
> This is the key issue. We are aware that once we lose one server we are
> running without protection, but we have IT staff on call 24 hours a day,
> and all the hardware is on a four-hour response maintenance contract.
> The servers are running RAID arrays and redundant power supplies, etc.
> The comms lines are duplicated and also have ISDN backup. Therefore we
> are happy to live without protection once we lose a server, provided that
> when we fix it we can resync the databases in a sort period of time (say
> less than half an hour, presuming the broken server was unavailable for 6
> hours) and with minimal downtime on the working server.
>
> > > Given that a server may fail in the middle of the night, the ability to
> > > let IT Operations staff select which configuration to use (Standalone A,
> > > Standalone B, A->B Replicated) from a menu without needing to wake up the
> > > DBA would be a big plus!
> >
> > Again, isn't it always?! From memory, the GUI's for FailSafe and
> > Replication allow you to do this in a slightly more complex way. I haven't
> > seen a way (from Oracle anyway) to allow similar functionality for standby.
>
> I don't know what Failsafe is, but if it is NT only it is no good to us,
> we need to run VMS now and Unix in the longer term. It sounds as though
> Replication is better for us than Standby.

FailSafe is only NT, so ignore that for now. Replication may be better for you, but with judicious configuration (small redo logs, disk geomirroring), standby can be just as good. I wouldn't like to make a call like that without spending a couple of days on site, rather than discussing it via email.

> > > Transaction levels would be low -> medium - less than 40 interactive
> > > users of a simple (less than 40 modules) telephone call logging
> > > application.
> >
> > It's the transactions per second that's the important measure here. You
> > could have hundreds of users doing very little and cope with it using
> > replication, or few users really hammering the system and not cope with it.
> > Also, the version of the database is important in determining whether the
> > system will cope with the load.
>
> We could run any version of the database that is supported on OpenVMS for
> Alpha. Transactions per second will be low - I can't remember the
> figures of the top of my head (no Usenet at work, so I'm writing this at
> home!) but I'll check them on Monday.

Just as an aside, I'm not sure what the latest version is on VMS. I'm pretty sure 8.0 is out, but 8.1 may not be. Check with your sales rep.

> > > It has been suggested to us that using Oracle Replication for this task
> > > would be 'a sledgehammer to crack a nut' and that we should write a
> > > package to automatically generate triggers for each table in the
> > > application and use the triggers to handle the replication. Obviously,
> > > this advice is from the person who wants to write the trigger-generator
> > > package!
> >
> > > Should we be considering Oracle Replication, or some other approach, or
> >
> > > should we pay for a trigger generator to be written?
> > >
> >
> > I prefer to stay away from home grown answers when something in the kernel
> > already does what you want. There's a number of reasons for that, and none
> > of them are because I work for Oracle! Firstly, you have to build, debug,
> > and maintain it. If it's in the kernel, at least you get Support for it.
> > Secondly, what you build probably won't have all the functionality in it
> > that something like Replication has. For example, if both databases are up
> > how do you handle conflict recognition and resolution? How do you handle
> > referential integrity? What a trigger based approach is trying to do is all
> > there in replication anyway.
>
> There won't be a problem with conflict resolution, because the
> application will only be allowed to write to one database at a time, no
> matter how we get the required resilience.
>
> But I agree with your general philosophy about using database
> functionality rather than writing code wherever possible, and I would add
> another reason to your list: a feature that is covered on the Oracle DBA
> course should mean that buying in expertise when staff leave should be
> easy - if you write it yourself, and the the key developer leaves
> supporting it could be a nightmare.

One point here - the replication training is a 5 day course on its own. You need to attend the normal DBA training first, then the replication course.

> Consequently I would prefer to use standard functionality if possible.
> However, we were told that not many people use replication, and that it
> is very complicated to setup, and (this is probably the killer) would
> need a DBA available to resync the servers (I.e. at three in the morning)
> as we wouldn't be able to automate it and shove the various options in a
> bunch of command files (be they DCL or PL/SQL or whatever).

I know of over 3000 clients who use replication, so I wouldn't say not many people use it! Its not that difficult to setup either, you just need to give some thought to how to handle the conflict side particularly. In your case, that's pretty straight forward. And as for the DBA need, resyncing happens automatically. If one site fails, people just start using the other, provided you've got the tnsnames.ora set up correctly. The work for the DBA occurs only when you need to resolve any issues when the primary site becomes available again, and even then it might not be their role to do the work. What I mean by that is if the primary is brought back up, and there are conflicts that occur due to duplicate data entry, that needs to be cleaned up. In this case, it would be best to NOT put conflict resolution on, and the conflicts would remain in an errors table to be resolved. Really, it's the application owner who should make the decisions about how to resolve the errors, since they own the data and the DBA's don't.

> > > Any advice gratefully received.
> >
> > Not a problem.
> >
> Hope this isn't all too newbie. Thanks for your help.
>
> Roger

Regards

Pete Received on Mon Jun 21 1999 - 14:08:47 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US