Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Redolog group Members

Re: Redolog group Members

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Fri, 26 Nov 2004 22:53:48 +1100
Message-ID: <41a71941$0$20521$afc38c87@news.optusnet.com.au>


Martin Doering wrote:
> On Fri, 26 Nov 2004 06:11:38 +1100, "Howard J. Rogers"
> <hjr_at_dizwell.com> wrote:
>
>
>>You have wonderful hardware there. You therefore have wonderful
>>protection against hardware failure.
>>
>>But if I were to walk in to your server room, and issue an 'rm
>>log1a.rdo' command, how much protection does all that hardware offer?
>>Your O/S will merrily delete the first member of that group, won't it?
>
>
> But what, if I rm all logs? I once did exactly that with a bad online
> backup script. I did save all members, than delete it - though, this
> was planned just for the archived logs. They had the same ending. :-|
>
> So deleting a log member is not hypothetic - the same for deleting all
> members. So more files does not neccessarily mean more safety.

The word "necessarily" is redundant. Of course if you are totally insane, or allow completely imbecilic people to have access to your servers, no amount of redundancy will help. But that having multiple members gives you some degree of redundancy, over and above what hardware mirroring can give you, is unarguable and irrefutable. So I hope you won't start arguing or refuting.

>
>
>>And what do you think will happen on the mirror?
>
>
> This mirror is done by the OS, oracle again does mirror for itself.
> The question is:
>
>
>>And in similar vein, LGWR is not perfect and has been known to throw the
>>odd bit of corruption into a log before now. What it writes to one log
>>group will be faithfully mirrored onto the other by your wonderful
>>hardware, won't it. At which point, you have a totally corrupt log group.
>
>
> And the logwriter does mirror it's corruption the all members?

It would have to do so in exactly the same place for all members. The chances are vanishingly small. Not nothing, by any means. But then if you want absolute certainties, you're in the wrong profession anyway.

The point is that ARCH, for example, can construct a clean archived redo log from two, three or more individually corrupt online redo logs.

>BTW, if
> I could not trust oracle's engine, I would need to multiplex all
> database file too, right? Our real live experience is, that I never
> had a corruption in a logfile, when needing a log file since now.
> Alhtough we had corruptions in the database files.

Your sentence structure means I don't know what you are trying to say there. But it sounds like you are simply protesting too much.

We're not talking about "trusting" anything. That's the point. We're talking about building in redundancy so that you *don't* have to trust anything.

>>Making LGWR write more than once to separate log group members means the
>>chances of software error corrupting an entire group are slender.
>
>
> Why? Is that a hope, or an experience?
> And at all: In the end, the
> online redo's get archived, and that's it. So what means software
> corruption? Just, if I want to roll forward my database in very rare
> cases, if will feel this corruption, am I right?

Again, because of sentence structure, I'm afraid I don't know what you are talking about. Your original post said multiplexing is "totally stupid" because I have excellent hardware redundancy. That is a non-logical deduction from the facts. You have excellent protection against hardware failure. You don't have excellent protection against software or user error, unless you multiplex.

Thems the facts. I'm sorry if you don't like them.

You can argue about how high the chances of maniac users or software glitches are: that's fine, and weighing such things up is part of the job. But don't, please, claim they are non-existent quantities and can be dismissed accordingly. When it comes to data recovery, I want to deal with certainties as much as possible, and the elimination of risk wherever feasible. You seem to want to take risks and trust to luck. That's fine... so long as you know what you are doing. My suspicion is that you don't.

>
>>My recommendation is that you are under-multiplexed by a factor of 1:
>>There should be three members per group.
>
>
> Where should I stop then? Why 3, not 4 or 5? Am I safer, if I can make
> 3 user errors?

I've told you the answer to this already. Multiplexing costs performance. You make LGWR do more work, and every commit waits on LGWR.   So why not 4 or 5 or 6? Because the performance penalties are too high for the decreased risk level gained. Usually and for most people. But if I had an infinite hardware budget, and incredibly important data that must not be lost under any circumstances, then yes... I probably would go to 4 or more members per group. And hardware mirroring. Three or more ways.

It is, in short, precisely what I said it was the first time: a balance to be struck.

>>You should learn the mantra: hardware mirroring protects you from
>>hardware failure. But it does nothing for software or user stuff-ups.
>>For that, you need Oracle multiplexing.
>
>
> Hm, this is old shool knowledge.

At least it's knowledge. What makes it "old school". What fascinating insights can you bring to the party that demonstrates this is false, misleading or harmful advice?

> This is always true.

Oh, really? Then why do you apparently not believe it? Or do you always say things are "totally stupid" which are "always true"?

> The really
> interesting question is, where to stop with safety?

Where the costs exceed the perceived benefits.

>Where to make the
> cut. I'm searching for your experiences about this, not for THE
> solution.

You are apparently looking for a kick in the teeth if your attitude is actually anything to go by. But no matter. Perhaps it's just a language thing. What I've written is based on experience. You are free to ignore it as much as you want. But one then questions the wisdom of soliciting the material in a public forum.

>>And of course there's a performance hit. What do you want? Safety or
>>performance? It's a valid thing, sometimes, to say 'performance every
>>time', but don't then run crying to management when you lose half an
>>hour's-worth of committed transactions. These things are usually a
>>balance, and I suspect that most organisations most of the time would
>>want to know their data was safe, provided performance was merely
>>acceptable.
>
>
> Safety vs. performance are not always contrahents. For the OS based
> mirroring for example, the mirroring gives you a better read
> performance, because the fastest mirror does answer first.

Of course they are not ALWAYS contra*dictory*. I didn't say otherwise. But then there's cost. And someone who posts here regularly (I forget exactly who) has or had a signature to their posts which read something like "I can give you safety, performance and cheap. Which two do you want". Which sort of sums it up.

> We want, for shure, safety and performance - both as much, as is
> possible. And, you are right, we might need to find the right
> compromise.

There's no "might" about it. Of course you will HAVE to find a compromise that is right for you.

>>If you feel that way about things, of course, you could always set
>>_disable_logging=true, watch your database run like a leaping gazelle,
>>and grow grey very quickly wondering when your database is about to
>>become completely toast.
>
>
> We have such databases. But for several production databases this is
> not an option for us.

Oh Lord. Just tell me when humour doesn't translate into solid teutonic, will you. Of *course* it's not a option. It's an utterly unsupported parameter, that renders a crashed database utterly toast. I doubt very much that you run ANY database in _disable_logging mode as a result. The point was simply that there is a spectrum of performance versus safety (and if you are being pedantic, it's an X-Y-Z graph with cost on the third axis).

> Though, the database just get's rid of the
> archiving processes (may be some bigger amount of I/O),

Right. So you don't actually know what _disable_logging does, do you? It doesn't just switch off archiving, you know. It switches off the writing of all redo, even by LGWR.

> but the online
> redo-logs are written the same way and still od have their I/O.

I assume you are intelligent and capable. I assume you care about your data, and would not want to see any of it lost through error, oversight or mishap. I assume you actually want to understand the issues involved.

In which case, stop pursuing an agenda which happens to be wrong. You can dismiss it as "old knowledge" is you wish. But it isn't. Open you ears, and be prepared to listen.

HJR Received on Fri Nov 26 2004 - 05:53:48 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US