Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: OCFS for Windows

Re: OCFS for Windows

From: David Fitzjarrell <oratune_at_msn.com>
Date: 6 Mar 2003 13:03:41 -0800
Message-ID: <32d39fb1.0303061303.1b509f72@posting.google.com>


"koert54" <nospam_at_nospam.com> wrote in message news:<puF9a.455$Ma.32_at_afrodite.telenet-ops.be>...
> Well - I'm glad you have it running without too much trouble - may I ask are
> you
> a consultant for that site, a project manager or are you the day-by-day DBA
> for
> those databases ?
>

I direct the Database Services department, but that doesn't mean that I don't work as a DBA as well. We all share the responsibility for our systems in the field.

> Don't get me wrong - I wish I could say the same of our RAC - but I just
> can't.
> We are experiencing problems like :
> - split brains - we have Gigabit private interconnect + 100Mbit public
> interconnect .. we have tried crosscables and switches
> - when a split brain occurs - normally hartbeats can happen through the
> controlfiles in order to decide which node
> will go down - no such thing ... they time-out on it and both crash
> - lotsa IPC errors for no apparant reasons - resulting in problems described
> above :-)
> - we have a shared pool now of 800Mb on both instances - still the instances
> keep on crashing due to memory
> shortage in the shared pool - my guess is that because PCM locks are now
> automated and thus, you don't have
> control over the number of lock elements in the lock database - the lock
> database keeps on growing, trying to
> get a 1:1 LE:BLOCK granularity as much as possible. A flush of the shared
> pool won't help so instead we have
> to rebounce. Oracle replies : encrease shared pool .... yeah right - till I
> reach 2GB and I don't have a buffercache anymore or no processspace for
> spawning threads ?:-)
> - timing problems : if the system clocks on both nodes start to differ
> you're in for a big surprise :-)
> - RAC on NT/2000 does not load balance it's inter-instance messaging over
> multiple nics - also the fail-over of
> nics doesn't work ... if I pull the private interconnect you would expect it
> to keep on running because of the
> public interconnect - no such thing, instead you get one node going to 100%
> CPU and needs a cold boot while the
> other just sits there and finally also crashes ...
> - if we bounce one instance - 20% chances we have to bounce the whole
> cluster
> - if we add new raw devices and symbolic links, sometimes the object
> services does not replicate them over the
> nodes - a reboot is necessary :-)
> - we've started a year ago with 9.0.1 - we're now at level 9.2.0.2 - did
> lotsa firmaware upgrades on the hardware (IBM)and still no solid cluster
> - these are just some of the problems we've had - we have spend so many
> hours on this thing - we could have easily bought 2 decent Tru64 nodes and
> maybe, just maybe it would have worked. I have talked to some collegues who
> run OPS on 16 nodes SP2s, RAC on Tru64 - they all had their share of
> sleepless nights. So consider yourself
> very lucky or if you're not the DBA - ask your DBA's what they do late at
> night :-)
>
> Again - a bitter and tired RAC DBA :-)
>

I can't answer for what you have configured ... we use 2000 for our servers and have no problems keeping Parallel Server instances running. We have had issues with some hardware, and that's been resolved (usually a firmware issue). Since we don't mix our operating systems I can't say how reliable or unreliable a configuration can be, and I won't argue with you over what you've experienced because I do know of some OPS configurations that are nightmares to administer. I originally thought the Windows configuration would be; I've been pleasantly surprised at how reliable it's been. And I have had issues with Microsoft Cluster Server, most notably with FailSafe configurations. So, I have my share of nightmares. They just don't involve OPS.

David Fitzjarrell
> "David Fitzjarrell" <oratune_at_msn.com> wrote in message
> news:32d39fb1.0303051112.74d1f551_at_posting.google.com...
> > "koert54" <nospam_at_spam.com> wrote in message
> news:<3e65f784$0$2184$4d4efb8e_at_news.be.uu.net>...
> > (snip)
> > > My personal opinion is - if you're bold enough to run a parallel server
> on a
> > > windows platform don't be surprised to be shuffling a lot of shit with
> > > your back against the wall !
> > >
> > (snip)
> >
> > We regularly run OPS on 2000 and have no problems with it. In fact
> > these systems are located in Central America, South America and the
> > Caribbean, running pre-paid cellular systems, so you can imagine the
> > traffic they get. We rarely have downtime (with the exceptions of
> > hardware issues and, sometimes, the local 'nut behind the wheel'), so
> > I WILL be surprised if I end up 'shuffling a lot of shit with your
> > (read 'my') back against the wall !'
> >
> > David Fitzjarrell
Received on Thu Mar 06 2003 - 15:03:41 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US