Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Q. To RAC or go vertical

RE: Q. To RAC or go vertical

From: Matthew Zito <mzito_at_gridapp.com>
Date: Tue, 05 Aug 2003 13:24:23 -0800
Message-ID: <F001.005C90F9.20030805132423@fatcity.com>

*sigh* Alright, I'll bite. See inline.

--
Matthew Zito
GridApp Systems
Email: mzito_at_gridapp.com
Cell: 646-220-3551
Phone: 212-358-8211 x 359
http://www.gridapp.com


> -----Original Message-----
> From: ml-errors_at_fatcity.com [mailto:ml-errors_at_fatcity.com] On
> Behalf Of Odland, Brad
> Sent: Tuesday, August 05, 2003 3:35 PM
> To: Multiple recipients of list ORACLE-L
> Subject: RE: Q. To RAC or go vertical
>
>
> When you do TCO analysis do add in the costs of
> administration?
Yes (in fact, we even say that it costs three times as much to administer a linux RAC cluster as a sun cluster).
> The learning curve?
Yes.
> The maintenance?
Yes.
>The
> value of reliability and familiar support structures? WHat
> kind of proof do you have about the claim of RAC's
> reliability compared to a single mutliple processor system?
>
The value of reliability? I'm not talking about buying some random Intel white-box vendor - I'm talking about a name like IBM, HPQ, etc. I have seen far higher reliability from those vendors than Sun in the last four years. Case in point - I had three E6500s once supporting over 100 IBM intel servers. I had one intel failure in a six month period, and three hardware failures on the Suns. That's an impressive reliability ratio from a hardware perspective. Familiar support structures should be an oxymoron - your hardware should fail rarely enough that you should have to look up the 1-800 number you need to call. As far as proof of reliability, that's hard to quantify. However, from a logical perspective, on an SMP system when a processor fails, the entire system goes down. When a node fails in a cluster, the others take over for it. Yes - software bugs can rear their ugly head and prevent that from happening, but that's a constant.
> What about when a node does fail and suddenly the users and
> batch processing is left with 1/2 or a 1/4 of the procsessing
> power gone? How long is it going to take to get the system
> back to 100%? Lots of admins can be confident in gettting a
> huge hp or sun box up in less than 12 hours. Is 6 hours of
> downtime worse than three days of processing at 50% capacity?
Is it better to have a performance impacted system or a down system? Is it better to buy twice the capacity to compensate for the fact your hideously expensive UNIX server tends to fall over when there's a two-bit memory error or cache corruption? I've never seen an intel box broken so badly it takes three days to fix. On the other hand, I had an e4500 that took Sun 7 months of replacing every part in the system to figure out what was wrong with it. We had to decomission it as a production server because it was crashing every few days. Hey, if you're concerned about node downtime and want to be crafty - buy an extra node for your RAC cluster. Splurge and spend the extra $10k for a node that sits there idle until its needed. It's _still_ better than buying twice your needed capacity. In fact - I haven't run the numbers, but I bet you could buy double the nodes you actually need and leave them idle and still be vastly cheaper than two big unix servers.
> What about the value of KNOWING a solution works not just
> speculating on how much money it MIGHT save.
>
Do today's solutions "work"? If you're running an enterprise database today, you need to buy two servers, pay for a clustering software, spend the money to implement a clustering solution, pay through the nose for platinum support on these things, and you still need to hire smart people to run them. And the end result is a solution where when a server dies, the other server that's been sitting there sucking down power and idling now gets to start up oracle and begin processing transactions. Yes, it technically functions, but it seems counter-intuitive for an organization that is generally a cost center to spend extra money to compensate for the fact that when their incredibly expensive server falls over, it takes the entire system down with it.
> The IT industry has fallen because of lots of "sell them the
> sizzle, get em' the bacon later" marketing hype like the info
> floating around about RAC and grid. Software and hardware
> vendors have been jumping from one "great idea" to another.
> The result is a lot of products that end up in the bone yard
> and another round of layoffs.
I'm with ya - I'm as amused and skeptical as everyone else at grid computing and independent clustering initiatives - Sun's N-1 being the shining example of "sizzle sans bacon". But you make it sound like RAC is this brand-new creature that was introduced last week by a tiny unknown company. Totally ignoring how long OPS was around, RAC was introduced in June of 2001. That's two years in the wild and its getting better all the time.
> What is happening is hardware and software vendors are
> feeding the markets desire to have a low cost system with
> unlimited power and scalability. I am sorry to say you STILL
> can't have both. I know what vendors are thinking, they think
> this will be holy grail of IT that will bring us back to the
> fat days of pre y2k.
Hey, I'd love to have a low cost system with unlimited scalability and power. If anyone knows what it is, please email me off list. I've never said that RAC is appropriate for all environments, and would never even dream of claiming that it was.
> "Get the grid going it so complex that they will have to use
> our consulting services too...once wer'e in the door we'll be
> there for years." IT directors made the mistake of trusting
> vendors once. They aren't going to do again.
>
Right. Don't trust your vendors. I think every IT department should have an antagonistic relationship with their vendors (I'm serious). That includes not putting all your eggs in one basket and always being willing to investigate new technology that has the possibility of improving your power stance against your vendors. Have a healthy skepticism (I see you've got that down) and take a look. How can you possibly be looking out for your organization's best interests if you're not investigating all of your options?
> Frankly I am all for reducing complexity and increasing
> reliability. Right now there is proven technology that may
> cost a bit more but in the long is going to be the right decision.
>
"Right now there is proven technology that may cost a bit more but in the long is going to be the right decision. " Sounds like you're trusting your vendors a whole bunch. How do you know its the right decision? Because your current solution works? I bet it would run on a mainframe as well - that costs a "bit more" and is definitely more reliable than a UNIX server and has been around for years. I don't mean to sound snarky, but how will anything ever be a "proven technology" if you don't investigate it? If breadth of deployment is a yardstick of a "proven technology", we should all be running win2k for everything.
> "The above notes and my company aside, I would be shocked if
> I ever implemented a large single-image Oracle instance ever again. "
>
> Yeah right when monkeys fly out my butt.
>
Now now, play nice. If I was ordered to build a single-image Oracle instance, I would - doesn't mean I'd recommend it. And with the databases I have come across, I feel comfortable that most of them could be implemented successfully and reliably with a RAC cluster. No, RAC is not a silver bullet. Yes, I wish it was. Still - for many many environments, I posit that it is a viable alternative to the traditional paradigm of active-passive clustered UNIX servers. Thanks for your thoughts, Matt -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Matthew Zito INET: mzito_at_gridapp.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Received on Tue Aug 05 2003 - 16:24:23 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US