Re: RMAN impact

From: Mladen Gogala <gogala_at_sbcglobal.net>
Date: Thu, 09 Mar 2006 23:45:37 -0500
Message-Id: <1141965937l.4769l.0l@medo.noip.com>

On 03/09/2006 09:14:21 PM, Steve Perry wrote:
> we have a rman backup for 500 GB 9.2 RAC on RHEL3 (3 LTO II drives) that takes 3 hours to complete, but kills the node it runs on.
> io wait % goes to 60%, cpu is low. the server is pretty much unresponsive until it completes.
> We allocate 4 channels and are not using the large_pool or tape slaves.
>
> We tried the same thing but used Disk (seperate LUN from the database) and the server crashed.

Steve, it is not my intention to berate you or anything of the sort, but this is some weird stuff. PC equipment, even the one with the SMP motherboards is not made for high volume I/O. Now you are discovering the difference between a PC and mini-computers like HP 9000/rp4400 or IBM p520: those can do massive amounts of I/O while not even the best Dell PC can do that. The problem is in the fact that PC buss, even with the best SMP motherboard, doesn't have enough capacity to allow simultaneous traffic between multiple CPU boards, peripheral devices and memory. Disks normally use DMA and deposit the result of I/O directly into the memory. Disks also notify CPU that I/O is done by sending interrupts which must be handled. With 4 channels, you have 4 active RMAN processes, each performing reads from your disks, depositing the result into memory and notifying any available CPU that it has completed I/O. It communicates with Oracle processes,sends data to and from network, which results in some more interrupts and DMA traffic between NIC and memory and your system bus is saturated. System is unresponsive because simple interrupts, like pressing enter, must wait to be handled. Steve, there is a reason why PC equipment is so much cheaper: it cannot do massive amounts of I/O. What you pay for when you buy p595 is a massive backplane which can sustain almost a TB/second and will allow your system to operate normally, even if you are writing 300MB/sec. The secret is in the fact that fiber channel adapter for IBM p595 is attached to memory and not the central system bus. When you issue an I/O request, you deposit IORB on one location in the memory where smart I/O controller reads it, executes and deposits results into memory. It then issues a single interrupt saying that it's done. Central system bus, the one used to carry data between CPU and RAM isn't used at all. Drivers for that kind of equipment are standard on AIX or HP-UX and require some work on Linux, where they're known as I2O. Also, architecturally speaking, those machines are much more balanced. When you have 4 screaming 3GHZ Intels inside, it is tough to feed them with memory. The fastest available memories are 30-50ns. That means that CPU can ask memory for more data less then 40 million times per second. Translated into megahertz, it corresponds to bus frequency of 40MHZ, needed of course, to feed each of the processors. Unless we can feed them faster, our screaming 3GHZ chip is useless as it, like the angel in the movie "Barbarella", has no memory to work on.
The situation can be improved by large L2 and L1 caches as well as TLB buffers. The efficiency of those is severely impacted by things like long jumps across the address space and context switches like, for instance, ones caused by interrupts or normal multiprocess work. Add, on that same bus, additional lines for cache synchronization, which are on a separate bus on the proper minis, needed to keep caches coherent. Those lines must exist between each L1 and L2 cache, so that it doesn't happen that one address has one value in one cache and another one in another. Now, add peripheral devices: video adapter, disk controllers and NIC. Your central bus has frequency of 233MHZ and is 64 wires wide ("64 bits"). When you calculate the maximum speed, it gives you 1864 megabytes/second for EVERYTHING. That is theoretical data transfer speed. The real one, due to retransmits and synchronizations between devices (Which device should transmit first? This is resolved by so called "bus arbitration", as well as the question which CPU should handle an interrupt) is significantly lower, only around 1.2 GB/sec. That is called "sustained data rate". The massive amount of I/O that you are trying to make your poor PC perform will simply consume the central bus and nothing will work. In addition to that, if interrupt handlers and software components start to detect timeouts, your machine will think that there is something wrong with the motherboard and will crash. Having only two channels would probably finish sooner and with less problems then having 4 channels. Machine doesn't crash with the tape as tapes are slower and cannot do I/O fast enough to endanger the central bus. Multiple disk drives are just about fast enough to crash the system. Detecting problems like this is precisely the purpose of benchmarking and testing before you buy a machine. For things like that you should use a proper mini. Do you know why they call them Infinitely Boring Machines? Nothing ever happens. They don't crash, they don't go down, they just quietly work and don't provide any excitement or adventure in your life.

-- 
Mladen Gogala
http://www.mgogala.com

--
http://www.freelists.org/webpage/oracle-l

Received on Thu Mar 09 2006 - 22:45:37 CST