RE: Solid state disks for Oracle?

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Sat, 11 Mar 2006 12:41:50 -0500
Message-ID: <KNEIIDHFLNJDHOOCFCDKAELAHMAA.mwf@rsiz.com>

Huh? Comments in line.

Rightsizing, Inc.
Mark W. Farnham
President
mwf_at_rsiz.com
36 West Street
Lebanon, NH 03766-1239
tel: (603) 448-1803

-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of Kevin Closson Sent: Saturday, March 11, 2006 11:32 AM
To: oracle-l_at_freelists.org
Subject: RE: Solid state disks for Oracle?

>>>

>>>Still, if you have direct attach SSD you avoid the wires and
>>>a bunch of protocol and network overhead.

>>>>SSDs are all FCP. The problem with that is if you want to have,say,

Well, NO. I've personally tested PCI bus SSD boards with 16 GB memory and 2 onboard disk drives and battery sufficient to dump all the memory to each disk drive. The boards were also available for battery only persistence at power outage SSD at a much cheaper price. Platypus was one company that produced them, and I'm not sure who owns what was formerly Platypus now. None of this is theoretical. TEMP can be put on non-persistent
storage without compromising the Oracle recovery model.

20 or so servers getting redo action on the SSD, you have to carve 20 LUNS (find me an SSD that can present its internal storage into 20 LUNS). Then you have to get the switch zoning right and then work out the front side cabling. How much simpler could the SAN gateway model be than that? You present the SSD as a single high performance LUN to the SAN gateway, put a NAS filesystem on it, export 20 directories, mount them on the respective nodes, set filesystemio_options=directIO in each init.ora, relocate the redo logs to the NFS mount and now you have <1ms redo IO for as much redo as each of the 20 servers can generate. Unless you can find me a system that is pushing more then, say, ~105MBs redo...then you'd have to configure a temed/bonded NIC for that system.

No argument from me that your protocol is a good way to share SSD. I believe that is what I said in the message you are

           replying to.
           The question is whether you have a lot of distinct servers on
which to share the SSD. The direct attach units are significantly
           less expensive, and come in persistent and non-persistent models.

That is unlikely to be part of the market segment you serve, so it doesn't surprise me too much that you've never heard of them.

>>>power source of the SSD is not dependent on the machine, so
>>>you get the memory to the onboard disk drives in a server
>>>power outage, and if you have duplex onboard disks that is
>>>as good as raid 1.

Power? SSDs like the now defunct Imperial and others like DSI are so internally power redundant it is redicilous. Like I said early on in this thread, I'm not talking from a theoretical standpoint. I have had an 8-port MG-5000 SSD since 2001 (not cheap, not small).

As I have said, there exist SSD boards that are *NOT* persistent across a server power cycle. They are much cheaper per gig

and they serve their purpose. But you should not put anything on them that is part of the Oracle recovery stream unless it is

a throwaway, totally refreshable system. Since the SSD performance matches the persistent models, that is a way to cut the

costs of test systems configured identically for performance as production (but not configured for reliable recovery). If you're on

a UPS, that means server crashes you can reboot without a power cycle still don't require recovery or refresh, but that is not

good enough in my opinion for the primary transaction stream.

>>>SSD. You preserve the utility of the SSD as long as network
>>>latency and bandwidth is sufficient for the load.

>>>>>>GigE is quite sufficient for transaction logging. You can

Pick a large enough number of servers and that is not true. That IS theoretical. But what I wrote was that the utility is preserved

as long as the network latency and bandwidth is sufficient the utility IS preserved. GigE likely does handle all reasonable cases,

so that is an excellent choice. But don't expect SSD on a network to help you at all if you share it across a crappy network or

even a good network connection with little headroom. You have to buy and spec a network attachment that is NOT a bottleneck

compared to load, or you will diminish the utility of the SSD. You're saying that such a network attachment exists for almost all

cases and every case you've seen. I don't disagree, but whoever configures the system has to get a sufficient network attachment.

Again, as long as network latency and bandwidth is sufficient for the load presented, the utility of the SSD is preserved.

see some test results here:
http://www.polyserve.com/pdf/Oracle_NFS_wp.pdf

>>>
>>>Which is best for a given server farm will vary. Even if you
>>>use direct attach SSD, you still have to verify that the SSD
>>>is certified for the protocol it is emulating to be treated

Emulation? These devices are FCP. And who would "certify" such a thing?

The certification on the PCI bus was for a given manufacturer to issue all commands in the regression suite for verifying a new

model of disk drive for use in their machines. The whole point of those models of SSD was you could use them as if they were disk drives (with extraordinarily short

seek times and high bandwidth.) Different manufacturers certified them for different buses, and you had to buy a model of the boards that was certified for a particular manufacturer and bus.

--
http://www.freelists.org/webpage/oracle-l



--
http://www.freelists.org/webpage/oracle-l

Received on Sat Mar 11 2006 - 11:41:50 CST