Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: tough choices

Re: tough choices

From: Daniel Morgan <damorgan_at_x.washington.edu>
Date: Fri, 18 Jun 2004 16:47:37 -0700
Message-ID: <1087602479.441629@yasure>


Serge Rielau wrote:

> Daniel Morgan wrote:
>

>> If I have a single table in a tablespace in Oracle stored in a single
>> datafile the data is equally accessible from all nodes. Lose a node and
>> there is no negative effect on the ability to access any part of a the
>> data in that table. Federate the data as is required by shared nothing
>> architectures and the loss of a single node effectively kills the
>> system.
>>
>> Thus with shared everything the more nodes the less likely a failure
>> whereas with shared nothing loss of a node makes part of the data
>> inaccessible.

>
> Let's clarify the lingo:
> DB2 II uses federation. One of the main properties of federation is that
> any one piece of the federation is pretty independent from the rest.
> They may not even know or admit they are part of it (a bit like Bavaria
> or Quebec ;-)
> DB2 UDB for LUW + DPF is a partitioned DBMS. The partitioning is done on
> a logical level. Partitions assume logical (not physical) ownership of
> data.
> If a partition goes down it can either simply be restarted, or, if there
> is a problem with the hardware that hosts the partiton the partition can
> be restarted on another hardware.

Partitioning of data in DB2 is an entirely different concept than is the partitioning of data in Oracle so the word may be misunderstood. Partioning in Oracle has nothing to do with RAC or clustering or nodes. In Oracle, as you know, data is never federated.

> The data sits on the same physical storage subsystem as it does in a
> shared disc.
> There is no problem for a partition to access that data from any
> physical node it happens to run on using the same technology Oracle
> exploits

It can't use the same technology as shared everything is not shared nothing and visa versa. Can you clarify what you intended to say?

> So the question really boils down to this:
> 1. The time it takes to detect a partition is down. That is a universal
> problem independent of the underlying technology.

That time is as long as it takes to rip the power cord out of the back of the machine: A fraction of a second.

> 2. The time it takes to get the partition back into the game compared to
> the failover work a RAC cluster has to do (such as remastering locks).

In my tests that is less than one second. Far to small with RAC to be of concern to anyone.

> A curious fact (no more no less) is that there are numerous kinds of
> applications where a set of clients only ever accesses one or a few
> partitions of the overall system. In these cases an unrelated partition
> can fail over completely without notice by those clients. An example may
> be the operation of a retail store chain with a partitioning key on the
> store id. While headquartes may not be able to get the big picture while
> a partition fails over, most the individual stores can operate without a
> hitch throughout the event.

You are still using the word partitions as though there is some relationship between partitions and RAC: There is not.

> It's my understanding that remastering locks by contrast has a global
> impact. Does this happen twice, btw? First time when the node goes down.
> Second time when it comes up?

Lock what? I have no idea what you are refering to? Why would an Oracle SELECT statement lock anything? Could it ... of course? But generally that is not the case as the multiversioning architecture does not require it.

> While _adding_ a partition to increase capacity would usually be
> followed by a redistribute which, today, has significant impact, the
> fact that nodes are not physical has interesting implications.

That redistribution we should note requires that the database be taken off-line. This never happens with RAC.

> E.g. one can oversize the number of partitions per physical node.

Which equates to what in Oracle?

> When more capacity is needed additional hardware can be made available
> and the partitions themselves get redistributed amongst the new
> infrastructure. Note that no data has to move since no partition was
> added or removed.

And if one adds or removes nodes with RAC no alteration of any database object is required nor does any line of code require modification.

> Similarly one would never remove a partition during an HA event or to
> lower resource consumption. One simply restarts the partition or clears
> out a physical node.

Ah but we can. With the 10g grid control I can dynamically reassign resources from a database node to an application server node and back. I can dynamically redistribute the CPU and RAM where it is needed.

> In reality most DB2 + DPF customers run more than one partition per
> physical node. So this is indeed a realistic option.

But how does this change the fact that with DB2 the more nodes that exist the more likely a system failure?

-- 
Daniel Morgan
http://www.outreach.washington.edu/ext/certificates/oad/oad_crs.asp
http://www.outreach.washington.edu/ext/certificates/aoa/aoa_crs.asp
damorgan_at_x.washington.edu
(replace 'x' with a 'u' to reply)
Received on Fri Jun 18 2004 - 18:47:37 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US