Re: tough choices

From: Serge Rielau <srielau_at_ca.eye-be-em.com>
Date: Fri, 18 Jun 2004 22:19:20 -0400
Message-ID: <cb07rs$ijt$1@hanover.torolab.ibm.com>

Daniel Morgan wrote:

> Serge Rielau wrote:
>

>> Daniel Morgan wrote:
>>
>>> If I have a single table in a tablespace in Oracle stored in a single
>>> datafile the data is equally accessible from all nodes. Lose a node and
>>> there is no negative effect on the ability to access any part of a the
>>> data in that table. Federate the data as is required by shared nothing
>>> architectures and the loss of a single node effectively kills the
>>> system.
>>>
>>> Thus with shared everything the more nodes the less likely a failure
>>> whereas with shared nothing loss of a node makes part of the data
>>> inaccessible.
>>
>>
>> Let's clarify the lingo:
>> DB2 II uses federation. One of the main properties of federation is
>> that any one piece of the federation is pretty independent from the
>> rest. They may not even know or admit they are part of it (a bit like
>> Bavaria or Quebec ;-)
>> DB2 UDB for LUW + DPF is a partitioned DBMS. The partitioning is done
>> on a logical level. Partitions assume logical (not physical) ownership
>> of data.
>> If a partition goes down it can either simply be restarted, or, if
>> there is a problem with the hardware that hosts the partiton the
>> partition can be restarted on another hardware.

> 
> 
> Partitioning of data in DB2 is an entirely different concept than is the
> partitioning of data in Oracle so the word may be misunderstood.
> Partioning in Oracle has nothing to do with RAC or clustering or nodes.
> In Oracle, as you know, data is never federated.

Actually both these interpretations can happily live next to each other: Oracle refers to a partitioned tabled while DB2 refers to a partitioned database. No judjement.. Just applications of the same technique at a different level.
Orcale, btw, uses Transparant Gateway, and supposedly OLEDB I would guess. So federation is very well available, although not heavily pushed.

>> The data sits on the same physical storage subsystem as it does in a
>> shared disc.
>> There is no problem for a partition to access that data from any
>> physical node it happens to run on using the same technology Oracle
>> exploits

> 
> 
> It can't use the same technology as shared everything is not shared
> nothing and visa versa. Can you clarify what you intended to say?

Yes I can, or at least I can try being fairly detached from cables and plugs myself.
In an Orcale RAC setup each Oracle instance(?) of the cluster has access to all the data. Obviously the system that the instance is running on must have an interconnect to the storage subsystem. Now while in DB2 with DPF a given databse partition only is allowed access to teh data that it owns. that data usually still lives on the shared storage subsystem and hance teh system that the partition runs on has an interconnect to the that system, just the same as Oracle RAC has. So, the system has access to all the data. The DB2 partition has not.

>> So the question really boils down to this:
>> 1. The time it takes to detect a partition is down. That is a
>> universal problem independent of the underlying technology.
> That time is as long as it takes to rip the power cord out of the back > of the machine: A fraction of a second. OK, fine by me.

>> 2. The time it takes to get the partition back into the game compared
>> to the failover work a RAC cluster has to do (such as remastering
>> locks).

> In my tests that is less than one second. Far to small with RAC to be of > concern to anyone.
OK, I take your word for it.

>> A curious fact (no more no less) is that there are numerous kinds of
>> applications where a set of clients only ever accesses one or a few
>> partitions of the overall system. In these cases an unrelated
>> partition can fail over completely without notice by those clients. An
>> example may be the operation of a retail store chain with a
>> partitioning key on the store id. While headquartes may not be able to
>> get the big picture while
>> a partition fails over, most the individual stores can operate without
>> a hitch throughout the event.

> You are still using the word partitions as though there is some > relationship between partitions and RAC: There is not. I was talking about DB2. Above is a DB2 + DPF scenario using DB2 Lingo which I set out to clarify.

>> It's my understanding that remastering locks by contrast has a global
>> impact. Does this happen twice, btw? First time when the node goes
>> down. Second time when it comes up?

> Lock what? I have no idea what you are refering to? Why would an Oracle 
> SELECT statement lock anything? Could it ... of course? But generally
> that is not the case as the multiversioning architecture does not
> require it.

Where did SELECT come into play here? This is cache-fusion stuff. There is a need to synch up. The details you know certainly better than I do, and HJR better than both of us I presume.

>> While _adding_ a partition to increase capacity would usually be
>> followed by a redistribute which, today, has significant impact, the
>> fact that nodes are not physical has interesting implications.

> 
> 
> That redistribution we should note requires that the database be taken
> off-line. This never happens with RAC.

Significant impact, as I say.

>> E.g. one can oversize the number of partitions per physical node.
> Which equates to what in Oracle?
Does it have to equate to anything on Oracle? I just describe a DB2 + DPF setup.

>> When more capacity is needed additional hardware can be made available
>> and the partitions themselves get redistributed amongst the new
>> infrastructure. Note that no data has to move since no partition was
>> added or removed.

> And if one adds or removes nodes with RAC no alteration of any database > object is required nor does any line of code require modification. OK please read teh following sentence ATOMIC. Do not take parts out of it: BEGIN ATOMIC
Adding a node in DB2 + DPF does not require application changes. If you go from n -> m where _n>1_ there is no need. Going from DB2 without DPF to DB2 with DPF requires some thought, because now the partitioning keys need to be decided on. W.r.t. speeding up, e.g. batch-jobs Oracle and DB2 are actually in a similar boat. To speed up a procedural thing one calls it n-times and partitions the dataset to keep the threads out of each others hair. There is no magic here.
The difference between a parallelized Oracle batchjob and a parallelized DB2 batch job is that in Oracle you may want to partition by ranges of data (maybe along table-partitions..). In DB2 the parallization occurs first along teh partitioning key then ranges can be used in addition ranges. If a new database partition gets added, no change. Simple more instances of the batch job.

>> Similarly one would never remove a partition during an HA event or to
>> lower resource consumption. One simply restarts the partition or
>> clears out a physical node.

> Ah but we can. With the 10g grid control I can dynamically reassign
> resources from a database node to an application server node and back.
> I can dynamically redistribute the CPU and RAM where it is needed.

We? I thought you are not tied to Oracle? Yes and I will the last to say that that's not a good thing (albeit I don't know it. However do you see anything in the design described above that precludes DB2 for DPF from doing the same. There may not be a "grid-control" but that's no design flaw, just a choice of focus. Automatic failover, workload balancing and client reroute tehcnology are independent of "shared disk" vs "shared nothing".

>> In reality most DB2 + DPF customers run more than one partition per
>> physical node. So this is indeed a realistic option.
> But how does this change the fact that with DB2 the more nodes that > exist the more likely a system failure? The more RAC nodes you have the more likely it is that one of these Dell boxes gets sick.
Oracle RAC can fail over and not turn the sick box into a failed system. The same is true for DB2 + DPF. A database partition can fail over and not turn a sick box into a failed system. You have to say good by of the notion that a DB2 database partition completely and _irrevocable_ fails, it's just a program. If a program is that sick the system as a whole is sick, because it's just a copy of the same program with the same bugs and that is as true for DB2 + DPF as it is for Oracle RAC.

Cheers
Serge

-- 
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Received on Fri Jun 18 2004 - 21:19:20 CDT