Re: Understanding partitioning by hash

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Thu, 09 Oct 2003 07:31:23 +1000
Message-Id: <3f8482bd$0$7066$afc38c87@news.optusnet.com.au>

Rick Denoire wrote:

> "Luc Gyselinck" <Luc.Gyselinck_at_nospampandora.be> wrote:
>
>
>> Oracle will make
>>sure there will be an (almost) evenly distribution. In our database the
>>difference between the less and most populated partitions is only about
>>10%.
>>
>>An example of a hash function (for VARCHAR2) is in DBMS_UTILITY:
>>
>> function get_hash_value(name varchar2, base number, hash_size number)
>>
>>These functions are deterministic, they will always return the same value
>>for the same key!
>>So all rows with the same partitioning key will go into the same
>>partition!
>
> If the partition for a record is deterministically dependent on the
> partition key, how can oracle distribute them (almost) evenly? This
> point remains still unclear for me.

Because it's choice of hashing algorithm is pretty smart. But yes, a badly-chosen hashing algorithm (such as MOD when you ask for 5 hash partitions and your partition key is a sequence number that increases in steps of 10) would mean a very nasty distribution of the data indeed. But since in 9i you can't choose your own hashing algorithm, it's difficult for the thing to completely mess up.

But Oracle's default algorithm is, as I say, pretty smart. Choice of the number of partitions is, however, important.

>
> Partitioning is transparent for the SQL code, so it does not need to
> take partitioning into consideration when doing a select statement.
> But what if doing inserts? Will Oracle automatically create the
> corresponding partitions when new values of the partitioning key are
> used?

Of course not. You create the table with (say) 6 partitions. That's it. YOu don't suddenly get a seventh partition when a new partition key is inserted. The hashing algorithm used takes account of the fact that there are only 6 partitions to work with, and allocates accordingly amongst those 6.

Does that mean you can get over-populated and under-populated partitions? Absolutely. A bad choice of the number of partitions can indeed screw you up badly. But it's not normally fatal.

>I hope so, otherwise any insert statement that would put a
> record on an non-existing partition (because the partitioning key is
> new) will fail!

It won't fail. It will be housed amongst the existing partitions.

>
> I am trying hard to learn enough about partitioning, but I am afraid
> that the "gotchas" will appear when the partitioned table is already
> in production... I need to be sure that after partitioning a table,
> everything will still work like before, with no adverse consequences
> and no need of special administrative care (of course, rebuilding
> indexes of local partitions will occur partitionwise, but that shuld
> not be a problem).

Everything will still work, but not exactly as before, because you're using a new feature. Yes, there are a number of new things to consider (such as an appropriate number of hash partitions, striking a balance between the benefits of partition elimination (the more partitios, the better) and poor data distribution (the 'right' number is important to determine, and more doesn't necessarily mean better).

But in terms of worrying about inserts failing, for hash partitions that's not an issue (for range partitions without a MAXVALUES clause, it is...but that's a different story).

Regards
HJR
>
> Thanks for your explanation!
>
> Rick Denoire

-- 
--------------------------------------------
See my brand new website, soon to be full of 
new articles: www.dizwell.com.
Nothing much there yet, but give it time!!
--------------------------------------------

Received on Wed Oct 08 2003 - 16:31:23 CDT