Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: partitioning and execution plans?

Re: partitioning and execution plans?

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: 2000/08/13
Message-ID: <966162923.15236.0.nnrp-14.9e984b29@news.demon.co.uk>

>I don't quite understand your formula? power(2,N)???? huh? and
>yes, I have 10 partitions, 1 partition per tablespace.
>

power(2,N) - the number of hash partition should be a power of 2. i.e. 2,4,8,16,32,...

>>In general, 10 partitions will lead to 4 small and 6 large
>>partitions, where the large partitions are twice the
>>size of the small partitions.
>>
>>Can you please let me know about the formula you have for
>calculating partions?

Formula for sizing - Oracle allows you to add or remove one hash partition at a time (split/coalesce) without having to rebuild the entire table. In your case, therefore, Oracle will treat 10 partitions as 8 + 2. Partitions 1 and 9 will be the same size as each other as will partitions 2 and 10. If you coalesce once, then 2 and 10 will collapse down to partition 2, do it again and 1 and 9 will collapse down into partition 1. Hence the comment about 6 large and 4 small.

>Right now the largest account has 500,000 rows. The newest
>account has 0 rows and the smallest non-zero account has 100
>rows.

Awkward - as you have discovered, Oracle is likely to switch back and forth between indexed paths and scans to deal with the two extremes. What indexes do you have at present ?

>Originally the table was normalized without the account_id. We
>found data growing incredibly though. For just 10 accounts, the
>table was accruing almost 1 million records every 2 weeks. I
>new it was going to be a HUGE table, so I thought I partitioning
>would help...and account_id seemed the most logical choice.
>

Hash partitioning really requires you to have a reasonably large number of different values for the partitioning column. Hashing for 'partition count = number of distinct values' is not a good setup, even experimentally. You are unlikely to get one account per partition.

>First of all, This is a developement database and I am a new
>DBA. I am actually an app developer trying to move into the
>role of a dba...so I am experimenting a lot.
>

This is a non-trivial problem, and you are addressing it with a new feature of Oracle. You may get lucky, but really no-one should ask a 'new DBA' to try to solve a problem like this. Get your management to call in a database architect for a few days. (I think I can say that without prejudice, since I'm based several thousand miles away and not likely to be a candidate for the task).

--

Jonathan Lewis
Yet another Oracle-related web site:  http://www.jlcomp.demon.co.uk

gdas wrote in message <0becf590.960f0885_at_usw-ex0102-015.remarq.com>...

>"Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote:
>>
>>We really need an idea of the total volume
>>of data and the variation in data volumes per
>>account.

>>
>>One starting thought - you say you have partitioned
>>over 10 tablespaces, does this also mean 10 partitions ?
>>You should always hash partition over power(2,N)
>>partitions to maximise you chances of equal sized
>>partitions.
>>

>>If you partition by hash (account_id), then any query
>>of the form:
>> account_id = {constant}
>>will automatically eliminate all but one partition, whether
>>or not the account exists at all - Oracle will simply access
>>the partition where it would be if it existed.
>>
>>Your best bet for getting the best access path all the
>>time to analyze the index for the column account_id
>>to get histograms generated.
>
>I have tried this...and have had the same results or worse
>results as when I was generating table statistics.
>
>>
>>One thought, however, for handling historical account
>>information - you might consider composite partitioning -
>> range partitioning by date, and hash subpartitioning
>>by account_id.
>>
>>Unfortunately, I did not give enough info previously. I can't
>explain the real intent of the application. I can, but I'm not
>supposed to due to disclosure contracts. The historical data is
>not date-based. It is more like an account acquires a
>characteristic, and as time goes by an account acquires more and
>more characteristics. When the account acquired a
>characteristic is not important. What's most important is all
>the characteristics that have been aquired by an account.
>
>The table in question looks like this:
>
>characteristic_id number,
>account_id number,
>reference_id number,
>characteristic_type varchar2(100),
>value varchar2(100)
>
>characteristic_id is the primary key
>reference_id is the foreign key that is used to join the table
>to other tables.
>account_id is the partitioning column.
>
>Logically, an account has many references (each account has
>anywhere from 1-100 references). Each reference has 1- 1000
>characteristics per account.
>
>
>First of all, This is a developement database and I am a new
>DBA. I am actually an app developer trying to move into the
>role of a dba...so I am experimenting a lot.
>
>As you might be able to tell, I denormalized the schema by
>adding account_id to the characteristic table. The only reason
>I did this was specifically to take advantage of partitioning.
>
>Originally the table was normalized without the account_id. We
>found data growing incredibly though. For just 10 accounts, the
>table was accruing almost 1 million records every 2 weeks. I
>new it was going to be a HUGE table, so I thought I partitioning
>would help...and account_id seemed the most logical choice.
>
>I'm not sure what to try next. I'd appreciate any tips.
>>
>>--
>>
>>Jonathan Lewis
>>Yet another Oracle-related web site:
http://www.jlcomp.demon.co.uk
>>
>>gdas wrote in message <004e2dc6.f97dbd35_at_usw-ex0101-
006.remarq.com>...
>>>I found an interesting issue today...
>>>
>>>I have a large table that is hash partitioned across 10
>>>tablespaces by an account_id. This was done because a user
will
>>>always be viewing information for their own account. However,
>>>the information for each account can have different amounts of
>>>data and thus varied cardinalities.
>>>
>>>I've experimenting with analyzing this table with different
>>>levels of sampling. I've even done the same for all the
>>>individual partitions.
>>>
>>>The problem is that the execution plan is highly data
>>>dependent. An execution plan for one account that works well,
>>>doesn't work well for another account if the amount of data is
>>>greatly varied.
>>>
>>>Additionally, if an account does not yet exist in this
>>>table...an incredibly slow query is sent out which I have to
>>>kill after 2 minutes. This table is tracking history. And new
>>>accounts have no history yet... accounts that do have history
>>>come back in anywhere from 2 - 60 seconds (due to problem
above).
>>>
>>>I can logically deduce that since there is no record for the
>>>account in the history table yet, that there is no hash bucket
>>>yet assigned. So oracle might be doing a full table scan here.
>>>(I do have local indexes created for the account_id).
>>>Also, I did an explain plan on a simple query for an account
>>>that did not exist yet "Select count(*) from table where
>>>account_id = n "...to see what partitions were being scanned.
>>>The output showed a start and stop on just a single partition.
>>>(Why it is scanning just one partition I don't know...but I
>>>suppose if it scanned all 10, the query would be ten times as
>>>slower).
>>>
>>>For manageability, I don't want to have to create 1 partition
>>>per account.
>>>
>>>
>>>Is there a better way to gather statistics in order to account
>>>for the variances in cardinalities for individual accounts that
>>>exist within a single partition.
>>>
>>>Any ideas?
>>>
>>>
>>>
>>>-----------------------------------------------------------
>>>
>>>Got questions? Get answers over the phone at Keen.com.
>>>Up to 100 minutes free!
>>>http://www.keen.com
>>>
>>>
>>
>>
>>
>>
>
>
>
>-----------------------------------------------------------
>
>Got questions? Get answers over the phone at Keen.com.
>Up to 100 minutes free!
>http://www.keen.com
>
Received on Sun Aug 13 2000 - 00:00:00 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US