Re: Reorganize Big Table with BLOB column

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Thu, 11 Mar 2021 10:50:19 +0000
Message-ID: <CAGtsp8kAAvSMbcsDQbqZutfJ+LKRWuUZ4az8BdmjSgQ7U8=2+w_at_mail.gmail.com>



Lothar,

Even though it doesn't say anything about partitioning, you might find some useful ideas on related topics in this note that I wrote a few years ago: https://www.red-gate.com/simple-talk/sql/oracle/text-indexes

Regards
Jonathan Lewis

On Wed, 10 Mar 2021 at 12:50, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:

> Hi Jonathan,
>
>
> I am currently still in the process of finding the best way of moving the
> data. I believe I have now found the best storage characteristics for the
> BLOB. (DISABLE STORAGE IN ROW CHUNK 32K RETENTION NONE
> NOCACHE NOLOGGING ). Average length of the BLOB is 130886 byte.
> My main concern is the text index though. From previous experience I can
> say that it was almost impossible to context index a table of this size.
> That index create was simply never finishing.
>
> Questions:
>
> 1. I wonder if it would be possible to build a global context index by
> creating it per partition using partial global indexing. You would switch
> on indexing as per partition one at a time and rebuild the index. (It
> could rebuild the whole context index though..)
> 2. Do you think that the time building a context index can increase
> more that linear when the number of rows increases? You were mentioning
> memory. Does this suggest that memory is needed to build the context index
> in relation to it's size? That when memory is exhausted the text index
> creation stalls?
>
> *Do you mean the application doesn't support partitioning by year *
>
> Yes. The partition key won't be used for partition pruning. The table is
> rather simple: id, BLOB , timestamp created and last modified. (Timestamps
> are stored as Varchar2, which practise I dislike.)
> I can construct something like partition key extracting year from
> timestamp created. The application would be totally unaware of the
> partition key.
> Effectively that would mean looping over partitions each and every time
> the table is accessed via the context index.
> I remember that even looping over empty partitions can slow queries. Can
> you guesstimate the effect? We are talking 11 yearly partitions right now.
>
> The table is accessed mainly by the id.
> Hash Partitioning on the id would be an option . However that partitions
> would be constantly growing in size.
>
> The Version is 19.0. It does allow partition a context index.
> In this case a global index would make more sense. However we might still
> have an issue due its size.
>
> "Not able to complete" - you said "write complete waits"
>
> Actually there waits happened when my colleague used alter table move. It
> turned out the reason for these waits were that the BLOB was set to cache.
> I changed the property to nocache and the waits are gone for good. We are
> having "direct path read"/"direct path write" waits right now.
>
> Thanks
>
> Lothar
>
>
> Am 06.03.2021 um 09:53 schrieb Jonathan Lewis:
>
>
> Do you mean the application doesn't support partitioning by year - or is
> your version of Oracle unable to partition a context index?
>
> "Not able to complete" - you said "write complete waits" - there's a lot
> of data to be written, to several tables, indexes and lobs for a context
> index: I'd have to check the manuals to see what you can do about getting
> some of the work done with direct path writes, but I'd guess that your
> write complete waits relate to the volume of data being generated in the
> cache by a serial process. (It might be a good idea, though, to check the
> session stats for work done to achieve read-consistency e.g. undo records
> applied)
>
> How can you have "nearly no I/O" but suffer from write complete waits? The
> two symptoms seem contradictory. What is the O/S saying about it?
>
> If you partition the table AND the index can be locally partitioned in
> your version, then the scale of the problem changes because you could (I
> assume) create the index unusable then rebuild each partition in turn.
> Can you extract the parameters that are currently being used so that you
> can see if there's anything that's been changed from default?
> Could you copy a couple of million rows from the table and see how Oracle
> behaves creating an index on the subset.
>
>
> Manual parallelism: I was thinking of something like:
> create empty table with local indexes (if it's possible), with all the
> physical segments required.
> write a simple script that does "create table as select data for one year,
> create indexes, ecxhange partition" - then run as many copies as the
> machine will allow (vary according to what actually works)
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
> On Thu, 4 Mar 2021 at 17:30, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
>> Hi Jonathan
>>
>> thanks for answering. Some does require further explanation:
>>
>> I can get partitioning by year in place. It is however not supported by
>> the software and would be just a mean of dividing the data into more
>> manageable pieces.
>> So far I observed, that when trying to text index the complete table
>> progress is dropping after a while. Actually it never finishes.
>> So far my conclusion was that we lacked CPU as there is nearly no I/O. Is
>> it possible that we were rather lacking memory (e.g. swapping) and I was
>> missing the point?
>> Can I improve my chances to create the text index by partitioning the
>> data resulting is lesser memory demand?
>>
>> By manual parallelism do you refer to dbms_parallel execute ?
>>
>> Regards
>>
>> Lothar
>>
>>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Mar 11 2021 - 11:50:19 CET

Original text of this message