Re: Large tables, updates, selects and varchars

From: mathewbutler <mathewbutler_at_yahoo.com>
Date: Thu, 3 Apr 2008 07:22:30 -0700 (PDT)
Message-ID: <ccd66887-5da4-438b-9cee-7806cce0a56f@s13g2000prd.googlegroups.com>


On 3 Apr, 13:10, chrism..._at_gmail.com wrote:
> On Apr 2, 9:02 pm, DA Morgan <damor..._at_psoug.org> wrote:
>
>
>
>
>
> > chrism..._at_gmail.com wrote:
> > > I am using Oracle 10g (10.2.0.3.0) and have a large table (1+ billion
> > > rows).  It currently has about 15 columns in it.  I have a requirement
> > > for a new varchar2(4000) column to go with the current data in that
> > > table.  I need to update this column after it has been added to the
> > > table.  I have been told that I may be better off putting the column
> > > in a separate table because by adding it to an existing table, Oracle
> > > has to jump around the hard drive to update it and find it, and
> > > therefore this will degrade performance.  Of course, by putting it in
> > > a separate table, a join will be required on selects.
>
> > > From a purely design point of view, it makes a lot more sense to add
> > > the column to the existing table, but I don't know the internals of
> > > Oracle.  When modifying existing tables on large Oracle databases, do
> > > you generally have to be wary of what columns you add?
>
> > > I can't really test this ahead of time--the update will take a very
> > > long time to perform, because there are many rows and a calculation is
> > > involved.
>
> > > Thanks.
>
> > 1+ billion rows with or without partitioning?
>
> > Is this the last change that will ever take place or part of what may
> > be ongoing modifications?
>
> > How is the new column going to be used? Will most queries need to
> > access the new data? Why VARCHAR2(4000) and not CLOB?
> > --
> > Daniel A. Morgan
> > Oracle Ace Director & Instructor
> > University of Washington
> > damor..._at_x.washington.edu (replace x with u to respond)
> > Puget Sound Oracle Users Groupwww.psoug.org
>
> We are partitioning with between 35 million and 60 million rows per
> partition.  This probably will not be the last change we are going to
> make.  There are likely to be future modifications.
>
> It's going to store file paths and most queries will need to access
> it.  We chose VARCHAR2 because the size of the data will never exceed
> 4000 chars, and we assumed that CLOBs were less efficient than
> VARCHAR2s.  Is this not the case?- Hide quoted text -
>
> - Show quoted text -

This doesn't answer your question, but if your O/S is POSIX compliant then file paths will be defined much shorter than what you have defined ( 255 characters from dim and distant memory ).

I think you will need to consider;

  • your transaction mix ( the answer to your question will depend on the balance of inserts/updates/deletes and selects )
  • your database block size ( how many rows will fit onto a block, will you have significant row migration with the new column, and where this occurs will the additional work that Oracle needs to do be less that the alternative join )
  • space implications of any additional storage required, depending on the modelling approach
  • whether the approach adds or takes anything away from your systems simplicity and code maintainability, and if your proposed approach takes something away measure this against the gains.

It's not clear what you mean by update. ie: Whether you are concerned about the initial population of the new column for table entries, or the ongoing management of values in this column. In any case, remember that if you are modelling seperately you will also be storing 1+billion PK entries to enable you to find the new column.

Hope this is some help. Received on Thu Apr 03 2008 - 09:22:30 CDT

Original text of this message