Re: how to rebuild index after direct path load?

From: -=< a q u a b u b b l e >=- <aquabubble_at_Remove.No.Parma.Ham.Remove.geocities.com>
Date: Sat, 27 Nov 1999 02:29:21 -0000
Message-ID: <821ear$ekm$1@news6.svr.pol.co.uk>

<hkpang_at_my-deja.com> wrote in message news:81l88d$pm3$1_at_nnrp1.deja.com...
> Using conventional path is about 10-20 times slower. I don't think I can
> do that.

It is slower, but the number of indexes and constraints determines by how much. 10-20 times would be worst case I think. If you turn them all of as is recommended, then you will see about 2-4 times slower from my experience.

> For the second method using an exception table, as I already noted, both
> the new record and the old record will be written to the exception
> table. I will have to run a query to delete rows whose rowid is smaller.
> This is also very slow.

Sorry, I went too quick for myself there! I wouldn't recommend the suggestion in the other thread as I believe this would be slower than the first one here. The way we deal with this situation is to use a DELETE query based on the table with duplicates using rowids... is this how you were doing it?

> I am puzzled why Oracle would write both the old and new rows to the
> exception table. I would expect Oracle to skip the first one.

Yes, crap isn't it!? I think they've gone for adherence to rules over functionality here.

> In article <81k3qt$spu$1_at_news6.svr.pol.co.uk>,
> "-=< a q u a b u b b l e >=-"
> <aquabubble_at_Remove.No.Parma.Ham.Remove.geocities.com> wrote:
> > <hkpang_at_my-deja.com> wrote in message
> news:81hbld$6nu$1_at_nnrp1.deja.com...
> > > Hi,
> > >
> > > I need to upload a huge dataset into an existing table. If the
> dataset
> > > contains duplicate keys, I want to the delete the old ones already
> in
> > > the table.
> > >
> > > My current approach is using sqlldr direct path. Since the data
> > > contains duplicate keys, the primary key index is violated and put
> in
> > > direct load state afterward.
> > >
> > > Now my problem is how to rebuild the index *and* eliminate the
> > > duplicates?
> > >
> > > 1) use a delete command to eliminate the duplicates. this kind of
> works
> > > but the performance is unacceptable. i potentially have millions to
> > > delete.
> > >
> > > 2) drop the index before upload, reenable the index with exceptions
> > > going to an exception table. the probelm is that both the old rows
> and
> > > the new rows are in the table, there is no way I can figure out
> which
> > > one to eliminate.
> > >
> > > Any suggestions???
> > >
> > > Many thanks for making a happy thanksgiving possible. :-)
> >
> > Okay, I have two suggestions. The simplest is to have your
> unique/primary
> > key constraint enabled when you do your SQL*Load (I think for this
> you have
> > to use the conventional path). Here, all exceptions won't be loaded by
> > default, and their rownums will be written to a bad file. Oh, you
> have to
> > make sure that your number of exceptions is defined as very large in
> your
> > SQL*Loader control file.
> >
> > Secondly, if you use method (2) you can determine the rowids of all
> columns
> > that cause an exception from the EXCEPTIONS table. It would be a very
> > efficient DELETE to delete from your table where rowid = ... I'd opt
> for the
> > latter, since I would think it was faster, and also you have more
> control
> > over what is/isn't deleted.
> >
> > HTH
> >
> >

>
>

> Sent via Deja.com http://www.deja.com/
> Before you buy.
Received on Fri Nov 26 1999 - 20:29:21 CST