Re: Statistics during a complex load
Date: Sun, 20 Feb 2011 17:59:21 -0800 (PST)
On Feb 20, 5:00 pm, John Hurley <hurleyjo..._at_yahoo.com> wrote:
> # I've probably missed something, otherwise there would be a very
> simple solution: drop all constraints (eg. foreign keys) and indexes
> before the load, and, if necessary, create them afterwards. That's
> always been the fastest way to load data.
> Works well for non partitioned tables and you have the option of
> dropping or just disabling the foreign keys depending on exactly how
> they were setup initially.
> Of course if you need concurrent access to the table(s) while loading
> it gets more involved. Typically I would consider getting all the
> stuff into a staging table ...
> The volume of data and the timing of handling changes to it is
> critical to coming up with a complete understanding of what might
> potentially work.
What is the volume of data? If you are on 11gR2, look at exchange table and external tables and "insert into realtable select * from externaltable". This actually uses SQLLOADER under the sheets and can be blazingly fast. Bottlenecks are typically the "mappers" you described.
Just curious, but when doing a data "merge" how do you determine which column get updated and which columns stay the same? You would almost need a timestamp column for each data column... and that would be a real pain.
As for partitioning - look at interval partitions (also new in 11gR2) where there is no need to pre-create the partitions - this is really cool. In our testing we also discovered that even if we added a zero statistic value to the new partitions, performance on queries was more than acceptable.
As for the speed, I would look at something like use the external table definition, then run an insert/update procedure (that would do all of the checking using a join rather than point select statements.) Of course, when you have to "merge" data, that can be a REAL pain.
As for the previous reply to drop constraints etc, that can be very dangerous if you depend on that referential integrity. Indexes (except PK) maybe, but the thought of having to potentially clean up data so you can re-add those constraints is not a very pleasant thought. You would need to **guarantee** that the referential integrity was completed somewhere upstream - which may negate the need for it at the db level. Received on Sun Feb 20 2011 - 19:59:21 CST