Re: best way to invoke parallel in DW loads?
Date: Mon, 14 Sep 2009 10:14:09 -0700 (PDT)
I'm currently performing a ton of work with parallel processing and I've always been taught and treated parallel very differently from other Oracle features. Due to the fact that a parallel process is a very unique request for a specific process, I've always found it best to invoke it with hints.
I have tried paralle processing for certain objects, turning on PARALLEL_AUTOMATIC_TUNING, setting the degree on the tables that I wanted to run parallel processing from, forced it from the session or felt these would be the best objects to automate any processes against in parallel, etc., but I've found it rarely works as well as hard coding hints with procedures and packages to ensure consistency.
I came to the conclusion that it fell to the same arena as plan execution consistency- if you wanted this in 10g or higher, it's a fine art, never an exact science- so you would need to come up with a combination of parameter settings, hints, testing and knowing your system inside and out, (as well as training your developers that too much of a good thing is rarely a good thing....:))
As I'm also currently working in a new environment with tons of parallel processes without up to date statitistics- STATS are ESSENTIAL to efficient parallel process execution! I've been testing on a proof of concept that is showing up to 73% improvement in parallel processing performance just by having up to date statistics on partitions involved in any process. Mind you that these schemas are 2-6TB each, but the performance improvement percentage could be similar and worth the time to prove.
- On Mon, 9/14/09, Neil Kodner <nkodner_at_gmail.com> wrote:
From: Neil Kodner <nkodner_at_gmail.com>
Subject: best way to invoke parallel in DW loads?
Date: Monday, September 14, 2009, 7:21 AM
I've taken over a 9ir2 data warehouse and its loading process.
Here are some 'highlights' of the current 9i warehouse environment(I didn't do this!)
4x1200mhz sparcv9 cpu
parallel_max_servers - 70(!)
many, many, many PARALLEL and RULE hints in the warehouse load procedures. NO table statistics.
The warehouse was designed years ago by someone very well-versed in RBO. Management has insisted we leave it alone in its oddly-configured state because well, it works. It might not be efficient, and we've had moments with concurrency, but the thing works. It might take all weekend to process the load, but it runs, and it runs consistently. It is important to note that the production warehouse is NOT on a dedicated server. It's odd to see up to 30 parallel slaves on a 4-cpu machine but that's just the way it is.
My first task is to prepare the data load for a 10g migration. To this end, I've created the 10g development environment and copied all of the procs/data. I gather statistics on the tables and the procs seem to work fine with current statistics and the RULE hints removed.
In the past, the users have used PARALLEL hints to force parallel loading, often with 6 or 8 slaves on a 4-cpu box, even with multiple processes running.
My question lies with the best practices for invoking parallel. We will be moving the production warehouse to a machine with at least 8 CPU. In our development environment, I set
I'm wondering where to invoke parallel processing. If I compile the procs without hints and execute, they run without parallel. If I ALTER SESSION FORCE PARALLEL DML, I get terrific results. I'm not sure if that's the best way to enable parallel processing (and let parallel_adaptive_multi_user do its thing), or if I should be setting the degree of each and every segment. I'm sure that in this 10g and above world, parallel hints are not the way to go.
Is there a 'best practice' for handling this?Received on Mon Sep 14 2009 - 12:14:09 CDT