RE: Curiosity: single-column index on sparse data cannot be built in parallel

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 15 Jul 2015 13:52:06 -0400
Message-ID: <008c01d0bf26$f7c5eb30$e751c190$_at_rsiz.com>



Mohamed: Nice Find.

JL: Yes, sniping would be perilous and you’d have to backtrack somehow that nothing could still be expecting to reference it, ergo the “let’s hope not…”  

Lots of interesting stuff popped up via this thread.  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mohamed Houri Sent: Wednesday, July 15, 2015 1:37 PM
To: Jonathan Lewis
Cc: Mark W. Farnham; sacrophyte_at_gmail.com; ORACLE-L Subject: Re: Curiosity: single-column index on sparse data cannot be built in parallel  

Jonathan,  

  1. in more complex queries it would be perfectly feasible for one set of slaves to be idle for a very long time while the other set was busy

Not only this is possible but when one parallel slave stays more than 30 minutes inactive it will produces the following Real Time SQL monitoring Report oddity  

https://hourim.wordpress.com/2015/06/23/real-time-sql-monitoring-oddity/  

Best regards

Mohamed Houri  

2015-07-15 13:32 GMT+02:00 Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>:

Marks,

Thanks.
I don't think you can risk sniping, though, as

  1. in more complex queries it would be perfectly feasible for one set of slaves to be idle for a very long time while the other set was busy
  2. the query coordinator probably chatter to each other a bit, so sniping one slave might make the QC decide to terminate every other slave

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: Mark W. Farnham [mwf_at_rsiz.com]
Sent: 15 July 2015 12:13
To: Jonathan Lewis; sacrophyte_at_gmail.com

Cc: 'ORACLE-L'
Subject: RE: Curiosity: single-column index on sparse data cannot be built in parallel  

JL: Nice blog post: http://oakweb02.oaktable.net/node/8065

Brings up the question of whether we need to track for and snipe idle entries in v$pq_slave to prevent anomalous choices or failures of Oracle to use the expected parallel degree (let’s hope not, or hope some bug is detected and fixed before we have to resort to that; we’d at least have to pay the restart cost of the idle slave processes that should be used. Perhaps they are pinned unusable)  

Charles: JL’s post nicely explains why adding a column spreads the work out enough to keep the slaves more equally busy since then many entries become not all null. As indicated by JL’s analysis the title “cannot” is a bit off; minimally usefully would be more on target. As for having many indexes needing to be created on each table for many tables and indexes, have you experimented with kicking off sets of index creation in parallel (each serial) up to the load you want to consume? My “all else equal” bet would be that a healthy number of creates being run in parallel would finish all the work to be completed in less elapsed time than using parallel degree and running the creates serially unless just a few index creations are so large that they prolongs the work window. (In which case get all the little stuff done in sets and then kick the big ones off maximally parallel each.)  

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis Sent: Wednesday, July 15, 2015 12:48 AM
To: sacrophyte_at_gmail.com
Cc: ORACLE-L
Subject: RE: Curiosity: single-column index on sparse data cannot be built in parallel  

Charles,

Apologies for the delay - I had to shut down for the day before your reply came through.

I've written a blog note about the anomaly - planned to re-read it and polish it a little this morning.

Your non-null data is 3% of your index, and a parallel build would mean RANGE distribution of the data, so if your build ran anything less than parallel 33 (approx) the point would come where only one slave (in the second set) would be doing anything because it would be the one that got all the real data and had to sort and build. That's why you (I believe) you might see a point where only one process seemed to be doing any work. There is still benefit in running parallel, of course, since you have to scan and discard the 97% of the data that is NULL and can't go into the index.

The anomaly is that when I examined v$pq_tqstat after creating my 310,000 row index from my 10,000,000 row table (running parallel 4) the first set of slaves had passed 10 million rows to the second set of slaves whiich means they must have been passing all the (null, rowid) entries rather than discarding them.

Are you licensed to dig back into the ASH history ? Can you find out if the one running process was a Pnnn process or a normal shadow process ? If it was a normal shadow the problem might simply be that at the moment the create index started all the PX slaves were involved in building other indexes. If you can run an AWR report for the time around the start of build you might get a clue from the statistics about "Parallel operations downgraded to ..."      

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: Charles Schultz [sacrophyte_at_gmail.com] Sent: 15 July 2015 01:05
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Curiosity: single-column index on sparse data cannot be built in parallel

I didn't check $pq_tqstat, but I show results when I next do the test. I say it was not running in parallel because only one session was active via Enterprise Manager (yeah, I was being lazy, I know) and the creation took over 1 hour the first test, 51 minutes the next test, whereas all my other indexes (mix of local and non-partitioned) on the same table were taking less than 20 minutes (some less than 10).  

Jonathan, can you elaborate a little more on the "interesting anomaly"? :) It is entirely possible I am hitting that, but would like to learn more.  

As far as parallel 100, I was just trying to open the doors to see where my bottlenecks are. The extent sizes are included in the index ddl (5m); again, I don't have issues with the other indexes, but there is something about this being on a single column, so you may be on the right track with large extents and small volumes of data per slave. You are making me curious now. :) I love the anthromorphic angle on "refusing to cooperate". *grin*  

PS - in case it is not obvious, this is a vendor supplied table and index.

PS2 - tests currently executing on Sun T5440, for those that care about such things.

PS3 - as to "why"; I got a lot of indexes to build in a short time, so I am looking for the bounds. :)    

On Tue, Jul 14, 2015 at 4:12 PM, Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk> wrote:  

I'm not going to try for 500M rows, but I just built a table with 10M rows with your pattern of data and ran a parallel create index.

There is an interesting anomaly to the create index (11.2.0.4) that might make the build look like a serial build - how are you showing that the build isn't parallel ? I queried v$pq_tqstat after the build.

Parallel 100 seems a little optimistic for such a "small" index - 17M rows at about 13 bytes per row gives about 220MB or 2MB per slave to build. What's your default extent size - perhaps there's something about large extents and small volumes of data per slave that makes Oracle refuse to co-operate.    

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle


From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Charles Schultz [sacrophyte_at_gmail.com] Sent: 14 July 2015 21:59
To: ORACLE-L
Subject: Curiosity: single-column index on sparse data cannot be built in parallel

Good day,  

I am trying to find the technical reason for why Oracle cannot use parallel slaves to build a single-column index on a sparse column with few distinct values:    

F COUNT(*)

  • ----------

   538937561

Y 51464

N 17486819    

Just by playing around, I discovered that if I put this column as the leading edge on an index with many columns, it can be built in parallel.  

According to the relevant documentation <http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel007.htm#i1009131> , I would expect that the base table is sampled, but perhaps the random sampling returns 0 keys due to the sparse nature of the column?    

Facts:

Oracle Enterprise Edition 11.2.0.4

table is partitioned, but not on this key (lol)

parallel_max_servers=3600  

ddl extracted via datapump:

CREATE INDEX "FIMSMGR"."FGBTRND_ENCD_INDEX" ON "FIMSMGR"."FGBTRND" ("FGBTRND_DEFER_GRANT_IND")   PCTFREE 10 INITRANS 2 MAXTRANS 255   STORAGE(INITIAL 5242880 NEXT 5242880 MINEXTENTS 1 MAXEXTENTS 2147483645   PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1   BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)   TABLESPACE "FIN_LARGE_INDX" PARALLEL 100 ;  

-- 

Charles Schultz





 

-- 

Charles Schultz





 

-- 

Houri Mohamed 

Oracle DBA-Developer-Performance & Tuning 

Member of Oraworld-team <http://www.oraworld-team.com/> 

Visit My         - Blog <http://www.hourim.wordpress.com/> 

Let's Connect - <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>  Linkedin Profile <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/> 

My  <https://twitter.com/MohamedHouri> Twitter      - MohamedHouri <https://twitter.com/MohamedHouri> 



--
http://www.freelists.org/webpage/oracle-l
Received on Wed Jul 15 2015 - 19:52:06 CEST

Original text of this message