Re: 10g: parallel pipelined table functions with cursor selecting from table(cast(SQL collection)) doesn't work
Date: Fri, 10 Jun 2011 12:42:14 -0700 (PDT)
Message-ID: <f34958e8-33bf-4dba-a335-f6d5bd32fa25_at_hd10g2000vbb.googlegroups.com>
On Jun 9, 2:00 pm, Frank Bergemann <FBergem..._at_web.de> wrote:
> Pls. see my 'sqlplus' example for entry #4 of this thread.
I saw your example and what I've replied still stands: You probably won't be able to parallelize your table function when using a table function as input that uses your in-memory collection as the source can not be executed in parallel, and Oracle obviously does not support a serial to parallel operation in that sense that there is no parallel row source that can execute your table function in parallel.
> This then resulted in delay for starting up the worker sessions (i
> didn't find a way to get rid of this).
There is no magic involved - Oracle has to "cluster" (or partition/ sort) the input row source for the table function according to your function declaration, so on a larger data set this operation may take some time. I assume that this delay is significant compared to the overall elapsed time?
> But: i also need to do an update of the primary driving data!
> So the stage #1 select is a 'select ... for update ...'.
> But you can't use such in CURSOR for table functions (which i can
> understand, why it's not possible).
> So i have to do my stage #1 selection in two steps:
I can't remember if we have discussed the details of what you're actually trying to achieve, but I can't help the feeling that you seem to try to re-invent the wheel at least partially. Wouldn't it be possible to implement your processing using pure SQL? Using (possibly multiple) parallel direct-path inserts and exchange partition techniques (if you have enterprise edition + partitioning available) should allow you to perform complex transformations while making the most of your available hardware. Always remember that "the fastest update/delete operation is an insert" in many cases where large data sets need to be processed/manipulated.
Besides that I believe that the workaround that you've mentioned should allow you to run your second step in parallel, however it depends of course on the overhead that you introduce by combining the data from step one with step two how efficient this will be.
Hope this helps,
Randolf
Received on Fri Jun 10 2011 - 14:42:14 CDT