RE: Wow - has anyone used ROW_NUMBER() to get around using a UNION statement when UNION ALL doesn't work??

From: Taylor, Chris David <ChrisDavid.Taylor_at_ingrambarge.com>
Date: Mon, 8 Aug 2011 13:48:56 -0500
Message-ID: <C5533BD628A9524496D63801704AE56D6A315EBD42_at_SPOBMEXC14.adprod.directory>



Sorry for the delay, our corporate spam catcher was hard at work it seems.

Greg- I'm going to look into pastebin now...Do you create the pastes at pastebin and then use the URL in a mail to the list?

Mark & Greg - I'll post the queries (orginal, and modified versions) along with plan info to pastebin. It will take some time...

Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205
Office: 615-517-3355
Cell: 615-663-1673
Email: chris.taylor_at_ingrambarge.com
 
CONFIDENTIALITY NOTICE: This e-mail and any attachments are confidential and may also be privileged. If you are not the named recipient, please notify the sender immediately and delete the contents of this message without disclosing the contents to anyone, using them for any purpose, or storing or copying the information on any medium.

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mark W. Farnham Sent: Monday, August 08, 2011 11:27 AM
To: oracle-l_at_freelists.org
Subject: RE: Wow - has anyone used ROW_NUMBER() to get around using a UNION statement when UNION ALL doesn't work??

In the vast majority of cases I agree with Greg about posting plans. However, in this particular case the question is a purely logical consideration not based on any particular plans the optimizer is choosing. But a little bit more of context may be required.

In the general case of a UNION, the full tuples must be projected for de-duplication. That can require a lot of work area space, even presuming the CBO gets everything perfect.

In the event that you know a subset of the columns of a UNION ALL projection such that ordering by the column subset guarantees that duplicates appear together, then you can generate that special case of UNION ALL to UNION de-duplication at a lower cost. Chris has related one such mechanism, using the ROW_NUMBER() analytical function. It may have been unclear that he was relying on a subset of the columns of the projection. If he needed all the columns for the grouping, his mechanism would be at best a tie (and I write that in the sense that the best you can do in any transfer of energy is have zero increase in entropy, except you really can only break even in the math and in reality you always lose.)

An example showing a simple UNION and the re-written query relying on the column subset ordering probably would have been helpful, but it is ironic that a short enough column set to be easily understood as an archetypal case likely would not have a substantial enough work area savings for the special case to even break even with simply doing the UNION naturally.

But an example would have made it more clear that the key to the advantage is knowing a "less than all columns" guaranteed ordering that saves substantial space in the work area.

I think Chris was calling out to folks familiar with the performance challenge of the general case of UNIONs as to whether any have tried the ROW_NUMBER() analytic technique and what they thought about it as compared to other means of de-duplication to turn a UNION ALL substituted for a UNION into a result set that matches the UNION.

As far as I know, there is no way to tell Oracle that it may rely on a subset of columns to guarantee that duplicates be adjacent in the projection. If there were, then Oracle might provide the similar special case UNION optimization under the covers. The danger, of course, is that if you are wrong about the column subset providing this ordering guarantee, then there is a potential to produce a result set that is not the same as the result set of a UNION.

Before we had inline views, analytics, and GTTs, it was sometimes effective to produce a result set of just the required keys plus the column set that guaranteed no duplicates as a user interim table and then use that interim table as a spine to retrieve the additional ornaments desired. I leave comparisons of that technique to star queries for another day. Regards,

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Greg Rahn
Sent: Saturday, August 06, 2011 12:42 PM To: Taylor, Chris David
Cc: Dominic Brooks; oracle-l_at_freelists.org Subject: Re: Wow - has anyone used ROW_NUMBER() to get around using a UNION statement when UNION ALL doesn't work??

Chris -

The issue at hand here (people asking for more information) has nothing to do with a solving problem -- it has to do with having the right information to understand the full context of what your are discussing. Your question (below) requires more information for one to really offer any meaningful comments (not guessing). Specifically it requires seeing the full execution plan for all scenarios and the optimizer's estimated and actual row counts.

PS. if you dont want to post plans, etc on the list, I'd suggest making an account over at http://pastebin.com - that works quite well (and even uses a fixed width font).

On Sat, Aug 6, 2011 at 5:17 AM, Taylor, Chris David <ChrisDavid.Taylor_at_ingrambarge.com> wrote:
> Again, I'm not trying to solve a problem.  I'm curious if anyone has
> used

row_number to get around using an UNION (when a UNION ALL is not an acceptable replacement).  And correlated to that if anyone wanted to discuss the pros and cons.

--
Regards,
Greg Rahn
http://structureddata.org
--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l





--
http://www.freelists.org/webpage/oracle-l
Received on Mon Aug 08 2011 - 13:48:56 CDT

Original text of this message