Re: tune between query/join

From: David Fitzjarrell <oratune_at_yahoo.com>
Date: Fri, 21 Oct 2011 01:27:52 -0700 (PDT)
Message-ID: <1319185672.81463.YahooMailNeo_at_web65403.mail.ac4.yahoo.com>



I can't see that it will when it's a cartesian join; there is no actual join condition in the query, just the unbounded search criteria on two columns not used to join anything between the two tables.

David Fitzjarrell

From: Stephane Faroult <sfaroult_at_roughsea.com> To: ax.mount_at_gmail.com
Cc: Oracle-L Group <oracle-l_at_freelists.org> Sent: Friday, October 21, 2011 1:16 AM
Subject: Re: tune between query/join

Alex,
    Your case is actually a rather common problem - you have the same thing when you try to identify a country from an IP address, to check that a credit card number falls into an acceptable range, etc.

  The snag is that when you read the condition you tend to think that the given values are those that define the range (the lower and upper bounds), and that you are looking for the one that falls into the range (what happens in a classical range scan). In fact, it's the opposite: the value that is given is, in your case, b.amount (the value you get from the smallest table), and you search the range it falls into.

If you rewrite the condition as the equivalent

where a.lowest_amount <= b.amount
    and a.highest_amount >= b.amount

you understand the problem better - you have not one bounded range condition, but two unrelated and unbounded conditions on two columns. What I try to do in such a case is transform one of the two unbounded conditions into a range search, which requires some knowledge about the data and its order of magnitude. If you suppose (for instance) that the intervals all have an identical span, say of 10,000, you can write

where a.lowest_amount between b.amount - 10,000 and b.amount     and a.highest_amount >= b.amount

Or you can use log() to get the order of magnitude of b.amount and define a floor and a ceiling values. Or you can define (for credit cards for instance) a "root value" (ie you know that the number to check is like "root_value%") and right pad to the expected length with 0s ans with 9s to get a lower and an upper value. You get the idea.

Create an index on (lowest_amount, highest_amount) and it will fly.

HTH,

-- 
Stephane Faroult
RoughSea Ltd <http://www.roughsea.com>
Konagora <http://www.konagora.com>
RoughSea Channel on Youtube <http://www.youtube.com/user/roughsealtd>

On 10/21/2011 07:41 AM, amonte wrote:

> Hello everyone
> I am running a query which looks like following (running in 10.2.0.4):
>
> select a.account_number, a.id, b.*
>    from account a, tmp_account b
> where b.amount between a.lowest_amount and a.highest_amount
>
> account has 8000000 rows and tmp_account 150000, the execution plan shows
> merge sort join as expected, this query takes around 5 hours to run, is
> there anyway to improve this?
>
>
> Alex
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
-- http://www.freelists.org/webpage/oracle-l -- http://www.freelists.org/webpage/oracle-l
Received on Fri Oct 21 2011 - 03:27:52 CDT

Original text of this message