Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: simple sql using between startnumber and endnumber not performing

Re: simple sql using between startnumber and endnumber not performing

From: Benjamin Sølberg <benjamin.soelberg_at_gmail.com>
Date: Wed, 27 Jun 2007 14:15:59 -0700
Message-ID: <1182978959.820832.326910@n2g2000hse.googlegroups.com>


On 27 Jun., 23:41, y..._at_vtn1.victoria.tc.ca (Malcolm Dew-Jones) wrote:
> =?iso-8859-1?q?Benjamin_S=F8lberg?= (benjamin.soelb..._at_gmail.com) wrote:
>
> : On 27 Jun., 05:45, y..._at_vtn1.victoria.tc.ca (Malcolm Dew-Jones) wrote:
> : > =?iso-8859-1?q?Benjamin_S=F8lberg?= (benjamin.soelb..._at_gmail.com) wrote:
> : >

SNIP
> : > $0.10
>
> : Hello Malcolm,
>
> : Thank you for your fast reply.
> : I was thinking about something like this, not an index based but just
> : the first digit as each number are all 8 digits.
> : But your idea seems like a better way.
>
> If they are actually strings then using the first digit would be the same
> general idea.
>
> : One question:
> : If there are more than 9999 between fromnumber and tonumber will this
> : still work ?
>
> Are you saying that there is a correspondence between the two numbers in
> each row? If so then some function of the two numbers should perhaps be
> indexed instead of the two numbers independently.
>
> : Say i select from 20000000 to 30000000.
> : I belive not and in this case i should select using only startnumber
> : and endnumber right ?
>
> No, the underlying issue is the same.
>
> The point of the index can be determined by trying to sort a list of pairs
> of numbers. The following is sorted on the first number.
>
> 1 10
> 2 5
> 3 8
> 4 1
> 5 1
> 6 3
>
> notice that sorting on the first number helps you find the first number
> efficiently, but it will not help in finding the second number. On
> average, the first number is in the middle of the list, but then you have
> to scan through all the rows _starting_ at that point in the list to find
> the second number. (i.e. on average you still half to scan through half
> the rows).
>
> What happens if you sort the above on both numbers? - well actually the
> list is already sorted on both numbers! It doesn't help because the first
> numbers are all unique so each first number corresponds to a single second
> number.
>
> The point of the suggested index is to partition the first numbers into
> lumps. You can efficiently find which lumps probably have the first
> number because that is sorted, but within each lump there is a sorted set
> of second numbers to help you efficiently find the second numbers in each
> lump. Each level of partitition should, on average, half the rows to be
> examined.
>
> The original numbers must be examined at some point, so you put them are
> in the index so that only the index needs to be read, not the table.
>
> HOWEVER, the indexing adds over head, so at some point the technique
> starts to become less efficient, therefore experimentation is required to
> see if this will even help in your situation.
>
> Also, if the numbers already have many repetitions then the above will do
> nothing except add over head, so as I said, experimentation and
> understanding is required to see if this will even help in your situation.

Yes each startnumber is either less or equal to the endnumber.

Here is an example:

startnumber - endnumber

20000000 - 20000000
20000010 - 20000545
31234567 - 31234567
47462832 - 47469999

and so on up til the value 99999999.
Not all numbers exists and some may cross bounderies of others.

I belive i understand the "lump" part and in my case I should do the following:

Create an index on substr(startnumber, 1, 4), substr(endnumber, 1, 4), startnumber, endnumber

Then do a SQL like this :
SELECT *
  FROM numbers
 WHERE '3131' BETWEEN SUBSTR (startnumber, 1, 4) AND SUBSTR (endnumber, 1, 4)

   AND '31313131' BETWEEN startnumber AND endnumber;

Is this correct ?

Regards
Benjamin Received on Wed Jun 27 2007 - 16:15:59 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US