Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Tuning SQL

Re: Tuning SQL

From: Chrysalis <cellis_at_clubi.ie>
Date: Tue, 05 May 1998 00:01:44 -0700
Message-ID: <cellis-ya02408000R0505980001440001@news.clubi.ie>


Andrew,

Faced with a similar problem, I improved performance considerably by adding two *extra* predicates which can use an index (which the getDistance function clearly cannot). The function may then be applied to a much smaller set of rows, thus improving performance.

The trick is to recognize that all qualifying postcodes must *also* lie within a "rectangle" approximately defined as:

  1. latitude between c_lat - 180*arcsin(radius/R)/3.1416 and c_lat + 180*arcsin(radius/R)/3.1416
  2. longitude between c_long - (180*arcsin(radius/R)/3.1416) /cos(latitude*3.1416/180) and c_long + (180*arcsin(radius/R)/3.1416) /cos(latitude*3.1416/180)

where R is the radius of the Earth (c. 6371 km for a sphere of equal volume to the Earth)

The difference between arcsin(x) and x is probably too small to worry about for distances of less than 500Km (especially since we are already using approximations in the getDistance function: see below). We can therefore simplify this to:

a1) latitude  between (c_lat  - 0.008993*radius)
                  and (c_lat  + 0.008993*radius)
b1) longitude between (c_long - 0.008993*radius/cos(0.01745*latitude))
                  and (c_long - 0.008993*radius/cos(0.01745*latitude))
                  

Next, observe that replacing latitude by c_lat on the right-hand side of these two predicates makes only a very small difference to the result. Allow for this by rounding up to give a slightly larger "rectangle". This then gives two extra predicates to add to the SQL cursor (*not* to the getDistance function!):
...

and latitude  between (c_lat - 0.009*radius) and (c_lat  + 0.009*radius)
and longitude between (c_long - 0.009*radius/cos(0.01745*c_lat))
                  and (c_long - 0.009*radius/cos(0.01745*c_lat))
and getDistance(...                  

The first two of these predicates may be evaluated during cursor optimisation by using an index on postalcode(latitude,longitude). The third predicate (involving the getDistance function) needs then to be applied only to a much smaller set of rows.

BTW, where did the value 5499(km) in your function definition come from? It looks like the radius of the circle of latitude at c. 59:40 N. If this the case, then your algorithm will lose accuracy at latitudes significantly different from this. If real precision is required (and it probably isn't), you should use 6371*sin(p_latitude) in its place. However, since you are using a value of 3.14285 for pi, I presume that absolute precision is not required, in which case you may not need to use the getDistance function at all!

Finally, qualifying all column references by their table name (or table label) not only improves parsing speed, but makes the code easier to understand for someone who is not familiar with your table definitions ;)

HTH  In article <3549CD74.C886842A_at_webworks.ca>, Andrew <andy_at_webworks.ca> wrote:

> Hi Folks.
>
> I am currently doing a cross referencing bit in oracle where I
> have a postal code and obtaining the latitude and longitude from out
> Postal Code database of 750000 records. Then I am cross referencing a
> set of criteria with a list of people of 32000 records. Now if I don't
> bother with a radius calculation to narrow down the search (which
> incidenty turns up about 10000 matches) the query comes back in about 7
> seconds. However, if I include the radius calculation (ie. return the
> people that fall in a 20 km radius).. the query takes 90 seconds... This
>
> is too long. Oracle is running on a fairly strong Solaris box (166 Mhz
> 128 MB Ram) barely tasked at all
>
> I am wondering if anyone has some advise about performing the
> distance calculation within PL/SQL and its obvious degradation in
> performance.
>
> My block looks something like:
>
> CURSOR ranks_list (c_lat,c_long)
> (c_provinceid personinfo.provinceid%TYPE) IS
> SELECT personinfo.id, firstname, lastname, cityname,
> provinces.abbrev, latitude, longitude,
> FROM postalcodes, personinfo, provinces
> WHERE provinces.id = provinceid
> AND personinfo.provinceid = c_provinceid
> AND personinfo.postalcode = postalcodes.postalcode
> AND getDistance(latitude,longitude,c_lat, c_long) <=
> radius;
>
>
> Where getDistance takes in each of the lat and long for all of the
> matches and elimiates ones that are not within the radius. Now if I
> remove the [AND getDistance(latitude,longitude,c_lat, c_long) <=
> radius], this thing returns the list in 7 seconds... with it left in, it
>
> takes 90 seconds...
>
> The function is as follows: ( we already downgraded the calcs from the
> curve distance calcs to this one which is the linear calc)
>
> FUNCTION getDistance2 (
> p_latitude1 PostalCodes.Latitude%TYPE,
> p_longitude1 PostalCodes.Longitude%TYPE,
> p_latitude2 PostalCodes.Latitude%TYPE,
> p_longitude2 PostalCodes.Longitude%TYPE)
> RETURN NUMBER
> IS
> v_distance NUMBER(20,2);
> v_radA1 NUMBER(20,6);
> v_radA2 NUMBER(20,6);
> BEGIN
> v_radA1 := p_latitude1/180*3.14285;
> v_radA2 := p_latitude2/180*3.14285;
> v_distance := 5499.0*acos(sin(v_radA1)*sin(v_radA2) +
> cos(v_radA1)*cos(v_radA2)*cos((p_longitude1/180*3.14285) -
> (p_longitude2/180
> *3.1428)));
> RETURN v_distance;
> END getDistance2;
>
>
>
> Any insight would be appreciated.
>
> Thanks
> Andrew

--
Chrysalis

"FABRICATE DIEM PVNC"
(To to protect and to serve)
Motto of the City Guard
Terry Pratchett Received on Tue May 05 1998 - 02:01:44 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US