Xref: alice comp.databases.oracle.server:20227
Path: alice!news-feed.fnsi.net!ais.net!btnet-peer!btnet!news-lond.gip.net!news.gsl.net!gip.net!news.medianet.ie!cellis
From: cellis@clubi.ie (Chrysalis)
Newsgroups: comp.databases.oracle.server
Subject: Re: Tuning SQL
Date: Tue, 05 May 1998 00:01:44 -0700
Organization: None
Lines: 140
Message-ID: <cellis-ya02408000R0505980001440001@news.clubi.ie>
References: <3549CD74.C886842A@webworks.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Newsreader: Yet Another NewsWatcher 2.4.0

Andrew,

Faced with a similar problem, I improved performance considerably by adding
two *extra* predicates which can use an index (which the getDistance
function clearly cannot). The function may then be applied to a much
smaller set of rows, thus improving performance.

The trick is to recognize that all qualifying postcodes must *also* lie
within a "rectangle" approximately defined as:

a) latitude  between c_lat  - 180*arcsin(radius/R)/3.1416
                 and c_lat  + 180*arcsin(radius/R)/3.1416
                              
b) longitude between c_long - (180*arcsin(radius/R)/3.1416)
                              /cos(latitude*3.1416/180)
                 and c_long + (180*arcsin(radius/R)/3.1416)
                               /cos(latitude*3.1416/180)

where R is the radius of the Earth (c. 6371 km for a sphere of equal volume
to the Earth)

The difference between arcsin(x) and x is probably too small to worry about
for distances of less than 500Km (especially since we are already using
approximations in the getDistance function: see below). We can therefore
simplify this to:

a1) latitude  between (c_lat  - 0.008993*radius)
                  and (c_lat  + 0.008993*radius)
b1) longitude between (c_long - 0.008993*radius/cos(0.01745*latitude))
                  and (c_long - 0.008993*radius/cos(0.01745*latitude))
                  
Next, observe that replacing latitude by c_lat on the right-hand side of
these two predicates makes only a very small difference to the result. 
Allow for this by rounding up to give a slightly larger "rectangle".
This then gives two extra predicates to add to the SQL cursor (*not* to the
getDistance function!):
...
and latitude  between (c_lat - 0.009*radius) and (c_lat  + 0.009*radius)
and longitude between (c_long - 0.009*radius/cos(0.01745*c_lat))
                  and (c_long - 0.009*radius/cos(0.01745*c_lat))
and getDistance(...                  

The first two of these predicates may be evaluated during cursor
optimisation by using an index on postalcode(latitude,longitude).
The third predicate (involving the getDistance function) needs then to be
applied only to a much smaller set of rows.

BTW, where did the value 5499(km) in your function definition come from? It
looks like the radius of the circle of latitude at c. 59:40 N. If this the
case, then your algorithm will lose accuracy at latitudes significantly
different from this. If real precision is required (and it probably isn't),
you should use 6371*sin(p_latitude) in its place.
However, since you are using a value of 3.14285 for pi, I presume that
absolute precision is not required, in which case you may not need to use
the getDistance function at all!

Finally, qualifying all column references by their table name (or table
label) not only improves parsing speed, but makes the code easier to
understand for someone who is not familiar with your table definitions ;)

HTH

 In article <3549CD74.C886842A@webworks.ca>, Andrew <andy@webworks.ca> wrote:

> Hi Folks.
> 
>         I am currently doing a cross referencing bit in oracle where I
> have a postal code and obtaining the latitude and longitude from out
> Postal Code database of 750000 records.  Then I am cross referencing a
> set of criteria with a list of people of 32000 records.  Now if I don't
> bother with a radius calculation to narrow down the search (which
> incidenty turns up about 10000 matches) the query comes back in about 7
> seconds.  However, if I include the radius calculation (ie. return the
> people that fall in a 20 km radius).. the query takes 90 seconds... This
> 
> is too long.  Oracle is running on a fairly strong Solaris box (166 Mhz
> 128 MB Ram) barely tasked at all
> 
>         I am wondering if anyone has some advise about performing the
> distance calculation within PL/SQL and its obvious degradation in
> performance.
> 
> My block looks something like:
> 
>         CURSOR ranks_list (c_lat,c_long)
>                   (c_provinceid personinfo.provinceid%TYPE) IS
>                 SELECT personinfo.id, firstname, lastname, cityname,
>                        provinces.abbrev, latitude, longitude,
>                 FROM  postalcodes, personinfo, provinces
>                 WHERE provinces.id = provinceid
>                 AND personinfo.provinceid = c_provinceid
>                 AND personinfo.postalcode = postalcodes.postalcode
>                 AND getDistance(latitude,longitude,c_lat, c_long) <=
> radius;
> 
> 
> Where getDistance takes in each of the lat and long for all of the
> matches and elimiates ones that are not within the radius.  Now if I
> remove the [AND getDistance(latitude,longitude,c_lat, c_long) <=
> radius], this thing returns the list in 7 seconds... with it left in, it
> 
> takes 90 seconds...
> 
> The function is as follows:  ( we already downgraded the calcs from the
> curve distance calcs to this one which is the linear calc)
> 
> FUNCTION getDistance2 (
>         p_latitude1       PostalCodes.Latitude%TYPE,
>         p_longitude1      PostalCodes.Longitude%TYPE,
>         p_latitude2       PostalCodes.Latitude%TYPE,
>         p_longitude2      PostalCodes.Longitude%TYPE)
>         RETURN NUMBER
>         IS
>         v_distance NUMBER(20,2);
>         v_radA1    NUMBER(20,6);
>         v_radA2    NUMBER(20,6);
> BEGIN
>   v_radA1 := p_latitude1/180*3.14285;
>   v_radA2 := p_latitude2/180*3.14285;
>   v_distance := 5499.0*acos(sin(v_radA1)*sin(v_radA2) +
> cos(v_radA1)*cos(v_radA2)*cos((p_longitude1/180*3.14285) -
> (p_longitude2/180
> *3.1428)));
>   RETURN v_distance;
> END getDistance2;
> 
> 
> 
> Any insight would be appreciated.
> 
> Thanks
> Andrew

-- 
Chrysalis

"FABRICATE DIEM PVNC"
(To to protect and to serve)
Motto of the City Guard
Terry Pratchett
