Re: Performance: SDO_RELATE vs. Stored Procedure

From: <schaef2k_at_googlemail.com>
Date: Sat, 22 Nov 2008 03:24:58 -0800 (PST)
Message-ID: <fac617a8-8ad3-4903-a042-2d3719290d2f@q9g2000yqc.googlegroups.com>


On 19 Nov., 14:43, Shakespeare <what..._at_xs4all.nl> wrote:
> schae..._at_googlemail.com schreef:
>
>
>
> > On 16 Nov., 23:10, Shakespeare <what..._at_xs4all.nl> wrote:
> >> schae..._at_googlemail.com schreef:
>
> >>> On 15 Nov., 20:00, Shakespeare <what..._at_xs4all.nl> wrote:
> >>>> schae..._at_googlemail.com schreef:
> >>>>> Hi,
> >>>>> I'm relatively new to Oracle and as I am experiencing some
> >>>>> inexplicable results I ask for your help.
> >>>>> In Oracle 10g XE I created a datatype for triangle-objects consisting
> >>>>> of 3 points, a name and a member function contains(t Triangle). An
> >>>>> object-table contains 500 randomly created triangles. By means of
> >>>>> additional static functions these triangles are converted to valid
> >>>>> SDO_GEOMETRY objects, which are stored in another table.
> >>>>> Now, evaluating a self-join of each table w.r.t. the predicate
> >>>>> "triangle a contains triangle b" I expected the SDO_GEOM.SDO_RELATE
> >>>>> operator to run faster than my own contains-operator, since it is a
> >>>>> built in function.
> >>>>> In contrast, it performs 3 times slower (~100seconds/450 results vs
> >>>>> ~35seconds/450 results). Why is that?
> >>>>> Does the SDO_RELATE operator perform some kind of filter/refine step,
> >>>>> i.e. test the spatial relationship of the geometries' bounding
> >>>>> rectangle first (my own contains operator omits such a step)? Is there
> >>>>> any way to get more implementation specific documentation about built-
> >>>>> in functions?
> >>>>> Thanks in advance!
> >>>>> Daniel
> >>>> One of the possiblities why your proc is faster is that you already KNOW
> >>>>   your geometries are triangular, where SDO_RELATE is for all kinds of
> >>>> geometries. And SDO_RELATE was built to do more than CONTAINS only
> >>>> But there's more to it: SDO_RELATE highly depends on spatial indexes.
> >>>> Take a look at Oracle Spatial Documentation.
> >>>> Shakespeare
> >>> OK, in the Oracle Spatial documentation it says:
> >>> "OVERLAPBDYDISJOINT can be defined as the relation where the objects
> >>> overlap but the boundaries are disjoint. This functionality is made
> >>> available
> >>> through an operator, SDO_RELATE, and a function, SDO_GEOM.RELATE().
> >>> The operator, SDO_RELATE, is registered with the extensible optimizer
> >>> and
> >>> hence the optimizer will evaluate various query plans that include or
> >>> exclude the
> >>> use of a spatial index. The function, SDO_GEOM.RELATE, does not use
> >>> the
> >>> spatial index and simply evaluates the two geometries that are passed
> >>> to it via
> >>> the argument list for the specified topological relationship.[...]"
> >>> But I still wonder, why there is such a huge performance difference
> >>> between my PL/SQL
> >>> code and the built-in Function. I ran the same test in PostgreSQL/
> >>> PostGIS, i.e. I compared
> >>> my own contains method with PostGIS's spatial containment operator. As
> >>> for the oracle
> >>> operator, contains() may not use any spatial index structure in my
> >>> test and of course it
> >>> may not use any optimizations that could apply to the processing of
> >>> triangles.
> >>> Both built-in operators base on the computation of the intersection
> >>> matrix as defined by
> >>> the 9-intersection model, so I suppose them to be somehow similar.
> >>> However, the postgres
> >>> operator returns its results in less than 1 second (450 rows), whereas
> >>> my plpgsql-code runs
> >>> for ~15 seconds and the oracle operator still needs ~100seconds.
> >>> Although a small performance
> >>> advantage over oracle was expected, I didn't expect it so huge and it
> >>> leaves me even more
> >>> confused.
> >>> Daniel
> >> Try it using an /*+ ordered */ hint. This may help!
>
> >> Shakespeare
>
> > Unfortunately, it did not help. Still the runtime is round about 100
> > seconds.
>
> > Daniel
>
> Did you run an explainplanfor yourquery? Did you create spatial indexes?
>
> Shakespeare

First of all: Thanks for your effort! I appreciate it.

The reason I did not use any index stuctures is, that I wanted to compare the performance of built-in functions with my own implementations. Therefore, I run the following queries, which are executed similarly (see execution plans below):

Q1: SELECT COUNT(*) FROM /*+ ordered */

	oracle_triangles t1, oracle_triangles t2
	WHERE SDO_GEOM.RELATE(t1.triangle, 'contains', t2.triangle, 0.001) <>
'FALSE'
	  AND t1.id <> t2.id

Q2: SELECT COUNT(*) FROM /*+ ordered */
	my_triangles t1, my_triangles d2
	WHERE t1.contains(VALUE(t2)) = 1
	  AND t1.id <> t2.id;

Execution Plan for Q1:
Operation

Object                  COST
------------------------------
------------------------------      ----------
SELECT STATEMENT ()                                               2068
 SORT (AGGREGATE)
  NESTED LOOPS ()
2068
   TABLE ACCESS (FULL)         oracle_triangles                 6
   TABLE ACCESS (FULL)         oracle_triangles                 4

Execution Plan for Q2:
Operation

Object                   COST
------------------------------
------------------------------      ----------
SELECT STATEMENT ()                                               1801
 SORT (AGGREGATE)
  NESTED LOOPS ()
1801
   TABLE ACCESS (FULL)         my_triangles                     5
   TABLE ACCESS (FULL)         my_triangles                     4


In PostgreSQL I do not use any indexes either, so the queries also involve a nested loops join using full table scans. Although the experimental settings equal for each query and each dbms, I still observe tremendous performance
differences that I just cannot explain.

Daniel Received on Sat Nov 22 2008 - 05:24:58 CST

Original text of this message