Re: SQL frustration - cannot make "obvious" query work well

From: What's in a namespace <xml_at_ns.com>
Date: Thu, 30 Nov 2006 15:10:16 +0100
Message-ID: <456ee648$0$324$e4fe514c@news.xs4all.nl>

<giuliano.procida_at_googlemail.com> schreef in bericht news:1164890406.726998.236320_at_j44g2000cwa.googlegroups.com...
>I have a fairly large data set, consisting of slices of points
> (essentially in 2-d). Around 100million points in 1million slices.
> Points/slice is very skewed (very many with < 10 points, very few >
> 10000 points).
>
> create table points (
> slice number(10) not null,
> x number(10) not null,
> y number(10) not null,
> value number(10) not null,
> constraint pky primary key (slice, x, y)
> ) -- organization index -- not currently an IOT, but should be
>
> It so happens that each slice is a meet semi-lattice (i.e, max(x),
> max(y) over any non-empty subset of a slice is also in the slice).
>
> Typical access is to look at subset of slices (values from another
> query) and for each slice, look for the the top-right point in a
> bottom-left quarter-plane and return the slice and value.
>
> The old query selected max(x) for a slice, and max(y) for a slice[,
> max(x)] and finally the row itself. Doing this in bulk for many slices
> involves plans with expensives scans and sorts.
>
> There is a very efficient query to fetch a value for a single slice:
>
> SELECT slice, value
> FROM (
> SELECT slice, value
> FROM points
> WHERE slice = :slice
> AND x <= :x
> AND y <= :y
> ORDER BY x DESC, y DESC
> )
> WHERE ROWNUM = 1
>
> The query plan uses the index, and a stopkey.
>
> The query cannot be rewritten to work with subsets of slices:
>
> SELECT slice, precious_data
> FROM
> all_slices,
> (
> SELECT slice, value
> FROM (
> SELECT slice, value
> FROM points
> WHERE slice = all_slices.slice
> AND x <= :x
> AND y <= :y
> ORDER BY y DESC, x DESC
> )
> WHERE ROWNUM = 1
> )
>
> -- ORA-00904: "ALL_SLICES"."SLICE": invalid identifier
>
> The nested, nested query cannot see the top-level table. This is very
> annoying.
>
> I have tried:
>
> a) Getting around the 00904 limitation somehow. No luck. This exact
> issue has popped up to asktom etc. a few times, no solution seems
> available.
>
> b) Writing a pipelined function that PIPEs out the single rows returned
> by the single-slice query, looping over slices returning by a driving
> query.
>
> This works a treat, over double the performance of the equivalent query
> using max(x) subselects, but the driving query varies a lot in the
> application (50 different queries => 50 functions to put in a package).
>
> c) Writing a version of the pipelined function that takes a cursor
> returning slices.
>
> I thought this would be flexible and as fast as b) but performance is
> worse than the original max(x), max(y) query.
>
> d) Using Oracle analytics.
>
> Selecting first_value, partitioning by slice and ordering the
> partitions by the index desc. Oracle ignores the index, fetches a load
> a data and throws away much. Slower.
>
> e) Simulating spatial/2-d space-filling curves indexes.
>
> Given that most slices are small and that selecting by subsets of the
> slices is most important, this seemed to be not worth the effort. The
> data would need to be reindexed by x, y first then slice. Genuine DBMS
> r-tree etc. indexes are hard to come by and space-filling curves
> (fitting a 2-d index into 1-d by mixing together alternating bits of x
> and y) give bad behaviour, require an extra column to index and
> application changes.
>
> I would really like a solution to a), or else get c) to be as fast as
> b) or as a last resort get the right query plan with d) so that the
> Oracle just skips through the index as it is supposed to.
>
> Any help appreciated.
>
> Giuliano.
>

Looks like you could solve the nesting with a normal join..... Received on Thu Nov 30 2006 - 08:10:16 CST