Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.misc -> SQL frustration - cannot make "obvious" query work well
I have a fairly large data set, consisting of slices of points
(essentially in 2-d). Around 100million points in 1million slices.
Points/slice is very skewed (very many with < 10 points, very few >
10000 points).
create table points (
slice number(10) not null,
x number(10) not null,
y number(10) not null,
value number(10) not null,
constraint pky primary key (slice, x, y)
) -- organization index -- not currently an IOT, but should be
It so happens that each slice is a meet semi-lattice (i.e, max(x), max(y) over any non-empty subset of a slice is also in the slice).
Typical access is to look at subset of slices (values from another query) and for each slice, look for the the top-right point in a bottom-left quarter-plane and return the slice and value.
The old query selected max(x) for a slice, and max(y) for a slice[, max(x)] and finally the row itself. Doing this in bulk for many slices involves plans with expensives scans and sorts.
There is a very efficient query to fetch a value for a single slice:
SELECT slice, value
FROM (
SELECT slice, value
FROM points
WHERE slice = :slice
AND x <= :x
AND y <= :y
ORDER BY x DESC, y DESC
)
WHERE ROWNUM = 1
The query plan uses the index, and a stopkey.
The query cannot be rewritten to work with subsets of slices:
SELECT slice, precious_data
FROM
all_slices,
(
SELECT slice, value
FROM (
SELECT slice, value FROM points WHERE slice = all_slices.slice AND x <= :x AND y <= :y ORDER BY y DESC, x DESC
The nested, nested query cannot see the top-level table. This is very annoying.
I have tried:
This works a treat, over double the performance of the equivalent query using max(x) subselects, but the driving query varies a lot in the application (50 different queries => 50 functions to put in a package).
c) Writing a version of the pipelined function that takes a cursor returning slices.
I thought this would be flexible and as fast as b) but performance is worse than the original max(x), max(y) query.
d) Using Oracle analytics.
Selecting first_value, partitioning by slice and ordering the partitions by the index desc. Oracle ignores the index, fetches a load a data and throws away much. Slower.
e) Simulating spatial/2-d space-filling curves indexes.
Given that most slices are small and that selecting by subsets of the
slices is most important, this seemed to be not worth the effort. The
data would need to be reindexed by x, y first then slice. Genuine DBMS
r-tree etc. indexes are hard to come by and space-filling curves
(fitting a 2-d index into 1-d by mixing together alternating bits of x
and y) give bad behaviour, require an extra column to index and
application changes.
I would really like a solution to a), or else get c) to be as fast as b) or as a last resort get the right query plan with d) so that the Oracle just skips through the index as it is supposed to.
Any help appreciated.
Giuliano. Received on Thu Nov 30 2006 - 07:37:57 CST
![]() |
![]() |