Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Getting the FIRST and LAST records in a group efficiently in SQL?
Hello all,
I have a table TRAIN_ROUTE which lists the route locations (field
LOCATION_I) in order of field ROUTE_ORDER_I, for a series of trains
grouped by (TRAIN_I and P_VERSION_I).
There is a primary key on TRAIN_I, P_VERSION_I, ROUTE_ORDER_I.
I can find the first and last location for each train easily enough with these two methods:
select TRAIN_I, P_VERSION_I, LOCATION_I, ROUTE_ORDER_I from TRAIN_ROUTE where (TRAIN_I, P_VERSION_I, ROUTE_ORDER_I) in(
select TRAIN_I, P_VERSION_I, MIN(ROUTE_ORDER_I) from TRAIN_ROUTE group by TRAIN_I, P_VERSION_I
8374 rows selected.
Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=420 Card=1 Bytes=59) 1 0 HASH JOIN (SEMI) (Cost=420 Card=1 Bytes=59)
2 1 TABLE ACCESS (FULL) OF 'TRAIN_ROUTE' (Cost=128 Card=6576 7 Bytes=1315340) 3 1 VIEW OF 'VW_NSO_1' (Cost=208 Card=65767 Bytes=2564913) 4 3 SORT (GROUP BY) (Cost=208 Card=65767 Bytes=854971) 5 4 INDEX (FULL SCAN) OF 'IX_TRAIN_ROUTE_ORDER_01' (UNIQ UE) (Cost=208 Card=65767 Bytes=854971)
select TR1.TRAIN_I, TR1.P_VERSION_I, TR1.LOCATION_I, TR1.ROUTE_ORDER_I
from TRAIN_ROUTE TR1
inner join
(
select TRAIN_I, P_VERSION_I, MIN(ROUTE_ORDER_I) MIN_ROUTE_ORDER_I from TRAIN_ROUTE group by TRAIN_I, P_VERSION_I ) TR2 on TR1.TRAIN_I = TR2.TRAIN_I
8374 rows selected.
Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=420 Card=65767 Bytes =3880253) 1 0 HASH JOIN (Cost=420 Card=65767 Bytes=3880253) 2 1 TABLE ACCESS (FULL) OF 'TRAIN_ROUTE' (Cost=128 Card=6576 7 Bytes=1315340) 3 1 VIEW (Cost=208 Card=65767 Bytes=2564913) 4 3 SORT (GROUP BY) (Cost=208 Card=65767 Bytes=854971) 5 4 INDEX (FULL SCAN) OF 'IX_TRAIN_ROUTE_ORDER_01' (UNIQ UE) (Cost=208 Card=65767 Bytes=854971)
They both trace to the same cost. However, I wonder if there is a better way than finding the MIN value of ROUTE_ORDER_I, but I am not sure. Do these methods here calculate the MIN(ROUTE_ORDER_I) value or use the index to do this?
My next question was how to use FIRST_VALUE using PARTITION BY (or some other method) to return only the origin (first) location of every train:
set autotrace on EXPLAIN;
select distinct TRAIN_I, P_VERSION_I,
FIRST_VALUE(LOCATION_I) over (partition by TRAIN_I, P_VERSION_I order by ROUTE_ORDER_I) FIRST_LOC
from train_route
set autotrace off EXPLAIN;
8374 rows selected.
Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=666 Card=65767 Bytes =1315340) 1 0 SORT (UNIQUE) (Cost=666 Card=65767 Bytes=1315340) 2 1 WINDOW (SORT) (Cost=666 Card=65767 Bytes=1315340) 3 2 TABLE ACCESS (FULL) OF 'TRAIN_ROUTE' (Cost=128 Card=65 767 Bytes=1315340)
Again this seems to be more expensive cost-wise than the others, which makes sense because of the distinct function running on a larger result set.
What I really want to do is this:
select TRAIN_I, P_VERSION_I,
FIRST_VALUE(LOCATION_I) over (partition by TRAIN_I, P_VERSION_I order by ROUTE_ORDER_I) FIRST_LOC
from train_route
group by TRAIN_I, P_VERSION_I;
but this does not compile:
FIRST_VALUE(LOCATION_I) over (partition by TRAIN_I, P_VERSION_I order by ROUTE_ORDER_I) FIRST_LOC
*
Thanks for any help.
Dean Received on Fri Dec 09 2005 - 09:37:44 CST