Re: Group By Noodle Twister

From: Michel Cadot <micadot_at_netcourrier.com>
Date: Mon, 13 Mar 2000 17:54:45 +0100
Message-ID: <8akrbd$2bkh$1@news5.isdnet.net>

May be that one:

select d.aid, d.dt, min(dir) dir
from test d,

     (select aid, max(dt) dt from test
      group by aid) c

where d.aid = c.aid
and d.dt = c.dt
group by d.aid, d.dt
/

--
Have a nice day
Michel

<lycovian_at_my-deja.com> a écrit dans le message : 8abf15$qoj$1_at_nnrp1.deja.com...
> OK. I've been doing Oracle now for 2 years
> (application programming) and databases for about
> 10 but this simple one has me stumped for an
> efficient way to do this. Before starting thanks
> for reading this and any help you can render
> would be appreciated.
>
> I've got the problem "figured out" but I'm
> wondering if the solution is to use sub-selects
> or is there a way to get the info in one pass
> rather than two queries or a sub-select.
>
> Look at this table:
> aid dt dir grouping
> ------------------- ----------
> a 1/1/00 0 (*) - dt,dir
> b 1/1/00 1 (*) - dt,dir
> c 1/2/00 0 (*) - dt,dir
> c 1/2/00 1 - dt
> d 1/2/00 0 (*) - dt,dir
> d 1/1/00 1 -
> e 1/1/00 0 - dir
> e 1/2/00 1 (*) - dt
> f 1/2/00 1 (*) - dt
> f 1/1/00 0 - dr
> f 1/1/00 1 -
>
> Problem: Return for each group (aid) a single row
> that is the max date (dt) for the group and the
> min dir for that max date.
>
> Desired Result (marked by *):
> aid dt dir grouping
> ------------------- ----------
> a 1/1/00 0 (*) - dt,dir
> b 1/1/00 1 (*) - dt,dir
> c 1/2/00 0 (*) - dt,dir
> d 1/2/00 0 (*) - dt,dir
> e 1/2/00 1 (*) - dt
> f 1/2/00 1 (*) - dt
>
> The grouping columns are not part of the data but
> give a clue as how I'm currently solving this
> problem. Under group dt indicates its the max
> date for the group, dir indicates it is the min
> for the group on aid, not for dt
>
> Here is my current solution:
> select a.aid, a.dt, a.dir
> from test a
> where a.dt =
> (
> select max(b.dt)
> from test b
> where b.aid = a.aid
> )
> and a.dir =
> (
> select min(c.dir)
> from test c
> where c.aid = a.aid
> and c.dt = a.dt
> )
>
> As you can see this is a rather complex solution
> for what I consider a rather simple problem
> requiring multiple sub-selects and five joins!
> Further complicating the problem is that I don't
> have a pk to work with or I would use it in the
> outermost query to join with rather than having
> to find my row with the results of two sub-
> selects.
>
> I've also tried using the IN predicate and
> nesting the sub-queries but in my speed tests the
> presented solution is several times faster. What
> I would like to do is simply return the pk for
> the max dt and the min dir for that max dt with
> one or two queries and as few joins as possible.
> My actual dataset for this is about 57K rows (and
> growing) and performance is suffering because it
> requires so many joins. Is it possible to do
> this in one query? Anyone know? I'm probably
> going to add a primary key to the table
> eventually but I'm curious as to how to make this
> more efficient without having to make a
> structural change to the db. Oh, and yes, I have
> indexes all all joined columns and in general the
> database is a speed demon. Thanks for your time
> and any help you lend.
>
> Mike Wilson
> Software Engineer
> Sony Pictures Imageworks
> e-mail: mwilson_at_imageworks.com
>
> To save some typing here is the table script:
> CREATE TABLE test
> (
> aid VARCHAR2(10),
> dt DATE,
> dir NUMBER(2)
> )
> /
>
>
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
Received on Mon Mar 13 2000 - 10:54:45 CST