RE: tricky group by questions

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 27 Nov 2007 17:36:36 -0500
Message-ID: <012401c83145$f8f83660$1100a8c0@rsiz.com>

While that is indeed a very nice example and is probably exactly on point as a solution except for the bit about rownum <= 12, if you're building a production service it probably makes sense to build a permanent "spine" table against which to perform the outer join to enumerate the times with zero rows of actual data. In this case the spine table would simply be the one column table of times by hour certain to bound your actual data. I recommend you create that table with the data values in order by time, and if you need more speed you can test whether it is better with an index, as an index organized table, or as a hash cluster in the context of your dataset and date range. I predict it would perform best if the "spine" table is presented to the query as a virtual table with its bounds and a solicitation of rownum so that it will be projected first and statically. Then if nested loops is chosen by the optimizer it will be a single pass and the sort will be a no-op. This of course is a trade-off of a little space (miniscule, I'd guess) versus CPU at run time, and that is my presumptive bias unless I know the resources headroom in an actual case is contrary to that bias.

I predict such a solution will be faster, but I'm pretty doggone sure it will be more readable. Using all_objects is a nice source of rows usually in sufficient quantity (yet not enormous) for examples, but it is probably not a good idea to repetitively query it in production. More importantly, I wouldn't want someone wondering what all_objects has to do with the purpose of the script. In examples, we're queued up to know that all_objects is just a convenient stand-in for a row source, and we don't mind burning the cpu to fashion whatever values we need for the example, but I would suggest that an object for the purpose in this case will serve you better.

You also probably need a little attention to your greater-than less-than values to get exactly the results you intend.

Regards,

mwf

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Bobak, Mark
Sent: Tuesday, November 27, 2007 11:29 AM To: ryan_gaffuri_at_comcast.net; oracle-l_at_freelists.org Subject: RE: tricky group by questions

Ryan,

Here's a nice AskTom example, which I think will do exactly what you want: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:89123115 13313

--

Mark J. Bobak
Senior Database Administrator, System & Product Technologies ProQuest
789 E. Eisenhower, Parkway, P.O. Box 1346 Ann Arbor MI 48106-1346
+1.734.997.4059 or +1.800.521.0600 x 4059 mark.bobak_at_il.proquest.com
www.proquest.com
www.csa.com

ProQuest...Start here.

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of ryan_gaffuri_at_comcast.net
Sent: Tuesday, November 27, 2007 10:31 AM To: oracle-l_at_freelists.org
Subject: tricky group by questions

I am writing a query that is grouping by 1 hour blocks over a period of time as follows

I am pretty sure the answer involves using "where not exists", but I can't get the dates I want to return.

select to_char(mydate, 'yyyymmdd hh24') , count(*) from mytab
where mydate < sysdate
and mydate > to_date('20071125 1500', 'yyyymmdd hh24mi')

and mydatedate < to_date('20071125 1600', 'yyyymmdd hh24mi')
group by to_char(mydate, 'yyyymmdd hh24')
order by to_char(mydate, 'yyyymmdd hh24') desc

Now I have one hour periods that do not have any rows. A standard group by just ignores those periods. I want periods with no data to return and have a count(*) = 0

so I would have

2007111101 20
2007111102 0
2007111103 10

now it returns as

2007111101 20
2007111103 10
--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Tue Nov 27 2007 - 16:36:36 CST