Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Histogram/Bucketizing resultset
in article B9E96DF2.D86%markbtownsend_at_attbi.com, Mark Townsend at
markbtownsend_at_attbi.com wrote on 11/2/02 11:56 AM:
The corollary example is interesting too - If you want this done as equiheight histograms, the NTILE capabilities simplify life considerably
SELECT min(b.age)||' - '||max(b.age) Range,
count(*) Count
FROM(SELECT age,
ntile(&i) over(order by age) as bucket FROM t) b
Gives:-
Enter Number of Buckets Required: 10
RANGE COUNT -------------------- ---------- 17 - 22 123 22 - 27 123 27 - 30 123 30 - 34 123 34 - 38 123 38 - 42 123 42 - 46 122 46 - 51 122 51 - 58 122 58 - 90 122
10 rows selected.
Enter Number of Buckets Required: 20
RANGE COUNT -------------------- ---------- 17 - 20 62 20 - 22 62 23 - 25 62 25 - 27 62 27 - 29 62 29 - 30 62 30 - 32 61 32 - 34 61 34 - 36 61 36 - 38 61 38 - 40 61 40 - 42 61 42 - 44 61 44 - 46 61 46 - 48 61 48 - 51 61 51 - 54 61 54 - 58 61 58 - 63 61 63 - 90 61
20 rows selected.
> in article 3DC352BB.C4BF69FA_at_drew.edu, Steve Kass at skass_at_drew.edu wrote on
> 11/1/02 8:21 PM:
>
>> No reason not to have a database in front of you - the enterprise evaluation >> edition of SQL Server is free and lasts for 120 days... >> >> http://www.microsoft.com/sql/evaluation/trial/
>> >> Anyway, one way or another, if you find a better solution to the bucketing, >> let us know. >>
>> >> SK >> >> Mark Townsend wrote: >> >>> in article B9E741D5.C3D%markbtownsend_at_attbi.com, Mark Townsend at >>> markbtownsend_at_attbi.com wrote on 10/31/02 8:24 PM: >>> >>>> Hence the WIDTH_BUCKET function - something like (I don't have Oracle at >>>> home to test this) - >>>> >>>> SELECT WIDTH_BUCKET(age, lbound, ubound, nbr_buckets) "Bucket Nbr", >>>> count(*) >>>> FROM t >>>> GROUP BY "Bucket Nbr"; >>>> >>>> where lbound is min(age), ubound is max(age), and nbr_buckets is the number >>>> of buckets you want (10) >>>> >>>> Note that Bucket Nbr would be a number i.e 1 for the first bucket, 2 for >>>> the >>>> second bucket etc. >>>> >>>> To turn this number into the actual range statement, you would need >>>> something like an inline view >>>> >>>> SELECT lbound+(((ubound-lboundb)/nbr_buckets) * (b.bucket_nbr-1))||' - '|| >>>> lbound+(((ubound-lboundb)/nbr_buckets) * b.bucket_nbr) "Bucket", >>>> val "Count" >>>> FROM (SELECT WIDTH_BUCKET(age, lbound, ubound, nbr_buckets) "Bucket_Nbr", >>>> count(*) val >>>> FROM t >>>> GROUP BY "Bucket") b >>>> >>>> Note that this will work (assuming the theory is correct) for any range of >>>> values, but more importantly for any number of buckets. YMMV, as there is >>>> bound to be some debugging required for the above pseudo code. >>> >>> Apologies - I just realized I missed the most significant part of your >>> follow up post >>> >>>>> Assume that the lower bound and the upper bound is not preset >>> >>> I'll need to think about this a bit more, at the office, with a database in >>> front of me.