Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Histogram/Bucketizing resultset
in article 499d4ffb.0210310936.731aa87d_at_posting.google.com, NoSpamPlease at
googleposer_at_yahoo.com wrote on 10/31/02 9:36 AM:
> Thx Mark > My bad - I framed the question incorrectly > > Assume that the lower bound and the upper bound is not preset > I need to do this in 1 single query > Hence > LB = minimum of the attribute in the table > UB = max of the attribute in the table > Bucketsize = UB-LB/10 <- 10 is -need 10 buckets > Age
> > LB = 24 > UB = 76 > BucketSize =(76 -24)/10 > > Bucket Count > 24-29.2 2 > (total 10 rows) > .. > 70.8-76 1 >
Hence the WIDTH_BUCKET function - something like (I don't have Oracle at home to test this) -
SELECT WIDTH_BUCKET(age, lbound, ubound, nbr_buckets) "Bucket Nbr", count(*)
FROM t
GROUP BY "Bucket Nbr";
where lbound is min(age), ubound is max(age), and nbr_buckets is the number of buckets you want (10)
Note that Bucket Nbr would be a number i.e 1 for the first bucket, 2 for the second bucket etc.
To turn this number into the actual range statement, you would need something like an inline view
SELECT lbound+(((ubound-lboundb)/nbr_buckets) * (b.bucket_nbr-1))||' - '||
lbound+(((ubound-lboundb)/nbr_buckets) * b.bucket_nbr) "Bucket", val "Count" FROM (SELECT WIDTH_BUCKET(age, lbound, ubound, nbr_buckets) "Bucket_Nbr", count(*) val
Note that this will work (assuming the theory is correct) for any range of values, but more importantly for any number of buckets. YMMV, as there is bound to be some debugging required for the above pseudo code.
> Mark Townsend <markbtownsend_at_attbi.com> wrote in message > news:<B9E5FECF.B3D%markbtownsend_at_attbi.com>... >> in article 499d4ffb.0210302045.452ef143_at_posting.google.com, NoSpamPlease at >> googleposer_at_yahoo.com wrote on 10/30/02 8:45 PM: >>
>> >> For an Oracle solution, assuming the bucket ranges you have given, something >> like >> >> SELECT >> (CASE WHEN age BETWEEN 0 AND 10 THEN ' 0 - 10' >> WHEN age BETWEEN 11 AND 20 THEN '11 - 20' >> WHEN age BETWEEN 21 AND 30 THEN '21 - 30' >> ... >> END) >> AS Range, >> COUNT(*) AS Value >> FROM T >> GROUP BY >> (CASE WHEN age BETWEEN 0 AND 10 THEN ' 0 - 10' >> WHEN age BETWEEN 11 AND 20 THEN '11 - 20' >> WHEN age BETWEEN 21 AND 30 THEN '21 - 30' >> ... >> END); >> >> This obviously becomes a PITA with more buckets. >> >> Depending on your actual requirements, you may also want to investigate >> NTILE or WIDTH_BUCKET functions, which will automatically create either >> equiwidth buckets or histograms for you, respectively - >> http://otn.oracle.com/docs/products/oracle9i/doc_library/release2/server.920 >> /a96520/analysis.htmReceived on Thu Oct 31 2002 - 22:24:05 CST