Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: computing median

Re: computing median

From: Frank Hubeny <fhubeny_at_ntsource.com>
Date: Wed, 25 Apr 2001 23:24:18 -0700
Message-ID: <3AE7BF12.6CCE17B5@ntsource.com>

I would agree that using Excel is not the best way to get the median, especially if there are millions of records involved. However, the suggestion to use the cume_dist function does not always give the expected answer. (Perhaps I just don't understand how it is being used.)

Suppose I have a table, called testmedian with one column, A, and it contains two rows with 1 and 10 as values for A. The median of 1 and 10 is their average or 5.5. (This is what Excel says it is at any rate.)

However, when I do the cume_dist over A, I get the following:

SQL> select a, cume_dist() over (order by a) result from testmedian;

         A RESULT
---------- ----------

         1         .5
        10          1

This would imply that 1 is the median, which is incorrect.

It is also possible that cume_dist does not return any value at the .5 level. This would occur, for example, if the numbers in the table were 1,1,10.

SQL> select a, cume_dist() over (order by a) result from testmedian;

         A RESULT
---------- ----------

         1 .666666667
         1 .666666667
        10          1

One of the earlier respondants suggested counting the number of rows and selecting the median by fetching the appropriate number of ordered rows. If the number of rows is even, the median would be the average value of the field of interest in the two central rows. Otherwise it is the value of the field of interest in the central row.

This appears to be the most straightforward way to get the median. It is probably also faster than using Excel.

Frank Hubeny

Dave Fowler wrote:

> Herwig,
> Try this. Build pl/sql function passing value for col_distvalue around
> this cursor....This function could be as dynamic as you want. In the loop
> looking at rows Get the Col1 row with column cume_dist value of .5 it works
> for even/odd number of total rows. Return the col1 value.
>
> Test data Col_distvalue is varchar2 of the same value e.g. 'X' col1 is
> number
>
> SELECT col_distvalue,COL1,count(*) count
> , CUME_DIST() OVER (PARTITION BY col_distvalue ORDER BY col1) AS cume_dist
> FROM mediantest
> group by col_distvalue,col1
>
> I was suprised a median function hadn't been written yet either with all the
> advanced stats for OLAP services in Oracle 8.
>
> This gets it done competely in Oracle enviroment without EXCEL or other
> products involved and returns it with a select statement.
>
> hth,
> Dave Fowler
>
> "Herwig Henseler" <hh_at_ecce-terram.de> wrote in message
> news:hh-2D0C99.18481320042001_at_news.f.via-net-works.de...
> > Hi experts,
> >
> > [Oracle 8i 8.1.7]
> >
> > How do I compute the Median of values in a table? I guess this must be a
> > FAQ but haven't found an answer so far. The median is the value in the
> > "middle" of a sorted list of values. That is, half of all values are
> > bigger and the other half is smaller than the median.
> >
> > TIA,
> > Herwig
> >
> > --
> > ECCE TERRAM Internet Services GmbH Dr. Herwig Henseler
> > E-Business & Webagentur Tel. 0441 / 500 12-0
> > Heinrichstr. 18d Fax. 0441 / 500 12-29
> > 26131 Oldenburg URL: http://www.ecce-terram.de
> > "Since I've solved all our Y2K problems, I will be on vacation from
> > 28th Dec 1999 to 6th Jan 1900."
> >
Received on Thu Apr 26 2001 - 01:24:18 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US