Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Calculating The Median: Error Discovered in Oracle SQL 101 Code

Re: Calculating The Median: Error Discovered in Oracle SQL 101 Code

From: Jared Still <jkstill_at_cybcon.com>
Date: Sat, 30 Jun 2001 23:16:55 -0700
Message-ID: <F001.0033E951.20010630232023@fatcity.com>

Try this link:

http://math.about.com/science/math/library/howto/htmean.htm

Jared

On Saturday 30 June 2001 19:00, MacGregor, Ian A. wrote:
> I ran the code from Oracle SQL 101 which Jared posted modifying it to find
> the median of the sal column on that table
>
> SQL> select
> 2 rownum,
> 3 sal
> 4 from (
> 5 select sal
> 6 from scott.emp
> 7 where sal is not null
> 8 union
> 9 select 1 from dual where 1=2
> 10 )
> 11 group by sal, rownum
> 12 having rownum >= (
> 13 select decode( mod(total_freq,2),
> 14 1,trunc(total_freq/2 + 1),
> 15 0,trunc(total_freq/2)
> 16 )
> 17 from (
> 18 select count(*) total_freq
> 19 from scott.emp
> 20 where sal is not null
> 21 )
> 22 )
> 23 and rownum <= (
> 24 select decode( mod(total_freq,2),
> 25 1,trunc(total_freq/2 + 1),
> 26 0,trunc(total_freq/2 + 1)
> 27 )
> 28 from (
> 29 select count(*) total_freq
> 30 from scott.emp
> 31 where sal is not null
> 32 )
> 33 )
> 34 /
>
> values
> averaged
> ROWNUM in median
> ---------- ----------
> 7 1600
> 8 2450
> ----------
> Median 2025
>
> ---------------------------------------------------------------------------
>----------- This answer is different from the result of the code I posted
> which uses the new analytical functions.
>
> select
> case
> when mod(number_salaried,2) = 0 then
> (select sum(sal)/2 from(select sal, row_number()
> over ( order by sal) as salrank
> from scott.emp)
> where salrank = number_salaried/2
> or salrank = number_salaried/2 +1)
> else
> (select sal from(select sal, row_number()
> over ( order by sal) as salrank
> from scott.emp)
> where salrank = ceil(number_salaried/2))
> end median
> from (select sal,rank() over (order by sal) as rk from scott.emp),
> (select count(sal) number_salaried from scott.emp)
> where rk = 1
> /
>
> MEDIAN
> ---------
> 1550
> ---------------------------------------------------------------------------
>--------------------- Emp is a 14 row table . The median should be the
> average of the seventh and eighth values. I cleared the computes and
> columns and ran the first part of the SQL 101 code
>
> clear computes
> utes cleared
> select
> rownum,
> sal
> from (
> select sal
> from scott.emp
> where sal is not null
> union
> select 1 from dual where 1=2
> )
> group by sal, rownum
> /
>
> ROWNUM SAL
> ------ ----------
> 1 800
> 2 950
> 3 1100
> 4 1250
> 5 1300
> 6 1500
> 7 1600
> 8 2450
> 9 2850
> 10 2975
> 11 3000
>
> ROWNUM SAL
> ------ ----------
> 12 5000
>
> and also ran the part of my code which corresponded. I changed my code
> slightly so the salrank column would print.
>
> SQL> select salrank, sal from(select sal, row_number()
> 2 over ( order by sal) as salrank
> 3 from scott.emp)
> 4 /
>
> SALRANK SAL
> ---------- ----------
> 1 800
> 2 950
> 3 1100
> 4 1250
> 5 1250
> 6 1300
> 7 1500
> 8 1600
> 9 2450
> 10 2850
> 11 2975
>
> SALRANK SAL
> ---------- ----------
> 12 3000
> 13 3000
> 14 5000
>
> 14 rows selected.
>
> ---------------------------------------------------------------------------
>------------------- The reason for the different answers is now apparent.
> The SQL 101 code is tossing duplicate records. It's been a long time
> since my stats classes, but I'm about 99.999999% confident you don't purge
> duplicates when computing a median. But even if I'm wrong about this, the
> SQL 101 code has reduced the set to 12 members, but it is still computing
> the median as if there were 14 members; that is, it is taking the average
> of the 7th and 8th values and not the average of the 6th and 7th.
>
> I hope there was a caveat in SQL 101 book stating the code only worked
> against columns with unique values, not including nulls.
>
> Ian MacGregor
> Stanford Linear Accelerator Center
> ian_at_slac.stanford.edu

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Jared Still
  INET: jkstill_at_cybcon.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Sun Jul 01 2001 - 01:16:55 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US