Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Re: Calculate median with ORACLE??
First lets agree on a definition. In order to calculate median you need to:
The main issue here is to force ORACLE select rows ordered by
column value. To do that you need to create an index on that
column. To force ORACLE to use it you need to add a where clause
condition column >= MIN(column) and (optionally) add an index hint
to your select statement. Now you need to know upfront number of
not null (for the column) rows, is it odd or even and column MIN
value. To do that you can use subquery in from clause (I assume
your ORACLE version is 7.2 or higher. If not, you can easily move
that subquery to where clause). Then using decode you can replace
not needed column values with 0, sum everything up and divide by
two. Note that sum is always divided by 2, therefore if count is
odd query doubles the middle row value.
Below is an example:
SQL> desc xyz
Name Null? Type ------------------------------- -------- ---- X NUMBER
SQL> select * from xyz;
X
10 5 1 4 3
Notice that column X values physically are stored not ordered!!!
SQL> create index xyz_i on xyz(x);
Index created.
Index is a must!!!
SQL> set autotrace on SQL> set echo on SQL> @median SELECT /*+ INDEX(XYZ_I,XYZ) */ SUM(DECODE(ROWNUM,Half_Count,DECODE(Remainder,0,X,0), Half_Count+1,DECODE(Remainder,0,X,2*X), 0 ) ) / 2 Median FROM XYZ, (SELECT TRUNC(COUNT(*) / 2) Half_Count, MOD(COUNT(*),2) Remainder, MIN(X) Min_Value FROM XYZ WHERE X IS NOT NULL )
Median
4
Execution Plan ----------------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=101 Card=10000 Bytes=1170000)
1 0 SORT (AGGREGATE)
2 1 COUNT 3 2 NESTED LOOPS (Cost=101 Card=10000 Bytes=1170000) 4 3 VIEW (Cost=1 Card=100 Bytes=6500) 5 4 SORT (AGGREGATE) 6 5 TABLE ACCESS (FULL) OF 'XYZ' (Cost=1 Card=100 By tes=2600) 7 3 INDEX (RANGE SCAN) OF 'XYZ_I' (NON-UNIQUE)
This is the most important step. ORACLE is being forced to use XYZ_I index. If you will remove WHERE X >= Min_Value condition ORACLE will do full scan of XYZ table in the physical order and result will be wrong!!!
SQL> @wrong_median
SELECT /*+ INDEX(XYZ_I,XYZ) */
SUM(DECODE(ROWNUM,Half_Count,DECODE(Remainder,0,X,0), Half_Count+1,DECODE(Remainder,0,X,2*X), 0 ) ) / 2 Median FROM XYZ, (SELECT TRUNC(COUNT(*) / 2) Half_Count, MOD(COUNT(*),2) Remainder, MIN(X) Min_Value FROM XYZ WHERE X IS NOT NULL )
Median
1
Execution Plan ----------------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2001 Card=200000 Bytes=23400000)
1 0 SORT (AGGREGATE)
2 1 COUNT 3 2 NESTED LOOPS (Cost=2001 Card=200000 Bytes=23400000) 4 3 TABLE ACCESS (FULL) OF 'XYZ' (Cost=1 Card=2000 Bytes =104000) 5 3 VIEW 6 5 SORT (AGGREGATE) 7 6 TABLE ACCESS (FULL) OF 'XYZ' (Cost=1 Card=2000 B ytes=26000)
As you can see forcing ORACLE to use index is a must.
Solomon.Yakobson_at_entex.com
In article <5ef09r$qdu_at_nr1.ottawa.istar.net>, "Sylvain Grenier" <grenie02_at_cnmtl2.cn.ca> wrote:
> > Hi, > > I recently had a request to calculate some statistics on a Oracle table > values. I have to calculate minimum, maximum, average and median. The > three first are not to complicated but a have no idea how to calculate the > median. Is there someone that ever had that problem before? If yes, is > there a way? > > Thanks in advance. > > Sylvain. (grenie02_at_cnmtl2.cn.ca)Received on Wed Feb 19 1997 - 00:00:00 CST
-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet