Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Computing truncated mean

Re: Computing truncated mean

From: Walt <walt_askier_at_SHOESyahoo.com>
Date: Wed, 07 Feb 2007 11:39:41 -0500
Message-ID: <12sk06eq64dfrbb@corp.supernews.com>


John K. Hinsdale wrote:
>> On Feb 7, 7:50 am, "Charles Hooper" <hooperc2000_at_yahoo.com> wrote:

Thanks guys. I'm testing your candidate scripts with the following data:

INSERT INTO T1 VALUES (1);
INSERT INTO T1 VALUES (2);
INSERT INTO T1 VALUES (3);
INSERT INTO T1 VALUES (4);
INSERT INTO T1 VALUES (5);
INSERT INTO T1 VALUES (10);
INSERT INTO T1 VALUES (10);
INSERT INTO T1 VALUES (10);
INSERT INTO T1 VALUES (10);
INSERT INTO T1 VALUES (10);

A 20% trim should include the middle six values and return a trimmed average of 42/6 == 7.

So far, the only "formula" I've tried that gives the right answer is the first one John posted (actually a slight modification thereof). The others (see below) appear to have problems. I'll try to understand them further to pinpoint the error.

////////////////////////////////////////

select avg(n) as trim_avarage,
        count(*) as rows_considered,
        max(total_count) - count(*) as rows_ignored,
        max(total_count) as total_rows
from ( select OQ.*, rownum / OQ.total_count as row_frac
        from
              ( select t1.*, C.total_count
                from t1 ,
                     ( select count(*) as total_count from t1 ) C
                where t1.n is not null
                order by t1.n
              ) OQ
       )

/* Ignore top and bottom deciles */
where row_frac > 0.20

   and row_frac <= 0.80;

TRIM_AVARAGE ROWS_CONSIDERED ROWS_IGNORED TOTAL_ROWS ------------ --------------- ------------ ----------

            7 6 4 10 1 row selected

////////////////////////////////////////////////




select avg(n) as average_n,
            count(*) as rows_considered
     from (
       select n, cume_dist_n
       from (
         select n,
                cume_dist() over (order by n) AS cume_dist_n
         from t1
         )
         order by n
       )
       where cume_dist_n > 0.20
         and cume_dist_n <= (1 - 0.20)

  AVERAGE_N ROWS_CONSIDERED
---------- ---------------

          4 3
1 row selected

SELECT
   AVG(N) AVERAGE,
   COUNT(*) NUM_RECORDS,
   MIN(N) MINIMUM,
   MAX(N) MAXIMUM,
   PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N) MIDPOINT_1,    PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N DESC) MIDPOINT_2 FROM
   (SELECT

     N,
     PERCENT_RANK() OVER (ORDER BY N) RANKING
   FROM
     T1)

WHERE
   RANKING BETWEEN .20 AND .80
    AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2 ---------- ----------- ---------- ---------- ---------- ----------

       7.75 8 3 10 10 10

//Walt Received on Wed Feb 07 2007 - 10:39:41 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US