Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Computing truncated mean

Re: Computing truncated mean

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: 7 Feb 2007 04:50:21 -0800
Message-ID: <1170852618.557700.324260@v45g2000cwv.googlegroups.com>


On Feb 6, 4:40 pm, Walt <walt_ask..._at_SHOESyahoo.com> wrote:
> A truncated mean or a trimmed average is the average after throwing away
> the top and bottom n percent.
>
> I seem to recall a built-in function in Oracle that did this fairly
> simply, but I can't seem to locate it in the docs. I could cook up my
> own in PL-SQL, but if there's something already out there I'd prefer to
> use it. Anybody know a simple way to do it?
>
> Thanks.
>
> //Walt
>
> W2k3, Ora 9.2 if that matters...

Set up a test case:
CREATE TABLE T1 (N NUMBER(10));

INSERT INTO T1 VALUES (1);
INSERT INTO T1 VALUES (2);
INSERT INTO T1 VALUES (8);
INSERT INTO T1 VALUES (10);
INSERT INTO T1 VALUES (15);
INSERT INTO T1 VALUES (20);
INSERT INTO T1 VALUES (34);
INSERT INTO T1 VALUES (40);
INSERT INTO T1 VALUES (55);
INSERT INTO T1 VALUES (18);
INSERT INTO T1 VALUES (22);
INSERT INTO T1 VALUES (28);
INSERT INTO T1 VALUES (32);
INSERT INTO T1 VALUES (40);
INSERT INTO T1 VALUES (49);
INSERT INTO T1 VALUES (50);
INSERT INTO T1 VALUES (80);
INSERT INTO T1 VALUES (200);
INSERT INTO T1 VALUES (201);
INSERT INTO T1 VALUES (210);

SELECT
  *
FROM
  T1
ORDER BY
  N;

         N


         1
         2
         8
        10
        15
        18
        20
        22
        28
        32
        34
        40
        40
        49
        50
        55
        80
       200
       201
       210

Now, an experiement to find the point in our table that represents the bottom 20% (START_POINT) and top 20% (END_POINT): SELECT
  PERCENTILE_DISC(0.20) WITHIN GROUP (ORDER BY N) START_POINT,   PERCENTILE_DISC(0.20) WITHIN GROUP (ORDER BY N DESC) END_POINT FROM
  T1
GROUP BY
  1;

START_POINT END_POINT
----------- ----------

         10 80

Now that we have the end points defined, we need a way to limit the input into the AVG function so that only the rows that contain column values between the two extremes are retrieved. Sliding the above into an inline view and joining it to our original table solves the problem:
SELECT
  AVG(N) AVERAGE,
  COUNT(*) NUM_RECORDS,
  MIN(N) MINIMUM,
  MAX(N) MAXIMUM,
  PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N) MIDPOINT_1,   PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N DESC) MIDPOINT_2 FROM
  T1,
  (SELECT
    PERCENTILE_DISC(0.2) WITHIN GROUP (ORDER BY N) START_POINT,     PERCENTILE_DISC(0.2) WITHIN GROUP (ORDER BY N DESC) END_POINT   FROM
    T1
  GROUP BY

  1. L WHERE T1.N BETWEEN L.START_POINT AND L.END_POINT;
   AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2 ---------- ----------- ---------- ---------- ---------- ---------- 35.2142857 14 10 80 32 34

After throwing out those values outside the 20% to 80% range, our average is 35.21, there are 14 records, the minimum is 10, maximum is 80, the first number in the middle of the group sorted from lowest to highest is 32, and the first number in the middle of the group sorted from highest to lowest is 34.

Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc. Received on Wed Feb 07 2007 - 06:50:21 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US