Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Computing truncated mean

Re: Computing truncated mean

From: Walt <walt_askier_at_SHOESyahoo.com>
Date: Wed, 07 Feb 2007 12:29:04 -0500
Message-ID: <12sk330h5brc0c5@corp.supernews.com>


Walt wrote:
> John K. Hinsdale wrote:

>>> On Feb 7, 7:50 am, "Charles Hooper" <hooperc2000_at_yahoo.com> wrote:

>
>
> Thanks guys. I'm testing your candidate scripts with the following data:
>
> INSERT INTO T1 VALUES (1);
> INSERT INTO T1 VALUES (2);
> INSERT INTO T1 VALUES (3);
> INSERT INTO T1 VALUES (4);
> INSERT INTO T1 VALUES (5);
> INSERT INTO T1 VALUES (10);
> INSERT INTO T1 VALUES (10);
> INSERT INTO T1 VALUES (10);
> INSERT INTO T1 VALUES (10);
> INSERT INTO T1 VALUES (10);
>
> A 20% trim should include the middle six values and return a trimmed
> average of 42/6 == 7.
>
> So far, the only "formula" I've tried that gives the right answer is the
> first one John posted (actually a slight modification thereof). The
> others (see below) appear to have problems. I'll try to understand them
> further to pinpoint the error.

Ok, here's the deal. Trimming according to cume_dist or percent_ranking trims by value, not by record. That is, all the records with n=10 come and go as a group - either they're in the sample used to compute the average or they're not.

John's first formula trims the top and bottom 20% of records, the others trim all records with a value between the 20th percentile and 80th percentile. Not the same thing (although it is if the values are distinct, as with some of the test data.)

Perhaps this is due to some ambiguity in the way I stated the problem. If so, I apologize. A 20% truncated mean is the average over the middle 60% of *records*, not over all the records whose *values* fall in the the 20th and 80th percentile range.

In my particular situation, I'm crunching some numbers from a survey where the only possible responses are 1 through 5. Trimming by value will result in all the 1s and 5s disappearing from the average, which is not at all what the client is looking for.

//Walt Received on Wed Feb 07 2007 - 11:29:04 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US