Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: Computing truncated mean
On Feb 7, 2:32 pm, "John K. Hinsdale" <h..._at_alma.com> wrote:
> On Feb 7, 12:51 pm, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
>
> > Your clarification helps. What we need to do is something like this,
> > which is similar to the PERCENT_RANK(), but 1 is added to the COUNT of
> > the records:
>
> > SELECT
> > N,
> > (ROW_NUMBER() OVER (ORDER BY N))/(COUNT(*) OVER (PARTITION BY 1)+1)
> > POSITION_RANKING
>
> ... OK I think that is the best way. Works, & fast.
>
> [BTW, using just COUNT(*), instead of COUNT(*)+1, in the
> denominatorseems more intuitive to me. E.g. for ten rows
> the "rank" would rise in increments of 0.1 and not 1/9.
> In this case you would need to use the comparisons ">"
> and "<=" instead of BETWEEN. However I cannot find any cases
> for which the two methods produce differente results, at
> least for rowsets of size N or larger when you are "trimming"
> 1/N of rows on either end.]
The problem here is that we need the first row to be the same distance from 0 as the last row is from 1, and the same distance between each row, which is where the PERCENT_RANK() analytic function failed (last row is assigned a value of 1.0). By taking COUNT(*)+1, I am actually incrementing by 1/(10+1) = 1/11 = 0.090909091 for each row.
Just to keep things interesting:
INSERT INTO T1 VALUES (8);
SELECT
N,
(ROW_NUMBER() OVER (ORDER BY N))/(COUNT(*) OVER (PARTITION BY 1)+1)
POSITION_RANKING
FROM
T1;
N POSITION_RANKING
---------- ---------------- 1 .083333333 2 .166666667 3 .25 4 .333333333 5 .416666667 8 .5 10 .583333333 10 .666666667 10 .75 10 .833333333 10 .916666667
SELECT
AVG(N) AVERAGE,
COUNT(*) NUM_RECORDS,
MIN(N) MINIMUM,
MAX(N) MAXIMUM,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N) MIDPOINT_1,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY N DESC) MIDPOINT_2
FROM
(SELECT
N,
(ROW_NUMBER() OVER (ORDER BY N))/(COUNT(*) OVER (PARTITION BY
1)+1) POSITION_RANKING
FROM
T1)
WHERE
POSITION_RANKING BETWEEN .20 AND .80;
AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2
---------- ----------- ---------- ---------- ---------- ----------
7.14285714 7 3 10 8 8
INSERT INTO T1 VALUES (11);
INSERT INTO T1 VALUES (12);
AVERAGE NUM_RECORDS MINIMUM MAXIMUM MIDPOINT_1 MIDPOINT_2
---------- ----------- ---------- ---------- ---------- ----------
7.77777778 9 3 10 10 10
Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.
Received on Wed Feb 07 2007 - 13:51:44 CST
![]() |
![]() |