Re: Help with statistics windowing function

From: Harel Safra <harel.safra_at_gmail.com>
Date: Sun, 25 Nov 2018 22:03:52 +0200
Message-ID: <CA+UC=5EH83EiOvSbGbKp77iGM+4-YxTVESaC6jE9cNF590W_kA_at_mail.gmail.com>



You normalize the values to a [0..n] range and them divide by the range size.
This should work, the trunc is to make the output more readable.

SELECT
    val,
    trunc((val - MIN(val) OVER())/ ( MAX(val) OVER() - MIN(val) OVER() ),3) norm
FROM
    test order by val;

Harel Safra

On Sun, Nov 25, 2018 at 9:47 PM Luis Santos <lsantos_at_pobox.com> wrote:

> Thanks for all replies. I forgot to mention that I really could use an
> inline view or a with subfactoring view. But, to be frank, I was playing
> with this because I want a way to normalize the data, creating a rank.
>
> I discover the PERCENT_RANK windowing function. And applied a minus 0.5 to
> it to get this results.
>
> U71013576_at_P00MDS.brux0333 [11g]> col pct_rank format 9990D999
>> U71013576_at_P00MDS.brux0333 [11g]>
>> U71013576_at_P00MDS.brux0333 [11g]> select val
>> 2 , avg(val) over ()
>> 3 , val - avg(val) over ()
>> 4 , min(val) over ()
>> 5 , max(val) over ()
>> 6 , PERCENT_RANK() over (order by val)-0.5 PCT_RANK
>> 7 from test
>> 8 /
>> VAL AVG(VAL)OVER() VAL-AVG(VAL)OVER() MIN(VAL)OVER()
>> MAX(VAL)OVER() PCT_RANK
>> ---------- -------------- ------------------ --------------
>> -------------- ---------
>> 8 44,25 -36,25 8
>> 91 -0,500
>> 12 44,25 -32,25 8
>> 91 -0,447
>> 14 44,25 -30,25 8
>> 91 -0,395
>> 14 44,25 -30,25 8
>> 91 -0,395
>> 20 44,25 -24,25 8
>> 91 -0,289
>> 25 44,25 -19,25 8
>> 91 -0,237
>> 26 44,25 -18,25 8
>> 91 -0,184
>> 32 44,25 -12,25 8
>> 91 -0,132
>> 34 44,25 -10,25 8
>> 91 -0,079
>> 35 44,25 -9,25 8
>> 91 -0,026
>> 35 44,25 -9,25 8
>> 91 -0,026
>> 49 44,25 4,75 8
>> 91 0,079
>> 64 44,25 19,75 8
>> 91 0,132
>> 65 44,25 20,75 8
>> 91 0,184
>> 65 44,25 20,75 8
>> 91 0,184
>> 70 44,25 25,75 8
>> 91 0,289
>> 73 44,25 28,75 8
>> 91 0,342
>> 73 44,25 28,75 8
>> 91 0,342
>> 80 44,25 35,75 8
>> 91 0,447
>> 91 44,25 46,75 8
>> 91 0,500
>> 20 rows selected.
>
>
> But the "normalized" scale on PCT_RANK column is positional only. If I
> change the MAX value to a real big one the value for PCT_RANK don't change.
>
> U71013576_at_P00MDS.brux0333 [11g]> update test set val=9100 where val=91;
>> 1 row updated.
>> U71013576_at_P00MDS.brux0333 [11g]> col pct_rank format 9990D999
>> U71013576_at_P00MDS.brux0333 [11g]>
>> U71013576_at_P00MDS.brux0333 [11g]> select val
>> 2 , avg(val) over ()
>> 3 , val - avg(val) over ()
>> 4 , min(val) over ()
>> 5 , max(val) over ()
>> 6 , PERCENT_RANK() over (order by val)-0.5 PCT_RANK
>> 7 from test
>> 8 /
>> VAL AVG(VAL)OVER() VAL-AVG(VAL)OVER() MIN(VAL)OVER()
>> MAX(VAL)OVER() PCT_RANK
>> ---------- -------------- ------------------ --------------
>> -------------- ---------
>> 8 494,7 -486,7 8
>> 9100 -0,500
>> 12 494,7 -482,7 8
>> 9100 -0,447
>> 14 494,7 -480,7 8
>> 9100 -0,395
>> 14 494,7 -480,7 8
>> 9100 -0,395
>> 20 494,7 -474,7 8
>> 9100 -0,289
>> 25 494,7 -469,7 8
>> 9100 -0,237
>> 26 494,7 -468,7 8
>> 9100 -0,184
>> 32 494,7 -462,7 8
>> 9100 -0,132
>> 34 494,7 -460,7 8
>> 9100 -0,079
>> 35 494,7 -459,7 8
>> 9100 -0,026
>> 35 494,7 -459,7 8
>> 9100 -0,026
>> 49 494,7 -445,7 8
>> 9100 0,079
>> 64 494,7 -430,7 8
>> 9100 0,132
>> 65 494,7 -429,7 8
>> 9100 0,184
>> 65 494,7 -429,7 8
>> 9100 0,184
>> 70 494,7 -424,7 8
>> 9100 0,289
>> 73 494,7 -421,7 8
>> 9100 0,342
>> 73 494,7 -421,7 8
>> 9100 0,342
>> 80 494,7 -414,7 8
>> 9100 0,447
>> 9100 494,7 8605,3 8
>> 9100 0,500
>> 20 rows selected.
>
>
> Is there a better way to normalize a list of values, using a windowing
> function, to a [0..1] scale?
>
> *--*
> *Att*
>
>
> *Luis Santos*
>
>
> Em dom, 25 de nov de 2018 às 17:21, Harel Safra <harel.safra_at_gmail.com>
> escreveu:
>
>> SELECT
>> t.*,
>> MIN(dtm) OVER(),
>> MAX(dtm) OVER()
>> FROM
>> (
>> SELECT
>> val,
>> AVG(val) OVER() avg,
>> val - AVG(val) OVER() dtm,
>> MIN(val) OVER() min,
>> MAX(val) OVER() max
>> FROM
>> test
>> ) t;
>>
>> Harel
>>
>> On Sun, Nov 25, 2018 at 8:59 PM Luis Santos <lsantos_at_pobox.com> wrote:
>>
>>> I have the following table, with 20 random values.
>>>
>>> U71013576_at_P00MDS.brux0333 [11g]> select val from test;
>>>> VAL
>>>> ----------
>>>> 65
>>>> 70
>>>> 12
>>>> 65
>>>> 91
>>>> 25
>>>> 8
>>>> 73
>>>> 35
>>>> 20
>>>> 26
>>>> 14
>>>> 73
>>>> 35
>>>> 49
>>>> 80
>>>> 34
>>>> 14
>>>> 32
>>>> 64
>>>> 20 rows selected.
>>>
>>>
>>> With the following query I can get the average, the min and max value,
>>> and a calculated distance from mean.
>>>
>>> U71013576_at_P00MDS.brux0333 [11g]> select val
>>>> 2 , avg(val) over () AVG
>>>> 3 , val - avg(val) over () DTM
>>>> 4 , min(val) over () MIN
>>>> 5 , max(val) over () MAX
>>>> 6 from test;
>>>>
>>>
>>>
>>>> VAL AVG DTM MIN MAX
>>>> ---------- ---------- ---------- ---------- ----------
>>>> 65 44,25 20,75 8 91
>>>> 70 44,25 25,75 8 91
>>>> 12 44,25 -32,25 8 91
>>>> 65 44,25 20,75 8 91
>>>> 91 44,25 46,75 8 91
>>>> 25 44,25 -19,25 8 91
>>>> 8 44,25 -36,25 8 91
>>>> 73 44,25 28,75 8 91
>>>> 35 44,25 -9,25 8 91
>>>> 20 44,25 -24,25 8 91
>>>> 26 44,25 -18,25 8 91
>>>> 14 44,25 -30,25 8 91
>>>> 73 44,25 28,75 8 91
>>>> 35 44,25 -9,25 8 91
>>>> 49 44,25 4,75 8 91
>>>> 80 44,25 35,75 8 91
>>>> 34 44,25 -10,25 8 91
>>>> 14 44,25 -30,25 8 91
>>>> 32 44,25 -12,25 8 91
>>>> 64 44,25 19,75 8 91
>>>> 20 rows selected.
>>>
>>>
>>>
>>> I want to get, in also windowing fashion, the MIN/MAX values from the
>>> DTM column. But...
>>>
>>> select val
>>>> , avg(val) over ()
>>>> , val - avg(val) over ()
>>>> , min(val) over ()
>>>> , max(val) over ()
>>>> , min(val - avg(val) over ()) over()
>>>> , max(val - avg(val) over ()) over()
>>>> from test
>>>> /
>>>
>>>
>>>
>>>> , min(val - avg(val) over ()) over()
>>>> *
>>>> ERROR at line 6:
>>>> ORA-30483: window functions are not allowed here
>>>
>>>
>>>
>>>
>>> *--*
>>> *Att*
>>>
>>>
>>> *Luis Santos*
>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Nov 25 2018 - 21:03:52 CET

Original text of this message