Re: Pearson-r in SQL

From: Paul Mason <latepaul_at_gmail.com>
Date: 8 Feb 2005 11:59:37 -0800
Message-ID: <1107892777.874946.275520_at_z14g2000cwz.googlegroups.com>


-CELKO- wrote:
> Add more data to the set, drop the columns with NULLs and now I can
get
> a result. For example with four more rows:
>
> (3kg, 6m)
> (5kg, 10m)
> (7kg, 14m)
> (1kg, 2m)
>
> my known values are linear. But I have those other rows of partial
> data that should reduce my confidence that I really have (r = 1). I
> just don't know what the SQL products like CA-Ingres are doing with
> this, nor what statistics would do with it.
> I did drop a note at alt.sci.math.statistics as well as here.

Since you asked, I ran a quick test, results below. I think it's reasonable that Ingres doesn't return 1 for the first two queries - whether NULL would be better than 0 I'll leave to the more statistically inclined to debate.

INGRES TERMINAL MONITOR Copyright 2004 Computer Associates Intl, Inc. Ingres Linux Version II 3.0.1 (int.lnx/109) login Tue Feb 8 19:27:04 2005

continue
* * /* SQL Startup File */
SELECT IFNULL(x,-99), IFNULL(y,-99)
FROM foobar
Executing . . .

+-----------+-----------+
|col1 |col2 |

+-----------+-----------+
|      3.000|    -99.000|
|      5.000|    -99.000|
|      7.000|    -99.000|
|    -99.000|      1.000|
|    -99.000|      2.000|
|    -99.000|      3.000|
|    -99.000|      4.000|
|      3.000|      6.000|
|      5.000|     10.000|
|      7.000|     14.000|
|      1.000|      2.000|

+-----------+-----------+
(11 rows)
continue
* * SELECT CORR(x,y) FROM foobar
Executing . . .

+-----------+
|col1 |
+-----------+
| 0.000|
+-----------+
(1 row)
continue
* * SELECT CORR(x,y) FROM foobar
Executing . . .

+-----------+
|col1 |
+-----------+
| 0.000|
+-----------+
(1 row)
continue
* * * SELECT CORR(x,y) FROM foobar
WHERE x IS NOT NULL
AND y IS NOT NULL
Executing . . .

+-----------+
|col1 |
+-----------+
| 1.000|
+-----------+
(1 row)
continue
* INSERT INTO foobar (x,y) VALUES (3.0,2.0) Executing . . .

(1 row)
continue
* * * SELECT CORR(x,y) FROM foobar
WHERE x IS NOT NULL
AND y IS NOT NULL
Executing . . .

+-----------+
|col1 |
+-----------+
| 0.942|
+-----------+
(1 row)
continue
* Your SQL statement(s) have been committed.

Ingres Version II 3.0.1 (int.lnx/109) logout Tue Feb 8 19:30:11 2005 Received on Tue Feb 08 2005 - 20:59:37 CET

Original text of this message