Home » SQL & PL/SQL » SQL & PL/SQL » Correlation Approach (Oracle 11g, Linux (CentOS 5))
Correlation Approach [message #427599] Fri, 23 October 2009 08:03 Go to next message
Vackar
Messages: 81
Registered: October 2007
Member
Hi

I have a problem where I am trying to find the correlaction of sample pairs given a set of measured observations.

So I have this data

=================================================================
SampleID     Measure1     Measure1     Measure1   ...     Measuren
=================================================================
Sample1       0.2            0.6         0.5      ...       1.2
Sample2       0.1            1.3         2.5      ...       6.3
Sample3       0.8            76          3.5      ...       1.2
Sample4       8.7            0.6         3.4      ...       5.2
...           ...            ...         ...      ...       ...
Sample55000   0.7            2.6         3.5      ...       1.8
=================================================================


And I'm trying to create a correlation matrix of the form

	    Sample1       Sample2       Sample3       Sample4       Sample5       Sample6       Sample7
Sample1     1.000
Sample2     0.274    1.000
Sample3    -0.134   -0.269    1.000
Sample4     0.201   -0.153    0.075    1.000
Sample5    -0.129   -0.166    0.278   -0.011    1.000
Sample6    -0.095    0.280   -0.348   -0.378   -0.009    1.000
Sample7     0.171   -0.122    0.288    0.086    0.193    0.002    1.000
Sample8     0.219    0.242   -0.380   -0.227   -0.551    0.324   -0.082    1.000
Sample9     0.518    0.238    0.002    0.082   -0.015    0.304    0.347   -0.013    1.000
Sample10    0.299    0.568    0.165   -0.122   -0.106   -0.169    0.243    0.014    0.352    1.000



OR
Sample1 Sample1   1.0
Sample1 Sample2   0.2
Sample1 Sample3   0.6
Sample1 Sample4   0.5
Sample1 Sample5   0.4
Sample2 Sample2   1.0
Sample2 Sample3   1.2
Sample2 Sample4   6.3



But I'm not sure what the best approach to this would be


I'm aware of the corr function but that givens you the correlation of column values, which can be grouped by some other column, so it doesn't really help in this situation.

I've also looked at doing a pivot so that samples define columns and measures become rows, that would then let me use the correlation function as corr(sample1, sample2), but it would mean having 55000 columns, and Oracle has a limit of 1000 columns per table.

Does anyone have any recommendations about how you may want to tackle this kind of problem.

Any suggestions would be warmly welcomed.

Thanks,
Vackar
Re: Correlation Approach [message #429357 is a reply to message #427599] Tue, 03 November 2009 06:23 Go to previous messageGo to next message
Vackar
Messages: 81
Registered: October 2007
Member
Anyone got any suggestions?
Re: Correlation Approach [message #429377 is a reply to message #429357] Tue, 03 November 2009 07:45 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
How about just pivoting the measure?
with src as (select 'Sample1' sample,       0.2 measure1, 0.6 measure2, 0.5 measure3, 1.2 measure4 from dual union all
             select 'Sample2'       ,0.1 , 1.3 , 2.5 ,  6.3 from dual union all
             select 'Sample3'       ,0.8 , 76  , 3.5 ,  1.2 from dual union all
             select 'Sample4'       ,8.7 , 0.6 , 3.4 ,  5.2 from dual)
, pivot as (select sample
                  ,case when lvl = 1 then measure1
                        when lvl = 2 then measure2
                        when lvl = 3 then measure3
                        when lvl = 4 then measure4
                   end val
                  ,lvl measure
                   
            from   src
                  ,(select level lvl from dual connect by level <=4))
select p1.sample
      ,p2.sample
      ,corr(p1.val,p2.val)
from   pivot p1
      ,pivot p2
where  p1.sample < p2.sample
and    p1.measure = p2.measure
group by p1.sample
      ,p2.sample
order by p1.sample,p2.sample;


Re: Correlation Approach [message #429399 is a reply to message #429377] Tue, 03 November 2009 09:00 Go to previous message
Vackar
Messages: 81
Registered: October 2007
Member
Wow that was just what I needed.

I never fail to be amazed by the things that oracle can do ( in mere Milliseconds - java, python... etc take at least 100x longer)

Thank you so much for your help
Razz
Previous Topic: Calculate with sql
Next Topic: trying to create a new column a specific position in already existing table
Goto Forum:
  


Current Time: Wed Dec 07 03:21:45 CST 2016

Total time taken to generate the page: 0.08680 seconds