Re: Reigniting Probability theory debate
From: Vadim Tropashko <vadimtro_at_gmail.com>
Date: Wed, 11 Apr 2012 17:08:20 -0700 (PDT)
Message-ID: <1a547377-8410-47ff-ad81-a150fd31aaeb_at_r2g2000pbs.googlegroups.com>
;
;
;
Date: Wed, 11 Apr 2012 17:08:20 -0700 (PDT)
Message-ID: <1a547377-8410-47ff-ad81-a150fd31aaeb_at_r2g2000pbs.googlegroups.com>
I agree and suggest that distribution correspond to not just a relation, but the relation amended with partitioning of its attributes into 2 sets. For example given a relation
Classes=[Prof Course Time]
Libkin DB101 Tue200 Libkin DB101 Thu500 Gromov Math Tue200 Gromov Math Thu500 Vianu DB101 Tue200
;
If we split attributes as {Prof }|{Course,Time} then we have distribution
ProfessorDistribution=[Prof Probability]
Libkin 0.4 Gromov 0.4 Vianu 0.2
;
The same can be done for {Course}|{Prof,Time}
CourseDistribution=[Course Probability]
DB101 0.6 Math 0.4
;
The FD Prof->Course implies that the entropy of CourseDistribution is lower than ProfessorDistribution:
http://vadimtropashko.wordpress.com/2012/01/26/caclulating-entropy-and-gini-index-for-a-partitioned-table/ Received on Thu Apr 12 2012 - 02:08:20 CEST