Re: Reigniting Probability theory debate

From: Vadim Tropashko <vadimtro_at_gmail.com>
Date: Wed, 11 Apr 2012 17:08:20 -0700 (PDT)
Message-ID: <1a547377-8410-47ff-ad81-a150fd31aaeb_at_r2g2000pbs.googlegroups.com>


On Apr 11, 2:02 pm, com..._at_hotmail.com wrote: ...
> I doubt  that relations are an appropriate abstraction for distributions per se. I expect that a relation is just one constituent of a proper distribution representation (which would include notions of dependent and independent coordinates/variables/attributes and calculating and renormalizing probabilities to sum to 1 and multiplying and summing for conditional and marginal probabilities and binary mappings in particular) and that any relevant operators on relations (which generally won't be database ones) are in turn used to define other operators on distribution representations per se. The paper's abstract says that conditional distributions and marginal distributions are given by selection and projection respectively. However I expect that what is actually the case is that for their relational representation of (some of) a distribution some notion of removing rows and columns happens as part of complex relation operators implementing distribution operations. That is human vague reminiscence, not semantic correspondence.
>

I agree and suggest that distribution correspond to not just a relation, but the relation amended with partitioning of its attributes into 2 sets. For example given a relation

Classes=[Prof Course Time]

         Libkin  DB101   Tue200
         Libkin  DB101   Thu500
         Gromov  Math    Tue200
         Gromov  Math    Thu500
         Vianu   DB101   Tue200

;

If we split attributes as {Prof }|{Course,Time} then we have distribution

ProfessorDistribution=[Prof Probability]

         Libkin  0.4
         Gromov  0.4
         Vianu   0.2

;

The same can be done for {Course}|{Prof,Time}

CourseDistribution=[Course Probability]

         DB101  0.6
         Math  0.4

;

The FD Prof->Course implies that the entropy of CourseDistribution is lower than ProfessorDistribution:

http://vadimtropashko.wordpress.com/2012/01/26/caclulating-entropy-and-gini-index-for-a-partitioned-table/ Received on Thu Apr 12 2012 - 02:08:20 CEST

Original text of this message