# Re: what are keys and surrogates?

Date: Fri, 11 Jan 2008 05:45:32 GMT

Message-ID: <0oDhj.6005$xA6.1172_at_trndny09>

"David BL" <davidbl_at_iinet.net.au> wrote in message
news:a8ec9dd4-ab6c-4117-980f-003328677c20_at_e10g2000prf.googlegroups.com...

> On Jan 11, 4:28 am, "David Cressey" <cresse..._at_verizon.net> wrote:

*> > "David BL" <davi..._at_iinet.net.au> wrote in message
**> >
**> >
*

news:1d8bc808-c202-45bd-8d04-5ad80bb895ef_at_n22g2000prh.googlegroups.com...>
On Jan 10, 5:05 pm, "David Cressey" <cresse..._at_verizon.net> wrote:

> > > > "David BL" <davi..._at_iinet.net.au> wrote in message

> > > > Off topic.

*> >
**> > > > I prefer quantified as the difference in entropy between the state
*

that

> > > > includes d and the state that excludes it. I believe that, except

for a

> > > > scale factor, the two measure boil down to the same thing, except

for

> > one

*> > > > subtle difference:
**> >
**> > > > Using entropy as the measure enables one to consider information
*

content

*> > as
*

> > > > being context sensitive. That is, if d is to be included in some

other

> > > > database e, then the information provided by d to e is the entropy

*> > > > difference between e and e+d (where "+" is suitably defined).
**> >
**> > > Are you suggesting that when d is included in e, there are less states
**> > > available for d?
**> >
**> > No. Did I say something that implies that?
*

>

> Perhaps not. My understanding is that entropy is defined as a

*> logarithm on the number of states available to a system, and tends to*

*> be proportional to the number of bits required to represent a*

*> particular state. When two *independent* systems s1,s2 are combined*

*> into a single overall system s = s1 + s2, the total number of states*

*> available to s is the product of the number of states available to s1*

*> and s2, and by property of logarithms, the entropy is additive.*

>

> I thought your comment had something to do with coupling between d and

*> e. ie there being less available states for d in the context of e,*

*> which is why you suggested an entropy measure of information content.*

>

Your nuderstanding is the same as mine.

I did intend some sort of coupling, but I don't completely understand your response to my comment.

Consider the following scenarios:

Case 1.

d contains "There is a person named Bob, and his age is 45."
e contains "There is a person named Bob"

Case 2.

d contains "There is a person named Bob, and his age is 45."
e contains "There might or might not be a person named Bob"

Case 3.

d contains "There is a person named Bob, and his age is 45."
e contains "There is no person named Bob"

If we ask how much entropy d+e holds, when compared to e alone, we get this.

Case 1 provides less additional information than Case 2. Case 3 puts d+e in a self contradictory state.

Then there's Case 4.

d contains "There is a person named Bob, and his age is 45." e contains "There is a person named Bob, and his age is 45."

d+e adds no information to e (wrt the subset of data under discussion) Received on Fri Jan 11 2008 - 06:45:32 CET