Re: what are keys and surrogates?

From: David Cressey <cressey73_at_verizon.net>
Date: Fri, 11 Jan 2008 05:45:32 GMT
Message-ID: <0oDhj.6005$xA6.1172_at_trndny09>


"David BL" <davidbl_at_iinet.net.au> wrote in message news:a8ec9dd4-ab6c-4117-980f-003328677c20_at_e10g2000prf.googlegroups.com...
> On Jan 11, 4:28 am, "David Cressey" <cresse..._at_verizon.net> wrote:
> > "David BL" <davi..._at_iinet.net.au> wrote in message
> >
> >
news:1d8bc808-c202-45bd-8d04-5ad80bb895ef_at_n22g2000prh.googlegroups.com...> On Jan 10, 5:05 pm, "David Cressey" <cresse..._at_verizon.net> wrote:
> > > > "David BL" <davi..._at_iinet.net.au> wrote in message

> > > > Off topic.
> >
> > > > I prefer quantified as the difference in entropy between the state
that
> > > > includes d and the state that excludes it. I believe that, except
for a
> > > > scale factor, the two measure boil down to the same thing, except
for
> > one
> > > > subtle difference:
> >
> > > > Using entropy as the measure enables one to consider information
content
> > as
> > > > being context sensitive. That is, if d is to be included in some
other
> > > > database e, then the information provided by d to e is the entropy
> > > > difference between e and e+d (where "+" is suitably defined).
> >
> > > Are you suggesting that when d is included in e, there are less states
> > > available for d?
> >
> > No. Did I say something that implies that?

>

> Perhaps not. My understanding is that entropy is defined as a
> logarithm on the number of states available to a system, and tends to
> be proportional to the number of bits required to represent a
> particular state. When two *independent* systems s1,s2 are combined
> into a single overall system s = s1 + s2, the total number of states
> available to s is the product of the number of states available to s1
> and s2, and by property of logarithms, the entropy is additive.
>

> I thought your comment had something to do with coupling between d and
> e. ie there being less available states for d in the context of e,
> which is why you suggested an entropy measure of information content.
>

Your nuderstanding is the same as mine.

I did intend some sort of coupling, but I don't completely understand your response to my comment.

Consider the following scenarios:

Case 1.
d contains "There is a person named Bob, and his age is 45." e contains "There is a person named Bob"

Case 2.
d contains "There is a person named Bob, and his age is 45." e contains "There might or might not be a person named Bob"

Case 3.
d contains "There is a person named Bob, and his age is 45." e contains "There is no person named Bob"

If we ask how much entropy d+e holds, when compared to e alone, we get this.

Case 1 provides less additional information than Case 2. Case 3 puts d+e in a self contradictory state.

Then there's Case 4.

d contains "There is a person named Bob, and his age is 45." e contains "There is a person named Bob, and his age is 45."

d+e adds no information to e (wrt the subset of data under discussion) Received on Fri Jan 11 2008 - 06:45:32 CET

Original text of this message