Re: Idempotence and "Replication Insensitivity" are equivalent ?

From: Marshall <marshall.spight_at_gmail.com>
Date: 20 Sep 2006 19:41:40 -0700
Message-ID: <1158806500.319923.130250_at_m7g2000cwm.googlegroups.com>


William Hughes wrote:
> Chris Smith wrote:
> > <pamelafluente_at_libero.it> wrote:
> > > What confuses me a little is that I thought that math is a (kind of)
> > > exact science. But here everything (proofs and definitions) seems that
> > > can be subject to opinion and I do not see resolutive statements !
> >
> > Then I've failed in being clear. Certainly, math is more or less exact.
> > The ambiguous bit here is how we define what we're talking about.
> >
> > Basically, the story so far is that I tried to provide a more precise
> > definition for what Marshall meant when he said that your "replication
> > insensitive" functions were equivalent to the set of aggregate functions
> > that are defined by idempotent binary functions.
>
> And as pointed out, only a subclass of functions on M(A)
> (henceforth turquoise functions) can be definied by binary
> functions.

Sure. Of course, my formulation of aggregates is not limited just to binary functions.

Recall:

Type 1:
  f : (A, B -> B)
  e: B
creates an aggregate of type
  {A} -> B

(A, B not necessarily distinct.)

Type 1b:
  f: (A, A -> A)
creates an aggregate of type
  {A} -> A

Type 2:
  an arbitrary expression that may include terms of type 1.

(Alternatively you could think of type 1 being simply the case of type 2 where the arbitrary expression is the identity function.)

> Marshall's comments can only apply to this
> subset [whether the subset should be called the aggregate
> functions is a different issue].

Well, at this point I've lost track of which comments you're referring to. The conversation is getting long.

> > First, I came up with a way fo defining aggregate functions from binary
> > functions that I thought was sufficient. For the functions I defined,
> > it *is* true that replication sensitivity is equivalent to idempotence
> > of a binary generating function. However, it turns out that the
> > functions I initially defined don't include everything that's commonly
> > called an aggregate function.
> >
>
> And more importantly does not include commonly used
> turqoise functions (i.e. median)
>
> > So I then made a second attempt to define aggregate functions. Marshall
> > also made an attempt, which I believe is equivalent to my second
> > attempt.
>
> Well, it is less than clear that this second attempt does include
> functions such as the median,you need to be much more
> careful with your definition of the range and domain of g.
> (g: A^2 -> A doesn't work).

A, A -> A is indeed too restrictive. A, B -> B is the most general form. But you also need the final function, if you want to address even such relatively simple aggregates as avg().

> But note, it only makes sense
> to talk about idempotence for g:A^2->A.

Agreed.

> So you can either
> restrict g and thus the class of turquoise functions that
> you are dealing with, and retain the notion of idempotence,
> or you can use a more general g, and abandon idempotence.
> You cannot apply the concept of idempotence to a larger
> class of turquoise functions by changing the definition of g.

I don't see why not. If you have a type A, B -> B, you can still talk about idempotence. It just implies that A = B.

> >This second attempt does seem to include all interesting
> > aggregate functions,

In fact, I think it includes all computable functions.

> > but I believe that one of the possible definitions
> > of your NOR function is a counter-example to the equivalence of binary
> > idempotence and replication insensitivity. Specifically, that
> > equivalence is only true when x_0 (which Marshall calls e) is an
> > identity for the binary function.

Ah! Rereading this, I think I get it. For example:

f = OR
e = T

f is idempotent, but the resulting aggregate will be duplication-insensitive,
because it is in fact the constant function T. So in the form of the fold where you supply a starting value, the starting value has to be an identity for f. (Which it is in the common cases, but it isn't
a requirement.)

> > We have been talking about (and even proving) statements about both of
> > these two mathematical structures. The only ambiguity relates to which
> > best captures the very imprecise discussion that was going on before.
> > For example, early in this thread, Marshall wrote "If the binary form of
> > the aggregate function is idempotent, the aggregate will return the same
> > value even if values are repeated arbitrarily." -- and that initially
> > raised the question of what is meant by the binary form of an aggregate
> > function. Unfortunately, mathematics has no answer to the question of
> > what Marshall meant when he wrote "the binary form of the aggregate
> > function", so that's why the conversation isn't so cut and dry.
> >
>
> No, but I see no chance of a definition that will allow all
> interesting turquoise functions to be generated by binary forms
> without torturing the term "binary form" beyond endurance.
> [E.g. you could define a binary form to be a function
> g: M(A) x A -> M(A), but in this case g takes an arbitrary
> number of elements of A, so calling it a binary form
> is stretching things]

I also wouldn't call your g above a binary function. However you *can* have this:

f: A, {A} -> {A}
f(a, S) = {a} union S

and now you've made all the elements of the input available to the final expression. Admittedly at this point the framework isn't buying you anything, but there are plenty of common cases where it *does* buy you something, and this example shows that it doesn't cost you anything in terms of expressiveness.

Marshall Received on Thu Sep 21 2006 - 04:41:40 CEST

Original text of this message