Re: MERGE as the imperative form of aggregation

From: Marshall <marshall.spight_at_gmail.com>
Date: 15 Apr 2007 10:28:22 -0700
Message-ID: <1176658102.296673.147230_at_n59g2000hsh.googlegroups.com>


On Apr 9, 12:06 pm, "Aloha Kakuikanu" <aloha.kakuik..._at_yahoo.com> wrote:
> On Apr 7, 11:46 am, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
>
>
> > But there is MERGE.
>
> > Merge is annoying in that the way it is usually
> > specified it takes a lot of parameters. Wikipedia
> > gives its general form thusly:
>
> > MERGE INTO table_name USING table_name ON (condition)
> > WHEN MATCHED THEN
> > UPDATE SET column1 = value1 [, column2 = value2 ...]
> > WHEN NOT MATCHED THEN
> > INSERT column1 [, column2 ...] VALUES (value1 [, value2 ...])
>
> > It's clear why all those parameters are there, but it's
> > still a mess. Furthermore, I find that I only ever want to
> > use it in a more restricted sense. The UPDATE part
> > and the INSERT part are completely uncorrelated, but
> > that's more generality than is useful. In fact, the UPDATE
> > and the INSERT ought to be thought of as the same
> > transformation. The UPDATE is a transform on an
> > existing row, and the INSERT ought to be that same
> > transform applied to the identity for that transform.
> > The existing row must be uniquely identified (that is,
> > a key must be specified.)
>
> I personally didn't invest even a minute of my time looking into what
> Merge is -- you post changed that:-) First of all, there is a nice
> symmetry of insert and delete, and update is a combination of the
> insert and delete. So why don't we go along generalizing update,
> instead of introducing an ugly combination of insert and update?

That is certainly one way to go. In fact a friend has suggested that I ought to try to design a single generalized imperative statement: one that can insert, update, or delete. If I am ever able to come up with such a statement that has any conceptual integrity I might be more attracted to this idea. Every attempt has just resulted in an uncoordinated heap of parameters and special cases.

I find the symmetry and completeness of insert/delete to be compelling. My reaction is that update is *already* an "ugly combination of insert and delete."

Insert can be seen as a simple imperative generalization of union. It can be even the non-relational union, since it doesn't make much sense for imperative operations to alter the type of the variable. Delete is usually thought of as a subtraction from a set by a subset thereof determined by a characteristic function, but we could instead consider it a join (specifically, an intersection!) of a set and a subset of that set determined by the negation of that characteristic function.

So that's what got me to thinking about trying to make a closer correspondence between the imperative and algebraic operators. I have to say I find the idea appealing.

> Next, there are insert, update, and delete triggers. Now merge is
> expected to be combination of insert, update; naturally insert and
> update triggers are expected to be fired when issuing a merge
> statement?

Triggers, sigh. I have a poor grasp of what they are for and what they mean from a theoretical standpoint. What are they necessary for? I have seen good descriptions of how to express view updating as triggers that model the inverse view, but would that be necessary if we had full view updating in the engine? What are triggers necessary for? I am clear on the value of *notification* of changes being sent to other components, but *actions* are less clear to me. Especially "instead of" actions.

> > Example: I am maintaining a count of something
> > on a per-foo basis. I have a table of foo ids and
> > counts. For all foo ids that are not present in the
> > count table, the count is 0. The sort of MERGE
> > I would do is
>
> > MERGE into FooCount f using
> > (select id, count from NewFooCount) f'
> > ON f.id = f'.id
> > WHEN MATCHED THEN
> > UDPATE f.count = f.count + f'.count
> > WHEN NOT MATCHED THEN
> > insert id, count values (id, count)
>
> > So, what else has a transform that has a concept of
> > identity and also specifies a key? An aggregate
> > with GROUP BY.
>
> > MERGE NewFooCount f' into FooCount f
> > GROUP BY id
> > SET f.count = sum(f.count, f'.count);
>
> > It's less general, but syntactically and conceptually
> > cleaner and simpler.
>
> > Thus, MERGE can be seen as the imperative form of
> > GROUP BY and aggregates.
>
> In a way insert and delete statements change counts too. Consider
> converting a relation into characteristic function:
>
> R(x,y,z) --> RC(x,y,z,cnt)
>
> where RC is an infinite relation. For every tuple (x,y,z) in the R, we
> have tuple (x,y,z,1) in the RC. For every tuple (x,y,z) which is not
> in the R,
> we have tuple (x,y,z,0) in the RC.
>
> Now, insert, delete, or update on the original relation is just an
> update of the cnt column of the RC.

Ah! That's much like how I think of boolean functions. They could be thought of as total functions from domain D to boolean:

  f: D -> boolean

or they could be thought of as partial functions to 0 return values:

  f: D ->?

(Here "->" indicates a total function and "->?" indicates a partial function.)

Marshall Received on Sun Apr 15 2007 - 19:28:22 CEST

Original text of this message