MERGE as the imperative form of aggregation
Date: 7 Apr 2007 12:46:24 -0700
Message-ID: <1175975183.973567.299670_at_l77g2000hsb.googlegroups.com>
INSERT is the imperative form UNION. There are some restrictions, because imperative statements must operate on variables, and so are restricted in what they can do about types.
declare newRows;
var v;
v = v UNION newRows;
does the same thing as
INSERT into v values newRows;
Because we are assigning the result of the union back into v, newRows and v have to have the same header.
Similarly, DELETE is the imperative form
of SUBTRACT. DELETE could be defined
to remove a relation, but instead it is usually
expressed as a predicate P(x), where the
type of x is the header of v. (Tuple type instead
of relation type.)
v = v MINUS (v where P(x));
DELETE from v where P(x);
UPDATE is a little weird; it specifies
both an indicator predicate, like DELETE
does, and also a function f:x->x', where
the type of x and x' is the row type. It
can also be expressed as a delete and
an insert accordingly.
For a while I had the idea that there would likely be a whole bunch of useful imperative statements, defined in terms of assignment and relational operators. But that doesn't seem to be true. Outside of INSERT/DELETE/UPDATE there doesn't seem to be much.
But there is MERGE.
Merge is annoying in that the way it is usually
specified it takes a lot of parameters. Wikipedia
gives its general form thusly:
MERGE INTO table_name USING table_name ON (condition)
WHEN MATCHED THEN
It's clear why all those parameters are there, but it's
still a mess. Furthermore, I find that I only ever want to
use it in a more restricted sense. The UPDATE part
and the INSERT part are completely uncorrelated, but
that's more generality than is useful. In fact, the UPDATE
and the INSERT ought to be thought of as the same
transformation. The UPDATE is a transform on an
UPDATE SET column1 = value1 [, column2 = value2 ...]
WHEN NOT MATCHED THEN
INSERT column1 [, column2 ...] VALUES (value1 [, value2 ...])
Example: I am maintaining a count of something
on a per-foo basis. I have a table of foo ids and
counts. For all foo ids that are not present in the
count table, the count is 0. The sort of MERGE
I would do is
MERGE into FooCount f using
So, what else has a transform that has a concept of
identity and also specifies a key? An aggregate
with GROUP BY.
MERGE NewFooCount f' into FooCount f
It's less general, but syntactically and conceptually
(select id, count from NewFooCount) f'
ON f.id = f'.id
WHEN MATCHED THEN
UDPATE f.count = f.count + f'.count
WHEN NOT MATCHED THEN
insert id, count values (id, count)
GROUP BY id
SET f.count = sum(f.count, f'.count);
Thus, MERGE can be seen as the imperative form of GROUP BY and aggregates.
Comments, questions, criticisms?
Marshall Received on Sat Apr 07 2007 - 21:46:24 CEST