Re: MERGE as the imperative form of aggregation
Date: 15 Apr 2007 10:28:22 -0700
Message-ID: <1176658102.296673.147230_at_n59g2000hsh.googlegroups.com>
On Apr 9, 12:06 pm, "Aloha Kakuikanu" <aloha.kakuik..._at_yahoo.com>
wrote:
> On Apr 7, 11:46 am, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
>
>
> > But there is MERGE.
>
> > Merge is annoying in that the way it is usually
> > specified it takes a lot of parameters. Wikipedia
> > gives its general form thusly:
>
> > MERGE INTO table_name USING table_name ON (condition)
> > WHEN MATCHED THEN
> > UPDATE SET column1 = value1 [, column2 = value2 ...]
> > WHEN NOT MATCHED THEN
> > INSERT column1 [, column2 ...] VALUES (value1 [, value2 ...])
>
> > It's clear why all those parameters are there, but it's
> > still a mess. Furthermore, I find that I only ever want to
> > use it in a more restricted sense. The UPDATE part
> > and the INSERT part are completely uncorrelated, but
> > that's more generality than is useful. In fact, the UPDATE
> > and the INSERT ought to be thought of as the same
> > transformation. The UPDATE is a transform on an
> > existing row, and the INSERT ought to be that same
> > transform applied to the identity for that transform.
> > The existing row must be uniquely identified (that is,
> > a key must be specified.)
>
> I personally didn't invest even a minute of my time looking into what
> Merge is -- you post changed that:-) First of all, there is a nice
> symmetry of insert and delete, and update is a combination of the
> insert and delete. So why don't we go along generalizing update,
> instead of introducing an ugly combination of insert and update?
That is certainly one way to go. In fact a friend has suggested that I ought to try to design a single generalized imperative statement: one that can insert, update, or delete. If I am ever able to come up with such a statement that has any conceptual integrity I might be more attracted to this idea. Every attempt has just resulted in an uncoordinated heap of parameters and special cases.
I find the symmetry and completeness of insert/delete to be
compelling. My reaction is that update is *already* an "ugly
combination of insert and delete."
> Next, there are insert, update, and delete triggers. Now merge is
Triggers, sigh. I have a poor grasp of what they are for and
what they mean from a theoretical standpoint. What are they
necessary for? I have seen good descriptions of how to
express view updating as triggers that model the inverse
view, but would that be necessary if we had full view updating
in the engine? What are triggers necessary for? I am clear
on the value of *notification* of changes being sent to other
components, but *actions* are less clear to me. Especially
"instead of" actions.
> expected to be combination of insert, update; naturally insert and
> update triggers are expected to be fired when issuing a merge
> statement?
> > Example: I am maintaining a count of something
> > on a per-foo basis. I have a table of foo ids and
> > counts. For all foo ids that are not present in the
> > count table, the count is 0. The sort of MERGE
> > I would do is
>
> > MERGE into FooCount f using
> > (select id, count from NewFooCount) f'
> > ON f.id = f'.id
> > WHEN MATCHED THEN
> > UDPATE f.count = f.count + f'.count
> > WHEN NOT MATCHED THEN
> > insert id, count values (id, count)
>
> > So, what else has a transform that has a concept of
> > identity and also specifies a key? An aggregate
> > with GROUP BY.
>
> > MERGE NewFooCount f' into FooCount f
> > GROUP BY id
> > SET f.count = sum(f.count, f'.count);
>
> > It's less general, but syntactically and conceptually
> > cleaner and simpler.
>
> > Thus, MERGE can be seen as the imperative form of
> > GROUP BY and aggregates.
>
> In a way insert and delete statements change counts too. Consider
> converting a relation into characteristic function:
>
> R(x,y,z) --> RC(x,y,z,cnt)
>
> where RC is an infinite relation. For every tuple (x,y,z) in the R, we
> have tuple (x,y,z,1) in the RC. For every tuple (x,y,z) which is not
> in the R,
> we have tuple (x,y,z,0) in the RC.
>
> Now, insert, delete, or update on the original relation is just an
> update of the cnt column of the RC.
Marshall Received on Sun Apr 15 2007 - 19:28:22 CEST