Re: SQL Update with a Join Syntax

From: --CELKO-- <jcelko212_at_earthlink.net>
Date: 7 Aug 2004 16:32:22 -0700
Message-ID: <18c7b3c2.0408071532.174d4354_at_posting.google.com>


>> [UPDATE with FROM clause] it's very standard SQL (not that that means anything to Oracle). <<

Nope! There is no FROM clause in a Standard SQL UPDATE statement; it would make no sense. Other products (SQL Server, Sybase and Ingres) also use the UPDATE .. FROM syntax, but with **different** semantics. So it does not port, or even worse, when you do move it, it trashes your database. Other programmers cannot read it and maintaining it is harder.

The correct syntax for a searched update statement is

<update statement> ::=

  UPDATE <table name>
     SET <set clause list>
  [WHERE <search condition>]

<set clause list> ::=

  <set clause> [{ , <set clause> }...]

<set clause> ::= <object column> = <update source>

<update source> ::= <value expression> | NULL | DEFAULT

<object column> ::= <column name>

The UPDATE clause simply gives the name of the base table or updatable view to be changed.

Notice that no correlation name is allowed in the UPDATE clause; this is to avoid some self-referencing problems that could occur. But it also follows the data model in Standard SQL. When you give a table expression a correlation name, it is to act as if a materialized table with that correlation name has been created in the database. That table then is dropped at the end of the statement. If you allowed correlation names in the UPDATE clause, you would be updating the materialized table, which would then disappear and leave the base table untouched.

The SET clause is a list of columns to be changed or made; the WHERE clause tells the statement which rows to use. For this discussion, we will assume the user doing the update has applicable UPDATE privileges for each <object column>.

  • The WHERE Clause

As mentioned, the most important thing to remember about the WHERE clause is that it is optional. If there is no WHERE clause, all rows in the table are changed. This is a common error; if you make it, immediately execute a ROLLBACK statement.

All rows that test TRUE for the <search condition> are marked as a subset and not as individual rows. It is also possible that this subset will be empty. This subset is used to construct a new set of rows that will be inserted into the table when the subset is deleted from the table. Note that the empty subset is a valid update that will fire declarative referential actions and triggers.

  • The SET Clause

Each assignment in the <set clause list> is executed in parallel and each SET clause changes all the qualified rows at once. Or at least that is the theoretical model. In practice, implementations will first mark all of the qualified rows in the table in one pass, using the WHERE clause. If there were no problems, then the SQL engine makes a copy of each marked row in working storage. Each SET clause is executed based on the old row image and the results are put in the new row image. Finally, the old rows are deleted and the new rows are inserted. If an error occurs during all of this, then system does a ROLLBACK, the table is left unchanged and the errors are reported. This parallelism is not like what you find in a traditional third-generation programming language, so it may be hard to learn. This feature lets you write a statement that will swap the values in two columns, thus:

UPDATE MyTable
SET a = b, b = a;

This is not the same thing as

BEGIN ATOMIC
UPDATE MyTable
SET a = b;
UPDATE MyTable
SET b = a;
END; In the first UPDATE, columns a and b will swap values in each row. In the second pair of UPDATEs, column a will get all of the values of column b in each row. In the second UPDATE of the pair, a, which now has the same value as the original value of b, will be written back into column b -- no change at all. There are some limits as to what the value expression can be. The same column cannot appear more than once in a <set clause list> -- which makes sense, given the parallel nature of the statement. Since both go into effect at the same time, you would not know which SET clause to use.

If a subquery expression is used in a <set clause>, and it returns a single value, the result set is cast to a scalar; if it returns an empty set, the result set is cast to a NULL; if it returns multiple rows, a cardinality violation is raised.

>> JOINS should never be done in the WHERE clause. That's an archaic construction. They should be done using the appropriate JOIN syntax; this keeps the distinction between joins and restricts clear. <<

A lot of newbies (read: SQL Server by way of ACCESS) think that the ON clause must hold the join condition and the WHERE clause must hold the restrictions. This is not true. It can make for some nice code, and might help a smart optimizer materialize common subexpressions across mulitple queries, show you what should be in a VIEW, etc. But it is not required.

>> .. but it isn't logical to write Value = COLUMN. <<

My Arab and Chinese students used to do that, too! Received on Sun Aug 08 2004 - 01:32:22 CEST

Original text of this message