Re: Ordering dependency problem

From: Paul G. Brown <paul_geoffrey_brown_at_yahoo.com>
Date: 17 Jan 2002 21:45:26 -0800
Message-ID: <57da7b56.0201172145.5c53a45b_at_posting.google.com>


71062.1056_at_compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0201171044.656f8b4_at_posting.google.com>...
> I think this is the answer you wanted.

  It is a fine answer, Joe. Though I'm note sure that it is quite the answer   the original poster was asking for. :-)

  Stonebraker's example is a bit forced, though I'm not sure why you think it   isn't normalized: the relationship need not be transitive for the example   to exhibit the behavior described. A simple cyclic constraint illustrates   the problem, and it's quite possible to construct other examples with the   same theme. The problem is not strictly one of data modelling; it relates   to DBMS engine implementation.

  My advice is to look up "Halloween Problem", which is a similar example of   a situation where a naive physical model makes for complications at the level   of the logical model.

  CREATE TABLE Employees (

      Id         VARCHAR(12)    PRIMARY KEY,
      Department CHAR(6)        NOT NULL,
      Salary     DECIMAL(12,2)  NOT NULL
  );

   Suppose there happens to be an index on (Department, Salary).

  The query goes:

   UPDATE Employees

      SET Salary = Salary * 1.1
    WHERE Department = 'shoe'
      AND Salary > 10000.0;

   Now, the obvious query plan is to scan the Employees table via the index    mentioned above. The problem is that if you update the index structure    at the same time that you update the records in the table, then the query    may never terminate! Modified index elements are moved ahead of the scan    point in the index and are subsequently re-read as the index scan navigates    through the structure, are modified again, and moved forward again, and so    on.

   As Joe points out, though, from the perspective of a SQL level programmer,    this is a bug. But how to prevent it?

   Products vary in the details, but the rough answer is that you need to be    very, very careful with UPDATE queries. Each UPDATE ends up maintaining a    list (hash table) identifying the rows it has already touched, and it needs    to check each index entry being read by the index scan against that list    before deciding whether or not the row is "visible" to the scan. Other    approaches call for a transaction ID and query Id to be assigned to every    row and for the query processor to check this value before modifying the    row (don't laugh: this is what Postgres did, and it offers the absolute    best thing in crash recovery).

   Anyway, I hope this contributes something.

      KR

                Pb
Received on Fri Jan 18 2002 - 06:45:26 CET

Original text of this message