Re: Ordering dependency problem

From: Paul G. Brown <paul_geoffrey_brown_at_yahoo.com>
Date: 17 Jan 2002 21:45:26 -0800
Message-ID: <57da7b56.0201172145.5c53a45b_at_posting.google.com>

71062.1056_at_compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0201171044.656f8b4_at_posting.google.com>...
> I think this is the answer you wanted.

It is a fine answer, Joe. Though I'm note sure that it is quite the answer the original poster was asking for. :-)

Stonebraker's example is a bit forced, though I'm not sure why you think it isn't normalized: the relationship need not be transitive for the example to exhibit the behavior described. A simple cyclic constraint illustrates the problem, and it's quite possible to construct other examples with the same theme. The problem is not strictly one of data modelling; it relates to DBMS engine implementation.

My advice is to look up "Halloween Problem", which is a similar example of a situation where a naive physical model makes for complications at the level of the logical model.

CREATE TABLE Employees (

      Id         VARCHAR(12)    PRIMARY KEY,
      Department CHAR(6)        NOT NULL,
      Salary     DECIMAL(12,2)  NOT NULL

);

Suppose there happens to be an index on (Department, Salary).

The query goes:

UPDATE Employees

      SET Salary = Salary * 1.1
    WHERE Department = 'shoe'
      AND Salary > 10000.0;

Now, the obvious query plan is to scan the Employees table via the index mentioned above. The problem is that if you update the index structure at the same time that you update the records in the table, then the query may never terminate! Modified index elements are moved ahead of the scan point in the index and are subsequently re-read as the index scan navigates through the structure, are modified again, and moved forward again, and so on.

As Joe points out, though, from the perspective of a SQL level programmer, this is a bug. But how to prevent it?

Products vary in the details, but the rough answer is that you need to be very, very careful with UPDATE queries. Each UPDATE ends up maintaining a list (hash table) identifying the rows it has already touched, and it needs to check each index entry being read by the index scan against that list before deciding whether or not the row is "visible" to the scan. Other approaches call for a transaction ID and query Id to be assigned to every row and for the query processor to check this value before modifying the row (don't laugh: this is what Postgres did, and it offers the absolute best thing in crash recovery).

Anyway, I hope this contributes something.

Pb

Received on Fri Jan 18 2002 - 06:45:26 CET

This message: [ Message body ]
Next message: Paul G. Brown: "Re: Polygon data modelling question"
Previous message: Qingqing: "Re: Ordering dependency problem"
Maybe in reply to: Qingqing: "Ordering dependency problem"
Next in thread: Paul G. Brown: "Re: Polygon data modelling question"
Reply: Qingqing: "Re: Ordering dependency problem"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message