Re: row vs row.column level locking

From: Serge Rielau <srielau_at_ca.ibm.com>
Date: Sat, 17 Dec 2005 20:19:14 -0500
Message-ID: <40jrokF1b00tlU1@individual.net>

Mladen Gogala wrote:
> On Sat, 17 Dec 2005 18:20:59 -0500, Serge Rielau wrote:
>
>

>>Well, we agree on the price tag at least.
>>Now I have no clue how statistical column correlation has anything to do 
>>with locking granularity.

>
>
> Serge, my suspicion is the following: column C1 has some kind of
> relationship (business, logical, statistical?) with the column C2.
> Uncontrolled simultaneous update of the two columns by two unrelated
> processes might break that, possibly important, relationship thereby
> producing logical corruption of the database.
>
> I know that relational theorists cringe on the thought of related
> (derivable) data in two different columns, but sometimes it is necessary
> to have that. In the large HMO that I was working for, medical providers
> (also known as "doctors") had two types of ID: new HIPPA Id and the
> old, "local" provider ID. There was also 1-1 mapping between them, the
> local ID was the primary key, while the new HIPPA ID was made a unique
> key. Those columns were obviously related, while the relationship among
> them was not immediately apparent. Any transaction which would update one,
> without updating another would break the mapping rules, which were crucial
> in the year 2003 (deadline for the HIPPA implementation). In other words,
> database which would allow independent updates of both columns
> simultaneously would have a serious potential to cause significant
> business damage, with all ramifications (fines up to $1M /day were
> threatened to HMOs which were not HIPPA compliant by the end of 2003).
>
> What makes me particularly uncomfortable is transition from the relational
> model, which contains n-tuples as its most basic element to something that
> might be called "spreadsheet model" as each "cell" becomes individually
> addressable (and lockable) by its "coordinates". From the mathematical
> point of view, a relation is, simply put, a subset of a Cartesian product.
> Elements of a Cartesian product of sets A and B are n-tuples (or "ordered
> pairs") (a,b) where a belongs to A and b belongs to B. If you operate on a
> relation, you operate on its elements, which are n-tuples, not on its
> individual "coordinates". I'm also afraid that all ramifications of the
> switching the underlying mathematical model are not yet well understood
> and have potential to introduce tectonic changes which global business is
> not yet aware of. Also, in that case, I'd suggest changing name to SDBMS
> instead of RDBMS (Spreadsheet Database Management System).
>

In order for this application to be dangerous it would have to enable updating only one of the columns in this functional dependency to begin with. That is if only one "user" is connected. Since to change both columns (or cells is a pivoted table) need to be locked. There is no danger there I believe as long as read transactions are properly implemented by the RDBMS (that is read locks but be held to enforce serializability).

When we take into account non serializable isolation levels then things get a whole lot more interesting. This, btw, is independent of whether we consider snapshot isolation, read stability or cursor stability. (just to point out that this isn't an Oracle vs. DB2 issue) Each of these "lesser" isolation levels will allow an application to behave dangerous. Column level locking (or pivoting tables) will only add another shade to this danger.

Anyway, it appears I at least start repeating myself and given that no vendor supports such a locking level I'll just leave you with this final comment.

Cheers
Serge

-- 
Serge Rielau
DB2 Solutions Development
DB2 UDB for Linux, Unix, Windows
IBM Toronto Lab

Received on Sat Dec 17 2005 - 19:19:14 CST