Re: Theoretical Basis for SELECT FOR UPDATE

From: Jonathan Leffler <jleffler_at_earthlink.net>
Date: Wed, 05 Oct 2005 04:53:15 GMT
Message-ID: <%_I0f.7088$oc.3991_at_newsread2.news.pas.earthlink.net>


vc wrote:

> Jonathan Leffler wrote:
> 

>>vc wrote:
>>
>>>Tony Andrews wrote:
>>>
>>>
>>>>vc wrote:
>>>>
>>>>
>>>>>Please explain what I am missing.
>>>>
>>>>Well, I'm not trying to justify the TTM position, just explain it as I
>>>>understand it! You are right, nothing prevents you from writing the
>>>>two updates as separate statement (and so separate transactions) -
>>>>unless there are constraints in place that would be violated by doing
>>>>so. TTM also proscribes deferred constraint checking (in fact, the
>>>>term becomes meaningless if there are no multi-statement transactions).
>>>>Suppose there is a constraint to the effect that the sum of all
>>>>account balances must always be zero. In the multi-statement
>>>>transaction:
>>>>
>>>>update account set bal=bal+10 where ac_no=123;
>>>>-- Constraint is currently violated (must be a deferred constraint)
>>>>update account set bal=bal-10 where ac_no=456;
>>>>commit;
>>>>-- Constraint is met
>>>>
>>>>In the TTM version:
>>>>
>>>>update account set bal=bal+10 where ac_no=123,
>>>>update account set bal=bal-10 where ac_no=456;
>>>>-- Constraint is met
>>>>
>>>>Again, there is no place in the transaction where you could potentially
>>>>see inconsistent data.
>>>
>>>
>>> Is the sequence update,select,update; legal ? If so, you can
>>> 'see' the constraint violation.
>>
>>No; then the constraint has been changed - and is more complex.
> 
> 
> What is the more complex constraint for the u,s,u; sequence then ?

Good question...which also begs the other question - what was the original constraint.

The original constraint in a banking system is likely to be moderately complex, but the basic constraint is that at the boundaries of a unit of work (UOW - aka statement or transaction, depending on context), the total amount of money in all accounts matches some total. A transfer between accounts, for example, is not allowed to break that balance; there must be a debit and a credit.

Now, in the simpler case, the constrain applies at the end of each statement. So, the double update is readily seen to meet the constraint. In the 'u,s,u' sequence, the statements are separate and the constraint would be violated at the end of the first update.

>> You no longer have a zero-sum system unless you account for 'money
>> in transit' that is not in any account. Further, since the money
>> here is added to one account before being debited to the other,
>> there's a negative 'cash in transit' value - 10 currency units more
>> have been deposited than withdrawn.

> 
> All that is very nice, but that's a mental picture in your head.  How
> is it implemented in the database ?

One possible answer is 'you need to ask that question in the comp.databases.choose-your-favourite-dbms' news group; this is comp.databases.theory, and implementation is irrelevant.

Another likely answer is 'the actual constraint is not specifiable, much less enforceable, in any current DBMS'.

Given a sufficiently powerful DBMS, the logical answer is that the constraint is along the lines of 'sum of all the balances in all the accounts is X'.

I'll work harder on the revised constraint when I have a clear definition of the original constraint - maybe.

>>> Moreover, I bet that 'update,update;' has to use the same
>>> mechanism behinds the scenes, namely deferrable constraints (what
>>> else?), in order to be possible.
>>
>>No; the sequence is defined as 'do all the operations; check all the
>>constraints'.

> 
> Are you saying that update,update;  is treated as a single atomic
> operation in which case the constraints needn't be deferrable,

Yes. The 'comma' separating the 'statements' means there are actually two separate sub-statements (either of which could be treated as a standalone statement) that are tied together to form a single statement.

> or you mean something completely different? If you mean the former,
> how do you imagine such concatenated update might be implemented
> without dramatic negative impact on concurrency and therefore on
> performance?

The implementation is substantially the same as the implementation of a single update - make changes then validate those changes. The two statements are prepared as a single unit, so the optimizer can combine the operations completely.

>>>All in all, the comma separated statements do not appear to bring much
>>>new to the table in comparison to the commit-delimited old style
>>>transactions except for perhaps more concise notation.
>>
>>They leave you with the option of having complex constraints that are
>>not violated at any point. The alternative leaves you with a complex
>>constraint that is violated at the intermediate point.

> 
> 
> Please explain how the complex constraint is not violated at any point
> in a multistatement comma separated expression.

The system ensures that while the single statement is running, no-one else gets to see the modified rows - unless they're running at a dirty read isolation in which case the inaccuracies are (by definition of dirty read) immaterial.

-- 
Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler_at_earthlink.net, jleffler_at_us.ibm.com
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/
Received on Wed Oct 05 2005 - 06:53:15 CEST

Original text of this message