Re: Lock-free databases

From: VC <boston103_at_hotmail.com>
Date: Fri, 4 Nov 2005 20:50:36 -0500
Message-ID: <O-mdncugrJRykvHenZ2dnUVZ_tqdnZ2d_at_comcast.com>


"Christopher Browne" <cbbrowne_at_acm.org> wrote in message news:m364r89due.fsf_at_mobile.int.cbbrowne.com...
>> Mark D Powell wrote:
>>>[...]
>>>Unless the db is read only I do not see how a db can be
>>> "lock free".
>>
>> It can, e.g. multiversioning with timestamp ordering does not need
>> locking.
>
> How does it ensure that concurrent updates to a tuple are applied
> correctly?
>
> Consider the table "balance_table", and two queries:
>
> create table balance_table (
> account integer,
> balance integer,
> constraint primary key (account)
> );
>
> insert into balance_table (account, balance) values (5, 100);
>
> Now, perform the following two queries concurrently:
>
> update balance_table set balance = balance + 10 where account = 5;
> update balance_table set balance = balance - 5 where account = 5;
>
> Multiversioning does not allow these to be applied concurrently
> without locking.
>
> The correct final result should be...
>
> select * from balance_table;
> account | balance
> ------------+---------------
> 5 | 105
>
> But if they are permitted to simultaneously update their own
> independent (lock-free) versions, the ending balance could be any
> value in the set (95, 105, 110).
>
> How, without holding back one or the other of the updates, do you
> ensure that the final balance is 105?
>
> That "holding back" will represent some equivalent to a lock.
>
> Multiversioning allows *READS* to not need locking; the normal claim
> is that:
>
> "Because writers modify a different version than the version read
> by readers, readers are never blocked by writers."
>
> The same is not true of writers...

In MVTSO, a data item in addition to the transaction timestamp that created the item has a read timestamp. In this model reads always succeed, and writes are rejected and restarted if their timestamp is smaller than the data item read timestamp. The latter is modified by each read transaction whose timestamp is greater than the current read timestamp. For details, see the book I've referenced earlier.

In your example, R1(100);W1(110);R2(110);W2(105) will succeed, R1;R2;W1-fail;W2;R1;W1 will succed with a restart.

> --
> output = ("cbbrowne" "_at_" "gmail.com")
> http://linuxdatabases.info/info/x.html
> Rules of the Evil Overlord #187. "I will not hold lavish banquets in
> the middle of a famine. The good PR among the guests doesn't make up
> for the bad PR among the masses." <http://www.eviloverlord.com/>
Received on Sat Nov 05 2005 - 02:50:36 CET

Original text of this message