Re: Concurrency in an RDB

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Fri, 08 Dec 2006 13:40:01 GMT
Message-ID: <RWdeh.30103$cz.453133_at_ursa-nb00s0.nbnet.nb.ca>


David wrote:
> Bob Badour wrote:
>

>>David wrote:
>>
>>
>>>I have some thoughts on how best to achieve concurrency in an RDB.  You
>>>may want to keep in mind that I am a "systems programmer" with
>>>little experience in using an RDB.  I'm posting to this newsgroup to
>>>see whether my ideas have a reasonable basis.
>>>
>>>Consider that a single mutex is used to protect an entire RDB.  This
>>>mutex offers shared read / exclusive write access modes.   It avoids
>>>writer starvation by blocking further readers once one or more writer
>>>threads are waiting on the mutex.
>>
>>Some write transactions take a long time to complete and will thus lock
>>everyone else out of the database.

>
> Can you outline for me a real life example where long lived mutative
> transactions are necessary?

In some places, 200 microseconds is too long: http://www.embeddedstar.com/press/content/2003/12/embedded11970.html

Are you suggesting that it is possible to acquire and release a universal exclusive lock over a distributed system in less than 200 microseconds?

It takes 500 times as long as that to ping my local ISP.

>>[snip]
>>
>>
>>>Exclusive write access to the entire RDB means there can never be
>>>dead-lock.  This eliminates a lot of complexity and overhead.
>>
>>It also eliminates almost all useful concurrency.

>
> Note that my premise is that concurrency is only needed for CPU
> intensive tasks and that these only require shared read access.

That is one of your assumptions that is false.

   If
> that is the case then there is plenty of concurrency available during
> shared read modes. Exclusive write access takes so little time that it
> can be neglected.

That is a second of your assumptions that is false.

>>>In some database applications repeated dead-lock scenarios occur, and
>>>the database can become very inefficient because transactions are
>>>continually aborted and rolled back.
>>
>>Which applications are those? And why are dead-locks necessarily a
>>problem for those applications?

>
> I don't have the RDB experience to know how often and to what extent
> dead-lock seriously degrades performance. However, I have heard of
> real cases where repeated dead-lock kills performance.

If one loads any system beyond its capacity, it will exhibit pathological behaviour. The common term for this is "thrashing" where concurrent processes spend more time on overhead than actual work. It can happen in a lock manager. It can happen in a cache. It can happen in a virtual memory manager.

All real computer systems have finite capacity.

>>[snip]
>>
>>>In a shared read mode we get the ultimate in concurrency.
>>
>>Shared read/shared write is the ultimate in security. The use of the log
>>to provide multiple concurrent views of uncommitted data gets that job done.

>

> IMO the conservative locking I propose will lead to far superior
> performance. This is a systems programming question, and I can't
> back up the claim with quantitative results at present.

Your opinion doesn't count for much, and I can confidently counter that you will never back up the claim with quantitative results except for perhaps a tiny special class of applications.

>>[snip]
>>
>>>Subject to integrity constraints, mutative work can be fine grained.
>>>For example, it is not necessary to add a whole family at once to a DB;
>>>it is possible to add one person at a time.
>>
>>One of the great things about the relational model is set-level
>>operation. It is not necessary to add one person at a time when one can
>>add a whole family.

>
> What I'm saying is that if it's not necessary to add a whole family
> at a time (according to integrity constraints or atomicity
> requirements) then it would be silly to design the application that
> way.

What I'm saying is that if it's possible to add the whole family at a time (according to integrity constraints or atomicity requirements) then it would be silly to design the application to prevent it.

   Mutative changes should be applied in as small a transaction as
> possible in order to promote concurrency and avoid dead-lock. That is
> commonly discussed in books on RDB.

I agree. At the physical level, the dbms should not hold shared resources any longer than absolutely necessary. I also suggest the dbms should not hold more shared resources than absolutely necessary.

>>[snip]
>>
>>I suggest if you look at any text on concurrency and transactions in
>>dbmses, you will find your proposal has been well-considered and
>>long-ago rejected.

>
> Call me pig/big headed but I don't always believe what I read!

Scientific publications have bibliographies for a reason. A new proposal that simply ignores all prior work, such as your proposal does, gets rejected without much further thought. Received on Fri Dec 08 2006 - 14:40:01 CET

Original text of this message