Concurrency in an RDB

From: David <davidbl_at_iinet.net.au>
Date: 7 Dec 2006 18:14:17 -0800
Message-ID: <1165544057.433915.4620_at_n67g2000cwd.googlegroups.com>



I have some thoughts on how best to achieve concurrency in an RDB. You may want to keep in mind that I am a "systems programmer" with little experience in using an RDB. I'm posting to this newsgroup to see whether my ideas have a reasonable basis.

Consider that a single mutex is used to protect an entire RDB. This mutex offers shared read / exclusive write access modes. It avoids writer starvation by blocking further readers once one or more writer threads are waiting on the mutex.

I'm basically suggesting conservative rather than strict 2PL. This may sound heretical, or at least far too simplistic (and inefficient) to be useful in an industrial strength RDB. However I contend that the approach deserves to me analyzed in some detail. I suspect it is a case of "less is more".

With such coarse level locking the CPU overheads of gaining and releasing locks are minimized (simply because locks are opened so infrequently).

Exclusive write access to the entire RDB means there can never be dead-lock. This eliminates a lot of complexity and overhead. For example there is no need for dead-lock detection such as by looking for a cycle in the wait-for graph. There is also no need to select a "victim" transaction, abort it and then later restart it from scratch.

In some database applications repeated dead-lock scenarios occur, and the database can become very inefficient because transactions are continually aborted and rolled back. This is a rather insidious problem because a database can work fine under lighter loads then suddenly takes on pathological behavior when the load increases slightly.

In a shared read mode we get the ultimate in concurrency. We allow any number of threads to have unrestricted access to all data in the database, without any additional locking (i.e. apart from the initial lock). It is fairly easy to support shared use of caches with LRU eviction policies. Data loaded from disk represents cached state, as well as all calculated (or "derived") state. So it is easy to allow different threads to share in the calculated results that they generate. This makes it quite easily to parallelise read-only algorithms across all processors in multi-core machines. It also makes it easy to ensure that processors work concurrently with I/O.

I like the analogy to functional programming. This emphasizes the idea that outputs are (completely) calculated from read-only inputs. It is well known that FP offers one of the best ways to achieve concurrency in computer science.

I claim that virtually all CPU intensive tasks fit well into the idea of calculation of outputs on read only inputs. For example a query on a set of tables used to extract information doesn't modify any of the (persistent) tables. The query "Find the grandparents of people who have a brother in law that had open heart surgery before 2001" has to examine many database records.

By contrast, mutative changes to an RDB tend to be very simple and don't require much CPU effort. For example

    "Add person Bill Heard born in 1970, who married Janet Shaw in 1996"

Furthermore they don't depend on the results of time consuming read only queries. If they did then the ingestion rate of the RDB would be very poor. Note that this limits the allowable complexity of the DB integrity constraints.

Subject to integrity constraints, mutative work can be fine grained. For example, it is not necessary to add a whole family at once to a DB; it is possible to add one person at a time.

Even though the changes to the database are simple and can be performed quickly, they may not be well localized. When a RDB makes use of very fine grained predicates (i.e. favors lots of tables with few columns rather than the reverse), adding some information atomically to the database can require updates to many tables and secondary indexes. Therefore an exclusive write lock on the entire RDB is well suited.

This computing model involves two "phases".

  1. Mutative work avoids concurrency completely by getting exclusive write access. The justification is that it completely avoids recalculation of derived state, and instead merely marks caches as dirty. In general dirtiness needs to propagate eagerly through all downstream derived values.
  2. Derived values are recalculated using lazy evaluation in a shared read mode that allows for the ultimate in concurrency.

A note to OO programmers: The pitfalls and inefficiencies often associated with the observer pattern are solved by this computing model. Derived values are only recalculated as often as they're needed, not according to when the inputs change (which can be far more often). Also the problems of dirty reads and reentrancy problems are eliminated.

Cheers,
David Barrett-Lennard Received on Fri Dec 08 2006 - 03:14:17 CET

Original text of this message