Re: Avoiding any locks in SQL Servers - read and understand....its magic.

From: Christopher Browne <cbbrowne_at_acm.org>
Date: 23 Aug 2003 05:30:48 GMT
Message-ID: <bi6u68$5fght$1@ID-125932.news.uni-berlin.de>

A long time ago, in a galaxy far, far away, Dieter Nöth <dnoeth_at_gmx.de> wrote:
> Guido Stepken wrote:
>> PostgreSQL documentation you find everywhere, search google.
>
> I know where to find the docu, i'm just not able to find what you're
> claiming, that's why i'm asking.
>
> Some quotes...
>
> Guido Stepken:
> "Oracle hasn't really made it perfect, because a write transaction can
> block another writing transaction.
> PostgreSQL has MVCC, multi versioning concurrency control. No writes,
> updates, reads block each other."
>
> PostgreSQL Docu 9.1. Introduction:
> "...in MVCC locks acquired for querying
> (reading) data don’t conflict with locks acquired for writing data, and
> so reading never blocks
> writing and writing never blocks reading."

REALITY is that there will be _some_ writes that need to block one another. MVCC does not magically eliminate the possiblity of deadlock.

And if you have lots of updates inside big transactions that are hitting a particular row, then those transactions are liable to queue up, blocking one after another.

Notable instances would be with database records that express "balances."

So if there is a customer that is submitting lots and lots of concurrent updates that are credited to their account, those updates MUST get serialized if they are to all hit the account. The balance has to move from $80,000 to $75,000 to $72,000 to $69,000 to ... ad infinitum.

Likewise, if requests are being made to take one unit of inventory off a particularly "busy" product, well, those updates have to get serialized (leading to SOME form of blocking) in order for them to all hit the inventory account properly.

And people elsewhere in related threads are making exceedingly rash claims about the non-need for SOMETHING resembling locking. The "balance" problem is a good characteristic example.

> Guido Stepken:
> "Sorry, if forgot to say, that the client is immediately informed, that
> the data is written, although the server has them still in work within
> transaction, which may have been delayed, till other transactions are
> finished."
>
> PostgreSQL Docu 9.2.1. Read Committed Isolation Level "However, such
> a target row may have already been updated (or deleted or marked for
> update) by another concurrent transaction by the time it is
> found. In this case, the would-be updater will wait for the first
> updating transaction to commit or roll back (if it is still in
> progress). If the first updater rolls back, then its effects are
> negated and the second updater can proceed with updating the
> originally found row. If the first updater commits, the second
> updater will ignore the row if the first updater deleted it,
> otherwise it will attempt to apply its operation to the updated
> version of the row."
>
> PostgreSQL Docu 9.2.1. Read Committed Isolation Level "This behavior
> makes Read Committed mode unsuitable for queries that involve
> complex search conditions."

In effect, there is no TOTALLY free lunch. Complex concurrency problems are alleviated considerably by MVCC; that helps solve a lot of problems.

You may need, all the same, to pick isolation levels carefully if you have particularly complex requirements.

> Guido Stepken:
> "I is like magic, programmers never ever have to care about concurrent
> write events any longer, because they are time - shifted and rearranged
> in a manner, as if a lock had been set."
>
> PostgreSQL Docu 9.2.2. Serializable Isolation Level:
> "If the first updater
> rolls back, then its effects are negated and the serializable
> transaction can proceed with updating the
> originally found row. But if the first updater commits (and actually
> updated or deleted the row, not
> just selected it for update) then the serializable transaction will be
> rolled back with the message
> ERROR: Can’t serialize access due to concurrent update
> because a serializable transaction cannot modify rows changed by other
> transactions after the serializable
> transaction began."

Apparently Guido is being a bit _too_ overexuberant about the merits of how PostgreSQL does things.

> Guido Stepken:
> "Throw away all code in your application, which has to do with
> locking....;-)) You will not loose any update, no fear."
>
> PostgreSQL Docu 9.2.2. Serializable Isolation Level: "The
> Serializable mode provides a rigorous guarantee that each
> transaction sees a wholly consistent view of the database. However,
> the application has to be prepared to retry transactions when
> concurrent updates make it impossible to sustain the illusion of
> serial execution."

Guido is probably right 90% of the time. But if you have particularly complex combinations of concurrent updates, you've got to do them intelligently.

Concurrency is a problem, no matter how you slice it.

Ousterhout wrote a paper saying that "Threading is mostly a bad idea" on the basis of the thesis that few programmers are up to understanding how to write threaded code _correctly_ when it needs to cope with unusual combinations of concurrent behaviour. I think he's quite right about that; concurrent systems of _whatever_ sort introduce the possibility of some really obscure sorts of errors.

I think MVCC _is_ a pretty big win; it does get rid of most need for locking, and, with suitable selection of serialization modes, can give good data back virtually without blocking.

But none of this allows people to totally ignore how their tools work when coping with particularly complex situations involving concurrency. You can improve how you MANAGE complexity; you can't simply make the issues vanish.

-- 
select 'cbbrowne' || '@' || 'ntlug.org';
http://www.ntlug.org/~cbbrowne/spreadsheets.html
It is better to have 100  functions operate on one data structure than
10 functions on 10 data structures.  -- Alan Perlis

Received on Sat Aug 23 2003 - 00:30:48 CDT