Re: ACID et al

From: David Cressey <david.cressey_at_earthlink.net>
Date: Tue, 06 Dec 2005 15:32:09 GMT
Message-ID: <Z9ilf.73$3Z.65_at_newsread1.news.atl.earthlink.net>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:xZYkf.52828$Eq5.42138_at_pd7tw1no...
> I'm interested to see any comments the group has on something I'm
> (haphazardly) working on which in part has to do with guaranteeing the
> ACID properties without locking.
>
>
> In case it's of any interest this was partly prompted by several
> previous threads regarding lock-free db's, in-memory db's and the TRM.
> Also by my feeling that the historic 'impedance mismatch' that dates
> back to batch days, has sort of been superceded by new kinds of
> impedance. Two big mismatches that I can think of are the stateless
> nature of the www versus conventional persistent db programs and the
> lack of appreciation in the www protocols such as html for the set
> theory underpinnings of the RM.
>
>
> When I think about this, there are plenty of issues to deal with, but
> one that worries a friend of mine and doesn't worry me at all, has to do
> with the transaction ACID properties (which is often thought of in an
> OLTP context, even though I don't see that context as the only important
> one).
>
>
> So here's my immediate question:
>
> 1) Assume an engine that keeps an in-memory database. By this I mean
> the whole database insofar as some application is concerned, not some
> fragment in a cache.
>
>
> 2) Assume that a 'redo' log is written and 'synced' before confirmation
> of a successful update is returned to a user or client. If the log
> write fails, the engine terminates and alarm bells go off.
>
>
> 3) Assume the engine is single-threaded. Two users' messages can't
> interleave their actions between each other, ie., messages are
> 'single-file'.
>
>
> 4) Assume that a message to the engine includes all the db actions for
> some 'business' transaction. These actions might include re-iterated
> reads of rows that were issued in an earlier 'query' message, as well as
> possibly the values that were returned earlier, as well as update actions.
>
>
> 5) Don't assume that the actions in a message are necessarily executed
> in any particular order, eg. if a message contains a write action that
> succeeds followed by a read action that fails, there is a delta or
> intent mechanism or equivalent that guarantees the write action won't
> persist after the user has been told that the whole transaction 'failed'.
>
>
> 6) Assume all the queries/actions in such a message must succeed for the
> message to succeed If any one of them fails, the whole message fails.
>
>
> I think that Durability is ensured by 2), Isolation by 3), Atomicity by
> 5), 6) and Consistency by 4).
>
>
> For example (for the moment without depending on any particular syntax
> for the contents of a message), the usual bank debit example would need
> two messages (note that if teller and branch rows were involved, they
> could be returned to the user in the reply to the first message along
> with customer and or balance rows). The second updating message (which
> might occur many minutes or hours after the first) might reiterate the
> original queries in some form, perhaps including certain data such as
> 'balance = $500' which would be asserted to still be true before the
> debit is made persistent.
>
>
> There are more complicated scenarios that people talk about such as the
> travel booking problem as well as the whole question of 'nested'
> transaction which I also don't see problems with, but here I've just
> given a simple example in order to see if people think there's a basic
> flaw in the assumptions above.
>
>
> Basically, I'm talking about an engine that is good at checking
> constraints and thus can be stateless as far as conventional locks are
> concerned, ie., it would have no 'lock manager' component per se.
>

PMFJI. I'm in a llitle over my head here. I understand how locking and ACID interact in one or two DBMS products, but not well enough to abstract what you are asking about here.

The first thing I want to say is that there are basically two ways of going about ACID: synchronization and multiversioning.

"Synchonization" basically means the same thing as locking. One thread blocks until another thread completes some action. In the rest of the thread, some people thought you meant, "using the lock manager" when you said "locking". I surmise that you only meant "synchronizing.

It reminds me of when DEC Rdb was ported from VMS, which has a lock manager service in the OS, to unix which AFAIK, doesn't have such a service. They basically had to put all the synchronization primitives inside Rdb.

"Multiversioning" basically means keeping every version of every updated cell around, until it's certain that no remaining process will need to read that version. Whenever anything is read, if there's more than one version, the multiversioner figures out which one is "right" for that thread. There's more to it than that, because you still have to make sure that transactions are atomic.

A few other comments:

  1. ... in-memory.

 As long as there is a single "server" that keeps part of the database in memory,
and combines the multiple threads in a single process, the distinction between an in memory database and a database that's on disk with a cache in memory is trivial as far as ACID is concerned. Oracle RDBMS, in the simplest configurations, works like this.

If each client process has its own server process, and they manipulate multithreading by interprocess communication, it gets lots more complicated. DEC Rdb (now Oracle Rdb) works like this.

point 2. REdo log is forced and syched at commit time.

A good point. This is enough to satisfy the D in ACID, as long as you can make the contents of the redo log durable, and as long as you can process the redo log serially.

point 3. Single threaded. This is where I get lost.

It sounds like you are saying that "transactions are serialized", and "that transactions are inherently atomic, because they take place in response to a single call by the client". If I've got that right, I'm going to call your attenttion to a long and involved recent thread in which some people compared SQL, which requires a "START TRANSACTION" (perhaps implicit), and a "COMMIT" or "ROLLBACK" to a more developed language, in which an entire atomic transaction can always be expressed in the form of a single request.

I forget what the more advanced language was. It might have been tutorial D. I let that whole thread go right by me, but it could be relevant to what you are thinking about.

4. Same as 3.

5. and 6. I'm going to wait until I'm clearer on points 1 through 4.

>
> Grateful for any comments,
>
> cheers,
> paul c.
Received on Tue Dec 06 2005 - 16:32:09 CET

Original text of this message