Re: Concurrency in an RDB
Date: Mon, 18 Dec 2006 17:28:03 +0200
Message-ID: <Pine.SOL.4.62.0612181517090.22950_at_kruuna.helsinki.fi>
On 2006-12-18, David wrote:
> Consider that 50% of the individual transactions break the constraint.
Under the rule I gave you, you'd test both transactions serially so you
won't outright lose anything. But the example does prove my thinking
wrong.
I'd say the gain isn't related to batch size, but the amount of zeroth
order statistical redundancy in the faulty-ok bitstream. It's also not
going to scale past 1/2 probability of error, because an oracle for any
one of a set of transactions being okay is not available. In practice
the approach will still be usable: most transactions are correct. You'll
just have to derive the rest of the throughput by throwing hardware at
the problem; thankfully it's still embarrasingly parallel. Then latency
> Then batching won't help. Eg try a batch of size 2. Expectation
> value of number of compiles per batch = 0.25*1 + 0.25*2 + 0.5*3 =
> 2.25, which is actually worse than not batching at all.
> I thought in practise a good DB application would never keep
> transactions open while dialogs are displayed for accepting user
> input.
They often do. That's really the so called long running transaction problem in a nutshell: if the DBMS doesn't understand application semantics, all it can do to preserve integrity is to note which data the application has seen, assume the worst case that the entire write set could be dependent on the entire read one, and so to enforce serializability within the whole transaction read-write set, no matter how large or long lived.
> However, IMO if the patient information relates more to "data" than
> "device" then it is better to use an asynchronous data entry model and
> provide powerful data validation based on read only queries that pick
> up duplicate data entry errors etc.
Then you're actually not going with locking or even optimism but with open transactions: you can't be sure the data you just entered will *really* stick until after the validation has run, and succeeded.
> Branching, merging tagging and so on can be used to distinguish a
> valid DB from a work in progress.
In theory, yes, but in practice you're then committing to maintaining enough context (in the DB) and expertise (around it) to merge the data after the fact, instead of forcing the DB to go from one valid state to another and aiming for maximal concurrency achievable within that constraint. From some personal experience with Lotus Notes, I can tell you that the result can be painful.
-- Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2Received on Mon Dec 18 2006 - 16:28:03 CET