Re: Newbie question about db normalization theory: redundant keys OK?

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Mon, 24 Dec 2007 17:30:53 +0200
Message-ID: <Pine.SOL.4.62.0712241655230.742_at_kruuna.helsinki.fi>


On 2007-12-23, David BL wrote:

> The literature on OT focuses on the real time collaborative editing of
> text. IMO the published solutions tend to be incorrect or impractical.
> I'm not aware of anyone else researching the use of OT for databases.

There is a long research tradition on semantic and multilevel concurrency control. As I've said before, OT seems to be a special case of those, in that all the operations on the underlying abstract datatype are engineered to be either commutative or to have easily expressed compensations. Transaction processing monitors routinely apply the more general theory, which accounts for much of the performance increase such middleware can offer.

> Is the comment about Oracle referring to Multi-Version Concurrency
> Control (MVCC)? Although very useful, MVCC only addresses the
> important case of long running read only queries.

Oracle's implementation has trouble even with those, because it's quite difficult to guarantee that the version data necessary in the later stages of the transaction do not accidentally get overwritten over the course of execution.

> I've never thought much of multiphase commit protocols. Instead I'm a
> fan of persistent message queues.

I don't quite see them as solving the same problem. If your queue is such that all of the updates posted in it are always applied, then in a distributed environment it would seem that you need something like a multiphase commit to get agreement on what to insert into the queue in the first place. Or if some of the queued transactions can be rolled back, then you need distributed agreement when the queued transaction is finally executed. Finally, it obviously solves the problem if the transactions are serialized before insertion into the queue, but then we lose in concurrency and the queue easily becomes a central hot spot.

So, could you perhaps elaborate a bit?

> A message queue can be used to bring the summaries DB up to date. A
> persistent message sequence number (stored in each DB) tells you
> whether the summary DB is synchronised with the details DB, or could
> allow a thread to block until the summaries have been synchronised,
> preferably at the start of a transaction before it has locked any
> resources.

That's how most reasonable report databases/warehouses connect to their respective OLTP sources, at least.

-- 
Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Mon Dec 24 2007 - 16:30:53 CET

Original text of this message