Re: schema help

From: David BL <davidbl_at_iinet.net.au>
Date: Wed, 26 Dec 2007 19:31:46 -0800 (PST)
Message-ID: <125d9cb0-bf24-46e0-8589-8e3d7fb0b685_at_i12g2000prf.googlegroups.com>


On Dec 27, 5:07 am, Rob <rmpsf..._at_gmail.com> wrote:
[Quoted] > On Dec 25, 9:10 pm, David BL <davi..._at_iinet.net.au> wrote:> Interesting to how this relates to a discussion in another thread...
>
> I'll try to respond to your well-thought-out response, though you have
> gone way beyond my question.
>
> > Let S' = S + d denote the application of delta d to state S to yield a
> > new state S'. Often S+d is much easier to calculate than d = S' -
> > S. In fact the latter is not always uniquely defined.
>
> Agreed. In current systems, d1,....,dn would seem to correspond to a
> change log. The log can be used to roll forward or roll back. At issue
> is the form of d. If each d is a before image and and after image,
> then d is uniquely defined, though S'-S is not.

I wasn't actually identifying d1,...,dn with the change log. The latter will often be framed in terms of physical changes to pages, rather than logical operations understood by the domain expert.

Furthermore recovery algorithms such as used by ARIES tend to assume that the deltas recorded in the change log are idempotent. That is quite restrictive! For example, assignment to some bytes in a page is idempotent, but insertion or deletion in a string is not.

> > Consider that there are two separate databases. The State-DB records
> > the current state S without any regard for history. The History-DB
> > only records the history as a sequence of deltas [d1,d2,...,dn] but is
> > not directly concerned with calculating the current state.
>
> Here I think you are expressing the spirit of the question: Is it OK
> to store the deltas in a database model? Just to be clear, let's
> assume that the initial database state S0 (S zero) is the empty
> database. As the OP suggests, I could maintain a sequence of database
> states S0, S1, ..., Sn in which case I could (imperfectly) compute the
> deltas. Or, I can maintain a sequence of deltas and perfectly compute
> any state. You suggest that I could have two sequences, one of deltas
> and one (of lower periodicity) for states. I think that's what your
> next paragraph says:

Not really two sequences. Only the History-DB records the history. The State-DB only records the current state. I wasn't suggesting that one would generally record lots of snapshots of State-DB's, though of course that may be useful in the sense of a cache.

> > There is a concept of applying outstanding deltas to S to bring it up
> > to date:
>
> > S' = S + [dm,...,dn].
>
> > In that sense the State-DB is just a read only view and only the
> > History-DB is regarded as the primary source of the data to which
> > updates are applied with transactions. Furthermore these updates tend
> > to only add extra tuples to relvars rather than removing or modifying
> > existing tuples. Therefore in theory the History-DB can provide
> > excellent ingestion rates and be optimised for writing new deltas as
> > sequential data to disk. Note that the information (as deltas) in the
> > History-DB is total ordered.
>
> Just one observation: Sets are not ordered, so the total ordering of
> the History-DB is up to the app, not the RDBMS. Your point about
> better "ingestion rates" intuitively makes sense, but can the database
> designer cause the RDBMS to write tuples sequentially to disk? This
> kind of optimization would likely be implemented at the system level,
> so in a way, you are suggesting an architectural meta meta model far
> different from the SQL meta meta model, at least as I understand it.

My comments are a suggestion for how the RDBMS should provide specialised support for a History-DB. This would include the total ordering (or better still use vector times and a partial ordering), efficient sequential writing of new operations to disk, and concurrent access for readers of the operations. I don't think any commercial RDBMS provides these capabilities.

> > IMO bringing the State-DB up to date is best performed
> > asynchronously. The State-DB can persist a sequence number to ensure
> > it applies each delta in the total order exactly once. A thread can
> > access the State-DB and by reading the sequence number, easily
> > determine the subset of the History-DB that is associated with the
> > current state of the State-DB. Furthermore, in theory with pure
> > additive changes to the History-DB it would be possible to allow read
> > only threads to access the History-DB concurrently with its mutative
> > transactions. This is kind of like MVCC for free. See
>
> > http://en.wikipedia.org/wiki/Multiversion_concurrency_control
>
> I see your point. But of course with "pure additive changes", your
> above assertion (d = S' - S is not uniquely defined) no longer holds.

I suspect you misunderstand me. "Pure additive changes" means that the history only grows by appending additional deltas to the sequence, rather than making changes to previously recorded deltas. ie every delta once generated is immutable. It is for this reason that readers can process deltas concurrently with writers that append new deltas to the History-DB.

It has nothing to do with whether d = S' - S is uniquely defined.

> The wikipedia discussion around MVCC gets into more "how" than is
> necessary here: The OP doesn't suggest multiple updaters, just a
> database-as-recording-device from which he could compute deltas.
> Personally, I need to understand the single user variant before wading
> into the more complex, multiuser case.

The point was only that the advantages of MVCC come for free. There is no need to implement MVCC!

Supporting concurrency between readers and writers may be important for performance in single user applications. The advantages of MVCC still apply when there is only a single updater.

> > Often deltas can be regarded as events occurring at some point in
> > space and time. For example, a marriage event causes the state to
> > change from unmarried to married. A birth event causes a person to be
> > added to a family tree database.
>
> > Clearly such events are to be recorded but shouldn't require
> > subsequent modification. I tend to think of a History-DB as an
> > *events* database, and often represents the proper way to record the
> > primary source of information (as distinct from the current state
> > which is going to depend a lot more on one's conceptual model and the
> > application). I would go as far as saying that the events DB should
> > be associated with the base relvars, and the DBMS should support the
> > asynchronous calculation of the current state as a read only view.
>
> Again, a little more complex than I can follow. If the current state
> is a read only view, then either it is never brought up to date, or,
> the database is periodically shut down and all the events since the
> last current state are applied, replacing the old current state with
> the new current state. I assume there is some read only view of the
> events that answers the question "what has changed between the old and
> new current states?" (That's the question the OP wants to answer.)

There is no need to shut down the State-DB to bring it up to date. Instead a mutative transaction is opened to apply the outstanding deltas from the History-DB. The transaction should only be opened after a batch of deltas has been read from the History-DB and cached in memory. Therefore the update to the State-DB doesn't need to block on network I/O (and perhaps not even file I/O), so it's brief.

Let me provide a contrived example. The history for an account consists of account transaction events. Once a transaction event is generated it is immutable. The current account balance is a function of all the history of account transaction events. I am suggesting that the primary information to be recorded consists of the transaction events, whereas the account balance is merely a derived and calculated quantity, and preferably should be recalculated asynchronously. When a user wants to see the account details on a web page the relevant thread simply displays the balance recorded in the State-DB, and the relevant subset of transaction events from the History-DB. So it is possible to display a consistent web page even though the system recalculates the account balance asynchronously.

The approach has many performance advantages, and will reach quiescence more quickly than conventional approaches. Paradoxically it will tend to provide a more up to date picture to the users than a synchronous approach.

> > I imagine that only an events DB has some realistic chance of making
> > the world's data look like a single information source - because it
> > has less application bias and completely avoids the quandaries of
> > distributed transactions.
>
> "making the the world's data look like a single information source" is
> beyond my ken as is the rest of your reply. (I'm just a humble
> toolmaker.) I do thank you for taking the time to reply.
Received on Thu Dec 27 2007 - 04:31:46 CET

Original text of this message