Re: the relational model of data objects *and* program objects

From: Anne & Lynn Wheeler <lynn_at_garlic.com>
Date: Thu, 14 Apr 2005 13:41:27 -0600
Message-ID: <m3ll7lywm0.fsf_at_lhwlinux.garlic.com>


Anne & Lynn Wheeler <lynn_at_garlic.com> writes:
> and small quote from previous reference
> http://www.mcjones.org/System_R/
>
> System R is a database system built as a research project at IBM San
> Jose Research (now IBM Almaden Research Center) in the 1970's. System
> R introduced the SQL language and also demonstrated that a relational
> system could provide good transaction processing performance.

there have been a number things done for practical, implementation reasons, as opposed to theory and abstraction; there have been more than a few references to codd believing that SQL compromised his (codd's) relational model.

a big issue during the evolution of system/r was the battle with the earlier generation of physical databases ... was the excessive overhead of the index (it minimized some of the manual administrative overhead but typically doubled the physical disk storage and increase real memory and processing overhead). having a single "primary" index with all the rest of the fields (columns) related to the pimrary index ... normally in the same physical record ... allowed things like bank account transactions to update an account balance field w/o impacting the index. carefully constructed model could select a fairly stable characteristic as the primary index and make all the other fields (in the same physical record) related to the primary index (and related updates wouldn't impact index structure).

having single index with large body of information physically collected under that index also minimized the index overhead on retrieval. being able to update nearly all of the fields ... resulting in single physical record write w/o impacting the index ... also minimized the index overhead.

Mapping financial transactions to database "transactions" (with acid and commit characteristics) with most stuff in a singled physical record, resulted in a large class of transaction updates only involving a single physical record needing to be logged as part of acid/commit (and with no impact on things like the index, where an index change might involve a large number of physical records being changed).

when i did the original prototype DLM for ha/cmp http://www.garlic.com/~lynn/subtopic.html#hacmp

minor related posting
http://www.garlic.com/~lynn/95.html#13

one of the things i worked out was how to do "fast commit" across a distributed set of machines (sharing the same disk pool).

a normal "fast commit" would involve writing the "after image" record (s) and the commit record to the log. Logs are written sequentially and for high performance may having either a high-speed sequenctial I/O device and/or a dedicated disk arm. database transactions however tend towards (possibly randomly) scattered records all over a set of disks. ACID/commit require fairly time-ordered activity. Disk arms can have much higher performance if i/o is order by disk position rather than time-sequence. "fast commit" leaves updated record in main memory cache, sequentially writes the log record ... and does lazy writes of the ("dirty") cache record to database "home position". The lazy writes of dirty cache records can be ordered by disk arm location (rather than transaction time sequence) potentially significantly improving disk thruput.

At the time, "fast commit" was crippled for distributed machines. If a lock (and record) migrated to a different machine ... there was a force write of the (dirty cache) record to its database home position and the "migrated to" machine read the record from disk. This preserves the simplified transaction record logging and recovery ... somewhat implicit in the original RDBMS implementations.

The DLM prootype work allowed a cache-to-cache copy of the database record (even dirty records), potentially piggybacked on the same I/O operation that migrated the specific lock. This avoided having to force a disk transit (out and back in again). The problem was that this severely complicated log recovery in case of a failure. In the single processor case, there could be a whole sequence of (commited) transactions to the same (dirty) record (none of which have shown up in the database home location for that record) ... that were all recorded sequentially in the same physical log. Recovery after a failure, just requires sequentlly reading the log and updating the home record locations for each correspond record.

In a distributed "fast commit" recovery ... multiple different transactions to the same record may in exist in several different physical logs (none of which have shown up yet in the home database location). The recovery process is attempting to recreate an integrated global sequence from multiple different physical ... when the typical transaction rate may be much higher than any global clock resolution (i.e. simple time-stamping of each record in the log may not be sufficient since it is very expensive to maintain fine-grain clock syncronization across multiple real machines).

So the approaches written up in the literature tends to involve things like log versioning or log virtual time (some global increasing value that allows recreating operation operation sequence from multiple different physical logs).

-- 
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Received on Thu Apr 14 2005 - 21:41:27 CEST

Original text of this message