Upgrade as Change in Meta-state

From: Kenneth Downs <firstinit.lastname_at_lastnameplusfam.net>
Date: Mon, 18 Oct 2004 14:09:43 -0400
Message-ID: <8t01lc.6oq.ln_at_mercury.downsfam.net>



Here is something I do not see discussed often, or else I have missed it.

Is there much interest in discussing the upgrade operation on an RDB?

Compare an upgrade to everyday use. In everyday production use, a database is going through a series of serializable changes to state, with the unspoken assumption that the meta-state, the description of allowed states, is itself unchanged.

Let us say that an upgrade is a change in meta-state, a change in the description and implementation of allowed states. In English, that would be new tables, columns, and constraints.

I would posit that the above definition supports my perpetual rant that biz rules should be stored as data, and DDL should be generated, never coded by hand. If the state is data, the meta-state is data.

OK, so all biz rules are in data. This means that an upgrade is described by a set of tables that describe the new state. This makes it childishly simple to do the following:

  1. validate the new state
  2. validate the state transition prior to upgrading (absolutely crucial)
  3. diff the states and generate only the necessary DDL
  4. self-document the operation, even possibly estimate time

Consider point #1. If the biz rules are in normalized data, then validation is achieved by well-known methods. However, if your upgrade is "trapped" in files that contain DDL, you know only that it was valid against the database it ran against during development and testing. You cannot prove ahead of time that it will run successfully against any particular customer database. Anything less than validated data is less efficient than it could be.

Consider point #2. Programmer X is going to push a feature into the smoke-test system. What nobody knows is that his DDL unconditionally drops and then creates the GONKULATORS table, with an integer ID. Programmer Y has already put a feature through in which the GONKULATORS table has a character ID column. Programmer X thinks he is *creating* a table, but he is actually *altering* a table. If the specs were in data and compared, this would be trapped as an ALTER COLUMN, and pretty durn strange one at that, programmers and technical managers could be alerted. (This by the way comes from real life, where I regularly saw programmers doing this to each other).

Consider point #3. Consider a system under active development, with a dozen or so programmers pushing changes through. Some of them are steadily altering several large tables. Customer X is being upgraded after 2 years. If we upgrade them with files containing DDL statements, one after another, the system may execute 10 separate ALTER TABLE ADD Column_name... commands against a huge table. If the system were described in data, a diff operation would detect the ten missing columns and add them in a single command.

Consider point #4. The tables that contain the before and after meta-state are diff'd to generate the DDL. The diff tables contain the precise description of the upgrade, *no* *guessing*. So Customer X from the prior example says, "Geez, what exactly am I getting after 2 years?" The most complete answer is to point to the web pages that expose the diff tables, anything less is hand-waving.

I can see no way to provide the 4 points unless the system is completely described in data.

-- 
Kenneth Downs
Use first initial plus last name at last name plus literal "fam.net" to
email me
Received on Mon Oct 18 2004 - 20:09:43 CEST

Original text of this message