Re: Upgrade as Change in Meta-state

From: Laconic2 <laconic2_at_comcast.net>
Date: Thu, 21 Oct 2004 10:21:32 -0400
Message-ID: <p4KdnX-jZIVqWOrcRVn-tQ_at_comcast.com>


"Kenneth Downs" <firstinit.lastname_at_lastnameplusfam.net> wrote in message news:8t01lc.6oq.ln_at_mercury.downsfam.net...
> Here is something I do not see discussed often, or else I have missed it.
>
> Is there much interest in discussing the upgrade operation on an RDB?

I think this topic is closely related to several other topics, ones that I often rant about.

In what follows the "enterprise" is not the enterprise you work for, Kenneth. It's the enterprise whose data is inside the database.

One is the difference between an EDS (Encapsulated Data Store) and an IDS (Integration Data Store). I started a topic on this a few months ago. An integration data store is the property of the enterprise, and is managed, right down to the bit level, if necessary, in order to meet the goals of the enterprise. An encapsulated data store is inside something the enterprise has purchased, something like a software package. In the case of an EDS, the enterprise treats the entire capsule as an asset, to be used, upgraded, or retired, as a unit, rather than disassembled.

Now, if the enterprise has tens of thousands of dollars invested in a "capsule", and tens of millions of dollars invested in the data inside the capsule, and decides to retire the capsule without retiring the data inside it, they are going to want a decent export utility inside the capsule. At this point, a lot of enterprises find that they have been bamboozled by the software vendor's ultimate marketing plan: there is no export tool inside the capsule.

I call this the "Hotel California" marketing scheme: you can check out whenever you like, but you can never leave!

Everything I have to say one the subject you raised depends on whether we are talking about an EDS or an IDS.

>
> Compare an upgrade to everyday use. In everyday production use, a
database
> is going through a series of serializable changes to state, with the
> unspoken assumption that the meta-state, the description of allowed
states,
> is itself unchanged.

The same can be said of a program or a package, can't it?

>
> Let us say that an upgrade is a change in meta-state, a change in the
> description and implementation of allowed states. In English, that would
> be new tables, columns, and constraints.
>

The "user of the DBMS" has the right to create new tables, columns and constraints.
We need to define who "the user of the DBMS" is. This is harder than it sounds.

> I would posit that the above definition supports my perpetual rant that
biz
> rules should be stored as data, and DDL should be generated, never coded
by
> hand. If the state is data, the meta-state is data.

<rant>
Nothing should be done by hand! Cloth should be made by automated looms! Autos should be built by robots!
Cities should be designed by some sort of "Sim-City" package! Most of don't need work... we need salaries!
Anything that can't be done by a machine should be transferred "offshore" (wherever that is)!
</rant>

Code should not be coded by hand. There should be a tool, Maybe a CASE tool as Tony suggested. The tool
should capture the requirements in some sort of modeling language, perhaps graphic. Once all the requirements are
captured, you should be able to just "click here", wait a while, and get all the Java, DML, and DDL you need.

And when your requirements change, you should just update the model, "click here" and have the diff engine go to work. It should spit out all the upgrades needed to the Java, the DML, the DDL, and the preexisting data!

In the meantime, what shall we do?

>
> OK, so all biz rules are in data. This means that an upgrade is described
> by a set of tables that describe the new state. This makes it childishly
> simple to do the following:
>

This means that there are "biz meta-rules" doesn't it? Who codes the meta-rules?
If I buy from you, and I want to change the rules, great! But what if I
want to revise and extend the meta rules?

You provided me with a "maximum salary" parameter, and it's childishly simple to update that parameter. Great! But now the Gubbermint shows up, and tells me that I have to implement a "maximum value of stock options" in addition to a maximum salary, and I have to do it withing 90 days. I call up the sales person at your company, who says, "Gee we never thought of that! Wait for the the next release!" "When is that?". "Two years form now".

Do you see the problem?

> 1) validate the new state
> 2) validate the state transition prior to upgrading (absolutely crucial)
> 3) diff the states and generate only the necessary DDL
> 4) self-document the operation, even possibly estimate time
>

You've just described a transaction, haven't you? It's atomic, consistent, isolated and durable, eh?

BTW, I fully support the idea that every DDL operation should take place in the context of a transaction.

> Consider point #1. If the biz rules are in normalized data, then
validation
> is achieved by well-known methods. However, if your upgrade is "trapped"
> in files that contain DDL, you know only that it was valid against the
> database it ran against during development and testing. You cannot prove
> ahead of time that it will run successfully against any particular
customer
> database. Anything less than validated data is less efficient than it
> could be.
>
> Consider point #2. Programmer X is going to push a feature into the
> smoke-test system. What nobody knows is that his DDL unconditionally
drops
> and then creates the GONKULATORS table, with an integer ID. Programmer Y
> has already put a feature through in which the GONKULATORS table has a
> character ID column. Programmer X thinks he is *creating* a table, but he
> is actually *altering* a table. If the specs were in data and compared,
> this would be trapped as an ALTER COLUMN, and pretty durn strange one at
> that, programmers and technical managers could be alerted. (This by the
> way comes from real life, where I regularly saw programmers doing this to
> each other).
>

> Consider point #3. Consider a system under active development, with a
dozen
> or so programmers pushing changes through. Some of them are steadily
> altering several large tables. Customer X is being upgraded after 2
years.
> If we upgrade them with files containing DDL statements, one after
another,
> the system may execute 10 separate ALTER TABLE ADD Column_name... commands
> against a huge table. If the system were described in data, a diff
> operation would detect the ten missing columns and add them in a single
> command.

I'm going to start a new thread on "creating the DB-DIFF engine" it's more skunk works.

>
> Consider point #4. The tables that contain the before and after
meta-state
> are diff'd to generate the DDL. The diff tables contain the precise
> description of the upgrade, *no* *guessing*. So Customer X from the prior
> example says, "Geez, what exactly am I getting after 2 years?" The most
> complete answer is to point to the web pages that expose the diff tables,
> anything less is hand-waving.
>
> I can see no way to provide the 4 points unless the system is completely
> described in data.

What's the methodology at this shop? The "Mogadishu Management Model"? You can't do DDL unless you understand the impact on the work of ALL the programmers, can you?

I'm with Tony on this one. This would never fly at my shop. Received on Thu Oct 21 2004 - 16:21:32 CEST

Original text of this message