Schema changes (was: Discovering new relationships)

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Thu, 08 Mar 2007 21:31:14 +0100
Message-ID: <45f07221$0$339$e4fe514c_at_news.xs4all.nl>


Marshall wrote:
> Walt wrote:

>> This can be one of the largest issues for "large, shared databases" going
>> forward.  On the one hand, you want flexibility,  so that requirements that
>> didn't go into the logical model of prior versions can be accommodated.  On
>> the other hand you want extreme backward compatibility, so that the
>> longevity of the database can be appropriately exploited.  The sweet spot on
>> the trade-off is difficult to find.

>
> "... so that the longevity of the database can be appropriately
> exploited"
> to me means that existing software continues to work. Software
> contains embedded assumptions about schema that may become
> stale, and it would be nice if the software continued to work even
> if the schema changes. In other words, decouple releases of
> the schema and releases of the software.

This gets away from the original topic, so I changed the subject line.

> One approach to this problem is simply not to do it. Couple to
> software and the database and change both at once. This
> is only practical if you control both the software and the
> database. It sounds bad when you say it but it's not as
> bad in practice as it sounds.

Reasons for a schema change:
Changes in the information needs and bugfixes.

> The theoretical approach that I usually hear mentioned here
> is views. I have no direct experience with them so I don't
> have an opinion on how well they work in practice. But the
> idea is certainly appealing.
>
> Another approach that I have in mind but that I never
> hear discussed is to exploit the relational algebra to
> enable active negotiation of views between software
> and dbms. There are cases where differences between
> client software and the dbms's current schema are
> obvious. I have no idea of the limits of this technique;
> obviously it can't handle everything because there are
> changes that are simply incompatible.
>
> It would be interesting to have some data on what
> kinds of changes were most common. I'd guess
> "alter table add column" would be in first position.
>
> It would also be interesting to have some theoretical
> framework for classifying schema changes. Anyone?

No theoretical framework, just scenario.

Barring bugfixes, either

	we want to know something about facts we didn't
	care about previously enough to justify the effort
	to capture, maintain and report them
or
	we don't care anymore about facts we used to care
	about etc.

Let's drop a now irrelevant table or column and add a now relevant one.

In generic CRUD frameworks there is no impact to the software, only to the user experience: Where has that table/column gone? And: Hey! A new one.

Consider the drop.
Without views all impacted applications immediately need adjustment.

Tricks may buy some time.
Views that mask the drop by returning literal values instead of maintained data only protect the read-only (as far as the dropped table or column is concerned) applications. The rest would need an 'on update/delete do nothing' trigger.

Even so, the functionality of the application is based on the now stale assumption that it is dealing with live, relevant data.
When the necessary adaptations aren't applied the application itself becomes (partly) irrelevant or worse, it lies.

Now consider the addition of a now relevant table or column. Other than the generic CRUD stuff: Yes, select * from foo is bad. Duh.

More importantly, the initial population must be provided, and the capture, maintenance and reporting must be developed and organized. Received on Thu Mar 08 2007 - 21:31:14 CET

Original text of this message