Re: Discovering new relationships

From: Walt <wamitty_at_verizon.net>
Date: Thu, 08 Mar 2007 15:51:40 GMT
Message-ID: <giWHh.13205$Tf.9955_at_trndny03>


"Marshall" <marshall.spight_at_gmail.com> wrote in message news:1173367826.938515.302300_at_30g2000cwc.googlegroups.com...
> On Mar 8, 6:18 am, "Walt" <wami..._at_verizon.net> wrote:
> >
> > This can be one of the largest issues for "large, shared databases"
going
> > forward. On the one hand, you want flexibility, so that requirements
that
> > didn't go into the logical model of prior versions can be accommodated.
On
> > the other hand you want extreme backward compatibility, so that the
> > longevity of the database can be appropriately exploited. The sweet
spot on
> > the trade-off is difficult to find.
>
> "... so that the longevity of the database can be appropriately
> exploited"
> to me means that existing software continues to work. Software
> contains embedded assumptions about schema that may become
> stale, and it would be nice if the software continued to work even
> if the schema changes. In other words, decouple releases of
> the schema and releases of the software.
>

As long as "continues to work" implies "continues to deliver correct results", agreed.

> One approach to this problem is simply not to do it. Couple to
> software and the database and change both at once. This
> is only practical if you control both the software and the
> database. It sounds bad when you say it but it's not as
> bad in practice as it sounds.

This severely hobbles an enterprise wide database. If you have hundreds of applications accessing data in an enterprise wide database, synchronizing dtabase change management with application change management is simply not feasible. If you sell an application with a database in it, and your clients use the datbase to drive their own software, you are faced with the same problem.

Incidentally, an "enterprise wide database" could mean a distributed database running under heterogenous DBMS's, themselves running under heterogenous OS's.

>
> The theoretical approach that I usually hear mentioned here
> is views. I have no direct experience with them so I don't
> have an opinion on how well they work in practice. But the
> idea is certainly appealing.

More generally, the concept is "logical data independence".
>
> Another approach that I have in mind but that I never
> hear discussed is to exploit the relational algebra to
> enable active negotiation of views between software
> and dbms. There are cases where differences between
> client software and the dbms's current schema are
> obvious. I have no idea of the limits of this technique;
> obviously it can't handle everything because there are
> changes that are simply incompatible.
>
> It would be interesting to have some data on what
> kinds of changes were most common. I'd guess
> "alter table add column" would be in first position.
>

My guess is that "create table" is in first position. Any application that gets broken by a "create table" was intentionally perverse.

"alter table add column" is right up there. As I've discussed with you a long time ago, a view that's defined as "select * from ..." can be implemented in one of two ways: You can expand the wildcard at view definition time, or you can expand the definition at view reference time. One of these ways is logically independent of "alter table add column", and the other is not.

One way makes "create script change management" a little more complex. Think about it.

> It would also be interesting to have some theoretical
> framework for classifying schema changes. Anyone?
>
>
I'm going to suggest one categorization:

Changes made in response to changing database requirements, and changes that are made for other reasons. One would like all of the changes made for other reasons to be changes at the physical level only, thus protecting all software that has physical data independence. Received on Thu Mar 08 2007 - 16:51:40 CET

Original text of this message