Re: OO versus RDB

From: <frebe73_at_gmail.com>
Date: 29 Jun 2006 10:23:21 -0700
Message-ID: <1151601801.398616.13380_at_p79g2000cwp.googlegroups.com>


> > If the semantics of the old column "salary" are the same as the new
> > column "baseSalary", why on earth would you need to change the name? If
> > the name is not changed, no existing SQL statements would break. Only
> > the applications that needs the new "burdenedSalary", would need to be
> > rewritten.
>
> To capture the correct semantics of the attribute. In a context where
> one must distinguish between different flavors of salary, the original
> name was poorly chosen and the DBA fixed that. The attribute name is
> supposed to reflect the attribute semantics. Not changing it just leads
> to opportunities for new applications to screw up later.

Nothing forces you to change the column name. The solution will work with the old name. Would you change the name of the corresponding object property too, and break the interfaces? I don't think so. Changing column names is not a relevant motivation for decoupling SQL statements.

> However, that isn't the point. There are lots of ways to modify schemas
> without affecting individual attribute semantics.

Please give some more examples.

> For example, I can
> easily construct an example where the attribute is simply moved to
> another table (e.g., to a new [EmployeeSalaryInfo] table, which is a
> child of [Employee]). The application query would still be broken.

I assume EmployeeSalaryInfo and Employee has the same primary key. That is a totally unnecessary change. Nothing forces you to do that kind of change.

> > In what way was the schema denormalized? What did the stored procedure
> > do?
>
> As stated, the burdenedSalary attribute's value is dependent on other
> non-key attributes of Employee. That violates 3NF, which requires that
> non-key attributes be dependent only on the tuple key.

What attributes? Besides, the only case denormalization should be use (if ever), is in datawarehouse solutions.

> The stored procedure ensures that burdenedSalary is updated whenever any
> of the attributes it depends upon is modified. That is necessary for
> dependent attributes to ensure data integrity outside the scope of the
> update transaction.

This is a good motivation why denormalization in non-readonly databases must be avoided.

> >>This sort of situation is actually rather common at the enterprise
> >>level.
> >
> > Changing names of columns but keeping the sematics is extremly rare,
> > just because it breaks the interface to the applications.
>
> So do all changes to enterprise schemas, which is why /any/ change to
> such a schema is a big deal. The DBA already had to change the schema
> to provide burdenedSalary and chose that opportunity to clean up the
> semantics.

Adding burdenedSalary does not break any existing SQL statement. If he is stupid enough to change the name of an existing column, which will break existing statements, he should be fired.

> >>When requirements changes trigger changes to the
> >>schema, those requirements typically only affect some of the reasons the
> >>applications use the data. So only a fraction of the applications that
> >>use the data are affected by the change and have to be modified for the
> >>requirements changes.
> >
> > In your example, only the applications that need the new column would
> > need to change. All other remains unchanged.
>
> No, in my example any application that accesses 'salary' must have the
> query construction code modified even though they don't access the new
> attribute.

That's why you should not change the name of an existing column if not the semantics change. It is the same reason why Sun does not change the name of the put method in java.util.Map. The interface will be broken.

Please give me a better example.

Fredrik Bertilsson
http://frebe.php0h.com Received on Thu Jun 29 2006 - 19:23:21 CEST

Original text of this message