Re: OO versus RDB

From: H. S. Lahman <h.lahman_at_verizon.net>
Date: Sat, 01 Jul 2006 15:39:45 GMT
Message-ID: <5Hwpg.167$8U1.153_at_trndny07>


Responding to Erk...

>>>If the semantics of the old column "salary" are the same as the new
>>>column "baseSalary", why on earth would you need to change the name?
>>
>>To capture the correct semantics of the attribute.  In a context where
>>one must distinguish between different flavors of salary, the original
>>name was poorly chosen and the DBA fixed that.

>
>
> So you have a poor design. You can still create a view adding the old
> column name as a derived attribute. For those apps that care only about
> base salary (nee salary), there's no change. I have no clue how O-O
> would help in this situation.

Not necessarily. The postulated situation was a maintenance change due to new requirements. The original design might have been fine for the original business context.

I don't think this has anything to do with OO. I was simply responding to the assertion that when schemas change the application /solution/ is necessarily affected. Changes like this can occur for reasons that do not affect the application's solution logic at all. That assertion, in turn, was being used to refute my advocacy of encapsulating data access in client applications that are outside the CRUD/USER realm. So I needed to demonstrate that such changes do occur.

>>The attribute name is
>>supposed to reflect the attribute semantics.  Not changing it just leads
>>to opportunities for new applications to screw up later.

>
>
> Agreed, so you fix it. A view helps make the migration gradual, if you
> need it.

Talk to Bertilsson. B-) His position depends on this never happening.

However, this was nothing more than a simple example to demonstrate that schema changes exist that break the application's queries without affecting the application's use of the data.

>>However, that isn't the point.  There are lots of ways to modify schemas
>>without affecting individual attribute semantics.  For example, I can
>>easily construct an example where the attribute is simply moved to
>>another table (e.g., to a new [EmployeeSalaryInfo] table, which is a
>>child of [Employee]).  The application query would still be broken.

>
>
> In a real DBMS, there's no difference between a view and a table.
> What's the problem?

I don't think it has anything to do with views vs. tables. One must understand the schema to construct a query. Since queries are constructed in the application to access the data needed by the problem solution, that means that the application's construction of the query must change. The same would be true for changing a DBMS view that the query utilized. Therefore the application itself must change in either case.

However, the problem solution still eats exactly the same data and manipulates it in exactly the same way. So the problem solution is unaffected by the change in the way the data is accessed.

That brings us full circle to my original point when entering this thread: change management on the application side. That is, on the application side one should isolate the data access mechanisms, including DBMS views, to ensure that any changes to them will not affect the problem solution logic. IOW, one decouples the data storage and access mechanisms from the problem solution solution logic via modularization.

>>As stated, the burdenedSalary attribute's value is dependent on other
>>non-key attributes of Employee.  That violates 3NF, which requires that
>>non-key attributes be dependent only on the tuple key.

>
>
> Then it's a calculation to be done in a view. It's not an attribute in
> a base relvar / table.

The reasons that DBAs deliberately denormalize are typically related to physical access performance bottlenecks. One might provide a dependent attribute like burdenedSalary to replace the per-query overhead of the view computation with a per-update computation. That would usually only be justified if it could be demonstrated that the dependent attribute was accessed far more often than the independent attributes were updated.

>>The stored procedure ensures that burdenedSalary is updated whenever any
>>of the attributes it depends upon is modified.  That is necessary for
>>dependent attributes to ensure data integrity outside the scope of the
>>update transaction.

>
>
> There's no integrity concern if you don't store the derived value.

True. But dependent attributes nonetheless are used for valid reasons.

>>>Changing names of columns but keeping the sematics is extremly rare,
>>>just because it breaks the interface to the applications.
>>
>>So do all changes to enterprise schemas, which is why /any/ change to
>>such a schema is a big deal.  The DBA already had to change the schema
>>to provide burdenedSalary and chose that opportunity to clean up the
>>semantics.

>
>

> And that's a fine thing to do, but applications will be affected. I
> don't see O-O helping at all with this.

As I indicated above, this subthread has nothing to do with OO. My entry was related to client application partitioning, which is relevant to any application, OO or not.



There is nothing wrong with me that could not be cured by a capful of Drano.

H. S. Lahman
hsl_at_pathfindermda.com
Pathfinder Solutions -- Put MDA to Work http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman Pathfinder is hiring:
http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATH Received on Sat Jul 01 2006 - 17:39:45 CEST

Original text of this message