Re: Mixing OO and DB

From: frebe <frebe73_at_gmail.com>
Date: Wed, 19 Mar 2008 22:21:19 -0700 (PDT)
Message-ID: <6f7f7fe1-c959-4c46-a3a1-fcbfbd4f2721_at_a1g2000hsb.googlegroups.com>


On 19 Mar, 22:31, Patrick May <p..._at_spe.com> wrote:
> frebe <freb..._at_gmail.com> writes:

> > On 16 Mar, 18:20, Patrick May <p..._at_spe.com> wrote:
> >> "Brian Selzer" <br..._at_selzer-software.com> writes:
> >> > "Patrick May" <p..._at_spe.com> wrote in messagenews:m2wso9i9be.fsf_at_spe.com...>> > I'll
> >> > buy that a schema can be part of multiple applications, but that
> >> > there can be multiple ways to represent the same information does
> >> > not alter what information is to be and can be recorded.
>
> >>      We seem to be in agreement that different specific schemas can
> >> provide access to the same underlying information.  That suggests
> >> that your statement that ". . . the schema is an integral part of
> >> the application specification, and it cannot be decoupled . . ."
> >> needs clarification.  Would you agree with the formulation "The
> >> logical schema is an integral part of the application
> >> specification."?  By "logical schema" I mean the implementation
> >> independent set of data that supports the application.
>
> > The logical schema is the relations (base relations - tables, and
> > derived relations - views). But the fact whether the relations is a
> > table or view, is not part of the logical schema. That is a physical
> > detail.
>
>      That's fine, but splitting hairs from the application
> implementation point of view.  The application implementation sees a
> particular schema specification (tables, views, or some combination).

But it is still a particular logical schema.

> >>      If you agree with this, the second half of your claim
> >> ". . . and it cannot be decoupled" is clearly incorrect because the
> >> application implementation deals with the physical schema.
>
> > Wrong. The applications deals with the logical schema. It has no
> > idea if it is accessing a table or a view.
>
>      There are multiple possible sets of tables and views that can
> provide the same information.  Therefore Mr. Selzer's claim that the
> application implementation and the specific schema it uses cannot be
> decoupled is incorrect.  The specific schema is not "integral" to the
> application implementation.

Previously you have been invited to demonstrate how a general ledger application can be decoupled from its general ledger schema. Unless you accept this invitation, we have to assume this claim to be unsupported.

> >> Since you agree that multiple different physical schemas are
> >> possible, decoupling the application from any particular set of
> >> those is both possible and good design.
>
> > Multiple logical schemas are also possible. The decoupling is done
> > using views.
>
>      That's one way.  Non-relational languages provide others.

So we have an agreement that using views is a valid solution for the problem?

> > I thought that Brians examples with coordinates was pretty obvious,
> > but maybe some futher clarification is needed. Suppose you have a
> > table for Cartesian coordinates. Now you have an application that
> > needs to work with polar coordinates instead. The solution is to
> > create an updateable view for polar coordinates, which derives the
> > data from the original table. Another solution could be to create a
> > new table using polar coordinates and creating a view (with the same
> > name as the original table) providing the Caresian coordinates. The
> > logical schema will now contain two relations for the same data in
> > different format.
>
>      Alternatively, the data could be loaded into an object that does
> the conversion on demand.  The rest of the application wouldn't need
> to know about the database or the specific schema at all.

And the benefit with that is? I will give a short example of why I think views are a better solution than wrapping every SQL statement in a method.

Lets take Martins famous find_employees_eligible_for_retirment example. We have two solutions, one is to create a method.

find_employees_eligible_for_retirment()
{
  return query("select id, name, birthdate from employee where yeardiff(now(), birthdate) > 60)");
}

The corresponding view would look like:

create view employee_eligible_for_retirement as select id from employee where yeardiff(now(), birthdate) > 60

In the first example the set of attributes is fixed (id, name, birthdate), and the only way to get extra attributes (or join extra tables) is to make another database call for every employee.

Using a view you could easily join extra attributes without making extra database round-trips and have superior performance.

select e.id, e.salary
from employee_eligible_for_retirement r join employee e on e.id=r.id

Lets say you want to se how many employees in a specific department that are eligible for retirement, you could easily write another SQL query.

select count(*)
from employee_eligible_for_retirement r join employee e on e.id=r.id where e.department=?

Using your method, you would have to write a new method for every single SQL statement, duplicating the eligible_for_retirement business rule all over you "persistence layer". The other option would be to use the original find_employees_eligible_for_retirment method and performing post-processing having terrible performance.

> >>      Further, the view that the schema is integral to the
> >> application is very data centric.  Different applications may need
> >> the data in different forms, not all of which are relational.
>
> > On the logical level, virtually any data can be represented in a
> > relational form.
>
>      On the logical level, all Turing complete languages are equally
> powerful.  For particular applications, some are far more expressive
> and usable than others.  Similarly, the relational model is not always
> the most expressive for a particular application.

Can you give some examples of data that doesn't fit for the relational model on the logical level?

> >> There is therefore a need to translate between the data structures,
> >> which is another good reason to decouple the application from the
> >> specific physical schema being used.
>
> > Since the application is not dealing with the physical schema, it is
> > not able to do that translations. It is already done in the RDBMS.
>
>      That is not correct.  The application deals with a specific
> schema with data represented in the relational model.  That is not
> always the format required by the application.

What is incorrect? The fact the the applications access the logical schema? Do you claim that the application access the physical layer in the RDBMS directly. I quess some DBMS products allow you to do that, but if you are using SQL as access method, you are only dealing with the logical schema. (Some SQL dialects allow you to do different hints about index usage, which is some kind of back-door to the physical layer).

//frebe Received on Thu Mar 20 2008 - 06:21:19 CET

Original text of this message