Re: Mixing OO and DB

From: Patrick May <pjm_at_spe.com>
Date: Fri, 21 Mar 2008 18:14:35 -0400
Message-ID: <m2k5jvohbo.fsf_at_spe.com>


"Brian Selzer" <brian_at_selzer-software.com> writes:

> "Patrick May" <pjm_at_spe.com> wrote in message news:m2eja6belp.fsf_at_spe.com...

>>>> If you agree with this, the second half of your claim
>>>> ". . . and it cannot be decoupled" is clearly incorrect because
>>>> the application implementation deals with the physical schema.
>>>> Since you agree that multiple different physical schemas are
>>>> possible, decoupling the application from any particular set of
>>>> those is both possible and good design.
>>>
>>> No, it isn't incorrect. Whenever a schema evolves, views are
>>> often used to provide backward compatibility for existing
>>> applications. An application need not concern itself with whether
>>> it is accessing a view or a table; therefore it is not necessarily
>>> tied to any physical schema.
>>
>> The fact that views can be used actually demonstrates that the
>> application can be decoupled from the schema. You are suggesting
>> using views to do so. That's one possible mechanism. OO languages
>> provide others.
>
> You have a very narrow and limited view of what a schema is and what
> it can provide.

     You have a deft hand with non sequiturs.


>> Even when views are used, the application should be decoupled
>> from the schema because the two models are often very different.
>> Applications can organize information in ways other than the
>> relational model.
>
> I just don't buy this.  If the information is the same, but just
> organized differently, then there must exist a transposition between
> them.  Each is then just a different possible representation of the
> same information.

     Some representations are more expressive in terms of the problem
or solution domain. Tuples are not always the optimal data structure.
> If the transposition is done by the DBMS, then it can retain its
> responsibility for guaranteeing integrity.  If the transposition is
> done by the application, then that responsibility may need to shift
> from the DBMS to the application--every application.  Now you have
> to guarantee that the code that is used to access the information is
> identical in every application that uses the information

     If that is a requirement, it's a good argument for a shared
mapping layer or other decoupling mechanism. In fact, though, different applications often need different representations of different subsets of the data available in a relational database, plus data that is only used within the application. Because the application has a different, non-relational model of the data, decoupling is good design.

> --AND, you have to prevent ad-hoc access to the data.

     Why? It's certainly easier to maintain the integrity of the database if you can, but many systems support multiple applications and ad-hoc interaction with the underlying database. That's what locking and other concurrency techniques are for.

> In addition, you have to manage concurrency in every application.

     Yes.

> Keep it simple, stupid: let the DBMS do what it is designed to do.
> Why reinvent the wheel?  The people who built the DBMS are probably
> a lot smarter than you, or at least know more about how to persist
> information than you, since that is their focus.

     A relational database is a very generic technology.  An
application is much more specific and can therefore take advantage of less general types and data structures that improve the performance and maintainability of the application code. Except for CRUD systems, the database vendors can't address those problem domains in a generic way.

>>>> Further, the view that the schema is integral to the
>>>> application is very data centric. Different applications may
>>>> need the data in different forms, not all of which are
>>>> relational. There is therefore a need to translate between the
>>>> data structures, which is another good reason to decouple the
>>>> application from the specific physical schema being used.
>>>
>>> If by data centric you mean that the information that is to be and
>>> can be recorded must be specified before even considering how that
>>> information may behave, then I agree: it is a data centric view.
>>
>> It is also possible to define a system in terms of behavior and
>> only decide on a particular data representation once those
>> behaviors are designed. In practice, both approaches are typically
>> used.

>
> How can you possibly design a system in terms of the behavior of
> objects if you haven't first specified which objects are
> interesting?

     You focus on the behaviors of interest and partition those
behaviors into cohesive units of classes and modules.

>>> But again, you're laboring under the delusion that an application
>>> must be tied to a physical schema.
>>
>> I'm under no such delusion. An application implementation that
>> uses data from a relational databas must deal with a particular
>> schema. Since the two change for different reasons and since the
>> models are often different, decoupling them is good design.

>
> Whenever there is a change in potential information content, that
> change may involve potential information that an application can
> access or manipulate, or potential information that an application
> doesn't access.

     Fallacy of the excluded middle.  The change may also be in how
the information is modeled by either the application implementation or the specific schema being used by the application. One option is to use views to isolate the two. Another option is to decouple the two components (application implementation and specific schema) so that changes in one do not impact the other.

>>>>>> Even if an application uses the database system's
>>>>>> capabilities to implement some application functionality, there
>>>>>> are still changes to the underlying schema that do not, or
>>>>>> should not, require change to the applications that use that
>>>>>> schema. This is why approaches like dependency inversion are
>>>>>> useful. The application depends on an interface and the
>>>>>> combination of database schema and any logic running in the
>>>>>> database implements that interface. Either can change without
>>>>>> impacting the other.
>>>>>
>>>>> The database contains that which can be manipulated. An
>>>>> application is that which does the manipulating. These are
>>>>> completely different species of functionality.
>>>>
>>>> The database sometimes contains some of that which can be
>>>> manipulated.
>>>
>>> I don't follow you. Please elaborate.
>>
>> Not all data used by an application needs to be in the database.

>
> I thought that we were discussing information that is to be and can
> be recorded.  Such information needs to be in the database.

     That depends on how long it needs to remain available and if it
needs to be accessed by other clients of the database. I often work on systems where a considerable portion of the information is stored in a distributed shared object repository, in memory. You could consider that a form of database, but it doesn't use a relational model.

Sincerely,

Patrick



S P Engineering, Inc. | Large scale, mission-critical, distributed OO
                       | systems design and implementation.
          pjm_at_spe.com  | (C++, Java, Common Lisp, Jini, middleware, SOA)
Received on Fri Mar 21 2008 - 23:14:35 CET

Original text of this message