Re: Mixing OO and DB

From: Patrick May <pjm_at_spe.com>
Date: Wed, 26 Mar 2008 10:42:02 -0400
Message-ID: <m27ifp5yyt.fsf_at_spe.com>


"Brian Selzer" <brian_at_selzer-software.com> writes:
> "Patrick May" <pjm_at_spe.com> wrote in message news:m2bq534qzb.fsf_at_spe.com...
>> I don't disagree, but it's still a non sequitur. You claimed
>> that it isn't possible to decouple the application implementation
>> from the specific schema. That is clearly incorrect because the
>> same internal representation used by the application can be
>> supported by more than one specific database schema, as you
>> describe here. If the specific schema is encapsulated such that
>> the application is decoupled from it, you can change the specific
>> schema without impact to the application.
>
> Not exactly. I claimed that it isn't possible to decouple the
> application from the schema. I believe I said, and I think you
> agreed, that a schema specifies what is to be and can be recorded,
> and it is in that sense that it cannot be separated from the
> application.

     Ah, you may have identified the source of our miscommunication here. We seem to be using "schema" differently, despite both of us repeatedly attempting to clarify. Let's consider first an application that doesn't use a relational database at all. This application still has a schema in the sense of the data that it uses internally to support the behaviors it exhibits. Are we in agreement so far?

     The implementation of this internal schema may change. For a simple example, an array may be changed to a queue or a stack. In this case the same data is being stored, but the behavior of the data structure is different. A more complex example would be changing a data member to a computation or vice versa. In these scenarios, the application logic cannot be decoupled from the schema representation.

     Now consider the same application modified to use a relational database. The application logic continues to use the same data structures (stacks, queues, DAGs, etc.) and classes (we'll assume it's an OO application, since this is comp.object) it was using before. The implementation of the schema in the relational database presents a particular interface. This interface is based on relations. Some form of mapping must take place to convert the data provided via that interface into the model used internally by the application.

     This mapping mechanism is exactly where the application implementation can be decoupled from the specific schema interface, typically through the use of something like the Dependency Inversion Principle.

>>>> If that is a requirement, it's a good argument for a shared
>>>> mapping layer or other decoupling mechanism. In fact, though,
>>>> different applications often need different representations of
>>>> different subsets of the data available in a relational database,
>>>> plus data that is only used within the application. Because the
>>>> application has a different, non-relational model of the data,
>>>> decoupling is good design.
>>>
>>> And what model is that? Is OO a data model?
>>
>> It could be anything from a simple stack to a DAG to a full
>> object graph. Internally, the application isn't often using
>> tuples.
>
> Data structures are not data models.

     They are implementations of data models and they can be decoupled from other implementations of data models via DIP and other techniques.

>>> If information is held in the memory of some application and is
>>> also in the database, and if the copy in memory changes, then the
>>> copy in the database is stale, and any query against the database
>>> must be considered suspect.
>>
>> True. This is a standard problem in large distributed systems.
>> There are many techniques for dealing with it that don't require a
>> centralized database. That's not to say that a centralized
>> database is never a good solution, it's just not always the best
>> solution.
>
> So what does that have to do with allowing ad-hoc access?

     Ad-hoc access is just one other concurrent use of the database.

> You missed my point altogether: an application can take advantage of
> less general types and data structures that improve the performance
> and maintainability of the code until another application needs to
> use the data. Then the performance and maintanability improvements
> will very likely go right out the window.

     That's not necessarily the case. Multiple applications, with solution-specific types and data structures, use shared databases concurrently in the vast majority of enterprise systems.

> As was what immediately preceeded it: information can be represented
> in many different ways yet still be the same information. What is
> fallacious is trying to argue that a change in structure constitutes
> a change in potential information content.

     That's not what I'm arguing. I'm saying that one representation (used internally by a particular application implementation) can be decoupled from another (a specific database schema, for example).

>>> I don't think we're on the same page as to what constitutes a
>>> model, either.
>>
>> Have you never developed an application that used data
>> structures other than relations?
>
> Indeed I have. But data structures are not data models.

     They are implementations of models. See above.

> Structure is only one component of a data model. An even more
> important component is a set of constraints that specifies what
> states and changes of state are possible.

     Those constraints can be implemented in either the database or in the application. The behavior of changing state often resides in the application.

Sincerely,

Patrick



S P Engineering, Inc. | Large scale, mission-critical, distributed OO
                       | systems design and implementation.
          pjm_at_spe.com  | (C++, Java, Common Lisp, Jini, middleware, SOA)
Received on Wed Mar 26 2008 - 15:42:02 CET

Original text of this message