Re: Mixing OO and DB

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 26 Mar 2008 20:52:32 GMT
Message-ID: <kIyGj.16682$5K1.16373_at_newssvr12.news.prodigy.net>


"Patrick May" <pjm_at_spe.com> wrote in message news:m27ifp5yyt.fsf_at_spe.com...
> "Brian Selzer" <brian_at_selzer-software.com> writes:
>> "Patrick May" <pjm_at_spe.com> wrote in message
>> news:m2bq534qzb.fsf_at_spe.com...
>>> I don't disagree, but it's still a non sequitur. You claimed
>>> that it isn't possible to decouple the application implementation
>>> from the specific schema. That is clearly incorrect because the
>>> same internal representation used by the application can be
>>> supported by more than one specific database schema, as you
>>> describe here. If the specific schema is encapsulated such that
>>> the application is decoupled from it, you can change the specific
>>> schema without impact to the application.
>>
>> Not exactly. I claimed that it isn't possible to decouple the
>> application from the schema. I believe I said, and I think you
>> agreed, that a schema specifies what is to be and can be recorded,
>> and it is in that sense that it cannot be separated from the
>> application.
>
> Ah, you may have identified the source of our miscommunication
> here. We seem to be using "schema" differently, despite both of us
> repeatedly attempting to clarify. Let's consider first an application
> that doesn't use a relational database at all. This application still
> has a schema in the sense of the data that it uses internally to
> support the behaviors it exhibits. Are we in agreement so far?
>

I wouldn't put it that way, but I think I understand what you mean.

> The implementation of this internal schema may change. For a
> simple example, an array may be changed to a queue or a stack. In
> this case the same data is being stored, but the behavior of the data
> structure is different. A more complex example would be changing a
> data member to a computation or vice versa. In these scenarios, the
> application logic cannot be decoupled from the schema representation.
>

The potential information content, however, may not be different. Whether you use an array, or a queue, or a linked list, or a doubly linked list, or a binary tree, the information that populates those structures may be exactly the same information, regardless of how it is laid out.

> Now consider the same application modified to use a relational
> database. The application logic continues to use the same data
> structures (stacks, queues, DAGs, etc.) and classes (we'll assume it's
> an OO application, since this is comp.object) it was using before.
> The implementation of the schema in the relational database presents a
> particular interface. This interface is based on relations. Some
> form of mapping must take place to convert the data provided via that
> interface into the model used internally by the application.
>

This is where we diverge. The data is the same data whether it is represented as a relation or as a binary tree. It is not the data that is subject to conversion, but rather its representation. This may seem like splitting hairs, but the difference is in my opinion critical. A schema specifies what is to be and can be recorded. It also suggests a structure, but that structure implies and is implied by a set of constraints that describes what the data is rather than how the data is laid out. That implied structure may differ significantly from what is optimal for a particular application, which is concerned less with what the data is than with how it can be used.

> This mapping mechanism is exactly where the application
> implementation can be decoupled from the specific schema interface,
> typically through the use of something like the Dependency Inversion
> Principle.
>
>>>>> If that is a requirement, it's a good argument for a shared
>>>>> mapping layer or other decoupling mechanism. In fact, though,
>>>>> different applications often need different representations of
>>>>> different subsets of the data available in a relational database,
>>>>> plus data that is only used within the application. Because the
>>>>> application has a different, non-relational model of the data,
>>>>> decoupling is good design.
>>>>
>>>> And what model is that? Is OO a data model?
>>>
>>> It could be anything from a simple stack to a DAG to a full
>>> object graph. Internally, the application isn't often using
>>> tuples.
>>
>> Data structures are not data models.
>
> They are implementations of data models and they can be decoupled
> from other implementations of data models via DIP and other
> techniques.
>
>>>> If information is held in the memory of some application and is
>>>> also in the database, and if the copy in memory changes, then the
>>>> copy in the database is stale, and any query against the database
>>>> must be considered suspect.
>>>
>>> True. This is a standard problem in large distributed systems.
>>> There are many techniques for dealing with it that don't require a
>>> centralized database. That's not to say that a centralized
>>> database is never a good solution, it's just not always the best
>>> solution.
>>
>> So what does that have to do with allowing ad-hoc access?
>
> Ad-hoc access is just one other concurrent use of the database.
>
>> You missed my point altogether: an application can take advantage of
>> less general types and data structures that improve the performance
>> and maintainability of the code until another application needs to
>> use the data. Then the performance and maintanability improvements
>> will very likely go right out the window.
>
> That's not necessarily the case. Multiple applications, with
> solution-specific types and data structures, use shared databases
> concurrently in the vast majority of enterprise systems.
>
>> As was what immediately preceeded it: information can be represented
>> in many different ways yet still be the same information. What is
>> fallacious is trying to argue that a change in structure constitutes
>> a change in potential information content.
>
> That's not what I'm arguing. I'm saying that one representation
> (used internally by a particular application implementation) can be
> decoupled from another (a specific database schema, for example).
>
>>>> I don't think we're on the same page as to what constitutes a
>>>> model, either.
>>>
>>> Have you never developed an application that used data
>>> structures other than relations?
>>
>> Indeed I have. But data structures are not data models.
>
> They are implementations of models. See above.
>
>> Structure is only one component of a data model. An even more
>> important component is a set of constraints that specifies what
>> states and changes of state are possible.
>
> Those constraints can be implemented in either the database or in
> the application. The behavior of changing state often resides in the
> application.
>
> Sincerely,
>
> Patrick
>
> ------------------------------------------------------------------------
> S P Engineering, Inc. | Large scale, mission-critical, distributed OO
> | systems design and implementation.
> pjm_at_spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)
Received on Wed Mar 26 2008 - 21:52:32 CET

Original text of this message