Re: Mixing OO and DB

From: Patrick May <>
Date: Sun, 02 Mar 2008 14:24:23 -0500
Message-ID: <>

frebe <> writes:

> On 2 Mar, 15:45, Patrick May <> wrote:

>> frebe <> writes:
>> >> >> You've asked this before and it has been fully answered.
>> >> >> The two components change at different rates for different
>> >> >> reasons, especially in distributed applications and when the
>> >> >> database supports multiple applications.
>> >> > The only answer so far is about "denormailzation for
>> >> > perforance", which is a very debatable argument.
>> >> It's not debatable at all.
>> > Ok, show me an example of such schema change that wouldn't affect
>> > the application code.
>> Denormalization for performance is just such a schema change.
>> The data supported by the schema doesn't change, only the format.
>> If you have SQL littering your application code, all of that has to
>> be changed and retested. If you have an O-R mapping layer, for
>> example, only the mapping code needs to change.
> Still no example. I was asking for the schema before and after such
> change. Then we could evaluate if the application code would have to
> change too, or not.

     I'm not sure why you need an example here.  Have you never seen
denormaliztion done for performance reasons? Do you not realize the the entire point of doing so is to reduce the bottleneck posed by the database without changing the external behavior of the business logic that uses the database?

     One common denormalization technique is to eliminate a join by adding columns to one table that replicate data in another table that has a one-to-many relationship with the table being denormalized. The business logic shouldn't have to change just because the schema does.

>> >> In addition, when a database supports multiple applications, the
>> >> schema may need to change for one application but the others
>> >> should be insulated from that change. This is common in many
>> >> enterprises.
>> > Addning columns or tables wouldn't affect the other (old)
>> > applications. If you have other changes in mind, please show an
>> > example.
>> Changes to the schema to support new regulatory requirements
>> are not uncommon. Sometimes these are simple additions, but they
>> can also involve splitting or, less commonly, merging existing
>> columns. Changing ordinality is another not infrequent change; an
>> existing application can continue to function by using the primary
>> or highest priority element on the many side of the relationship
>> but a new application or component can take advantage of the full
>> set.
> Please give one example (schema before and after).

     Again, why? Have you never experienced these kinds of changes? They happen often enough in my, and others, experience that the decoupling provided by OO techniques has proven useful. Despite what some relational proponents think, developers don't come up with these approaches because they dislike or don't understand relational databases. They do it because they're trying to solve real problems.

     Consider a regulatory change like Sarbanes-Oxley. Companies need to add additional monitoring rules, but they have existing applications in place that they don't want to change any more than absolutely necessary due to the high cost of doing so. The applications can be encapsulated to work without change within a different workflow. That workflow requires additional information in the database, in particular more detailed audit information and finer grained tracking of individual steps in the process. This results in schema changes but does not change the behavior of existing applications.

>> During development, many shops are using iterative methods
>> that require changes to both the schema and the application. When
>> there is more than one project ongoing at the same time, insulating
>> the application from the schema allows the new changes to be
>> incorporated when it makes sense for the application team, rather
>> than requiring all teams to change in lock step (a sure recipe for
>> schedule slip).

> This is all very nice words, but at least we need some examples to
> verify it.

     Particular schema examples will not help.  Either you've
experienced this or you haven't. If you haven't, you should at least be able to understand that these kinds of environments do exist and these kinds of changes do take place. This is not a discussion about particular schema changes, it is about the value of decoupling. Talking about specific schemas will take this down a rathole and miss the entire point.

>> >> > Yes it is indeed bad design to force every SQL statement to be
>> >> > wrapped inside a method.
>> >> So you prefer littering your code with embedded SQL?
>> > I wouldn't write exactly the same SQL statement in many different
>> > places, in that case I would use a function. But I would wrap
>> > every SQL statement in a function.
>> Did you mean to say that you _wouldn't_ wrap every SQL
>> statement in a function? If so, you wouldn't pass a code review in
>> my teams.

> Happily I don't belong to your team....

>> The maintenance cost of not doing so is too high.
> When someone fail to support his methods about software development,
> he can always claim his methods has lower maintenance costs, since
> it is almost impossible to verify such argument.

     This is where experience comes into play.  No one I know who has
actually developed a large system using OO techniques would ever embed SQL directly in the application logic. They've personally experienced the costs of doing so and don't want to go through it again.

>> >> This isn't even an issue for the systems I work on.
>> > Why not?
>> I've recently been working on low latency systems in the
>> financial services industry. We have to keep the database, and any
>> other disk based storage, completely out of the critical path of
>> the business transaction or we won't meet our performance
>> requirements.
> What kind of "business transactions"?

     Most recently a large risk analytics engine. The business transactions, or business use cases, if you prefer, involve calculations like value at risk for portfolios containing hundreds of thousands of positions. The only way to get acceptable latency is to co-locate the business logic, messaging mechanism, and data in the same operating system process. The only way to get acceptable throughput is to run dozens of those processes in parallel. Hitting a database would make the system unusable.

     Another example is determining the impact of faults in a telephone network. Probes return a large amount of information, only some of which is pertinent. Raising alerts that make sense at the business level, like "The SLA for Big Customer's video conference service is about to be violated." requires several levels of analysis and aggregation. That has to be done at in memory speeds.

     What kind of systems do you build? Perhaps we can find some common problem domains that would make this conversation more productive.



S P Engineering, Inc. | Large scale, mission-critical, distributed OO
                       | systems design and implementation.
  | (C++, Java, Common Lisp, Jini, middleware, SOA)
Received on Sun Mar 02 2008 - 20:24:23 CET

Original text of this message