Re: Mixing OO and DB
Date: Sun, 09 Mar 2008 14:58:08 -0400
"Brian Selzer" <brian_at_selzer-software.com> writes:
> "Patrick May" <pjm_at_spe.com> wrote in message news:m2y78vfcbv.fsf_at_spe.com...
>>>>>> One common denormalization technique is to eliminate a join
>>>>>> by adding columns to one table that replicate data in another
>>>>>> table that has a one-to-many relationship with the table being
>>>>>> denormalized. The business logic shouldn't have to change just
>>>>>> because the schema does.
>>>>> Why denormalize when you could just use a materialized view?
>>>> Materialized views are not always available and may not provide
>>>> significant performance benefits if the data is being updated
>>>> frequently and the clients need the absolute latest values.
>>> How is that different from denormalization? Updates still have to
>>> be made in a denormalized schema.
>> With a materialized view, the updates to the underlying tables
>> need to be propagated to the view at some point. If updates are
>> frequent and the clients of the materialized view need the latest
>> values, this is an extra step.
> (Didn't I just say something to that effect? Oh, you snipped it.)
Nah, I just replied to it below. Confusing layout on my part,
> In the sense that the number of steps increases, yes there is an > extra step; in the sense that more work must be performed, no, not > really. The workload is simply divided into that which must be > performed to update the base table (which is less and can be > significantly less than that for a denormalized schema) and that > which must be performed to refresh the view. > >>> I guess it could be that due to the frequency of updates the >>> materialized view could be stale more often than not: in that case >>> there wouldn't be any point in materializing it. >> >> Exactly. >> >>> But I don't see how a denormalized schema would work any better, >>> since much of the time queries would be waiting for update >>> transactions to complete. >> >> You'd save the time required to update the view. Depending on >> the granularity of the update, you could get some performance >> improvement when the query is for rows not currently locked under a >> transaction. > > No, you wouldn't save any time, since the same work must be > performed whether you're refreshing the view or updating the > denormalized base tables. In fact, since the task of refreshing the > view can be separate, you may actually lose time waiting for updates > to the denormalized base tables to complete. That implies that the materialized view could be updated beforethe underlying tables. What happens in that situation if the update to the underlying tables fails for some reason? The best you can hope for is to have enough excess CPU capacity to allow the changes to the materialized view and to the underlying tables to commit simultaneously.
If the materialized view is updated with a batch size greater than one transaction against the underlying tables, an application accessing the tables can get the updates before one accessing the view.
In any case, we're getting a little far afield from the original question. In enterprise systems, denormalization for performance does take place. This is just one of several reasons for decoupling the application logic from the database schema.
> Denormalization causes problems beyond the additional code required > to maintain consistency, and the need to duplicate that code every > time an update occurs. I'm not arguing against the principle that denormalization shouldbe avoided if at all possible.
S P Engineering, Inc. | Large scale, mission-critical, distributed OO
| systems design and implementation. pjm_at_spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)Received on Sun Mar 09 2008 - 19:58:08 CET