Normalize until neat, Automate until Complete
Date: Wed, 24 Nov 2004 13:06:15 -0500
Message-ID: <o5ve72-r03.ln1_at_pluto.downsfam.net>
We hear the term "Denormalizing for performance" very often, and depending upon who is doing it, there might be an exasperated "et tu brute?" along with it.
The informal slogan that serves as a nominal rationalization for denormalization is often given as "Normalize until it hurts, denormalize until it works." This slogan, even though it almost rhymes, does not give us much useful guidance. In this slogan normalization is a bad thing, but actually is given no redeeming value. On the contrary, the denormalized system has the advantage that it *works*. So why normalize? Why not just make a system that "works"?
The fact that so many people in the real world are willing to denormalize suggests that the RDM in itself is not providing a complete set of guiding principles for real-world database operations. When people "denormalize for performance", and I don't mean amateurs here, they are reaching for something. Just as perturbations in a planet's orbit can lead scientists to suspect and discover other planets, perhaps theoreticians can look at the practioners' efforts and realize that there must be more to the story.
I would suggest a better slogan would be "Normalize until Neat, Automate
until Complete," which actually gives some guidance to the denormalization
process. The idea is that user-entered or imported data should be
normalized for standard reason, nothing new there. The process of
automation requires a normalized database as a starting point, so that the
generated columns are built upon a valid base.
I have a simple question for those who oppose automation on the grounds that
it denormalizes. If normalization is intended to ensure correctness, and
your system disallows writes to automated columns, have you not preserved
correctness while also improving the lot of your users? If so, isn't that
what it's all about?
If we separate theory from implementation for a moment, which means that we lay aside performance concerns, then the only motivation for automation is to make the database more complete. If it can be made more complete without threatening correctness, is not this an a priori Good Thing?
-- Kenneth Downs <?php $sig_block="Variable scope? What's that?";?>Received on Wed Nov 24 2004 - 19:06:15 CET
