Re: The problem with denormalization.

From: Laconic2 <laconic2_at_comcast.net>
Date: Mon, 27 Sep 2004 11:46:20 -0400
Message-ID: <Ncednc27yMJ7qMXcRVn-hA_at_comcast.com>


"Tony Andrews" <andrewst_at_onetel.com> wrote in message news:1096291773.101549.254100_at_h37g2000oda.googlegroups.com...
> I don't know what star schema is, really. Well, I suppose it's like a
> "fact" table in the middle, and lots of "dimension" tables round the
> outside;

You got it. There are more details in Kimball's books, but this is the kernel.

> but I have no idea why/where denormalisation is involved in
> that.
It's not so much that you "denormalize" as that you end up with dimension tables that are not 3rd normal form.
Let's say you have a dimension called "Products", with one row for each product. Let's say you have some kind of hierarchy of product categories, for some purpose like marketing. In an normal database, you might have a separate table for each layer of the hierarchy. In star, you might well flatten the hierarchy down into the dimension table. It makes life a lot easier for certain modes of access to the data.

The point is not that you "denormalize". The point is that if you know how to normalize, but you don't know what star is for, your first reaction is to criticize the design because it isn't normalized. That's how I reacted to the first star I saw.

> My world is generally OLTP not OLAP, and I realise different
> considerations apply.

Exactly! My world, at least in the times when I used star, was OLAP, DW, or reporting.

>
> In my OLTP experience (with some reasonably big, but not terabyte,
> databases), I have never encountered a situation where I would choose
> to denormalise. Generally, where it has been done, it was a grope in
> the dark by a database designer who failed to see and fix the real
> issues. They often didn't even bother to quantify whether it actually
> improved performance, they just assumed it would. It grieves me that
> the requirement to denormalise is taken as given by so many designers:
> "this is going to be a big, multi-user system that needs to perform, so
> which tables shall we denormalise first?" The idea of a database with
> NO denormalisation would strike them as just plain wrong. That's sad.
>

Agreed. However, even in OLTP environments, I seldom worried about normal forms beyond 3rd.

> Even in OLAP, I would want to ensure other options were considered
> first: for example, the use of Materialized Views to get the benefits
> of denormalisation (stored summaries, pre-joined tables) without the
> dangers (corrupt data).
>

Take a look at star. It isn't the silver bullet. But it's good. Received on Mon Sep 27 2004 - 17:46:20 CEST

Original text of this message