Re: Normalization by Composing, not just Decomposing

From: Alan <>
Date: Mon, 12 Apr 2004 16:34:03 -0400
Message-ID: <c5euf6$v5g3$>

See in line... --->

"Dawn M. Wolthuis" <> wrote in message news:c5esfv$hi$
> "Alan" <> wrote in message
> news:c5erh9$qt9b$
> > Normalization rules are Codd's rules, not God's rules. They are a
> to
> > distributing data among entities, not a dogmatic recipe. You seem to
> to
> > project a certain amount of dogmatism on everything, as if life were
> > and white. It isn't, it's an infinite number of shades of gray (well
> 16,384
> > at least).
> Who, me? Perhaps it comes across that way bz I am new to doing any sort
> significant study of database theory and compared to everything I have
> worked on before, relational theory IS very tightly pre/described with a
> mathematical model. The model is tight but that doesn't mean it is

---> In the strictest sense, yes, as a theory, but in practice, after the rules are followed, you go back and bend them quite often. I think of it as a starting point, not an ending point- perhaps that's the explanation you're looking for.

> tightly implemented. I'll agree completely with the infinite number of
> shades of gray -- I'm guessing even uncountably infinite.
> > Did you hear about the programmer they found dead in the shower? He was
> > stiff, grasping a bottle of shampoo, his eyes apparently fixed on the
> > instructions, "Lather, rinse repeat."
> >
> > Here's the general rule of thumb: Normalize to 3NF, and then see if that
> > works for you in your situation. If it doesn't, then denormalize or
> > normalize further. Iterate.
> But I don't want to put the data in 1NF -- there is no reason to do so
> my perspective. Since all other normal forms require the data to be in
> first, that pretty much kills the process as it is written. However,
> I can put data into (2NF - 1NF) and (3NF - 1NF) that is what I do and then
> proceed as you describe to refactor the model until it fits.

---> Ah, the rules DEscribe the data as being in 1NF first, they do not PREscribe the data to be in 1NF first. You find the data however it is, and then move it around as necessary. The data may happen to be in 1NF when you start, but maybe it isn't. Often, especially after some experience, you start out by placing the data you are familar with in 3NF, and then see where the rest goes. For examlpe. An HR database- well you know you have EMP and DEPT, and you have a pretty good idea what's going to go in them. No sense starting in 1NF when you already know where you're going. OTOH, if you are woking in a domain about which you know very little, you have no choice but to start by asking a lot of questions and placing the data in 1NF, and take it from there. Note that this is coming from an implementers POV, not a theorist. The theorists are probably ready to explode by now.

> > You ask, "Are you adhering to relational theory when deploying an OLAP
> > database where
> > the data is in fact & dimension tables?" Did you read my message? What
> I
> > write about normalization and the star schema?
> I'll re-read and see if it clearer, but it seems to me that you were
> a pitch for OLAP data being "relational" in some way. Sorry if I
> misunderstood.

---> No, I was making a pitch that it is not necessary, and can even be confusing, to view OLAP data as being relational in some way. Read Ralph Kimball (the "father" of data warehouses). He refers to what we are talking about as "dimensional modeling", and as something that is _applied to_ realtional databases. Note that he is _not_ talking about realtional modeling.

> > Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
> > modelling is analysis and maybe design.
> Cubes could be an implementation, but they could also be used to model the
> data. There is not always a need to take the data, put it into a
> data model and reform it for OLAP -- one could go from requirements to
> data model, right? --dawn

---> Yes and no. How would you model an eight (or more) dimension cube? I don't think it is practical, as models are used for analysis and design. You do build a model of the cube (using various cube-creation software) before you actually create the cube, but you really can't easily model from scratch (perform analysis) with it. You can readily model a star schema, however, and you then build the cube model from a star schema. While you can take "loose" data and turn it into a star schema, the practical reality is that you are almost always starting out with an existing relational database, so it is wise to discuss it from that point of view.

> >
> > "Dawn M. Wolthuis" <> wrote in message
> > news:c5ecb7$e2f$
> > > "Alan" <> wrote in message
> > > news:c5e6m0$lo87$
> > > > Realtional theory as a data theory is analagous to democracy as a
> > of
> > > > government- it may not be perfect, but so far, there's nothing
> in
> > > > most cases.
> > >
> > > Have you concluded this by reviewing some emperical data that has been
> > > collected or because you adhere to some philosophy or what? Could I
> state
> > > something contradictory with as much logical backing?
> > >
> > > > Denormalization in itself has nothing directly to do with OLAP,
> > > that
> > > > one may denormalize more for an OLAP application than an OLTP
> > application.
> > > > However, in OLAP, you are not necessarily denormalizing so much as
> > > > "re-normalizing", in that you are really developing a diiferent
> > > distribution
> > > > among entities for the same data, such as in a star schema. It's not
> > > > normalized, but it's not denormalized either. It's just different. I
> > > suppose
> > > > an argument could be made that (in the case of a star schema), you
> start
> > > > with a normalized schema, and then apply transformation rules (no,
> don't
> > > ask
> > > > me what they are- there are books on the topic) to transform it into
> > > star
> > > > schema. Think about it- a basic star schema is essentially a giant
> > > > many-to-many linking table (the fact table) with a bunch of
> descriptive
> > > data
> > > > tables (dimensions).
> > >
> > > Are you adhering to relational theory when deploying an OLAP database
> > where
> > > the data is in fact & dimension tables? I have not seen a star schema
> for
> > > the purposes of OLAP without some rules broken such as duplication of
> > data.
> > > But I'm fine with it either way -- just curious whether most
> > > theorists would view data modeled as OLAP cubes as following "the
> > > ules". --dawn
> > > <snip>
> > >
> > >
> >
> >
Received on Mon Apr 12 2004 - 22:34:03 CEST

Original text of this message