Re: Normalization by Composing, not just Decomposing
Date: Mon, 12 Apr 2004 09:48:04 -0400
Denormalization in itself has nothing directly to do with OLAP, except that one may denormalize more for an OLAP application than an OLTP application. However, in OLAP, you are not necessarily denormalizing so much as "re-normalizing", in that you are really developing a diiferent distribution among entities for the same data, such as in a star schema. It's not normalized, but it's not denormalized either. It's just different. I suppose an argument could be made that (in the case of a star schema), you start with a normalized schema, and then apply transformation rules (no, don't ask me what they are- there are books on the topic) to transform it into a star schema. Think about it- a basic star schema is essentially a giant many-to-many linking table (the fact table) with a bunch of descriptive data tables (dimensions).
"Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
> "Alan" <alan_at_erols.com> wrote in message
> > You are assuming that (good) normalization is a science. It is not. It
> > part science and part art- that's where experience (as well as ESP to
> > the user's minds and clairvoiance to predict future needs) comes in to
> > Oh, it is also part voodoo. Sometimes waving a dead chicken in a paper
> > over your head produces the results you need.
> You are preachin' to the choir-ish -- that's the type of thing I would say
> if I were not trying, oh so hard, to learn what makes relational theorists
> tick and trying, oh so hard, to use the same way of thinking so that I can
> really learn what it is about relational theory that is keeping it the
> of the hill. There are very formalized statements, using very
> terminology and all, that show the process of normalization to be
> np-complete or whatever else makes some folks feel all warm and fuzzy (not
> the mathematical use of the term "fuzzy"). When I ask what it is about
> relational theory that makes it king, I hear that it is because it is
> on mathematics, including predicate logic.
> > By the way, the process of
> > putting it back together is called denormalization, not composing, and
> > not uncommon, but as you noted, there are no rules. That's why
> > data modelers get paid more than newbies.
> Yes, you are right and I'm quite familiar with denormalization used with
> OLAP, which is why I avoided that term. From what I have seen, folks talk
> about denormalization when going away from transactional data processing
> I didn't want to inadvertantly take the thread in that direction. .
> So, with your statements about formalization of the rules for good data
> modeling/design/implementation, are you in "the relational camp" or among
> the less orthodox (of us)? Thanks. --dawn
> > "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
> > news:c546v4$a76$1_at_news.netins.net...
> > > Sorry I have so many questions, but I do appreciate the help I have
> > received
> > > from this list. I just read, or rather, skimmed the document Jan
> > me
> > > to related to XML and normal forms. There were other more accessible
> > papers
> > > there that I skimmed too.
> > >
> > > If I am understanding correctly, the process of normalization for any
> > of
> > > data attributes is a process of decomposing from one large set to
> > > smaller ones. That makes sense when starting from scratch.
> > >
> > > But tests for determining whether data is normalized also seem to
> > > whether it has been fragmented sufficiently and do not take into
> > > whether the data has been TOO fragmented.
> > >
> > > For example, if we have attributes: ID, First Name, Last Name, Nick
> > > where the ID is a primary key (or candidate key if you prefer) and for
> > each
> > > ID there is precisely one list of Nick Names and the Nick Name list
> > > (relation, if you prefer) is determined by the ID, the whole ID, and
> > nothing
> > > but the ID, then in the relational model, most folks would still split
> > > Nick Names into a separate relation simply because it is, itself, a
> > > relation.
> > >
> > > More progressive relational modelers might decide it is OK to model
> > > relation-valued attribute of Nick Names within the first relation.
> > > either option would then be acceptable and considered normalized
> > > newer definitions of 1NF).
> > >
> > > But there seem to be no "rules" or even guidelines that are provided
> > > COMPOSE or keep together the Nick Names with the ID. Such rules would
> > > the ones I would add to what I have seen related to XML modeling and
> > > used, without being explicitly stated, by PICK developers. The
> > > description of this rule is:
> > >
> > > If it is dependent on the key, the whole key, and nothing but the key,
> > then
> > > don't split it out!
> > >
> > > More precision, but not absolute precision, would give us something
> > > Let A be the set of all Attributes and FD be the set of all functional
> > > dependencies among the attributes. If a is an element of A and is a
> > and
> > > mv is another element (named to give a hint that it might be
> > > aka relation-valued) and a-->mv is in FD (but no subcomponent of a
> > > mv), then
> > >
> > > mv should be an attribute in a relation where a is a key and for all
> > > attributes b with this same relationship to a, mv should be in the
> > relation
> > > with b
> > >
> > > In other words, there ought to be some "rules" that govern when we
> > not
> > > split out data attributes, in general, as well as when we should
> > > them.
> > >
> > > Or am I missing something? Perhaps what I skimmed includes this, but
> > just
> > > didn't pick it up. I know I haven't read everything out there -- are
> > > other places where normalization or rules related to data modeling are
> > > focussed exclusively on when to split attributes out, but also include
> > > bringing them together when they have already been unnecessarily
> > decomposed?
> > >
> > > Thanks. --dawn
Received on Mon Apr 12 2004 - 15:48:04 CEST