Re: On what OLAP can and what OLAP can't

From: David Cressey <dcressey_at_verizon.net>
Date: Thu, 07 Sep 2006 21:42:25 GMT
Message-ID: <5n0Mg.11769$hV2.6376_at_trndny05>


"Cimode" <cimode_at_hotmail.com> wrote in message news:1157639738.578212.133710_at_b28g2000cwb.googlegroups.com...
> Greetings,
>
> Last week, while auditing a 1 year-old data warehouse totally dead in
> terms of response time and aggregating accuracy, I had the following
> chat with the *data multidimensional architect*...I thought some of you
> may find it interesting to understand one sign of our time....
>
> Multidimensional Architect: How come you wrote in your report that the
> system has excessive reponse time and no accuracy because relational
> rules of design were broken. That show you don't understand anything
> about OLAP modeling
> Me: I do not need to know what OLAP modeling is. All I can see, is no
> matter how well designed is your OLAP system, it still incorrect result
> at high cost because its raw data source is not normalized
> Multidimensional Architect: But OLAP modeling has nothing to do
> normalization, star schema's are totally denormalized...
> Me: Really? I could argue that a star schema is a fully normalized
> because it is nothing else than a fact table exclusively made up of
> foreign key attributes pointing to dimension tables...Besides to
> denormalize, it would initially start from some normal form? What
> normal form did you start from to start your *denormalizing*
> Multidimensional Architect: That's not the point. The point is a
> relational model only applies to OLTP (Online Transactional Processing)
> systems while OLAP model is targetted at Decision Support
> Systems...OLAP is totally independent from relational model.
> Me: What exactly is OLAP model?
> Multidimensional Architect: It is a multidimensional model that pulls
> data from various relational data sources and filters only what is
> relevant to decision makers...
> Me: So basically, you are stating that OLAP pulls data from SQL DBMS's
> right? What if the results of data sources are wrong? What does it
> say about the results of OLAP..Are you saying you are filtering
> something that is not correct at the first place? Weird...
> Multidimensional Architect: But OLAP does not care about the accuracy
> of the data pulled from the RM as it gets filtered anyway.
> Me: How exactly do you do that?
> Multidimensional Architect: By putting filters on the views extracting
> data from the relational structure.
> Me: So basically it is the same thing as putting filters on a view to
> attemps compensate for the loss of integrity?
> Multidimensional Architect: No it is not the same thing?
> Me: What is the difference then?
> Multidimensional Architect: OLAP modeling is made of hierarchies of
> dimensions that can be reorganized at run time...
> Me: How many filters do you exacty put for each extracting process?
> Multidimensional Architect: Between 6 and 10 AND conditions in the
> WHERE clause
> Me: And how many processes are there to filter
> Multidimensional Architect: Between 60 and 100
> Me: Are'nt you afraid that may have some impact on ressources consumed?
> What if these filters are not sufficient anyway? What if you forget to
> add on excluding conditions in one of the numerous processes?
> Multidimensional Architect: That won't happen as long as I am
> around...
> Me: And how long will you be around
> Multidimensional Architect: Not very long....
>
> The rest of the discussion was counless examples of a general trend of
> ignorance about RM concepts by people who think in buzz words such as
> *multidimensional modeling* while in fact it presents all
> characteritics of good old hierarchical systems...Pathetic...
>

It's clear from the above transcript that you and the data architect were talking past each other rather than holding a genuine dialogue. Looking back, I can see myself in years gone by being on either side of the discussion, at different points in my career.

I can, if you like, shed some light on what the data architect might have been trying to show you. However, I can only surmise that the architect's opinions and knowledge were somewhat like my own. They may have been quite different. What you might get out of such a discussion in here depends on what you are seeking.

Since the DA isn't here, I can't rephrase what you were trying to communicate to the DA for that person. However, it's just vaguely possible that the DA had been through the same learning curve you have, and had grown to appreciate multidimensional modeling in addition to relational model, rather than instead of it.

If you are seeking to genuinely understand how multidimensional modeling and star schema design might be genuinely useful additions to your own set of tools and methods, the discussion could go one way. If, on the other hand, you are merely looking for validation of your own opinions, and want justification for disregarding anything novel (to you) that the data architect might have said, then the discussion is likely to go nowhere, as many prior discussions in c.d.t. have.

Before I invest the time and effort at trying to summarize and explain the benefits of star schema design, I want to have some inkling about whether it's a waste of time.

I will comment on a few things above:

First, the term "data architect" makes a great deal of sense to me. The terms "data multidimensional architect" or "multidimensional architect" sound very odd to me. I would expect any good data architect to have an understanding of relational data modeling, data normalization, ER modeling, multidimensional modeling, relational database design, star schema design, and (why not) object modeling, in their collection of useful tools and methods, and to be able to apply any one of them to good advantage, when appropriate.

While multidimensional modeling is quite different from relational data modeling, it really isn't a case of "either/or" but a case of "both/and". Many of the religious debates in this forum are quite sterile in terms of comparing different tools and methods and finding different benefits and drawbacks to each.

Second, I think the data architect (DA) chose an unfortunate term when he described a star schema as "denormalized". A better choice of words would have been "not necessarily normalized (beyond 1NF)". When doing multidimensional modeling and star schema design, it is not so much a case of normalizing and then denormalizing as it is of following some design principles that lead to a less than fully normalized design.

The multidimensional model has been dismissed as mere bunk by some acclaimed professionals. But I can tell you from repeated direct experience that it has proven very useful to me in a class of situations. And certainly Kimball is quite a respectable authority in this field, in his own right. Inmon, on the other hand, bases his discussion of data warehousing completely on normalized data as a starting place. I've never had a chance to apply Inmon's ideas, so I can't compare them to Kimball's ideas.

I will say, however that dismissing either one of them as ignorant and unprofessional only shows how little the person doing the dismissing actually knows. Received on Thu Sep 07 2006 - 23:42:25 CEST

Original text of this message