Re: On what OLAP can and what OLAP can't - A little problem

From: <pamelafluente_at_libero.it>
Date: 11 Sep 2006 23:14:00 -0700
Message-ID: <1158041640.206944.214400_at_b28g2000cwb.googlegroups.com>


David Cressey ha scritto:
> On the subject of ignorance, I find that the more I learn, the more of my
> own ignorance I become aware of.
>

What we know is such an insignificant part of what we don't that there is not even point
to compare our ignorances.

It more or less like whatching some ants down on the ground that are fighting to establish which one has a bigger penis.

I like to learn, even among insults and bashing. I just have an unstoppable desire to get everything I can, like a sponge ... even absorbing your soul, if I can ;)

When you begin to start to think how good you are, you are just taking a photograph of your current state of ignorance and freeze it.

Computer world, and database theory is mostly intended for database which resides on computers, is growing so fast that just a little pause in keeping up turns you in an obsolete dinosaur...

Also there are a lot of practical and programming related issues where theory and practice must often to reach a compromise...

> It might be useful to move the discussion to another forum. I'm not
> prepared to provide the theoretical underpinnings that might or might exist
> for multidimensional modeling, and I'm not willing to debate those who
> attack any tool that doesn't meet their criteria for theoretical rectitude.
>
> Another thing that might be useful to do is for you to tell me how much you
> already know. Star schema design is actually very simple. However, star
> schema design isn't really at the heart of multidimensional modeling. In
> the meantime, I'm going to look for websites that say what I want to say,
> only better.

I am a programmer, and I see the problem from that perspective. One problem I faced lately and perhaps you theorist can help me find a better solution is to find a good algorithm to split a query under the following circumstances.

Make it simple. Assume you have a few tables. Assume that there are a few relationships and that for instance if you have 2 table A, B in 1-N relationship, you
have on table A defined some function that is, say, "replication sensitive", such as count (not count distinct) or sum. When whe make some join of these tables and compute such functions we obtain an incorrect calculation of the functions due to record replication. Leader softwares such as Business Objects are able to devise some union of subquery to avoid that replication problem. Experiment show that such software are usually able to deal with that if functions are on "fact" table (say on the N side of the relationship), but they seem to have big problems when the functions are applied on the dimension table
(say on side 1 of relationship).

Would you be able to suggest an optimal split/union algorithm for this situation. I do have my empirical idea, but I would like to hear your and bob's approach...

-P Received on Tue Sep 12 2006 - 08:14:00 CEST

Original text of this message