Re: Announcing the "Instant Data Warehouse" Product

From: Phil Fernandez <philf_at_xymox.palo-alto.ca.us>
Date: 1996/02/22
Message-ID: <312C1879.6259_at_xymox.palo-alto.ca.us>#1/1


NRaden wrote:
>
> In Message-ID: <4g4ajn$his_at_oznet07.ozemail.com.au>,
> Peter Nolan <pnolan_at_ozemail.com.au> writes:
>
> <<One of the great features of a properly implemented star schema is the
> fact that the summary fact tables can be implemented as 'multi-level'
> summary tables. This means that you can have many levels of aggregation
> in the one table and the user is never aware of this. Hence there is no
> need to understand which tables/views to look at, it is all taken care
> of. When you want a new summary there are no new objects to create.>>

Granted this approach can be useful, but it can also cause problems. If multi-level aggregates are stored in the same summary fact table, quantitative data in the fact table is inherently additive. This in turn requires that ever dimension with multiple levels always be properly constrained on ever query. This is OK if the tool and application environment enforce this; however, consider a case where each of the the PRODUCT, MARKET, and TIME tables of a simple STAR contain multi-level aggregates.

Consider a question exploring sum of sales dollars by product in a specific time period, but without consideration for the MARKET dimension. If the query doesn't constrain on the "ALL" aggregate in the MARKET dimension, incorrect results will occur. This is a very easy mistake for inexperienced users to make, particularly with a powerful graphical query writing tool.  

> We call this setup the "consolidated star" and it is classic Metaphor.
> Information Advantage seems to prefer this setup too and unless I'm
> mistaken, actually requires it (or at least a view that looks like it). My
> problem with it is that it slows down queries at aggregate levels, since ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> we have to query a much large fact table, instead of a much smaller one.
> It also complicates the creation and maintenace of aggregates at load
> time, but it does vastly simplify the maintenance of metadata. Tradeoffs.

Note that with Red Brick's STARjoin technology, this approach *does not* slow down queries at the aggregate levels, since for this kind of well-constrained query, performance is almost entirely independent of the fact table size. As far as I know, the same is not true of traditional join technologies applied to a star schema.

pmf Received on Thu Feb 22 1996 - 00:00:00 CET

Original text of this message