Re: Announcing the "Instant Data Warehouse" Product

From: Michael J. Saylor <saylor_at_strategy.com>
Date: 1996/02/14
Message-ID: <31222AAA.D4_at_strategy.com>#1/1


Regarding:
" Deliverable ROLAP capacity may be less limited by theoretical database size than by performance and the hassles of maintaining hundreds or thousands of summary tables, so their deliverable capacity is not necessarily larger than MDBs, and the cost of their scalability is high in people and hardware costs."

Nigel,
Consider the Woolworths data warehousing project again, in light of the above
comments, and its significance becomes much more clear. Woolworths (under the direction of Will Gee) has managed to created an OLAP application with acceptable performance against atomic (i.e. non-redundant, linearly independent) data volumes in excess of 50 Gigabytes without relying upon any indexes or aggregates.

MDDB solutions rely much more heavily on aggregation than ROLAP solutions, and even though they do a good job of making the maintenance of these aggregates transparent to the developer, they cannot avoid the significant CPU cost of maintaining these tables over time. ROLAP solutions may or may not depend upon aggregation, as Woolworths has effectively demonstrated. It is true that any OLAP application (whether it is ROLAP or MDDB) with a high degree of aggregation will suffer from maintenance challenges as the degree of atomicity increases.

In my panel at the Data Warehousing Conference last week in Orlando, I debated this topic with representatives from Pilot, SAS, Dimensional Insight, and Prodea. At one point, Aaron Zornes asked point-blank: "How much raw
(atomic) data in your average installation, and what is your maximum at
this point in time?" I answered first, noting that we have a number of customers with atomic volumes in excess of 100 Gigabytes, with the average atomic volume in the 25-100 Gigabyte range.

The vendors which rely primarily on proprietary data structures for their analysis (Pilot, SAS, Dimensional Insight)did not know what their largest atomic database was (or refused to offer this information). All noted that their average installation relied upon approximately 300-700 MEGABYTES of atomic data. Let's keep in mind that Arbor's famous 40 gigabyte benchmark database had approximately 50 megabytes of atomic information. Based upon the underlying mathematics and our own industry sources, I suspect that there are few instances of MDDB applications which crack the 5 gigabyte atomicity barrier.

Thus, I think it is not unreasonable to offer the following conclusions:

(a) There is little overlap between MDDB and ROLAP applications.
(b) ROLAP applications are definitely more scalable than MDDB.
(c) ROLAP applications are 100x as large as MDDB applications.
(d) If you have more than 5 gigabytes of atomic data, ROLAP is the only
choice, and ROLAP vs. MDDB comparisons are irrelevant.

ROLAP may be more expensive than MDDB on an absolute basis, but it is certainly not 100x more expensive. When you begin to divide the cost of that "exotic" hardware by a factor of 100, it doesn't seem so bad after all. Further, ROLAP offers far more than 100x the query scope, since our technology is capable of managing significantly more sophisticated relationships (via joins and multi-SQL execution plans) between all this data.

Regards,
Michael

-- 
Michael J. Saylor			MicroStrategy, Inc.
President & CEO				8000 Towers Crescent Drive
Voice: 	(703) 848-8620			Vienna, VA 22182
FAX: 	(703) 848-8669			Office:	(703) 848-8600
saylor_at_strategy.com			http://www.strategy.com/

	"Data Warehouse Decision Support Solutions"
Received on Wed Feb 14 1996 - 00:00:00 CET

Original text of this message