Re: DB clasical structure violation

From: Anthony Youngman <anthony.youngman_at_eca-international.com>
Date: 28 Jun 2002 01:02:16 -0700
Message-ID: <2c63b9ee.0206280002.3cc3c8a3_at_posting.google.com>


Stephan Eggermont <stephan_at_stack.nl> wrote in message news:<afcpni$odh$1_at_news.tue.nl>...
> In comp.databases.olap Paul Vernon <paul.vernon_at_ukk.ibmm.comm> wrote:
> >> > You'll find plenty of counter arguments to your theory on Database
> Debunkings
> >>> (http://www.dbdebunk.com).
>
> >>The arguments given there are not relevant for datawarehouse-like
> >>applications.
>
> Now that's a candidate for dbdebunk's quote of the week if ever I saw one
> > :-)
>
> They are welcome to use it, if they want :-)
>
> > Stephen, have you read say http://www.dbdebunk.com/kimball1.htm ?
>
> Yes, I have. The design principle is: 'Avoid duplication of volatile
> information' (Edsger W. Dijkstra). In a datawarehouse the data is
> not volatile.
>
Again, I'm slamming SQL here, not relational ...

Let's start with a definition of "large". A large database to me is one where the active working set is greater than available RAM. Most data-warehouses are large by this definition, maybe even often "humungous".

And it's in exactly this scenario that Pick will score. It will ABSOLUTELY FLATTEN almost any other database for speed. SQL-based databases have query optimisers. Why? Because an unoptimised query is so inefficient that improvements of one or two HUNDRED percent are easy to achieve. Pick doesn't have optimisers, because even two achieve two or three percent would take performance beyond excellent, through perfect, into the mathematically impossible.

Pick was designed to use disk as virtual RAM, and memory as the working cache. It was also designed to work tolerably fast with 4Kb of magnetic core in a multi-user interactive environment. Do you know ANY RDBMS that could work at speeds remotely close to Pick on a large database?

I don't know how popular the DataStage data warehouse environment is. I do know the company (Ascential, what's left of Informix after IBM bought the bulk of it) is doing reasonably well. And I do know that DataStage is built on a MultiValue (Pick-based) engine. In fact, to the best of my knowledge, almost no data warehouses are built on relational technology. The grunt required to process the typical quantity of data in a warehouse prohibits the use of SQL technology - it's just too inefficient. After all, isn't the main point of datawarehouses to store pre-analysed data to try and reduce the overhead of relational queries because the systems storing the live data would simply curl up their toes under the load?

Cheers,
Wol Received on Fri Jun 28 2002 - 10:02:16 CEST

Original text of this message