Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: A searchable datastructure for represeting attributes?

Re: A searchable datastructure for represeting attributes?

From: <no_spam_for_me_at_thankyou.very_much>
Date: Thu, 28 Feb 2002 18:30:48 -0500
Message-ID: <3C7EBDA8.75833AFE@thankyou.very_much>

Thanks. that's a good point. do you know whether oracle sqlserver have the support for compressing data that way.

Dre

--CELKO-- wrote:

> >> Again, since car seems to be a popular example imagine you have a
> database
> of cars in a town where almost 99% everyone likes exactly the same
> sort of
> car a blue toyota. If you store a 5 million rows like:
>
> id make color
> 1 toyota blue
> 2 toyota blue
> 3 toyota blue
> .....
> 5000000 totoyta blue
>
> Yes it is very good and normalized in Dr Codd sense but you still
> store 100
> times more information than you need to store with what I described in
> approach_1. That is absense of a key indicates a default of
> make=toyota
> and color=blue. Yes I can see how this is anathema to relational
> design but
> it is a real world problem. <<
>
> You are confusing PHYSICAL storage with the LOGICAL model. Yes, many
> file-oriented version of SQL (SQL Server, DB2, Oracle), tend to
> actually PHYSICALLY repeat values in storage. But If I were using
> the Nucleus SQL engine from Sand technology, there would be a bit
> vector with a 1 for (make = 'toyota') and a 1 for (color = 'blue') at
> the appropriate positions. This bit vector is then compressed and all
> queries are done on the compressed form. The original data is
> re-constructed one column at a time on output.
>
> The more repetition in the data, the smaller the Nucleus database
> gets. The Nucleus engine invites you to split telephone numbers in
> (area code, exchange, phone number) columns to save space because
> area codes and exchanges repeat.
>
> In fact, a good rule of thumb for this product is that the size of the
> entire database will be 80% or less of the size of the original data.
> Your data could well be less than 20% of the original size.
>
> Obviously, this is a data warehouse tool.
Received on Thu Feb 28 2002 - 17:30:48 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US