Home » SQL & PL/SQL » SQL & PL/SQL » Most disk space-efficient number type? (Oracle RAC 11gR1 on Oracle EL5)
Most disk space-efficient number type? [message #313059] Thu, 10 April 2008 09:23 Go to next message
Mopper
Messages: 11
Registered: April 2008
Location: Belgium
Junior Member
Hi,

I'm currently estimating the required disk space for a table that will contain the results of a data mining algorithm. This table will contain something along the lines of 500,000,000,000 rows, and will have 3 columns: one with a foreign key to a generated primary key in the source data table, one with a foreign key to a table that contains the models, and one BINARY_DOUBLE column that contains the predicted result.

I need to keep the disk space as limited as possible, but the column that stores the result (BINARY_DOUBLE) can not be touched. So I was wondering what the best format would be for the foreign keys (and the primary keys in tables they refer to). As of now, both are declared as SMALLINT, but I am not sure if this is the most efficient way.

The value for the foreign key to the source data table should never exceed 100,000,000 and the value for the foreign key to the table with the models should never exceed 2'000'000; so any datatype that can represent all these values should do, as long as it's as small as possible on disk.

I would love to hear your suggestions on this matter, as we only recently started working with Oracle, and how things are stored at low level isn't exactly the first thing you learn Smile

Thanks in advance.

[Edit: typo.]

[Updated on: Thu, 10 April 2008 09:24]

Report message to a moderator

Re: Most disk space-efficient number type? [message #313061 is a reply to message #313059] Thu, 10 April 2008 09:29 Go to previous messageGo to next message
ThomasG
Messages: 3189
Registered: April 2005
Location: Heilbronn, Germany
Senior Member
The data type doesn't seem to matter.

According to the Concepts Manual

Quote:

Internal Numeric Format

Oracle stores numeric data in variable-length format. Each value is stored in scientific notation, with 1 byte used to store the exponent and up to 20 bytes to store the mantissa. The resulting value is limited to 38 digits of precision. Oracle does not store leading and trailing zeros. For example, the number 412 is stored in a format similar to 4.12 x 102, with 1 byte used to store the exponent(2) and 2 bytes used to store the three significant digits of the mantissa(4,1,2). Negative numbers include the sign in their length.

Taking this into account, the column size in bytes for a particular numeric data value NUMBER(p), where p is the precision of a given value, can be calculated using the following formula:

ROUND((length(p)+s)/2))+1



Re: Most disk space-efficient number type? [message #313064 is a reply to message #313061] Thu, 10 April 2008 09:36 Go to previous messageGo to next message
ThomasG
Messages: 3189
Registered: April 2005
Location: Heilbronn, Germany
Senior Member
Ah, found the additional directly below that :

Quote:

Floating-Point Numbers

....

BINARY_FLOAT is a 32-bit, single-precision floating-point number datatype. Each BINARY_FLOAT value requires 5 bytes, including a length byte.
BINARY_DOUBLE Datatype

BINARY_DOUBLE is a 64-bit, double-precision floating-point number datatype. Each BINARY_DOUBLE value requires 9 bytes, including a length byte.

Re: Most disk space-efficient number type? [message #313066 is a reply to message #313059] Thu, 10 April 2008 09:42 Go to previous messageGo to next message
Mopper
Messages: 11
Registered: April 2008
Location: Belgium
Junior Member
Thanks for your reply. If i understand correctly, Oracle will always store numbers in the most efficient way possible, and there is no way to affect this with using various datatypes. This makes me wonder a bit about how useful all these subtypes from NUMBER are, but I guess it means less work for me Smile


Thank you for your quick response.
Re: Most disk space-efficient number type? [message #313068 is a reply to message #313059] Thu, 10 April 2008 09:45 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Search on compression in oracle reference manual. It is useful on tables which are only inserted and selected. If you do updates then compress could prove expensive. For more information check the oracle reference manual or asktom.oracle.com.

Hope that helps

Regards

Raj
Re: Most disk space-efficient number type? [message #313070 is a reply to message #313059] Thu, 10 April 2008 09:51 Go to previous messageGo to next message
Mopper
Messages: 11
Registered: April 2008
Location: Belgium
Junior Member
Sounds interesting, I'll look into it.

Thanks.
Re: Most disk space-efficient number type? [message #313071 is a reply to message #313068] Thu, 10 April 2008 09:54 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
To add a little bit value to what I said check this link.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8806017524595

Regards

Raj
Re: Most disk space-efficient number type? [message #313238 is a reply to message #313059] Fri, 11 April 2008 03:49 Go to previous messageGo to next message
Mopper
Messages: 11
Registered: April 2008
Location: Belgium
Junior Member
I've been playing around with the number formats and compression this morning. This is the table I've been experimenting with:
CREATE TABLE "DATAMINER"."RESULT_TEST" 
   (	"FK_DATA" NUMBER(38,0), 
	"FK_BETA" NUMBER(38,0), 
	"PREDICTION" BINARY_DOUBLE
   )


I put 581,012 records in it. So, to get the size of the table, I did:
 SELECT BYTES FROM USER_SEGMENTS WHERE SEGMENT_NAME ='RESULT_TEST';

     BYTES
----------
  15728640


That's about 15MB, and comes down to just over 27 byte per record. So, given that a BINARY_DOUBLE always takes 9 byte, the other 2 must take about 9 byte each as well, which I found rather a lot. (FK_DATA is between 2467 and 583'478, FK_BETA is 1 for every record).

I had declared both the FK_DATA as the FK_BETA columns as SMALLINT, but this still gave a scale of 38, which is a lot more than I need. So, I dropped the table and re-created it with both FK_ columns declared as NUMBER(10,0) (which is about the upper limit of what I'll ever need), and inserted all the data again. Now, when I checked the size of the table again, it was exactly the same as before. This could be explained by the fact that Oracle only takes as much space as it needs for a NUMBER, and since none of the values are bigger than NUMBER(10), none of them were affected. But. That would mean that Oracle needs 9 byte to store values between 2467 and 583'478, not to mention 9 byte to store a simple '1' value? I find this really strange.

A more successful story is the compression though. Since this table's data is most likely to get written once, never updated, but read a lot I took compression out for a spin. After a
ALTER TABLE RESULT_TEST MOVE COMPRESS;
, (the table's data was ordered by the least discriminating columns first) I got the following size:
SELECT BYTES FROM USER_SEGMENTS WHERE SEGMENT_NAME ='RESULT_TEST';



     BYTES

----------

   8388608

That's about 8MB, or 14.4 byte/record. That's almost half of what it was! This will save us gigabytes on the final result. But, even though this table is suited very well for compression, it still leads me to believe that Oracle is wasting space when not compressing the table. Anyhow, I'm happy with the result we've got so far, especially with the compression Smile

Greetings
Re: Most disk space-efficient number type? [message #313254 is a reply to message #313238] Fri, 11 April 2008 04:33 Go to previous message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Thanks for coming back to us with your inputs.

Regards

Raj
Previous Topic: need help on writing query for a complex situation
Next Topic: Get last 5 dates
Goto Forum:
  


Current Time: Sun Dec 04 00:35:51 CST 2016

Total time taken to generate the page: 0.07200 seconds