Representing data on disk in an MV database - was Re: foundations of relational theory? - some references for the truly starving

From: Jonathan Leffler <jleffler_at_earthlink.net>
Date: Thu, 23 Oct 2003 17:06:15 GMT
Message-ID: <bUTlb.2937$wc3.186_at_newsread3.news.pas.earthlink.net>


Mike Preece wrote:
> [...]

 > It might be worth mentioning at this stage that the amount of data you
> can store in a Pick item will require far less physical disk space
> than on any other (uncompressed) database. There are two reasons for
> this:
> 1) because every field is of variable length - every datum is just as
> long as it needs to be and no more and empty fields occupy no space at
> all (other than single character system delimiters) - and
> 2) [...]

Does that mean that all field values are stored as printable character strings? I've seen similar comments before, and always been puzzled by them. It seems to be the only way to understand the use of a fixed byte code as the value separator (sorry - I've forgotten the correct MV terminology).

If I'm correct, and if an MV database is storing integer values in the range 0..999 (or maybe 0..9999), then the space used is no worse than in a system which uses a binary data representation, but as soon as you're dealing with numbers in the millions, instead of using a fixed length (4 bytes for a 32-bit integer), this system seems to be using a lot more space. (Yes, I quietly ignored negative numbers - I'm curious about how they're represented on disk too.)

How significant that is depends on many factors. Informix databases, for example, store a TIMESTAMP(5) - actually, DATETIME YEAR TO FRACTION(5) in Informix's rather weird notation, but equivalent to the SQL standard TIMESTAMP type - in just 11 bytes of data, but the text version requires 25 bytes. Further, that's a fixed 25 bytes (or, at best, the variable part is just 0-5 digits after the decimal point separating seconds from fractional seconds, and possibly the decimal point itself - 19-25 bytes).

What about other complex data types? Indeed, does an MV database support typed data?

Obviously, if my strawman target is off-base (first paragraph), the comments in the other paragraphs are inapplicable.

-- 
Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler_at_earthlink.net, jleffler_at_us.ibm.com
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/
Received on Thu Oct 23 2003 - 19:06:15 CEST

Original text of this message