Re: Representing data on disk in an MV database - was Re: foundations of relational theory? - some references for the truly starving
Date: Fri, 24 Oct 2003 08:53:53 -0400
Message-ID: <NbKdnRF3VqaBvASiU-KYiw_at_golden.net>
"Jonathan Leffler" <jleffler_at_earthlink.net> wrote in message
news:6L1mb.4003$wc3.803_at_newsread3.news.pas.earthlink.net...
> Mike Preece wrote:
>
> > Jonathan Leffler <jleffler_at_earthlink.net> wrote:
> >>Mike Preece wrote:
> >>>[...]
> >>> It might be worth mentioning at this stage that the amount of data you
> >>>can store in a Pick item will require far less physical disk space
> >>>than on any other (uncompressed) database. There are two reasons for
> >>>this:
> >>>1) because every field is of variable length - every datum is just as
> >>>long as it needs to be and no more and empty fields occupy no space at
> >>>all (other than single character system delimiters) - and
> >>>2) [...]
> >>
> >>Does that mean that all field values are stored as printable character
> >>strings?
> >
> > Yes they are.
> >
> >>I've seen similar comments before, and always been puzzled
> >>by them. It seems to be the only way to understand the use of a fixed
> >>byte code as the value separator (sorry - I've forgotten the correct
> >>MV terminology).
> >
> > Apology not necessary. Thanks anyway - it's like a breath of fresh
> > air.
> > Char(255) = End of item mark
> > Char(254) = Attribute mark
> > Char(253) = Value mark
> > Char(252) = SubValue mark
> >
> >>If I'm correct, and if an MV database is storing integer values in the
> >>range 0..999 (or maybe 0..9999), then the space used is no worse than
> >>in a system which uses a binary data representation, but as soon as
> >>you're dealing with numbers in the millions, instead of using a fixed
> >>length (4 bytes for a 32-bit integer), this system seems to be using a
> >>lot more space. (Yes, I quietly ignored negative numbers - I'm
> >>curious about how they're represented on disk too.)
> >
> > OK. I'm prepared to concede this point - to a degree. Packed decimal
> > isn't allowed in Pick - out of the box, you'd have to 'roll-your-own'
> > functions to allow it. This would potentially mess with system
> > delimiters (x'7F' would be a problem for example). The number 12345
> > takes 5 bytes, as does 123.45 (numbers are generally stored in
> > internal format with input and output conversions applied). Minus
> > 123.45 takes 6 bytes. Obviously packed decimals take up less space. I
> > suppose then, if your data comprises mostly packed decimals in a
> > database that caters for variable length data (I wonder how fields
> > would be delimited though), you would be able to get more data into
> > less physical disk space. As a matter of academic interest I'd like to
> > know which database can do this and how it does it.
>
> Well, maybe I can shed some light on this - at least as it applies to
> the Informix databases (SE, OnLine, IDS, XPS that is - not RedBrick,
> nor the U2 systems, nor yet Cloudscape or the other half-dozen or so
> DBMS that Informix acquired at various times), and slightly less
> definitively to DB2 in its various forms. I'm fairly sure that the
> basic ideas I'll outline apply to Oracle, MS SQL Server, Sybase -- and
> even some of the non-relational systems such as IMS.
It also applies to small footprint systems like SqlBase. Received on Fri Oct 24 2003 - 14:53:53 CEST