Re: Column space requirements - VARCHAR2 versus NUMBER

From: Rockhound57 <Rockhound57_at_NSA.com>
Date: Sun, 08 May 2011 23:08:59 -0400
Message-ID: <lmles61cbe5gc5it8731gal724nbg15sb9_at_4ax.com>



On Sat, 7 May 2011 11:56:46 +0200, "Peter J. Holzer" <hjp-usenet2_at_hjp.at> wrote:

>On 2011-05-04 01:26, Luke <luke_airig_at_hotmail.com> wrote:
>> Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit
>>
>> I have a table that will contain around 300 million rows so space
>> utilization is a primary criterion for the design of the table.
>>
>> One of the columns will be loaded with a variable length numeric value
>> from 1 to 9999999999 that will contain leading zeroes.
>
>What does "variable length ... that will contain leading zeroes" mean
>exactly:
>
>1) The value without leading zeroes is variable length, leading zeroes
> are used to pad to 10 digits ("0000000042" is a legal value, but
> "00042" is not).
>2) The length including leading zeroes is variable, but the leading
> zeroes are not significant ("0000000042" and "00042" could both occur
> but are considered the same value)
>3) The length including leading zeroes is variable and all digits are
> significant ("0000000042" and "00042" can both occur and are
> different values).
>
>> Since this numeric column value is not used in any computations, if
>> space were not so paramount, I would normally define it as
>> varchar2(10) but we are considering defining it as NUMBER (realizing
>> that we will lose the leading zeroes when the column is populated).
>
>This will only really work in the case 1 above. It will sort of work in
>case 2 (i.e. you can still identify your records but users may be
>irritated if "the computer eats their zeroes").
>
>It will not work at all in case 3. You can, however, make it work in
>case 3 by prepending a fixed digit. So you store "0000000042" as
>10000000042 and "00042" as 100042. For display purposes you strip the
>first digit. This will add 0.5 bytes per row on average.
>Alternatively you could use a varchar2 and store each pair of digits in
>a single ASCII character. That would shave off another byte compared to
>storing it as number, but has the disadvantage that you have cryptic
>strings instead of readable numbers in your database - probably not
>worth the bother.
>
> hp

Leading zeros... Think everyone haves them. There's a special place in Hell for people who define things identified by things using leading zeros!

Would help if we knew what this column content represented. Might lead to less guessing, better modeling.

Two blind thoughts.

(1) This is 10 numeric characters in one column in a 300 million row table. How many other columns, how big are they? or is it a waste of effort to try to tweak a teeny bit of storage out of what may be insignificant, in the grand scheme of things? If each record is fifty columns, some of which are rather "fat", why bother? just store 10 characters & keep all the zeros.

(2) If, for some reason, these mysterious leading zeros all must be there to a total of ten charactes worth in the string, (God knows why - some sort of part number, maybe?), why not just get the zeros "back" when retrieving them using LPAD in SQL & not bother with storing them in the table at all?

But a purist would say that if something is part # "000564" I'd want to store 000564, NOT 564. And I'd have to say I'd agree, too. Integrity of the data is more important than saving a byte or 2.

5 years from now, when the systems have grown, mutated & changed beyond all recognition, nobody will ever know why that particular column happened to be fiddled with - best to "keep it real". :) Received on Sun May 08 2011 - 22:08:59 CDT

Original text of this message