Re: SQL Humor

From: Paul <paul_at_test.com>
Date: Fri, 19 Aug 2005 11:16:51 +0100
Message-ID: <4305b5ac$0$1317$ed2619ec_at_ptn-nntp-reader02.plus.net>


Mikito Harakiri wrote:

>>Many companies use a short mnemonic code for their customers. The use of
>>a short mnemonic code for products is not uncommon either. Imagine a
>>customers table with 10,000 customers, a products table with 5,000
>>products and an orders table with a few million rows. Now would you
>>prefer char(6) or varchar(6) for CustomerCode? And char(5) or varchar(5)
>>for ProductCode?

>
> But if lookup table cardinality goes up, then, the storage factor
> char(n) vs varchar(n) goes down! Perhaps, you can convince me that the
> effect of the two trailing bytes is not miniscule, as I previously
> thought, but it just can't be significant. Any benchmark demonstrating
> that the performance degradation is not in single percentage digits is
> welcome.

I guess the char() is really just being used as a hint to the DBMS so it can make an informed decision of what physical data structures to use.

Maybe a better solution would be to have a single type ("string" or whatever) but then have a check constraint like len(column) < 6.

So you could regard the char datatype as shorthand for a varchar datatype with a check constraint. The two ways of looking at it are functionally identical. I think in practice certain DBMSs will internally store a varchar(n) as a char(n) for sufficiently low values of n anyway.

In theory, check constraints should be helpful to the DBMS, both for queries and for deciding physical storage structures.

So there are three possibilities:
1) have char(n) and varchar(n) datatypes 2) scrap char(n) and just have varchar(n). The DBMS can decide to use different internal storage methods if n is small enough. 3) have varchar with no maximum length specified. Now if you forget to specify a maximum length via a check constraint, the DBMS isn't able to optimize things so well.

You might have a similar argument with tinyint, smallint, int, bigint types.

tinyint is really just an int column with a check constaint of 0 <= column <= 255

The question is: should the length constraint be part of the type or part of the database? Does it matter even?

Paul.

Paul. Received on Fri Aug 19 2005 - 12:16:51 CEST

Original text of this message