Re: SQL Humor

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Thu, 18 Aug 2005 22:50:33 +0200
Message-ID: <e7s9g15sq4c1p45kgdvooh4d0ckgbdpemo_at_4ax.com>


On 18 Aug 2005 13:17:37 -0700, Mikito Harakiri wrote:

>Stu wrote:
>> Not that I totally disagree with the concept of deprecating char vs
>> varchar, but I see this argument used a lot, and there's two flaws with
>> it.
>>
>> 1. Storage size may be cheap and plentiful, but performance should
>> always be foremost in the DBA mind. 2 bytes in a single column pf
>> storage may not be much, but you also need to write queries that
>> retrieve that extra 2 bytes. It's not just disk space; it's also
>> memory and CPU.
>
>How much performance difference does it really make? Keep in mind that
>in a typical table you can declare char a couple of boolean (Y/N)
>columns at most.

Hi Mikito,

The extra performance cost of declaring Y/N columns varchar instead of char is trivial compared to the extra cost of doing so for foreign key columns.

Most databases have lots of so-called "lookup tables". In the main table, the state is stored as a 2-letter code, country as a 3-letter ISO code and currency as another 3-letter ISO code. These codes are foreign keys into the States, Countries and Currencies tables, that have the primary key 2- or 3-letter code, a unique long name/description and possibly some other columns as well.

All these 2- and 3-letter codes will be in indexes, and these indexes will be heavily used during inserts and updates (to verify the foreign key constraint) and in queries (because users generally prefer to see the full name of the state/country/currency instead of the code).

Using varchar for a 2-letter code means that the space taken is doubled. This halves the number of rows that fit on one leaf page of the index. The result will be: more logical reads, lower cache hit ration, more physical reads --> slower performance.

>> 2. Failing to appreciate the differences between varchar and char sets
>> us on a path of lazy design. If it doesn't matter if I use char(10) vs
>> varchar(10), then what's the matter with varchar(50)? How about
>> varchar(51) etc? Eventually we could get away with "oh screw it; disks
>> are cheap, memory is cheap, CPU's are powerful, let's just put
>> everything in an text column and parse it on the fly".
>
>Failing to appreciate the differences between varchar and char sets is
>an ability to raise the level of abstraction. (BTW, the skill many DBAs
>lack.)

It's not a prime skill for DBAs. The abstraction level where the actual data type is irrelevant is where you'll find the information analyst, functional designer, conceptual modeler or whatever the current name of those people is. It's not the DBA's task to check if they did their job properly. The DBA should ensure that the database runs smoothly, that as blocking and deadlocks are minimized and that queries perform as fast as possible. Especially the latter is impossible to do without appreciating the difference between char, nchar, varchar, nvarchar, text, and ntext.

> With low level of abstraction you would never quit chaising
>perceived problems (eg. trying to figure out the "optimal" block size,

No SQL Server DBA will ever try tio figure out a block size. (Remember that this thread started in a SQL Server group and was crossposted to .theory later!)

>pondering if
>
>select * from table
>
>is faster than
>
>select col1, col2, ... from table

No need to ponder that -- all SQL Server DBAs (and presumably all DBAs for all serious RDBMS's) know that SELECT * should never be used in production code (except in a EXISTS(..) subquery).

Best, Hugo

-- 

(Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Thu Aug 18 2005 - 22:50:33 CEST

Original text of this message