Re: Problems with SELECT *

From: Pablo Sanchez <pablo_at_dev.null>
Date: Wed, 12 Mar 2003 12:58:24 -0600
Message-ID: <Xns933C79CD0CE72pingottpingottbah_at_216.166.71.233>


"Curtis \"Ovid\" Poe" <cp_at_onsitetech.com> wrote in news:b4nuvl$nab_at_dispatch.concentric.net:

> SELECT * FROM emails
>
> [ versus ]
>
> SELECT userid, email, contest FROM emails
>
> That SQL statement has a variety of benefits:
>
> 1. It's immediately obvious to the maintenance programmer exactly what
> is being requested.

I think that's probably the most key issue in your example.

> 2. It's more efficient than selecting a bunch of columns you don't
> need, if you have more columns in the table than you're actually using.

Databases physically store data in an atomic page of information: data page or block. Within the block, as many rows are stored as possible. When a SELECT is done, the data page/block is read into the 'buffer cache' Whether you do SELECT * or SELECT userid, all the data is in the buffer cache.

Your point becomes valid if and only if you're moving 'SELECT *' from the backend to the middleware, otherwise the data is there and it's not that big of a deal.

> 3. If you access fetched data in a particular order (fetching the
> fields into an array, for example), you don't have to worry about
> columns in the database being reordered.

Good point! I'm all for independence between the operational and the development. :)

> 4. If access the data by name (in a hash or dictionary) you will
> understand faster why "emails.userid" failed when the field was renamed
> to "user_id".

I'm unclear on your point here.

> 5. If someone removes a column from the database, you get the same
> benefit as mentioned in the fourth point.

I was thinking it's similar to point 1 - the apps will break during QA (what's that?! :) and you can fix accordingly.

> I've now been asked where I can find an online article which details
> similar pitfalls (including the "select * from foo" example). I've
> had trouble finding anything and I was wondering if anyone could
> suggest some?

I think the maintainability is the biggest issue because applications evolve over time. 'select *' opens the possibility for the 'unknown' and that's always bad for applications. :)

-- 
Pablo Sanchez, High-Performance Database Engineering
http://www.hpdbe.com
Received on Wed Mar 12 2003 - 19:58:24 CET

Original text of this message