Re: all foreign key should have index?

From: Christopher Browne <cbbrowne_at_acm.org>
Date: Thu, 02 Feb 2006 17:35:43 -0800
Message-ID: <m34q3hns0g.fsf_at_mobile.int.cbbrowne.com>


> Gene Wirchenko wrote:
>
>> On Wed, 1 Feb 2006 20:17:11 +0000 (UTC), "Murdoc"
>> <murdoc_0_at_hotmail.com> wrote:
>>
>> > Gene Wirchenko wrote:
>> >
>> >> On Tue, 31 Jan 2006 22:12:56 +0000, Eric Junkermann
>> >> <eric_at_deptj.demon.co.uk> wrote:
>> >>
>> >> [snip]
>> >>
>> >> > When you delete a parent row, or update its key, the DBMS needs to find
>> >> > the children, either to cascade the operation or to forbid it - how can
>> >> > it do this efficiently without an index? But of course if you never do
>> >> > those things,
>> >> > you might still need it to find child rows efficiently anyway.
>> >>
>> >> Why does it have to be an index?
>>
>> > Generally: efficiency. If you want the DB to enforce referential integrity, the operation of
>> > doing so needs to be efficient. A search of the entire 'FK' table to ensure that a record can
>> > be deleted from the 'PK' table is (a) inefficient; and (b) pointless. Or even worse, on a
>> > cascade delete. The search (when using an index) really comes down to "Is there an entry in the
>> > index for this field value?".
>> >
>> > Again, it does come down to the size of the table. A table with a maximum of about 10 rows is
>> > not going to have a large performance gain with an index (or maybe a performance detriment).
>>
>> That is not the point. Why does it have to be an INDEX?
>>
>> You have been assuming that an index is the only way. It is not.
>
> Then what is the other way? What other mechanism allows efficient
> searching of a table without a full search?

Sometimes, if the table is fairly small, a full search is in fact the fastest method. Pointedly, if the table is likely to fit in one page, then a full search guarantees finding the result with one disk access. In comparison, any index-like indirection would require two disk accesses, and therefore be more expensive than full search. Indeed, the result of that is that full scan is "probably optimal" if the table fits into two pages of disk.

Some people assume that "index" is synonymous with "B-tree"; there are other data structures that they may not believe should be called an "index." Pointedly, hashing is a sometimes-efficient mechanism for searching for data. (It tends to seem less attractive when applied to disk-based structures than it is for in-memory data, but that's noth particularly relevant here...)

-- 
(format nil "~S_at_~S" "cbbrowne" "gmail.com")
http://linuxdatabases.info/info/spreadsheets.html
"How much more helpful could I be than to provide you with the
appropriate e-mail address? I could engrave it on a clue-by-four and
deliver it to you in Chicago, I suppose." -- Seen on Slashdot...
Received on Fri Feb 03 2006 - 02:35:43 CET

Original text of this message