Re: satisfies algorithm

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 27 Jul 2008 11:00:49 -0400
Message-ID: <Q40jk.17392$89.5634_at_nlpi069.nbdc.sbc.com>


"David Portas" <REMOVE_BEFORE_REPLYING_dportas_at_acm.org> wrote in message news:GZudndkIeMw_1xHVnZ2dneKdnZydnZ2d_at_giganews.com...
> "Brian Selzer" <brian_at_selzer-software.com> wrote in message
> news:wRRik.17363$89.1080_at_nlpi069.nbdc.sbc.com...
>>
>> In nearly all of the commercial implementations and implementations of
>> systems built with commercial implementations that pervade the industry,
>> where there is a unique constraint or a primary key constraint that needs
>> to be enforced, there is also an index. But thank you for pointing out
>> that there are alternatives to using indexes without also providing the
>> algorithms for those alternatives for the OP's edification. In other
>> words, "where's the beef?" Can you back up your claims with more than
>> marketing hype? I read the white papers Netezza provided on their web
>> site, but I didn't feel a thrill going up my leg, if you know what I
>> mean. Can you provide links to academic papers that describe these
>> alteratives in detail. I'm curious, and I would bet that others here are
>> too.
>>
>> By the way, how Netezza is implemented internally is anyone's guess.
>> They may sort active domains--storing them in a particular order. Who
>> knows? but wouldn't that be similar enough to be considered a facsimilie
>> of an index?
>>
>> --Brian
>>
>>
>
> I had hoped I wouldn't need to explain that it is possible to validate
> uniqueness without an index.

Of course you can validate uniqueness without an index: you can simply scan the entire relation every time you do an insert or update (anyone with half a brain could figure that out), but with a billion rows, that could be both time and resource intensive.

> For example some DBMSs will allow you to use a trigger or an assertion to
> enforce uniqueness. An index is simply a method for speeding that process
> up. In any case I don't expect discussions about fundamental matters to be
> limited to what "nearly all of the [current] commercial implementations"
> can support. The facts are what they are regardless of what Microsoft,
> Oracle, et al are selling. A key is a logical construct and an index is a
> physical one.
>

The facts are what the liberal media say they are. I thought you knew that.

A key has the uniqueness property. Enforcing the uniqueness property of a key without scanning the entire relation every time an insert or update occurs requires implementing some form of indexing--whether that be a hash index or table or a b-tree index or storing active domains as ordered sets (or some facsimilie thereof, such as binary trees) or storing the relation itself as an ordered set (making it its own index, but that only works if the relation has only one key).

Prove me wrong.

> I'm really not sure what "claims" you want me to back up since you've
> already agreed that an index isn't always required (quote: "In nearly
> all..."). If you mean you want to know about optimisation strategies that
> don't involve indexes then you could start here: http://tods.acm.org/.
> There is a wealth of material on optimisation. Optimisation has nothing to
> do with keys of course, which brings us back to where we started...
>

I was hoping for more than a link that I already have in my favorites.

I notice that you didn't find fault with the meat of my original post, which was to normalize so that there can be no nontrivial functional dependencies where the determinant is not also a key. Why nitpick over the stupid stuff? I'll rephrase the last part of my original response: There should already be a mechanism in the dbms to ensure that keys are unique. Usually, that is accomplished through the use of unique indexes. Happy? My point was that a sound design obviates the need for a functional dependency enforcement algorithm. It should be noted that the least efficient algorithm for enforcing uniqueness--that is, scanning every row each time that there is an insert or update--is more efficient than the algorithm that the OP supplied due to the elimination of the sort. Received on Sun Jul 27 2008 - 17:00:49 CEST

Original text of this message