Re: Hashing for DISTINCT or GROUP BY in SQL

From: paul c <anonymous_at_not-for-mail.invalid>
Date: Sat, 16 Oct 2010 00:32:52 +0000 (UTC)
Message-ID: <i9arrj$o9f$1_at_tioat.net>


On 15/10/2010 4:38 PM, Cimode wrote:
> Hi paul
>
> Thanks for elaborating about the historic context. I believe you have
> misunderstood the point I am trying to make. This is not about
> physical vs logical confusion or a database theory vs database
> implementation confusion. It is about defining the math the physical
> must satisfy to effectively support the logical.
>
> I can conceive that VSAM is not significant for somebody observing
> database science from a purely logical perspective of relation
> operation and structural definition. But logical database science is
> not all database science. The history of theory behind physical
> implementations following the development of database science is just
> as significant. I suspect that it is the inability of logical
> theorists to conceive mathematical models that could allow
> implementations of higher abstraction logical principles on binary
> current mechanized addressing schemes that explains a part the failure
> of the database science as a whole in contemporary times.
>
> For instance, logical theorists I sometime find naive when they assume
> no math could exists behind the algorithms of database implementations
> attempts. And I find them even more naive, when they believe that the
> lack of such math can not have an impact on the bias of how the
> logical database science is conceivable. As for me, I can not
> dissociate one from the other: though the logical(RM) dictates the
> intent, the physical dictates the method to respect the intent.
>
> I consider VSAM significant in physical database implementation
> perspective and the lower level theory behind it, as compared to
> previous systems. In previous systems, seek algorithms were mostly
> relying on run time sorting and linear dichotomic searches. VSAM
> introduced a sophisticated seeking scheme based the concept of
> dichotomic leaf search aka*register zig zags* which is widely
> implemented on direct image systems. The algorithms probably inspired
> by fractal theory allowed an order of magnitude reduction of number of
> logical IO's necessary to reach a specific pre-determined value. In a
> sense, the clusters were not*just* a dumb stack of files, but were
> also the ancestors of today's ordered indexes (known as clustered
> indexes) frequently used on SQL systems.
>
> The logic behind the data structure also relies on the concept of pre
> order presenting similarities with latest transrelational model. I
> believe somehow that this constitutes an evolution that can't be
> neglected since it allowed to conceive as possible the implementation
> of number of operators such as NOT EXISTS that were previously
> considered resource prohibitive under ISAM systems.
>

Cimode, to the extent I understand it, no argument. Like most people I saw Vsam only from the outside. It did have a couple of good theoretical underpinnings, one being b+ trees. IBM would have had some thinkers aware of that idea since the originator (forget his name) was at Boeing, a big customer of theirs. I'm usually pretty hard on IBM even though for much of my life it was indirectly responsible for most of my income. To be fair, I'm sure lots of the s/w products that came out of it started out as pretty pure theory that got 'adjusted' to be downward-compatible with the stuff customers had already been sold. Received on Sat Oct 16 2010 - 02:32:52 CEST

Original text of this message