Re: A better SQL implementation?

From: Cimode <cimode_at_hotmail.com>
Date: 8 Jun 2006 02:03:35 -0700
Message-ID: <1149757415.762602.95960_at_i39g2000cwa.googlegroups.com>


David Portas wrote:
> Cimode wrote:
> > Below is a link to a friend's website who developped his own
> > implementation of SQL considering that all current SQL implementations
> > are unefficient. This SQL is embedded into his database engine (Atlas)
> > Atlas defined a DBMS *set system* as a reference to set theory.
> >
> > As an indicative annotation, he was a part of the developping team that
> > worked on System-R.
> >
> > He has an interesting truth based historical perspective about
> > evolution of SQL implementations from early days and supports that SQL
> > should have been implemented otherwise. Here's a link to a white paper
> > put on his website. I would like your opinion on that.
> >
> > http://www.armadillo.fr/english/whitepapers/WHITEPAPER_2004.htm
>
>
> Cimode wrote:
> > Below is a link to a friend's website who developped his own
> > implementation of SQL considering that all current SQL implementations
> > are unefficient. This SQL is embedded into his database engine (Atlas)
> > Atlas defined a DBMS *set system* as a reference to set theory.
> >
> > As an indicative annotation, he was a part of the developping team that
> > worked on System-R.
> >
> > He has an interesting truth based historical perspective about
> > evolution of SQL implementations from early days and supports that SQL
> > should have been implemented otherwise. Here's a link to a white paper
> > put on his website. I would like your opinion on that.
> >
> > http://www.armadillo.fr/english/whitepapers/WHITEPAPER_2004.htm
>
> The paper is rather confusing to me. It claims the index structure is
> the novel feature of Atlas but the author doesn't explain what the
> index structure is or how it differs from other indexes. The only clue
> is the diagram of "Normal" vs "Set" indexes but those terms and the
> diagrams are strange to me and they explain nothing. How does this
> system compare to bitmap indexes? Hashing? B-trees? Clustering? etc.
Yes it is confusing to me as well on RM perspective. From what the author claims it is more designed for user's audiences than technically knowledgeable audiences as he has troubles trying to make the company take off.
> Throughout there seems to be the mistaken idea that SQL is a set-based
> data model and language. Informally, we often say that (I do anyway)
> but that's not really good enough for a serious discussion about
> database design and implementation.
Yes I know terminology traduces as confusions about RM. Still, based on the
implementation I have observed and tested it is a promising track to explore. For instance (looking at pseudo code), the idea of limiting logical disk accesses by computations of intersection and union rather than obtaining results by comparing values is a plausible idea...
> Then there is this:
>
> Select * from Document where parag = 'kw1' and parag = 'kw2'
>
> If the author wants that statement to return anything other than zero
> rows then he has to define a whole new SQL syntax and apparently a
> different *logical* data model as well. He obviously isn't using SQL's
> model. But none of that is explained other than with trivial examples.
> Where is the grammar that describes the language he has used? Where is
> the formal specification of the data model?

At first look, yes but not a much as you think...it appears clearly that parag refers to a document data type that uses '=' syntaxic representation of inclusion of a specific word into that document. The following expression...

select * from Document where parag = 'kw1' and parag = 'kw2'

is in fact equivalent to (using pseudo-code)

select * from Document where parag INCLUDING ('kw1' and 'kw2')

Otherwise, the logical meaning of the symbol '=' has been determined by audiences (commitees) that know as much about RM as I know about chinese language. Considering that SQL's current support of complex data type, how would you express in current SQL standard the following pseudo code

select * from Document where parag INCLUDING ('kw1' and 'kw2')

In common SQL implementations this kind of data type is usually expressed as

select * from Document where parag CONTAINS('kw1' and 'kw2')

using Full text search. And there is no clue onto how it is implemented physically. This implementation has the advantage of allowing to define indexes on a complex data type. and uses the '=' operator in a coherent manner. Think about it, in natural language, if you ask your assistant to pull out for you all documents that includes 2 specific words, you usually express it as saying to your colleague: "Could you find the document that include 'school' and 'student'". It can make sense...

> Finally, since Atlas is neither SQL nor Relational why bother making it
> look like SQL at all? If the author thinks we need a new data model or
> a better language then better design one without all the baggage and
> the mistakes that have hindered SQL from the start. He patently isn't
> doing that. For example, SELECT * implies that logical column order is
> still a "feature" of his SQL. He even allows tables without keys. I
> can't understand why anyone who was serious about reforming SQL would
> keep such huge flaws in the language.
>
> --
> David Portas

I do not believe that the author tries to reengineer SQL or give it a SQL look and feel. I sincerely believe from our discussions that his version of SQL may give us an idea of what it may have looked like if it would have taken a different path based on IBM's early decisions that shaped current form of SQL. I thought it would be interresting to share that with this respectable community. I am just as doubtful as you are but thrust me and give it a chance, I am convinced you won't be disappointed...;) Received on Thu Jun 08 2006 - 11:03:35 CEST

Original text of this message