Re: Is it possible to build a purely relational database on top of SQL?

From: Erich Schultz <purerelational_at_yahoo.ca>
Date: 17 Jul 2004 09:28:25 -0700
Message-ID: <edeabcd.0407170828.69bd3657_at_posting.google.com>


"A step in the right direction" isn't exactly glowing praise.

"It's too bad that you are limited to Access and don't have the opportunity to work with TRDBMSs (they don't exist). You may want to take a look at Alphora's Dataphor data language, a step in the right direction."

http://www.dbdebunk.com/page/page/1270152.htm

jingleheimerschmitt_at_hotmail.com (John Jacob) wrote in message news:<72f08f6c.0407122316.29c30515_at_posting.google.com>...
> > My question: is there any reason why a purely relational database
> > interface, in the "Third Manifesto" sense of, say, Tutorial D,
> > could not be built on top of any existing commercial SQL DBMS?
> > That is, could one build a true RDBMS without having to rewrite
> > everything from scratch? (Note, I do not require that such
> > an interface be expected to work for an arbitrary DB containing
> > unpleasantness such as NULLs and duplicate rows etc.)
> >
> > If this is possible, has it been done? Are there performance
> > issues with such an approach?
> >
> > If it is not possible, what is it about SQL that lacks
> > sufficient expressive power?
>
> The short answer is yes, the long answer is no ;)
>
> I am part of a software team that has built what I believe is as close
> to "truly" relational as is possible while still taking as much
> advantage of the performance and transaction management capabilities
> of existing systems as possible. It turns out to be a pretty tall
> order to combine the two, in fact, I believe, impossible without
> writing the system from scratch (or at least from a modified b-tree).
>
> The overall architecture of the system is based on providing a pure
> logical environment based on the relational model, and then
> transforming requests against that logical model into SQL queries
> against various back-end storage systems, or "devices." Initially, the
> logical model was 100% pure, and (almost) 100% TTM compliant (if you
> would like, I can post a Manifesto Compliance document from the 1.0
> version of the product). However, the design goals of the system
> called for translation into existing systems for performance. We knew
> we could not match, at least initially, existing systems in terms of
> sheer data processing power and query optimization (they've got thirty
> years on us in this department). What we were after was the logical
> and physical data independence that would be afforded by a more
> faithful relational implementation. We needed this not because it was
> relational in some esoteric sense, but because it would enable
> declarative development. That is the primary motivation behind the
> product.
>
> At any rate, the architecture necessarily involves a very abstract
> mechanism for mapping expressions of the logical model into equivalent
> expressions in the various dialects of the target systems. This
> mapping has two main components: connectivity, which is concerned with
> the physical communication aspects such as ODBC, OLEDB, (your data
> access paradigm here), and translation, which is concerned with the
> semantic mapping of expressions. The latter component is by far the
> more interesting, and consists of type mappings, and operator
> mappings. Type mappings handle value translation, and operator
> mappings handle expression translation.
>
> Initially, a given query is parsed and compiled exclusively in terms
> of the logical model. Then, a binding phase is performed which
> essentially decides where each portion of the query will be executed.
> Whenever an operator is encountered in the execution plan, the device
> involved is asked whether it "supports" the operation using the
> operator mapping. If the device responds yes, it is given the task of
> translating the operation into the appropriate dialect, and then
> evaluating and returning the results. If the device responds no, the
> internal query processor takes over and continues processing.
>
> In this way, operations that are ordinarily not available such as
> transitive closure, or blob comparison, can be performed, while still
> taking full advantage of the query processing capabilities of a given
> system.
>
> For type mapping, the logical model allows types to be defined exactly
> as prescribed by TTM. We then transform the physical representation of
> values of the type into some format that can be stored and manipulated
> within an existing system. We initially supported type inheritance as
> well, but for various reasons we dropped it. One reason was the
> complexity of mapping a polymorphic operator invocation into a
> statically typed SQL-based system, as pointed out elsewhere in this
> thread.
>
> In creep nulls. The null semantics of existing systems are so
> ingrained that to completely avoid them, we would have had to support
> nothing but base table retrieval, leaving our query processor to do
> all the work. While we spent a great deal of time making sure that the
> operations we did perform were done efficiently, the fact remains that
> Oracle and SQL Server will outperform us every time (at least today
> <evil grin>). In the end, we were forced to bow out and implement null
> semantics in exactly the same way that SQL-based systems do.
>
> It is worth noting that not even this compromise is 100% correct. For
> example, Oracle systems do not distinguish between null and the empty
> string, most systems ignore trailing spaces when comparing character
> data, some systems consider rows with nulls less than rows without
> them for the purposes of sorting, other systems consider them greater,
> and the list goes on and on.
>
> It is also worth noting that we made some very specific inclusions in
> our compiler for use in detecting when nulls are possible in a given
> expression. In this way, the optimizer can tell when it can perform
> optimizations that would be invalid in the presence of nulls.
>
> If you would like more specific examples of the translation problems
> that forced us to the "null" conclusion, I would be happy to provide
> them. It's been a very instructive journey for us, and we plan on
> documenting the whole process once we have time.
>
> One area where I am happy to say we did not budge is duplicates. This
> is relatively easy to enforce (although it does prevent an otherwise
> off-loadable query from being supported in some cases) by using the
> distinct clause.
>
> As far as the relative expressive power of the various dialects is
> concerned, as long as a system supports nested queries in the from
> clause, almost all queries can be handed off. There are various
> operators that are either not supported, or poorly defined such as
> quota queries and explode operators, but these operations are
> relatively rare, and our implementations for these are at least as
> efficient as a hand-coded implementation would be.
>
> By far the more aggravating problem is the inconsistency of SQL, both
> internally within each dialect, and the subtle nuances of syntax and
> semantics between the various dialects. The goal is that once an SQL
> system is safely wrapped up in a device, we will never have to speak
> that horrible dialect again. Eventually, we'll have wrappers for every
> system we have to speak to, and we'll never have to speak that
> horrible language again.
>
> Regards,
> Bryn Rhodes
> Alphora
Received on Sat Jul 17 2004 - 18:28:25 CEST

Original text of this message