Re: Storing query language in relations

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Sat, 5 Oct 2002 18:51:14 +0100
Message-ID: <ann90n$12bs$1_at_sp15at20.hursley.ibm.com>


>You could I suppose regard SQL DDL as a view of the DEFINITION_SCHEMA
>tables just as the INFORMATION_SCHEMA views are. In fact I think most
>current DBMSs don't actually store the DDL, but generate it from the
>system tables on request.

Tell me about it... one side project of mine has being building DB2 DDL from the DB2 system catalog just by using SQL views (using recursive SQL of course).

>So perhaps the point is more of a conceptual one - that we should
>move away from thinking of a query language as the primary means
>of interacting with the database, and the system tables as only an
>"underlying" storage for DDL.

Any query language, at it's heart, is simply relational algebra (or, if you prefer it, relational calculus).

Now we could do without fancy layers on top of the algebra, but we cannot do without the algebra itself.

Surprisingly, systems based on SQL seem to manage without a explicitly exposed (version of) the algebra as a specific datatype. I find a slight shame that the current D&D RM does not explicitly suggest such a type, but then neither do they explicitly suggest that an implementation of the RM needs to be able to hold a description of the relational model itself. All they _require_ is the type BOOLEAN, plus of course the ability to generate TUPLE and RELATION types.

Such a silly minimal system would not be much bottle however, as you could not interact with it. It would need to accept at least values of type R_ALEGBRA and return values of generated RELATION types. Also, we generally find it difficult to input most datatypes directly. The traditional method is to take a value of type STRING and cast it to the desired type. Of course one downside of this method is that the cast will fail if the STRING is not a valid representation of the type it is cased to. A better system, as we know, is to parse the key strokes as they come in, so preventing the creation of invalid strings in the first place. Furthermore, most strings are representations of non-scalar types. A minimal R_ALGEBREA would be a non-scalar type with components of type RA_OPERATOR (defined as say the set {<NAND>, <REMOVE>, <TCLOSE>, <ASSIGNMENT>} ), RELVAR and generated RELATION and TUPE types. Therefore as our keystokes are being parsed, they need to be parsed against the component types of R_ALGEBRA.

All good stuff you reckon!?

BTW D&D do say the catalog should be self describing and updateble (and hopefully constrainable (including being able to constrain on what constraints are allowable) - but they don't mention that)

Now back to the question at hand. Are languages a better interface to databases (and computer systems in general), than relational assignment of language component values? Well in the words of Steven Pinker, language is an instinct. It's hard wired in our genes and our brains, so we should find them easy to work with. Also it is _more efficient_ in number of keystoke terms than relational assignments because it utilises spatial co-loctation i.e. the order of the symbols in the language have a meaning. We forgo this (for very good reason) in the relational model per se. In other words a _raw_ interface to a RDBMS will only appeal to those perverse enough, and with enough time on their hands to work with such an interface. For such reasons, the relational model needs to support multiple, non-raw interfaces. Stuff like hierarchies, arrays, linked-lists, fancy graphics etc all show aspects of spatial co-location that are fundamentally more efficient as human interfaces. And anyhow, you can't truly display a Relation in our reality as any display format will exhibit information (i.e. orderings, or placement in space) that do not exist in the relation per se.

It might have been easier just to do an example to answer the question. Say compare the ease of input of say the string "create table a(a integer not null, t CHAR(1) NOT NULL CHECK (t IN ('A', 'B')), primary key(a, t));"

compared to assignments into the following SQL catalog tables: TABLES, COLUMNS and CONSTRAINTS, but where would be the fun in that? Argument from first principles is much more satisfying. :-)

>Instead the system tables become the
>main way of storing your DDL & DML, and SQL or other query
>languages are relegated to a mere view of these.

>Though as someone pointed out views and tables should be equivalent.

Wow, they must be sharp.

>I'm not sure how this would work here though because the views
>containing the SQL strings wouldhave non-atomic values whereas
>the underlying tables wouldn't.

So?

> So the tables would be in some sense superior to the views.

What sense would that be? I say they would be logically equivalent - they would be information equivalent for sure. Superiority is often just a psychological issue. :-)

>And a "external" query language like SQL becomes an optional
>add-on to a DBMS instead of an essential part.

I for one would stump up the money to buy that add-on.

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Sat Oct 05 2002 - 19:51:14 CEST

Original text of this message