Re: Storing query language in relations

From: Paul <pbrazier_at_cosmos-uk.co.uk>
Date: 24 Sep 2002 03:42:54 -0700
Message-ID: <51d64140.0209240242.21fe1fa3_at_posting.google.com>


> > If we accept that a relational database is the
> > most logical way to
> > hold structured data, why not go a step further
> > and store the actual
> > queries in relational format?
>
> What is "unstructured" data anyway?! :-)

OK, bad terminology, I guess by structured data I really mean data that is held in a non-atomic form, so it's not really the data that is unstructured; rather the logical storage format.  

> > ...why not go a step further and store the actual
> > queries in relational format?
>
> Sure... but there is no reason you couldn't do a whole imperative
> language including queries (expressions). It would only be a matter
> of having a table for each language element. Something like this:

An imperative language is one like C or Java where you specify "how" rather than "what"? So a query would just be an expression in the imperative language of type "relation"? Do we make a distinction between a table with a single row and column containing the integer "3" and the integer "3" itself?

although SQL Server 7 doesn't have relation types as far as I know, this:

IF (SELECT 3) = 3 PRINT "equal"

does print "equal", it must do an implicit type conversion.

What I'm thinking is that everything could be done in a declarative language including the imperative parts. So rather than embedding SQL within a procedural language, you do the opposite: store your procedural language in relations within a RDBMS.

It wood be good to get the self-referentiality of C (compilers compiling themselves) for a declarative query language. Imagine an RDBMS that could compile itself. You might say how can a database affect the underlying operating system but the basic C can't really do anything other than program logic - it needs all the libraries to do any sort of I/O. You could have the equivalent of I/O libraries for an RDBMS which are implemented at the physical level but controlled through relations at the logical level.

For the example of just storing SQL queries in relations a first naive attempt to store a basic query that can select columns from a single table but not restrict the rows (this is called projection in relational algebra?) might be:

CREATE TABLE sqlselect (
selectID int,
PRIMARY KEY (selectID)
)

CREATE TABLE sqlcolumn (
selectID int,
colname char(20),
alias char(20),
PRIMARY KEY (selectID, alias),
FOREIGN KEY (selectID) REFERENCES sqlselect(selectID) )

CREATE TABLE sqlfrom (
selectID int,
tablename char(20),
PRIMARY KEY (selectID),
FOREIGN KEY (selectID) REFERENCES sqlselect(selectID) )

INSERT INTO sqlselect VALUES (1)
INSERT INTO sqlcolumn VALUES (1, 'a', 'a')
INSERT INTO sqlcolumn VALUES (1, 'b', 'b1')
INSERT INTO sqlcolumn VALUES (1, 'b', 'b2')
INSERT INTO sqlfrom VALUES (1, 'tbl')

this would represent the SQL query:
SELECT a, b AS b1, b AS b2 FROM tbl

Instead of having the actual table & column names stored you could have the relevant keys from, and constraints with, the system catalogue tables. Then if you changed a column name the "queries" would still be valid. Also you'd have to delete all the "queries" that referenced a table before you could drop the table.

Also to allow for joins, subqueries, expressions such as "a + b", aggregates etc. you'd obviously need a lot more complicated schema.

Paul.

> create table InsertStatement
> {
> Statement_ID : StatementID,
> SourceExpression : ExpressionID,
> TargetExpression : ExpressionID,
> key { Statement_ID },
> ...
> };
>
> create table Expression
> {
> ID : ExpressionID,
> ResultType : TypeID,
> ExpressionType : ExpressionType,
> key { ID }
> };
>
> create table UnaryOperatorExpression
> {
> Expression_ID : ExpressionID,
> Operand : ExpressionID,
> key { Expression_ID },
> ...
> };
>
> create table IntegerLiteralExpression
> {
> Expression_ID : ExpressionID,
> Value : Integer,
> key { Expression_ID },
> ...
> };
>
> Of course you would want to build some supporting operators:
>
> create operator EvaluateExpression(AExpressionID : ExpressionID) :
> Generic
> ...
>
> What would also be an interesting exercise would be to model a word
> processor! So much for "unstructured" data!
Received on Tue Sep 24 2002 - 12:42:54 CEST

Original text of this message