Re: Storing query language in relations

From: Paul <pbrazier_at_cosmos-uk.co.uk>
Date: 25 Sep 2002 03:08:14 -0700
Message-ID: <51d64140.0209250208.25f5412a_at_posting.google.com>


Paul Vernon <paul.vernon_at_ukk.ibmm.comm> wrote in message news:<ampb91$f3e$2_at_sp15at20.hursley.ibm.com>...
> Paul. You are raising the question of when to model non-scalar types as
> single attribute values and when to model them as relation values.
>
> The cheating answer is to point out that in the RM you can have your cake
> and eat it.
> All you need do is to create a set of virtual relvars over your non-scalar
> attribute.

should it not be the other way round - at least on the logical level? i.e. have a view to display the non-scalar type as a single attribute but have it in relations in the database. Maybe on the physical level you could then store the non-scalar type attribute in a single physical file.
Or are we just saying the same thing in different ways?

> >I know some DBMSs store the text of queries, view definitions,
> >stored procedures etc. in tables but not in a normalised fashion.
> >An advantage would be that the DB constraints would automatically
> >validate the query so no need for parsing SQL code for syntax.
>
> The downside being that you would need some horrendous multiple relvar
> assignments to enter a new query.

Well you might say: if all queries have to be stored in relations, how do you INSERT any data to begin with? Instead of having a non-relational query language like SQL sitting on top of the database, you could have a method of entering data into relations *under* the database i.e. at the physical level.

Maybe this is a GUI or maybe it is a conventional SQL INSERT statement but as you type it (or once you've finished) the DBMS would be putting it into relations for you. In a sense to the end user this is indistinguishable from the usual practice I suppose.

But you wouldn't need SELECTs or DELETEs at all: just use the built-in "bootstrap" INSERT statement to put your SELECT or DELETE query into the appropriate relations and then maybe there would be a relation where you can enter the ID of a query and a time to execute it, and once this had been done the query would run. The output would be built in to the RDBMS and could be a text file or whatever.  

> Can I suggest that a more pragmatic (and human friendly) idea would be to
> recognise the need for a new non-scalar type generator. Such a generator
> would take a BNF grammar and create a non-scalar 'language' type for that
> grammar. It would also create a parser to take a text string and return a
> value of the new type for all valid inputs.
>
> If called from a Relation Type generator (i.e. CREATE TABLE, VALUES in
> SQL), such a language generator could, if desired, produce virtual relvars
> that normalise any non-scalar language attributes. Along with such virtual
> relvars, a mass of relational constraints corresponding to the particular
> BNF grammar would be generated, thereby ensuring that the relvars could
> only be set to values corresponding to valid paragraphs of the language.

So does this mean for example that the generator knows the syntax for SQL statements, and when you feed it a string like this:

"SELECT foo FROM bar WHERE foo > 1"

it will "normalise" it and put it into the appropriate relations for you?
The relational constraints would only need to be done once because you could store all your SQL statements in the same relations.

> But I would ask why? What would we gain? The usual answer for normalising
> non-scalar types is that we want to be able to ad-hoc query the components
> of the non-scalar types using the expressive power of relational algebra.
> As an example, breaking written language attributes into their consistence
> parts (e.g. words) is a sensible step if you are building a simple text
> search database. This would allow you to say find the set of sentences
> with 6 or more words in common, using a relational query.
>
> In the general case of languages, (e.g. SQL, C, relational algebra, etc) I
> find it hard to conceive of a common need to relationally query their
> components. I guess that a DIFF function could be coded relationally, but
> frankly I think that traversing parse trees is going to be easier.

For example you might want to know how many of your views use a certain relation or how often a particular column is used. Or changing a variable name.
Or listing all the functions used in a C program.

Isn't normalization good just from a logical storage point of view even before you begin wanting to write queries?  

> Having said that I think it would be a good test of a true relational
> language. If such a language Type generator was provided, it would show
> that they were eating their own food.
>
> >Wasn't one of Codd's guidelines that metadata should also be
> > stored relationally?
>
> Indeed. I don't have the quote to hand, but the upshot is that EVERYTHING
> that is needed to be stored persistently, should be stored relationally.

So from this could you not argue that any DBMS that stores queries, view definitions, etc. in unnormalised text files without a corresponding relational interface is not a true RDMBS :)

> >I've not thought too deeply about how a schema for SQL queries (for
> >example) could be implemented but it must be possible (kind of like
> >the MS Access query grid but normalised).
>
> So the MS Access query grid can specify all possible relational
> expressions? Humm. As above, I would suggest using the grammar for the
> language as your schema. Nasty hey?

Well the MS Access query grid is very limited but it can define basic queries.
Joins are stored in the "diagram" part but if you stick to single-table queries (projections and restrictions?) I think it gives you a vague idea of what I'm driving at. (Not that I'm advocating Access as a great relational product ;)) Actually I think Access may in fact store the queries as text but just use the grid as a data entry mechanism. Though that would be OK if you think of the text storage as physical and the grid as the logical level.  

> >Is this taking things too far or might this line of thinking
> >produce something useful? Has this been done already in some
> >DBMS?
>
> As long as the ideas hold water, you have to go a long way to go to far in
> a theory newsgroup.

I'm not sure if this is thinking along similar lines as mountain man's thread or if this is taking it further to its logical absurdity (or purity depending on your viewpoint).

Paul. Received on Wed Sep 25 2002 - 12:08:14 CEST

Original text of this message