Re: Expressing SQL in relational algebra

From: Paul <pbrazier_at_cosmos-uk.co.uk>
Date: 1 Apr 2003 07:04:43 -0800
Message-ID: <51d64140.0304010607.bce82c2_at_posting.google.com>


neo55592_at_hotmail.com (Neo) wrote in message news:<4b45d3ad.0303280739.6213c846_at_posting.google.com>...
> How can I express "John likes Mary" using relational algebra using
> only "John", "likes", "Mary", intersection operator(^) and grouping
> operators ("()")? The expression should not find "Mary likes John",
> "like John Mary", etc. Would the following expression be correct:
> (((John ^ John) ^ likes) ^ Mary)

I think you might be going to a lower level here to something that is outside the scope of relational algebra.

Say you have an employee relation with columns for the employee id, name, dept and salary. Then you can think of this as a logical predicate or a "template" sentence that can either be true or false. i.e.

"employee E is called N, works for dept D and is on salary S"

any row in the corresponding relation will be a true sentence (proposition) like:

"employee 001 is called "John Smith", works for dept 10 and is on salary 20,000"

so then if we use relational algebra to project this relation onto D say, we get
the predicate:

"there is an employee in dept D"

the actual meaning (semantics) behind the sentences aren't known to the DBMS, it just does the algebra mechanically. So in your example you'd have a relation L which corresponds to the sentence "A likes B", where A stands for the person doing the liking and B stands for the person being liked. So in relational algebra "John likes Mary" would just be L restricted to A=John and B=Mary.

Now the employee predicate means that all employee information we add to the database must adhere to this template. We might have some other facts we know:

  1. employee 002 is called "Mary Jones" but, we don't know their dept or salary
  2. no employee earns less than 10,000
  3. Either Bill or Karen are in dept 10, but not both.
  4. Kate is either in dept 20 or 30.
  5. There is someone in dept 50.
  6. employee 003 is called John and he is not in dept 20. etc.

Because none of these fit the template though we can't add these facts to the database. NULLs are a (dubious?) attempt to cover case 1 and 5. Case 2 could be covered by an integrity constraint but what if it is a volatile fact? Maybe in case 6 you could have a separate relation that details all the departments that an employee is definitely not in. I think in relational theory a trade-off is made between simplicity and flexibility - you agree to have a rigid template for your facts to avoid massively complicated logical expressions.

I'm not an expert but I believe that "constraint databases" will cover this kind of thing. I'm not sure whether they are specific applications of a relational DBMS or something totally separate to relational theory. You've also got logical programming languages like Prolog which sounds more like the sort of thing you're talking about above. I'm not sure exactly where they sit in relation to RDBMSs or constraint databases, maybe someone could enlighten me.

Now I suppose if you wanted to you could encode your grammar in a relational database, so you'd have a relation holding your verbs like "likes", another holding your nouns like "John" and "Mary", and some other relations indicating how valid sentences can be formed. So you could then write a query (knowing the semantics of your database) that lists all valid sentences of your language. But all you would be doing here is just shifting down the level a bit - there would still be semantics unknown to the DMBS. I think this is due to the fact that relational theory is restricted to first-order predicates. Maybe a richer theory exists if you allowed higher-order predicates as well - has there been any research into what would happen if you try to develop a relational theory with higher-order logic?

What do you mean by "intersection" above? Set intersection i.e. elements that are in both sets? The only sets used in relational terms are sets of tuples within a relation, which correspond to sets of propositions that are true for a predicate.
So you can say "R1 intersection R2" where R1 and R2 are both relations (with the same columns). But "John", "Mary" and "likes" aren't relations.

Paul. Received on Tue Apr 01 2003 - 17:04:43 CEST

Original text of this message