Re: [Help]How to translate from Algebra->SQL and viceverse?

From: Jan Hidders <hidders_at_REMOVE.THIS.win.tue.nl>
Date: 7 Nov 2000 13:11:41 GMT
Message-ID: <8u8v2d$oc5$1_at_news.tue.nl>


macefindu wrote:
> Hi,
> My professor have teached us how to translate from Logic
> Language(LL) to SQL, and LL to Algebra, and then in exam appeared
> question about tranlate Algebra code->SQL code and SQL->Algebra, so I
> assumed that the only way was through the LL, becasue the direct
> tranlate from Algebra->SQL is almost impossible for me, especially
> with subqueries of more than 4 levels, im talking about kind of
> monster like this:
>
> select(select(select ... where (blablabla) or exists (select ...)));
>
> So my first question is: Is there a book i could learn more about
> this issue from?

Your professor probably already explained to you how the translation works, so I am not sure what more you would like to know. But the standard reference for this stuff would be:

  Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems,   Volume I. Computer Science Press 1988, ISBN 0-7167-8158-1

> One more quesion from the last exam was:
> "to list all the clients who suffered more than one accidents in Algebra."
>
> table Client(client_id, insurance#,...) <-strong entity
> (primary key:client_id;foreign key:insurance#)
>
> table Accidents(insurance#, date, time) <-weak entity
>
> to do this one, i used "having count(*)>1" for table Accidents, but
> the problem is they didnt ask for SQL, so my question is: What is the
> equivalent algebra codes for "groups ... having ..." ?

To my knowledge there is no general solution for such queries. In fact, in the literature some extensions of algebra were proposed that included such operations. But simple HAVING-conditions like the one above can usually be translated with some "tricks" like the following.

We start by noting that every client has only one insurance number. So we can first determine which insurance number has had more accidents (this involves only the Accidents table) and later look up which clients belong to the insurance numbers.

We start by taking the cartesian product of the Accidents table with itself. I will explain later why we are doing this, so please bear with me. Before we can do this we need to rename all the columns, otherwise the result will simply be the original table.

 T1 := RN[cid->cid2,ins#->ins#2,date->date2,time->time2](Accidents)

                                  where RN[a->b..] is the rename
                                  operator that renames column a to b,
                                  et cetera. Note that I have
                                  abbreviated the column names
                                  somewhat.

Now we do the actual join:

 T2 := JN(Accidents,T1)           where JN is the natural join. Since
                                  all columns have been renamed this
                                  amount to taking the cartesian
                                  product.

So, now T2 contains all combinations of all accidents. What we are now going to do is select all those combinations that involve the same client but are different accidents. First, we select those pairs that involve the same client:

 T3 := SL[cid=cid2](T2)           where SL[..] is the selection
                                  operator.

Next, we select from these the pairs that involve different accidents. Note that accidents are different if either the date or time is different or both are different. If we can only use boolean conditions in the select than this can be done as follows:

 T4 := SL[time<>time2 OR date<>date2](T3)

Of course, we can do this also the hard way with only simple conditions:

 T4a := SL[time=time2](T3)
 T4b := SL[date=date2](T4b)
 T4  := T3 - T4b                  where '-' is the set minus

So, what do we have now? We have the pairs of different accidents that involve the same insurance number. So these insurance numbers have more than 1 accident associated with them. So we now only have to take the projection of the result to get just these numbers:

 T5 := PJ[cid](T4)

But we didn't want the insurance numbers but the clients so these have to be joined in and then projected:

 T6 := PJ[cid](JN(Client,T5))

And this is the result that was asked.

-- 
Kind regards,

    Jan Hidders
Received on Tue Nov 07 2000 - 14:11:41 CET

Original text of this message