Re: Object Oriented Wrapper classes to SQL statements

From: Clifford Heath <cjh_nospam_at_osa.com.au>
Date: Thu, 13 Dec 2001 11:30:40 +1100
Message-ID: <3C17F6B0.E33A5DF4_at_osa.com.au>


Rachel wrote:
> Can you refer me to links to resources on the web on the issues of
> calculus vs. algebra, preferably with examples?

No, sorry, but I can try to give a simple example below.

> Can you give me a very simple example to show why it is easier to
> model algebra than calculus with objects?

The calculus says what conditions qualify the result set without saying anything about how to compute the result set. The algebra is a mathematical expression describing a possible computational path (a query plan). The actual query plan to be used will always be modified according to what indices exist and what types of scan are useful in reducing work, and indeed by many other factors which the query optimiser takes into account, but a human understand the algebra directly nonetheless.

Ok, taking as an example a schema that has Company, Person, and Directorship. The database shows a Person's employer, and all Directorships of companies (if any) which the person holds:

CREATE TABLE Company (

        CompanyNumber   varchar(12)     -- Company/business number
                CONSTRAINT CompanyKey PRIMARY KEY,
        CompanyName     varchar(60)

)

CREATE TABLE Person (

        FamilyName      varchar(20),
        GivenName       varchar(40),
                CONSTRAINT PersonKey PRIMARY KEY
                        (FamilyName, GivenName),
        CompanyNumber   varchar(12)     -- Company/business number of employer
                CONSTRAINT FK_Employee
                        REFERENCES Company(CompanyNumber)
)

CREATE TABLE Directorship (

        FamilyName      varchar(20),    -- Director's name
        GivenName       varchar(40), 
                CONSTRAINT FK_Director
                        FOREIGN KEY (FamilyName, GivenName)
                        REFERENCES Person(FamilyName, GivenName),
        CompanyNumber   varchar(12)     -- Company/business number
                CONSTRAINT FK_Directs
                        REFERENCES Company(CompanyNumber)
)

Now, to get the names of directors of "Acme Trading" using an SQL statement you say:

SELECT p.FamilyName, p.GivenName
FROM Person as p, Company as c, Directorship as d

WHERE c.CompanyName = 'Acme Trading'
  AND c.CompanyNumber = d.CompanyNumber
  AND p.FamilyName = d.FamilyName
  AND p.GivenName = d.GivenName

Note that although it's clear what is wanted, no information is given on how to get it. This is a calculus expression: it defines a set of x by saying "x such that (some conditions hold)".

The relational algebra might be written as a set of nested functions (using a syntax I'm making up for this example):

project(

    { FamilyName, GivenName },
    join(

        Person
            using { FamilyName, GivenName },
        join(
            Directorship 
                 using CompanyNumber,
            filter(Company where CompanyName = 'Acme Trading')
                using CompanyNumber
        )

    )
)

Note that each "function" returns a table (a set), and to execute the query you must start at the innermost nested function, evaluate that to a table, then pass that table to the enclosing function, until you reach the outside.

The functions (relational operators) I've used here are:

  • filter(table where <condition>) Select only those rows of the table which meet the condition
  • join(table1 using <fieldlist1>, table2 using <fieldlist2>, ...) Find all composite rows from table1 and table2 where the specified fieldlist values match.
  • project(<fieldlist>, table) Reduce the number of columns in the table to those specified, and remove duplicates (necessary because a set cannot have duplicate elements).

There are many other operators that would be used in an actual query plan, including index scans of various kinds (filter from index where value in <range>) and a variety of physical methods for joins, for example.

The above query might well be efficiently executed using the implied query plan, but there are many other ways of writing and executing it which will also get the right result - the SQL query optimiser changes the SQL into relational algebra, then manipulates the expression using algebraic operations until it has the apparently cheapest form.

--
Clifford Heath, ManageSoft Corporation
Received on Thu Dec 13 2001 - 01:30:40 CET

Original text of this message