The Theoretical Foundations of the Relational Model

From: Paul G. Brown <paul_geoffrey_brown_at_yahoo.com>
Date: 14 Jun 2002 16:42:43 -0700
Message-ID: <57da7b56.0206141542.4a694b5d_at_posting.google.com>



In the midst of the current flare-up in the long-running 'objects vs

   relations' flame-war, I thought it might be useful to take a moment,    step back, and explain to the object folk something of where the    relational crowd are coming from. I'm going to try to do this without    once mentioning the word 'database'. (Oops.)

    The origins of the 'relational model' lie in mathematical philosophy, and    specifically in something known variously as 'predicate logic' or
'symbolic logic'. (I'll use the later term.) If you want to read more about
   all of this I'd recommend the three books at the end of this post.

    Symbolic logic was an attempt to put a set of framing principles around    reasoned (ie. logical or rational) discourse. (ie. how to tell when    someone is talking crap, even if you can't check their facts.) For the    longest time there was this way of reasoning called 'sylogistic logic', and    it has the form you're probably all familiar with. (P1) All men are mortal.    (P2) Socrates is a man. (P3) Therefore, Socrates is mortal. Lots of people    talked this way and we all 'knew' how the rules (things like '(or (a) (not    a)) is always true, and that from (P1) and (P4) Kate Moss is not a man, it    does *not* follow that (P5) Therefore, Kate Moss is not mortal.) worked,    but no one had paid a lot of attention to the topic of logical structure for    a while.

    Symbolic logic was the result of a renewed focus on thinking about how to    think. One of the earliest thunkers about this stuff was a guy called George    Boole, who said in the introduction to his book that what he was trying to do    was to think about logic (about thinking) from a mathematical point of view,    rather than to try to establish the laws of logic from our acquaintance with    reality. At the time (late 19th century) is was becoming clear that a lot    of what human beings concluded about the world from observation was plumb    wrong. What the logical philosophers did was to try to see if this    dissonance was the consequence of poor habits of thought. The whole    exercise culminated in an attempt to place all of mathematics upon a    foundation of logic (until one M. Goedel put the kybosh on that! But I    digress. . .)

     In his book _Introduction_to_Symbolic_Logic_and_its_Applications_ Rudolf    Carnap highlights the way that one of the differences between syllogistic    and symbolic logic in the later's emphasis on the relation. The idea is that    we are generally interested in reasoning about 'propositions': true sentences    describing the relationship between things (nouns) and/or their properties.    A sentence like 'There exists a planet called Earth which is 12,756    km in diameter and has a mass of 5.98 x 10E+24 kg.' is a proposition. So is    "There is a Product called 'A Toothbrush', that sells for $7.49, is red,    is plastic, and costs $5.75 to make.".

     Now, an important point to note is that there are infinitely many possible    propositions. (Even if you were to AND together everything in existance,    you could still say 'AND there exists a proposition of the following form'    and repeat yourself.) In practice what we are doing is not representing    reality in a schema, but imposing an order on the world which does not    really exist there. However, if you look at any pair (or set) of propositions    there are rules to follow about manipulating them. For example, if the first    sentence above is true, you can deduce from it that 'There exists something    with a mass of 5.98 x 10E+24 kg.', and that (for example) there is at least    one proposition in our universe of discourse. (oops - now there are two, now    there are three . . . )

    [some time later]

     When we point to a group of propositions with an identical structure    (refering to the same kinds of things in the same relationship to one    another) we label that group a 'relation'. It doesn't really matter what    order they appear in, or what order the elements appear in. An example of a    relation would be all of the sentences about planets (let's just stick to    the solar system for the time being, and presume that Pluto is the last one.)    We call the different 'kinds of things' in the relation's elements instances    of 'domains'. There is a domain of Planet Names, a domain of Weights and    Distances and so on. If you want an object-speak term, the closest thing to    a relation in object-land is a 'pattern'.

     Now, in pure mathematical logic, there are *only* relations and domains.    For example, there is a relation like this: Equal { <'Earth', 'Earth'>,    <'Mars','Mars'>, etc } and another like this: LessThan { <0.1, 0.0>, <0.2,     0.1> . . }. The first relation is finite (because the domain of Planet    Names is finite) but the last one is not. It arranges every possible Mass    value, and every possible Mass less than that it (even if there is nothing    at all that actually weighs that amount), into a vast list of pairs.    Not very practical for a database (oops) but incredibly powerful    conceptually.

     Why? Because we can reason about the propositions contained    within relations in an orderly, deterministic fashion. In other words, we    can automate reason (can't say anything about the correctness of the    propositions themselves, but we can say quite a lot about how they can be    manipulated). The whole sum of Ted Codd's great insight is that all of    the programming language stuff about 'references' and 'identity' and
'order' can (and should) be eliminated without losing any representational
   power. The principles and practices that find expression in 'the relational    model' are not really about programming at all. They are an attempt to    describe a model of rational thought that can be written into a computer    program (a DBMS).   

    We can take a proposition of the following form 'Those Planet Names where    mass of the planet is less than the mass of a planet called 'Earth', which    does not really exist in the same sense that our Planet relation does,    and turn it into something like this:

       WITH Planet as P1, P2 
       [ P2.Name, P2.Mass ]:
         (Equal < P1.Name, 'Earth' > ), 
         (LessThan < P2.Mass, P1.Mass >);

    And this helps to explain why Relational people are so hostile to
'object speak', or 'references', or physical programming in any of its
   manifestations. The R model is a delight precisely because it rids us of    loosey-goosey notions like 'class' and 'object' and replaces them with a    systematic, and very powerful, way of thinking about the problem. It    corresponds to something like the rules of chess without which no game    of chess could exist, whereas object speak feels like a bag of    chess pieces and a board (which are utterly superflous). There is no way,    in an object-base, of saying whether one schema design is better than    another, or why. In relation-land, we can look at a schema design and the    rules it follows and make formal judgements about the design. And we can    take query expressions and turn them into a sequence of automated steps    which will compute the answer (we can even select one sequence out of the    manifold possibilities because it is the 'cheapest' way to compute the    answer.)

     Anyway, I hope this clarifies the issues somewhat. This kind of    back-and-forth gets really old, really fast, because it usually all boils    down to the object camp not understanding where the relational guys are    coming from. We're all very familiar with object-speak because we are,    mostly, programmers. But James (for example) repeatedly indicates that he    has no clue about what the hell the R guys are talking about. (BTW: Don't    get too cranky about this, James. You're not the first and I doubt you'll    be the last.)

     Relational theory people are only really interested in computers to the    extent that they can use them as tools to automate their model (and it needs    to be pointed out that the only kinds of models you can automate are those    with a firm theoretical foundation.)

   Books:

   Carnap, Rudolph. _Introduction_to_Symbolic_Logic_and_its_Applications_.
   Russel, Bertrand. _Introduction_to_Mathematical_Philosophy_.
   Langer, Susanne. _An_Introduction_to_Symbolic_Logic_ (This is the best.)
Received on Sat Jun 15 2002 - 01:42:43 CEST

Original text of this message