Re: Objects and Relations

From: Marshall <marshall.spight_at_gmail.com>
Date: 30 Jan 2007 01:33:02 -0800
Message-ID: <1170149582.074055.87040_at_a75g2000cwd.googlegroups.com>


On Jan 29, 9:41 pm, "David BL" <davi..._at_iinet.net.au> wrote:
> On Jan 30, 2:01 pm, Gene Wirchenko <g..._at_ocis.net> wrote:
>
> > "David BL" <davi..._at_iinet.net.au> wrote:
>
> > [snip]
>
> > >These are clear disadvantages of RM here. Are there any advantages?
>
> > Oh, come on! Surely you can troll better than that?
>
> Do you realise I'm only talking about strings?
>
> I'm not trolling. I'm quite serious.

I believe that you believe what you're saying, so in some narrow sense you're not a troll. But you're still a troll in the broader sense,
in that you're an outsider who doesn't exhibit much mastery of the RM, pops in with no prior posting history, and speaks of "clear disadvantages" without any signs of having done any actual analysis. If Gene dismisses you out of hand, I cannot fault him.

You give a couple of small signals that indicate that you might not be a complete loser, though; you've supplied an actual table header and problem statement, and you mention functional and logic programming. That by itself puts you ahead of most of the folks who show up here to helpfully explain to us just where OO is so much better than RM.

> Consider the following relation
>
> charsInString(StringId, Index, Char) :- StringId has Char at
> Index

This isn't right as is, though; you're putting every string in a single table. Coincidental cohesion; grouping things based on a superficial type commonality. Are you going to put all your ints in a single table also? I would expect not.

Instead, let's imagine we're in a nested relational system, and strings are a relation type {index:int, c:char} I will use the name S to denote the string at hand; sort of the analog of "this" in an OO language.

> What benefit does this relation give you (ie on top of an OO
> implementation of a string).

All the benefits of the RM apply. What would be first on my list would be the relational algebra, but others might put logical/physical independence first, or any of a variety of things.

> Show me some elegant and useful string
> processing that can be achieved using RA.

Okay. (But please note the "show me" is trollish; you are the one who wants to prove a point; why shouldn't you be the one to do the analysis?)

> Won't you almost always
> join/select for the simple purpose of looking up a character in a
> given string at a given index position?

You say that like it's a bad thing. Join is the primary tool of the RA, so yes, we will join. And select. We'll use the algebra, just like an OO programmer would use loops and method calls.

> I can only think of some "silly" uses like finding frequencies of
> characters, or finding all the locations of a given character.

Ha! I would have more sympathy for this position if I hadn't had an application a few months ago where I had to write a bunch of "silly" character frequency analysis in Java. It was excruciatingly low level, whereas what I wanted to do was just variations on:

  select c, count(*) from S group by c

But yes, you've pointed out one area where the relational way of doing things will be much easier than the OO way: aggregates.

> Something important like searching for a given sub-string is actually
> quite difficult to specify in the RA.

Using any new set of tools is quite difficult, until you get the hang of it. The first time I tried functional and logic programming, it was mind-bendingly painful to specify anything. How was it for you? Haskell looked really weird to me for a while. The different associativity and precedence of function application and juxtaposition, let alone currying, made everything harder at first. But then I saw that 3 line quicksort in Haskell and in a flash I realized that there was a lot of expressive power to be had, even if it meant I had to discard familiar imperative techniques and learn new functional ones to get it.

Yes, our first hypothesis to explain your difficulty must be lack of familiarity. However we can't say with confidence until we've done some analysis. Maybe the relational implementation *is* more complex. Let's see.

So: substring.

Let's compare two substring operators from java.lang.String to a relational specification of same.

indexOf(int ch, int fromIndex)

The source to java.lang.String can be downloaded as part of the JDK; I will refer to it here. (But not excerpt it, much as I would like to for
comparison, because it is copyrighted.)

Here is the relational implementation in pseudo-SQL:

  select min(index) from S where c = ch and index >= fromIndex

The body of the Java implementation is 37 lines long, and contains 8 if statements and 2 for loops, and can return from any one of five places.

But hey, that's maybe "silly" because it only uses one character, and *obviously* the relational implementation will win there. So let's do the next one:

indexOf(String str, int fromIndex)

Okay, so this finds the first occurrence of str inside "this" that is greater
than or equal to fromIndex. To do this I need an aggregate conjunction function; that takes the boolean-and of every boolean in the aggregate input. I don't know if this is standard in SQL. In any event, this is simply
the min() function applied to the boolean domain, under the ordering false < true. Think of it as "every".

Relationally:

  select min(S1.index) from S S1 where
    (select min(S2.c = str.c) from S S2, str

      where S2.index+S1.index = str.index) and
      S1.index >= fromIndex

"Select the smallest S.index where
  every character in S' offset by S.index equals   every character in str,
  that is greater than or equal to fromIndex"

The body of the java implementation calls an internal helper function that is 33 lines, containing 6 ifs, 2 for loops, 1 while loop, and that
can return from any one of 4 different places.

I could try to write the Haskell version of indexOf, but my Haskell sucks.
I imagine it would require a helper function, and be defined inductively.
I don't believe it would be any more succinct, but I suppose an argument
could be made that it was more comprehensible. But I also believe that there will be as many places where the set-theoretic approach will be more comprehensible than the inductive approach as the reverse.

> If you want to deal with strings declaratively (not procedurally) then
> use a functional programming or logic programming language. RM seems
> a poor fit.

Does it still seem so after reading the above?

Marshall Received on Tue Jan 30 2007 - 10:33:02 CET

Original text of this message