Re: Objects and Relations

From: Kevin Kirkpatrick <kvnkrkptrck_at_gmail.com>
Date: 31 Jan 2007 13:49:22 -0800
Message-ID: <1170280162.242191.293360_at_p10g2000cwp.googlegroups.com>


On Jan 31, 9:24 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> Marshall wrote:
> > On Jan 30, 6:32 am, "David BL" <davi..._at_iinet.net.au> wrote:
>
> >>On Jan 30, 6:33 pm, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
> > Okay. Going to try to recreate my earlier post. Will not be
> > as good but hope to get the basic sense across, along
> > with some actual references to stuff.
>
> >>>>Consider the following relation
>
> >>>> charsInString(StringId, Index, Char) :- StringId has Char at
> >>>>Index
>
> >>>This isn't right as is, though; you're putting every string in
> >>>a single table. Coincidental cohesion; grouping things based
> >>>on a superficial type commonality. Are you going to put all
> >>>your ints in a single table also? I would expect not.
>
> >>I'm glad to see that you don't like the idea of putting all characters
> >>of all strings in a single table.
>
> >>I believe a relational database should treat strings as an ADT to be
> >>used for domains for attributes.
>
> > Yes, I'm aware of that perspective, and some people I really
> > respect subscribe to it. However I don't; if you encapsulate
> > a string as an ADT then you lose the ability to use the RA
> > on the characters, which I believe I've already demonstrated
> > has some utility.
>
> > I also have an argument that encapsulation itself is merely
> > a crutch to make up for the fact that a language lacks
> > declarative integrity constraints.
>
> >>>Instead, let's imagine we're in a nested relational system,
> >>>and strings are a relation type {index:int, c:char} I will use
> >>>the name S to denote the string at hand; sort of the analog
> >>>of "this" in an OO language.
>
> >>Well, I was assuming a flat relational system. If you are going to
> >>allow nested relations then I agree that things are rather
> >>different. In fact I believe this is a bit of a cop out. Please
> >>correct me if I'm wrong.
>
> > Not 100% sure I understand, but it sounds like you're saying
> > nested relations are a cop-out. If so I disagree. Again, this
> > is a theory group, so we do not have to restrict ourselves
> > to discussing existing implementations or SQL or whatever.
> > As a logical model nested relations have a long history of
> > study, and they do solve some problems. My personal model
> > includes nested relations.
>
> NF^2 nested relations? Or relation-valued attributes?
>
> >>At the outermost nesting level you have
> >>chosen to deal with strings opaquely - as a domain for attributes. In
> >>other words at that level of abstraction you have chosen not to deal
> >>with strings relationally.
>
> > Yes however I consider this a non-issue. The advantage of
> > recursive structures is that one can deal with each level
> > at a time, and not have to address them all at once.
>
> And I would expect a String type to permit multiple possible
> representations. If one can represent a String as an array, one already
> represents them as a relation given the operations to convert back and
> forth. So why not have a relation-valued representation?
>
> /amplification
>
> >>You don't seem to appreciate the fact that the onus of proof often
> >>depends on the nature of the claim rather that who is making it. It
> >>would obviously be difficult to construct a formal proof that there
> >>doesn't exist elegant and useful string processing algorithms using
> >>RA, whereas it is comparatively easy to show examples if they in fact
> >>exist. All I said is that I can't think of any and perhaps there
> >>aren't any. I make no apologies for having rather less experience in
> >>RM than you do. If you look carefully at my post you will find that I
> >>never even claim that there are no such examples. I only raise the
> >>question.
>
> > I don't believe I have made any claims as to where the anus of
> > proof lies.
>
> Intentional? ;> LOL
>
> >>I note that this is the second time you have expected proofs of non-
> >>existence of examples (cf our discussions about fine grained mutative
> >>operations in a previous thread). Do you agree that this is often
> >>impossible?
>
> > I don't believe that I have expected proofs. Let me change
> > my earlier phrase "prove a point" to "make a point" so that
> > we avoid any potentially-ambiguous use of the word "prove."
>
> > Some of what we're discussing are design questions and
> > they are not subject to proof. (In fact, surprisingly little *is*
> > subject to proof.)
>
> Indeed. The more we make subject to proof, the better, however.
>
>
>
>
>
> >>>>Won't you almost always
> >>>>join/select for the simple purpose of looking up a character in a
> >>>>given string at a given index position?
>
> >>>You say that like it's a bad thing. Join is the primary tool
> >>>of the RA, so yes, we will join. And select. We'll use the
> >>>algebra, just like an OO programmer would use loops
> >>>and method calls.
>
> >>Of course. What I mean is that the access pattern on the relation is
> >>almost always the same, and no different to the functionality provided
> >>by the ADT.
>
> >>This contrasts with relations like Employee which join on different
> >>attributes at different times, revealing the power of RA for adhoc
> >>query.
>
> > Joining on single-attribute relations is a practical and useful
> > thing to do; it doesn't indicate any problem, any more than
> > a function that takes only a single argument indicates a
> > problem.
>
> > Join of two one-attribute relations on their single attributes is
> > set intersection, which I hope you will agree is a meaningful
> > and useful operator.
>
> Or cross-product which I hope everyone will agree is a meaningul and
> useful operation too.
>
>
>
>
>
> >>>indexOf(int ch, int fromIndex)
>
> >>>The source to java.lang.String can be downloaded as part of the JDK;
> >>>I will refer to it here. (But not excerpt it, much as I would like to
> >>>for
> >>>comparison, because it is copyrighted.)
>
> >>>Here is the relational implementation in pseudo-SQL:
>
> >>> select min(index) from S where c = ch and index >= fromIndex
>
> >>>The body of the Java implementation is 37 lines long, and
> >>>contains 8 if statements and 2 for loops, and can return
> >>>from any one of five places.
>
> >>I shall look at this Java code when I get the time. It seems overly
> >>complicated.
>
> >>In my C++ code I use a StringRange class that represents an aliased
> >>range of characters using a pair of pointers, allowing me to write
>
> >> while(s && *s != ch) ++s;
>
> >>to scan through characters in range s not equal to ch.
>
> > I am aware of the StringRange techinque but not familiar with
> > it. I am not immediately clear what the above code does; in
> > a C idiom (as opposed to C++) it appears merely to advance
> > the (uninitialized) s pointer. Even if it's a complete solution, you
> > omit the code from the StringRange class, which has to be
> > accounted for to do a fair comparison with my code.
>
> I wonder how the above loop terminates if ch does not occur in s. Then
> again, if the increment operation can make s evaluate to false at the
> end of the string, I suppose it is a fair operation to assume just as
> join is a fair operation to assume.
>

Bob makes this point in a later context, but I think it can be made even more accutely in the context of indexOf(int, int).

David has implemented indexOf(int,int) as a fixed algorithm... sure, his implementation is somewhat simpler than Marshall's declarative version, but this metric misses a mammoth disadvantage.

In almost any modern SQL DBMS, if string is stored as a relation mapping indexes to characters, and indexOf() is written as Mashall defined it, one could slap an index on the string relation and immediately have a O(log n) for indexOf(). In fact, for rare characters such as delimiters (which are most likely to be searched for within an application making routine use of indexOf), they would have an implementation approaching O(1). Even better, modern databases allow for the kind of dynamic sampling which would allow an optimizer to make runtime decisions that choose whether to use a binary search for rare characters, or an array scan starting at fromIndex for more common characters.

For David to compete, he should present a procedural version of indexOf() code with equivalent behavior characteristics; i.e. in which indexOf() will be resolved via a binary search for rare characters and an array scan for more common characters.

I've added the requisite additions for enhancing Marshall's code (using standard DBMS-supplied utilities) Create table s (i int, c int);
Create index n (c); -- new
GatherStats(s); -- new
IndexOf(ch int, fromIndex int) : select min(i) from S where c = ch and i >= fromIndex

David's enhanced code (using standard OO-supplied libraries):

[David, please present the code here.]

>
>
>
>
> >>> select min(S1.index) from S S1 where
> >>> (select min(S2.c = str.c) from S S2, str
> >>> where S2.index-S1.index = str.index) and
> >>> S1.index >= fromIndex
>
> > [corrected from original]
>
> >>>"Select the smallest S.index where
> >>> every character in S' offset by S.index equals
> >>> every character in str,
> >>> that is greater than or equal to fromIndex"
>
> >>>The body of the java implementation calls an internal helper function
> >>>that is 33 lines, containing 6 ifs, 2 for loops, 1 while loop, and
> >>>that
> >>>can return from any one of 4 different places.
>
> >>Thinking off the top of my head, in C++ I would write something based
> >>along the lines of
>
> >>while(s && str.size() <= s.size() && s.prefix(str.size()) != str) ++s;
>
> > Again, I note that you're not accounting for the size() and prefix()
> > methods;
> > you have to include them in the total count.
>
> I tend to disagree with that. You do not account for the physical
> implementation of join or min.
>
> My code didn't use any
>
> > function abstraction; what you see is the whole deal. Otherwise
> > we both just wrap our implementations in a function and call it a
> > day. My implementation used only min(), subtraction, and comparison.
>
> I don't think it unfair to assume standard library-supplied code in a
> comparison, but I agree the other extreme of wrapping everything in a
> function makes no sense either.
>
> I note that David's algorithm is necessarily at least O(n*m) where n is
> the number of characters in s and m is the number of characters in
> str--squared if the .size function scans for null-termination.
>
> Your solution is potentially O(n) depending more on the skill and
> investment of the compiler writers and less on your skill and investment.
>
> (Unless someone can point to a C++ compiler that has implemented
> algorithmic replacement.)
>
>
>
> >>>>If you want to deal with strings declaratively (not procedurally) then
> >>>>use a functional programming or logic programming language. RM seems
> >>>>a poor fit.
>
> >>>Does it still seem so after reading the above?
>
> >>Yes!
>
> > Okay. What would you find to be a convincing demonstration? If nothing
> > else I claim my substring find is quite declarative; as much so as any
> > logic program. (In fact, one could make an argument that mine *is*
> > a logic program.)
>
> > It is in general nearly impossible to separate out one's familiarity
> > with
> > a particular computational model from how "intuitive" code in that
> > model appears. Again, Haskell on first contact appeared ridiculously
> > opaque, but now I find it generally very readable.
>
> > The Stroustroup quote:
>
> >http://www.research.att.com/~bs/bs_faq.html#compare
>
> > The only way to get around the familiarity issue is with hard
> > metrics of expressiveness, and these are few and far between.
> > The one that stands out is Felleisen's "On the Expressive Power
> > of Programming Languages." I have it on my to-do list to
> > master Felleisen expressiveness and use it to compare a
> > relational language.
>
> > Until we do that, or something like it, pretty much all we're
> > doing with
>
> ...
>
> read more »- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -
Received on Wed Jan 31 2007 - 22:49:22 CET

Original text of this message