Re: Concurrency in an RDB

From: David <davidbl_at_iinet.net.au>
Date: 27 Dec 2006 16:02:04 -0800
Message-ID: <1167264124.418907.257940_at_i12g2000cwa.googlegroups.com>


Sampo Syreeni wrote:
> On 2006-12-27, paul c wrote:
>
> > What possible reason would one have to apply relational operators to
> > strings, at least strings as most humans would read or write them?
>
> I can't see any. But I also read this as a shortcoming of the relational
> model.
>
> We do have a number of operations on strings and also full-fledged
> running prose which are practically important, but which haven't yet
> been neatly included in the relational model of data. Say, the
> equivalence between a low level string-of-characters representation, and
> a fully parsed, hierarchical, more annotated, "more semantic" one.
> Apparently there's something about text and/or strings which isn't
> straightforwardly amenable to relational treatment.
>
> Given the current, practical importance of both running text and the RM,
> I wonder why a) there haven't been any genuine attempts at treating
> strings, text and language in general in relational terms, or b) why the
> RM folks won't confess it can't be done, given the current state of
> knowledge, thereby acknowledging that there is data that just isn't
> currently amenable to relational treatment.

This reminds me of a difference I have noted between Prolog and RM.

Prolog allows for complex data structures to be built using nested functors. These data structures don't themselves represent predicates. So although Prolog forces all algorithms to be written using logic programming, it doesn't impose such restrictions on the data itself. Consider the example of using Prolog for symbolic algebra. The algebraic expressions to be manipulated are not recorded as assertable or retractable predicates.

By contrast in RM the data itself is stored in the form of predicates (or facts). Although clearly very powerful and useful, it would surprise me if all knowledge should be stored that way. For example, RNM doesn't seem right for storing algebraic expressions.

Here is my hypothesis for what's going on: It relates to the interpretation mapping between entities and values stored in the RDB. Ideally a given tuple from a given relation can be easily (and independently) confirmed as a fact with respect to the real world entities. This independence between all the facts seems to be one of the main reasons why RM is so easy and intuitive to work with.

However that all breaks down when the RDB is used store compound expressions that when decomposed, the parts (ie sub-expressions) represent nothing other than themselves, rather than some "real" external entity mapped under interpretation. The benefits of relational query are lost because we aren't normally interested in searching for sub-expressions.

Consider the example of storing a string relationally using a head-tail list. This requires generation of unique identifiers to represent each tail. This means the table has a lot of fabricated entities in it, that don't map directly to the problem domain.

It seems to me that this is associated with why RM is not particularly suitable for storing things like strings, algebraic expressions or source code.

Cheers,
David Received on Thu Dec 28 2006 - 01:02:04 CET

Original text of this message