Re: what are keys and surrogates?

From: David BL <davidbl_at_iinet.net.au>
Date: Tue, 8 Jan 2008 21:59:52 -0800 (PST)
Message-ID: <3e5975f0-5255-4914-aecd-2ed62f8afc1a_at_1g2000hsl.googlegroups.com>


On Jan 9, 1:25 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Jan 8, 6:17 pm, David BL <davi..._at_iinet.net.au> wrote:
>
> > On Jan 8, 9:26 am, JOG <j..._at_cs.nott.ac.uk> wrote:
>
> > In November I started a thread called "RM and abstract syntax trees"
> > in which I suggested that RM was poorly suited for the representation,
> > never mind manipulation of ASTs.
>
> Hmmm, I think I remember that. ;-)
>
>
>
>
>
> > The problem is that the only
> > reasonable way to represent the structure is to introduce meaningless
> > node identifiers. An important principle in the RM is that a tuple
> > should always represent a proposition that makes sense to the problem
> > domain expert, so I agree with you that we cannot allow hidden
> > identifiers. Therefore the RM cannot help but expose the node
> > identifiers for all to see.
>
> > Prolog is able to parse string expressions entered by users and build
> > and manipulate ASTs. Behind the scenes, nested functor expressions
> > are usually implemented using dynamically allocated nodes wired up
> > with pointers. However, as far as the programmer is concerned, only
> > unification is available to decompose the structure. It seems to me
> > that Prolog has a more general support for data modeling than
> > available in the RM, to the extent that nested functor expressions
> > avoid the need to introduce lots of meaningless identifiers.
>
> This issue goes away if we relax 1NF and allow attributes that are
> lists or relations. This gives us nested structures. (Nested relations
> are not particularly controversial around here.)

When you say "nested relations", is it your intention to nest at every node as one recurses down the AST?

Can you please explain how an expression like

    (5 + 6) * x

would be represented? I can imagine for example that the top node will be stored in a relation R as follows

    R: { (0,R0), (1,R1), (2,R2) }

where 0,1,2 are used to index the elements of a list where the 0th element R0 is an RVA that represents the type of node (in this case a multiplicative node) and the subsequent elements are the child nodes which are also RVAs (R1 represents "5+6" and R2 represents "x").

An alternative approach (which would look even more like LISP) would be to use head-tail lists.

I agree this avoids the need to introduce meaningless identifiers.

I guess it comes down to a matter of definition of what the relational approach means. It strikes me as counter to the intentions behind RM/ RA to use a distinct relation value for every node as we recurse down the AST. If you're happy to call that approach "relational" then I won't disagree. I will however ask the question of whether much of the theory and practice discussed on cdt is at all relevant. For example what parts of the RA are useful? Where is there any set based processing?

How does the typing system work with such an approach? ie how do you constrain the allowed RVAs? Would there be some concept of inheritance for relation types (by analogy to an OO implementation of an AST that uses a Node base class)? Received on Wed Jan 09 2008 - 06:59:52 CET

Original text of this message