# Re: RM and abstract syntax trees

Date: Thu, 01 Nov 2007 08:05:39 -0700

Message-ID: <1193929539.677726.197470_at_y27g2000pre.googlegroups.com>

On Oct 31, 4:29 pm, David BL <davi..._at_iinet.net.au> wrote:

*> On Nov 1, 7:23 am, Tegiri Nenashi <TegiriNena..._at_gmail.com> wrote:
**>
**>
**>
**> > On Oct 29, 7:06 pm, David BL <davi..._at_iinet.net.au> wrote:
**>
**> > > The inappropriateness of the RM for ASTs is highlighted when we
**> > > interpret tuples as propositions or facts. Consider the following
**> > > expression
**>
**> > > (x + 1) * 3
**>
**> > > In order to state the fact that this expression consists of the
**> > > product of subexpressions '(x+1)' and '3', one must clearly assign
**> > > names to the things about which we want to state facts. No wonder the
**> > > RM forces us to go on a naming spree.
**>
**> > And when you consider the above expression in the language theory you
**> > don't assign grammar symbols to each subexpression? In your example
**> > they clearly are:
**>
**> > number : 1 | 3;
**> > expr : number | expr + expr | expr * expr | ( expr );
**>
**> I don't understand your point. Naming grammar symbols is quite
**> different from naming actual instances of subexpressions. Imagine
**> how many names one would need in a large source code program.
*

The actual instances of subexpressions are binary predicates. In your example

(x + 1) * 3

let's enumerate all the tokens:

0 -> ( 1 -> x 2 -> + 3 -> 1 4 -> ) 5 -> * 6 -> 3

Then by parsing this fragment we establish that, for example, the predicate "expr"(x,y) evaluates to true for the arguments x=0, y=5. (By convention a semiopen interval of tokens [x,y) includes all the integers between x, included, and y, excluded). Essentially you have a structure of nested intervals that encodes the parse tree:

expr[0,7)

...expr[0,5) ......."(" [0,1) .......expr[1,4) .......")" [4,5) ..."+"[5,6) ...expr[6,7)Received on Thu Nov 01 2007 - 16:05:39 CET