# Re: RM and abstract syntax trees

Date: Mon, 05 Nov 2007 23:22:42 -0800

Message-ID: <1194333762.116623.131770_at_v29g2000prd.googlegroups.com>

On Nov 6, 6:06 am, Tegiri Nenashi <TegiriNena..._at_gmail.com> wrote:

> On Nov 3, 9:06 am, Marshall <marshall.spi..._at_gmail.com> wrote:

*>
**>
**>
**>
**>
**> > On Nov 1, 11:24 am, Tegiri Nenashi <TegiriNena..._at_gmail.com> wrote:
**>
**> > > On Nov 1, 8:04 am, paul c <toledobythe..._at_ooyah.ac> wrote:
**>
**> > > > Surely the application here is the manipulation
**> > > > of the parse tree, not the AST.
**>
**> > > Parse tree indeed is the king, not AST. Given the parse tree one can
**> > > easily derive AST by selecting only "interesting" nodes. The aguments
**> > > for AST -- that it is more human readable and more compact -- are both
**> > > meningless from the theory perspective.
**>
**> > Various compiler people I know would disagree with this. The
**> > parse tree has lots of irrelevant info in it. Also the two may
**> > not have the same structure. For example a set of child
**> > nodes may appear as a left (or right) heavy tree, as a
**> > result of the structure of the productions. Whereas the
**> > semantic view might be a list or even a set.
**>
**> How do you define "semantic view"? IMO the only significant relation
**> encoded in the tree structure is the parent-child, but in order to
**> work comfortably with this you have to transitively close it to obtain
**> "ancestor of" partial order relation. Both AST and parse tree have the
**> same partial order, and AST simply has less number of elements.
**>
**> However, I read your reply as if you have a counter example which I
**> fail to see. Consider the previous example:
**>
**> (x + 1) * 3
**>
**> Parse tree:
**>
**> expr[0,7)
**> ...expr[0,5)
**> ......."(" [0,1)
**> .......expr[1,4)
**> ..........expr[1,2)
**> ............."x"
**> ..........expr[2,3)
**> ............."+"
**> ..........expr[3,4)
**> ............."1"
**> .......")" [4,5)
**> ..."*"[5,6)
**> ...expr[6,7)
**> ......"1"
**>
**> versus AST
**>
**> expr
**> ...expr
**> ......."x"
**> ......."+"
**> ......."1"
**> ..."*"
**> ..."1"
**>
**> Clearly one is reduction of the other. One can further reduce it by
**> renaming symbols so that expression is labeled by
**>
**> "*"
**> ..."+"
**> ......."x"
**> ......."1"
**> ..."1"
**>
**> but, yet again, do you actually gain anything?
*

I once wrote a (toy) Prolog program to do symbolic algebra. The AST consisted of nested terms, and I associated the AST with the internal data structure in which to simplify expressions and solve equations. The (top down) parser was a breeze to write, and I had no need to explicitly generate a parse tree.

In the context of your above question I would say that in this particular example, avoidance of the parse tree gained performance and simplicity.

I don't know enough about grammars and parsing theory to know whether parse trees are important more generally. Received on Tue Nov 06 2007 - 08:22:42 CET