Re: Hierarchical query

From: Jan Hidders <hidders_at_gmail.com>
Date: Wed, 13 Jun 2007 14:25:24 -0700
Message-ID: <1181769924.505218.128520_at_z28g2000prd.googlegroups.com>


On 13 jun, 21:10, Vadim Tropashko <vadimtro_inva..._at_yahoo.com> wrote:
> On Jun 13, 9:19 am, Jan Hidders <hidd..._at_gmail.com> wrote:
>
> > Ok. I'm going to assume the following DTD (in a notation of my own
> > making to make it a bit more readable) for the syntax tree. (It uses
> > no attributes to keep things simple):
>
> > <stat_bl> --> <decl_kw> <decl_item_list> <begin_kw> <statements>
> > <semicol_kw> <end-kw>
> > <decl_kw> --> EMPTY
> > <begin_kw> --> EMPTY
> > <end_kw> --> EMPTY
> > <semicol_kw> --> EMPTY
> > <statements> --> ( <stat_bl> | <assignment> )+
> > <assignment> --> <var> <assign_kw> ( <var> | <number> ) <semicol_kw>
> > <var> --> PCDATA
> > <assign_kw> --> EMPTY
> > <number> --> PCDATA
> > <decl_item_list> --> ( <var> <type> <semicol_kw> )+
> > <type> --> PCDATA
>
> All right, the DTD is a bastardized [context free?] grammar describing
> a language that XML document is an element of. Although the adjectives
> "cumbersome" and "ugly" still apply, I grudgingly admit that the idea
> that a grammar fits into DTD effortlessly is quite powerful (so I'm
> removing the "XML sucks" image from my homepage:-)

Of course, it combines ugliness with great power. Welcome to the Dark Side. :-)

> > For starters I'll first do the reverse query,
> > so I will assume there
> > is a variable $dvar that contains a <var> element that describes a
> > variable in a declaration. The XPath expression that walks to all the
> > <var> nodes in an assignment that are in the scope of $dvar is as
> > follows:
>
> > (1) $dvar/(
> > (2) ../../statements//assignment/var[string() = $dvar/string()]
> > (3) minus
> > (4) ../../statements//stat_bl[decl_item_list/var/string() = $dvar/
> > string()]/statements//var
> > (5) )
>
> > The idea is quite simple: the path expression in line (2) walks to all
> > variables that are nested within the statement block of the
> > declaration $dvar, ...
>
> I have trouble comprehending this line
>
> (2) ../../statements//assignment/var[string() = $dvar/string()]
>
> should I read it left to right? Then the ".." selects the parent of
> the current node, and what is the current node?

First a small correction, the 'minus' I used should actualy be 'except', which has the semantics of the set minus.

What is the current node? Note that the global structure is p_1/(p_2 except p_3) where (p_2 except p_3) means that from the context node you evaluate p_2 which gives you a set (or rather sequence) of nodes and from that you subtract the set of nodes that you get when you evaluate p_3 starting from the context node. So in this case the "current node" that you ask about is for both subexpressions the same, namely the node in $dvar.

Btw., if you forget about sequence order, document order and value comparisons for a moment, the core of XPath is actually a relatively neat calculus of binary relations:
- p_1/p_2 is the concatenation of binary relations - p_1[p_2] is the selection of pairs in p_2 whose right-hand side matches a left-hand side in p_2 (semijoin anyone?)

- p_1 union p_2 is the set union of two binary relations
- p_1 intersect p_2 is the set intersection of two binary relations
- p_1 except p_2 is the set difference between binary relations

Did I mention Tarski already? :-) Of course we had to wait until XPath2.0 until the three set operations were all allowed everywhere, but we have them now.

  • Jan Hidders
Received on Wed Jun 13 2007 - 23:25:24 CEST

Original text of this message