I read a paper recently where someone was working on federating ontologies on the Web; one subproblem was matching subtrees and they used Gentner's Similarity Mapping Engine for this. Perhaps this link would help:

> Suppose I have a collection of labeled ordered trees (vulgo XML
> documents). Suppose I have a query on the tree structure. Suppose I
> want this query to be interpreted in a vague way.
> Any key words to search for in the literature?
> I know of similarity between trees, but that does not appear to apply
> here. First of all, the query is likely to be much smaller than the
> documents, and therefore the similarity is likely to be low.
> Secondly, the query may specify things which can't be expressed as
> trees. For instance, suppose people search for nodes labeled A which
> have either a child labeled B or a grandchild labeled C. This
> disjunction cannot be expressed as a tree.
> The second question is the relationship between the similarity and
> the relevance. How do we know that a similar tree is also more
> relevant to the user? It would be nice to have a theoretical
> foundation for this.
