| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Approximate structural conditions in trees
Suppose I have trees (ordered node-labeled trees, vulgo XML documents,
to be precise) and I want to search them for some structural
conditions. For example, I might want to search nodes labeled A
which have child-nodes labeled B.
Now I assume that the queries are not precise since the user formulating them does not know what the trees look like. In our example, A could be a grandparent of B but the system should still find it.
Another example would be that the user looks for something like this:
A
/ \
B B
But the tree looks like this:
B
/ \
A A
Maybe this tree should still be found.
Now a number of questions arise:
I'm aware that there is some work on similarity between trees (one approach is the tree editing distance), but some things are not clear to me:
(1) A query is not the same as a complete tree. For instance, a
query is much smaller than a complete tree. Consider that the trees are books in XML format, and that the query might be to search for chapters with a certain heading. Then the query just contains two nodes, but the documents will presumably contain thousands of nodes.
Also, a query could contain other kinds of conditions. For example, what if the query says to search for A elements which have either a B child or a C child? It's clear that this does not map to a simple tree.
So it is problematic to apply a similarity measure between trees if one tree could be very different from the other one, or maybe one tree is not a tree at all!
(2) Is the tree edit distance a good measure of the users' idea of
relevance? If one tree has a low distance and the other has a high distance, will the user really prefer the tree with the low distance?
Is there work on this? What are the keywords to feed to Google for this?
kai
-- A large number of young women don't trust men with beards. (BFBS Radio)Received on Tue Jul 09 2002 - 22:55:23 CDT
![]() |
![]() |