Re: Three Kinds of Logical Trees

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 24 Jul 2005 14:06:24 -0700
Message-ID: <1122239184.099956.55070_at_g43g2000cwa.googlegroups.com>


dawn wrote:
> Marshall Spight wrote:
>
> hmmm. I'll process, but I figured that putting a person record on each
> node when trying to classify logical trees was muddying the example. I
> don't think I'm fully tapped into what you are thinking yet, but this
> gives me more clues.

I just picked org chart because it shows an example of a complex structure in which the structure of each node is the same.

> > > > dynamic heterogeneous: parse tree. There are specific kinds of
> > > > nodes, but the structure of the tree is relatively unconstrained.
> > >
> > > There are people here much more knowledgeable about trees than I (but
> > > that has not stopped me before, eh?) but even with the lose definition,
> > > I wouldn't say "relatively unconstrained" but perhaps "either
> > > unconstrained or constrained by a grammar." The grammar constraint
> > > might not be fully encoded into the tree, due to the complexity of the
> > > grammar.
> >
> > Yes, exactly. In other words, one has, say, 20 different kinds of
> > nodes; each node type may have a fixed or variable number of children;
> > each specific child may be constrained to be of a particular type:
> > a parse tree. I'm looking for a term that captures the fact that
> > a node has a specific structure. Each node has fixed *local* structure,
> > but the tree's *overall* structure is not fixed. (Hence "dynamic
> > heterogeneous.")
>
> OK, so the tree structure is not the only variable here -- you are also
> looking at the structure within a node. Got it.

Well, yes, but only insofar as the node structure influences the overall structure. In other words, a node for + would have two children, a node for ! would have only one.

> > (I am unclear why you use the term "metadata" here.
>
> If you look at a DOM tree for an XML document (or just zero in on an
> XHTML document if easier), you see that you move from tag to tag in a
> path until you get to a value that has no children. So, you step down
> the tree from <html> to <body> to <div> to <p> to a value that is the
> text in the paragraph. The leaf nodes have values and others have
> metadata.

That strikes me as a nonstardard definition of the use of metadata, but no matter.

> > I agree "scalar" is an unsatisfying term. I have in mind a better one,
> > but it requires a type system that's a lot different from SQL's.
> > (Which I assert is an "opportunity." :-)
>
> I'm planning to start, uh, blogging, this Fall and I decided to test
> out various tools this past spring, so I have a first entry (which
> likely looks like I abandoned the cause) at my not-exactly-perfect web
> site (they say that women apologize more than men) about the types I
> have at the top of my types tree (under the "type" type) in my ideal
> LOGICAL system.
>
> http://tincatgroup.com/mewsings

The pun in the URL is awful. I like it!

> I'm guessing yours are different?

Well, that column seemed to be mostly about document management and not data management. I'm really starting to see them as two entirely different disciplines, and I believe that most of what's going on in XML is about document management and not data management.

I admit that text documents are an important data type, but it's *just one* datatype among millions. Limiting ourselves to looking at only a single datatype doesn't put us in a good position to think about datatypes in general.

> > > Otherwise, if you have such a structure
> > > and you have a value that is either multipart or multivalued within
> > > this static structure, you still don't have SQL-92 support and most
> > > SQL-DBMS's don't make it easy to have such structures, even if they are
> > > supported in some way.
> >
> > I'm not saying that this is necessarily the best way to go, but you
> > can certainly handle this case in a 1NF way.
>
> Sure, but that doesn't mean that if you have a tree that matches your
> definition then it IS in a SQL-DBMS structure. If so, then, yes,
> obviously SQL handles that type of tree well, but if you permit other
> instances of such a type then, no, SQL doesn't handle all trees of this
> type well, so this partitioning of types of trees did not isolate
> information about SQL and trees. Did that make sense?

Sorry, but it didn't. At least not enough to tell you whether I agree or not.

I'll try rephrasing: If you have data that conceptually matches what I'm calling "static heterogeneous", (specifically, the data hierarchy is fixed, as in Customer/Account/Invoice/InvoiceLineItem) then you will be able to model, query, update, and constrain this data very well in SQL. In constrast, if you have an org chart, in which the structure is not fixed, then you will have a harder time, particularly with querying and constraining, and maybe also updating.

> If you are saying that you can put data structures that could be
> implemented in a SQL-DBMS into such a structure, OK, but if you are
> saying that SQL handles such tree structures well, then NO -- it only
> handles the SQL-like flavors well.

I don't really get it. *Any* data structure can be implemented in SQL. I'm trying to get at the question of what specific ones are a good match for sql, and I'm saying static heterogeneous is, and the others aren't.

> > > Interesting topic, but I not quite sure about this partitioning as yet.
> > > Are you trying to carve out how the relational model fits within a
> > > tree model by giving a term to such trees?
> >
> > Yes!
>
> You have possibly made it to a category that is a superset. It might
> be necessary to use such a structure (I'm not quite fully tapped into
> it, so I'll hedge on that), but it is not sufficient unless you add a
> translator to take a tree with multipart values or multivalue and
> explode it (that really is formally the term we use in my neck of the
> woods) into a tree that SQL does like.
>
> > SQL handles the static heterogeneous case with ease,
>
> am I correct that this is only tree for a subset of this type of tree
> by your definitions?

I'm going to assume you mean "true" when you first said "tree", because otherwise I can't make the sentence parse. If so, then *no*, I'm saying it's true for all tree structures that match my definition of static homogeneous. (And, explicitly, *not* true for the other two kinds. They're hard to work with in SQL.)

> > but really
> > chokes on the other two cases. The big difference seems to be
> > the fixed depth vs. variable depth issue.
>
> Yes, that is an issue, but only one.

Okay. What are the others, as you see it?

> > It's impossible to
> > handle variable depth without something more powerful than the
> > basic RM; you need something at least as powerful as transitive
> > closure; recursive queries or (ugh) iteration will also work.
>
> yes & yes and iteration is precisely how I drive from my house to my
> parents' house, among other things.

I do not agree. It is just as accurate to say that in going from point A to B to C to D, you invoke a pure function to transform your position A to B, which then invokes a function to transform your position from B to C, which then invokes a function to transform your position from C to D. If you want to prove to me that the universe is inherrently iterative, you're going to have to point out to me a loop counter somewhere in the real world. Under a rock, say.

The fact that so many progammers favor iteration over recursion is something I consider odd, given that iteration is strictly less powerful than recursion. It is possible to transform any iterative algorithm into a recursive one; the reverse is not true.

> How do you think relational operators are implemented? ;-)

Alas, this is irrelevant. We build software in layers, and each layer is implemented in whatever paradigm it chooses, but this does not constrain the choice of paradigm of higher layers.

> > So the area I'm exploring is, what kinds of operations do we want
> > to do on the dynamic tree types, and what smallest bit of power
> > to we have to add to the RM to handle that well?
>
> That is precisely the wrong question ;-) if you are talking about the
> logical data model IMO. If a human is interacting with the data model,
> then that person can apply either the metaphor of a graph (tree or
> otherwise) or of sets and benefit from the use of both metaphors,
> rather than being restricted by one.

I would be interested to hear what you consider to be the right question.

You regularly mention graphs. Do you have some demonstration of why you consider them a good choice? Maybe a description of some minimal set of operators and what you can do with them? I am quite pleased with the fact that the RM has its minimal set of operations and its correspondence with first order logic; can you describe anything comparable?

> If we can model the idea of
> "travel" by showing a picture of a person on a bicycle or by showing an
> airplane, then how can we extend the picture of the bicycle minimally
> so that it gets at the idea of being able to go over water? That
> question is as relevant in my mind as the one you asked (OK, that is
> almost the case -- I used a metaphor, so it is necessarily flawed and
> also limited).

Alas, I did not follow this metaphor at all. :-( Could you try to rephrase your point, perhaps without using a metaphor?

> > > If so and if I understand
> > > your terms, then there are at least NF2 (non-first normal form) models
> > > that would fit your "static" heterogeneous term.
> >
> > The issue for me is not the model so much, as it is what operators
> > do we need to work on them.
>
> Give freedom to the functions -- they can be relational operators or
> functions to move down a path or whatever is useful.
>
> > Modeling structure is relatively easy;
> > the hard part is querying, updating, and constraining.
>
> Yup. I check in on XQuery on occasion just for the entertainment.
> smiles. --dawn

I hear XMLSchema is a laugh riot.

Marshall Received on Sun Jul 24 2005 - 23:06:24 CEST

Original text of this message