Re: Canonical DB

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Fri, 23 Jun 2006 00:05:42 +0200
Message-ID: <449b13ca$0$31652$e4fe514c_at_news.xs4all.nl>


Dmitry A. Kazakov wrote:
> mAsterdam wrote:

>>Robert Martin wrote:

>
>>>Things like tree searches, graph walks, etc.
>>
>>No, tree searches and graph walks are things
>>you *need* to do (and specify) when all you have
>>is navigational structures. They are part of their cost.
>>
>>Now where is the benefit - what are you computing:
>>Which (or which types of) computations are easier
>>with a navigational structure?

>
> That's easy.

Thank you for your reply. I can go along with the direction of it as a good start.
The nit-picks below are not to be taken as disrespect, I hope they help to get things sharp.

> The benefit is in reducing the
> cardinality of the problem space.

That is the benefit of modeling in general, not a benefit of any particular way of modeling, in this case modeling with navigational structures.

> Tree, graph and a table, too, can serve as sparse descriptions of
> some large problem space. Consider a route map.

A navigational problem space.
I prefer simple beginnings, but so be it. It just means we have to be extremely careful not to confuse navigation in our universe of discourse (UOD) with possibly navigational aspects of our model.

So, ok. Reluctantly.

> The original space is 3D continuum.
> You cannot handle it as-is in any finite system.

So, we have to model. Ok.

> A graph here is just a sparse description, of something,
> which otherwise might be impossible to compute.

http://en.wikipedia.org/wiki/Graph_theory says "a graph is a set of objects called points or vertices connected by links called lines or edges." yet another pair: nodes and links.

> Talking about cases where usage of the "original" relational structure is
> possible, but computationally inefficient, consider distance mappings.

Ok. Why the scarequotes?

> Technically each ordered structure induces some distance.

Yes.

> For example, a table with ordered keys induces one
> between rows d(k1,k2)=k2-k1.

That depends on whether you are talking about tables drawn on paper (or some other media) or about 'table' as an SQL reserved word which (to avoid confusing implementation issues with modeling issues) are to be treated as relation variables. They have, by definition, no order.

The keys in the table on paper even have a distance in cm. On the keys in the SQL table distance is undefined.

Common sense (as in: I have no theory to back this up) tells me that for k2-k1 to denote a distance, both k1 and k2 should denote a point in some space.

> A tree structure does it as well, it is so-called tree distance.
> This induced distance is obviously not the problem space distance.

Indeed. To the problem space this tree-distance is meaningless. It may say something about something like a computational distance for a (navigational) computation, but nothing at all about the UOD.

> Which might have an immense computational impact.

How so? We haven't any way yet to compute problem space distance. That's it. No computations to compare yet.

> Example: kd-trees vs. relational tables.

Nodes on a kd-tree *do* denote a (carefully chosen) point/line/plane (k - 1 d thingy) in problem space.

> [Euclidean (any 3D) distance cannot be mapped to *any* linear order. It is
> a hard fact. So there is no key which could work. This was the thing which
> killed Prolog in machine learning (nobody would even try RA.)]

Using something completely meaningless to compute something meaningful kills nothing but the idea to do so in the first place. It is a bad idea (TM).

Nodes on a kd-tree denote a point/.. in problem space, so it is a candidate navigational structure for the question at hand - if we have some computations to compare.

A kd-tree is built from a set of points. We don't start from the kd-tree, so it
can only support /part/ of a solution.

> Another advantage of graphs and other "non-linear" structures, is that you
> might do things which would otherwise illegal. For example,
> self-referential structures. Self-containment is illegal, but a thing which
> behaves as if it contained itself is possible through indirection. There
> are many cases where you have to use such models, because other models
> would be complex/inadequate/illegal.

While I agree that recursion can be very convenient in order to reduce complexity and to enhance adequacy, I don't see how it makes possible (or legal) what otherwise isn't.

> If you want a generalized description,
> it is the problems where local descriptions are still "good-behaving",
> linear, but global ones are a nightmare. Graph is locally is almost as good
> as a table.

A kd-tree is a graph, so again we have sets of nodes and links to maintain. However,

http://en.wikipedia.org/wiki/Kd-tree:

"Balancing a kd-tree requires care. Because kd-trees are sorted in multiple dimensions, the tree rotation technique cannot be used to balance them — this may break the invariant."

ISTM this means that kd-trees are suitable for data-rendering, but not for capturing or maintenance of the same data these rendering computations build on.

My question was:

"Which (or which types of) computations are easier with a navigational structure?".

So far, your answer appears to me as:

One candidate type/class of computations is:

	those computations that benefit
	from a representation of the problem space
	as (nodes on) a kd-tree:
	with distance mappings in a kd-tree
	there is a definable computational
	distance, which relates to the problem
	space distance in a definable way.


Am I understanding correctly? Received on Fri Jun 23 2006 - 00:05:42 CEST

Original text of this message