Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> comp.databases.theory -> Re: Canonical DB
On Fri, 23 Jun 2006 00:05:42 +0200, mAsterdam wrote:
> Dmitry A. Kazakov wrote:
>> mAsterdam wrote: >>>Robert Martin wrote: >> >>>>Things like tree searches, graph walks, etc. >>> >>>No, tree searches and graph walks are things >>>you *need* to do (and specify) when all you have >>>is navigational structures. They are part of their cost. >>> >>>Now where is the benefit - what are you computing: >>>Which (or which types of) computations are easier >>>with a navigational structure? >> >> That's easy.
>> The benefit is in reducing the >> cardinality of the problem space.
Sure
>> Talking about cases where usage of the "original" relational structure is >> possible, but computationally inefficient, consider distance mappings.
Because we are talking about models.
The world isn't relational, it just isn't! (:-))
I wished to consider cases where relational models are thinkable, but not optimal.
>> Technically each ordered structure induces some distance.
>> For example, a table with ordered keys induces one >> between rows d(k1,k2)=k2-k1.
Let you have some set of keys: K={k1,k2,...} (a countable set). If K is linearly ordered: forall k1,k2,k3 in K
k1>k2 V k1<k2 V k1=k2
not (k1<k2 & k2<k1)
not (k1<k2 & k1=k2)
k1<k2 & k2<k3 => k1<k3
Then you can enumerate all K preserving the order. I.e. if k# denotes the ordinal number of k, then k1#<k2# <=> k1<k2.
So you *can* define a distance on rows as
d(r1,r2) = | key(r1)# - key(r2)# |
(which will satisfy axioms of distance)
It can be shown, that this apparently arbitrary distance, plays a sufficient role in some cases.
> Common sense (as in: I have no theory to back
> this up) tells me that for k2-k1 to denote a
> distance, both k1 and k2 should denote a point
> in some space.
Yes.
The space is K, and, equivalently, the space of rows R, because key:R->K is a bijection.
>> A tree structure does it as well, it is so-called tree distance. >> This induced distance is obviously not the problem space distance.
As any artifact of any model is. Models are meaningless. We give them meaning. We could fail to give a meaning or choose a wrong one. That would make a model inadequate. So if the distance immanent to our model does not map the problem space distance, and, at the same time, this model distance is directly or indirectly used in a reasoning based on the model, then we are in a deep trouble. Consider a model: "the skin color determines the intellectual level of a person." I.e. d(White, Person) --> IQ(Person). It is an inadequate model.
> It may say something about something like a
> computational distance for a (navigational)
> computation, but nothing at all about the UOD.
Yes. It is our belief which does.
>> Which might have an immense computational impact.
A problem can be stated in terms of the problem space distances. Most of classification/optimization/approximation problems this or that way are.
>> Example: kd-trees vs. relational tables.
Yes. This makes the tree distance close to the Euclidean one.
>> [Euclidean (any 3D) distance cannot be mapped to *any* linear order. It is >> a hard fact. So there is no key which could work. This was the thing which >> killed Prolog in machine learning (nobody would even try RA.)]
I feel you still have the idea: let's feed the monster with "facts" and see what it will come from, hmm, the other end. No need to guess, what it will be! (:-))
Unfortunately, it does not work this way. So far it is impossible to do without substantial re-wiring the monster for each new problem. It is also a great controversy, if that were fundamentally possible to do. There are strong arguments against (Goedel) and for (existence of Goedel himself and other intelligent beings). Let's leave that to philosophers.
>> Another advantage of graphs and other "non-linear" structures, is that you >> might do things which would otherwise illegal. For example, >> self-referential structures. Self-containment is illegal, but a thing which >> behaves as if it contained itself is possible through indirection. There >> are many cases where you have to use such models, because other models >> would be complex/inadequate/illegal.
Indirection does. You cannot put a set into itself because that would be illegal. But you can put some other object into it, which resembles the enclosing set. It is almost the set. To put it differently, you more the recursion from the language to the objects. You talk about recursion instead of recursively talking.
>> If you want a generalized description, >> it is the problems where local descriptions are still "good-behaving", >> linear, but global ones are a nightmare. Graph is locally is almost as good >> as a table.
Right
> ISTM this means that kd-trees are suitable for
> data-rendering, but not for capturing or
> maintenance of the same data these rendering
> computations build on.
They aren't good when you want to add and remove many points. Balancing is a problem even when nodes are only added and never removed. There are cases when you might get a very bad tree. Higher dimensions k>10 are also problematic.
> My question was:
>
> "Which (or which types of) computations are easier
> with a navigational structure?".
>
> So far, your answer appears to me as:
>
> One candidate type/class of computations is:
> those computations that benefit
> from a representation of the problem space
> as (nodes on) a kd-tree:
> with distance mappings in a kd-tree
> there is a definable computational
> distance, which relates to the problem
> space distance in a definable way.
>
> Am I understanding correctly?
Yes. "Navigational" presumes existence of some context. Your actions depend on it. It is the focus of attention. You value things depending on how close are they to the focus. This allows you to reduce resources needed, by ignoring anything out of context. It also allows using adaptive techniques, i.e. to vary your strategy from point to point or/and from time to time. But it also may lead to aberrations, dead ends etc, which are quite common for "local" methods as opposed to "global" ones.
I think that local vs. global is a better dichotomy than navigational vs. not. We could find it everywhere.
-- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.deReceived on Fri Jun 23 2006 - 05:08:41 CDT