Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> comp.databases.theory -> Re: Canonical DB
Hi Dmitry,
The posts are getting way to long for my taste, so I snipped (... and [snip]), maybe oversnipped - feel free to unsnip :-)
Dmitry A. Kazakov wrote:
> mAsterdam wrote:
>>Dmitry A. Kazakov wrote: >>>mAsterdam wrote: >>>> ...Which (or which types of) computations are >>>> easier with a navigational structure?...
Heh. Agreed.
>>>... ordered structure induces some distance.
>>...The keys in the table on paper even have a distance in cm. >>On the keys in the SQL table distance is undefined.
Yes, that is clear.
These requirements establish K as a clean point type.
Aside: With the last condition deleted one can make a
circular K, having distance(k1, k2) <> distance(k2, k1).
> Then you can enumerate all K preserving the order. I.e. if k# denotes the
> ordinal number of k, then k1#<k2# <=> k1<k2.
>
> So you *can* define a distance on rows as
>
> d(r1,r2) = | key(r1)# - key(r2)# |
>
> (which will satisfy axioms of distance)
>
> It can be shown, that this apparently arbitrary distance, plays a
> sufficient role in some cases.
K is now a point type, so distance(k1, k2) is defined.
Aside: That is, indeed, assuming we have agreed on some algorithm to calculate the distance, given two points. See for instance http://en.wikipedia.org/wiki/Manhattan_distance
I'm still missing a (IMO crucial) step:
What does it mean - or how to interpret
this distance between two values in K as
something in the problem space?
The reason I had to state the above in a
backward way is: Normally when modeling
this isn't a problem, because at some time
we introduced K into the model for
some purpose - but here we have a
partial solution wanting a problem space:
the computations we are looking for.
>>Common sense (as in: I have no theory to back >>this up) tells me that for k2-k1 to denote a >>distance, both k1 and k2 should denote a point >>in some space.
Now I do (have some theory to back it up). Thanks :-)
> The space is K, and, equivalently, the space of rows R,
> because key:R->K is a bijection.
Well, yes.
In order to have a distance between keys, the keys must be of a point type.
Now I think you are suggesting that this bijection implies that there is also a distance defined between the tuples.
I reject that. There maybe other point-typed
attributes in the relation and hence several valid
distances between any two tuples.
Which one is *the* distance?
The crucial step here is:
What does it mean: how to interpret
this distance between tuples as
something in the problem space?
(Again, necessarily still phrased backwards.
I think that if I had a concrete problem
example it would be easier to continue).
>>... To the problem space ... tree-distance is meaningless.
I'd say until there is meaning there is no model of the UOD (maybe a template, a pattern, a proto-model, but no model of the UOD). This may be a slight terminological difference, this may be important - I don't know yet.
> So if the distance immanent to our model does not
> map the problem space distance, and, at the same time, this model distance
> is directly or indirectly used in a reasoning based on the model, then we
> are in a deep trouble. Consider a model: "the skin color determines the
> intellectual level of a person." I.e. d(White, Person) --> IQ(Person). It
> is an inadequate model.
I reject this model because of its meaning.
>>It may say something about something like a >>computational distance for a (navigational) >>computation, but nothing at all about the UOD.
The modeler comes up with artifacts
and interpretations.
It is the modelers responsibility
to explore whether the UOD interpretations
and the model's limitations are understood.
>>>Which might have an immense computational impact. >> >>How so? We haven't any way yet to compute problem space distance. >>That's it. No computations to compare yet.
Could you please give some problem space examples?
>>>Example: kd-trees vs. relational tables. >> >>Nodes on a kd-tree *do* denote a (carefully chosen) >>point/line/plane (k - 1 d thingy) in problem space.
>>>[Euclidean (any 3D) distance cannot be mapped to *any* linear order. It is >>>a hard fact. So there is no key which could work. This was the thing which >>>killed Prolog in machine learning (nobody would even try RA.)] >> >>Using something completely meaningless to compute >>something meaningful kills nothing but the idea >>to do so in the first place. It is a bad idea (TM). >> >>Nodes on a kd-tree denote a point/.. in problem >>space, so it is a candidate navigational structure >>for the question at hand - if we have some computations >>to compare. >> >>A kd-tree is built from a set of points. >>We don't start from the kd-tree, so it >>can only support /part/ of a solution.
Heh. Sorry you feel that way. I just thought,
hey where do all these data come from?
They must have been captured somehow.
I'd welcome ideas from other posters as to which types of computations are easier with a navigational structure, to be honest preferably a bit more tangible than kd-trees (which to me are a solution space template).
> Unfortunately, it does not work this way.
I am not surprised.
Somebody who would "feed the monster with "facts" and see what it will come from, hmm, the other end." would be disappointed, I imagine.
(Same scarequotes, yes?)
> So far it is impossible to do
> without substantial re-wiring the monster for each new problem. It is also
> a great controversy, if that were fundamentally possible to do. There are
> strong arguments against (Goedel) and for (existence of Goedel himself and
> other intelligent beings). Let's leave that to philosophers.
>
>>>Another advantage of graphs and other "non-linear" structures, is that you >>>might do things which would otherwise illegal. For example, >>>self-referential structures. Self-containment is illegal, but a thing which >>>behaves as if it contained itself is possible through indirection. There >>>are many cases where you have to use such models, because other models >>>would be complex/inadequate/illegal. >> >>While I agree that recursion can be very convenient in order >>to reduce complexity and to enhance adequacy, I don't see >>how it makes possible (or legal) what otherwise isn't.
Sorry, I do not understand this. Is it crucial to the topic?
[snip]
>>One candidate type/class of computations is: >> those computations that benefit >> from a representation of the problem space >> as (nodes on) a kd-tree: >> with distance mappings in a kd-tree >> there is a definable computational >> distance, which relates to the problem >> space distance in a definable way. >> >>Am I understanding correctly?
>
> Yes. "Navigational" presumes existence of some context. Your actions depend
> on it. It is the focus of attention. You value things depending on how
> close are they to the focus. This allows you to reduce resources needed, by
> ignoring anything out of context. It also allows using adaptive techniques,
> i.e. to vary your strategy from point to point or/and from time to time.
> But it also may lead to aberrations, dead ends etc, which are quite common
> for "local" methods as opposed to "global" ones.
>
> I think that local vs. global is a better dichotomy than navigational vs.
> not. We could find it everywhere.
It is another dichotomy.
local vs. global is better for what?
ISTM local vs. global is within the navigational realm, so
inappropriate for comparisons of computations with navigational
vs. relational structures.
Aside: I replaced 'not' by relational.
Is non-navigational synonymous to relational?
Some are reluctant to even consider that.
Received on Fri Jun 23 2006 - 08:37:54 CDT