Re: Interpretation of Relations

From: JOG <jog_at_cs.nott.ac.uk>
Date: 22 Jan 2007 08:26:22 -0800
Message-ID: <1169483182.218899.74970_at_v45g2000cwv.googlegroups.com>


Joe Thurbon wrote:

> On 2007-01-22 10:46:47 +1000, "JOG" <jog_at_cs.nott.ac.uk> said:
> >
> > This is extremely close to my line of research. As such this seems like
> > a good opportunity to dig out something very similar, that started my
> > line of thought in this direction. I had been attempting to look at the
> > consequences of different encoding strategies for stating NL
>
> NL is natural language?
>
> > sentences
> > as formal propositions, and the effects that the choices made have on
> > the issue of missing information within the resulting data model. In
> > the course of this I produced the following simple example of the
> > effects of CWA and missing information that concerned me (I have
> > reworked the example to correspond to the OP).
>
> I think we're thinking along similar lines. I am still really new with
> the RM side, so I'm going to ask what might be pretty basic questions,
> and make what might be basic observations. Hopefully my understanding
> of the logic side might be able to pay you back (although I'm not
> really an expert there, just a lot more experienced than I am in the
> RM).

I hope so too.

>
> >
> > * Consider the dual predicates Joes_hair(x) and Not_Joes_hair(x), and
> > an RM representation of them with a trivial domain:
>
>
> >
> > Domain D_Hair = {Red}
> > Relation R_Joes_Hair = <value: D_Hair>
> > Relation R_Not_Joes_Hair = <value: D_Hair>
> >
> > * Constraints:
> > C1 = FORALL x R_Joes_hair(x) <-> ~R_Not_Joes_hair(x)
> > C2 = FORALL x R_Not_Joes_hair(x) <-> ~R_Joes_hair(x)
>
> Is it possible to have these sorts of constraints in the RM? I thought
> that there was no 'inferencing' that went on inside the model.

Yes. Through a CHECK constraint in SQL for example.

>
> What does ~R_Joes_hair(x) mean: just that x does not appear in the body
> of R_Joes_hair?

Yes that is what I meant in my notation, however this is no standard for this I think (I may be wrong).

>
> And finally, is this the standard way of handling predicates where you
> want to assert 'negative facts?'

AFAIK there is no concept of negative facts in RM, just predicates. So if P1 = A(x) and P2 = ¬A(x), then P1 and P2 are definitions for two separate relations.

>
> >
> > * We obviously know from our encoding that:
> > R_Joes_hair(value : x) -> Joes_hair(x)
> > R_Joes_hair(value : x) -> Not_Joes_hair(x)
> >
> > * Also, by the CWA we know that:
> > ~ R_Joes_hair(value : x) -> ~Joes_hair(x)
> > ~ R_Not_Joes_hair(value : x) -> ~Not_Joes_hair(x)
>
> Just to check that I'm with you here, Joes_hair is is logical predicate
> and R_Joes_hair is a relation?

Yes, that is the intention, to correspond the encoding to the actual predicate being modelled.

>
> I had written down a couple of formula somewhere which I think captures
> this notion. I'll put it at the bottom of this post, but yes, assuming
> that ~R_Joes_Hair(x) just means that x does not appear in R_Joes_Hair's
> body, I think this is the right way to interpret this logically.

This is good news to me. Or bad given the issues it generates. Depends what side of the bed I get up in the morning. ;)

>
>
> >
> > * Now, if Joe's hair is red one should encode:
> > R_Joes_Hair = { (value:Red) }
> > R_Not_Joes_Hair = { }
> >
> > * Or if he does not have red hair one encodes:
> > R_Joes_Hair = { }
> > R_Not_Joes_Hair = { (value:Red) }
> >
> > * However, I don't know Joe, so this information is missing, and this
> > puts me in rather a spot. I cannot state R_Joes_Hair(Red) or
> > R_Not_Joes_Hair(Red) because:
> > Joes_Hair(Red) = UNKNOWN
> > Not_Joes_Hair(Red) = UNKNOWN
> >
> > * But worse still, if I do nothing at all and insert no tuples I have:
> > R_Joes_Hair = { }
> > R_Not_Joes_Hair = { }
> >
> > * From CWA from this we could infer:
> > ~Joes_hair(Red) ^ ~Not_Joes_Hair(Red)
> > => ~Joes_hair(Red) ^ Joes_Hair(Red)
> > => CONTRADICTION
>
> Yes. Right.
>
> >
> > This frustrated me somewhat when I first jotted it down, and even if it
> > is missing a trick, it has given me some useful insights into how the
> > issue might be addressed through a description of 'facts /about/ our
> > knowledge of the world' (as you put it) via a SOL formalization - I'm
> > not sure that modal logic is necessary in the db-algebra itself.
>
> I agree. I think that the last set of assertions made in your example
> (where you derive the contradiction using the CWA) is already outside
> the RM. And if it is, my intuition is that really the root of all the
> problems with missing information is using the CWA to induce negative
> information. It ends up with your logical interpretation being
> completely asymmetric. In particular, you need to have these 'special
> constraints' to infer a relationship between Hair_Colour and
> Not_Hair_Colour, whereas Hair_Color and ~Hair_Colour are logical
> complements 'for free'.
>
> So, what are the alternatives?
>
> I think there are quite a few.
>
> For example, we could extend the notion of relation to include two
> bodies. (I don't think that this is the right way to go about it in
> general, but it's a starting point)
>
> For example,
>
> Domain D_People = {Joe}
> Domain D_Hair = {Red, Blond, Black}
>
> Relation R_Hair Colour = <<D_People X D_Hair>: {{Joe, Blond}}: {{Joe:Red}}>
>
> would indicate that
>
> Joes hair is blond (it is in the 'positive' body)
> Joes hair is not red (it is in the 'negative' body)
> Whether Joe's hair is black or not is unknown.
>
> Now, I am aware that this approach, followed purely as stated, would be
> completely intractable in practice. For example, many domains are just
> to big to enumerate all the possibilities. But at least it gets rid of
> the CWA.
>
> Another strategy would be to slightly weaken the CWA in some
> circumstances. More below.
>
> > However I am a long way from being a logician and as such have a
> > healthy skepticism of the validity of absolutely any maths I generate,
> > so any critical analysis is /more/ than welcome.
>
> It all seemed to be right to me. A more general characterization might
> be like this:
>
> (The below is for unary relations over finite domains, because usenet
> is a plain text medium and the notation is difficult enough already,
> but the n-ary case is the natural extension):
>
> A relation R is defined over a Domain D = <d1, ..., dn>, with a Body B,
> containing 0 or more 1-ary tuples, each tuple containing one element of
> D.
>
> We then define a logical predicate L, also defined over D, whose truth
> value for each di in D is defined as
>
> L(di) iff "di is an element of B", where B the body of R.
>
> The "iff" effectively closes the predicate over D, so the closed world
> assumption happens at the logic level, rather than the algebraic level.
>
> So far, this is identical to your example above. One 'workaround' to
> the missing problem is:
>
> For concepts which are 'facts about the world' L(di) is just a logical
> assertion.
>
> For concepts which are 'facts about our knowledge about the world' I'd
> use modal logic (more precisely something like epistemic logic) and say
> that L(di) should actually be thought of as
>
> K(L(di))
>
> where K is the model operator 'Known'.
>
> Why is this interesting (well, at least I think it is)?
>
> K(L(di)) -> L(di))
>
> That is, all the stuff that you "know" is true.
>
> but, and this is important
>
> ~(K(L(di)) does not entail ~L(di)
>
> That is, just because you don't "know" something, you don't necessarily
> know that it's true.
>
> This gives a nice consistent interpretation of missing information,
> keeps the CWA around, and doesn't change the meaning of the 'positive
> examples' in the body of R. Of course, sometimes, you _do_ want to
> entail ~L(d) from d missing from R's body. In this case, you just
> choose your L's interpretation to be a standard logical one.
>
> A few posts ago, you made a closing remark that query results are 'as
> far as the DB knows'. The idea that some relations should be
> interpreted modally allows a model to make that notion explicitly and
> selectively. Note that none of this modal treatment actually effects
> the relation model, just what happens when you start trying to do some
> inference with the facts as stated in relations.
>
> What I'm trying to get a handle on now is whether it is (a) correct,
> (b) useful, and (c) can it be incorporated into the model. For example,
> you might consider how a 'known' and a 'fact' relations behave under
> joins. Another consideration is, how many of these modal operators are
> needed? Possibly as many as there are different faces of NULL?
> Actually, before I get to (c), I'm not sure if it would be better to
> just leave it out of the RM altogether, and keep it in the inferencing
> part of the 'system'.
>
> Anyway, I've rambled on quite a bit. The ideas are pretty new to me,
> still in development, and really, I'm getting ahead of myself because I
> still don't fully understand the RM. It's nice to see that someone has
> at least had a similar idea, too.
>
> Does any of this make any sense to you? To anyone?
>
> Cheers,
> Joe
Received on Mon Jan 22 2007 - 17:26:22 CET

Original text of this message