Re: Interpretation of Relations

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Mon, 22 Jan 2007 13:39:14 GMT
Message-ID: <683th.3966$1x.66908_at_ursa-nb00s0.nbnet.nb.ca>


Joe Thurbon wrote:
> On 2007-01-22 10:46:47 +1000, "JOG" <jog_at_cs.nott.ac.uk> said:
>

>>
>> This is extremely close to my line of research. As such this seems like
>> a good opportunity to dig out something very similar, that started my
>> line of thought in this direction. I had been attempting to look at the
>> consequences of different encoding strategies for stating NL

>
>
> NL is natural language?
>
>> sentences
>> as formal propositions, and the effects that the choices made have on
>> the issue of missing information within the resulting data model. In
>> the course of this I produced the following simple example of the
>> effects of  CWA and missing information that concerned me (I have
>> reworked the example to correspond to the OP).

>
>
> I think we're thinking along similar lines. I am still really new with
> the RM side, so I'm going to ask what might be pretty basic questions,
> and make what might be basic observations. Hopefully my understanding of
> the logic side might be able to pay you back (although I'm not really an
> expert there, just a lot more experienced than I am in the RM).
>
>>
>> * Consider the dual predicates Joes_hair(x) and Not_Joes_hair(x), and
>> an RM representation of them with a trivial domain:

>
>
>
>>
>> Domain D_Hair   = {Red}
>> Relation R_Joes_Hair =  <value: D_Hair>
>> Relation R_Not_Joes_Hair =  <value: D_Hair>
>>
>> * Constraints:
>> C1 = FORALL x R_Joes_hair(x) <-> ~R_Not_Joes_hair(x)
>> C2 = FORALL x R_Not_Joes_hair(x) <-> ~R_Joes_hair(x)

>
>
> Is it possible to have these sorts of constraints in the RM? I thought
> that there was no 'inferencing' that went on inside the model.
>
> What does ~R_Joes_hair(x) mean: just that x does not appear in the body
> of R_Joes_hair?
>
> And finally, is this the standard way of handling predicates where you
> want to assert 'negative facts?'
>
>>
>> * We obviously know from our encoding that:
>> R_Joes_hair(value : x) ->  Joes_hair(x)
>> R_Joes_hair(value : x) ->  Not_Joes_hair(x)
>>
>> * Also, by the CWA we know that:
>> ~ R_Joes_hair(value : x) ->  ~Joes_hair(x)
>> ~ R_Not_Joes_hair(value : x) ->  ~Not_Joes_hair(x)

>
>
> Just to check that I'm with you here, Joes_hair is is logical predicate
> and R_Joes_hair is a relation?
>
> I had written down a couple of formula somewhere which I think captures
> this notion. I'll put it at the bottom of this post, but yes, assuming
> that ~R_Joes_Hair(x) just means that x does not appear in R_Joes_Hair's
> body, I think this is the right way to interpret this logically.
>
>
>>
>> * Now,  if Joe's hair is red one should encode:
>> R_Joes_Hair = { (value:Red) }
>> R_Not_Joes_Hair = { }
>>
>> * Or if he does not have red hair one encodes:
>> R_Joes_Hair = { }
>> R_Not_Joes_Hair = { (value:Red) }
>>
>> * However, I don't know Joe, so this information is missing, and this
>> puts me in rather a spot. I cannot state R_Joes_Hair(Red) or
>> R_Not_Joes_Hair(Red) because:
>> Joes_Hair(Red) = UNKNOWN
>> Not_Joes_Hair(Red) = UNKNOWN
>>
>> * But worse still, if I do nothing at all and insert no tuples I have:
>> R_Joes_Hair = { }
>> R_Not_Joes_Hair = { }
>>
>> * From CWA from this we could infer:
>> ~Joes_hair(Red) ^  ~Not_Joes_Hair(Red)
>> => ~Joes_hair(Red) ^ Joes_Hair(Red)
>> => CONTRADICTION

>
>
> Yes. Right.
>
>>
>> This frustrated me somewhat when I first jotted it down, and even if it
>> is missing a trick, it has given me some useful insights into how the
>> issue might be addressed through a description of 'facts /about/ our
>> knowledge of the world' (as you put it) via a SOL formalization - I'm
>> not sure that modal logic is necessary in the db-algebra itself.

>
>
> I agree. I think that the last set of assertions made in your example
> (where you derive the contradiction using the CWA) is already outside
> the RM. And if it is, my intuition is that really the root of all the
> problems with missing information is using the CWA to induce negative
> information. It ends up with your logical interpretation being
> completely asymmetric. In particular, you need to have these 'special
> constraints' to infer a relationship between Hair_Colour and
> Not_Hair_Colour, whereas Hair_Color and ~Hair_Colour are logical
> complements 'for free'.
>
> So, what are the alternatives?
>
> I think there are quite a few.
>
> For example, we could extend the notion of relation to include two
> bodies. (I don't think that this is the right way to go about it in
> general, but it's a starting point)
>
> For example,
>
> Domain D_People = {Joe}
> Domain D_Hair = {Red, Blond, Black}
>
> Relation R_Hair Colour = <<D_People X D_Hair>: {{Joe, Blond}}: {{Joe:Red}}>
>
> would indicate that
>
> Joes hair is blond (it is in the 'positive' body)
> Joes hair is not red (it is in the 'negative' body)
> Whether Joe's hair is black or not is unknown.
>
> Now, I am aware that this approach, followed purely as stated, would be
> completely intractable in practice. For example, many domains are just
> to big to enumerate all the possibilities. But at least it gets rid of
> the CWA.
>
> Another strategy would be to slightly weaken the CWA in some
> circumstances. More below.
>
>> However I am a long way from being a logician and as such have a
>> healthy skepticism of the validity of absolutely any maths I generate,
>> so any critical analysis is /more/ than welcome.

>
>
> It all seemed to be right to me. A more general characterization might
> be like this:
>
> (The below is for unary relations over finite domains, because usenet is
> a plain text medium and the notation is difficult enough already, but
> the n-ary case is the natural extension):
>
> A relation R is defined over a Domain D = <d1, ..., dn>, with a Body B,
> containing 0 or more 1-ary tuples, each tuple containing one element of D.
>
> We then define a logical predicate L, also defined over D, whose truth
> value for each di in D is defined as
>
> L(di) iff "di is an element of B", where B the body of R.
>
> The "iff" effectively closes the predicate over D, so the closed world
> assumption happens at the logic level, rather than the algebraic level.
>
> So far, this is identical to your example above. One 'workaround' to the
> missing problem is:
>
> For concepts which are 'facts about the world' L(di) is just a logical
> assertion.
>
> For concepts which are 'facts about our knowledge about the world' I'd
> use modal logic (more precisely something like epistemic logic) and say
> that L(di) should actually be thought of as
>
> K(L(di))
>
> where K is the model operator 'Known'.
>
> Why is this interesting (well, at least I think it is)?
>
> K(L(di)) -> L(di))
>
> That is, all the stuff that you "know" is true.
>
> but, and this is important
>
> ~(K(L(di)) does not entail ~L(di)
>
> That is, just because you don't "know" something, you don't necessarily
> know that it's true.
>
> This gives a nice consistent interpretation of missing information,
> keeps the CWA around, and doesn't change the meaning of the 'positive
> examples' in the body of R. Of course, sometimes, you _do_ want to
> entail ~L(d) from d missing from R's body. In this case, you just choose
> your L's interpretation to be a standard logical one.
>
> A few posts ago, you made a closing remark that query results are 'as
> far as the DB knows'. The idea that some relations should be interpreted
> modally allows a model to make that notion explicitly and selectively.
> Note that none of this modal treatment actually effects the relation
> model, just what happens when you start trying to do some inference with
> the facts as stated in relations.

The RM includes the operations on relations by which one does the inferencing. Does it not?

> What I'm trying to get a handle on now is whether it is (a) correct, (b)
> useful, and (c) can it be incorporated into the model. For example, you
> might consider how a 'known' and a 'fact' relations behave under joins.
> Another consideration is, how many of these modal operators are needed?
> Possibly as many as there are different faces of NULL? Actually, before
> I get to (c), I'm not sure if it would be better to just leave it out of
> the RM altogether, and keep it in the inferencing part of the 'system'.
>
> Anyway, I've rambled on quite a bit. The ideas are pretty new to me,
> still in development, and really, I'm getting ahead of myself because I
> still don't fully understand the RM.

Would the observation that the relational calculus is basically 1st order predicate logic help you understand it better?

> It's nice to see that someone has
> at least had a similar idea, too.
>
> Does any of this make any sense to you? To anyone?
>
> Cheers,
> Joe
Received on Mon Jan 22 2007 - 14:39:14 CET

Original text of this message