Re: RM formalism supporting partial information

From: Jan Hidders <hidders_at_gmail.com>
Date: Mon, 19 Nov 2007 02:11:27 -0800 (PST)
Message-ID: <79f96dd3-616e-410e-870c-4c80a5bd49fe_at_f13g2000hsa.googlegroups.com>


On 19 nov, 05:20, David BL <davi..._at_iinet.net.au> wrote:
> On Nov 17, 7:54 pm, Jan Hidders <hidd..._at_gmail.com> wrote:
>
>
>
> > On 17 nov, 04:35, David BL <davi..._at_iinet.net.au> wrote:
> > > Also I think one could argue that the CWA is at odds with a model of
> > > partial information, unless you keep in mind the idea of "negation as
> > > failure to prove true".
>
> > Here we touch the core of why I think your approach is problematic.
> > The CWA does apply for the does-not-apply interpretation of null
> > values. It does however not apply in its classical form for the value-
> > unknown interpretation. Actually Raymond Reiter himself (he introduced
> > the CWA) explains very well how it then changes in a paper with the
> > title "A sound and sometimes complete query evaluation algorithm for
> > relational databases with null values".
>
> > What you seem to be doing is mixing the two interpretations, resulting
> > in something that IMO doesn't really seem to have any consistently
> > meaningful interpretation at all. If you really want to combine the
> > two interpretations you need to first define the meaning of a relation
> > with null values in terms of sets of "possible worlds" where the null
> > values are removed by either giving a concrete value for them or
> > declaring them not applicable. That might actually be quite
> > interesting, and I don't remember seeing a paper that did this
> > properly. Even Zaniolo doesn't really get this right, although he
> > claims that he combines the two approaches. So you are in very good
> > company. :-)
>
> The two main interpretations of null are apparently
>
> 1. value exists but is unknown
> 2. value doesn't exist
>
> Zaniolo combines these into a single interpretation
>
> 3. no information
>
> I think you're saying 3 is a mixture of 1,2 and doesn't lead to a
> consistent interpretation.

It can lead to a consistent interpretation, but I think you and Zaniolo don't do it in a consistent way.

> IMO the problem is actually the reverse.
> It is easy to interpret "no information" (it simply means that the
> predicate instantiation isn't available for logical deduction),
> whereas interpretations 1,2 quickly take us outside the realm of the
> RM/RA. I say that because it seems clear that the RM/RA is only
> concerned with a strict subset of the FOPL involved with logical
> deduction from sets of fully instantiated predicates.

Wow. There's so much I disagree with here that I'm not sure where to begin. To begin with a detail, the RM/RA naming convention hints for me at a deep misunderstanding. The algebra is not an integral part of the data model. If you take another query language it is still the relational model. If anything, FOL and the related calculi are more fundamental for understanding the meaning of the data.

Next, you claim that your interpretation is simple, but your description of it is clearly not complete. A complete description tells you which FO sentences are true, false or neither. See Reiter in his paper on null values on how this is done properly for the valueunknown  interpretation. Since you are combining it with another interpretation the result will be at least as complex as his. To begin with you will need to extend the language of FOL with extra atoms that allow places to be not there. So we have next to the classical atoms such as P(x, y, z) also P(x, y, _) (for simplicity let's take the unlabeled perspective) which says that the third value may be undefined. So tell me, given a languages of formulas with the usual logical connectives and quantifiers, over such atoms, which formulas are true, false and unknown given a certain relation with null values. Again, see Reiter for how this is done. Oh, and try to keep it simple. :-P

Finally, you say that it seems clear that the RM/RA is only concerned with a strict subset of the FOPL involved with logical deduction from sets of fully instantiated predicates. To the extent that this is true, you have already left that safe and well-understood realm the moment you allowed the value-unknown interpretation. Adding the notapplicable  interpretation makes the situation worse, not better.

PS. A busy week is coming up, so it might be until the next weekend that I reply again.

  • Jan Hidders
Received on Mon Nov 19 2007 - 11:11:27 CET

Original text of this message