Re: RM formalism supporting partial information

From: Jan Hidders <hidders_at_gmail.com>
Date: Sun, 2 Dec 2007 04:22:24 -0800 (PST)
Message-ID: <2b76729c-3f82-45d3-8ec8-fffa4f9cc037_at_a39g2000pre.googlegroups.com>


On 1 dec, 04:17, David BL <davi..._at_iinet.net.au> wrote:
> On Dec 1, 12:48 am, Jan Hidders <hidd..._at_gmail.com> wrote:
>
>
>
> > On 29 nov, 03:54, David BL <davi..._at_iinet.net.au> wrote:
>
> > > The concept of "possible answers" isn't universally applicable, and
> > > therefore seems to represent quite a problem for any model of partial
> > > information that emphasises that concept as fundamental.
>
> > The concept of 'possible answers' applies and is well defined for all
> > databases where you have precisely defined what it means if certain
> > data is missing, and note that his includes the definition that says
> > that it means nothing. So what you mean by "isn't universally
> > applicable" is completely beyond my comprehension.
>
> Consider the following predicates, all with OWA intensional
> definitions
>
> age(Person,Age)
> occupation(Person, Occupation)
> married(Person,Person)
> died(Person,Date)
>
> You say the concept of possible answers is well defined. How exactly
> would you calculate the possible 27 year old pilots?

You seem to assume that "well defined" and "can be computed" is the same, which it isn't. But to answer your question, assuming that everybody has only one occupation that would be every person p for which there is no tuple (p, a) with a<>27 in relation age, and no tuple (p, o) with o<>"pilot" in relation occupation. If the domain of Person is not finite, or restricted by a relation person(Person) then the result may be infinite.

> What does it mean precisely?

It contains every person that might be a 27 year old pilot as far as the given database is concerned.

> > > What do you think of the suggestion that the formalism (which is
> > > concerned with extensions rather than intensions)
>
> > > 1) ignores the CWA/OWA distinction;
>
> > > 2) assumes the CWA applies everywhere; and
>
> > > 3) null is *always* interpreted as non-existence w.r.t.
> > > the (carefully worded) intensional definitions?
>
> > > This approach seems simple and self consistent.
>
> > If I ignore for the moment 1) (because 1) and 2) seem contradictory
> > because I cannot assume there is no difference between X and Y and at
> > the same time assume that only Y applies everywhere) this is just the
> > classical value-does-not-apply interpretation.
>
> I meant that the actual CWA/OWA distinction is absorbed into the
> intensional definition, so that it can be assumed that with respect to
> the intensional definition the formalism assumes a CWA. I thought
> that was clear.

It was.

> > > It doesn't however, attempt to model the case of "value exists but is
> > > unknown". IMO that case should be modeled *explicitly* with a
> > > different predicate.Of
>
> > Sure, the value-does-not-apply interpretation can always also be
> > represented without null values.
>
> > The thing is that you have now fully ignored the real problem of
> > incomplete information which is that in practice the CWA does not
> > always fully apply. Your main solution seems to be to redefine the
> > meaning of the relations such that it does, which, of course, doesn't
> > solve anything at all and simply puts the problem back on the plate of
> > the user.
>
> You say "of course doesn't solve anything at all" without giving any
> hint at all why you say that. Can you elaborate?
>
> What problem doesn't it address? Can you provide a specific example?

Suppose you have a table R(a,b,c) with candidate key {a} where column c may contain null values that indicate that we don't know it's value. You can now solve this by splitting this into R1(a,b) and R2(a,c) and thus remove the null values. It could be that R was, apart from the null values, complete so that would mean that the CWA applies to R1, but not to R2. So it will be the case for some queries over R1 and R2 that when computed in the usual way they return the exact answer, some will return the possible answers, and some will return neither. Wouldn't it be nice if the DBMS could tell you which ones do what? Or if you could tell the DBMS that it shouldn't compute the query as given but rather such that it return the set of possible (or certain) answers for the given query, if it can?

  • Jan Hidders
Received on Sun Dec 02 2007 - 13:22:24 CET

Original text of this message