Re: RM formalism supporting partial information

From: Jan Hidders <hidders_at_gmail.com>
Date: Wed, 28 Nov 2007 07:23:57 -0800 (PST)
Message-ID: <39e95632-e884-4371-b6fa-b938e5d66b13_at_w40g2000hsb.googlegroups.com>


On 28 nov, 15:05, "David Cressey" <cresse..._at_verizon.net> wrote:
> "paul c" <toledobythe..._at_ooyah.ac> wrote in message
>
> news:e_d3j.62082$cD.25240_at_pd7urf2no...
>
>
>
> > Jan Hidders wrote:
> > > On 28 nov, 01:58, David BL <davi..._at_iinet.net.au> wrote:
> > ...
> > >> Consider a query to find all the 27 year old pilots from a census
> > >> recorded in an RDB. If the age or occupation is missing we could
> > >> think of the person as a possible answer.
>
> > > I believe there is a terminology problem here concerning the terms
> > > "possible answers" and "certain answers". In the context of research
> > > on incomplete databases (i.e. anywhere the classical CWA does not
> > > apply fully) that usually means the following. Given a query and the
> > > assumptions about "closedness" the set all tuples with the right
> > > header can be partitioned into three groups: the certain answers
> > > (those that are certain to be in the result of the query on the
> > > omniscient database), the possible answers (those that might be in the
> > > aforementioned result) and the impossible answers (those that are
> > > certain not to be in the aforementioned result).
>
> > > In that sense the tuple describing the person you mentioned above
> > > (presuming it is projected on the non-null fields) is a certain
> > > answer, not a possible answer.
> > > ...
>
> > When it comes to a public census I believe the possible answers or
> > non-answers are planned for. As Bob B pointed out a "Don't Know"
> > response or even a refusal is often considered a specific answer, ie.,
> > some number of those is expected. Interesting that even statisticians
> > who are more interested in probability than db theory do this. Seems
> > quite different from the usual null examples. (I'm not touting census
> > methods in general - I've seen outrageous cheating by census-takers,
> > making up answers or even non-existent people in order to meet quota
> > maximums for DK's/NA's.)
>
> A database derived from census data might be about two different subject
> matters:
>
> The first is the responses to the census questions.
> The second is the demographics the census purports to pin down.
>
> If it's the first, a "Don't Know" is a specific answer, and should be
> recorded as such.
> If it's the second, a "Don't Know" probably means that the database
> doesn't know.

Exactly. An which interpretation applies determines which CWA should be assumed.

> BTW, in the case on the query about 27 year old pilots, a person with a
> missing age and a person with a missing occupation are clear enough. But
> what about a person who is missing from the database altogether? Is that
> not a possible answer?

Yes, it can be.

  • Jan Hidders
Received on Wed Nov 28 2007 - 16:23:57 CET

Original text of this message