Re: RM formalism supporting partial information

From: David Cressey <>
Date: Wed, 28 Nov 2007 14:05:52 GMT
Message-ID: <4Be3j.15661$281.1303_at_trndny06>

"paul c" <> wrote in message news:e_d3j.62082$cD.25240_at_pd7urf2no...
> Jan Hidders wrote:
> > On 28 nov, 01:58, David BL <> wrote:
> ...
> >> Consider a query to find all the 27 year old pilots from a census
> >> recorded in an RDB. If the age or occupation is missing we could
> >> think of the person as a possible answer.
> >
> > I believe there is a terminology problem here concerning the terms
> > "possible answers" and "certain answers". In the context of research
> > on incomplete databases (i.e. anywhere the classical CWA does not
> > apply fully) that usually means the following. Given a query and the
> > assumptions about "closedness" the set all tuples with the right
> > header can be partitioned into three groups: the certain answers
> > (those that are certain to be in the result of the query on the
> > omniscient database), the possible answers (those that might be in the
> > aforementioned result) and the impossible answers (those that are
> > certain not to be in the aforementioned result).
> >
> > In that sense the tuple describing the person you mentioned above
> > (presuming it is projected on the non-null fields) is a certain
> > answer, not a possible answer.
> > ...


> When it comes to a public census I believe the possible answers or
> non-answers are planned for. As Bob B pointed out a "Don't Know"
> response or even a refusal is often considered a specific answer, ie.,
> some number of those is expected. Interesting that even statisticians
> who are more interested in probability than db theory do this. Seems
> quite different from the usual null examples. (I'm not touting census
> methods in general - I've seen outrageous cheating by census-takers,
> making up answers or even non-existent people in order to meet quota
> maximums for DK's/NA's.)

A database derived from census data might be about two different subject matters:

The first is the responses to the census questions. The second is the demographics the census purports to pin down.

If it's the first, a "Don't Know" is a specific answer, and should be recorded as such.
If it's the second, a "Don't Know" probably means that the database doesn't know.

BTW, in the case on the query about 27 year old pilots, a person with a missing age and a person with a missing occupation are clear enough. But what about a person who is missing from the database altogether? Is that not a possible answer? Received on Wed Nov 28 2007 - 15:05:52 CET

Original text of this message