Re: RM formalism supporting partial information

From: Jan Hidders <hidders_at_gmail.com>
Date: Wed, 28 Nov 2007 03:03:00 -0800 (PST)
Message-ID: <9a3b57d2-17fd-498e-a528-0c04e3aaef84_at_y20g2000hsy.googlegroups.com>


On 28 nov, 01:58, David BL <davi..._at_iinet.net.au> wrote:
> On Nov 27, 9:43 pm, Jan Hidders <hidd..._at_gmail.com> wrote:
>
>
>
> > On 26 nov, 15:06, David BL <davi..._at_iinet.net.au> wrote:
> > > On Nov 26, 7:47 pm, Jan Hidders <hidd..._at_gmail.com> wrote:
> > > > On 26 nov, 08:52, David BL <davi..._at_iinet.net.au> wrote:
>
> > > > > Firstly a minor nit pick: you can't say "possible answers", because
> > > > > they don't actually represent an upper bound on the result in the
> > > > > omniscient database.
>
> > > > ?? They do so by definition.
>
> > > What I meant was that unless CWA is available on an appropriate
> > > projection there may be so much missing information (eg all
> > > information about an entity) that the query purported to return the
> > > "possible answers" does no such thing. ie it suffers a similar
> > > problem to negation (it returns neither the certain nor the possible
> > > answers).
>
> > I'm not sure what you mean by "the query purported to return the
> > 'possible answers'". If the user formulates a query then this will now
> > include an indication of whether he or she wants the possible/certain
> > answers. It is up to the DBMS to efficiently compute the answer, and
> > this is not necessarily done by the usual translation of calculus to
> > algebra or even one very similar to it.
>
> Consider a query to find all the 27 year old pilots from a census
> recorded in an RDB. If the age or occupation is missing we could
> think of the person as a possible answer.

I believe there is a terminology problem here concerning the terms "possible answers" and "certain answers". In the context of research on incomplete databases (i.e. anywhere the classical CWA does not apply fully) that usually means the following. Given a query and the assumptions about "closedness" the set all tuples with the right header can be partitioned into three groups: the certain answers (those that are certain to be in the result of the query on the omniscient database), the possible answers (those that might be in the aforementioned result) and the impossible answers (those that are certain not to be in the aforementioned result).

In that sense the tuple describing the person you mentioned above (presuming it is projected on the non-null fields) is a certain answer, not a possible answer.

> However we cannot say the
> query returns all possible answers unless we assume every person took
> part in the census.

Of course. If the set of possible answers is non-empty then the database cannot give you an exact answer. But it might do the next best thing which is to give you the set of certain answers and indicate that this might nog be the exact answer. If that is still too hard for the DMBS it might give you the proper subset of the certain answers which it can derive given its limited inference mechanism, and indicate that it is doing so.

I'm not sure how that would be relevant for the discussion we were having, but I hope this clears up the misunderstanding.

Cheers,

  • Jan Hidders
Received on Wed Nov 28 2007 - 12:03:00 CET

Original text of this message