# Re: RM formalism supporting partial information

Date: Sun, 18 Nov 2007 12:34:49 -0800 (PST)

Message-ID: <1c493284-09ff-415c-9a6d-dbdddeb8ce36_at_p69g2000hsa.googlegroups.com>

On 18 nov, 15:28, "David Cressey" <cresse..._at_verizon.net> wrote:

> "Jan Hidders" <hidd..._at_gmail.com> wrote in message

*>
**> news:2d1cf135-63e5-4bf0-b293-191338a21e62_at_v4g2000hsf.googlegroups.com...
**>
**> > PS. Note that I am oversimplifying because I left out the headers,
**> > which need to be taken into account to get a completely correct
**> > definition of "information content" in this setting. The relation with
**> > header {a,b} and body {(a=1, b=null)} does not necessarily have the
**> > same information content as the one with header {a} and body {(a=1)}.
**> > But considering that (1) it is not hard to see how to add that and (2)
**> > it would make the definitions more complex without really adding to
**> > the essential insight, I left them out anyway.
**>
**> Jan,
**>
**> I am sure you are right about this. What's not clear to me is whether the
**> difference in "information content" between the two cases you give is
**> relevant to the information content in the context of the subject matter,
**> or whether it's a difference in information content that's relevant only in
**> the context of the implementation we are looking at.
*

By "context of the subject matter" you meant the particular interpretation of null values that was under discussion? Yes, I made this statement in that context, i.e., the does-not-apply interpretation of null values. However under the value-unknown interpretation there is in general also a difference in information content between the two mentioned cases because in the first you are informed that there exists an associated 'b' value, although you are not told what it is, and in the second case you do not know whether it exists or not.

> I suspect that this quandary is exactly what's responsible for so much

*> misinterpretation of NULLs in the world of application programming.
*

That's also my intuition. Programmers are rarely aware of the different possible interpretations and their consequences. For example, suppose you have a relation R(a,b,c) with candidate key {a} and 'b' is not null, but 'c' may be null. If you want to avoid the null's you might want to split this in R1(a,b) and R2(a,c). However, if the interpretation of the null was value-unknown then the closedworld assumption does not apply to R2, although it does hold for R1. This matters for what you think the relations mean.

- Jan Hidders