Re: RM formalism supporting partial information

From: David BL <davidbl_at_iinet.net.au>
Date: Fri, 16 Nov 2007 19:35:37 -0800 (PST)
Message-ID: <359f6993-66c7-4be6-8049-aeafda9a5afa_at_s36g2000prg.googlegroups.com>


On Nov 17, 12:30 am, Jan Hidders <hidd..._at_gmail.com> wrote:
> On 16 nov, 02:51, David BL <davi..._at_iinet.net.au> wrote:
> > On Nov 16, 8:31 am, Jan Hidders <hidd..._at_gmail.com> wrote:
> > > On 15 nov, 20:11, David BL <davi..._at_iinet.net.au> wrote:
>
> > > > I'm sure your time is precious and I don't want to be presumptuous,
> > > > but have you digested much of the document? Do you have any
> > > > particular comments on the operators, such as the information
> > > > comparison operator which gives a partial ordering and a concept of
> > > > information equivalence?
>
> > > It's actually not a partial order, but a preorder because it is not
> > > antisymmetric.
>
> > I didn't know partial order had that more specific meaning. Nor did I
> > know that the term preorder was available for what I wanted to say.
> > Thanks for pointing that out.
>
> The term pseudo order is also sometimes used.
>
> > > It's also a bit strange in that it says that the
> > > following relation bodies (for simplicity the tuples are unlabeled)
> > > all have the same information:
> > > - { }
> > > - { ({}, {}) }
> > > - { ({a}, {}) }
> > > - { ({}, {b}) }
>
> > I don't think that's correct.
>
> Indeed, my apologies, you are right. A case of "reading what I think
> it should say", I'm afraid. :-)
>
> > Intuitively the information content can be regarded as the set of all
> > conventional propositions that can be "read out" of the mv-relation,
> > for all possible projections.
>
> Aha! Now *that* makes more sense, and as far as I can tell that is
> roughly the "value does not apply" interpretation of null values. But
> I'm not really convinced that the equivalence relationship over
> relations that you derive from that really follows from it. I'd
> strongly advice you to read the following paper, "Database Relations
> with Null values" by Carlo Zaniolo, which starts more or less from the
> same principle, but works it out in a subtly different way:

I came across that paper about a year ago, but only read the initial discussion. I actually commented on it in a post to cdt on January 11, saying that it seemed like a good way to understand nulls, but no discussion took place. At the time I was still learning the fundamentals of RM/RA.

Around that time I thought of an interesting join operator that takes intersections on multi values instead of simply comparing single values. I was vaguely aware that it could deal with partial information.

Anyway, there is little doubt I have been inspired by Zaniolo's ideas, and it's been on my todo list to compare the two approaches. I was guessing that they may be equivalent in some sense.

>
> http://citeseer.ist.psu.edu/534211.html
>
> Let me give a brief summary of what he does to explain what my problem
> is. I'll give a formal but simplified version of what he does:
>
> - We postulate the sets A (attribute names) and D (domain values)
>
> - A *tuple* is a partial function from A to D, i.e., a subset t of AxD
> such that if (a1,d1) and (a1,d2) in t, then d1=2. An attribute name
> for which a tuple is not defined is informally assumed to represent a
> null value.
>
> - We define an ordering <= over tuples such that t1 <= t2 iff t1 is a
> subset of t2. Intuitively t1 <= t2 means that t2 contains more
> information.

I can see an equivalence to my definition of t2 --> t1 if we introduce the constraint that for every attribute a of mv-tuple t, |t(a)| <= 1

Let z-tuple stand for Zaniolo's definition of tuple. We can map an mvtuple  t to a z-tuple z(t) as follows

    z(t) = { (a,d) such that a in attribs(t) and d in t(a) }

I'm not bothering about the detail of defining sets A and D. Anyway z(t) is indeed a partial function if |t(a)| <= 1

My definition states t1 --> t2 if

    for all a in (attribs(t1) intersect attribs(t2)),

        t2(a) is a subset of t1(a)
    for all a in (attribs(t2) \ attribs(t1)), t2(a) = {}

Claim t1 --> t2 <=> z(t2) is a subset of z(t1) Proof

    (=>)
    let (a,d) in z(t2)
    then a in attribs(t2) and d in t2(a)     we show a in attribs(t1)

        suppose not
        so a in (attribs(t2) \ attribs(t1))
        so t2(a) = {}
        contradicts d in t2(a)

    so a in (attribs(t1) intersect attribs(t2))     so t2(a) is a subset of t1(a)
    so d in t1(a)
    so (a,d) in z(t1)

    (<=)
    if a in (attribs(t1) intersect attribs(t2))

        we show t2(a) is subset of t1(a)
            let d in t2(a)
            so (a,d) in z(t2)
            so (a,d) in z(t1)
            so d in t1(a)
    if a in (attribs(t2) \ attribs(t1))
        we show t2(a) = {}
            suppose not
            so exists d in t2(a)
            so (a,d) in z(t2)
            so (a,d) in z(t1)
            so a in attribs(t1)
            contradiction


> - Given a subset X of A the *restriction* of a tuple t to X, denoted
> as t[X], is defined as the partial function { (a,d) in t | a in X }

That's just like my projection function on mv-tuples.

>
> - The *domain* of tuple t, denoted as dom(t), is the subset of A for
> which t is defined

In my formalism that would be

    dom(t) = { a in attribs(t) | t(a) <> {} }

> - A *relation* is a set of tuples
>
> - We define a binary relation <=1 over relations such that R1 <=1 R2
> iff for every tuple t1 in R1 there is a tuple t2 in R2 such that t1 <=
> t2.

Interesting.

>
> - We say that two relations R1 and R2 are *1-equivalent*, denoted as
> R1 ~1 R2, if R1 <=1 R2 and R2 <=1 R1
> Do you see how this is similar but different from what you are doing?
> Let me try to make this link a bit more explicit.

> - Given a relation R and a subset X of A we define the *null-free
> projection* of R on X, denoted as NFP[X](R), such that NFP[X](R) =
> { t[X] | t in R, dom(t[X])=X }.

I think that is the counterpart to I(proj(X,r)) in my notation.

> - The *interpretation* of a relation R, denoted as int(R), is defined
> such that int(R) = { NFP[X](R) | X is a subset of A }.

In my formalism I could define the counterpart "total interpretation" which is the set of I(proj(X,r)).

> I can now define an ordering based on this in two ways:
>
> - We define a binary relation <=2 over relations such that R1 <=2 R2
> iff int(R1) is a subset of int(R2)

Without thinking about it too deeply, in my notation I suspect that would be the counterpart to defining

    r1 <=2 r2 if

        for all X subset of attribs(r1), I(proj(X,r1)) = I(proj,X,r2))

It doesn't seem particularly useful in a model of partial information. I can understand some relevance to CWA as you say below.

> - We define a binary relation <=3 over relations such that R1 <=3 R2
> iff for each relation R' in int(R1) there is a relation R" in int(R2)
> such that R' is a subset R".

Consider the following binary relation where A,B are sets

    A <<< B if

        for each subset X in A,
            there exists subset Y of B st
                X is a subset Y

Claim: A <<< B iff A subset of B
Proof:

    (=>)
    let X = A
    X is a subset of A (since A is subset of A)     A <<< B so there exists Y st

        Y subset of B   and
        X subset of Y

    so X is a subset of B (by transitivity)     so A is a subset of B (since X = A)

    (<=)
    let X be a subset of A
    put Y = X.
    then X is subset of Y (since X = Y)     and Y is subset of B (since X=Y and X is subset of A)

This makes me think R1 <=3 R2 is the counterpart to my definition

    r2 --> r1

        for all X subset of attribs(r1), I(proj(X,r1)) is subset of I(proj,X,r2))

> - As before we can use <=2 and <=3 to define the equivalence relations
> ~2 and ~3.

I have the feeling that ~2, ~3 are the same even though <=2, <=3 are not. Is that right?

> I'm running a bit out of time now, so I'm closing with the relevant
> questions:
> - What is the relationship between the three equivalence
> relationships?

I'm still thinking about that!

> - Which corresponds to yours, and which to Zaniolo's?

I'm trying to work out the difference between <=1 and <=3. Could you show me an example?

> - Which is more intuitive?

Hard to say. I suspect that has more to do with how the human mind works than some mathematical distinction!

> To give a hint about my opinion on the final question. If you assume
> the closed world assumption, and we usually do in this context, then
> it is not true that a smaller relation necessarily contains less
> information. The absence of tuples then also carries information.

Not sure what you are alluding to there. I don't yet fully appreciate the difference between these information comparison operators.

Also I think one could argue that the CWA is at odds with a model of partial information, unless you keep in mind the idea of "negation as failure to prove true".

I think the CWA is also particularly relevant to difference operators which bring in the idea of negation as failure. Did you notice that I defined two such operators in the word document? I haven't studied their properties yet.

Thanks by the way for the most interesting discussion I've had on cdt. Received on Sat Nov 17 2007 - 04:35:37 CET

Original text of this message