Re: Principle of Orthogonal Design

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Mon, 21 Jan 2008 23:10:11 +0100
Message-ID: <4795176a$0$85794$e4fe514c_at_news.xs4all.nl>


Jan Hidders schreef:
> mAsterdam wrote:

>> Jan Hidders wrote:
>>> ..., you can always trivially satisfy
>>> it by renaming R(a,b) to R(r_a,r_b).
>>> How exactly does that improve the design?
>> Not. But just renaming the attributes does not change the
>> tuple-type (which I earlier referred to as signature of
>> the relation), so, while it indeed does not improve the
>> design, it also has no bearing on satisfying PoOD as
>> I understand it from EE.

>
> As far as I can see their definition of both tuple type and tuple
> includes the attribute names, so then it matters.

Yes, depending on 'it matters', because , as you said, compliance by renaming becomes trivial.

>> I could not find your definition, BTW.

>
> That's probably because I didn't explicitly state it as a definition.
> Apologies for making you go through the thread again. Here it is
> again:
>
>>>>> What they probably should have said is: R and S are said to have
>>>>> overlapping meaning if it does not follow from the constraints /
>>>>> dependencies that the intersection of R and S is always empty.

Thank you.

> For the record. I think this definition is better than Erki Eesaar's,
> but as I will argue later on I also think that it is still too crude
> and that there is a better definition possible.
>

>> It is clear from JOG's OP that renaming the attribute doesn't
>> PoOdify the design - which made Darwen reject it.
>>
>>  From JOG's OP:
>>
>> OP> Darwen rejected the original POOD paper outright given that
>> OP> McGovern posits that:
>> OP>
>> OP> R1 { X INTEGER, Y INTEGER }
>> OP> R2 { A INTEGER, B INTEGER }
>> OP>
>> OP> violates the principle, whatever the relations' attribute names.

>
> Interesting. I'm missing the context here, so I'm not sure about their
> positions, but I suspect that to some extent both are right. Darwen is
> right that McGoverns' definition is probably too strict, but McGovern
> is right that the attribute names shouldn't matter. I'll come back to
> this later, because I think there is a solution that might be
> acceptable for both. Well, acceptable for me, anyway. :-)
>
>>>>> What they probably should have said is: R and S are said to have
>>>>> overlapping meaning if it does not follow from the constraints /
>>>>> dependencies that the intersection of R and S is always empty.
>>>> Maybe, but it would not take away my objection.
>>>> As long as R\S and S\R are allowed to be non-empty,
>>>> R and S are independent, regardless of their heading.
>>> In that case you still might have redundancy.
>> Redundancy in what sense?

>
> The same fact being represented in more than one place. Note the
> "might have" in the sentence. It could very well be that there is in
> fact no such redundancy, even if it does have overlapping meaning
> according to my definition. In that sense my definition of overlapping
> meaning is still too crude because it is a sufficient condition but
> not a necessary condition. This gets even worse if we start ignoring
> the attribute names.
>
> Can this be solved with a more refined notion of "overlapping meaning"
> that still is defined in terms of dependencies? I think it can. For
> that I need a new notion for a certain restricted class of
> dependencies: a qualified inclusion dependency. An example of such a
> dependency is a constraint like "if R(a=x,b=y) and x > 5 then
> S(c=x,d=y)". If such a constraint holds then certain facts represented
> in R will also be represented in S, and so there will very probably be
> redundancy and update anomalies.

The constraint only applies to a subset of R I.o.w. R is a mix of constrained and unconstrained data. We are talking a /design/ principle here, so my question would be: how did these different sets end up disguised as one in the first place?

But indeed, redundancy.

> In general a qualified inclusion dependency is of the form "if Q1 then
> Q2" where Q1 is a conjunction of atoms and simple equations, Q2 is a
> single atom, and there is at least one atom in Q1 with the same free
> variables as Q2. Such an inclusion dependency is said to hold between
> R and S if at least one atom in Q1 that has the same free variables as
> Q2 concerns R and the atom in Q2 concerns S.

> Our new definition of "overlapping meaning" might then be as follows:
> R and S are said to have overlapping meaning if there is a qualified
> dependency between R and S or between S and R that does not follow
> from the dependencies at relation level.

An inclusion dependency on a proper subset would be a tell, right?

> As you can see it also deals with the case where R and S have
> differently named attributes, and at the same time arguably does not
> see overlapping meaning where there actually is none, so it is more
> refined then my preceding definition. Darwen and McGovern could
> perhaps both be happy with this. :-)
>
> What would really establish its correctness is a theorem that says
> that it characterizes exactly if there is a certain type of redundancy
> (defined for example a la Libkin) at schema level or not. Much like we
> also know is the case for 5NF. I think that's possible, although we
> should probably take all full, single-head dependencies into account
> for that and redefine the normal form accordingly, but unfortunately I
> don't have time right now.
>

>>> If you decompose in to the following three,
>>> R' = R/S,
>>> RS' = R intersect S,
>> RS' =  R ⋂ S     (long live Unicode!)

>
> Mmm. I'm not sure if all clients support this (although Google groups
> does), or even if all relay servers relay it well. AFAIK the usenet
> standards do not require that all 8 bits of every byte in a message
> are relayed.

Last time I checked, with レ (re) and ル (ru), there were no complaints.

ID: R(a) レ S(b) japanese re, for 'references'

                   (mnemonic: check)
FK: R(a) ル S(b)  japanese ru, for 'references unique'
                   (mnemonic: check one)

>>> S'  = S/R,
>>> then you have removed that redundancy.
>> I can see that it is a lossless decomposition.
>> What I don't see is which update anomaly is prevented by it.
>> If I apply it to less abstract relations (say Brians example),
>> I find myself having much more difficulty to come up with
>> sensible predicates for R' , S' and especially for RS'.
>> This makes me suspicious.

>
> Yes, me too. But I strongly suspect my new definition doesn't have
> that problem. It's hard for me to see how there could be a qualified
> inclusion dependency without some corresponding real overlap in
> meaning.

I tried to pick nit, but I could not find objections. Received on Mon Jan 21 2008 - 23:10:11 CET

Original text of this message