Re: Principle of Orthogonal Design

From: Jan Hidders <hidders_at_gmail.com>
Date: Tue, 22 Jan 2008 04:49:26 -0800 (PST)
Message-ID: <0712c556-9d28-4e87-95dd-48eff9219ab8_at_q39g2000hsf.googlegroups.com>


On 21 jan, 23:10, mAsterdam <mAster..._at_vrijdag.org> wrote:
> Jan Hidders schreef:
>
> > mAsterdam wrote:

> >> Jan Hidders wrote:
> >>> ..., you can always trivially satisfy
> >>> it by renaming R(a,b) to R(r_a,r_b).
> >>> How exactly does that improve the design?
> >> Not. But just renaming the attributes does not change the
> >> tuple-type (which I earlier referred to as signature of
> >> the relation), so, while it indeed does not improve the
> >> design, it also has no bearing on satisfying PoOD as
> >> I understand it from EE.
>
> > As far as I can see their definition of both tuple type and tuple
> > includes the attribute names, so then it matters.
>
> Yes, depending on 'it matters', because , as you said,
> compliance by renaming becomes trivial.
>
> >> I could not find your definition, BTW.
>
> > That's probably because I didn't explicitly state it as a definition.
> > Apologies for making you go through the thread again. Here it is
> > again:
>
> >>>>> What they probably should have said is: R and S are said to have
> >>>>> overlapping meaning if it does not follow from the constraints /
> >>>>> dependencies that the intersection of R and S is always empty.
>
> Thank you.
>
>
>
> > For the record. I think this definition is better than Erki Eesaar's,
> > but as I will argue later on I also think that it is still too crude
> > and that there is a better definition possible.
>
> >> It is clear from JOG's OP that renaming the attribute doesn't
> >> PoOdify the design - which made Darwen reject it.
>
> >>  From JOG's OP:
>
> >> OP> Darwen rejected the original POOD paper outright given that
> >> OP> McGovern posits that:
> >> OP>
> >> OP> R1 { X INTEGER, Y INTEGER }
> >> OP> R2 { A INTEGER, B INTEGER }
> >> OP>
> >> OP> violates the principle, whatever the relations' attribute names.
>
> > Interesting. I'm missing the context here, so I'm not sure about their
> > positions, but I suspect that to some extent both are right. Darwen is
> > right that McGoverns' definition is probably too strict, but McGovern
> > is right that the attribute names shouldn't matter. I'll come back to
> > this later, because I think there is a solution that might be
> > acceptable for both. Well, acceptable for me, anyway. :-)
>
> >>>>> What they probably should have said is: R and S are said to have
> >>>>> overlapping meaning if it does not follow from the constraints /
> >>>>> dependencies that the intersection of R and S is always empty.
> >>>> Maybe, but it would not take away my objection.
> >>>> As long as R\S and S\R are allowed to be non-empty,
> >>>> R and S are independent, regardless of their heading.
> >>> In that case you still might have redundancy.
> >> Redundancy in what sense?
>
> > The same fact being represented in more than one place. Note the
> > "might have" in the sentence. It could very well be that there is in
> > fact no such redundancy, even if it does have overlapping meaning
> > according to my definition. In that sense my definition of overlapping
> > meaning is still too crude because it is a sufficient condition but
> > not a necessary condition. This gets even worse if we start ignoring
> > the attribute names.
>
> > Can this be solved with a more refined notion of "overlapping meaning"
> > that still is defined in terms of dependencies? I think it can. For
> > that I need a new notion for a certain restricted class of
> > dependencies: a qualified inclusion dependency. An example of such a
> > dependency is a constraint like "if R(a=x,b=y) and x > 5 then
> > S(c=x,d=y)". If such a constraint holds then certain facts represented
> > in R will also be represented in S, and so there will very probably be
> > redundancy and update anomalies.
>
> The constraint only applies to a subset of R
> I.o.w. R is a mix of constrained and unconstrained data.
> We are talking a /design/ principle here, so my question
> would be: how did these different sets end up disguised
> as one in the first place?

Very good point. But note that just horizontally splitting R turns the qualified inclusion dependency into a normal inclusion dependency, which is a good sign, but does not remove the redundancy.

> But indeed, redundancy.
>
> > In general a qualified inclusion dependency is of the form "if Q1 then
> > Q2" where Q1 is a conjunction of atoms and simple equations, Q2 is a
> > single atom, and there is at least one atom in Q1 with the same free
> > variables as Q2. Such an inclusion dependency is said to hold between
> > R and S if at least one atom in Q1 that has the same free variables as
> > Q2 concerns R and the atom in Q2 concerns S.
> > Our new definition of "overlapping meaning" might then be as follows:
> > R and S are said to have overlapping meaning if there is a qualified
> > dependency between R and S or between S and R that does not follow
> > from the dependencies at relation level.
>
> An inclusion dependency on a proper subset would be
> a tell, right?

Close, but not exactly. *Every* inclusion dependency in some sense causes redundancy, but if the set of attributes in which it ends is also a component of non-trivial join dependency, then you can remove this redundancy. So that is why the rule should say that only then there is a problem.

> > As you can see it also deals with the case where R and S have
> > differently named attributes, and at the same time arguably does not
> > see overlapping meaning where there actually is none, so it is more
> > refined then my preceding definition. Darwen and McGovern could
> > perhaps both be happy with this. :-)
>
> > What would really establish its correctness is a theorem that says
> > that it characterizes exactly if there is a certain type of redundancy
> > (defined for example a la Libkin) at schema level or not. Much like we
> > also know is the case for 5NF. I think that's possible, although we
> > should probably take all full, single-head dependencies into account
> > for that and redefine the normal form accordingly, but unfortunately I
> > don't have time right now.
>
> >>> If you decompose in to the following three,
> >>> R' = R/S,
> >>> RS' = R intersect S,
> >> RS' =  R ⋂ S     (long live Unicode!)
>
> > Mmm. I'm not sure if all clients support this (although Google groups
> > does), or even if all relay servers relay it well. AFAIK the usenet
> > standards do not require that all 8 bits of every byte in a message
> > are relayed.
>
> Last time I checked, with レ (re) and ル (ru),
> there were no complaints.
>
> ID: R(a) レ S(b)  japanese re, for 'references'
>                    (mnemonic: check)
> FK: R(a) ル S(b)  japanese ru, for 'references unique'
>                    (mnemonic: check one)

:-) FWIW, that works for me also.

  • Jan Hidders
Received on Tue Jan 22 2008 - 13:49:26 CET

Original text of this message