Re: Principle of Orthogonal Design

From: JOG <jog_at_cs.nott.ac.uk>
Date: Sun, 20 Jan 2008 13:35:04 -0800 (PST)
Message-ID: <c324c654-1373-43d2-adc7-da249a0d7400_at_i72g2000hsd.googlegroups.com>


On Jan 20, 12:40 pm, Jan Hidders <hidd..._at_gmail.com> wrote:
> On 19 jan, 16:55, mAsterdam <mAster..._at_vrijdag.org> wrote:
>
>
>
> > Jan Hidders wrote:
> > > mAsterdam wrote:
> > >> Jan Hidders wrote:
> > >>> mAsterdam wrote:
> > >>>> Jan Hidders wrote:
> > >> ...
> > >>>>> Anyway, the stronger POOD that requires that headers are distinct
> > >>>>> sounds like nonsense to me. Why would R(A, B) be a worse design than
> > >>>>> R(R_A, R_B)?
> > >>>>> The weaker POOD looks more interesting to me. I even found a published
> > >>>>> paper about it:
> > >>>>>http://www.wseas.us/e-library/conferences/2006madrid/papers/512-451.pdf
> > >>>> Unfortunately the link times out.
> > >>> Hmm, not for me. But to help you out:
> > >>>http://www.adrem.ua.ac.be/bibrem/pubs/pood.pdf
> > >> Thank you for helping me out by providing the document.
>
> > >> I started reading "Extended Principle of Orthogonal Design"
> > >> by Erki Eessaar
>
> > below: EE
>
> > >> User defined datatypes (UDT) give the designer of a database
> > >> more decisions, more room for wrong decisions.
> > >> Row-, array-, reference-, multiset collection types
> > >> - choices, choices, choices.
> > >> How to deal with that freedom?
> > >> Enter the Principle of Orthogonal Design (POD):
> > >> If you have a tuple-type, make sure to have only one base
> > >> relvar for recording that tuple-type.
>
> > > Hmm, I would call that the strong POOD (or strong POD), and that, I
> > > would agree, seems to make no sense.
>
> > We agree on that, that is clear.
> > So ok, but - from the huge category of easily asked but
> > hard to answer questions: - why?
>
> > Why doesn't the (strong) PoOD make sense to you?
>
> It is at the same time too strong and too weak. It is too strong
> because it forbids cases where there is in fact no redundancy, and at
> the same time allows cases where there is redundancy. Moreover, you
> can always trivially satisfy it by renaming R(a,b) to R(r_a,r_b). How
> exactly does that improve the design?

Well this is an interesting question, because I am yet to see any reason as to why it be beneficial, but can see how it makes life harder given that it prevents a union without a preceding rename.

I find the example of phone numbers useful (a scenario also discussed on TTM lists a while back): for a selection of propositions discussing people's work, home and mobile numbers is it possible to state a consistent preference for a schema from:

  1. Contacts = {phone_type, name, number}
  2. Home = {name, number} Work = {name, number} Mobile = {name, number}

I find it interesting that the PoOD recommends 1, the option that I believe most designers would intuitively go for, but that in its 'strong form' also suggests that 2 is perfectly satisfactory if roles are simply renamed as:

Home = {name, home_number}
Work = {name, work_number}
Mobile = {name, mobile_number}

Without any evidence of improvement this might confer to view updating, I certainly see little sense therein. However, I wonder if there is not a preferable guideline that also recommends 1 along the lines of eliminating application bias (a Principle of Bias Elimination perhaps). Such a guideline would also reject other feasible schema such as,

Toms_numbers = {type, number}
Franks_numbers = {type, number}
etc...

that one might never realistically consider, but differ little conceptually to a Home, Work, Mobile divisions, suggesting instead that if it is possible for propositions to be collectivized under a single predicate, then they should be?

Either way, I certainly find that appealing to notions of "meaning" within formal design recommendations seems to head towards very slippery ground.

>
> > It does not make sense to me, and I am putting some effort
> > into finding out why it doesn't. The track I am on now gives
> > rejection because of inappropriate equalization of meaning
> > with form (i.c. the heading).
> > This track does not give me any distinction between the validity
> > of the strong (EE: original) and the weak (EE: extended)
> > PoOD, because both build on the sentence quoted hereunder.
>
> Note that my definition differs in an important way from theirs. I
> forgot to emphasize this the last time.
>
>
>
> > >> EE: Chapter 2:
> > >> "The meanings of R1A(t) and R1B(t) are said to overlap
> > >> iff it is possible to construct some tuple t so that R1A(t)
> > >> and R1B(t) are both true".
>
> > >> Note that it does /not/ say 'all tuples t', but 'some tuple t',
> > >> So it does, in particular, /not/ exclude the possibilty
> > >> that R1A(t') is true and R1B(t') is not true or vice versa,
> > >> in other words, that R1A and R1B may be mutually independent.
> > >> With this trick, meaning is forced into synonymity with the
> > >> signature of the relation.
>
> > > No, no, not exactly. It could be that there are tuple constraints that
> > > don't allow you to construct the tuple in question. Say you have
> > > R(a,b) and S(a,b) and for R the tuple constraint that b > 5 and for S
> > > the tuple constraint that b <= 5. In that case you cannot construct a
> > > tuple that is both in R and S, but they still have the same header.
>
> > Which could be rephrased as different B domains for R and S (so that
> > R, S becomes an acceptable schema to PoOD), but I won't for the sake
> > of argument.
>
> For the sake of the argument, please do. :-) It would allow me to
> reply with the following: In that case there is still a
> counterexample, namely if there is a tuple constraint a < b for R and
> a >= b for S. Of course if you want, you can redefine the notion of
> header such that this is also included.
>
> > > What they probably should have said is: R and S are said to have
> > > overlapping meaning if it does not follow from the constraints /
> > > dependencies that the intersection of R and S is always empty.
>
> > Maybe, but it would not take away my objection.
> > As long as R\S and S\R are allowed to be non-empty,
> > R and S are independent, regardless of their heading.
>
> In that case you still might have redundancy. If you decompose in to
> the following three, R' = R/S, RS' = R intersect S, S' = S/R, then
> you have removed that redundancy. That's the motivation if the
> definition of "overlapping meaning" that I gave.
>
> Actually, to really remove all such redundancy one would would need a
> stronger notion of "overlapping meaning", so we you can also deal with
> overlap modulo renaming, but I don't want to complicate things too
> much at this point.
>
> -- Jan Hidders
Received on Sun Jan 20 2008 - 22:35:04 CET

Original text of this message