Re: Principle of Orthogonal Design

From: Brian Selzer <brian_at_selzer-software.com>
Date: Tue, 29 Jan 2008 12:33:14 GMT
Message-ID: <e2Fnj.2608$5K1.239_at_newssvr12.news.prodigy.net>


"Jan Hidders" <hidders_at_gmail.com> wrote in message news:d4f7d641-ea3e-4c2b-8050-ae66c469416d_at_d4g2000prg.googlegroups.com... On 29 jan, 06:28, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "Jan Hidders" <hidd..._at_gmail.com> wrote in message
>
> news:12c08729-2bf7-4080-ba63-d3c90b8d95e0_at_1g2000hsl.googlegroups.com...
> On 28 jan, 22:26, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>
>
>
> > > "Jan Hidders" <hidd..._at_gmail.com> wrote in message
>
> > >news:43bf804b-4ed0-493c-8499-2b3e8a83ee52_at_z17g2000hsg.googlegroups.com...
> > > On 28 jan, 15:29, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> > > > "Jan Hidders" <hidd..._at_gmail.com> wrote in message
>
> > > >news:0a872b75-448e-4131-ba5b-6bcee88da815_at_e10g2000prf.googlegroups.com...
>
> > > > > On 28 jan, 02:12, mAsterdam <mAster..._at_vrijdag.org> wrote:
> > > > >> Jan Hidders wrote:
> > > > >> > mAsterdam wrote something very much like:
> > > > >> >> Pragmatical redefinitions must be temporary and tracked.
> > > > >> > Sure, we agree on that.
>
> > > > >> <unsnip>
>
> > > > >> Wether the relation between heading and tuples goes
> > > > >> via names or ordering is relevant or not.
>
> > > > >> If it is not I want it out of scope.
>
> > > > >> </unsnip>
>
> > > > > I don't think that it is possible to get it out of scope. If you
> > > > > think
> > > > > it is, then by all means provide an equivalent and complete
> > > > > definition
> > > > > where it is. I'm also not sure what your problem exactly is. We
> > > > > have
> > > > > a
> > > > > definition that works for the named perspective, which is arguably
> > > > > the
> > > > > most appropriate for the relational model anyway, so can we now
> > > > > please, please, please, pretty please, move on with the
> > > > > discussion?
>
> > > > I don't think the definition is sufficient even for the named
> > > > perspective:
>
> > > Certainly. I think I already said earlier that there is a stronger
> > > version that removes even more redundancy. But I wanted to wait a
> > > little until mAsterdam had gotten his head around the current one.
>
> > > > consider
>
> > > > R1 {J, K}
> > > > KEY {J}
> > > > KEY {K}
>
> > > > R2 {J, A}
> > > > KEY {J}
> > > > FOREIGN KEY {J} REFERENCES R1
>
> > > > R3 {K, A}
> > > > KEY {K}
> > > > FOREIGN KEY {K} REFERENCES R1
>
> > > > Supposing that J, K and A have different types and discounting any
> > > > meaning
> > > > attributed by relation names, there is overlap between R2 and R3.
>
> > > > J and K are both keys for R1, so J --> K and K --> J.
>
> > > > And due to the foreign keys between R2 and R1 and R3 and R1:
>
> > > > from J --> K and K --> A, J --> A can be inferred;
> > > > from K --> J and J --> A, K --> A can be inferred.
>
> > > You can reason like that about FDs in the context of a single
> > > relation, but you seem to do it here at schema level. What exactly
> > > does it mean that J --> K holds at schema level? The only way I can
> > > make sense of your statements is if you are working under the
> > > universal relation assumption. Are you? Or are you perhaps assuming a
> > > few extra dependencies you haven't told us about? Dependencies like
> > > (R1 NJN R2)[K,A] = R3?
>
> > I didn't have the universal relation assumption in mind. The foreign key
> > constraints require that whenever there is a tuple in either R2 or R3,
> > there
> > must also be a tuple in R1, so since there must be a tuple in both R2
> > and
> > R1
> > or both R3 and R1, it stands to reason that
>
> > whenever R2 then R2 JOIN R1,
> > or
> > whenever R3 then R3 JOIN R1.
>
> > From that it follows that
>
> > the FDs from R2 and the FDs from R1 must also hold in R2 JOIN R1
> > and
> > the FDs from R3 and the FDs from R1 must also hold in R3 JOIN R1
>
> > because the foreign keys ensure that
>
> > (R2 JOIN R1) {J, A} = R2
> > and
> > (R3 JOIN R1) {K, A} = R3
>
> That actually follows even if there are no inclusion dependencies /
> foreign keys.
> <<<<<<<<<<<<<
>
> Actually, without the foreign key, (R2 JOIN R1) {J, A} is not necessarily
> R2, since there could be one ore more tuples in R2 with a value for J that
> does not appear in R1.

True, but that "that" in my sentence referred to the FDs holding in the result of the natural joins.

> But so, yes, it follows that K --> A holds on R2 JOIN R1 and J --> A holds
> on R3 JOIN R1. Why then do you think it follows that if there are
> tuples {j1, k1} in R1, {j1, a1} in R2, and {k1, a2} in R3, the
> database is inconsistent? I really don't
> see the contradiction here. Sure, we now have two A values associated
> with a J value and a K value, but so what? That certainly doesn't
> contradict the fact that K --> A holds on R2 JOIN R1 and J --> A holds
> on R3 JOIN R1.
>
> -- Jan Hidders
> <<<<<<<<<<<<<<
>
> Since K --> A holds in R2 JOIN R1 and also in R3 JOIN R1, shouldn't it
> hold
> in (R2 JOIN R1) JOIN (R3 JOIN R1), too?

It does.

> But the information in the tuples,
> {j1, k1}, {j1, a1}, {k1, a2} disappears when (R2 JOIN R1) is joined to (R3
> JOIN R1), even though there is a tuple with the same value for J in each
> of
> R2 and R1 and a tuple with the same value for K in each of R3 and R1. The
> information disappears because the database is inconsistent.

I have no idea what you mean by that. I don't see any inconsistencies. And the information disappears because your are computing the intersection of two different access paths from J/K to A and this intersection happens to be empty. This is entirely valid. If there is more than one way to get from A to B then these have apparently more than one relationship, and there is no reason why these relationships cannot be completely unrelated.

> Perhaps a concrete example would be in order. Suppose that you have an
> EMPLOYEE relation with two keys, Badge# and SSN. Suppose also that you
> have
> a CLOCK relation that contains the Badge#, Date, TimeIn, TimeOut and Hours
> for each employee that worked for each day. Now suppose that you have a
> PAYROLL relation that contains the SSN and Hours for each employee for
> each
> day. Note that in the PAYROLL relation, there is a tuple for each employee
> for each workday even if there isn't one in the CLOCK relation. An
> employee
> may be on paid vacation, for example, and therefore should still get paid
> for the hours he otherwise would have worked.
>
> What you have is three relations, (other attributes not relevant to the
> discussion omitted)
>
> EMPLOYEE {Badge#, SSN}
> KEY {Badge#}, KEY {SSN}
>
> CLOCK {Badge#, Date, TimeIn, TimeOut, Hours}
> KEY {Badge#, Date}
> FOREIGN KEY {Badge#} REFERENCES EMPLOYEE
>
> PAYROLL {SSN, Date, Hours}
> KEY {SSN, Date}
> FOREIGN KEY {SSN} REFERENCES EMPLOYEE
>
> Now, if the employee with badge number 2011 worked 9 hours on 1/28/08 as
> represented in a tuple in CLOCK, then wouldn't it be a bad thing for there
> to be a tuple in PAYROLL with that employee's SSN, '123-45-6789,'
> indicating that he only worked 8 hours on 1/28/08?

(There's something up with my news reader: when I hit reply, it doesn't quote what you wrote.)

>>>>>>>>>>>>>>>>>>>>>>

If you don't want that you need to specify an extra dependency / database constraint that states this. Probably an inclusion dependency from the join of EMPLOYEE and CLOCK to PAYROLL. The ones that you specified so far don't imply this. So I don't see how this supports your point.

  • Jan Hidders <<<<<<<<<<<<<<<<<<<<<

In the context of the existing inclusion dependencies, doesn't the value '123-45-6789' for SSN in PAYROLL represent the EMPLOYEE that exemplifies the values in the tuple {Badge# = 2011, SSN = '123-45-6789'}, and doesn't the value 2011 for Badge# in CLOCK represent that same EMPLOYEE? Wouldn't that mean that the tuple, {Badge# = 2011, Date = 1/28/08, Hours = 8} in the projection over CLOCK {Badge#, Date, Hours} means exactly the same thing as the tuple, {SSN = '123-45-6789', Date = 1/28/08, Hours = 8} in PAYROLL--that is, that the employee with Badge# 2011 /and/ SSN '123-45-6789' worked 8 hours on 1/28/08? So since they mean exactly the same thing, wouldn't it be inconsistent to have in the database a statement that the employee with Badge# 2011 and SSN '123-45-6789' worked 8 hours on 1/28/08 and a statement that the employee with Badge# 2011 and SSN '123-45-6789' worked 9 hours on 1/28/08? Received on Tue Jan 29 2008 - 13:33:14 CET

Original text of this message