Re: constraints in algebra instead of calculus

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 21 May 2007 22:32:18 -0400
Message-ID: <SCs4i.1650$C96.846_at_newssvr23.news.prodigy.net>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:nmq4i.205308$aG1.145355_at_pd7urf3no...
> Brian Selzer wrote:

>> "paul c" <toledobythesea_at_oohay.ac> wrote in message 
>> news:JQ94i.200044$DE1.108062_at_pd7urf2no...
>>
>>>Brian Selzer wrote:
>>>
>>>>"paul c" <toledobythesea_at_oohay.ac> wrote in message 
>>>>news:dX84i.201343$6m4.74192_at_pd7urf1no...
>>>>
>>>>
>>>>>Marshall wrote:
>>>>>
>>>>>
>>>>>>Okay, a while back we were talking about writing constraints
>>>>>>in a language with aspects of the relational calculus, specifically
>>>>>>the existential and universal quantifiers. The point was made
>>>>>>that that's unnecessary; the calculus is no more expressive
>>>>>>than the algebra.
>>>>>>
>>>>>>So it ought to be possible to write any constraint from the
>>>>>>calculus in the algebra.
>>>>>>
>>>>>>Well, I'm having a hard time figuring out how to do it. Can
>>>>>>anyone help?
>>>>>>
>>>>>>How does one write a functional dependency in the algebra?
>>>>>>A foreign key?
>>>>>>
>>>>>>
>>>>>>Marshall
>>>>>>
>>>>>
>>>>>I thought that one was easy, if FK is the set of 'foreign key' 
>>>>>attributes and A is the 'referencing table' and using an op like D&D 
>>>>><AND>, then it's something like A{FK} <AND> B{FK} = A{FK}.  (I like 
>>>>>this one because it's particularly easy to implement.)
>>>>>
>>>>
>>>>
>>>>Very slick.  One point, though.  The constraint
>>>>
>>>>A{FK} <AND> B{FK} = A{FK}
>>>>
>>>>Is not sufficient.  Also required is the constraint
>>>>
>>>>COUNT(B{FK}) = COUNT(B)
>>>>
>>>>
>>>>
>>>>
>>>
>>>I've never seen anybody justify why a foreign key must be a candidate key 
>>>in the referenced relation/relvar/table except for 'so-and-so says so'. 
>>>Whereas I've seen examples where the non-key reference is essential.
>>
>>
>> There is a paper by Mark Levene and Millist W. Vincent, "Justification 
>> for Inclusion Dependency Normal Form."  In it they argue that a database 
>> schema should be free of circular inclusion dependencies and should only 
>> contain key-based inclusion dependencies (which are as far as I can tell 
>> from their definition, foreign key constraints)
>> ...
>

> What is their argument?
>

They talk about avoiding attribute redundancy, preventing update anomalies, and avoiding interactions between FDs and INDs because the joint implication problem for FDs and INDs is undecidable--at least where the set of INDs is circular. The paper is not long, but they pack a lot of information in it. I don't think I can rehash their argument here without making hash of it:) I found it out on the Internet a few years ago. I can't remember where.

>> An inclusion dependency that is not also a foreign key constraint denotes 
>> a many-to-many relationship, ...
>

> Not necessarily. My attitude is very much from an engine perspective
> where the feature definitions ought to be as elemental as possible. This
> shouldn't stop somebody who is using a product that has a hard time with
> circular dependencies from using a so-called catalog who doesn't want them
> for stylistic reasons from preventing their definition.
>

> In my view, the candidate key aspect is a second constraint, combining the
> two in one "foreign key" directive buys nothing since anytime the
> reference is not to a key, another directive would be needed anyway. When
> Dave P says the reason is historical, I'm sure he's right, but the biggest
> lesson of history, if you ask me, is that it as much as not shows why we
> shouldn't always follow history. Somewhere, Hugh Darwen argued for the
> deprecation of the term "foreign key" - I think I'm with him. Of course,
> I would hope an ideal dbms would allow simple macro-ization for users who
> want to combine features with their own choice of constraint keyword.
>

I agree, but it's easier when visualizing a directed graph between values participating in an inclusion dependency to count the arrows originating from values in the referencing relation. If all of the values have only one arrow, then it's a foreign key constraint, whereas if any of the values has more than one arrow, then it's not. Also, when considering the set of INDs in isolation, it makes sense to fold in the effect of that second constraint in order to differentiate between INDs that denote a many-to-many relationship and INDs that denote a one-to-many relationship.

> Here is a quote from my possibly inaccurate echo of Darwen's example that
> I sent in a another thread:

>

> (quote)
> Student { StudentId, Name } KEY { StudentId }
> Course { CourseId, Title } KEY { CourseId }
> Staff { StaffId, Name } KEY { StaffId }
>

> Teaches { StaffId, CourseId } KEY { StaffId }
>

> (note that a teacher may teach only one course, the argument depends on
> this)
>

> Enrolment { StudentId, CourseId } KEY { StudentId, CourseId }
>

> (note also that a student may take many courses and that a student may
> enrol in a course before a teacher is assigned to it)
>

> TutorFor { StudentId, StaffId } KEY { StudentId, StaffId }
>

> It's possible to enter a row in TutorFor where StaffID stands for a
> teacher who doesn't teach any course the student is enrolled in. If I
> understand the argument it is that it would be simpler/easier for an
> implementation if the system allowed a foreign key F1
> (StudentId,StaffId) for TutorFor referencing the join of Enrolment and
> TutorFor, eg., the view V={StudentId,StaffId,CourseId} where V is
> constrained by its own foreign key F2 (StaffId,CourseId) referencing
> Teaches. If I've got this right, the schema is in BCNF, but neither F1
> or F2 references a key.
> (end quote)
>

F2 references a superkey. The constraint, COUNT(B[FK]) = COUNT(B), requires only that the projection B[FK] be a superkey of B, not necessarily a candidate key. This sqares with SQL, which requires that a unique constraint has been defined on the referenced columns.

V is not in BCNF: since StaffId --> CourseId in Teaches, StaffId --> CourseId in V.

F1 does reference a key: since StaffId --> CourseId, {StudentId, StaffId} is a candidate key for V.

> Sorry for the long post about something so basic, but no apology for
> talking about basics!
>
> p
Received on Tue May 22 2007 - 04:32:18 CEST

Original text of this message