Re: Principle of Orthogonal Design

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 13 Jan 2008 11:39:15 -0500
Message-ID: <U8rij.14047$6%.12238_at_nlpi061.nbdc.sbc.com>


"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:a9d463eb-2d44-4946-ac76-7b5d8a2a2b04_at_v4g2000hsf.googlegroups.com...

> On Jan 13, 1:54 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:

>> "JOG" <j..._at_cs.nott.ac.uk> wrote in message
>>
>> news:30ef4f38-4b2b-4e15-9ed1-f96ab5efd545_at_1g2000hsl.googlegroups.com...
>>
>>
>>
>> > On Jan 9, 9:01 pm, TroyK <cs_tr..._at_juno.com> wrote:
>> >> On Jan 8, 6:51 pm, JOG <j..._at_cs.nott.ac.uk> wrote:
>>
>> >> > I was wondering what your current stances towards the principle of
>> >> > current design is cdt - info about the POOD is actually pretty
>> >> > sparse
>> >> > on google, which has not helped my own understanding. I gather that
>> >> > Date has realigned his opinion - although what to I know not - and
>> >> > that Darwen rejected the original POOD paper outright given that
>> >> > McGovern posits that:
>>
>> >> > R1 { X INTEGER, Y INTEGER }
>> >> > R2 { A INTEGER, B INTEGER }
>>
>> >> > violates the principle, whatever the relations' attribute names.
>> >> > Instinctively it does seem rather odd that a predicates such as:
>>
>> >> > * on Day:X the shop had noCustomers:Y
>> >> > * on Roll:A, the dice showed the Number:B
>>
>> >> > cannot share the same database. Have I interpreted the debate
>> >> > correctly? Any insights or corrections are, as ever, appreciated -
>> >> > POOD is certainly thought provoking, and the concept that an update
>> >> > need not require specifcation of a table name is an interesting one.
>>
>> >> I thought I'd work up an example that illustrates part of McGovern's
>> >> argument (as I understand it). Comments and corrections are most
>> >> welcome:
>>
>> >> Assuming we want to represent employees and their status as either
>> >> salaried, commissioned, or both, we might come up with the following
>> >> design (assuming the most obvious interpretations):
>>
>> >> Table: Emp
>> >> Emp# IsSalaried IsCommissioned
>> >> ==== ---------- --------------
>> >> 1 Y N
>> >> 2 N Y
>> >> 3 Y Y
>>
>> >> Another possible design could be:
>> >> Table: Emp
>> >> Emp#
>> >> ====
>> >> 1
>> >> 2
>> >> 3
>>
>> >> Table: SalariedEmp (FK, Emp# references Emp.Emp#)
>> >> Emp#
>> >> ====
>> >> 1
>> >> 3
>>
>> >> Table: CommissionedEmp (FK, Emp# references Emp.Emp#)
>> >> Emp#
>> >> ====
>> >> 2
>> >> 3
>>
>> >> Although we can derive the same information from each design, the
>> >> former is to be prefered because the latter violates POOD (two tables
>> >> with identical headers, even semantically speaking). Additionally, it
>> >> requires that we inspect the tables' names in order to gather the full
>> >> meaning of the propositions.
>>
>> I think that the requirement that we inspect table names comes from the
>> correlation between RM and predicate logic. In predicate logic, there
>> are
>> predicate symbols and there are individual symbols, and under an
>> interpretation, both the predicate symbols and the individual symbols are
>> assigned meaning. The two relations,
>>
>> sine {x, y}
>>
>> and
>>
>> cosine {x, y}.
>>
>> have the same heading, but have totally different meanings
>
> Interesting example. So sine and cosine as stated would break POOD
> (which appears intended to assist with view updates).  ...

Wasn't POOD intended to eliminate redundancy across relations? For example, in the two following relations,

ES {EmployeeNumber, EmployeeName, Salary} EC {EmployeeNumber, EmployeeName, Commission}

where EmployeeNumber is the key, the dependent attribute, EmployeeName, is redundant for employees that receive both salary and commission.

So because EmployeeNumber --> EmployeeName appears in both and "means the same thing" in both, it is a violation of POOD. I submit that it would not be a violation of POOD if there were a database constraint,

IS_EMPTY(ES JOIN EC {EmployeeNumber})

Which would mean that employees couldn't receive both a salary and a commission. Since it would not then be possible for an individual to be exemplified by both ES and EC, there isn't any overlapping meaning, so there isn't a POOD violation.

For a similar example, consider the following database schemata,

Case 1:

PS {Part#, Serial#, Warehouse, Bin}
  KEY {Part#, Serial#}

PSL {Part#, Serial#, Warehouse, Bin, Lot#}   KEY {Part#, Serial#}

IS_EMPTY(PS JOIN PSL {Part#, Serial#})

In this case, PS is for the location of parts that are being tracked by serial number only, whereas PSL is for the location of parts that are being tracked by both lot and serial number.

Note that {Part#, Serial#} --> {Warehouse, Bin} appears in both.

Case 2:

PS {Part#, Serial#, Warehouse, Bin, HasLot}   KEY {Part#, Serial#}
  HasLot IN {'Yes', 'No'}

PSL {Part#, Serial#, Lot#}
  KEY {Part#, Serial#}
  FOREIGN KEY {Part#, Serial#} REFERENCES PS

In this case, PS is for the location of parts that are being tracked by serial number or by lot number and serial number, PSL is for recording the lot numbers for parts that are being tracked by both lot and serial number, and there is an explicit indication in HasLot as to whether there should be a tuple in PSL for parts that are being tracked by both lot and serial.

I think that Case 1 does not violate POOD due to the database constraint,

IS_EMPTY(PS JOIN PSL {Part#, Serial#}).

This database constraint limits the possible interpretations of the predicates of PS and PSL so that the meanings of {Part#, Serial#, Warehouse, Bin} in PS and PSL must be different.

I prefer case 1 because in it the distinction between parts that are being tracked by serial number only and parts that are being tracked by both lot number can be determined due to the schema only, whereas in case 2, that distinction can only be interpreted by examining the value of HasLot.

> ...  To avoid
> breaking the POOD, we'd have either:
>
> operation{ input:x, output, y, type:t }
> t E {sin, cos}
>
> or,
>
> sin  {sin_input:x, sin_output:y}
> cos {cos_input:x, cos_output:y}
>
> Well, my instinct tells me that the latter is kludgy given x plays
> exactly the same role in both predicates, and as such one would expect
> them to have exactly the same name (and who knows, that might be
> useful information). But then the former would generate enormous
> tables given it would have to incorporate every conceivable operation
> with an input and an output of the same type as used by sin and cos.
> Practically that appears to be undesirable. Theoretically I guess
> there is nothing wrong with it.
>
>

>> --even though the
>> same individuals are exemplified in both whenever (x - pi / 4) modulus pi
>> is
>> zero. The tuple, {pi / 4, sqrt(2) / 2}, appears in both relations yet
>> has a
>> different meaning assigned to it from each predicate. Interestingly,
>> though, there is still only one individual represented by that tuple even
>> though it appears in both relations.
>>
>> >> TroyK
>>
>> > I like this example Troy - It is both concise and clear. I was just
>> > mulling and came up with a possible schema for a "kennel club", that
>> > might help clarify some of the issues I'm seeing.
>>
>> > owners = {name, age, joined} PK(name)
>> > dogs = {name, age, joined} PK(name)
>> > ownership = {owner, dog} FK(owner, owners.name) FK(dog, dogs.name)
>>
>> > As it stand this is clearly not in POOD, and a proposition such as
>> > <Bess, 18, 01/08/08> is ambiguous. So it got me thinking about how we
>> > know in real life what a proposition refers to.
>>
>> > 1) I know the context implicitly of <Bess, 18, 01/08/08> (from
>> > previous conversation, who i'm talking to, etc)
>> > 2) Surrounding text: <Bess, 18, 01/08/08> = "The dog called Bess is 18
>> > years old and joined the kennel club on 01/08/08"
>>
>> > I don't think we can deign to rely on a computer ever being able to
>> > cope with implict context (leave that to the CYC people to bang there
>> > heads against ad infintum), so that appears to leave me with option 2.
>> > In turn, I see two ways of integrating this explicit surrounding
>> > context.
>>
>> > 1) Via role names so, the proposition becomes:
>> > There is a dog with (dogsName:Bess) of (age:18) and who (joined:
>> > 01/08/08)
>>
>> > 2) Via another attribute:
>> > There (is_a:dog) with (name:Bess) of (age:18) and who (joined:
>> > 01/08/08)
>>
>> > Which is preferable, and why that would be so I am uncertain. It would
>> > be easy to say "this is all just a design decision", but I'm starting
>> > to think there might be some sort of salient lesson in there.
>> > Somewhere. Maybe.
> Received on Sun Jan 13 2008 - 17:39:15 CET

Original text of this message