Re: Modelling Disjoint Subtypes

From: Joe Thurbon <usenet_at_thurbon.com>
Date: Thu, 22 Mar 2007 20:47:03 GMT
Message-ID: <bXBMh.492$M.380_at_news-server.bigpond.net.au>


Bob Badour wrote:

> Joe Thurbon wrote:
> 

>> I have (another) question about examples in Pascal's "Practical Issues
>> in Database Management" book. This time it is about modelling disjoint
>> entity subtypes.
>>
>> The example given in the book (section 6.2.2 for those who have it) is
>> of an "employee" entity with a subtype commissioned employee. The
>> relational representation for this example is
>>
>> Employee: EMP# Name HireDate Salary
>> ====---------------------------
>> 1 Smith 1/1/2001 16150
>> 2 Jones 1/1/2002 14350
>>
>>
>> Commissioned_Employee: EMP# Commission
>> ====-------------
>> 1 5000
>>
>> There is also the natural generalisation to allowing employees to be
>> salary-only, commission-only, or both.
>>
>>
>> Employee: EMP# Name HireDate
>> ====
>>
>> Commissioned: EMP# Commission
>> ====
>>
>> Salaried: EMP# Salary
>> ====
>>
>> An exercise left to the reader (why is it always the one I'm
>> interested in that's left to the reader!?) is to model the case
>> whereby employees can be either salaried, commissioned, but _not_
>> both. That is, each of the subtypes is disjoint.
>>
>> I can think of perhaps three ways to do this, but none seem to be as
>> elegant as the above examples, and so I'm wondering if I'm missing
>> something simple.
>>
>> The first approach I can think of uses some sort of
>> type-discriminator, in the employee table. This seems fraught with
>> difficulties: how do you select the table to join on based on the type
>> discriminator other than using multiple queries? How do you prevent
>> nonsense data appearing, for example, in the 'commissiones' table for
>> an employee whos type discriminator is is 'salaries'. So I'm guessing
>> that a type discriminator does not work.
>>
>> The second approach is to have only one sub-type, but have two columns
>> "Type", "Amount" whereby type gives a clue as to how to interpret the
>> ammount (i.e. is it salary or commission). This seems problematic,
>> since , in some sense, the domain of commissions is different to the
>> domain of salaries (at least, according to my understanding of chapter
>> 1 of "Practical Issues..."
>>
>> The other approach requires some sort of check constraint that
>> simulates a kind of 'distributed key' (c.f. Darwen's 'Final Null In Th
>> Coffin'). I can't really work out how this would be done relationally.
>>
>> So I'm stuck. Any hints or clues?
> 
> What if you had an attribute that was a constant? How might that change 
> your analysis above?

I'm afraid I've found this clue (hint?) a little too cryptic.

It spawned a few thoughts. For example, the second approach above might be changed to

EMP# Salary SalaryFactor Commission CommisssionFactor


Where SalaryFactor and CommissionFactor were either 0 or 1, and the employees total remuneration was a linear equation. But that seemed, um, fanciful. (Not to mention that it takes a very liberal interpretation of your clue, and doesn't work in the general case).

I can't see how to apply your hint to either of the other approaches. In approach one, you need at least as many values as there are sub-types. In approach three, the desire is to have a constraint which ensures that for a given key, that key appears only in one of the subtype tables, and I don't see how a constant attribute helps there. I guess you could use the constant attribute as some sort of token to indicate which sub-type is the 'active' one, but I don't see how that is different to approach three.

Am I just barking up the wrong tree? Would you like to tell me the answer?

Cheers,
Joe Received on Thu Mar 22 2007 - 21:47:03 CET

Original text of this message