Re: SQL for Modeling Generalization Hierarchies

From: Payson <payson_b_at_hotmail.com>
Date: 28 May 2004 11:37:27 -0700
Message-ID: <771aa5a4.0405281037.16b7597d_at_posting.google.com>


This is always messy. With dynamically added subtypes, I would consider keeping everything soft. Here is a first approximation for the tables:

Employee     Valid Subtypes    Attributes/Subtype    Valid Attributes
  Subtype<--------Subtype--------->Subtype              ?Datatype
  Employee #      ?Desc            Attribute<-----------Attribute 
     |                                                  ?Desc
     |                                                      
Employee Attributes

   Employee #
   Attribute
   Attribute data

You could then build a list of available attributes for an employee based on his subtype from the attributes/subtype table. The data entry screen(s) could be built from the Valid Attributes table(depending on how smart it needs to be).

This is a bit rough, but I hope the idea comes through. Control over the Valid Subtypes table, the Attributes table, etc. would need to be defined (who and how), but, gosh, I would have to have to create a new table every time a new subtype was added.

LOL.. here I go again, ask me what time it is and I tell you how to build a watch. Hope this helps more than it hurts.

Payson

robertbrown1971_at_yahoo.com (Robert Brown) wrote in message news:<240a4d09.0405270856.4ce55c7d_at_posting.google.com>...
> Is there a good approach to modelling many heterogeneous entity types
> with that have some attributes in common?
>
> Say I have entities "employees" which share some attibutes (e.g.
> firstname, lastname, dateofbirth) but some subsets of employees (e.g.
> physicians, janitors, nurses, ambulance drivers) may have additional
> attributes that do not apply to all employees. Physicians may have
> attributes specialty and date of board certification, ambulance
> drivers may have a drivers license id, janitors may have
> preferredbroomtype and so on.
>
> There are many employee subtypes and more can be dynamically added
> after the application is deployed so it's obviously no good to keep
> adding attributes to the employees table because most attributes will
> be NULL (since janitors are never doctors at the same time).
>
> The only solution I found for this is a generalization hiearchy where
> you have the employee table with all generic attributes and then you
> add tables for each new employee subtype as necessary. The subtype
> tables share the primary key of the employee table. The employee table
> has a "discriminator" field that allows you to figure out which
> subtype table to load for a particular entity.
>
> This solution does not seem to scale since for each value of
> "discriminator" I need to perform a join with a different table. What
> if I need to retrieve 1,000 employees at once?
>
> Is that possible to obtain a single ResultSet with one SQL statement
> SQL?
>
> Or do you I need to iterate look at the discriminator and then
> perform the appropriate join? If this kind of iteration is necessary
> then obviously this generalization hierarchy approach does not work in
> practice
> since it would be painfully slow.
>
> Is there a better approach to modelling these kind of heterogeneous
> entities with shared attributes that does not involve creating a table
> for each new employee type or having sparce tables (mostly filled with
> NULLS)
>
> I guess another approach would be to use name/value pairs but that
> would make reporting really ugly.
>
> Seems like a very common problem. Any ideas? Is this a fundamental
> limitation of SQL?
>
> Thanks!
>
> - robert
Received on Fri May 28 2004 - 20:37:27 CEST

Original text of this message