Re: SQL for Modeling Generalization Hierarchies

From: Robert Brown <robertbrown1971_at_yahoo.com>
Date: 27 May 2004 17:45:05 -0700
Message-ID: <240a4d09.0405271645.c5ddfdc_at_posting.google.com>


"Laconic2" <laconic2_at_comcast.net> wrote in message news:<E8KdnXHBZNMsuSvdRVn-jg_at_comcast.com>...
> "Robert Brown" <robertbrown1971_at_yahoo.com> wrote in message
> news:240a4d09.0405270856.4ce55c7d_at_posting.google.com...
> > Is there a good approach to modelling many heterogeneous entity types
> > with that have some attributes in common?
>
> This is a frequently asked question. Unfortunately, the frequently given
> responses don't usually settle the matter.

  Could you please post links to any relevant past threads if you remember where to find them?I tried to search the news group for "generalization hierarchy" but did not find anything useful. Is there a better search term I can use?

>
> > There are many employee subtypes and more can be dynamically added
> > after the application is deployed so it's obviously no good to keep
> > adding attributes to the employees table because most attributes will
> > be NULL (since janitors are never doctors at the same time).
> >
> > The only solution I found for this is a generalization hiearchy where
> > you have the employee table with all generic attributes and then you
> > add tables for each new employee subtype as necessary. The subtype
> > tables share the primary key of the employee table. The employee table
> > has a "discriminator" field that allows you to figure out which
> > subtype table to load for a particular entity.
>
> If you invent new subtypes on an adhoc basis, (e.g. a DBA is never either a
> janitor or a doctor)
> and you invent new tables for subtypes on an equally ad hoc basis, I'd
> suggest that you are altering your data model on an ad hoc basis.
>
> You can do that if you want, but don't expect the same power and simplicity
> that you get from a stable data model.
>
>
>
> >
> > This solution does not seem to scale since for each value of
> > "discriminator" I need to perform a join with a different table. What
> > if I need to retrieve 1,000 employees at once?
>
> Why would you ever need the attributes of more than one subtype? If your
> query is about doctors, why would you need to join the table about janitors
> into the query?

Because I need to display users and their heterogenious attributes in the UI as one list (e.g. in response to a search).

> If your query is about employees, why would you need to know any of the
> subtype attributes?

Same reason, our customer's HR department does not see the employees as separate entities. To them an employee is an employee and they want to see all the associated data while browsing employees. I guess you could show the generic attributes on the first pass and then require the end user to click to drill down and see the subtype attributes - but that's not what they want.

> >
> > Or do you I need to iterate look at the discriminator and then
> > perform the appropriate join? If this kind of iteration is necessary
> > then obviously this generalization hierarchy approach does not work in
> > practice
> > since it would be painfully slow.
>
> You need to ask two questions about the model, and about any imlpementation
> based on the model:
> first, is it correct and complete? Second, if yes to the first, is it fast
> enough?
>
> In general, different implementations of the same model should be
> transparent to users of the imlpementation who only look at the features
> visible in the model.
>
>
> >
> > Is there a better approach to modelling these kind of heterogeneous
> > entities with shared attributes that does not involve creating a table
> > for each new employee type or having sparce tables (mostly filled with
> > NULLS)
>
> If you have new entities, you are going to have new relations. That either
> means having new tables, or fudging the relationship between tables and
> rleations. Do the second one at your own peril.
>
> > Seems like a very common problem. Any ideas? Is this a fundamental
> > limitation of SQL?
>
> I would suggest it goes beyond SQl to the very heart of using foreign key/
> primary key aossciations to establish linkages.
Received on Fri May 28 2004 - 02:45:05 CEST

Original text of this message