| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> candidate keys in abstract parent relations
Hi. I'm a newbie data modeler, and I'm having a tough time with what
seems like a simple data modeling / identity modeling problem, but
which has been giving me fits.
For the purposes of this discussion, I need four relations, working from the "top" down: Albums are put out in the form of one or more Releases, and each Album consists of one or more Tracks. A Track is a particular instance of a Recording, so each Recording may appear as different Tracks on multiple Albums.
So the candidate key for Releases is Release("Album", Label, Locale, Release Date). (Please correct me if I'm using the notation incorrectly.)
2. The Recording reaches outside of the subset of the model we're discussing here, so assume it has a straightforward natural key as well.
3. The Track is really an association between Recordings and Albums
with some additional attributes (track length, credited performer,
credited title, etc). It also has a possible unique ID in the ISRC
(International Standard Recording Code), which is issued for pretty
much every track available on a semi-modern format (but not, of course,
on bootlegs, which I would also like my model to encompass).
4. The hard part: An Album is a collection of Tracks by a Credited Performer with a Title. The difficulty comes in because I'm trying to be scrupulous, and defining uniqueness on those attributes is proving to be very difficult.
Therefore, Album(Name, Credited Performer) is not guaranteed to be unique.
Releases often do have generated unique IDs (aka ISBN / UPC / EAN codes), but not all of them, and these are tied to a specific Release. "Original release dates" are a functional dependency of the earliest known Release, and "recording date" is a functional dependency of the Recording's "Session", and doesn't work for many kinds of albums (i.e. compilations, best-ofs) anyway.
Upon thinking about it, Albums and Tracks are defined in terms of each other, which is to say the uniqueness of the Album is defined by the collection of Tracks associated with it, and those tracks don't really exist independent of the album. Also, as the collection of tracks can theoretically be entirely distinct between two Albums with the same name (which comes remarkably close to happening sometimes in reality), properties like the number of tracks on the album are not only a functional dependency, but not guarantors of uniqeuness.
There have been at least a few attempts to solve this problem within
the recording industry, but I'm not really satisfied with any of them.
The Library of Congress has its own system for filing sound recordings,
but as far as I'm concerned as an end user with no ability to generate
my own LoC classification codes, that's just another generated key
(also, the LoC doesn't even try to catalog all sound recordings).
Musicbrainz (http://www.musicbrainz.org/) uses GUIDs for both Albums
and Tracks, and I'd rather just use my own semi-semantic sequence
generator (based on one of Celko's non-repeating, pseudorandom sequence
generators) than go with GUIDs, although I might stash away the
Musicbrainz GUID as an attribute, purely to integrate my application
transparently with Musicbrainz (I personally think my data model is
richer and cleaner than theirs).
Is there a method to generate a primary key based on entities that are children of a parent relation (and if there is, is it all a good idea to use it?)? Is the Album an overly-abstract construct that should be pushed into the Tracks and Releases? Or am I being overly finicky and should I just write a sequence generator / use an autoincrementing column? In the real world, labels issue the same Album at different times on different formats in different Locales that are struck from the same masters with the same name (but as separate Releases) and the same Recordings on the Release, and I think most people would intuitively understand the relationships as I've sketched them out above, as well as intuitively understanding that two albums with the same title and artist can, on occasion, contain different Tracks, and therefore do not share the same identity.
Any hints on what I should look at next? Received on Tue Jan 17 2006 - 21:50:43 CST
![]() |
![]() |