Re: candidate keys in abstract parent relations

From: Forrest L Norvell <spankysyourpal_at_gmail.com>
Date: 20 Jan 2006 14:09:33 -0800
Message-ID: <1137794973.660236.90460_at_g44g2000cwa.googlegroups.com>


David Cressey wrote:
> ... [I]f you have a table with no candidate keys (two rows mean
> something different, but are duplicates of each other) then you haven't got
> a relational model, yet.

I think I'm almost there, though. Or at any rate, I'm a lot closer than I was when I started this thread!

> It seems to me that the OP has yet to identify a
> track.

No, the Track is one of the few entities I really am sure about: a Track is an indexed element of a Side (of a Disc (of an Album)) that is associated with a Performance, with its own name and duration. The same Performance may be included on different albums as a separate Track, but a Track is ALWAYS tied to specific Side->Disc->Album. Sides and Discs are logical groupings, and could probably be denormalized to albums if it weren't for things like run-out groove etchings (on Sides) and named Discs (and CD pressing plant codes (look at the inner ring on the bottom of a CD sometime)) in multi-disc sets.

I just realized that this even further muddies the waters between Releases and Albums, because run-out groove etchings are specific to the release. So I can fold Releases, Albums, and AlbumVersions into one entity called Album, or maybe I just need to rejigger which attributes are on which of those tables. I probably don't need three tables to model that relationship, though.

At any rate, the natural key for a track is Track(side, index, name, performance, duration). "name" may look like a denormalized attribute, but this allows me to deal with cases where the song's actual name (as registered with the publishers / originally conceived by the songwriters) differs from the name as it is used on the actual album. I.e. the Rolling Stones song is "Paint It Black", but a surprising number of covers call it "Painted Black". Even though you could argue that the latter is a mistake, that's how it's written on the release, so that's how I want it recorded in the database (while still maintaining a connection to the actual, correct information about the song elsewhere).

Another aside: I'm using surrogate keys for all my tables (so the above is actually "performance id" rather than "performance", it's "side id" instead of "side", etc.), but only after I've derived a natural key (enforced by a constraint) to which the surrogate key can act as a surrogate. PostgreSQL just penalizes performance too badly for using non-integer primary keys, and Rails (which is the first application I'm going to be layering over the database) is unwieldy when dealing with tables that don't observe its conventions (as well as presently having poor to nonexistent support for multi-columnar primary keys). This seems like a pretty common real-world pattern, except in the real world the designers never bother to figure out the natural keys first. ;) Received on Fri Jan 20 2006 - 23:09:33 CET

Original text of this message