Re: Question on Structuring Product Attributes

From: Hugo Kornelis <hugo_at_perFact.REMOVETHIS.info.INVALID>
Date: Tue, 18 Oct 2011 21:29:37 +0200
Message-ID: <e5jr975rcsni1bojb4lvr2neu9i11k4tgc_at_4ax.com>


On Wed, 12 Oct 2011 17:51:11 -0700 (PDT), Derek Asirvadem <derek.asirvadem_at_gmail.com> wrote:

(...)
>----------
>2. The case of Sedans and TwoDoor.
>----------
>
>(Staying with the posted method.) I do not believe it is a good idea
>to mix vehicle_type {SED|SUV}, with sedan_type, they are two separate
>Things. We already know it is a sedan because it is in the Sedans
>table, likewise, every row in TwoDoor is a two-door sedan (not an
>SUV). A hierarchy, which I agree with, would be:
>
>CREATE TABLE Sedans (
> vin CHAR(17) NOT NULL
> PRIMARY KEY,
>   sedan_type CHAR(3) DEFAULT '2DR' NOT NULL
>       CHECK (sedan_type IN ('2DR', '4DR') ),
>   FOREIGN KEY (vin)
>   REFERENCES Vehicles(vin)
> CONSTRAINT Sedan_Excl_ck
> CHECK ...
> )
>
>CREATE TABLE TwoDoor (
> vin CHAR(17) NOT NULL PRIMARY KEY,
>   FOREIGN KEY (vin)
>   REFERENCES Sedans(vin)
> CONSTRAINT 2dr_Excl_ck
> CHECK ...
>
>As the the model progresses, VehicleType and SedanType will be
>Normalised into separate Entities, and Vehicle and Sedan
>(respectively) will have Foreign Key References to them.

Hi Derek,

I've tried to follow your debate with Joe Celko with interest, though it's sometimes hard to keep track of the arguments in between the mud-slingin.

As Joe can confirm, I often disagree with him, on many subjects. But when I first saw him post the subtype/supertype implementation model you criticise, I liked it for its robust protection agains integrity violations. You are correct that the redundant columns and the extraneous keys require more disk space and hurt performance, but I always think that integrity comes first, all the rest comes later.

But I have of course an open mind for better alternatives, and you seem to have one. Unfortunately, the most relevant part for integrity is left out of your post, or rather replaced by ellipsis. Given the two tables above (plus the implied vehicles table), how will the following (inconsistent) transaction be rejected:

BEGIN TRANSACTION;
INSERT INTO Vehicles (vin, vehicle_type) VALUES ('12345678901234567', 'SUV');
INSERT INTO Sedans (vin, sedan_type)
VALUES ('12345678901234567', '4DR');
INSERT INTO TwoDoor (vin)
VALUES ('12345678901234567');
COMMIT TRANSACTION; I think you intend the constraints Sedan_Excl_vk and 2dr_Excl_ck to prevent this, but you left out their specification. In a complete ANSI SQL implementation, I could easily create this constraint, as the ANSI SQL standard permits the use of subqueries in a CHECK constraint (SQL-92: full compliance; SQL-2003: Feature F671, "Subqueries in CHECK constraints"). But the product I work with (Microsoft SQL Server) does not implement this feature, and a quick Google search gives me the impression that most (if not all) major vendors suffer from the same limitation.

If your point is that in a fully ANSI-compliant RDBMS, the version Joe posted suffers from needless redundancy (just like this sentence), I agree. But if you have a way of coding the CHECK constraint that is supported by some, or preferably all, major current RDBMS implementations and that will disallow the inconsistent data I gave above, please let me know. (Or, if you already gave it and I missed it when wading through the discussion, please tell me where I can find it).

-- 
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Received on Tue Oct 18 2011 - 21:29:37 CEST

Original text of this message