Re: Are redundant fields ever appropriate?

From: Clifford Heath <cjh_nospam_at_osa.com.au>
Date: Wed, 14 Nov 2001 13:55:32 +1100
Message-ID: <3BF1DD24.AC118996_at_osa.com.au>


> > Is this something I'd ever want to do? > No.

I disagree. The join is implicit in the foreign key reference, but sometimes when three tables are frequently joined, one of the tables should be clustered by a key which isn't naturally a foreign key in that table. The data is denormalised, but that's necessary for the DBMS to produce appropriate clustering. Here's an example simplified from one I used recently with MS SQL Server. It records Organisation Units as SDOU (Site/Domain/OrganisationUnit) using X.500 DNs, then records Computers belonging to those SDOUs, then records Occurrences of known Software objects on those computers. The catch is that the most important query is to summarise over a single SDOU. The query is easy to write without the non-normal SDOUID in SoftwareOccurrences, and fairly efficient to perform, but clustering the SoftwareOccurrences records by SDOU makes it more efficient. Clustering by ComputerID isn't as big a win. Any thoughts on this?

Apologies for using SQL-Server extensions in the form of IDENTITY fields - this reduces the keys to a manageable size. What equivalent support is there in other DMBSs (Oracle?) or in the SQL standard for this sort of thing (apart from allocating unique IDs manually)?

create table SDOU (

  • Auto generated identity number OUID int NOT NULL IDENTITY CONSTRAINT PK_SDOU PRIMARY KEY NONCLUSTERED,
  • OU String (DistinguishedName) OU varchar(800) NULL ); create unique clustered index SDOUByName on SDOU(OUID, OU);

create table Computer (

  • Auto-generated identity # ComputerID int NOT NULL IDENTITY CONSTRAINT PK_Computer PRIMARY KEY NONCLUSTERED,
  • CN of the Computer ComputerCN varchar(32) NOT NULL,
  • OU where Computer belongs ComputerOUID int NOT NULL CONSTRAINT FK_Computer_OU REFERENCES SDOU(OUID) ); create unique clustered index ComputerByName on Computer(ComputerOUID, ComputerCN);

create table SoftwareVersion (

  • Auto-generated identity # SoftwareID int NOT NULL IDENTITY CONSTRAINT PK_SoftwareVersion PRIMARY KEY NONCLUSTERED,
  • Which software? SoftwareName varchar (128) NOT NULL,
  • Software Version? Version varchar (32) NOT NULL ); create unique clustered index SoftwareVersionByName on SoftwareVersion (SoftwareName, Version);

create table SoftwareOccurrence (

  • SoftwareVersion SoftwareID int NOT NULL CONSTRAINT FK_SoftwareOccurrence_Software REFERENCES SoftwareVersion(SoftwareID),
  • Computer where the SoftwareVersion was found, FK to Computer ComputerID int NOT NULL CONSTRAINT FK_SoftwareOccurrence_Computer REFERENCES Computer(ComputerID),
        /*
         * FK to OU of Computer. This is denormalised, since it's
         * dependent on the ComputerID in this row, but useful for
         * clustering!
         */
        ComputerOUID int NOT NULL
                CONSTRAINT FK_SoftwareOccurrence_ComputerOU
                        REFERENCES SDOU(OUID)

	-- other fields...

);
create unique clustered index SoftwareOccurrenceBySWComputer

        on SoftwareOccurrence(SoftwareId, ComputerOUID, ComputerID, UserID);

--
Clifford Heath, ManageSoft Corporation
Received on Wed Nov 14 2001 - 03:55:32 CET

Original text of this message