Re: theoretical question on the RDBMS
Date: 16 Aug 2002 04:54:10 -0700
Message-ID: <51d64140.0208160354.5e78dd96_at_posting.google.com>
"David Cressey" <david_at_dcressey.com> wrote in message news:<kFS69.68$lq1.8187_at_petpeeve.ziplink.net>...
> Paul,
>
> > I think perhaps (guess who's been reading the dbdebunk website) this
> > is something that should be done at the physical, not logical level.
> > i.e. behind the scenes the database could implement "sparse" columns
> > in a separate "file" for performance reasons but at the logical (SQL)
> > level it should be in the same table.
> >
>
> I'd suggest that there are both logical and physical implications to columns
> which may contain NULLS.
>
> The physical implications have to do with sparseness, as you remarked. In
> addition to whatever space may be saved, there may be speed considerations
> when retrieving either or both tables, if they are separated into two
> tables.
>
> There are also logical implications to storing the optional column in a
> separate table, along with the primary key.
> First, every truly optional column (one where NULLS might actually appear)
> is a departure from the strict relational model. In a strict relational
> model, the tables represent relations. And, in a relation, there are no
> NULLS.
> Second, every truly optional column represents an outer join of some kind,
> and therefore a departure from full normalization.
>
> The above may or may not be reasons to choose one logical design over
> another, but there are consequences, at the logical level, to either design
> choice.
OK, maybe the example of a "mainly NULL" column was a bad one. What about cases where the column is NOT NULL but most of the entries are the same? e.g. the original example was an entry in a "persons" table for "language spoken". Or say a column called "HasAWoodenLeg" - maybe not an ideal example because it's just a boolean value. Or one where an actual value is used to represent a kind of "none of the above" entry e.g. "FavouritePolitician" where most of the entries would be "None" ;-)
Maybe also they would be modelled with separate "language" or "politician" tables with foreign key constraints but I don't think this drastically alters the argument.
I'm sure there are better examples but now the column isn't optional at all but the vast majority of the entries are identical.
Now they might be a physical argument for having two "tables" but would there be any logical reason for splitting the table?
Paul. Received on Fri Aug 16 2002 - 13:54:10 CEST
