Re: Is multi-type data in one field normal?
Date: 12 Apr 2003 17:22:03 -0700
Message-ID: <c0d87ec0.0304121622.5e65e652_at_posting.google.com>
>> Inspite of various claims, it is kind of normal to store different
I agree with that statement. That is one of the differences between
FIELDS and COLUMNS in a relational database. A column has a scalar
value of one and only one type by definition. A file system with
variant records, an OO system with typing rules, or a NFNF database
might be able to store different unrelated data in a single field;
doing in an SQL database involves a kludge to work around the system
and the relational model.
unrelated data in a single field. <<
>> And don't pay too much attention to Mr. Celko's, his understanding
You can read about my view of data types in DATA & DATABASES, but more
to the point, this is being done in an SQL database
of data types is probably limited to SQL's view of data types which is
very primitive, to say the least. <<
>> He apparently has a lot of trouble to come to terms with the
The Boolean/ multi-value logic thread is irrelevent to this thread,
but ignoring the "ad hominem", SUM() is a function on a numeric set;
it is not a datatype at all.
boolena data type so asking him to understand the concept of a sum
type is perhaps too much. <<
>> ... you should design the solution the best you can, and then find
Agreed.
>> To paraphrase the principle, you should not design your schema in
Wrong. The conceptual model is at one level and that is where the
system is designed. This leads to a logical model, and finally I *do*
design the schema in SQL at the implementation level. You are
advocating kludges that skip over the various levels. (I have to do
this "cut & paste"; it is too good not to use!):
Kludge or kluge: n. Slang
Sources:
From the old Scottish word "kludgie" meaning an outside toilet; A
Scottish engineering term for anything added in an ad hoc manner; the
spelling "kludge" adapted by American enginners in World War II.
"How to Design a Kludge", Jackson Granholme, Datamation, February
1962, pp. 30-31], which defined it as "An ill-assorted collection of
the most suitable "translation" that fits the limitations of your
particular environment. <<
SQL, but rather design your schema as if you had the perfect DBMS in
front of you, and after that translate it to SQL. <<
>> The fact that SQL's type system is limited should not prevent you
.. as long as those SQL facilities match the relational translation
from the conceptual model.
from using a better type system in the abstract (including union
type), and then "encode" or translate your design using whatever SQL
facilities you see most fit for that purpose. <<
>> There's nothing wrong in principle with having a column that is a
Yes, there is. It's called First Normal Form (1NF) and it required
scalar, atomic values from one domain. This kind of nightmare where
each row is fundamentally different leads to weird SQL queries and
falls apart in about a year. Every typo becomes a new datatype, every
query has to have a monster CASE expression to read the table, the
column is declared with the most general datatype possible (it used to
be VARCHAR(255) but now you see VARCHAR(8000) a lot in SQL Server) and
has to be CAST() to strings, numerics and temporal types, etc.
type discriminant (in your case StatisticsTypeId) and a second column
(StatisticAmount) who's semantics depends on the type discriminator.
<<
>> Although it apparently "looks" bad it is a perfect "translation" of
When you are writing code in ML, Haskell, etc. then use a union type.
I still remember the problems they caused in "C", but I will azssume
that newer, higher level language have the proper type checking and
boundary protections. But back in the real world, databases are in
SQL.
union type ideas who is perfectly sound and validated both in type
theory and in practical languages like ML, Haskell and most advanced
programming languages of today support union types. <<