Re: 4 the FAQ: Are Commercial DBMS Truly Relational?

From: Christopher Browne <cbbrowne_at_acm.org>
Date: 10 Oct 2004 01:34:41 GMT
Message-ID: <2srhtgF1ldm0vU1_at_uni-berlin.de>


Quoth Kenneth Downs <firstinit.lastname_at_lastnameplusfam.net>:
> Laconic2 wrote:
>>> What are some of the other objections?
>>>
>> Another objection is this: in mathematics, a relation has no need
>> for NULLS. Every tuple is a distinct point in the same space. A
>> row with a NULL in it is projected onto a subspace of the original
>> space.
>
> Hmmm, have never needed NULLS myself, and the customers have never
> complained. I believe they can always be engineered out. Let the
> flames begin :)

One of {Darwen,Date,Pascal} wrote a paper presenting how one might do this.

It's mostly obvious; if you have some attribute that is permitted to be NULL, that means that attribute should be separated out into another table so that it expressly becomes optional. If the value doesn't exist, there will be no tuple in that secondary table.

The obvious example is that of an employee record that points to dependent children. It's DOWNRIGHT DUMB to have that folded into the employee record; it is much better to create a "dependents" table that gets joined to employees when needed.

But this transformation does not prevent there from being thorns on the rose.

The problem with that approach is that you no longer have a simple query that lists employees and their dependents where the dependents are presented as columns.

The obvious sort of query is:

  select e.employee_id, e.employee_name, d.name, now() - d.dob as age   from employees e, dependents d
  where d.parent_id = e.employee_id

Some readers of the report that results from that will get pretty antsy about the fact that some employees are getting listed multiple times.

I'm in the "Internet domain" realm; we have an analagous situation where confusion arises. Each domain has four kinds of contacts: a registrant, adminstrative, billing, and technical. The first kind is a 1:1 relationship, but with the other three, multiple contacts of the three types are permissible. That's a somewhat unusual scenario, but it does happen, and it is certainly contemplated by the data model.

If there are 3 admin contacts, 2 billing contacts, and 4 technical contacts (admittedly an unusual kind of case), there's the zowie result using simple joins to build a report that lists all four contacts is liable to list the domain 24 times. No "evils of NULLs" involved; it's just the result of the relational cross product. If someone sees an entry listed 24 times with some bizarre set of repeating values, they KNOW something's deranged here.

In SQL, it's easy enough to generate these sorts of query that are easy, attractive to use, and which are fairly deranged.

I don't think that's SQL's fault; the same temptingly simple query approach would be just as simple, tempting, and off-kilter in Tutorial D. It would be much the same in Rand /rdb (see http://cbbrowne.com/info/rdbmsmisclinux.html). It would probably be the same in APL and other array-oriented languages.

Building a query that 'feeds' this sort of multivalued column result is just plain nontrivial, at least in a declarative manner.

It's certainly doable by embedding the problem into a procedural language, where you set up some form of "control break."

But that isn't what I want. I would like to "declare" it, and neither with the clever hacks of _SQL for Smarties_ nor by embedding the hairy control logic in some scripting language.

I have nothing against Celko, but I'm not particularly interested in depending on having overly clever DBAs around to construct overly baroque SQL queries to compute things. It's good to have the ability to do such queries, but using them for "production" work smacks indeed of "doing magick." There's plenty enough magick around already. There's plenty of unmaintainable Perl. More than enough unmaintainable C++. More than enough unmaintainable Java. And ABAP/4. And so forth.

-- 
(reverse (concatenate 'string "moc.liamg" "_at_" "enworbbc"))
http://www.ntlug.org/~cbbrowne/oses.html
MS-Windows: Proof that P.T. Barnum was correct. 
Received on Sun Oct 10 2004 - 03:34:41 CEST

Original text of this message