Love or hate, or? domains with cardinality two

From: Nicola <nvitacolonna_at_gmail.com>
Date: Mon, 2 Nov 2015 18:29:54 +0100
Message-ID: <n186ig$36t$1_at_adenine.netfront.net>



Not long ago, I had a quick chat with a logician, where we talked about boolean domains (domain with two elements) in connection to logic and databases. All started from an instance of a relational schema:

Employee(id, dept, mgr?)

where mgr? is boolean. A tuple (x,y,true) means that employee x belongs to department y and x is a manager of y. A tuple (x,y,false) means that employee x belongs to department y but x is not a manager of y.

I was arguing about how schemas like that are commonly found in practice, and about how I feel uncomfortable with them, because, let apart the obvious difficulties with enforcing some integrity constraints (e.g., a department should have one and only one manager at a time) in current DBMSs, truth of a predicate is coded as a value of a domain. Suprisingly (for me), she did not dismiss the approach at once, but pointed out that you are trading a predicate (Manager(id)) for a couple of new constant symbols and the two things should be essentially interchangeable (from a logical point of view).

Not long after that, I stumbled across one of those lengthy arguments by Chris Date, in which he critiqued a schema like:

Loves(x,y) ("x loves y")
Hates(x,y) ("x hates y")

on the ground of a principle by which each tuple should be insertable only in one schema of a database (in the example above, ('Romeo', 'Juliet') may be inserted in both), and he proposes to use a single schema R(x, z, y) where z ranges over {'loves','hates'}.

I am still uncomfortable with boolean domains, but I can't pinpoint the exact reasons. In the Loves/Hates example I don't see a particular advantage in using a single schema, because in either case you must impose similar kinds of constraints (if x loves y then x doesn't hate y...); the only difference is that in the first case those are interrelational constraints, and you might say that they are not as "easy" to enforce as intrarelational constraints (is it true?).

In the Employee example, boolean values may lead to redundancy, but of a kind that is not captured by the usual normal forms. For example, if each department may only have one manager, then in this instance:

id dept mgr?



1 Math true
2 Math ?
3 Math ?

all the values denoted by ? must necessarily be false. This schema is in 5NF, however (it is not in DKNF because it has an insertion anomaly, e.g, inserting (4, 'Math', true) leads to an inconsistent instance). Of course, this argument doesn't hold if a department may have many managers.

But maybe my intuition is wrong. What do you think? Are there compelling reasons to avoid boolean domains in logical database design?

Nicola

Received on Mon Nov 02 2015 - 18:29:54 CET

Original text of this message