Re: Network Example: Sibling of Opposite Gender

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Fri, 5 Jan 2007 11:55:35 +0200
Message-ID: <Pine.SOL.4.62.0701041557570.4002_at_kruuna.helsinki.fi>


On 2007-01-03, Marshall wrote:

> If your domain model specifies things to a greater degree than NULL
> can, then I don't see why you *wouldn't* want to join on equality.

Normally the domain model doesn't control this sort of thing. It can be used to restrict which collections of columns you can join on, but not how values join within a column. Of course you could expand it by designating unknown values and the like for a domain, but if you then implemented the unknown value semantics, all you'd have would be a number of conventional nulls going by different names.

Codd's idea was that the common features of missing values are implemented by the DBMS as null(s), so that you don't have to. It does mean that the DBMS now has to have an equality predicate that is aware of what amounts to a purely semantic concern. It also leads to multivalued logic. Both effects are counter-intuitive, they may even be counter-productive, and they can be avoided at the RM level by sticking to 2VL and no nulls. The latter is precisely what the decomposition approach is about. It's just that then a concern that seems to be rather common has to be implemented over and over by the user on top of the semantically poorer data model.

> One thing I like about tagged union values is that, unlike null, they
> always really are values. You get more expressiveness yet simpler
> semantics.

Not so, because what I think of as "expressiveness" comes from more complicated semantics. Simple semantics on the other hand mean we're working with a simplified, or more abstract, representation of the data, which is then incapable of making some distinctions that a data model with richer semantics could.

That is, you can always put in an enumerated value like MISSING_VALUE. But then that value won't have the join semantics of a non-value like a null does. The reason Codd went with nulls is that he *wanted* all of those nasty non-value, 3VL (even 4VL) behaviors there, *because* those behaviors are precisely the semantics that *define* missing information. If the nastiness cannot be there, neither can missing information, semantically speaking. You can emulate it by always including the appropriate conditions with every invokation of the equality predicate in a join where your chosen representation of missing information could be present ("and a.salary!='UNKNOWN' and b.salary!='UNKNOWN'"), but then you're just reimplementing a null in 2VL.

>> Wouldn't a type-aware equality predicate suffice as well? E.g.
>>
>> select sum(salary) from pers_info where salary in Integer;
>
> No, because what if two of the tags use the same type?

select sum(salary) from pers_info where salary in Integer2 ; what you do with tags, I do with type names. That's getting rather close to the universal relation scheme assumption, I know, but then both URSA and the domain abstraction were largely motivated by the same reasons.

> Maybe you have people paid with dollars and people paid with store
> credit:
>
> SALARY{ SALARIED(integer), CREDIT(integer), UNPAID }

select sum(salary) from pers_info where salary in SALARIED . If you conceive of tagged unions as instances of disjoint unions, the latter have to come from somewhere. They'll have to be constructed from some collection of existing domains, which always have names. The usual way we deal with (disjoint) unions in math is that they're just sets of all the possible instances, so embedding that into RM, you get precisely "where salary in <name of one of the underlying sets>". Your notation would then be short for something like

create domain Credit as derived type of Integer; create domain Salaried as derived type of Integer; create domain Unpaid as enumeration (UNPAID); create domain SalaryType as disjoint union of (Credit, Salaried, Unpaid); create table pers_info (PersonName String, Salary SalaryType);

-- 
Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Fri Jan 05 2007 - 10:55:35 CET

Original text of this message