Re: Storing data and code in a Db with LISP-like interface

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Mon, 17 Apr 2006 13:54:52 GMT
Message-ID: <M6N0g.61220$VV4.1138577_at_ursa-nb00s0.nbnet.nb.ca>


Marshall Spight wrote:

> Bob Badour wrote:
>

>>Marshall Spight wrote:
>>
>>
>>>If we have a requirement to distinguish between
>>>exists-but-we-don't-know
>>>and doesn't-exist, then the cardinality-0 model is insufficient. But
>>>there
>>>are also cases where we won't need to so distinguish, and cases
>>>where doesn't-exist is not a possibility. In those cases, I would
>>>propose that cardinality-0 is often a better choice than SQL's NULL.
>>
>>Yes, well, you are not putting the bar very high. Now, are you? Besting
>>SQL's NULL for handling missing information is rather like tripping over
>>a shoelace.
>>
>>The closed world assumption is going to bite you more often than you
>>seem to suppose. Presumably, if you have a relation valued attribute,
>>you intend to use the relational algebra or calculus on it. However, the
>>closed world assumption is central to both which means they will yield
>>incorrect results in many cases unless the user takes extraordinary
>>measures to account for the not-quite-closed nature of the model's world.
>>
>>One must consider those consequences when considering relation valued
>>attributes as a means to describe missing information.

>
> I do not disagree, but I'm not sure if I'm seeing all the consequences
> you refer to. I agree that how the algebra behaves is important.
>
> Lame example:
> Given a Persons relation with an Age attribute. (I know it is better
> to have a Birthdate attribute, but for this artifical example, this
> will
> suffice.) The Age attribute is an RVA of a single int attribute with
> an empty key (that is, it is either an int or empty, indicating we
> don't
> know the age.) If we wanted to calculate the average age of
> Persons, and we took sum(Age) / count(Age), (waving my hands
> around the nest/unnest issue), then it would seem to me that
> what you would most likely want to know is, of the people for whom
> the age is known, what is the average, which is exactly what
> I'd expect the RVA-Age schema to give you.

Your example works exactly as NULL does, and I consider the result incorrect. The average age of the entire cohort is unknown, and it is misleading to pretend otherwise.

Consider the situation where one also has a (possibly unknown) Weight attribute. If one queries for average Age, one gets a numeric answer. If one queries for average Weight, one gets a numeric answer. If one queries for average Age and average Weight, one gets two numeric answers that may not equal the answers above.

This sort of thing has been a source of many errors I have been asked to correct. Basically, it breaks the identity: SUM(A) + SUM(B) = SUM(A+B)

Some folks get really antsy when the numbers don't balance.

>>>I don't think it's a complete solution, though. I think some kind of
>>>sum type, as in SML, is also quite desirable. This lets the
>>>modeller create special values according to the requirements
>>>of the domain.
>>
>>I could not find a comprehensive enough reference for SML online to
>>decypher what you are saying above. Is a sum type similar to a union
>>type of some sort?

>
> Yes; specifically a tagged union, unlike the untagged C union.
> Here's a decent article:
>
> http://en.wikipedia.org/wiki/Tagged_union

Yes, I agree. That would be very useful. Received on Mon Apr 17 2006 - 15:54:52 CEST

Original text of this message