Re: 3vl 2vl and NULL

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Sat, 10 Dec 2005 00:05:31 +0100
Message-ID: <ig1kp1po0o5tqld38abgo80l6ha60qs46e_at_4ax.com>


On 9 Dec 2005 07:43:37 -0800, dawn wrote:

>
>Hugo Kornelis wrote:
(snip)
>> "The age of Aunt Marge is 47 years" is a fact.
>> "The age of Uncle Vernon is unknown" is also a fact - but it's not the
>> same fact type as the former.
>
>I'll refer to that statement below.

Hi Dawn,

So will I, so I didn't snip it.

>> And since they are different fact types,
>> they should be stored in differrent columns. Attempting to store them in
>> the same column would be a violation of 1NF.
>>
>> Or, to put it differently, and almost in the words of Codd:
>>
>> The meaning of the fact that (and the reason why) a value is missing
>> from some part or column of a relational database is quite different
>> from the meaning of a value that is legitimate within that part or
>> column. And since the meanings are different, they can never be combined
>> in the same column.
>
>I don't think this necessarily follows. You are suggesting that the
>meaning of the proposition is different. That would mean that the
>propositions should be modeled by different Relations. The "types" of
>your "facts" are different implies separate Relations, right?

Yes, that's what I meant. My modeling knowledge is based on learning a Dutch variant of Object Role Modeling, with Dutch terminology. It's sometimes hard to find the proper terms when you have to translate between ORM and Relational AND translate Dutch to English at the same time.

>
>>
>>
>> > Don't try to tell me
>> (snip)
>>
>> I'll tell you, or anybody, how I think about this subject. If you rather
>> not read my opinions, skip my messages or killfile me.
>
>I'm enjoying them.

Thanks!

>
>>
>> > Much of the discussion between
>> >Codd & Date (see link to article in dbdebunk in the other thread)
>> >concerned the *fact* that NULL can be interpreted to mean things other
>> >than simply "unknown".
>>
>> I won't deny that NULL *can* be interpreted to mean other things than
>> simply "no value here". Your "unknown" is one of them.
>>
>> For a lot of NULLs in my tables, I know why there's no value in that
>> cell. I know it, because I understand the real-life entities and their
>> properties that are represented by the values in my database. But that
>> understanding is not part of the NULL marker in the DB itself.
>
>I don't have a problem with including a NULL marker, but with its
>interpretation in the languages of the database. It is feasible to
>include a null and yet use a 2VL. Sure there are issues, but not so
>many as with a 3VL and it is far simpler to understand and employ.

Are you sure?

"Show me all family members who are older than Aunt Marge" - should Uncle Vernon be included or not? Why?

"Show me all family members who are NOT older than Aunt Marge" - should Uncle Vernon be included or not? Why?

"Show me all family members who are older than Uncle Vernon" - should this return noone, everyone, or only some? Why?

In all these cases, you run into the problem that Uncle Vernon's age is not in the database. With or without databases, fact remains is that you don't know if Aunt Marge is older, younger or the same age as Uncle Vernon. This proves that, as soon as you allow values to be missing in your database, True and False are no longer the only possible results of a predicate. Reality forces 3VL on you.

Using 2VL might be simpler to understand and employ, but it can't properly depict all facets of reality. I prefer a model that is faithful to the reality I try to model over a model that's easy but wrong.

>When DeMorgan's law goes out the window, the data and interpretation
>become unnecessarily complex (for most applications and most people).

I must be missing something. How exactly does DeMorgan's law cease to apply with 3VL?

(snip)
>> > If I look at my empty mug I can truthfully say
>> >that there is an absence of any liquid in it - because it's empty.
>
>Sortof like an empty set, eh?
>
>> And that fact can truthfully be represented in a column MugEmpty with
>> domain {'Yes', 'No'}, or in a column MugContents with a numeric domain
>> that stores the amount of liquid (in cc) in your mug.
>>
>>
>> > That
>> >does *not* mean I don't know what's in it. I don't have to look for any
>> >other facts. It is, very plainly and obviously, empty. If I have a rule
>> >that I only store data (known facts) in a database then I'll know that
>> >where I have a NULL then the NULL represents an absence of any value.
>> >It most certainly will not signify that I have an unknown value. It
>> >will signify that there is no value where one would normally be
>> >expected.
>>
>> Exactly! That's what I've been saying all the time: NULL represents the
>> absence of a value. Since the question "what is in your mug" can't be
>> answer currently, there is no value to store in the cell that is
>> reserved to hold this answer. It'll be NULL - no value here. This cell
>> in the database is empty.
>
>Or if you want to have values for every cell so that you can answer
>comparison questions, and if the dbms permits sets in cells, then
>thinking of the cell has holding an empty set would make sense, right?

*IF* the database allowed to store a set in a cell, then it might make sense to store an empty set for an empty mug. And it would also make sense to store the contents of my mug as {Coffee}, but the contents of my wife's mug as {Coffee, Sugar}.

In SQL databases, sets can't be stored in a cell. If the complete set of contents of the mug has to be stored, the first rule of normalisation requires us to use a seperate table. My wife's mug would result in two rows. My mug in one row. And Mike's empty mug would get no rows at all in this table, since that's how SQL databases store an empty set.

But the original discussion was about NULL being used to represent a missing *scalar*.

>
>>
>> > It will signify that the "thing" is empty.
>>
>> If "thing" refers to the cell in the database: indeed.
>>
>> If "thing" refers to your mug: NO. There can be many reasons why the
>> answer to the question "what is in your mug" is not in the database.
>> Maybe you refused to answer. Maybe I forgot to ask. Maybe you didn't
>> know. Maybe the mug was empty.
>
>If I KNOW it is empty, it seems like a value that represents this
>knowledge might be a better approach. This is easy with a char
>attribute which could be coded with EMPTY or NA or NONE, but not as
>easy in a strongly typed environment with numeric types.

All these values (EMPTY, NA, NONE) are not in the domain of valid liquids. They should therefore be disallowed in this column (either by a CHECK constraint or by a FOREIGN KEY constraint).

(snip)
>> Using
>> NULL to say that there definitely are none might be valid in Pick (I'll
>> just have to trust you on that, as I don't know PicK),
>
>no, no, I'll differ on that

Since I don't know Pick, I'll have to believe everything Mike says about it. I'll also have to believe everything you say about it. With the interesting result that I am now totally believeing two mutually exclusive statements about Pick. <g>

>
>> but it's
>> absolutely not valid in an SQL table. In SQL, storing NULL means
>> removing the fact "New Zealand has 0 nuclear warheads" from the table.
>> That would place New Zealand -incorrectly!- in the group of countries
>> that have no data at all (either because there is no reliable
>> information, or for whatever other reason - I don't kinow, though the
>> users of this DB will).
>>
>> And if either you, with your Pick DB, or me, with my SQL DB, have to
>> build a report, then we should make absolutely sure that we get the
>> requirements straigth - does Mr. President want a list of all countries
>> that we *know* to have WMD, or does he wish to also include all
>> countries that *might* have WMD. Your query will look different from
>> mine.
>
>Does NULL also mean "it could be something"? ;-)

The only thing NULL means is "no value here". All other meanings are interpretations by the end user.

In some contexts, NULL might surely mean "it could be soomething". But it'll mean different things in different contexts. Without knowledge of the context, there should be no meaning beyond "no value here".

Talking about the meaning of NULL is like talking about the meaning of 'A'.

>> MVL is like a bread knife - dangerous in unskilled hands, but invaluable
>> in skilled hands.
>
>Not at all invaluable -- quite unnecessary, in fact. Cheers! --dawn

<smile> I'll withhold my comment on the "unnecessary" part until you answered my questions about Aunt Marge and Uncle Vernon.

Best, Hugo

-- 

(Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Sat Dec 10 2005 - 00:05:31 CET

Original text of this message