Re: Proposal: 6NF

From: dawn <dawnwolthuis_at_gmail.com>
Date: 8 Oct 2006 14:54:31 -0700
Message-ID: <1160344471.664311.321760_at_m73g2000cwd.googlegroups.com>


Marshall wrote:
> In my not-so-humble opinion, this whole thread is a bunch of
> useless talking-past-each-other. The word "null" (like all other
> words) means different things in different contexts,

Yes, so we ought to be able to put our definitions down and see where there are differences.

> and "null"
> in particular means a variety of different, *incompatible* things.
> It seems to me that everyone is assuming it means the same
> thing to them as it does to the person they're talking to,

I tried to be clear that I would like definitions. I was contesting a blanket statement (perhaps by JOB IIRC) that if taken at face value indicated something completely inaccurate without specific definitions for the terms. While I understand that there are differences in the definition of NULL, I had no idea there were such differences of opinion on the def of "function" although I have yet to see Cimode's definition.

Would you agree that if NULL is a value in a particular context (e.g. where the logic in working with that value is to treat it as the empty set), then there surely can be functions where the output of the function is a NULL value?

> at
> least some of the time.
>
>
> [context] term
> -----------------
> [math] empty set
> [SQL] null
> [3VL] unknown
> [Java] null
> [lisp] nil
> [C] pointer 0
> [type theory] bottom
> [SML] Maybe algebraic data type
> [Haskell] Maybe monad
> [Nested RA] value with an empty determinant-set functional dependency
> [TTM] Table Dum
> [TTM] omega
>
> ... and probably many more.
>
> Every one of these is a distinct concept, with distinct semantics.

Agreed.

> None of them can be separated from the underlying theory.

Agreed.

> It only makes sense to consider these within the context of
> the theory they are embedded in, and a direct comparison that
> doesn't take in to account how well the concept fits into its
> theory is leaving out important details. It is simply pointless
> to talk about one being better than the other without context.

In this case I was only trying to describe a scenario/theory where NULL is a value, that Cimode seems unfamiliar with, using terms that relational theorists typically understand.

> What we have to do instead is consider the context, consider
> the desired behavior, consider the possible failure modes,
> and measure the semantics of any given proposal within
> the context of a particular theory. We can't even do that
> without an agreed-upon set up use cases.

I suspect that those who wish to use 3VL will disagree with those who wish to use 2VL on use cases, but it might be worth a try to mock some up. Here is a requirement:

When a developer compares two attribute variables from a relation accessed via a DBMS for equality using a compiled language, given that their code compiles successfully, they will know that the result will be either True or False.

E.g.
The condition (a == b) where a refers to one relation variable attribute and b refers to another, where == is equality, must be either True or False

This requirement would lead us to 2VL. How would you write a complete set of use cases so as not to force a conclusion of either 2VL or 3VL?

> I don't think we've got anything approximating consensus
> on what the desired behavior is.
>
> So: what is the desired behavior? What are the use cases?

I indicated one desired behavior above. Here is another one:

If a DBMS combined with a DBMS schema permit a developer to persist a tuple value in a relation when the developer and their executables do not know the value of each attribute, the developer must have a standard way of expressing the fact that there are data values missing in a stored "proposition", for each attribute for which the attribute value is missing.

Here's one where there is a big trade-off so that I would not suggest this requirement if I had not used various approaches:

The developer must be able to decide for each attribute whether to express a fact that it is known that there is no value identically to a fact that it is not known what the value is or not.

Alternatively, there could be a requirement to be less flexible, such as

The developer must have a standard way of designating that it is known that there is no value for a particular attribute that is distinct from the designation that it is not known what the value is.

This has the potential to be less ambiguous and, therefore, more accurate, but there is a related cost which might not have a corresponding benefit for any particular atribute (thus the requirement that I gave to permit the developer to decide this for each attribute).

And so on and so on. --dawn Received on Sun Oct 08 2006 - 23:54:31 CEST

Original text of this message