Re: 3vl 2vl and NULL

From: dawn <dawnwolthuis_at_gmail.com>
Date: 15 Dec 2005 08:38:39 -0800
Message-ID: <1134664719.359019.30540_at_f14g2000cwb.googlegroups.com>


Hugo Kornelis wrote:
> On 13 Dec 2005 12:37:00 -0800, dawn wrote:
>
> >Hugo Kornelis wrote:
> >> On 10 Dec 2005 05:22:38 -0800, dawn wrote:
> >>
> >> (snip)
> >> >> "Show me all family members who are NOT older than Aunt Marge" - should
> >> >> Uncle Vernon be included or not? Why?
> >> >
> >> >Two choices of how to ask the question --
> >> >1) Show me all members who have an age where that age is less than or
> >> >equal to Aunt Marge's age
> >> >
> >> >In this case Uncle Vernon does not show up because he does not have an
> >> >age
> >> >
> >> >2) Show me all family members who do not have an age greater than Aunt
> >> >Marge's
> >> >
> >> >In this case Uncle Vernon does show up
> >>
> >> Hi Dawn,
> >>
> >> So the results of the query would depend on exactly how you formulate
> >> it?
> >
> >I cannot think of any situation where that is not the case, can you ;-)
>
> Hi Dawn,
>
> I have a lot more trouble thinking of a situation where it IS the case.
>
> In SQL, the following queries will return the same results:

I'll number these

1)
> SELECT Name
> FROM MyFamily
> WHERE Age <= (SELECT Age FROM MyFamily WHERE Name = 'Aunt Marge')

2)
> SELECT Name
> FROM MyFamily
> WHERE NOT (Age > (SELECT Age FROM MyFamily WHERE Name = 'Aunt Marge'))
>
>
> >> Hmmm. AFAICT, this means that in your 2VL algebra, missing data results
> >> in ~(A) not being the same as (~A).
> >
> >No, I don't think so, but before I respond to more, can you give me an
> >example that makes you think this? Thanks. --dawn
>
> See the above queries. Rules of Booolean algebra say that
> NOT (Age(x) > Age(Aunt Marge))
> should be the same as
> (Age(x) <= Age(Aunt(Marge))
>
> Your explanation made me believe that this is not the case in your 2VL
> system.

Then I did not explain it correctly or I'm missing something now.

Let's have a data set that includes
Aunt Marge, 47
Uncle Vernon, null = null set = empty set, nada Dale, 12
Dot, 70

  1. Inner select grabs the age of 47, then selects names of everyone but Dot as everyone else, including Uncle Vernon in this 2VL scenario, has an age less than or equal to 47 (null is less than 47 as surely as nothing is less than something -- that's the twist that is different than a 3VL)
  2. Inner select grabs the age of 47, then we test to see if NOT (Age >
  3. is the case. That is the case for everyone but Dot.

If the data are changed so that Aunt Marge also has a null set age, then with

  1. The Age of null set is selected and both Aunt and Uncle are chosen for this query
  2. Similarly, the only ones with NOT (Age > Marge's age) are the Aunt and Uncle.

So, the results are the same with both queries. Now, perhaps Uncle Vernon is 110, clearly older than Aunt Marge, but that fact has not yet been recorded since people were pouring over medical and legal records to verify this first. He will then incorrectly be identified as someone whose age (in this database) is less than or equal to Marge's age. When the data are not correct in the system, the results of the queries are not correct either.

I have not done any tests with this, but there is something about seeing the incorrect data on a report that helps with data cleansing, with the downside that someone might believe that Uncle Vernon is younger than Aunt Marge and this could cause some problem. There are pros and cons to each, but it really is far easier to work with a 2VL. I doubt anyone who has worked extensively with both would say otherwise.

Did that clarify? --dawn

>
> Best, Hugo
> --
>
> (Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Thu Dec 15 2005 - 17:38:39 CET

Original text of this message