Re: 3vl 2vl and NULL

From: dawn <dawnwolthuis_at_gmail.com>
Date: 15 Dec 2005 14:21:18 -0800
Message-ID: <1134685278.596134.11750_at_o13g2000cwo.googlegroups.com>


Hugo Kornelis wrote:
> On 15 Dec 2005 08:38:39 -0800, dawn wrote:
>
> >
> >Hugo Kornelis wrote:
> >> On 13 Dec 2005 12:37:00 -0800, dawn wrote:
> >>
> >> >Hugo Kornelis wrote:
> >> >> On 10 Dec 2005 05:22:38 -0800, dawn wrote:
> >> >>
> >> >> (snip)
> >> >> >> "Show me all family members who are NOT older than Aunt Marge" - should
> >> >> >> Uncle Vernon be included or not? Why?
> >> >> >
> >> >> >Two choices of how to ask the question --
> >> >> >1) Show me all members who have an age where that age is less than or
> >> >> >equal to Aunt Marge's age
> >> >> >
> >> >> >In this case Uncle Vernon does not show up because he does not have an
> >> >> >age
> >> >> >
> >> >> >2) Show me all family members who do not have an age greater than Aunt
> >> >> >Marge's
> >> >> >
> >> >> >In this case Uncle Vernon does show up
> >> >>
> >> >> Hi Dawn,
> >> >>
> >> >> So the results of the query would depend on exactly how you formulate
> >> >> it?
> >> >
> >> >I cannot think of any situation where that is not the case, can you ;-)
> >>
> >> Hi Dawn,
> >>
> >> I have a lot more trouble thinking of a situation where it IS the case.
> >>
> >> In SQL, the following queries will return the same results:
> >
> >I'll number these
> >
> >1)
> >> SELECT Name
> >> FROM MyFamily
> >> WHERE Age <= (SELECT Age FROM MyFamily WHERE Name = 'Aunt Marge')
> >
> >2)
> >> SELECT Name
> >> FROM MyFamily
> >> WHERE NOT (Age > (SELECT Age FROM MyFamily WHERE Name = 'Aunt Marge'))
> >>
> >>
> >> >> Hmmm. AFAICT, this means that in your 2VL algebra, missing data results
> >> >> in ~(A) not being the same as (~A).
> >> >
> >> >No, I don't think so, but before I respond to more, can you give me an
> >> >example that makes you think this? Thanks. --dawn
> >>
> >> See the above queries. Rules of Booolean algebra say that
> >> NOT (Age(x) > Age(Aunt Marge))
> >> should be the same as
> >> (Age(x) <= Age(Aunt(Marge))
> >>
> >> Your explanation made me believe that this is not the case in your 2VL
> >> system.
> >
> >Then I did not explain it correctly or I'm missing something now.
> >
> >Let's have a data set that includes
> >Aunt Marge, 47
> >Uncle Vernon, null = null set = empty set, nada
> >Dale, 12
> >Dot, 70
> >
> >1) Inner select grabs the age of 47, then selects names of everyone but
> >Dot as everyone else, including Uncle Vernon in this 2VL scenario, has
> >an age less than or equal to 47 (null is less than 47 as surely as
> >nothing is less than something -- that's the twist that is different
> >than a 3VL)
> >
> >2) Inner select grabs the age of 47, then we test to see if NOT (Age >
> >47) is the case. That is the case for everyone but Dot.
> >
> >If the data are changed so that Aunt Marge also has a null set age,
> >then with
> >
> >1) The Age of null set is selected and both Aunt and Uncle are chosen
> >for this query
> >2) Similarly, the only ones with NOT (Age > Marge's age) are the Aunt
> >and Uncle.
> >
> >So, the results are the same with both queries.
>
> Hi Dawn,
>
> Okay, thanks for clarifying. I guess this means that you made a mistake
> in one of your previous replies (see quote above).

I'm not seeing the error right away, but I'll trust you on that. Sorry if I led you off course.

>
> This definitely improves the quality of how your 2VL logic system
> handles missing data. But I still don't like it. :-)

If you worked with it for a bit or if you had not worked with a 3VL at all, you would likely be happier with it.

> And you have already given the reason.
>
> > Now, perhaps Uncle
> >Vernon is 110, clearly older than Aunt Marge, but that fact has not yet
> >been recorded since people were pouring over medical and legal records
> >to verify this first. He will then incorrectly be identified as
> >someone whose age (in this database) is less than or equal to Marge's
> >age. When the data are not correct in the system, the results of the
> >queries are not correct either.
>
> But in this case, the data was not incorrect.

Yes, it was incorrect. In order to simply, we could say that we modeled the proposition:

Uncle Henry's age is unknown

with the database data of

Uncle Henry has no age

One could say this flies in the face of a database being a set of facts. However, simplifications are part of modeling and this little white lie is a way to simplify the data processing functions and human understanding thereof. We are still modeling an accurate statement.

> Since there's no way for a
> database to check if data is correct or not, I can live with the
> "garbage in garbage out" principle.
>
> But this case is different. The data was missing. And instead of
> treating this data as being missing, the DB simply assumed some value
> (apparently a zero in this case) and reported on it as if that was the
> value I entered on the form yesterday. This would be the "nothing in
> garbage out" principle.

I understand your argument. Were I not a practioner, I would likely agree with your reasoning and dismiss a 2VL as over-simplified for reality.

> >
> >I have not done any tests with this, but there is something about
> >seeing the incorrect data on a report that helps with data cleansing,
> >with the downside that someone might believe that Uncle Vernon is
> >younger than Aunt Marge and this could cause some problem. There are
> >pros and cons to each, but it really is far easier to work with a 2VL.
> >I doubt anyone who has worked extensively with both would say
> >otherwise.
>
> Would ten years of programming applications in various thrid-generation
> languages qualify as "worked extensively" with 2VL?

Yes.

> Would nine years of developing database applications in SQL databases
> qualify as "worked extensively" with 3VL?

Yes.

> If the answer to both these questions is yes, you can now say "Hi" to
> someone who has worked extensively with both and who would definitely
> say otherwise. ;-)

OK, how about if I insert words like "database access language" in there? I'll also ask you if you are sure you prefer the 3VL of SQL to the 2VL of your programming language(s). Which language(s) are you using? Do you think they would be better if "enhanced" with a 3VL? Ugh. --dawn

> Best, Hugo
> --
>
> (Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Thu Dec 15 2005 - 23:21:18 CET

Original text of this message