Re: All hail Neo!

From: JOG <>
Date: 26 Apr 2006 06:54:33 -0700
Message-ID: <>

Frank Hamersley wrote:
> Marshall Spight wrote:
> > Frank Hamersley wrote:
> >> I can cope with that - I would have no problem being forced to write
> >>
> >> "select avg(age) from table where age is not null"
> >>
> >> to get the crappy statistic that it is if the user demanded it.
> >
> > I dunno. If you have a whole lot of people and most of them
> > have their ages filled in and a few don't, are you ever going
> > to want to ask, "what is the average age", since the answer
> > will always be "unknown." Doesn't seem much use to me.
> > If you want to know if any of them are unknown, you could
> > ask that specifically. But if you want to know the average
> > age, then you want to know the average of the data you have;
> > you're not asking about the data you don't have because
> > you don't have it. The only useful query in there is "give
> > me the average age for the data I have"; why should we
> > make the way you ask for that longer winded than other,
> > never-useful queries?
> >
> > And how "crappy" is that statistic anyway? Probably not
> > at all crappy. It's probably exactly what you want.
> >
> > The idea of null as something that taints everything it
> > touches doesn't seem useful or practical to me.
> I fully understand the sentiment however in cases like this I prefer
> arrangements that retain flexibility for the (awake) programmer and
> provide a form of simple clarity. i.e. if there is one null the avg()
> is null. It would not take long for the industry to adopt this although
> with all existing code out there a change might take on Y2K proportions!
> This simplistic approach means any stray nulls creeping into a dataset
> where none are expected will not go undetected if inadequately
> constrained queries are framed. Of course this a very late stage to be
> worrying about data integrity but better late than never.
> Of course this is a simple case which is why I am interested to see if
> Bob comes up with insights into something more difficult.
> Cheers, Frank.

PMFJI, I thought it might be useful to point out that spreadsheets have had to address this problem since their inception. As far as I know they, by default, ignore empty cells when averaging ranges, as opposed to dropping out with an error. As such they view the empty cell as as not existing in the range at all. If all cells are empty then no range exists and an error results. Received on Wed Apr 26 2006 - 15:54:33 CEST

Original text of this message