Re: Computing Averages in MS Excel

From: dawn <dawnwolthuis_at_gmail.com>
Date: 8 Dec 2005 18:49:33 -0800
Message-ID: <1134096573.429324.11350_at_g49g2000cwa.googlegroups.com>


David Cressey wrote:
> "dawn" <dawnwolthuis_at_gmail.com> wrote in message
> news:1134059689.752533.87750_at_f14g2000cwb.googlegroups.com...
> >
> > David Cressey wrote:
> > > A little while ago, dawn mentioned in the monster NULL thread that she
> was
> > > busy trying to
> > > reduce NULLS from SQL into an SQL-wrapper.
> >
> > IIRC It was the smaller 2VL 3VL thread -- I don't recall contributing
> > to the larger one.
> >
> > >
> > > After she said that she couldn't simply filter out the missing data,
> the
> > > subject dropped.
> > > I'll note in passing that 2vl do, eventually, run into the necessity for
> > > dealing with missing data. I suspect that Pick programmers do, too.
> >
> > Yes, of course. I believe I indicated that an attribute with no value
> > can be seen as having an empty set of values assigned to it. That way
> > the data attribute has a value (the null set) that indicates the
> > absense of a value. An issue that arises here is that it is a good
> > idea to have expressed values for the case when you know there is no
> > value for an attribute (e.g. NA for not available) compared to when you
> > do not know if there is one or not but do not have one.
> >
> > >
> > > Anyway, that led me to reframe the question, as follows:
> > >
> > > How to present missing data to users in MS Excel, and do it in such a
> way
> > > that it won't skew averages?
> > >
> > > So I cranked up a blank worksheet, and I started out by defining a
> > > rectangular area on it.
> > > I set up four cells that would take functions on that area: AVERAGE,
> > > COUNT, SUM, and SUM divided by COUNT.
> > >
> > > I then proceeded to enter various things in the cells, to see what I
> would
> > > get. I was pleased with the way COUNT works:
> > >
> > > COUNT includes cells that contain numbers, and also cells that contain
> > > formulas that evaluate to numbers.
> > > COUNT excludes empty cells, labels (cells that contain text), and
> cells
> > > that contain formulas that evaluate to text strings.
> > >
> > > I didn't mess around with dates.
> > >
> > > AVERAGE and SUM use the same criterion that COUNT does. And SUM divided
> by
> > > COUNT always gave the same answer as AVERAGE.
> > >
> > > So here's my answer (before looking at any documentation): In MS
> Excel, I
> > > would use an empty cell to represent the SQL NULL. If my users asked
> what
> > > an empty cell means, I'd say, "it means there isn't anything in it."
> > > I would expect 99% of users to go on to the next issue without missing a
> > > beat.
> >
> > If I'm understanding, that is the way that Pick handles it. That is
> > why end-users never give NULL a thought there. It averages only what
> > is there. The problem arises if you add SQL to the mix and say that
> > you have a numeric attribute.
>
> Simply not true. If you add SQL to the mix, and say you have a numeric
> attribute, and data is missing
> in one cell, you just deliver a NULL in that cell. SQL's average function
> will filter out the missing data.
>
> Anyone who needs to "see" the missing data can simply check for NULLs.
>
> It's as easy as shooting fish in a barrel. I don't understand why you are
> having a problem.

You are right on that one. It has been some years, and my memory is fuzzy on it right now -- I think the problem I encountered was that a SQL NULL was not an option as a value from the 2VL using this particular wrapper. I cannot see a reason right off why a wrapper could not be set to work as you suggested. I'll admit I am wrong in theory while I do recall that in practice there was definitely a problem. Ah well. --dawn Received on Fri Dec 09 2005 - 03:49:33 CET

Original text of this message