Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> comp.databases.theory -> Re: All hail Neo!

Re: All hail Neo!

From: David Cressey <dcressey_at_verizon.net>
Date: Thu, 27 Apr 2006 12:08:13 GMT
Message-ID: <Nu24g.1345$Te.1329@trndny04>

"Marshall Spight" <marshall.spight_at_gmail.com> wrote in message news:1146083416.518617.324390_at_i40g2000cwc.googlegroups.com...
> David Cressey wrote:
> > "Marshall Spight" <marshall.spight_at_gmail.com> wrote in message
> > >
> > > I just fired up Excel 2003. I entered two columns of numbers, and
> > > left one cell empty in each column. Then I set the formula for the
> > > third column to be col1 + col2. Then I got the sum of the first two
> > > column. It behaved exactly as I propose: missing values were
> > > ignored, both on the vertical sum and on the horizontal sum.
> > > This preserves the SUM(A) + SUM(B) = SUM(A+B) property
> > > as well.
> >
> > Repeat the experiment with avg. The results are even more interesting
than
> > sum.
>
> I was thinking about this just before you suggested it, and it occurred
> to me that the average of an empty set should produce the same
> result as a divide by zero. Sure enough, if you ask excel what the
> average of three empty cells is, it says "#DIV/0!".
>
> Using average in the grid does *not* have the property
> (AVG(A) + AVG(B)) / 2 = AVG(A+B). That property
> won't hold unless A and B have the same cardinality,
> which they won't (necessarily) if A or B is allowed to
> be empty. We can generalize this to say that
> a fold of a fold won't necessarily return the same
> results as a flattened fold.

I was thinking of something somewhat less subtle. If the average function is asked to compute the average of a column of four values and one null, there are three ways it could be implemented:

  1. report an exception
  2. add up the four values, and divide by four.
  3. add up the four values, and divide by five.

I think we can all agree that alternative C is not useful. Some people prefer A. I prefer B.
Both Excel and SQL, to my knowledge, implement choice B. That's what I was referring to, when I suggested doing the experiment with avg. Received on Thu Apr 27 2006 - 07:08:13 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US