Re: Examples of SQL anomalies?

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 7 Jul 2008 22:26:19 -0400
Message-ID: <gfAck.12047$89.8921_at_nlpi069.nbdc.sbc.com>


"Cimode" <cimode_at_hotmail.com> wrote in message news:ad00994a-1166-4d7c-ac66-a018648c222d_at_34g2000hsh.googlegroups.com...
> On 7 juil, 17:28, "Brian Selzer" <br..._at_selzer-software.com> wrote:

>> "Cimode" <cim..._at_hotmail.com> wrote in message
>>
>> news:b6313a3d-a07c-40d7-859c-6c70db4fa0c9_at_f63g2000hsf.googlegroups.com...
>>
>>
>>
>> > On Jul 7, 1:30 am, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>> >> "Cimode" <cim..._at_hotmail.com> wrote in message
>>
>> >>news:623888cb-7e35-4aaf-a496-1654f8fed040_at_k37g2000hsf.googlegroups.com...
>>
>> >> > On 6 juil, 17:46, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>> >> >> "Cimode" <cim..._at_hotmail.com> wrote in message
>>
>> >> >>news:f4a29563-3e69-4d50-a3d8-07d434aac374_at_27g2000hsf.googlegroups.com...
>>
>> >> >> > On 6 juil, 15:43, "Brian Selzer" <br..._at_selzer-software.com> 
>> >> >> > wrote:
>> >> >> > [Snipped]
>> >> >> >> >> The users know what the data is supposed to mean.  The system
>> >> >> >> >> hasn't a
>> >> >> >> >> clue.
>> >> >> >> > I do not understand how you place system and people on the 
>> >> >> >> > same
>> >> >> >> > standpoint.  Don't you believe in need-to-know basis?
>>
>> >> >> >> What has that to do with it?
>> >> >> > Everything. Why do you want to place the burden of data
>> >> >> > interpretation
>> >> >> > on people at run time.  What is the point designing systems to
>> >> >> > present
>> >> >> > information to people if they have to spend hours guessing what 
>> >> >> > it
>> >> >> > means?
>>
>> >> >> I would place the burden of data interpretation on people because 
>> >> >> it
>> >> >> is
>> >> >> only
>> >> >> people that can interpret data.
>> >> > OK but do they need to do it at run time.  What would be the point 
>> >> > of
>> >> > design then?
>>
>> >> In some cases, yes, they need to do it at run time.  With an unlimited
>> >> budget of time and money, you can handle every possible exception in
>> >> code,
>> >> but back here in the real world, you can't waste precious resources on
>> >> something that seldom occurs, so you just print out an exception 
>> >> report
>> >> and
>> >> provide the means for the users to deal with them (if it doesn't 
>> >> already
>> >> exist).
>> > Who says design should be unlimited?  Design simply sufficient to
>> > handle most cases and it should correct on RM standpoint.  What about
>> > time wasted on poor design?
>>
>> From what I've seen, poor design usually translates to inadequate design.
>> If you don't spend enough time on design, you end up reinventing the 
>> wheel
>> halfway through the implementation, and it ends up costing upwards of 
>> double
>> the budget.

> How about the idea that poor(inadequate if you feel better with this
> term) design = using NULLS
>

NULLs are only one way to deal with missing information. From your comments it appears that you are equating a design that uses NULLs with a design that permits missing information. While a design that uses NULLs necessarily permits missing information, there are designs that do not use NULLs that also permit missing information. In fact, for every design that employs NULLs, there is an equivalent design that does not.

>> >> >> Why would they have to spend hours?  Hopefully any system would
>> >> >> present
>> >> >> enough complete information to obviate most guesses, but for the 
>> >> >> rest,
>> >> >> given
>> >> >> a choice between being given a potentially wrong answer and being 
>> >> >> told
>> >> >> that
>> >> >> the information is incomplete, I (and most people I know) would 
>> >> >> prefer
>> >> >> the
>> >> >> latter, even if it required a little extra effort on my part.
>> >> > But then, following this logic and the 8 item weigth example, the
>> >> > system would then require the information that the sum is relevant
>> >> > only up to 8 items to be available to people for them to make
>> >> > *educated* guesses.  Don't you think that makes things more
>> >> > complicated?  Most people I know would tell you, you did a poor 
>> >> > design
>> >> > job and that they don't need this irrelevant information.
>>
>> >> The information is already available.  If SUM returned an 
>> >> indeterminant
>> >> result, then the user should be able to issue additional queries
>> >> qualified
>> >> with something like, WHERE weight IS NULL or WHERE weight is NOT NULL, 
>> >> in
>> >> order to find out why the result was indeterminant.  It certainly 
>> >> isn't
>> >> rocket science.

> What information is available ? The information fact that it is not
> correct(suspect) ? How is it already available ?
>

If the information were not already available, then how could a query that performs a SUM be issued?

> What interest in building systems that bring the information that the
> system is not reliable and impose on people a two step that would
> force them to re-interpretate and force them to permanently using NOT
> NULL/IS NOT NULL.
>

Who said that the system is not reliable? Returning an indeterminate result to a query that cannot be exactly answered exemplifies robustness, not a lack of reliability. It is certainly better than telling the user that no answer can be had, since there may be instances where the same query /can/ be exactly answered.

> Rocket science is understanding how is that simpler than simply taking
> the NULLS out of the equation and not have to deal with all this.

If the design permits missing information, then you will have to deal with all this. Again, NULLs are just one way to deal with missing information. If there should be a value, but it hasn't been supplied, then there needs to be an indicator that there should be a value. This applies regardless of whether NULLs are used or not. So to handle missing information without NULLs, you need an indication (possibly implicit) that there should be a value, which may need to be in a separate table, and another table for those values that should have been supplied and have. That way the absence of a tuple containing a value can be correctly interpreted, because it could be that there shouldn't be a value, and that's why there isn't a tuple.

>> > But what user are you refering the end user or the designer?  Should
>> > an application end user have to know about IS NULL / IS NOT NULL?
>>
>> Both, and neither.  The user could be the designer, or an application
>> developer, or the user could be an end-user writing a Crystal Report or
>> integrating a spreadsheet.  I would think that in order to integrate a
>> spreadsheet, the end-user would have to know about IS NULL / IS NOT NULL 
>> if
>> any of the columns in any of his queries can have NULLs.

> So, basically the end user is intended as designer on a per use and
> per application basis? Don't you believe this would leave the meaning
> of information open to designer's interpretation?

Isn't it already? Received on Tue Jul 08 2008 - 04:26:19 CEST

Original text of this message