Re: boolean datatype ... wtf?

From: Erwin <e.smout_at_myonline.be>
Date: Fri, 1 Oct 2010 11:41:25 -0700 (PDT)
Message-ID: <196cabd3-1cb7-45bf-b6f8-a960231153a2_at_a36g2000yqc.googlegroups.com>


On 1 okt, 16:51, Paul Mansour <psmansour2..._at_yahoo.com> wrote:
> On Oct 1, 9:54 am, Erwin <e.sm..._at_myonline.be> wrote:
>
>
>
>
>
> > On 1 okt, 14:41, Tony Andrews <tony.andrew..._at_gmail.com> wrote:
>
> > > On Sep 30, 1:04 pm, Erwin <e.sm..._at_myonline.be> wrote:
>
> > > > On 30 sep, 12:32, Tony Andrews <tony.andrew..._at_gmail.com> wrote:
> > > > > Or we could create a plethora of tables like:
> > > > >  create table applications_with_garages (application_id references
> > > > > applications primary key);
> > > > >  create table applications_with_immobolisers (application_id
> > > > > references applications primary key);
> > > > > ... etc.
>
> > > > > That may be the right approach in a theoretical true RDBMS, but I'm
> > > > > pretty sure it would get me sacked as a lunatic in any SQL-based DBMS
> > > > > team!
>
> > > > If your SQL-based DBMS had proper physical data independence, then I
> > > > am quite convinced your claim would be false.
>
> > > But would this unary table approach to avoid a Boolean attribute
> > > really be considered good practice in a true RDBMS?  If I want to know
> > > "does the applicant's car have an immboliser", is the absence of a row
> > > in applications_with_immobilisers sufficient to answer it?  I thought
> > > that was a proposed solution to missing information ("it is unknown
> > > whether the applicant's car has an immobiliser"), now it seems to be
> > > acting as available information - i.e. the predicate "applicant 123's
> > > car does not have an immboliser".  Seems dubious to "record" a known
> > > fact by, er, not recording it.
>
> > Does this make a meaningful difference ?
>
> > I mean, you came up with the example of checkboxes on paper.  The
> > checkbox on paper can be "marked" if the corresponding label/property
> > is "true".  If the checkbox on paper is not marked, you interpret that
> > as "false", no ?  You wouldn't consider/interpret this as
> > "unanswered", or "unknown", or whatever, no ?
>
> > If the answer is two-way, then those two ways are isomorphic to "tuple
> > present"/"tuple absent", no ?  And "unknown"/"unanswered" simply
> > doesn't enter the picture, no ?
>
> > If you want to _explicitly_ take the option of "unknown"/"unanswered"
> > into account, then on the paper version, you would be _forced_ to
> > provide an _additional_ checkbox saying "I know the answer to the next
> > question", or "You can consider the next checkbox as having been
> > answered", no ?- Hide quoted text -
>
> > - Show quoted text -
>
> Erwin, suppose then that we do want to explicity take "unknown" into
> account with two checkboxes as you say, or I think better yet, with
> three mutually exclusive radio buttons, to explicity specifiy  Yes,
> No, and Unknown or Male,Female and Unkown, or what have you. How then
> would you model this? Is this a column  with three allowable values,
> or a column with two values where no corresponding row indicates the
> third option? Alternatively, there could be just two radio buttons
> indicating, say, Male and Female, and neither one is pre-selected, so
> it is clear whether or not the question was answered. Same question,
> how is this then modeled?  I think Tony Andrews has a good question,
> and it's more fundamental than the GUI.- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

Well, one column with three allowable values is a possible and by no means "definitely invalid" way to do it, but at any rate it makes the issue off-topic, because the original topic was "boolean attributes", and "three possible values" cannot possibly be represented by a single boolean. Note that since "question unanswered" is very explicitly made part of the business domain, it is not invalid in this case to include in the database explicit assertions of absence.

Three relvars (instead of the theoretical four - because no known answer can exist if an answer is asserted to not exist) are another possible and by no means "definitely the only" way to do it. It's the one I prefer, but I might reconsider this position if I have to apply this technique for thousands and thousands of questions in a survey (thousands and thousands of relvar names to remember, you know).

One relvar with two boolean attributes (a nullable boolean in an SQL table would essentially boil down to the same, if SQL supported it) is yet another way to do it that is not invalid per se, but it's the option I would only resort to as a last refuge. Received on Fri Oct 01 2010 - 20:41:25 CEST

Original text of this message