Re: boolean datatype ... wtf?

From: Paul Mansour <psmansour2000_at_yahoo.com>
Date: Sat, 2 Oct 2010 04:27:50 -0700 (PDT)
Message-ID: <ffc76ea0-ffac-402c-b0bb-39a21cd43f2c_at_z28g2000yqh.googlegroups.com>


On Oct 1, 6:24 pm, Erwin <e.sm..._at_myonline.be> wrote:
> On 1 okt, 23:29, Gene Wirchenko <ge..._at_ocis.net> wrote:
>
>
>
> >      I do not want the exponential explosion if I have customers with
> > boolean attributes.  In my client billing system, there are two:
> > whether the client billing is PDF only and whether they pay by credit
> > card transaction where we process a payment after a delay if we have
> > not heard otherwise.
>
> >      Why would I want to split the client table into four tables?
>
> > [snip]
>
> > Sincerely,
>
> > Gene Wirchenko
>
> OK.  I admit that on the sheer face of it, your argument seems right.
>
> But I propose we get a look _beyond_ the "face of it".
>
> If you speak about "whether client billing is PDF only", you are
> already suggesting that "client billing" really isn't just a boolean.
> In particular, you are suggesting that there really are other billing
> modes than "PDF", and that there may exist clients who want "PDF +
> postal mail", that there are other clients who want "postal mail
> only", and (for the humorous part of it) "no billing at all".
>
> I propose you consider a design with a relvar that simply records
> "client wants PDF billing", another one that records "client wants
> postal mail billing", a constraint "client must express at least one
> billing mode".
>
> With such a design, if newly arising technology allows for "novel
> billing modes", then all you have to do to your database is add
> another relvar, plus include that relvar in the union that is part of
> the constraint.
>
> Similar observations apply to "customer's preferred mode of payment".
>
> When I was a "young Turk" in IT land, I formulated my own design
> principle : "codes are poor, entities are rich".  What I meant by that
> is that "A boolean is a piece of 'codified' information, and therefore
> it doesn't convey much information after all.  A separate entity (ER
> terminology, relational equivalent being 'a separate relvar' holding
> particular tuples) allows for specifying WHY that boolean piece of
> information happens to be true.".
>
> IOW: if a boolean is true, then somewhere deeper down inside, _THERE
> IS ALWAYS A REASON WHY_, and _KNOWING_ that 'reason why' is _ALWAYS
> MORE AND BETTER_ than just 'knowing that it is so'.
>
> Oh, and I am very much aware of the case where legislation does not
> allow you to actually be 'aware' of 'the reason why'.

Erwin, thanks for your response to my previous questions. I was trying to get my head around why if it is a good idea to split a column with two potential values into two separate relvars, it would not also be a good idea to put a column with n potential values into n separate relvars, when, in your response to Gene Wirchenko, you suggest just such a design, on the grounds that it is then easy to add new values, by simply adding a new relvar. What does such a design possibly get you over a single column (perhaps with a lookup table) with n possible values whether in the main entity table or off in its own table to avoid nulls? Furthermore it takes a schema change to add a new category, rather than simply adding a new row to a lookup table. So now you are suggesting that not only boolean columns are bad design but even columns with 3 or more unique values. Is there a number (say 25?), where you just say to heck with it, I'll go to a single table? I must be missing something. Received on Sat Oct 02 2010 - 13:27:50 CEST

Original text of this message