Re: boolean datatype ... wtf?

From: Hugo Kornelis <hugo_at_perFact.REMOVETHIS.info.INVALID>
Date: Thu, 07 Oct 2010 00:51:01 +0200
Message-ID: <vntpa6556neb5mln7mu7h43qkvaga7aa9d_at_4ax.com>


On Wed, 6 Oct 2010 06:29:45 -0700 (PDT), Erwin wrote:

>On 6 okt, 13:50, Hugo Kornelis <h..._at_perFact.REMOVETHIS.info.INVALID>
>wrote:
>
>> Probably. But I guess Brian kind of lost me when he wrote that a single
>> nullable column has to be replaced by not one but two seperate tables in
>> a null-less design.
>
>In lots of situations where nulls show up in a design, the reality is
>that "missing" information may be "genuinely unknown" (e.g. natural
>person is known to have a professional occupation, but which
>profession that is, happens to be unknown), or it may be
>"inapplicable" (e.g. natural person is known to be unemployed, i.e. is
>known to not have any professional occupation at all). Or take
>religious belief: "known to have one, but unknown which one it is",
>versus "known to not have one".
>
>It's explored in more depth in "How to handle missing information
>without using nulls" on www.thethirdmanifesto.com, and in "Database
>Explorations", chpt 23. If you have only a single separate relvar for
>the "null" case, AND it is the case that "inapplicable" applies, then
>the predicate of that separate table must be a disjunction, such that
>in the end, if a tuple appears in that table, you still don't _really_
>know what that tuple means.

Thanks for the link. Interesting reading. I was aware of the vertical decomposition, but I had not seen the "very loosely speaking" horizontal decomposition before. Since the concept of horizontal decomposition builds on the assumption that the distinction between "unknown" and "not applicable" HAS to be made, I don't think this approachh should always be used.

However, my understanding of vertical decomposition may always have been incorrect. I have always thought that the vertical decomposition of   PERSON:

  ID    NAME      JOB     SALARY
  1234  Anne      Lawyer  100,000
  1235  Boris     Banker  ?
  1236  Cindy     ?       70,000
  1237  Davinder  ?       ?
leads to these three relvars:
  CALLED            DOES_JOB          EARNS
  ID    NAME        ID    JOB         ID    SALARY
  1234  Anne        1234  Lawyer      1234  100,000
  1235  Boris       1235  Banker      1236  70,000
  1236 Cindy
  1237 Davinder
which requires outer joins to reconstruct the original. But the presentation you point me to includes all persons in all three relvars, so that the original can be reconstructed with an innder join, but no nulls are eliminated at all, and the DOES_JOB and EARNS relvars still have the awkward predicate problem.

In the three relvars I sketch above:
* No NULL values are required
* It's not possible to distinguish between "unsalaried" and "salary unknown", so **IF** that distinction is required, extra relvars are required.
* Predicates for each table are simple - they are given in the presentation, so no need to repeat them here, but there are no tuples with question marks to blur their meaning. * So for each tuple that appears in a table, we know _exactly_ what it means.

I was surprised to see the rows with the question marks in the presentation. Removing them is not exactly rocket science, so there must obviously be some other problem that prohibits this apparently simple solution. But I fail to see what that problem might be.

Best, Hugo Received on Thu Oct 07 2010 - 00:51:01 CEST

Original text of this message