Re: Multiple-Attribute Keys and 1NF

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Tue, 28 Aug 2007 20:42:37 -0300
Message-ID: <46d4b2a3$0$4050$9a566e8b_at_news.aliant.net>


JOG wrote:

> On Aug 29, 12:12 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> 

>>JOG wrote:
>>
>>>On Aug 28, 8:23 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>JOG wrote:
>>
>>>>>On Aug 28, 7:41 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>>>JOG wrote:
>>
>>>>>>>On Aug 28, 5:43 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>>>>>JOG wrote:
>>
>>>>>>>>>On Aug 28, 5:05 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>>>>>>>JOG wrote:
>>
>>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
>>>>>>>>>>>concern is that there /seems/ instances where 1NF is insufficient. An
>>>>>>>>>>>example occurred to me while I was wiring up a dimmer switch (at the
>>>>>>>>>>>behest of mrs. JOG, to whom there may only be obeyance). Now I don't
>>>>>>>>>>>know the situation in the US, but in the UK a while back the colour
>>>>>>>>>>>codes for domestic main circuit wiring changed. Naturally the two
>>>>>>>>>>>schemes exist in tandem, as exhibited in every house I've had the joy
>>>>>>>>>>>of doing some DIY in:
>>
>>>>>>>>>>>Brown -> live.
>>>>>>>>>>>Red -> live
>>>>>>>>>>>Blue -> neutral.
>>>>>>>>>>>Black -> neutral.
>>>>>>>>>>>Green and yellow -> earth.
>>
>>>>>>>>>>>The issue with encoding these propositions is that the candidate key
>>>>>>>>>>>for each proposition may consist of one _or_ two colours. Now I have a
>>>>>>>>>>>couple of options, none of which seem satisfactory. I could leave
>>>>>>>>>>>green & yellow as some sort of set-value composite, but obviously this
>>>>>>>>>>>would affect my querying capabilities, so thats out straight off the
>>>>>>>>>>>bat. Similarly adding attributes Colour1 and a nullable Colour2 is
>>>>>>>>>>>simply so hideous it isn't worth consideration. So, I could ungroup to
>>>>>>>>>>>give me:
>>
>>>>>>>>>>>Colour Type
>>>>>>>>>>>-----------------
>>>>>>>>>>>Brown live
>>>>>>>>>>>Red live
>>>>>>>>>>>Black neutral
>>>>>>>>>>>Blue neutral
>>>>>>>>>>>Green earth
>>>>>>>>>>>Yellow earth
>>>>>>>>>>>-----------------
>>
>>>>>>>>>>>But again this is unsatisfactory as I have lost the information that
>>>>>>>>>>>one wire is green and yellow, but none is brown /and/ red.
>>
>>>>>>>>>>>I could introduce a surrogate to give me:
>>
>>>>>>>>>>>Id Colour Type
>>>>>>>>>>>-----------------
>>>>>>>>>>>1 Brown live
>>>>>>>>>>>2 Red live
>>>>>>>>>>>3 Black neutral
>>>>>>>>>>>4 Blue neutral
>>>>>>>>>>>5 Green earth
>>>>>>>>>>>5 Yellow earth
>>>>>>>>>>>-----------------
>>
>>>>>>>>>>>But this seems wholly artificial given that all the information I
>>>>>>>>>>>required for identification was available in the original
>>>>>>>>>>>propositions, and that did not require some artificial id. A [shudder]
>>>>>>>>>>>non 1NF variation such as:
>>
>>>>>>>>>>>Id Colour Type
>>>>>>>>>>>-----------------
>>>>>>>>>>>1 Brown live
>>>>>>>>>>>2 Red live
>>>>>>>>>>>3 Black neutral
>>>>>>>>>>>4 Blue neutral
>>>>>>>>>>>5 Green, earth
>>>>>>>>>>>Yellow
>>>>>>>>>>>-----------------
>>
>>>>>>>>>>>is clearly hideous as it denies the fundamental mathematical principle
>>>>>>>>>>>that that one attribute should take one value from one domain,
>>>>>>>>>>>nevermind the fact that it introduces query bias.
>>
>>>>>>>>>>>I could of course introduce nested relations, but I am uncertain as to
>>>>>>>>>>>the theoretical consequences of having nested relation as a key (I
>>>>>>>>>>>guess it would be fine, if adding seemingly unnessecary complexity to
>>>>>>>>>>>subsequent queries). But moreover it again seems unintuitive, given
>>>>>>>>>>>that in this case it would indicating that the original propositions
>>>>>>>>>>>contained, as a value for one of their attributes, a further
>>>>>>>>>>>proposition, and this was not the case.
>>
>>>>>>>>>>>I am having a crisis of faith with the way 1NF is currently viewed.
>>>>>>>>>>>Any ideas to solve my dilemma? Am I on my own in being perturbed?
>>
>>>>>>>>>>>Regards, Jim.
>>
>>>>>>>>>>There is one obvious way to represent that in 1NF:
>>
>>>>>>>>>>Create a color domain where a single value represents green and yellow,
>>>>>>>>>>another value represents green, and a third represents yellow etc. The
>>>>>>>>>>domain could even represent thick green/thin yellow as a separate value
>>
>>>>>>>>>>from thick yellow/thin green if one chooses.
>>
>>>>>>>>>>Regardless whether one creates only the domain or also uses it as a
>>>>>>>>>>candidate key for some sort of lookup table, the resulting relation is
>>>>>>>>>>simply:
>>
>>>>>>>>>>Colour Type
>>>>>>>>>>======= -------
>>>>>>>>>>...
>>
>>>>>>>>>>Your ID above is one example of such a domain. However, the domain need
>>>>>>>>>>not be numeric or have any external numeric representations. It need
>>>>>>>>>>only exist with a distinct value for green and yellow.
>>
>>>>>>>>>Well, practically, the surrogate key is the way that I would go. My
>>>>>>>>>question is rather whether this corresponds naturally to the original
>>>>>>>>>propositions, which don't require a new domain in order to be
>>>>>>>>>manipulated in FOL.
>>
>>>>>>>>You assume a color domain so imagining a different color domain changes
>>>>>>>>the design without adding anything new.
>>
>>>>>>>Okay, you're right - not a new domain, just a different one. If I had
>>>>>>>started with domain of all colours C (clearly containing the colour
>>>>>>>"grey" given the presence of the u there), I read you as proposing
>>>>>>>that it be replaced with a labelled powerset of C. Howwwever, would
>>>>>>>occams razor not suggest that we should prefer a domain made up of
>>>>>>>atomic individuals, as opposed to aliased sets, which will require an
>>>>>>>extra step to decompose?
>>
>>>>>>I don't recall suggesting anything about sets--just a domain that has a
>>>>>>distinct value that means "green and yellow".
>>
>>>>>Okay, sure. But then to be able to query for green and yellow
>>>>>individually one must employ a further relation encoding two more
>>>>>propositions that state "'Green and yellow' contains 'Green'" and that
>>>>>"'Green and yellow' contains 'Yellow'" respectively. One then has a
>>>>>schema with two domains - one for the composites and one for
>>>>>individual colours (which is what I was inferring when I initially
>>>>>said a new one was being added).
>>
>>>>Assuming one has a need to query for green separately, I suppose one can
>>>>define an operator on the domain to that effect. If one invents a
>>>>requirement that requires a second domain, then one will need a second
>>>>domain regardless.
>>
>>>Well that sort of brings us full circle back to to my query as to
>>>whether a structure that doesn't require that second domain,
>>
>>Let me be clear: Unless you invent a requirement that requires a second
>>domain, no second domain is required. If one invents such a requirement,
>>one is.
>>
>>such as a
>>
>>
>>>set where elements themselves are pure mathematical relations
>>>containing attribute/value pairs:
>>
>>>Wires = { {(Color, Yellow), (Color, Green), (Type, earth)}} {(Color,
>>>blue), (Type, live)} }
>>
>>But a set is a second domain. You have 1) colors and 2) sets of colors.
>>Actually, you have sets of some supertype of color and type?!? Yuck!
> 
> 
> I'm worrying that you have misinterpreted what I have sketched out
> there. I haven't specified any domain sets at all in the  above - it
> is just a set of propositions, and as with RM, each element is a
> mapping from attribute names onto values, that's all. I have no idea
> why you think I have supertypes, etc, in there. (which I agree would
> be yuck)

Based on this set: {(Color, Yellow), (Color, Green), (Type, earth)}

That is not a tuple. A tuple would be:

{(Color, {Yellow, Green}), (Type, earth)}

The names must be unique within a tuple.

>>>has any negative theoretical impacts. I can see immediately that this
>>>would affect WHERE and ON clauses in the algebra, and one would get
>>>more use out of an GROUP/UNGROUP statements, but I see nothing
>>>inherently /bad/.
>>
>>>Yet that is.
>>
>>Supposing you have a requirement that you must be able to use the green
>>and the yellow separately and supposing you choose to use a set of
>>values (i.e. an RVA), you have already identified the problem that when
>>one ungroups, one loses the information that green and yellow belong as
>>a pair.
>>
>>To preserve this information, the dbms would have to have some facility
>>to generate an artificial identifier for the pair. However, if one
>>normalized the base relations, one would already have the identifier,
>>and it would be rather simple to construct the RVA in a derived relation.
>
> Much food for thought. Thanks for the responses Bob.

You are very welcome. Some of this stuff seems obvious to me now, but at one time it was anything but. Received on Wed Aug 29 2007 - 01:42:37 CEST

Original text of this message