Re: 1 NF

From: dawn <dawnwolthuis_at_gmail.com>
Date: 2 Mar 2007 14:28:05 -0800
Message-ID: <1172874485.298524.150580_at_30g2000cwc.googlegroups.com>


On Mar 2, 12:34 pm, "JOG" <j..._at_cs.nott.ac.uk> wrote:
> On Mar 2, 4:45 pm, "dawn" <dawnwolth..._at_gmail.com> wrote:
>
>
>
> > On Mar 2, 6:54 am, "JOG" <j..._at_cs.nott.ac.uk> wrote:
>
> > > On Mar 2, 10:40 am, Stefan Nobis <sno..._at_gmx.de> wrote:
>
> > > > Gene Wirchenko <g..._at_ocis.net> writes:
> > > > > "Alfredo Novoa" <alfred..._at_gmail.com> wrote:
> > <snip>
>
> > > N.B. That if the domain is {1, 2, 3, 4, 5} the value "{1, 2}" is not
> > > in this domain . This is why an MV approach /using sets/ makes no
> > > sense to myself.
>
> > But the domain of an attribute could include sets, right? The domain
> > could be a sets with elements from the domain of colors, for example.
> > So, if you have {yellow, blue, green} as valid colors, then the domain
> > of an attribute could be all subsets of this set or even all lists
> > composed of elements from this set.
>
> Yes. And of course then, to be consistent, every element would have to
> be a set,

Why? Because if we have one element that is a number, all have to be? Of course not! But if you like it to be tidy, you could model each as a set where values might be the null set {}, a set with on element {"john_at_aol.com"} a set with multiple elements, or even a list, perhaps also modeled as a set {("john_at_aol.com",0),("jdoe_at_gmail.com", 1)}

In any case, each attribute may have its own domain. If one has a domain of sets, others need not. You chide me often, so this time around I'm gonna suggest you could be more careful than you were with that last statement, doncha think?

> even if they were just singletons. How screwy would that be.
> One would need two mechanisms - one to deal with propositions and the
> other to deal with sets. And then another to deal with multi-sets
> right? Oh, and anothe for lists. An on and on ad infinitum.

Yes, there is an infinite variety of possible domains. I don't find this very disturbing in practice, nor in theory, although I do understand models like pick and mumps opting for all values to be strings or sets/lists/bags of strings, which can then be cast to numbers, dates, etc. It simplies some things, while eliminating all attribute values that are sets, bags or lists simplies other things. There are pros and cons to these things (along with beliefts, of course ;-)

> I'll stick
> to a nice neat algebra with one type thank you very much.

I just cleaned up my bedroom and now the guest room is a mess! You can make one little thing tidy, but when trying to optimize the whole, it might make sense to complicate something, such as within a DBMS engine, in order to simplify application software maintenance, for example.

> > > I have swung wildly in my understanding of 1NF and whether or not it
> > > made sense to me. I've come out the other side with the belief
>
> > It's all religion, brother ;-)
>
> It's bloody well not for me. It's an engineering problem.

That would be my take. I wondered why you brought those "beliefs" into it.

> > > that
> > > 1NF is absolutely essential for good data modelling. However I also
> > > believe that being in 1NF does not necessarily preclude propositions
> > > with multiple-values (you just couldn't use relations to store them).
>
> > Why not use relations to model these propositions?
> > Here is one tuple (add quotes as desired)
>
> > Person(John,Doe,[blue, green, blue, yellow],23,4/6/1956)
> > where [...] is a list
>
> > > It is very hard to debate this though because many often assume that
> > > 1NF => relations
>
> > which is surely not the case with the old def of 1NF and the
> > mathematical def of relation
>
> > > and MV => NFNF, both of which are wrong. Rather
>
> > MV is NF2 by the def of NF2. Non-first normal form means that at
> > least nested sets are possible because NF2 takes its queues from the
> > old (still most commonly-used) definition(s) of 1NF. Deciding that 1NF
> > means something different just means that we have overloaded the term
>
> I was not born in 1969, so care little about the origins of the term
> 1NF. I just care that in my sphere of work there is general consensus
> that 1NF means that for any element in a tuple, I pick one value from
> one domain. (i.e attribute-value pairs form a binary mathematical
> relation - a finite partial mapping in the RM)

Most of us can adjust to the term "bad" being a good thing as long as there is still a term we can use for "bad" that doesn't mean "good." Please provide me with a term for the form-formerly-known-as-1NF that will be generally understood in our profession so I don't have to refer to it that way again. Otherwise I will continue to use the term 1NF when I think people still know that "bad" means "bad" and will have to use the longer form in this forum.

> > and now NF2 refers only to one of these two (categories of)
> > definitions. But yes, by Def MV ==> NFNF aka NF2
>
> > > relations => 1NF
>
> > That is just a silly redefinition
>
> It is not a definition. It is a material implication.

I was referring to the def of 1NF. I understood your notation.

> If I have
> relations then I also have 1NF. But I do not require relations to have
> a model in 1NF.
>
> > that serves to make it that we do
> > not change our practices to align with our new understanding of
> > theory. Relations do not need to be in 1NF by the original and most-
> > commonly used def of 1NF. Let's say instead that we now understand
> > that relations do not need to be in 1NF. I really, really dislike
> > this toying with terms in a way that obscures changes in theory to the
> > detriment of our practices. Once there was a pretty widely agreed
> > upon understanding that the form-formerly-known-as-1NF was not
> > required, we should have said "We don't need to put relations in 1NF
> > anymore."
>
> > > and NFNF => MV.
>
> > This is also inaccurate IMO. MV is acronym for "MultiValue" which is
> > a trademark (Spectrum International owns it) for those databases that
> > use the Nelson-Pick data model employed by the database formerly known
> > as PICK, among others.
>
> I appreciate the correction in terminology - you see, that is helpful.
> By 'MV' i meant the desire to store or view a proposition of the order
> "Employee 5 has name 'John Smith' and works for departments 4,5 and
> 6." without breaking it down into 3 separate propositions. What would
> you think a better general description/acronym to usre if I am
> referring to this broad category? (obviously not 'NF2' as that is only
> a single methodology for encoding the proposition).

NF2, sorry. Saying that you will keep this in non-first normal form would be understood by the broadest range of IT/DP professionals to mean what I think you are intending, even though that uses (old) terminology from the relational model for the proposition. You may also call it MV, as most will understand that to be Non-1NF.

> > But there are other data models, such as M/
> > MUMPS, the one employed by Caché that are non-first normal form and
> > are not MV. Cheers! --dawn
>
> > > Regards, J.

cheers! --dawn Received on Fri Mar 02 2007 - 23:28:05 CET

Original text of this message