Re: Define "flatten database" ?

From: Alan <not.me_at_rcn.com>
Date: Fri, 28 Jan 2005 21:25:15 -0500
Message-ID: <360ah3F4s9igfU1_at_individual.net>


"Dawn M. Wolthuis" <dwolt_at_tincat-group.comREMOVE> wrote in message news:ctee0t$o47$1_at_news.netins.net...
> "Alan" <alan_at_erols.com> wrote in message
> news:35vj76F4sdq3pU1_at_individual.net...
> >
> > "Dawn M. Wolthuis" <dwolt_at_tincat-group.comREMOVE> wrote in message
> > news:cte2b6$h5u$1_at_news.netins.net...
> >> > OK, I'd vote for both terms being incorrect. It is not denormalizing
> >> > because it is starting with non-1NF data -- agreed? And it is not
> >> > normalizing because, well, it clearly isn't (the key, the whole keyu
> >> > and
> >> > nothing but the key ...). However, the reason that "we" sometimes
call
> > it
> >> > normalizing is that it takes a non-SQL92-compatible structure and
turns
> > it
> >> > into something that can be used with ODBC, for example. By formal
> >> > definitions (until perhaps recently) the source data are not in first
> >> > normal form. So, informally, to get ODBC to work (for example)
someone
> >> > might suggest they are normalizing it. I'm one of the people who
says
> >> > "flatten" instead because normalizng is clearly wrong. However, it
is
> >> > still an informal way people might say it as in "we need to nornalize
> > the
> >> > data in order to pull it into Excel". (Wrong, but you can see how
such
> >> > language evolved).
> >>
> >> I'll just add one other point that the reason that "flattening" from a
> >> non-1NF is sometimes referred to as normalizing is that there is a
sense
> >> that the nested structures are pulled into separate "rows" thereby
> >> normalizing, and then a view created that ties them to their original
> > table,
> >> but now as if both relations (the parent and the child) were normalized
> > and
> >> then joined. Did that help clarify the langauge?
> >>
> >
> > No. Putting anything into rows (seperate or otherwise) is not
normalizing

>

> I'm not sure you understood me. I really do know how to normalize data
> according to the usual practices -- I'm trying to explain a term, which is
> the topic of this thread.

You are not explaining- you are twisting a simple explanation so it fits your view of the world. Flattening is denormalizing. It is not normalizing.

  I also explained WHY when working with embedded
> lists, putting the data into a form that is usable as a SQL VIEW is
> sometimes referred to this way. This is an incorrect use of the term
> "normalize" but if you listen a bit, you might be able to hear how such
> language evolved:

>

> George: Can you pull that into Excel using ODBC?
> Harry: Do you want to have a person's list of e-mail addresses all in one
> cell or in separate cells?
> G: Separate ones
> H: The e-mail addresses are in a list, so I'll have to normalize the data
> first
> G: OK, let me know when you've flattened the data into Excel, OK?

This example makes no sense. I've NEVER heard ANYONE refer to the things you've mentioned in this way, or anything close to it, and I've heard some pretty dumb conversations in meetings.

>

> > Normalizing is simply a process of removing redundancy from data
>
> That is true for 2NF and beyond, but that is where I get on my soapbox
about
> 1NF. 1NF is not about reducing redundancy, is it?

In a sense, it is. 1NF is a description of a step in the process of acheiving a minimum of 3NF. You are under the misunderstanding that is a (first?) type of finsihed normalization. Technically speaking, I suppose it is, but no one I know of would stop at 1NF. It is just the first gas station between Chaostown and Organizedville. Fill'er up and keep driving. Next stop 2NF. No services available- just a porta-potty. Hardly even worth a visit.

>

> > and
> > reducing the possibility of insert, update and delete anomalies. It is
not
> > done at the row level, it is done at the individual data element level.
>

> Of course, but one can have a row with embedded lists, thereby indicating
> that the data are not in 1NF, right?
>

> > (Remember functional dependencies? Did you ever read the Elmasri/Navathe
> > book you bought?)
>
> About half of it and almost all of Date's 8th edition. I'll sitll admit
to
> ignorance, but perhaps not as much as you might think, so you might have
to
> decide whether or not I'm dumb instead ;-)
>

> > The structure of a row is defined +as a result+ of the
> > process, it is not the process itself. Flattening is the process of
doing
> > the opposite (DEnormalizing)- taking normalized data from many
normalized
> > tables (think of it as a kind of hierarchy (not that it is), or
> > parent-child
> > relationships (Joe Celko, do not yell at me- I am just trying to explain
> > this in simple terms)) and putting it into fewer tables (usually to one
> > table) (IOW, taking away the hierarchy "pyramid", therfore causing it to
> > appear "flat" or "flattened", hence the term "flattening"). This could
> > introduce redundancy, and therefore cause the above mentioned anomalies.
> >
> > I _still_ don't understand why you insist on making this complex. It
> > isn't -
> > it is soooo basic - but I give up.
>

> I was not trying to make it complex but to answer the original question
> since I have often used the non-RDBMS phrase "flatten the data" and,
> therefore, thought I might be able to shed some light on at least what I
> meant when I said it.
>

> >
> >> > You have it backwards and twisted.
> >
> >
> >>
> >> This might still be true. --dawn
> >
> > Well, yes...
>

> I'm not sure you are reading me right, but I'm sure you will feel free to
> continue to hold that opinion. Ah well ...
> --dawn
>
> Received on Sat Jan 29 2005 - 03:25:15 CET

Original text of this message