Re: Define "flatten database" ?

From: Dawn M. Wolthuis <dwolt_at_tincat-group.comREMOVE>
Date: Sat, 29 Jan 2005 11:20:16 -0600
Message-ID: <ctggko$j6h$1_at_news.netins.net>


<snip>
> Geez, Dan, I was on your side. I was responding to Dawn's insitence that
> flattening was something other than denormalizing..

But I didn't insist on that, Alan -- I tried to explain that the term "flattening" was sometimes used interchangably with the term "normalize" (even if incorrectly). In that case it means that data that were in a structure not usable with ODBC (to put it in concrete terms) were virtually put into 1NF (one aspect of being normalized) and then virtually joined into a VIEW of the data that SQL92 understands (that's where it is then denormalized).

And because I think you might have what it takes to understand this point, I'm going to give it one more shot and I'm really, really hoping that you come back and say "OK, I do understand your point now".

Take two people, A & B, who work with non-1NF data. If you have no other reference for data stored in a non-1NF structure, think of data in XML documents (others might prefer to think of a VSAM or indexed-sequential file defined with a COBOL OCCURS clause, even if that shows your age). Use the same example I used earlier with (PersonID, LastName, FirstName, Email*) tuples that include a list of Email addresses.

  1. Can you put that data in Excel? B. Sure. The data are not normalized -- do you want all of the e-mail addresses for a person in a single cell? A. No -- why don't you normalize it for me. B. Will do. I'll flatten it and send you the spreadsheet. A. Thanks.

This conversation is one that could definitely happen in the world of non-RDBMS products. Cam you see why that is? Have I had any success explaining why a person using the term "flatten" might be thinking in terms of normalizing, even if the 1NF step is only the first of a process like this in our example:

  1. put the list of Email addresses into its own (virtual) table, with a candidate key of the PersonID plus the e-mail address -- that is, put into 1NF, the first step in "normalizing" (at least in the "old" relational version of normalizing).
  2. create a view of the data spanning both the original table sans Email and the new (virtual) table of e-mail addresses
  3. use the view of the now "normalized" (from the first step) data, recognizing that the step of making a view was a denormalizing step (but not the same one that would put us back to the original data -- it is still a step that keeps us in the world of SQL92).

That was my last shot at it, so if you still think I'm either trying to skew my information or am making no sense, well, I tried my best. Cheers! --dawn Received on Sat Jan 29 2005 - 18:20:16 CET

Original text of this message