Re: Define "flatten database" ?

From: Dawn M. Wolthuis <dwolt_at_tincat-group.comREMOVE>
Date: Fri, 28 Jan 2005 12:50:46 -0600
Message-ID: <cte1i9$fka$1_at_news.netins.net>


"Alan" <alan_at_erols.com> wrote in message news:35vdofF4rvh8nU1_at_individual.net...
>
> "Dawn M. Wolthuis" <dwolt_at_tincat-group.comREMOVE> wrote in message
> news:ctdn9o$7sq$1_at_news.netins.net...
>> "DA Morgan" <damorgan_at_x.washington.edu> wrote in message
>> news:1106859773.807419_at_yasure...
>> > silversw2000_at_yahoo.com wrote:
>> >
>> >> I had a database analyst interview yesterday on the phone, and the
>> >> interviewer brought up the phrase: "I flattened the database".
>> >>
>> >> I did not obfuscate in replying to her thread, but was able to avoid
>> >> having to know what she meant by that term.
>> >>
>> >> Can someone define "flatten database" for me, in 100 words or less (OK
>> >> 500 words is fine).
>> >>
>> >> Fred Z.
>> >>
>> >
>> > Denormalize.
>>
>> I would guess "normalize" is more likely. If the source dbms permits
>> non-1NF structures, then the term "flatten" is sometimes used to mean
>> that
>> the data are put in1NF. Terms I might use as synonyms to "flatten" are
>> "unnest", "explode", "normalize". It isn't quite the same as
>> normalizing,
>> however, because the embedded lists are not placed in a separate
>> relation,
>> but rather the "scalar" data are repeated in each row, while the list has
>> one value per row. Reasons to flatten the structure would include a need
> to
>> use a SQL-92-based tool or pour the data into a single Excel worksheet.
>>
>> If the source dbms is a SQL-based product, then the user might still be
>> dumping tag-delimited lists in as attribute values (we all know there are
>> people who do that, right?) or even using some non-1NF capabilities of
>> the
>> particular RDBMS.
>>
>> Example:
>>
>> Start with:
>>
>> (12345,DeSmith,John,{jsmith_at_aol.com,JohnSmith_at_yahoo.com})
>> (23127,Meador,Beth,{beth_meador_at_msn.com,bmeador_at_hotmail.com,btm_at_ibm.org})
>>
>> Then flatten it to get:
>>
>> (12345,DeSmith,John,jsmith_at_aol.com)
>> (12345,DeSmith,John,JohnSmith_at_yahoo.com)
>> (23127,Meador,Beth,beth_meador_at_msn.com)
>> (23127,Meador,Beth,bmeador_at_hotmail.com)
>> (23127,Meador,Beth,btm_at_ibm.org)
>>
>> That's my best guess. --dawn
>>
>> > --
>> > Daniel A. Morgan
>> > University of Washington
>> > damorgan_at_x.washington.edu
>> > (replace 'x' with 'u' to respond)
>>
>>
>
> Dawn,
>
> Flattening is DEnormalizing. Period.

OK, I'd vote for both terms being incorrect. It is not denormalizing because it is starting with non-1NF data -- agreed? And it is not normalizing because, well, it clearly isn't (the key, the whole keyu and nothing but the key ...). However, the reason that "we" sometimes call it normalizing is that it takes a non-SQL92-compatible structure and turns it into something that can be used with ODBC, for example. By formal definitions (until perhaps recently) the source data are not in first normal form. So, informally, to get ODBC to work (for example) someone might suggest they are normalizing it. I'm one of the people who says "flatten" instead because normalizng is clearly wrong. However, it is still an informal way people might say it as in "we need to nornalize the data in order to pull it into Excel". (Wrong, but you can see how such language evolved).

> You have it backwards and twisted.

I think I had the concept correct and the "language that gets used" correct, even if such language is inaccurate. Agreed?

> It
> could be called Neo-normalizing.

That's going a bit too far now, don't you think?! smiles. --dawn Received on Fri Jan 28 2005 - 19:50:46 CET

Original text of this message