Views for denomalizing

From: Dawn M. Wolthuis <dwolt_at_tincat-group.comREMOVE>
Date: Wed, 2 Feb 2005 22:34:40 -0600
Message-ID: <cts9lf$9h3$1_at_news.netins.net>



The top of this posting is intended to be devoid of opinions, so please correct if I have any misconceptions prior to the question.
  1. SQL-DBMS's (at least those that conform to SQL92) provide no constraints on the user creating new tables to restrict base tables from being denormalized EXCEPT in the case of the first normal form.
  2. For relations that are in 3rd normal form, for example, the user can still have a denormalized View of that data and, in fact, many Views are denormalized.
  3. This denormalization is for the purposes of the user viewing the data they need, together in one view, so they can report against it in a single report, for example.
  4. The only normal form that cannot be breached when creating SQL92 views is the same one required for base tables -- 1NF. Even though the purpose of a View is to provide the user a "view" of their data, if the user would like to see lists as attributes (e.g. a list of e-mail addresses as an attribute for a person), there is no provision for this in a SQL92 View.

Is that all correct?

Question: I recall that SQL Server UDF's permit lists to be returned, while the last time I asked, Oracle did not. Do SQL Server customers create views that include these user defined functions (returning derived data values) with lists of values? If so, how does SQL handle queries against such views? If not, what prohibits this?

Note: This might sound like a comp.databases question instead of cdt, but it relates to my questions about the legitimacy of 1NF.

----the rest of this note is more blog-like, so if you aren't interested in the author's opinions, stop here

1NF is the only normal form that is forced upon us by many common industry tools. This is really backwards, however, given that most of us do not contest the importance of functional dependency-based normal forms, but 1NF is the one that doesn't have the same backing across the industry.

For example, relational folks have now tossed in relation-valued attributes, even if they can't bring themselves to say the data would then not be in 1NF -- instead they completely redefine 1NF so that it is meaningless, that is, so that relation=1NF by definition. OO folks; XML folks, and long-time developers using non SQL-based tools all "ignore" 1NF. Yet SQL-DBMS's and related tools consider it so much more important than the other normal forms that even in the views, where denormalization is common, acceptable, and clearly useful, non-1NF is still not (typically, as best I can tell) employed.

I started this line of thinking a few years ago when doing mappings that were similar to XML to SQL -- non-1NF to 1NF, two-valued logic to three, etc for the purpose of ODBC access to such data. The users were then pleased to be able to retrieve their data from Excel, for example, but wanted the ease of use of their previous "views" of the data and query language that went with it (similar to SQL, but 2VL and including EVERY and other features for handling embedded lists). I was not as hopeful then as I am now that the industry would move somewhat quickly in my preferred direction, away from an insistence on 1NF.

However, with web services requiring SQL to and from XML mappings for developers who work with data in SQL-DBMS's, tools do seem to be changing. I am hopeful that within the coming decade fewer and fewer students will graduate with Computer Science degrees thinking of normalization as starting with what the last couple of decades of students were taught was 1NF. But we as a profession will need to attend to those who already graduated to eliminate the bias against lists within attributes, at least when it comes to views of the data, if not base relations.

Agree? --dawn Received on Thu Feb 03 2005 - 05:34:40 CET

Original text of this message