Re: Hierachical structures - an overview

From: Dawn M. Wolthuis <dwolt_at_tincat-group.com>
Date: Mon, 5 Jan 2004 12:04:22 -0600
Message-ID: <btc8vb$721$1_at_news.netins.net>


"Marshall Spight" <mspight_at_dnai.com> wrote in message news:USgKb.64924$I07.282153_at_attbi_s53...
> "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
news:bta3ac$69i$1_at_news.netins.net...
> > >
> > and if I thought that there really were gains by taking an application
that
> > could be 10 non-RDBMS database"files" (tables that include tables as
> > elements) and exploding it to more than 100 tables, I would be in favor
of
> > all those tables, but I don't, so I'm not. ;-)
>
> I think it's a mistake to use such simplified metrics as counting the
number
> of tables or counting the number of trees. Neither metric gets at the
> fundamental complexity of the schema. Whatever the degree of complexity
> of the schema is, that's what it is, and efforts to paper over fundamental
> complexity do more harm than good.
>
> By analogy, I have noticed that junior Java programmers are sometimes
> reluctant to add classes, even when they have a perfectly valid
> abstraction. They will try to wedge the abstraction into an existing
> class. When I ask them why, they say they are trying to minimize
> the number of classes, because a lot of classes makes the system
> complicated. Their mistake is that in Java, classes are the fundamental
> unit of organization. Likewise, in an RDBMS, tables are the fundamental
> unit of organization. Consider: if reducing table count was a useful
> metric, we'd want to denormalize everything as much as possible. Ugh!
>
> If you want to be sure your schema (or structure) is no more complex
> than it needs to be, then the best way to address that is with a formal
> set of normalization rules. If your table definitions are fully
normalized,
> then you have the right number of tables, whatever than number may
> be.
>
> Is there a formalized set of tree schema normalization rules?
>
>
> Marshall

Number of tables is not a primary metric for me, however, I do think it counter-productive as well as counter-intuitive that if one prepares a table of books and another of authors, for example, and would like to associate the authors with the book, then in a relational model (and only in a relational model, it seems) one needs to have a book-author relationship table with two rows for a book with two authors. In most other models, one could have the book "file" point to the two authors with a field that is a list. It is often the case with such models that for efficiency, there are "return links" on the authors as well to point to their books, so referential integrity must be retained in both files. But the advantage is much more than having fewer tables -- it is in having tables that make sense to human beings in the way we use language and perceive objects.

I have searched high and low for a good list of how to perform normalization in a non 1NF structure and have decided I need to prepare such myself, so I'm working on that in my spare time.

I do agree with you regarding Java classes, however. I see the classes much like data elements, the building blocks of any data model -- you don't want to cram more data into a single element than belongs there. With the relational model and a single relation, we tend to remove elements that DO belong there simply because of their cardinality. I suspect that is why many relational theorists are backing off from previous defs of 1NF that prohibited lists (or relations) as elements. However, SQL-92 (still the employed standard by most) does not function with relations as elements (although SQL-99 does -ish).

Cheers! --dawn Received on Mon Jan 05 2004 - 19:04:22 CET

Original text of this message