Re: Relation or attribute and why

From: J M Davitt <jdavitt_at_aeneas.net>
Date: Fri, 26 May 2006 02:20:52 GMT
Message-ID: <8Ctdg.47027$P2.3916_at_tornado.ohiordc.rr.com>


Bob Badour wrote:
[snip]
>
> Requirements (identified need/business case)
> |
> V
> Analysis -> a conceptual model (information)
> _________________/
> |
> V
> Design -> a logical design (data)
> _________________/
> |
> V
> Implementation (physical design) -> a physical layout (media encoding)
>
>
> Separating design and implementation (logical and physical) as much as
> possible has numerous benefits including adaptability, error avoidance,
> automated maintenance etc. Most products and all non-relational products
> force one to combine them to a large extent.
>
> Separating analysis and design as much as possible has numerous benefits
> too including early cost estimation, identifying buy vs. build
> opportunities, scope identification, early identification and
> reconciliation of diverse needs, identification and verification of
> unspoken assumptions, automating design etc.

[The following is excerpted from something I wrote for another audience.]

The first "database" work product to be produced in a database project is a conceptual model. This model is the result of requirements analyses during which the entities and events of interest are identified, analyzed, and correctly described.

The next work product is a logical design. This design is the result of carefully and completely mapping the conceptual design to a machine-sensible design. The logical design need not be implemented - but it could be: all the analysis and hard* work necessary to realize the automation of a physical design must be done or demonstrably doable before the logical design is complete.

  • "Hard" as in rock, not as in difficult. I'm referring to the phenomenal amount of precise and painstakingly correct work that must be done in order to get computers to the point where they are useful.

The logical design then serves as a starting point for physical design. Knowledge of the data to be stored, the system's required availability, the database management system to be used, and anticipated uses of the stored data determine which storage schemes are most appropriate and, consequently, which physical design alternative is best.

The greatest potential for divergence from the user's perception of entities and their associations occurs during the physical design. Because it is, after all, for the benefit of users that we set upon any implementation, such divergence is undesirable and may, in fact, render the entire effort useless. To restore cohesiveness, features which affect closure between the physical design and the user's perception of the data and processes must be added to the product.

The point is that, whether in logical design or physical design, some entities or features or processes or artifacts which would only confuse or confound users may well find their way into the product. We should not casually expect the user to accept the the burden of dealing with such things. Some methodical effort must be made to make the product look and feel as users expect it to.

That's presentation design. And, strictly speaking, it has nothing to do with database design. It does, however, greatly determine how users will perceive the product.

[End of excerpt.]

I think my description is essentially identical to Bob's, except my approach makes one final step explicit: the presentation design.

This is the portion of the product where "dimensional designs" and "star schemas" and "materialized views" can appear -- because there are a good number of useful tools that can easily and effectively be connected to such designs. But this is not the "layer" in which the "real data" are stored: only a fully normal design is acceptable for that purpose.

In other words, denormalization is proper solution to some problems -- but denormalization requires a fully normal design as a starting point. An there's a universe of difference between 'denormalized' and 'non-normalized.'

One thought occurred to me as I was reading this: although I tried to be careful and consistent with my choice of words, it seems that there's plenty of opportunity for confusion when so many products, work products, designs, models, projects, tools, implementations, and systems are all crowded into s small space.

I recently sat in a meeting attended by developers and project managers. Someone used the terms "scope" and "requirements" in the same sentence, and I knew a train wreck was not far away: these terms have precise and very different meanings in the two worlds. Very confusing.

> The difference between information and data is one of the most
> fundamental and important differences in our field.

And I am amazed that ignorance of the distinction is widespread among "seasoned professionals." Received on Fri May 26 2006 - 04:20:52 CEST

Original text of this message