Rookie asks about "Normalization"

From: <Rubenm_at_sys653.chatlink.com>
Date: 1995/12/21
Message-ID: <173293_at_653.chatlink.com>


normalizing your database is only half of the task. you need to make sure that your SQL statements are processing set-at-a-time (set processing). and not R.A.T.s (record at a time). i've seen a lot of people use SQL regardless of the platform (Oracle, DB2, Sybase, Informix, etc.) and complain and blame the following in this order... first the hardwar vendor.. then the database vendor... but never the database designer and programmer.

 you see, as you move in to higher normalization levels 4NF..and 5NF you even need to make sure that you are maximizing your SQL thru SET at a timeprocessing. (it doesn't mean that if you are using SQL you are already doing set processing.) Relational database design goes hand-in-hand with SQL set processing. Therefore, even if you are using set operators like union, minus, intersect, etc.. but inside a declare cursor statement, and a corresponding fetch thereafter.... then you are not processing your data in sets...and if your database is normalized...3NF and up... then your system will ....CRAWL... specially if its a large database (250 million records is typical) when it comes to processing in batch mode... (so the guys who don't understand set processing will tell you to denormalize your data right away...good thing they are a dying breed)

If you are using a relational database platform that supports parallel database query and SMP or MPP harware with an O.S. that is tuned for SMP/MPP hardware then I suggest that you even further normalize your database into fourth normal form or even 5th normal form specially if its an operational database.

relational design basically benefits operational systems (OLTP) because these type of systems involve a lot of inserts updates and deletes. however the problem with relational design is that it may require multi-table joins on your selects or inquiries (selects...)...

most likely your systems has been designed in a manner typical of legacy systems (mostly heirarchical designs coming from old COBOL type applications)... i've even seen systems using SQL but programmed COBOL style (typically the select statement was treated like the cobol read statement)...
quite sad... i've seen some big shops claiming that they have moved their old systems and converted it into a relational database.... you see my friend buying an RDBMS... does not make a database system relational... you have to make sure if your applications have conformed to proper normalization rules (this is regarding your file/table design) and and that your are processing your rows/records set-at-time (try insert into .... with a select clause or a create table as select).

the best way to understand the set paradigm is to look at your tables as decks of cards the you cut group at a time instead of picking up one card at a time.   

however, if some of your systems work on a lot on historical data and will use a lot of selects (only few inserts, updates, and deletes... ) then you may be safe following some guys suggesting denormalization. however, may i suggest that you try looking for the following design principles.... GNF or Grammatical Normal Forms or STAR schemas... look up the following companies... Relational Systems and Ralph Kimball and Associates... By the way, these two methods of design GNF and STAR could actually be used on 'data warehouse'/DSS projects.... but please don't use these on your OLTP application. Try getting a copy of Bill Inmon's "Client/Server Appications Development" and his data warehousing book to get an insight and a more detailed explanation of what i am saying... Codd would have been delighted to see you doing it the right way....
projects..... but please don't use GNF and STAR on your OLTP/operational databases.

Lastly my friend, keep on looking at your optimizer's query plan because it tells you how your SQL will execute....try as much as possible to work on test data that's almost as big as your production volume. your optimizer may change it's query plan at certain volumes (this concerns the use of your indexes). if you don't have the volume data available then you can generate one by making a small program (you can do it in pro* C) of pure C and then load them to your schema. then validate your SQL by looking at SET EXPLAIN..

In my group I require that the SQL's be formed already one's the table design has become stable.. only after we have STRESS tested our SQL's against our schema will we proceed into construction (SQL generation and testing happens during detailed design) not during construction.

Hope this would be of help a bit!!!!

Merry Christmas and yes there is a Santa Clause!!!

from your friendly Informix user ...... (I've also used Oracle 7.0 and DB2 on VLDB projects)...

P.S.

  Proper relational database design also helps you enforce referential integrity effectively. Received on Thu Dec 21 1995 - 00:00:00 CET

Original text of this message