Re: normalization problems

From: Vaughan Powell <vaughan_at_workmail.com>
Date: Fri, 22 Dec 2000 11:04:22 GMT
Message-ID: <91vcfl$vtc$1_at_nnrp1.deja.com>


In article <3a428996.34700317_at_news.ntlworld.com>,   kosmond_at_softhome.net wrote:
>
> >So you are looking at something like this:
> >

 <snip>
> >where the prod-cat table contains all of the valid product-category
> >combinations, and then all of the attributes specific to just
 products
> >or categories listed in those seperate tables, yes?
>
> Thanks for the reply! However that's not quite what I meant, my fault
> for being unclear...
>
> I'm looking at something like this
>
> Vendor
> -----------
> vendor_code
> (etc)
>
> Product
> -----------
> product_code
> vendor_code
> (etc...product stats eg release date, max users)
>
> Product_Category
> --------------------------
> product_code
> table_ref
>
> CRM
> --------------------
> product_code
> response_cycle_analysis
> kpi_trending
> (other crm specific fields)
>
> OLAP
> -----------
> product_code
> smp_parallel_processing
> (other olap specific fields)
>
> Vendor 1-n Product
> Product n-1 Product_Category
>
> Product 1-1 CRM
> Product 1-1 OLAP
>
> CRM and OLAP are two of a number of product categories which will be
> needed. Depending on what type of product a product is, I need to
> store different information in different formats, that's why I've put
> CRM, OLAP etc in separate tables.
>
> Each product will fall into at least one of the tables CRM, OLAP etc.
>
> The Product_Category table is really a workaround so that I can tell
> which category table I need to look in to find more data on a product.
> So I would be hardcoding in something like:
> if (tableref='OLAP') {
> SELECT * FROM OLAP...
> }
> etc
>
> Although I can implement what I need to with the layout above, it
> seems like an awful solution! Any ideas? Afraid I've only worked
> with 'classic' library/stock style databases before, and this has got
> me stumped!
>
> -Karen.
>
>
Karen,

This is just a 'get you started' suggestion, not a complete database design - you will have to follow through my ideas and see if they a suitable for your application.

You could generisise the Product Category details with the Category_Attribute table:

Category_Code    -- Primary Key
Attribute_Code   -- Primary Key - -- e.g. response_cycle_analysis
Attribute_Type   -- e.g. datetime, numeric, string
Attribute_Label  -- e.g. Response Cycle Analysis

This will hold as many attributes for a category as you like and serves to define the attributes for a Category.

You will then need a Product_Category_Attribute table:

Product_Code     -- Priamry Key
Category_Code    -- Primary Key
Attribute_Code   -- Primary Key -- -- e.g. response_cycle_analysis
Attribute_Value  -- Value of attribute

This will hold the values of the attributes for each product-category combination. You can either hold the attribute values in a generic varchar column, i.e. Attribute_Value or use separate numeric, date, varchar, e.g. Attribute_Value_Varchar, Attribute_Value_DateTime, Attribute_value_Numberic etc - one for each datatype that you expect to use. Your code would the the appropriate column depending on the Attribute_Type. The benefit of separate columns is that indexed search will work correctly as you will not have to trasform the data types when select data, e.g.

     Instead of:

        WHERE CAST (Attribute_Value AS DATETIME) = '01-Jan-2000'

     you will use:

        WHERE Attribute_Value_Date = '01-Jan-2000'

     which is much more efficient.

This solution means that the value of each attribute for a productcategory  link will appear on a separate row, but the solution will grow as your categories grow and will gracefully handle addition of new category attributes with no change to the DB design.

This solution eleiminates the explosion of Category tables at the expense of more complex code to retrieve and interpret the data.

Note that I have suggested table design in the same spirit as your existing ones. However I would recommend that you use integer surrogate keys (or identities) rather than codes in your primary key fields. This is an additional normalisation as it copes with changes to code (which do happen in real life) and is also much faster as key seeks on integer indexes are much faster than on character indexes.

Hope this is of use.

--
Vaughan Powell MCDBA, MCSD, MCSE
Data Architecture and Applications Design Manager
BuildOnline


Sent via Deja.com
http://www.deja.com/
Received on Fri Dec 22 2000 - 12:04:22 CET

Original text of this message