Re: A Normalization Question

From: Alan <not.me_at_uhuh.rcn.com>
Date: Mon, 28 Jun 2004 12:52:40 GMT
Message-ID: <swUDc.10129$Av3.4182_at_nwrdny01.gnilink.net>


"VHarris001" <vharris001_at_aol.com> wrote in message news:20040626120209.29625.00000578_at_mb-m16.aol.com...
> Over the years, I've tried several different relational database schemes
for
> tracking personal and professional information, and in every instance have
been
> frustrated with the results.
>
> The biggest problem seems to lie in identifying entities, or 'things.'
For
> instance, when I set up a corporation as a business entity in a table, and
use
> other tables for employees, vendors, customers, strategic partners, etc.,
I
> find that some employees are also vendors, customers, and partners. Of
course,
> this means that I must enter the same person (or 'thing') into multiple
tables.
>
> At one point, I thought I could work around this difficulty by
'normalizing'
> the data -- for instance, by creating just one table for 'things.' But in
> trying to do this, I discovered that I still could not 'normalize' the
data
> sufficient to eliminate duplicate entries in tables -- to which the only
> solution seems to be dramatically increasing the number of tables.
>
> The only way to accomplish this that seems to make sense to me is to
create one
> table that has a unique instance for each entity and another table that
relates
> entities. (I note that a entity-relationship approach was was being
discussed
> in another thread here.).
>
> In this conception, the unique corporation would have a unique identifier
> (probably a unique number) in the entity table. Same with the person who
is an
> employee. Also, in the entity table would be the entities, for example,
of
> "Delaware Corporation." and 'Human Being.'
>
> In the relationship table, the corporation entity would be related to the
> "Delaware Corporation' entity, the employee entity would be related to the
> 'Human Being' entity, and the corporation entity and the employee entity
would
> be tied together in the relationship table as type 'employer-employee.'
>
> When the employee also purchases from the employer, the relationship table
> would be updated to tie the corporation and the human being as type
> 'vendor-customer,' etc.
>
> In this way, the entity table contains all entities, and the relationship
table
> contains all the pointers between entites, and describes the relationships
> between those entities.
>
> To relate this to the other thread about things and relations, in this
> conception, the names Brown and Browne, and the color brown would all
three be
> entities in the entity table. If human being Brown was incorrectly tied
to the
> entity 'Browne' in the relationship table, only the relationship table
would be
> updated to point to the entity name Brown.
>
> In this way, as was commented on in the other thread, even the letter 'b,'
if
> helpful, could be set up as an entity or a thing.
>
> I guess this would be normalization in the extreme, except that once the
data
> is normalized to this degree, it no longer makes sense to keep different
data
> types in different tables, because the number of tables becomes unwieldy.
>
> Is there any database scheme that is set up in this manner, and that could
be
> used to eliminate the necessity of entering the same 'entity' in multiple
> tables?
>
> V Harris

Your normalization problem is really a business definitions problem. If you were to create an ERD first, you would have no problems normalizing. This is a standard employee-supervisor problem, where an employee has a supervisor and a supervisor is, of course, an employee. It is a recursive relationship. You need one table with multiple FKs (one for each relationship) that refer to itself.

For example, we'll use Tax ID Number (TIN) as the PK. In the U.S., all TINs are nine digits ssn and company tin and do not overlap. So your table may look like this:
THINGS
thing_tin (pk),
name,
address,
etc
vendor_tin (fk),
emp_tin (fk),
supplier_tin( fk),
customer_tin (fk),
partner_tin (fk),
etc

You then join the table to itself, using the proper FK. If (when!) you have multiple addresses (or other attributes), you then split out all of the non-fk attributes in the THINGS table, leaving behind a many-to-many linking table:

THINGS
thing_tin (pk),
vendor_tin (fk),
emp_tin (fk),
supplier_tin( fk),
customer_tin (fk),
partner_tin (fk),
etc (fk)

VENDOR_ATTRIBUTES
vendor_tin (pk/fk),
name,
address,
etc

EMPLOYEE ATTRIBUTES
emp_tin (pk/fk),
name,
address,
etc

ETC_ATTRIBUTES

.
.
.

That should do it. Received on Mon Jun 28 2004 - 14:52:40 CEST

Original text of this message