Re: The wisdom of the object mentors (Was: Searching OO Associations with RDBMS Persistence Models)

From: Neo <neo55592_at_hotmail.com>
Date: 3 Jun 2006 19:57:40 -0700
Message-ID: <1149389860.815874.4660_at_i39g2000cwa.googlegroups.com>


>>> colors :
>>> name part color
>>> ---- ---- -----
>>> john skin black
>>> john hair red
>>> john hair blue
>>> mary skin white
>>> mary hair red
>>>
>>> textures :
>>> name part texture
>>> ---- ---- -------
>>> mary skin smooth
>>> mary hair silky
>>> mary hair smooth

>> ... mary is represented in two tables and in multiple rows and rmdb has no reliable method of knowing that they are the same.

> What do you mean by "knowing that they are the same"? What is "they"?

In the original relational implementation to represent persons and their skin and hair types, the implementation did not have a reliable method of determining if different values referred to same or different mary. This becomes apparent in the case when a new mary with purple hair is added. For example, after adding a new mary with purple hair the color table would be as follows:

name part color
---- ---- -----
john skin black
john hair red
john hair blue

mary  skin  white                            ; old mary
mary  hair  red                                ; old mary
mary  hair  purple                           ; new mary

Now since a person (ie john) can have hair with multiple colors (red and blue), after adding the new mary, the app processing the data cannot tell if there is one mary with white skin and red/purple hair vs two marys, one with white skin and red hair and a second mary with purple hair.

> What redundancy are you referring to?

In the original color table above, the person john is represented twice, once by the value in the first tuple and again by the value in the second tuple. Representing a thing multiple times in a db leaves it suseptible to various problems and inflexibilities. In subsequent extentions of this examples, I will show how the relational implementation will require additional schema update to handle new data requirements, but dbd won't (because, among other reasons, it doesn't represent things redundantly). Also, various colors and textures are represented redundantly.

> If we take (name part color) and (name part texture) as the primary keys for colors and textures, then how is the database not normalized?

Sorry, by not normalized, I mean it has redundant data. Defining primary key as above will prevent the RM implementation from representing a second person with the same characteristics. Imagine we have two andriods. One uses above relational schema, the other dbd. Both droids meet two marys with same skin color. The RM based driod will not be able to represent the second mary and will require a schema/constraint update. The dbd based driod will represent the second mary with same skin color.

> What scenario can you repeat? Please do so within this example.

We have already begun the process. So far, you added an ID column to handle multiple marys. Once that solutions is cleaned up, I will present new data requirement that will require rm schema updates but dbd won't because it uses a more general/flexible/sysematic method of representing things.

> By the way, isn't MV a type of data model?

Depends on your definition of data model. I would classify it as a type of data model.

> How does a data model "implement" something?

I am having trouble grasping the context of your question. A data model is a human-created methodology for representing things. In general humans implement that methodology as faithfully as possible on some computing device in the form of an actual database product, ie SQL Server or MS Access. In turn, the db implements that methodology to represent things.

> Or do you mean "dbd doesn't suffer from the flaw in most/all current implementations /of/ MV models"?

As far as I know, the MV model itself does not provide way to "normalize" values with data- independent references. Therefore, its implementations can't. On the other hand, RM provides a way to implement multiple attribute values such that the values can be "normalized" using data-independent references (keys).

> And here we would be talking about DBMS and not data models or databases correct?

With respect to implementing multiple values such that they can be "normalized" using data-independent references, RM is capable, MV is not.

> What do you mean by "similar values"?

In the orginal RM implementation, the value john in the first row is "similar" or actually redundant with respect to the value john in the second row.

> What is a data-independent reference?

Anything that connects/relates data and is itself unrelated to the data being related. For example, in the table above, the john in the first row is related to john in the second row by a data-dependent reference (string "john"). The problem with data-dependent references is that they don't allow db to resolve which data is related when another john in added. Using data-dependent reference leaves open the possibility that the original schema will have to be redesigned to accomodate new data requirements.

> If there are "multiple marys" then what is /a/ mary?

With respect to current example, mary is a name. It can be the name of 0 to many persons. Received on Sun Jun 04 2006 - 04:57:40 CEST

Original text of this message