Re: MV Keys

From: Jon Heggland <heggland_at_idi.ntnu.no>
Date: Mon, 6 Mar 2006 11:11:00 +0100
Message-ID: <MPG.1e76291efc220e11989787_at_news.ntnu.no>


In article <2o2Of.56113$dW3.4736_at_newssvr21.news.prodigy.com>, brian_at_selzer-software.com says...
> > I don't really understand. What kind of rules?
>
> Rules for constructing true statements by combining values from one or more
> domains. Wouldn't that be called a grammar? I'm kind of in uncharted
> territory here. I think in images, and while I can visualize concepts, I
> often have difficulty articulating them.

This sounds a bit like the "relations are sentences, domains are nouns" viewpoint.

> > Why is this redundant? It is not very clear to me. Can you show an
> > example of such redundancy? Is it redundant to have both Strings and
> > chars?
>
> If you have a domain for primary colors, the values are red, blue and
> yellow. If you have a domain for lists of primary colors, you might have an
> entry, (blue, blue, yellow). Assuming there's an operator for extracting
> values from a list, in a discussion about blue, which blue are you talking
> about? Is it the primary color blue, is it the blue that is the first
> element of a list, is it the blue that is the second element of the same
> list? Which is it?

They are the same colour; there is only one blue. In the list (1, 1, 2) is the first number different from the second number? No, they are the same.

> This problem becomes more immediate if you consider
> active objects. Assuming you have a domain for widgit objects, and a domain
> for lists of widgit objects, how can you discern which widgit you're talking
> about? Are they the same widgit, or are they separate instances with the
> same property values?

Now you are talking about variables, not values.

> > Then how can it have any practical significance? Can you give any
> > examples of anomalies or redundancy caused by "stuffing nonscalar values
> > into attributes"?
>
> Consider the following scenario: Bob Smith and Jane Smith have three
> children: Peter Smith, Paul Smith and Mary Smith. Now assuming that there
> is a relation with the following attributes: Parent, Children where Parent
> is the parent's name and Children is a comma-separated list containing the
> names of each child. So, some of the rows in this relation would look like
> this:
>
> "Bob Smith", "Peter Smith, Paul Smith, Mary Smith"
> "Jane Smith", "Peter Smith, Paul Smith, Mary Smith"
> "Gerald Smith", "Mary Smith, Amy Smith"
>
>
> Aren't the lists of children for Bob and Jane Smith redundant?

No, why? They could have other children with other people. And what has this to do with list attributes? You can create an equivalent design without them, which would have exactly the same issues.

> What happens when Mary Smith becomes Frank Jones' wife?

Nothing, I think---your relation doesn't mention marriage. Or do you fear she might change her name?

> As an aside, which Mary Smith are we talking about?

If there can be more than one (as it seems there can be, assuming a person can only have two parents), it is not a good idea to use the name to identify her. In that case, you have a list of names, not a list of children---and there is only one "Mary Smith" name. But this also has nothing to do with list attributes.

> Assuming that you can determine Mary Smith has Bob
> Smith as a parent, don't you have to also update the list associated with
> Jane Smith?

I'm not sure what you mean here, since (a) Mary Smith is already present as a child of both Bob and Jane. But (depending on you definition of "having children", of course) I'd say you wouldn't necessarily have to.

> This is why I cringe every time I come across a database that
> stuffs comma-separated lists into columns: I know that I'm going to
> encounter nightmare after nightmare after nightmare, and it may cost more to
> fix it than to scrap the whole thing and start over.

I agree that the design above is horrible. But

"Bob Smith", "Peter Smith"
"Bob Smith", "Paul Smith"
"Bob Smith", "Mary Smith"
"Jane Smith", "Peter Smith"
"Jane Smith", "Paul Smith"
"Jane Smith", "Mary Smith"
"Gerald Smith", "Mary Smith"
"Gerald Smith", "Amy Smith"

isn't much better, even though it doesn't use lists. Now, in a real database, you would use some unique identifier to represent persons--- which would probably be a foreign key to some other relvar. If you can't enforce this constraint when you put these identifiers into lists, you certainly shouldn't do it. If you can, or there is no such constraint, using lists may or may not be a good idea, depending on the expected use of the list. In most cases, I think using lists is a bad idea---but not because of some inherent redundancy.

-- 
Jon
Received on Mon Mar 06 2006 - 11:11:00 CET

Original text of this message