Re: MV Keys

From: Brian Selzer <brian_at_selzer-software.com>
Date: Fri, 03 Mar 2006 21:17:50 GMT
Message-ID: <2o2Of.56113$dW3.4736_at_newssvr21.news.prodigy.com>


"Jon Heggland" <heggland_at_idi.ntnu.no> wrote in message news:MPG.1e727fcc8185158a989783_at_news.ntnu.no...
> In article <NTXNf.62440$PL5.60906_at_newssvr11.news.prodigy.com>,
> brian_at_selzer-software.com says...
>> "Jon Heggland" <heggland_at_idi.ntnu.no> wrote in message
>> > What universe? A concrete example: I can use a varchar as a string
>> > (where, for the sake of the argument, I postulate that the individual
>> > characters "have no meaning"), or as an array of characters (where they
>> > do). The DBMS doesn't know (or care) what meaning I apply to either
>> > varchar---so what is the point of the distinction? Just to say that one
>> > design (the one using varchar as an array) is probably bad, and the
>> > other is not?
>>
>> I take issue with stuffing nonscalar values into attributes. The
>> universe
>> of discourse is (at least for relational databases) the set of all
>> possible
>> values for all relevant domains and a set of rules that describe how
>> those
>> values can be combined.
>
> I don't really understand. What kind of rules?

Rules for constructing true statements by combining values from one or more domains. Wouldn't that be called a grammar? I'm kind of in uncharted territory here. I think in images, and while I can visualize concepts, I often have difficulty articulating them.

>
>> It is clearly redundant to have the same value in a
>> scalar domain and a domain of lists, because undoubtedly, there's an
>> operator available that can extract that value from the list so that it
>> can
>> be discussed.
>
> Why is this redundant? It is not very clear to me. Can you show an
> example of such redundancy? Is it redundant to have both Strings and
> chars?

If you have a domain for primary colors, the values are red, blue and yellow. If you have a domain for lists of primary colors, you might have an entry, (blue, blue, yellow). Assuming there's an operator for extracting values from a list, in a discussion about blue, which blue are you talking about? Is it the primary color blue, is it the blue that is the first element of a list, is it the blue that is the second element of the same list? Which is it? This problem becomes more immediate if you consider active objects. Assuming you have a domain for widgit objects, and a domain for lists of widgit objects, how can you discern which widgit you're talking about? Are they the same widgit, or are they separate instances with the same property values?

Values have identity with respect to the universe; propositions have identity with respect to the database.

>
> Is this only an issue with (variable-length) collection types? What
> about a phone number with an area code part? Or a location with latitude
> and longitude?
>
>> I think that redundancy in the universe of discourse is worse
>> than redundancy in the database, because it undermines the logical
>> foundation of the database.
>
> How so?
>
>> How can you know if you're talking about the
>> same thing if it values in the universe of discourse don't have identity
>> with respect to that universe?
>
> Why wouldn't they have?
>
>> A database is a logical thing, so it is not relevant what the DBMS knows
>> or
>> cares.
>
> Then how can it have any practical significance? Can you give any
> examples of anomalies or redundancy caused by "stuffing nonscalar values
> into attributes"?
>

Consider the following scenario: Bob Smith and Jane Smith have three children: Peter Smith, Paul Smith and Mary Smith. Now assuming that there is a relation with the following attributes: Parent, Children where Parent is the parent's name and Children is a comma-separated list containing the names of each child. So, some of the rows in this relation would look like this:

"Bob Smith", "Peter Smith, Paul Smith, Mary Smith"
"Jane Smith", "Peter Smith, Paul Smith, Mary Smith"
"Gerald Smith", "Mary Smith, Amy Smith"


Aren't the lists of children for Bob and Jane Smith redundant? What happens when Mary Smith becomes Frank Jones' wife? As an aside, which Mary Smith are we talking about? Assuming that you can determine Mary Smith has Bob Smith as a parent, don't you have to also update the list associated with Jane Smith? This is why I cringe every time I come across a database that stuffs comma-separated lists into columns: I know that I'm going to encounter nightmare after nightmare after nightmare, and it may cost more to fix it than to scrap the whole thing and start over.

>> > My point is that you can't say that a type (e.g. varchar) is scalar or
>> > not a priori; you have to say "the way varchar is used by this
>> > operation
>> > in this particular database means it's not a scalar here". Hence,
>> > scalar-ness is a property of some use of some variable of a type, not
>> > of
>> > the type as such. I think we actually agree; you do say "scalar /with
>> > respect to the universe of discourse/" (my emphasis).
>> >
>>
>> Whether something is scalar is a logical concept, not a physical one.
>> Also,
>> it's not the operation in a particular database, but rather the
>> definition
>> of a domain in the logical universe that determines whether or not
>> something
>> is scalar.
>>
>> I think we do agree. Something is scalar with respect to the universe,
>> or
>> less precisely, with respect to the context in which it is used.
>
> I don't think your "universe" concept is very precise, but ymmv.
>
>> > Or the other way around. Are substrings of strings components or
>> > transformations? What about subranges, or individual values, of arrays
>> > or lists? What is the difference?
>>
>> If a string is atomic within the universe, then a substring must be a
>> transformation. If it's not, then both the string and the substring are
>> compound entities subject to the rules of combination that are defined in
>> the universe. I'm using the term entity deliberately here, because such
>> instances have identity with respect to the database. They're
>> propositions,
>> not values.
>
> So basically, it is whatever we define it to be? Anyway, whenever the
> term "entity" enters a discussion, I leave it to its misery. :)
> --
> Jon
Received on Fri Mar 03 2006 - 22:17:50 CET

Original text of this message