Re: Extending my question. Was: The relational model and relational

From: Steve Kass <skass_at_drew.edu>
Date: Tue, 18 Feb 2003 20:00:15 -0500
Message-ID: <b2ukru$g79$1_at_slb2.atl.mindspring.net>


Bernard,

  Then perhaps we agree more than we disagree. I went back and read http://www.dbdebunk.com/cjddtdt.htm, which was referred to earlier in this thread, and two comments on it might be useful, if not specifically in answer to this post of yours:

  Date characterizes the bag-advocate as saying "But I don't need to distinguish among the duplicates--all I want to do is be able to count them." I think that's an unfair characterization, and I would say "But I don't need to distinguish among the duplicates--all I want is to know how many there are." I would go further and ask Date, to whom it is important to be able to count (requiring distinguishability) three cans of cat food, whether it is equally important for him to distinguish and count five pounds of flour, or better yet, $1000 in a bank account. Surely (well, perhaps not so surely) Date does not demand that his bank represent his one thousand dollars as one thousand distinguishable entities in their records so that he might verify their number by counting them. I think that for some purposes only one fact is needed to record the purchase of three cans of cat food, and many stores reflect this conceptualization by representing this fact with a single line on a receipt. Similarly, and probably more agreeably even to Date, it is a single fact that someone has $1000 in a particular bank account. Surely it is not 1000 facts. The point here is that there are things that can be counted and things that cannot be counted but which have integral measure nonetheless. This is not a strict distinction, and neither is it the same distinction as the one between things that need to be counted and things that do not need to be counted. There are many useful examples: a bag of 15 apples in a refrigerator, of which 10 are mine and 5 are yours (or vice versa, if you'll let me have a banana or two), for example.

Does Date, or anyone, think that a withdrawal of ten dollars from a bank account requires the deletion of ten identical facts from a table?

  Later in the article, Date provides two bags, of parts and of suppliers, and asks the question (query) "list part numbers for parts that either are screws, or supplied by supplier S1, or both." Then he proposes twelve queries that might answer the question, and uses the fact that nine different results appear to support his argument against bags.

Unfortunately when Date asks a stupid question, he gets many stupid answers. The problem is that in his sample bag-database, "part number" is not an entity. So even if the query were "list THE part numbers ...", which might mean list each part number in the part numbers bag ..., doesn't fly, since there is no part numbers bag. The question he asks is not well-defined. For it to be a valid query, it must specify the source of the items to be listed. I would interpret it to mean "For each part in the bag of Parts, list the part's part number if the part is a screw or if the part is supplied by supplier S1, or both.

This question has an unambiguous answer, despite the fact that Date has intentionally thrown a wrench into the works as well, by providing a table SP (suppliers-parts?) that contains the same fact twice - a suppliers-parts table should be subject to a constraint that the multiplicity
of any supplier-part fact be 1, since there is no real-world meaning to the two identical rows he lists, in contrast to the clear real-world meaning of the three P1-screws in the parts table. His example is entirely unconvincing.

SK

Bernard Peek wrote:

> In message <b2uat1$f0t$1_at_slb9.atl.mindspring.net>, Steve Kass
> <skass_at_drew.edu> writes
>
>> Bernard,
>>
>> This isn't a matter of opinion. There is one determinant: "there are
>> two
>> employees named John Smith". There are many consequential
>> truths, such as "there is at least one employee", "there is an employee
>> whose first name is not Nancy", "there are at least two employees
>> whose first and last names share a common letter of the alphabet.",
>> and so on.
>>
>> I don't deny that it can be important to distinguish between two
>> John Smiths.
>
>
> That's not my argument. My argument is that there may be no need to
> distinguish between two real-world objects, each of which is
> referenced by a single record in a database. The relational model (and
> databases based on it) require that a distinguishing key be created
> even if there is none in the logical data structure.
>
> I don't dispute that there are real pragmatic reasons for accepting
> that deviation from the logical structure of the data. But as this is
> a theory newsgroup I wanted to point out that this is a (minor)
> failing in the relational model.
>
> [...]
>
>> I'm not redefining any words, but we have a fundamental
>> difference in understanding logical vs. physical models. You are
>> saying that the real-world scenario of books in a library must be
>> represented by a logical model that does not keep track of an
>> actual attribute of "book" (acquisition number), and then you
>> blame the model for not being able to distinguish two identical
>> books,
>
>
> No, that's not my objection. My objection is that the relational model
> declares that there must be a distinction, when there is no such
> requirement in the real world.
>
> It does makes the maths easier, and it makes the implementation much
> easier.
>
>
>
Received on Wed Feb 19 2003 - 02:00:15 CET

Original text of this message