Re: What predicates the following relation represents

From: Paul <paul_at_test.com>
Date: Thu, 06 May 2004 20:48:22 +0100
Message-ID: <cEwmc.563$wI4.53161_at_wards.force9.net>


robert wrote:
> <snip>

>>OK I think you understand what I'm saying but I think some others in
>>the thread maybe don't. I'm in agreement with Date that saying
>>something twice doesn't make it any truer. What I'm also saying is
>>that saying something twice doesn't make it wrong.

> <snip>
>
> of course it makes it wrong; unless, of course, you're treating each
> table like a VSAM/COBOL flat file. as soon as you join to this table
> (and if you're phobic about joins, find another line of work), you
> get wrong results. unless you write explicit code to deal with
> duplicates, etc.

I must be very bad at explaining myself; I think you're still not understanding what I mean. I'm thinking of things in terms of logical predicates, which is what underlies the relational model. I'm also supposing we have a truly relational DBMS i.e. duplicate rows cannot exist in any relation.

Now consider an employee relation with columns for id, name, dept, salary. The primary key is defined to be the id column. I might have these tuples in the relation:

(1, Ada,  10, 15000)
(2, Bill, 10, 12000)
(3, Carl, 20, 12000)

these correspond to the logical propositions:

"Employee 1 is called Ada,  is in dept 10 and has a salary of 15,000"
"Employee 2 is called Bill, is in dept 10 and has a salary of 12,000"
"Employee 3 is called Carl, is in dept 20 and has a salary of 12,000"

Now I can repeat the proposition:
"Employee 3 is called Carl, is in dept 20 and has a salary of 12,000" as many times as I want: it doesn't make it more true, nor does it make it less true, nor does it make it false.

So what I'm saying is that correspondingly, I should be able to INSERT the tuple (3, Carl, 20, 12,000) into my relation as many times as I want without error. BUT the points I'm making are:

  1. that after doing this, my relation would still only have 3 rows. i.e. this INSERT operation would leave the relation unchanged. (actually this is just standard relation theory, it's not really a point: I'm just stressing it so I'm not misunderstood again).
  2. the DBMS shouldn't return an error if the row already exists, it should just leave the relation alone and report back "success".

This is really just standard set theory: duplicates aren't wrong, they're just meaningless. I can define a set as {1, 2, 2, 2, 3}. It just happens to be identical to the set {1, 2, 3}. I can add the number 3 to this set, it just so happens that the result is the set {1, 2, 3} again.

If I tell a computer to remember the set {1, 2, 2, 2, 2, 3} I expect it to tell me {1, 2, 3} when I ask it to regurgitate the set.

Adding the number 3 to the set is really just doing {3} U {1, 2, 3} (where U is set union). I can add it as many times as I want: the answer is still {3} U {3} U {3} U .... U {3} U {1, 2, 3} = {1, 2, 3}

Paul. Received on Thu May 06 2004 - 21:48:22 CEST

Original text of this message