Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Paul <>
Date: 17 Feb 2003 03:30:12 -0800
Message-ID: <>

"Mikito Harakiri" <> wrote in message news:<nGb3a.11$>...
> "Paul" <> wrote in message
> > What's the interpretation given to a table with duplicate rows?
> Interpretation is not always the driving force of the theory. Widely
> successful theory can have poor interpretation. Quantum Mechanics Copenhagen
> Interpretation, for example, has some conflicts with "common sence".

But in science the "reality" comes first and then the model is designed to fit.
With maths you create your model first and the actual applications of it (like relational databases) come afterwards. So it's important to get the foundation and the motivation right. To me this seems one of the biggest problems with multisets.  

> > So is "multi-set" relational algebra not based on predicate logic?
> Correct.

So what is it based on? What is the interpretation?  

> > Also, there's no way for users to identify a row because the thing
> > that differentiates them is held internally and is thus invisible to
> > the user - and this breaks the logical/physical distinction.
> This is a subtle issue. For example, a number 12 is 10^1*1+2*10^0. Any
> numerical value can be thought as an aggregate of its base coefficients. We
> could even store it in the database in dismembered state

But isn't there a distinction between the abstract number itself and any given representation of it?  

> Adding count column is certainly possible, as Bob demonstrated it yet one
> more time. I'm just verifying if this practice is not inferior to explicitly
> having multiset concept in the model.

OK suppose I have an employee "relation" which is a multiset. I have two employees called John Smith, in the same dept on the same salary.
So my multi-relation contains two outwardly identical tuples: ("John Smith", 10, 20000)
("John Smith", 10, 20000)

Now one of the John Smiths has a sex-change and becomes Jane Smith.

How does the user update only one of the rows? Surely it's impossible because the two rows are only distinguished internally?

Or are you saying that base relations must be sets, derived relations can be multi-sets? Aren't these two categories supposed to be interchangeable though?  

> I'm not claiming that multisets are superior to sets. There seems to be some
> some cases where they do, and I want to see the whole picture.

Can't multisets always be defined in terms of sets? So you could for example define a multiset [a,a,b,c,d,d,d] as the set {(a,1),(a,2),(b,3),(c,4),(d,5),(d,6),(d,7)} (where (x,y) denotes an ordered pair which you could define as the set {{x},{x,y}} if you wanted)

or maybe {(a,2),(b,1),(c,1),(d,3)} would be better because it doesn't add extra information (sequential order). Which is really the same as adding the extra "count" column I guess.  

So multisets really just are syntactic shorthand. Now I suppose ultimately everything is syntactic shorthand for set theory :) but I'm not convinced that at the logical level the advantages of multisets outweigh the disadvantages. Quite possibly it might be different for the physical level though.

In fact I'd say confusion over unique identifiers for (possibly derived) relations is one of the common causes of problems I see in practice.

> Given that multiset is essentially a mapping of a rows to nonnegative
> integers, wouldn't it be more advantageous to generalize it to mapping of a
> rows to reals?

Well this is straying off-topic I guess since databases are necessarily finite so a suitably large finite set of integers would suffice.
But what if you wanted a multiset of even greater cardinality than the reals? e.g. the set of all subsets of the reals?

Paul. Received on Mon Feb 17 2003 - 12:30:12 CET

Original text of this message