Re: MV Keys

From: Jon Heggland <heggland_at_idi.ntnu.no>
Date: Thu, 9 Mar 2006 11:56:38 +0100
Message-ID: <MPG.1e7a2843ae34134098979b_at_news.ntnu.no>

In article <1141846796.927114.151230_at_i40g2000cwc.googlegroups.com>, marshall.spight_at_gmail.com says...
> What I would say is that "a subset of a relation is also a relation."

Yes, I know you have said that---and do of course agree. What I take issue with is that you *also* say that an *element* of a relation is a relation. There is a difference between a subset and an element.

> > So I ask: What do you call an element of a relation?
>
> What does it matter what we call it?

To begin with, I asked just because I was curious, since you redefined the traditional notion of "tuple". Now I ask because you contradict yourself.

> My claim is that it's not
> a useful nor necessary abstraction, since it's just a kind of relation.
> Let's just stop talking about it as if it were something distinct.

I could accept your defining "tuple" as a special case of relation, even though it clashes with established terminology. But to simultaneously say that an *element* of a relation is also a tuple, and that tuples and relations are not distinct, is tantamount to saying that there is no difference between { X } and X. An element of a set is distinct from the set!

> To me, it is like the distinction between list-of-char and string;
> entirely unnecessary; historical only.

False analogy. List-of-char and string is like set-of-tuple (classical definition) and relation; fair enough. A subset of of a product of sets is a tuple if it has cardinality one, by your definition. Thus, it has one element. What is that element? What kind of "thing" is it?

> > For example the
> > sole element of R, which is (1,b)? You say:
> >
> > > "Tuple," "element" and "relation" all seem valid to me.
> >
> > Yet (1,b) is obviously not a relation, and equally obviously not a tuple
> > by your definition! It is not the subset of anything, because it is not
> > a set. Do you still not see the problem?
>
> Yes, I still don't see the problem.

The problem is that you contradict yourself. You say (1,b) is a tuple, and you say that a tuple is a set. (1,b) is not a set.

> There is no way this issue can be resolved using arguments that
> use 'tuple' as a required abstraction, since that very thing is
> the issue in question.

Huh? I didn't presuppose what a tuple was in the above demonstration. I just showed that your definitions are contradictory. Maybe I should emphasise the point that if you call an element of a relation something *other* than "tuple" or "relation", the contradiction goes away. But isn't there an established term for the kind of element you get when you take the product of sets? Isn't that term indeed "tuple"?

> The way to resolve the issue is
> to either show that some important quality is lost if we don't
> make the distinction (and again, that important quality *cannot*
> be the distinction in question), or to show that everything we want
> to accomplish can be accomplished without it.

I'm just trying to show that your definition is self-contradictory. Before that is resolved, it doesn't even make sense to talk about qualities lost or gained.

> Proposal:
> A relation of multiple elements can be specified as follows
>
> {
> (attr1=val1_1, attr2=val1_2, ... attrn=val1_n),
> (attr1=val2_1, attr2=val2_2, ... attrn=val2_n),
> ...
> (attr1=valm_1, attr2=valm_2, ... attrn=valm_n)
> }
>
> As a syntactic shorthand, perhaps a one-element relation could
> be specified as follows:
>
> (attr1=val1, attr2=val2, ... attrn=valn)
>
> but this is unnecessary and if it introduces any ambiguities
> it clearly cannot be allowed.

I'd say this "shorthand" clearly *is* necessary---it is what you use in the multiple-element relation specification above! If you didn't have it, the relation specification would never get past the endless sequence of opening braces. Besides, if an element of a relation is a relation, (attr1=val1_1, attr2=val1_2, ... attrn=val1_n) is implicitly a relation, since it appears as an element of a relation.

How do you envision the grammar for the above expressions? You obviously need to distinguish between "relation" and "element of relation" (or "singleton relation")---and when you do, what on earth is gained (except confusion) by *not* using the established distinction of "relation" and "tuple"?

> > > But since you asked,
> > > I see nothing wrong with using a tuple-specific syntax for relations
> > > that happen to have only one element. :-)
> >
> > Even though you make a point of treating relations and tuples the same
> > for simplicity? Seems to me that you lose whatever gain you may have
> > had.
>
> I don't see how.

It now has special syntax and special rules. It's like saying (in Java) that the int 1 really is the List<Integer> [1], but that you can use the shorthand "1" to refer to it, and you can't add or remove anything to/from the list. But hey, now you don't need the int type anymore! And you can use an Iterator on 1! :)

> Anyway, the value of this idea is hard to evaluate
> in the absence of a working system, or a formal semantics, or
> whatever.

Yes, that is why I ask you about syntax, grammar and reasonably formal definitions.

> > > As I mentioned in another post, the RA doesn't have any operations
> > > on elements,
> >
> > I don't see what that has to do with it. Don't you agree that the
> > ability to specify relation literals and tuple literals is needed?
>
> I agree that the ability to specify relation literals is needed. I
> agree this includes cardinality-1 relation literals.

Do you also agree that the cardinality-1 relation literals must (or at least should) be specified in such a way that the compiler knows that the cardinality is 1?

> > > so this issue doesn't come up.
> >
> > It did come up for me trying to understand your system.
>
> Is there any reason to think that this isn't just the result
> of the fact that you are used to working with a system
> in which this abstraction is important?

Certainly! I can easily envision a system where a tuple is a singleton relation, but not one where it simultaneously is an element of a relation. I can also envision a system that doesn't use tuple types, though I think I would consider it cumbersome and unorthogonal. I believe tuple types as distinct from relation types exists by necessity/definition, since { (1,b) } is distinct from (1,b)---so I would wonder why I wasn't allowed to use them. Analogously, I consider the boolean type to exist in SQL (since the WHERE clause requires an expression of that type), and I wonder why I'm not allowed to use booleans elsewhere.

> > > The operations on
> > > tuples that are sometimes described are just special cases of
> > > the relational operations; they may be discarded without loss
> > > of generality. Operations on attributes are as they ever were;
> > > specific to the attribute type and not part of the RA.
> >
> > But how can anyone ever get to the attributes? It's sets all the way
> > down!
>
> What do you mean by "get to?"

Put it this way: If a tuple is a relation, then the element(s) of a tuple isn't attribute values (like I am used to), but a relation. Which in turn contains a relation, which again in turn contains a relation...

> I don't see that anything is
> necessary beyond the ability to invoke a function, and that
> can be considered a relational operation. Consider a function
> add(a:int, b:int) : (c: int). (This function has three attributes;
> two input and one output. After invocation, the c attribute has
> the value a+b.)
>
> Here's a relational operation:
>
> add natural_join { (a=1, b=2) }
>
> The result is
>
> { (a=1, b=2, c=3) }

Yes, I've seen this idea in TTM, and I think it is elegant (in theory, but cumbersome in practise). I just wonder how the code implementing the (infinite) relation "add" looks like. Do you envision this as implemented in a separate language and thus irrelevant to this discussion? Should the users / developers be able to make their own "operation relations" � la "add"?

> > > > Anyway, what is *good* about this definition? :)
> > >
> > > It's simpler. You have fewer kinds of types. You also have
> > > fewer things to implement.
> >
> > I think you are mixing interface/model and implementation here.
>
> Certainly not! There were two sentences after "it's simpler." The first
> was about interface and the second about implementation. There
> are benefits at both levels. But that doesn't mean I'm mixing them.

Fair enough. I disagree on both points, however. I think you need special syntax and behaviour for tuples; not considering them distinct from relations is therefore confusing, not simple. I also believe that not taking advantage of tuples being fixed-size (as opposed to relations with a variable number of tu... I mean elements:) makes the implementation more complicated than it needs to be.

> > But
> > consider this example: I have a procedure that takes/wants a tuple as an
> > argument. It only works on tuples, or singleton relations in your world.
> > With tuple types, this is fairly trivial. Without, it seems to me you
> > must do a lot of extra work for this to function reliably:
> >
> > * Either, you must in your procedure check that the relation indeed
> > contains exactly one element, and throw a run-time error if it doesn't.
> >
> > * Or, you must at compile time check that no invocation of the procedure
> > ever uses a non-singleton relation as an argument. That may be hard, but
> > more importantly: You then probably have to indicate to the compiler
> > that your procedure expects/demands a singleton relation---by using some
> > special syntax/annotation/metadata, I guess. And if you do, what is the
> > meaningful difference between a proper tuple type and a singleton-
> > relation-type-with-special-syntax?
>
> Or, you could have it such that prodecurally defined functions were
> always, necessarily expressed in single-element terms (as is
> customary,)
> and rely on the relation operators to handle the cardinality issues.

Isn't a relation a single element? Do you propose that a procedure shouldn't be allowed to take a general relation as an argument? Or that if it does so, it is by definition a relation operator? Will you allow user-defined relation operators?

> > > So, what are you going to do with this string now that you've got it?
> >
> > Compare it to another string? Put it into a variable of type String?
> > Examine its third character?
>
> All of these can be done with either relational operators on other
> relations, or relational operators on functions.

So you want me to do

RowVariable over { FirstName } = { "Jon" FirstName }

instead of

RowVariable.FirstName = "Jon"

and

RowVariable add { FirstName[2] PointlessName } over { PointlessName }

(or even

RowVariable rename { FirstName String } join StringToCharacterRelation join { 2 Index } over { Character }

where StringToCharacterRelation is a "operation relation" with attributes String, Index and Character)

instead of

RowVariable.FirstName[2]

? Not very simple, in my opinion.