Re: Argument for 1NF by counter-example

From: Marshall Spight <mspight_at_dnai.com>
Date: Sat, 30 Oct 2004 06:22:02 GMT
Message-ID: <eqGgd.23408$HA.19911_at_attbi_s01>


"Tony Douglas" <tonyisyourpal_at_netscape.net> wrote in message news:bcb8c360.0410281320.496ab93a_at_posting.google.com...
> Well, the algebra is pretty much independent of Tutorial D; with
> Haskell (say) extended to know about relations and the relation
> operators, you'd be pretty much in the same place re. the algebra.

Agreed. It's funny, though; these days my favorite thing about TTM is chapter 4 and their "deconstruction" (if you will) of relational algebra, and not a whole lot else.

> > > I like Haskell's algebraic data types, parametric
> > > polymorphism, and list comprehensions, but I'm not sure if one really
> > > needs algebraic data types if one has relations, nor am I sure about
> > > parametric polymorphism when one has relations, which are
> > > polymorphic by nature. List comprehensions are awesomely
> > > cool but provide nothing that a good relational language gives
> > > you. (Although the haskell people clearly understand reduce
> > > better than the relational people, the relational people have
> > > a better handle on map and filter, which are just simple cases
> > > of select/join.) Do we need union types? Dunno. I'm pretty
> > > sure we need enumerated types.
> > >
>
> Hum. I think here I go back to the "types are orthogonal to relations"
> argument. I'm a little confused by your "relations are polymorphic by
> nature" comment. Could you expand on that ?

If we are talking about *typed* relations (and we pretty much always are, but untyped relations are possible) then relations may be relations across any number of any type. In other words, relation is a type generator, that is parameterized by zero or more attribute types. Which is to say, relation is parametrically polymorphic, with the attribute types being the parameters.

Kinda like templates in C++ or generics in Java.

With such a powerful facility at the core, it's not clear to me whether any *further* parametric polymorphism is necessary. After all, pretty much 100% of the example uses of parametric polymorphism are container classes. Do we need parametric functions or modules if we have relations? Dunno.

> In one sense, relations are one thing the language expresses; they
> have an additional interesting property that they survive between
> program invocations / expression evaluations. Having relations
> shouldn't preclude doing anything else. Whether you *need* the other
> stuff, or whether you pretend to have it but use syntactic sugar for
> relation constructs, is an interesting, alternative question.
>
> I also think that algebraic data types finesse an answer to the
> question of null and 3VL. Which is a very good thing.

I'm strongly against 3VL, but I have a slightly different take. In my mind, "missing" information is different than any of the other things that 3VL is usually used for, such as "N/A" or what have you. I think of missing information as a cardinality issue; the data is merely absent, as opposed to using a special marker, such as one might use for N/A or NAN.

There was a very interesting paper out of MS Research a few years ago, "Unifying Tables, Objects, and Documents."

http://citeseer.ist.psu.edu/577443.html

I'm not thrilled with everything in the paper (they have XML on the brain, like almost everyone else these days) but there was one thing that really stood out for me, which was their cardinality annotations, which can be verified statically. That is, you could say of a int variable that it had either 0 or 1 int values in it, which would roughly correspond to SQL's NULLABLE. Or you could indicate it had exactly 1, or zero or more, or 1 or more. Pretty cool.

> > Prolog seems like a good answer to the recursive queries problem.
> > You get a nice *general* solution for transitive closure, without
> > a special-purpose tclose or connect-by or whatever. You can
> > also do reflexive closure, and some other cool stuff.
>
> Well, Prolog is a start, but it would need several issues sorting
> before progressing, the main two being the ordering of functor
> arguments and the ordering of facts/clauses affecting the evaluation
> of a predicate. Types would be nice too ;)

[Did you mean "sorted out" instead of "sorting"?]

Yes, being able to assert facts more than once, and having the order be significant is just wrong. Still, the idea of having rules in the language is a good one.

> > Do we need subtyping polymorphism? Not clear; maybe
> > the generalize/specialize techniques of relational are enough.
> >
> > Do we need a better set of relational operators? You bet!
>
> Hm ! Could you expand on this ? I'm pretty sure what you mean isn't
> what I'm reading :)

This sort of gets back to the part about relational algebra at the start of this post. I would like to see an operator like <OR> from TTM's A algebra. In fact, I have a vague idea for a framework for describing all possible relational operators, analogous to the complete set of 16 boolean operations. (Although there are more than 16 for relations.)

I tend to shy away from <NOT>.

> > > Do we need some kind of nested relations? I'm pretty sure we do.
> > >
>
> Well, I get into discussions on this. Whether we *need* them or not,
> the definition of a relation permits them, so we *should* have them.
> Otherwise we can't claim to be properly implementing relations.

Either explanation is fine by me.

> > > We also need a module system. What about some kind of dynamic
> > > dispatch? The OO single-dispatch solution to this has an excellent
> > > power-weight ratio, and also some kind of object mechanism
> > > is a great solution to a number of different modularity concerns.
> > > I'm still not clear how objects (specificially, non-constant fields)
> > > interact with relations, though.
> > >
>
> I'm going to chuck something up in the air and see what happens.
>
> I think the whole idea of putting an object in a database is absurd.
>
> Firstly, the object oriented languages don't seem to agree on what
> *precisely* an object is; the views of say, Java, C++, Simula and
> Smalltalk seem to differ, albeit sometimes subtly, on this.

I dunno; I think this whole line of argumentation is a snow job from the dbdebunk camp. In some ways it is a *good* thing that the OO languages each have their particular semantics; it's a sign that the field is vital and active, and trying out new ideas. In contrast, relational languages are stultifyingly stuck under the mighty thumb of SQL.

And in fact, the basic idea of what an object is is fairly clear; it's a data structure with both a value and an identity, and an associated dynamic dispatch mechanism. The details vary, but AFAIK all "OO" languages have these features.

Furthermore, the specific semantics of each OO language are quite well nailed down; there is little or no fuzziness in the Java spec, or the C++ report, etc.

> If they
> can't agree on what an object is/means, how can we agree on what it
> means to store an object in a database?

Why do we have to agree? Why can't we try out lots of different ideas and see which ones work best?

> Also, the objects have, for
> me, a very particular application - modeling dynamic interacting
> processes. What does it mean to store one of these things in a
> database? If we're using them for, effectively, abstract data typing,
> then there are better, simpler, more conceptually robust answers than
> object orientation for that.

As far as what makes a good abstract data type mechanism, I'd say objects are it. They've pushed out every other way of doing things, and with good reason.

As to the question of what it means to store an object in a database, the only real question is what do you do about variables inside objects, and pointers. And I don't think it's out of the question to say that they just get flattened, and go away, subsumed by the "variable-ness" of the database itself, in which case the answer is very simple indeed.

Marshall Received on Sat Oct 30 2004 - 08:22:02 CEST

Original text of this message