Re: repeating groups

From: dawn <dawnwolthuis_at_gmail.com>
Date: 19 Feb 2006 00:12:05 -0800
Message-ID: <1140336725.827201.24000_at_g14g2000cwa.googlegroups.com>


Marshall Spight wrote:
> Jonathan Leffler wrote:
> > mAsterdam wrote:
> > > Marshall Spight wrote:
> > >>>[...snippage...]
> > >> Sure. But there are several ways out of the repeating groups problem
> > >>
> > >> 1) decomposing relations, aka "classical" 1NF
> > >> 2) higher-than-1 cardinality attributes: lists or sets

You either have to toss (nested) bags in there or have me referring to lists and ordered lists as if they were conceptually different even if implemented identically. It works OK for me to handle nested sets, bags, and lists all as lists. If the system provides nested lists with an arbitrary number of attributes, my experience is that the user can distinguish whether the ordering is relevant to them or not.

> > >
> > > Aren't you jumping over the "order may have meaning" problem
> > > here by taking these two together?
> >
> > It depends whether you take the OR in 'lists or sets' to be inclusive or
> > exclusive, doesn't it? If it is inclusive, then lists handle the 'order
> > has meaning' case and sets handle the 'order is not significant case'.
> > But you've complicated the algebra - instead of just sets, you've now
> > got to deal with lists and sets (and the alternatives you list below).

>

> Yes, *exactly*!
>

> Complicating the algebra should give us pause, but I now believe
> it is something that we have to do. Lists are essential; sets are
> essential; I am not willing to admire a language that doesn't make
> both of them first class. I am not convinced there is any other
> structure that we need to give such importance to. And lists and
> sets need to interoperate. Hence we have to have the unified
> algebra.
>

> I've been thinking about this problem for a while. A half-assed
> job is remarkably easy. What is remarkably hard is to actually
> do a good job.

I suspect that the existing implementations which serve users very well would be considered in the former category. They were built before discussions of relational algebra. I'd be curious what aspects of the interface to the database would be considered awful (lots I'm sure) by those trying to implement a mathematical theory. I'd be interested in learning of and discussing those features which developers think are great, but database theorists say are wrong.

> > >> 3) Fully nested relations.
> > >
> > > lists-of-lists? sets-of-sets? lists-of-sets? sets-of-lists?
> > >
> > >> Actually, 2) was something I hadn't really thought of before:
> > >> add *one* level of nesting. This idea is interesting but less
> > >> appealing than 3, because it is less regular. I tend to prefer
> > >> orthogonal designs; they have fewer arbitrary limitations.
> >
> > Was it Hoare who said "anything in computer science that isn't recursive
> > isn't any good" or words to that effect? (I found two references at
> > Google to Jim Gray saying "Anything in computer science that's not
> > recursive is no good", but it is not an easy expression to search for.)

>

> An excellent expression! I had not heard that before.
>

> > 1-level of nesting would be arbitrary and would very quickly become
> > restrictive. Indeed, you'd have to be very careful not to lose the
> > closure property of the algebra.
>
> Indeed.

MUMPS has n levels of nesting where PICK has 2 built in: multivalues and subvalues (within the multivalue). This mathematically arbitrary limit relates quite well to language. Propositions often have lists and occasionally have lists within lists.

>

> But there is the existence proof for the usefulness of one level
> of additional structure, and that is the special handling given to
> the type list-of-character. Still, I favor the recursive type
> approach. (I actually prefer to say "inductive".)

We are looking to improve the API for developers. Modeling with two levels of nesting (and possibly even one) gives benefits to those using the interface and is not overly restrictive. Think how complex it would be for those working with the data to work with 50 or 500 levels of nesting. Forcing them to restructure the data after 2 (rather than before 1 as with SQL-DBMS's today) seems like a reasonable constraint on the design, even if not mathematically elegant. I fully understand the impulse to go from 0 to n rather than a seemingly arbitrary 2, however. Cheers! --dawn Received on Sun Feb 19 2006 - 09:12:05 CET

Original text of this message