Re: More on lists and sets

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 26 Mar 2006 18:24:21 -0800
Message-ID: <1143426260.988651.277030_at_e56g2000cwe.googlegroups.com>


David Cressey wrote:

>

> I want to turn my attention to the discussion of lists and sets from the
> logical and or physical design perspective. Marshall Spight recently said
> that nearly all programming languages offer support, in one way or another,
> for lists. Support for sets in much more spotty. Marshall figures that a
> good language will have to allow for both. Without disputing that
> conjecture, I want to turn my attention to facilities that can convert
> easily beck and forth between sets and lists.

I think that's a great idea. Let's think about what it means for a minute.

A set is an unordered collection of distinct elements. (As an aside, there are a few programming languages with support for plain sets; Icon has it and I believe Python does as well. These are simple sets, though, not relations.) The sweet spot IMHO comes with relations, whose awesome power we are all familiar with, if not equally appreciative of. :-) Relations are sets whose element type is a product type. I know of no general purpose programming language with any kind of support for relations. (Maybe Dataphor qualifies?) Although you can do some impressive things with languages with list comprehensions; see Clean for a good example.

So, what is a list? A list is a mapping from the natural numbers to a set. A finite list (the usual case) is a mapping from a finite continuous subset of the natural numbers, that is, from 0 .. n, where n+1 is the number of elements in the list. Note that this allows duplicates.

Interestingly, I never hear anyone talk about lists of product types. Why is that? (Guess: OOP is hogging the conversation as usual.) It seems to me that lists of product types would be almost as much more useful than lists of scalar types as sets of product types are more useful than sets of scalar types. So let's think of lists as being lists of product types.

How then do we convert from a list to a set? There are two useful techniques I can think of, one information-preserving and one lossy.

The information preserving way simply considers a list to be a set with an additional attribute that is the position of the rest of the attributes in the list. Converting back and forth simply means thinking about (or typing) the data in one way or another.

The lossy way is to consider the position as something extra compared to the set representation, and create it when converting set -> list, and discard it when converting list -> set.

In the set -> list case, we need to supply a total order to the set if we want to unambiguously convert it to a list.

But not every interesting order is a total order. Often, partial orders are also useful. Now, I have also observed that no programming language I know of makes any kind of distinction between a total order and a partial order, which is a shame. (Roughly, a total order is one in which there are no ties-- every distinct element is either strictly less than or strictly greater than every other element. In a partial order, two element may compare the same, even if they are not the same element.) SQL doesn't make this distinction. In fact, many order-by clauses in SQL end up specifying a partial order, which SQL more or less treats as a total order. :-(

So I would propose that we need to consider sorting with partial orders and sorting with totals orders separately. In the total order case, we end up with a list whose element type is the same as the element type of the set. In the partial order case, we end up with a list-of-sets, where the set has cardinality-1 in the no-ties case and cardinality > 1 in the case of ties.

It is interesting that we see list-of-sets emerging directly from the nature of ordered data.

Marshall

PS. I realize this post is more brain-dump than anything coherent. Received on Mon Mar 27 2006 - 04:24:21 CEST

Original text of this message