Re: More on lists and sets
Date: 26 Mar 2006 18:24:21 -0800
Message-ID: <1143426260.988651.277030_at_e56g2000cwe.googlegroups.com>
David Cressey wrote:
>
> I want to turn my attention to the discussion of lists and sets from the
> logical and or physical design perspective. Marshall Spight recently said
> that nearly all programming languages offer support, in one way or another,
> for lists. Support for sets in much more spotty. Marshall figures that a
> good language will have to allow for both. Without disputing that
> conjecture, I want to turn my attention to facilities that can convert
> easily beck and forth between sets and lists.
I think that's a great idea. Let's think about what it means for a
minute.
A set is an unordered collection of distinct elements. (As an aside,
there are a few programming languages with support for plain
sets; Icon has it and I believe Python does as well. These
are simple sets, though, not relations.) The sweet spot IMHO
comes with relations, whose awesome power we are all
familiar with, if not equally appreciative of. :-) Relations are
sets whose element type is a product type. I know of no
general purpose programming language with any kind of
support for relations. (Maybe Dataphor qualifies?) Although
you can do some impressive things with languages with
list comprehensions; see Clean for a good example.
So, what is a list? A list is a mapping from the natural numbers
to a set. A finite list (the usual case) is a mapping from a
finite continuous subset of the natural numbers, that is,
from 0 .. n, where n+1 is the number of elements in the list.
Note that this allows duplicates.
How then do we convert from a list to a set? There are two
useful techniques I can think of, one information-preserving
and one lossy.
The information preserving way simply considers a list to
be a set with an additional attribute that is the position of
the rest of the attributes in the list. Converting back and
forth simply means thinking about (or typing) the data in
one way or another.
The lossy way is to consider the position as something extra
compared to the set representation, and create it when converting
set -> list, and discard it when converting list -> set.
In the set -> list case, we need to supply a total order to the
set if we want to unambiguously convert it to a list.
But not every interesting order is a total order. Often, partial
orders are also useful. Now, I have also observed that no
programming language I know of makes any kind of distinction
between a total order and a partial order, which is a shame.
(Roughly, a total order is one in which there are no ties-- every
distinct element is either strictly less than or strictly greater than
every other element. In a partial order, two element may compare
the same, even if they are not the same element.) SQL doesn't
make this distinction. In fact, many order-by clauses in SQL end
up specifying a partial order, which SQL more or less treats as
a total order. :-(
So I would propose that we need to consider sorting with partial
orders and sorting with totals orders separately. In the total order
case, we end up with a list whose element type is the same as
the element type of the set. In the partial order case, we end up
with a list-of-sets, where the set has cardinality-1 in the no-ties
case and cardinality > 1 in the case of ties.
It is interesting that we see list-of-sets emerging directly from the
nature of ordered data.
Marshall
PS. I realize this post is more brain-dump than anything coherent. Received on Mon Mar 27 2006 - 04:24:21 CEST