Re: More on lists and sets

From: David Cressey <dcressey_at_verizon.net>
Date: Mon, 27 Mar 2006 13:09:10 GMT
Message-ID: <WtRVf.327$tZ.81_at_trndny03>


"Marshall Spight" <marshall.spight_at_gmail.com> wrote in message news:1143426260.988651.277030_at_e56g2000cwe.googlegroups.com...
> David Cressey wrote:
> >
> > I want to turn my attention to the discussion of lists and sets from the
> > logical and or physical design perspective. Marshall Spight recently
said
> > that nearly all programming languages offer support, in one way or
another,
> > for lists. Support for sets in much more spotty. Marshall figures that
a
> > good language will have to allow for both. Without disputing that
> > conjecture, I want to turn my attention to facilities that can convert
> > easily beck and forth between sets and lists.
>
> I think that's a great idea. Let's think about what it means for a
> minute.
>
> A set is an unordered collection of distinct elements. (As an aside,
> there are a few programming languages with support for plain
> sets; Icon has it and I believe Python does as well. These
> are simple sets, though, not relations.) The sweet spot IMHO
> comes with relations, whose awesome power we are all
> familiar with, if not equally appreciative of. :-) Relations are
> sets whose element type is a product type. I know of no
> general purpose programming language with any kind of
> support for relations. (Maybe Dataphor qualifies?) Although
> you can do some impressive things with languages with
> list comprehensions; see Clean for a good example.
>
> So, what is a list? A list is a mapping from the natural numbers
> to a set. A finite list (the usual case) is a mapping from a
> finite continuous subset of the natural numbers, that is,
> from 0 .. n, where n+1 is the number of elements in the list.
> Note that this allows duplicates.
>
> Interestingly, I never hear anyone talk about lists of product
> types. Why is that? (Guess: OOP is hogging the conversation
> as usual.) It seems to me that lists of product types would be
> almost as much more useful than lists of scalar types as
> sets of product types are more useful than sets of scalar
> types. So let's think of lists as being lists of product types.
>
> How then do we convert from a list to a set? There are two
> useful techniques I can think of, one information-preserving
> and one lossy.
>
> The information preserving way simply considers a list to
> be a set with an additional attribute that is the position of
> the rest of the attributes in the list. Converting back and
> forth simply means thinking about (or typing) the data in
> one way or another.
>
> The lossy way is to consider the position as something extra
> compared to the set representation, and create it when converting
> set -> list, and discard it when converting list -> set.
>
> In the set -> list case, we need to supply a total order to the
> set if we want to unambiguously convert it to a list.
>
> But not every interesting order is a total order. Often, partial
> orders are also useful. Now, I have also observed that no
> programming language I know of makes any kind of distinction
> between a total order and a partial order, which is a shame.
> (Roughly, a total order is one in which there are no ties-- every
> distinct element is either strictly less than or strictly greater than
> every other element. In a partial order, two element may compare
> the same, even if they are not the same element.) SQL doesn't
> make this distinction. In fact, many order-by clauses in SQL end
> up specifying a partial order, which SQL more or less treats as
> a total order. :-(
>
> So I would propose that we need to consider sorting with partial
> orders and sorting with totals orders separately. In the total order
> case, we end up with a list whose element type is the same as
> the element type of the set. In the partial order case, we end up
> with a list-of-sets, where the set has cardinality-1 in the no-ties
> case and cardinality > 1 in the case of ties.
>
> It is interesting that we see list-of-sets emerging directly from the
> nature of ordered data.
>
>
> Marshall
>
> PS. I realize this post is more brain-dump than anything coherent.

Thanks for the brain dump. I found it great. Especially the explanation of "partial order". I didn't know the meaning of that phrase. Makes me think of IBM card sorters. But that's a digression.

With regard to lossless conversions, I think it's important to distinguish cases where the order pertains to the problem domain, and cases where the order pertains to the solution domain. I want to talk a bit about B-trees that permit duplicates, but I'll put that off to another subthread.

One offshoot of your brain dump is this odd idea: The concept of "scalars" in Pascal is a mapping from some other type to (a finite subset of) the natural numbers, 0..n

Example: char maps into 0..255 or Boolean maps into 0..1.

I wonder what the connection, if any, between that concept and this thread is. Received on Mon Mar 27 2006 - 15:09:10 CEST

Original text of this message