A Logical Model for Lists as Relations

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 9 May 2006 16:41:53 -0700
Message-ID: <1147218113.519014.27620_at_e56g2000cwe.googlegroups.com>



I am interested in the question of how best to handle lists in a relation-oriented world. I have considered various approaches, usually oriented around adding a list collection type.

But a list can be described as a relation. Most simply, an infinite list is a relation from the natural numbers to the target set, and a finite list is a relation from some finite contiguous subset [0..n] of the naturals to the target set. Generalizing, we could describe an n-ary list as a relation with an index attribute and zero or more other attributes.

Generalizing still further, we could accomodate multidimensional lists as relations with multiple index attributes.

There are various selection operations we might want to do on lists. The full power of something like SQL's SELECT on a list, where the index was one of the attributes, would be useful. This would necessarily produce a relation result.

There might be cases where one wanted a list result.

For example, the substring operation. If one had a string
"abcdefg" (actually { (0,a), (1,b) ... etc }) and wanted
everything where the index was in [1..3], one would
want { (0,b), (1,c), (2,d) } rather than { (1,b), (2,c), (3,d) }. Perhaps this could be accomplished with a library routine. (Otherwise you get into the territory of trying to make list a subtype of relation, and worse, wanting contravariant return types, ugh.)

There are also two-argument operations on generic lists. One, commonly called "zip", is the pairwise putting together of list elements. Thus, zip([1,2], [5,6]) is [(1,5), (2,6)]. If we consider lists as relations, this is simply a join on the index attribute.

Another binary operation on lists is taking the product. In essence, the dimensions of the second list are considered dimensions above those of the first list. product([1,2], [5,6]) is [[(1,5), (2,5)], [(1,6), (2,6)]]. This is simply a rename of the second lists 0th index to be the 1st index, and a join.

Both zip and product can be generalized to multidimesional lists in the obvious way.

What about imperative operations? This is a bit more complicated. We have to consider whether we want to enforce the usual list semantics, where for example if we delete an element, the higher indicies are accordingly reduced. Thus, insert/delete also renumber. Again this might be best handled with a library routine. (Alternative: triggers?)

The more I type, the more this seems like a simple matter of some library routines for applying the usual list semantics.

Anyway, comments, suggestions, criticisms welcome. Although note I'm not much interested in the question of whether ordered data is useful or not.

Marshall Received on Wed May 10 2006 - 01:41:53 CEST

Original text of this message