Re: A Logical Model for Lists as Relations

From: Alvin Ryder <alvin321_at_telstra.com>
Date: 11 May 2006 16:00:37 -0700
Message-ID: <1147388437.437200.143000_at_y43g2000cwc.googlegroups.com>


Marshall Spight wrote:
> I am interested in the question of how best to handle lists in a
> relation-oriented world. I have considered various approaches,
> usually oriented around adding a list collection type.
>

I agree, this is quite an interesting question.

You can of course simply have attributes you later sort on. You can also represent any data structure, including maps and lists in the RM so that isn't an issue either. I think your question is can you have "Lists as Relations"?

> But a list can be described as a relation. Most simply, an infinite
> list is a relation from the natural numbers to the target set,
> and a finite list is a relation from some finite contiguous subset
> [0..n] of the naturals to the target set. Generalizing, we could
> describe an n-ary list as a relation with an index attribute and
> zero or more other attributes.
>

Sets are non-ordered collections of distinct elements, lists are ordered and non-distinct. The two are opposite in both areas and this is no small difference.

But what you've described is more like what many would call a "Map", Hash Table or even LinkedHashMap where the LHS is an ordinal value (0..n) and the RHS is a relational key.

At any rate the RM allows for *one and only one* way to represent data, obviously in relations .. having a *set* of tuples. There is no way around this, sets not lists here.

The above also appears to be a "positional addressing" scheme, these are explicitly disallowed by the RM. Data independence requires avoidance of "ordering dependence", "index dependence" and "access path dependence".

So for many reasons I'm having trouble seeing a list or map at the fundamental relational level. This isn't to say they cannot be represented as data, of course they can.

> Generalizing still further, we could accomodate multidimensional
> lists as relations with multiple index attributes.
>
> There are various selection operations we might want to do
> on lists. The full power of something like SQL's SELECT
> on a list, where the index was one of the attributes, would
> be useful. This would necessarily produce a relation result.
>
> There might be cases where one wanted a list result.
> For example, the substring operation. If one had a string
> "abcdefg" (actually { (0,a), (1,b) ... etc }) and wanted
> everything where the index was in [1..3], one would
> want { (0,b), (1,c), (2,d) } rather than { (1,b), (2,c), (3,d) }.
> Perhaps this could be accomplished with a library routine.
> (Otherwise you get into the territory of trying to make list
> a subtype of relation, and worse, wanting contravariant
> return types, ugh.)
>
> There are also two-argument operations on generic lists.
> One, commonly called "zip", is the pairwise putting together
> of list elements. Thus, zip([1,2], [5,6]) is [(1,5), (2,6)].
> If we consider lists as relations, this is simply a join
> on the index attribute.
>
> Another binary operation on lists is taking the product.
> In essence, the dimensions of the second list are considered
> dimensions above those of the first list. product([1,2], [5,6])
> is [[(1,5), (2,5)], [(1,6), (2,6)]]. This is simply a rename of
> the second lists 0th index to be the 1st index, and a join.
>
> Both zip and product can be generalized to multidimesional
> lists in the obvious way.
>
> What about imperative operations? This is a bit more complicated.
> We have to consider whether we want to enforce the usual
> list semantics, where for example if we delete an element, the
> higher indicies are accordingly reduced. Thus, insert/delete
> also renumber. Again this might be best handled with a library
> routine. (Alternative: triggers?)
>
> The more I type, the more this seems like a simple matter
> of some library routines for applying the usual list semantics.
>

Agreed - I think this is really the only option.

You can have an attribute type "List" with any opeartors it needs but I don't think we can get too carried way. Should the RM be required to break or decompose this list into elements? Shouldn't it only be known as one value? Will these elements be primary keys into a relation? But isn't that just positional addressing revisited?

> Anyway, comments, suggestions, criticisms welcome. Although
> note I'm not much interested in the question of whether ordered
> data is useful or not.
>
>
> Marshall

Cheers. Received on Fri May 12 2006 - 01:00:37 CEST

Original text of this message