Re: A Logical Model for Lists as Relations

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 10 May 2006 04:38:14 GMT
Message-ID: <W6e8g.70352$F_3.10413_at_newssvr29.news.prodigy.net>


"Marshall Spight" <marshall.spight_at_gmail.com> wrote in message news:1147218113.519014.27620_at_e56g2000cwe.googlegroups.com...
>I am interested in the question of how best to handle lists in a
> relation-oriented world. I have considered various approaches,
> usually oriented around adding a list collection type.
>
> But a list can be described as a relation. Most simply, an infinite
> list is a relation from the natural numbers to the target set,
> and a finite list is a relation from some finite contiguous subset
> [0..n] of the naturals to the target set. Generalizing, we could
> describe an n-ary list as a relation with an index attribute and
> zero or more other attributes.

I thought that whole numbers begin with 0 and natural numbers begin with 1.

I agree that using a separate attribute to represent the order makes sense, but from a practical standpoint, mapping list elements to the natural numbers often makes operations like insert, update and delete cumbersome and resource intensive. All that is really important is to be able to determine (1) whether or not two tuples are adjacent within a relation, and (2) which of the two adjacent tuples comes first.

Since the value assigned to the order attribute is arbitrary, notwithstanding the above conditions, wouldn't it make more sense to use a varchar to represent the order? For example, given the list: (15, 33, 2), you could represent it as the relation:

"A", 15
"B", 33
"C", 2

If the value 88 is to be inserted between 15 and 33, you get:

"A", 15
"AM", 88
"B", 33
"C", 2

If the value 33 is then deleted from the list, you get:

"A", 15
"AM", 88
"C", 2

If the value 25 is then appended to the list, you get:

"A", 15
"AM", 88
"C", 2
"D", 25

It should be obvious that with this mechanism, modifications would require a lot less work and consequently, fewer writes while preserving the ability to present the information in order.

On the other hand, if it is often necessary to find the 100th element, then mapping to the natural numbers might make more sense. The best solution would depend on the volatility of the data, the average cardinality of the lists, and the aggregate nature of the queries.

>
> Generalizing still further, we could accomodate multidimensional
> lists as relations with multiple index attributes.
>
> There are various selection operations we might want to do
> on lists. The full power of something like SQL's SELECT
> on a list, where the index was one of the attributes, would
> be useful. This would necessarily produce a relation result.
>
> There might be cases where one wanted a list result.
> For example, the substring operation. If one had a string
> "abcdefg" (actually { (0,a), (1,b) ... etc }) and wanted
> everything where the index was in [1..3], one would
> want { (0,b), (1,c), (2,d) } rather than { (1,b), (2,c), (3,d) }.
> Perhaps this could be accomplished with a library routine.
> (Otherwise you get into the territory of trying to make list
> a subtype of relation, and worse, wanting contravariant
> return types, ugh.)
>
> There are also two-argument operations on generic lists.
> One, commonly called "zip", is the pairwise putting together
> of list elements. Thus, zip([1,2], [5,6]) is [(1,5), (2,6)].
> If we consider lists as relations, this is simply a join
> on the index attribute.
>
> Another binary operation on lists is taking the product.
> In essence, the dimensions of the second list are considered
> dimensions above those of the first list. product([1,2], [5,6])
> is [[(1,5), (2,5)], [(1,6), (2,6)]]. This is simply a rename of
> the second lists 0th index to be the 1st index, and a join.
>
> Both zip and product can be generalized to multidimesional
> lists in the obvious way.
>
> What about imperative operations? This is a bit more complicated.
> We have to consider whether we want to enforce the usual
> list semantics, where for example if we delete an element, the
> higher indicies are accordingly reduced. Thus, insert/delete
> also renumber. Again this might be best handled with a library
> routine. (Alternative: triggers?)
>
> The more I type, the more this seems like a simple matter
> of some library routines for applying the usual list semantics.
>
> Anyway, comments, suggestions, criticisms welcome. Although
> note I'm not much interested in the question of whether ordered
> data is useful or not.
>
>
> Marshall
>
Received on Wed May 10 2006 - 06:38:14 CEST

Original text of this message