Re: A Logical Model for Lists as Relations

From: Frank Hamersley <terabitemightbe_at_bigpond.com>
Date: Wed, 10 May 2006 12:49:48 GMT
Message-ID: <Mjl8g.1287$S7.282_at_news-server.bigpond.net.au>


Brian Selzer wrote:
> "Marshall Spight" <marshall.spight_at_gmail.com> wrote in message
> news:1147218113.519014.27620_at_e56g2000cwe.googlegroups.com...

>> I am interested in the question of how best to handle lists in a
>> relation-oriented world. I have considered various approaches,
>> usually oriented around adding a list collection type.
>>
>> But a list can be described as a relation. Most simply, an infinite
>> list is a relation from the natural numbers to the target set,
>> and a finite list is a relation from some finite contiguous subset
>> [0..n] of the naturals to the target set. Generalizing, we could
>> describe an n-ary list as a relation with an index attribute and
>> zero or more other attributes.

[..]

> I agree that using a separate attribute to represent the order makes sense,
> but from a practical standpoint, mapping list elements to the natural
> numbers often makes operations like insert, update and delete cumbersome and
> resource intensive. All that is really important is to be able to determine
> (1) whether or not two tuples are adjacent within a relation, and (2) which
> of the two adjacent tuples comes first.
>
> Since the value assigned to the order attribute is arbitrary,
> notwithstanding the above conditions, wouldn't it make more sense to use a
> varchar to represent the order? For example, given the list: (15, 33, 2),
> you could represent it as the relation:
>
> "A", 15
> "B", 33
> "C", 2
>
> If the value 88 is to be inserted between 15 and 33, you get:
>
> "A", 15
> "AM", 88
> "B", 33
> "C", 2
>
> If the value 33 is then deleted from the list, you get:
>
> "A", 15
> "AM", 88
> "C", 2
>
> If the value 25 is then appended to the list, you get:
>
> "A", 15
> "AM", 88
> "C", 2
> "D", 25
>
> It should be obvious that with this mechanism, modifications would require a
> lot less work and consequently, fewer writes while preserving the ability to
> present the information in order.

Your proposed mechanism is quite practical in limited cases (as you note later in the post) however it is not a general solution. Take for instance a situation where I am inserting millions of list elements and I happen to perform it in list order. Using your approach I would quickly reach "Z" and then start repeatedly splitting the last ordering attribute so I expect the 27th insert would produce "ZZ" the 28th "ZZZ".   I expect you could mitigate against this by the 27th element being keyed as "ZA" then the 28th as "ZB" etc but at the end you will end up with a very skewed outcome.

> On the other hand, if it is often necessary to find the 100th element, then
> mapping to the natural numbers might make more sense. The best solution
> would depend on the volatility of the data, the average cardinality of the
> lists, and the aggregate nature of the queries.

[..]

As Bob mentions in his post // to yours, a general solution would arise if the list tuple had a foreign key (or two for doubly linked lists) identifying the next (or previous) tuple. This design whilst requiring   more than 1 "IO" per insert remains linear in cost.

The ordinals if needed could then be produced on retrieval although the cost could become prohibitive. Others may be able to affirm that algorithms exist to produce these on the fly.

Cheers, Frank. Received on Wed May 10 2006 - 14:49:48 CEST

Original text of this message