Re: A Logical Model for Lists as Relations
Date: Wed, 10 May 2006 12:49:48 GMT
Message-ID: <Mjl8g.1287$S7.282_at_news-server.bigpond.net.au>
Brian Selzer wrote:
> "Marshall Spight" <marshall.spight_at_gmail.com> wrote in message
> news:1147218113.519014.27620_at_e56g2000cwe.googlegroups.com...
>> I am interested in the question of how best to handle lists in a >> relation-oriented world. I have considered various approaches, >> usually oriented around adding a list collection type. >> >> But a list can be described as a relation. Most simply, an infinite >> list is a relation from the natural numbers to the target set, >> and a finite list is a relation from some finite contiguous subset >> [0..n] of the naturals to the target set. Generalizing, we could >> describe an n-ary list as a relation with an index attribute and >> zero or more other attributes.
[..]
> I agree that using a separate attribute to represent the order makes sense,
> but from a practical standpoint, mapping list elements to the natural
> numbers often makes operations like insert, update and delete cumbersome and
> resource intensive. All that is really important is to be able to determine
> (1) whether or not two tuples are adjacent within a relation, and (2) which
> of the two adjacent tuples comes first.
>
> Since the value assigned to the order attribute is arbitrary,
> notwithstanding the above conditions, wouldn't it make more sense to use a
> varchar to represent the order? For example, given the list: (15, 33, 2),
> you could represent it as the relation:
>
> "A", 15
> "B", 33
> "C", 2
>
> If the value 88 is to be inserted between 15 and 33, you get:
>
> "A", 15
> "AM", 88
> "B", 33
> "C", 2
>
> If the value 33 is then deleted from the list, you get:
>
> "A", 15
> "AM", 88
> "C", 2
>
> If the value 25 is then appended to the list, you get:
>
> "A", 15
> "AM", 88
> "C", 2
> "D", 25
>
> It should be obvious that with this mechanism, modifications would require a
> lot less work and consequently, fewer writes while preserving the ability to
> present the information in order.
Your proposed mechanism is quite practical in limited cases (as you note later in the post) however it is not a general solution. Take for instance a situation where I am inserting millions of list elements and I happen to perform it in list order. Using your approach I would quickly reach "Z" and then start repeatedly splitting the last ordering attribute so I expect the 27th insert would produce "ZZ" the 28th "ZZZ". I expect you could mitigate against this by the 27th element being keyed as "ZA" then the 28th as "ZB" etc but at the end you will end up with a very skewed outcome.
> On the other hand, if it is often necessary to find the 100th element, then
> mapping to the natural numbers might make more sense. The best solution
> would depend on the volatility of the data, the average cardinality of the
> lists, and the aggregate nature of the queries.
[..]
Cheers, Frank. Received on Wed May 10 2006 - 14:49:48 CEST