Re: bags vs. sets

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Tue, 15 May 2007 22:21:30 -0300
Message-ID: <464a5c56$0$4013$9a566e8b_at_news.aliant.net>


Vadim Tropashko wrote:

> On May 15, 3:38 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>

>>Marshall wrote:
>>
>>>We regularly deride SQL for using bags rather than sets.
>>>Now, I get many of the reasons we point and laugh at SQL,
>>>but I'm not so clear on this one.
>>
>>>We could have a language that covered sets, bags, and lists.
>>>We could have a language that used bags as the basic
>>>collection type and built lists and relations out of that.
>>>It's not hard: a relation is a bag with a uniqueness constraint.
>>>A list is a bag with a uniqueness constraint on a column
>>>that belongs to the natural numbers. (Details glossed over.)
>>
>>>OTOH, I'm less clear how to model a bag with relations.
>>
>>A bag depends on physical location for identifying elements. Because
>>elements in relations have no particular physical location, the very
>>idea seems a little absurd.

>
> Compared to sets bags occur in math very rarely. Yet there are some
> prominent cases such as equation roots. Consider:
>
> x*(x-1)*(x-1) = 0
>
> The root x=1 is counted with its multiplicity 2.

Is the bag 0, 1, 1 the same as 1, 0, 1 ?

>>>Bags also show up in things like aggregation. The
>>>sum of a column of a relation is the fold of + over
>>>the bag-projection over that column.
>>
>>At the internal physical level, physical location is perfectly
>>acceptable for identifying elements. It's just not such a good thing at
>>the external logical level.

>
> The question is if counting fits naturally into logical level. It may
> be argued that combinatorics, arithmetics, and many other areas escape
> logical level. Bags are not idempotent, perhaps they are better
> suitable for counting.

If one has sets of tuples, counting fits naturally. The above example would use tuples with a root and a multiplicity.

>>Internally, dbmses have to deal with streams and blocks and arrays and
>>all sorts of physical structures. I don't see what the problem is. As
>>long as the dbms manages the physical aspects without burdening the
>>user, what's the problem?

>
> My guess is that Marshall asked this question in the context of
> general-purpose relational programming language. The main question is
> if we want to expand relational algebra to embrace structures other
> than relations, or fit everything into relations. Arguably, fitting
> everything into relations sometimes gives quite an awkward solutions.
>
> Consider languages and grammars. They have 2 fundamental operators:
> join and union. While union is very similar in its properties to the
> relational union, the join is very different. And language
> intersection is no longer a join as in relational case. Sure we can
> model words (which are lists of symbols) as relations by introducing
> an auxiliary ordering number attribute, but this is IMO too awkward to
> result into anything practical. As an exersize, one may try expressing
> list join relationally. Besides the fact that one has to do very
> artificial manipulation with ordering numbers, the expression for list
> join doesn't look particularly exciting.
>
Received on Wed May 16 2007 - 03:21:30 CEST

Original text of this message