Re: A question for Mr. Celko

From: Paul <paul_at_test.com>
Date: Sat, 24 Jul 2004 23:47:24 +0100
Message-ID: <4102e6fc$0$63391$ed2e19e4_at_ptn-nntp-reader04.plus.net>


Marshall Spight wrote:

>> Do all values in the RVA column have to have the same columns? i.e.
>>  is there just one RVA type or a different type for each relation 
>> that has a different column setup?

>
> Why are you trying to make it so complicated? If you spent as much
> effort trying to make it easy as you spend trying to make it hard,
> you'd be done now.

I'm just trying to make things easy :)
I'm trying to be open-minded though.

> All attribute values are of the same type, just the same as any other
> attribute.

What I mean is do you just have one relation-type "relation", or do you have a whole family of types "relation(char, int, char, date)" etc.

So could you have a relation with tuples like:

(1, 'abc', {(1, 2, 3)})
(2, 'xyz', {('t', '2004-07-24')})

I think the jargon is whether "relation" is a type or a type-generator.

>> In first-order logic your variables can only be values, not 
>> predicates. If you have RVAs that the RDBMS "understands" then you
>>  are going to have variables that are relation-valued ie. 
>> predicates.
>> 
>> Now this isn't a problem if you are just storing the relations as 
>> an "encoding" that is invisible to the RDBMS, but once you let the
>>  RDBMS see the structure of your RVAs directly it jumps from being
>> a value to being a predicate.

>
> Again, you're just making it hard for no reason I can discern.

Logic is extremely subtle and counter-intuitive so I think it's important that proposed extensions to the relational model are scrutinised very thoroughly.

Check out this paradox for example:
http://en.wikipedia.org/wiki/Banach-Tarski_Paradox Basically using the foundations of logic it can be proved that you can chop a ball into a *finite* number of pieces, then reassemble them into *two* balls of the same size as the original!!!

>> OK then, how does a view created in this way differ logically from
>> a standard view created by merely JOINing the two tables?

>
> What makes you think I think there should be a logical difference?
> Because in fact I *don't* think there should be a logical difference.

So what do you gain by introducing RVAs then? Why not just keep things simple and stay with standard views?

> Let me expand your persons table to include
 > (3, 'Joe')
 > and I'll leave your fruit table the same. Joe is fruitless.
>
> Now I want to know what fruit everyone had. The right answer is

 > (1, 'Fred', {('a', 'apple'), ('b', 'banana')})
 > (2, 'Bill', {('c', 'cherry')})
 > (3, 'Joe', {} )

>
> Can't do that with vanilla join. You need to introduce the evil outer
> join, which means you need nulls now, unless you have RVAs. Death to
> nulls; long live empty sets!

You can still have an equivalent to the outer join if you don't allow NULLs: it just means you have to explicitly specify a default value for each column that possibly doesn't have a match.

Or maybe you could prescribe that every type has to have one value marked as the "default" value that would be used in place of a NULL in outer joins? This is an "orthogonal" issue I think. :)

>> I'm thinking for example of an invoice table that stores invoice 
>> lines in a RVA in each row Pick-stylee, but I think this might not
>>  be the kind of application you have in mind.

>
> No, I think that's pretty much what I'm talking about. It's not
> actually a very big deal; it's mostly just for convenience.
>
>> Because the problem in this case would be you are taking your 
>> single logical relation-value and splitting it up into multiple 
>> smaller relation-values, one for each invoice.
>> 
>> You're also losing the invoice-line relvar and kind of 
>> incorporating it within the invoice relvar.

>
> Not losing. Two views of the same data. The invoice-line relvar is
> still there.

So are you saying that for every existence of an RVA there must also be a relvar (with a name etc.) that corresponds to it? This is a *requirement* of having RVAs, or is it optional?

I'm not quite comfortable with the idea that views and "base tables" are indistiguishable I must admit, but in this case there's even more reason that they are different because one is a relvar and the other a relation-value.

Suppose I have two relations R and S.
Relation R has two columns, the first of integer type, and the second of "relation" type. How would I update a row of R to have relation S as its value in the relation-type column?

UPDATE R SET r2=S WHERE r1=1?

I remain to be convinced that RVAs that are accessible by the main RDBMS are a good thing, but I think the potential complications are too much and the potential benefits too small. Please try to convince me otherwise though!

Paul. Received on Sun Jul 25 2004 - 00:47:24 CEST

Original text of this message