Re: Relation-valued attributes (again)

From: <jingleheimerschmitt_at_hotmail.com>
Date: 11 Feb 2005 01:12:58 -0800
Message-ID: <1108113178.258473.264330_at_f14g2000cwb.googlegroups.com>


Philip,

> (I have been thinking you mean D group/ungroup for nest/unnest--am I
> mistaken?)
No, these are the terms preferred by Date and Darwen. We have opted for the terms nest and unnest because we feel they more accurately capture the semantics of the operation, and they do not clash with the expected meaning of group (from SQL).

> I understand from your last msg that you are doubting that ungroup is
> primitive--more on this below.
Yes, I am trying to express the query using only the traditional primitives of the relational algebra, namely restrict, project, join, union, and difference. Date and Darwen have suggested that the actual primitive set is even smaller, but that is another matter about which I remain unconvinced. More to the point, I am not convinced that extend is non-primitive either.

> However I don't understand your choice of counterexample;
> I don't understand how it matters whether Op is a function or
> relation,
There are two cases where it matters, 1) when Op is non-functional, and 2) when Op is non-deterministic.
Date and Darwen dispense with non-functional by insisting that any operator that returns a value must not update the database. While I disagree with this position for pragmatic reasons, I am willing to let this one slide for the purpose of this discussion. However, the non-deterministic requirement is much more problematic. For example, how do I use the "join to a constant relation value" technique to write the query:

T add { Now() DateTime }

In addition, I'm not sure I am comfortable with using this technique to dispense with extend attributes defined in terms of references to attributes in the "current" tuple. For example:

T add { ID ID1 }

We could define a relation R of type { ID : Integer, ID1 : Integer } populated appropriately, and then use it to express the query as a join, but this seems different in kind to me, in that it involves constructing the "constant relation" from the current state of the database. While I cannot dismiss it entirely, something doesn't seem right about it to me.

> or whether you have relation-valued functions (you can always just
> insert an rva directly).
> The point I was trying to make last msg was that
> 1. if you want T-join-Op then you can calculate it by join on a
> relation-Op or extend on a function-Op, and
> 2. if you want T-join-Op-unnested then why does it matter how you did
> the join?
> But as I say, I seem to be missing something.
Only if that operator is functional, deterministic, and makes no reference to attributes in the current context. (We called this notion context-literal to give it a name because it came up so often. I'm not sure what the official term is.) In this case, the Op(ID) invocation is not context-literal, so I cannot see how to express the query primitively (at least not without "materializing" a constant relation).

> Re whether it "may be", see Date's Intro to DBS section 7.0 Grouping
> and Ungrouping.
Yes, I just read that the other day. You are right.

> By the way, I've never seen a proposal for GROUP PER, but it would be
> a way of putting empty values into ("back into" with respect to an
> UNGROUP) an rva.
You should propose that to Date and Darwen, I think they would like the idea. It has a nice symmetry with the SUMMARIZE extensions.

> One of the points I was trying to make is that even if ungroup is
> primitive, indeed even if it's not present, we can still have and use
> nested relations.
Agreed, but there are certain queries that we cannot express without them. This is the part that bothers me, because then we don't have a complete algebra. Again, it's not that I want to get rid of them, quite the contrary, I think disallowing them is arbitrary and ill-advised. But I want to be sure that if we have them, we have a complete and simple set of primitives for dealing with them.

> But aggregation should be allowed for arbitrary types and appropriate
> functions, eg
> SUMMARIZE rint ADD PRODUCT(myint) AS summary
> SUMMARIZE rrel ADD UNION(myrel) AS summary
Nice. I love the idea. In fact, on reflection, I don't think RVA support would be complete without them.

> Also, a basic system capability is a way to extract a value out of a
> relation
> (I don't found how D does this),
D uses tuple and attribute extractors. The keyword is FROM in Tutorial D. For example:

ID FROM TUPLE FROM T WHERE ID = 1; We dislike prefix notation generally, so we chose to use index-style access for the tuple extractor and a dot-qualifier for the attribute extractor:

T[1].ID

or

(T where ID = 1)[].ID

> Then one way to express
> ungroup (T add {Op(ID) RVA}) by RVA
> is the single value in
> summarize
> (T add {Op(ID) RVA}) add {(RVA add {ID ID}) eachid})
> union (eachid) as summary
Again, nice. And if we accept that extend is not primitive, and we restrict ourselves to functional, deterministic operators, then you have indeed provided a formulation of the query in terms of the primitive relational operators. However, I still feel this is subject to the context-literal objection mentioned above.

> (I will have more to say about the details of the primitiveness of
> group/ungroup, extend and summarize in a later msg.)
I am very interested to hear it.

> You may say, I am cheating, that ungroup is only non-primitive if I
> allow abitrarily typed aggregates and value extraction.
I am fine with arbitrarily typed aggregate operators. And these operators are not primitive, in that they could always be performed using the constant relation technique. As for value extraction, I'll have to think about that one some more.

> I can only say that a system overall has not just operators from
> relations to relations but operators between relations and other
> types,
> and if you don't treat any type as special (eg integers, or
relations)
> and are thoroughly orthogonal in your use of types,
> then having added sensible primitives for those purposes also gives
> you ungroup.
I agree that a system that supports RVAs should certainly support nest and unnest. I am happy to have as many operators as are useful. The crux of the issue is that I don't feel like the set of primitives we have are sufficient, even without RVAs. I have never been fully convinced that the traditional primitives are complete. I would also add that the question is far from academic. Even if the traditional primitives are complete in some sense, I would like to see a more pragmatic set of primitives that is closer to the actual implementation. This would provide a simpler basis for implementing an RVA-enabled query processor.

Regards,
Bryn Received on Fri Feb 11 2005 - 10:12:58 CET

Original text of this message