Re: Conflicts in Relationships

From: Jan Hidders <jan.hidders_at_pandora.be>
Date: Fri, 19 Sep 2003 23:12:12 GMT
Message-ID: <g3Mab.28843$n05.1488993_at_phobos.telenet-ops.be>


bill_dev wrote:
> Originally posted by Jan Hidders

>> It's not very simple, but also not very complicated. Just think of 
>> the join as a graph: nodes are the relations, edges are join
>> conditions. You start from the node/relation that has to be fully
>> in the result. If there is a simple path (no cycles) that starts in
>> this node and ends with the edge (R1,R2) then the join between R1
>> and R2 should be an outer join on the side of R1. >

>
> Awesome. I'm just on the verge of comprehending :) I was thinking
> nodes are tables but no, they're relations. That makes more sense.

It does? Because actually I meant "tables" when I said relations. :-)

> So in my above example I'd have 3 nodes...A is related to B, B is
> related to C, and C is related to A and 3 edges. That would be a cycle
> (#edges > (#nodes-1)), is that right?

Yes, it is. And as a conseqence you need a full outer join between B and C.

> I assume verifying a cycle would
> be harder than verfying the simple path you describe above. Do you know
> of any documentation that describes this further? Thanks again!

You could simply do a depth-first traversal with a recursive function that checks if the node that it is about to visit is not already in the path that it used to get to the node where it is at the moment. The function looks roughly like this:

FUNCTION check_all( current_path, current_node ) /* current path is a list of nodes that describes the path */

/* to the current_node                                     */
BEGIN
   FOR ALL edges (R1,R2) with R1 = current_node DO
     IF R2 is not in current_path
     THEN
       Check if the join condition that corresponds with the edge
       is "outer" on the side of R1;
       check_all( current_path ++ [ current_node ], R2 );
       /* ++ = list append */
     FI

   OD
END If table A is the one that has to be fully in the result then the initial call is: check_all( [A], A)

Note that I didn't say what to do when the check fails, you can probably figure that out for yourself.

> ..or are you saying only cycles have potential for conflict and my
> software should evaluate the simple path leading up to the last edge -
> then if the last edge works as you describe then no conflict?

If you look at the algorithm you will see that if there is a cycle then it will walk around the cycle in both ways, which means that all the join conditions have to be full on both sides, i.e., they have to be *full* outer joins. All other join conditions only have to full on one side.

  • Jan Hidders
Received on Sat Sep 20 2003 - 01:12:12 CEST

Original text of this message