Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Can this be done in SQL? Find the transitive relation
On Aug 4, 12:06 pm, "da. Ram" <dba2..._at_yahoo.com> wrote:
> Dear Group,
>
> I am strugling get the following done in SQL. Could anyone help me to
> figure out a way?
>
> Oracle version 8i .
>
> create table t_grp (id1 char(1), id2 char(2)) ;
>
> insert into t_grp values ('A','C'); --> A:C are related
> insert into t_grp values ('B','D'); --> B:D are related
> insert into t_grp values ('F','H');
> insert into t_grp values ('G','H');
> insert into t_grp values ('B','G');
> insert into t_grp values ('X','Y');
> insert into t_grp values ('W','Y');
>
> I want to group the values based on the following rules,
>
> 1, if any of the value from id1 or id2 can be linked to any other
> values,
> they all will be treated as one group
>
> Ex, in the above case
> A has relationship to C, neither of them has any other relationship to
> any
> other values in the table
> B has relationship to D, B has relationship to G, G has relationship
> to H, H
> has relationship to F
> F has relationship to H, H has relationship to G ...
>
> A:C - Group 1 (A,C) - new elements
> B:D -> B:G -> G:H -> H:F - Group 2 (B, D, G, H, F)
> F:H -> H:G -> G:B -> B:D - Group 2 (F, H, G, B, D) - same elements as
> above
> G:H -> G:B -> H:F -> B:D - Group 2 (G, H, B, F, D) - do -
> G:B -> G:H -> H:F -> B:D - Group 2 (G, B, H, F, D) - do -
> X:Y -> Y:W - Group 3 (X, Y, W) - new set of elements
>
> My objective is to get the final result in the below form.
>
> grp id1 id2
> ----- ------ ------
> 1 A C
> 2 B D
> 2 F H
> 2 G H
> 2 B G
> 3 X Y
> 3 W Y
>
> The following output is also fine [This would be the final output, but
> I can covert from the above to this one]
>
> grp id
> ---- -------
> 1 A
> 1 C
> 2 B
> 2 D
> 2 F
> 2 H
> 2 G
> ...
> 3 X
> 3 Y
> 3 W
> 3 Y
>
> Thank you.
http://www.bookpool.com/sm/0977671542, section on transitive closure
Transitive Closure
There are several key graph concepts that would guide your intuition
when writing queries on graphs.
1. Reflexive closure of a graph is build by adding missing loops -
edges with the same endpoints. Reflexive closure is expressed easily
in SQL:
select head, tail from Edges -- original relation
union
select head, head from Edges -- extra loops
union
select tail, tail from Edges -- more loops
2. Symmetric closure of a (directed) graph is build by adding inversely oriented edge for each edge in the original graph. In SQL:
select head, tail from Edges -- original relation
union
select tail, head from Edges -- inverse arrows
3. Transitive closure of a (directed) graph is generated by connecting edges into paths and creating new edge with the tail being the beginning of the path, and the head being the end. Unlike the previous two cases, transitive closure can't be expressed with bare SQL essentials - the select, project, and join relational algebra operators.
For now, let's assume that we know how to query transitive closure,
and demonstrate how armed with these definitions we can approach some
problems. Here is a puzzle from the comp.databases.oracle.misc forum:
Suppose I have a table that contains Account# & RelatedAccount#
(among
other things).
How could I use CONNECT BY & START WITH in a query to count
relationships or families.
For example, in
ACCT REL_ACCT
Bob Mary Bob Jane
then there'd be only 1 larger family. Can I use CONNECT BY & START
with to detect such relationships and count them? In my case I'd be
willing to go any number of levels deep in the recursion.
Let's ignore oracle specific SQL keywords as inessential to the
problem at hand. With proper graph terminology the question can be
formulated in just one line:
"Find a number of connected components in a graph."
Connected component of a graph is a set of nodes reachable from each
other. A node is reachable from another node if there is an undirected
path between them.
Figure 6.4: A graph with two connected components. Reachability is an equivalence relation: it's reflective, symmetric, and transitive. Given a graph, we formally obtain reachability relation by closing the Edges relation to become reflective, symmetric and, transitive (fig. 6.5).
Figure 6.5: Reachability as an equivalence relation: graph from fig. 6.4 symmetrically and transitively closed. Returning back to the problem of finding the number of connected components, let's assume that we already calculated the reachability relation EquivalentNodess somehow. Then, we just select a smallest node from each component. Informally,
Select node(s) such that there is no node with smaller label reachable from it. Count them.
Formally:
select count(distinct tail) from EquivalentNodes e
where not exists (
select * from EquivalentNodes ee
where ee.head<e.tail and e.tail=ee.tail
);
Being able to query the number of connected components earned us an unexpected bonus: we can redefine a connected graph as a graph that has a single connected component. Next, a connected graph with N nodes and N-1 edges must be a tree. Thus, counting nodes and edges together with transitive closure is another opportunity to enforce tree constraint.
Now that we established some important graph closure properties, we can move on to transitive closure implementations. Unfortunately, our story has to branch here, since database vendors approached hierarchical query differently. Received on Fri Aug 10 2007 - 17:18:32 CDT