Re: Floyd's algorithm in SQL

From: John Gilson <jag_at_acm.org>
Date: Thu, 02 Dec 2004 22:39:59 GMT
Message-ID: <3RMrd.18429$Yh2.7072712_at_twister.nyc.rr.com>


"Lennart Jonsson" <lennart_at_kommunicera.umea.se> wrote in message news:6dae7e65.0412021227.6e0e87ce_at_posting.google.com...
> "John Gilson" <jag_at_acm.org> wrote in message news:<ykGrd.18407$Yh2.6979338_at_twister.nyc.rr.com>...
> > "--CELKO--" <jcelko212_at_earthlink.net> wrote in message
> > news:18c7b3c2.0412011610.3891e0bc_at_posting.google.com...
> > > I am re-doing the graphs section for the third edition of SQL FOR
> > > SMARTIES and would like some feedback and help.
> > >
>
>
> [...]
> >
> > The following was tested with DB2 so it might not be fully compliant
> > with Standard SQL. For example, I believe the Standard requires a
> > recursive WITH to be defined as WITH RECURSIVE.
> >
> > Using your DDL, here's a sample graph.
> >
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('1', '2', 50);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('1', '3', 30);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('1', '4', 100);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('1', '5', 10);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('3', '2', 5);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('4', '2', 20);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('4', '3', 50);
> > INSERT INTO Edges (source, destination, lgth)
> > VALUES ('5', '4', 10);
> >
> > To find the shortest paths:
> >
> > CREATE VIEW ShortestPaths (source, destination, path_length)
> > AS
> > WITH Paths (source, destination, path_length)
> > AS
> > (
> > SELECT source, destination, 1
> > FROM Edges
> > UNION ALL
> > SELECT E1.source, P1.destination, P1.path_length + 1
> > FROM Edges AS E1, Paths AS P1
> > WHERE E1.destination = P1.source
> > )
> > SELECT source, destination, MIN(path_length)
> > FROM Paths
> > GROUP BY source, destination;
> >
>
> Dont you need a termination condition in case of cycles? I dont think
> this will terminate if you add (2,1) to the set of edges. The best
> condition I could think of was that we never have to walk more then
> the nr of edges in the graph, thus when path_length > cnt we can stop.
>
> It would be nice to be able to do something like:
>
> WITH Paths (source, destination, path_length) AS (
> SELECT source, destination, 1 FROM Edges
> UNION ALL
> SELECT E1.source, P1.destination, P1.path_length + 1
> FROM Edges AS E1, Paths AS P1
> WHERE E1.destination = P1.source
> AND NOT EXISTS (
> SELECT 1 from Paths P2
> WHERE (E1.source, P1.destination) = (P2.source,
> P2.destination)
> )
> )
> SELECT source, destination, MIN(path_length)
> FROM Paths
> GROUP BY source, destination;
>
> But that is not supported by DB2.
>
> I BTW found another limitation for the with clause in DB2. It is not
> allowed to use x inner join y on ... inside the with clause.
>
>
> /Lennart
>
> [...]

Yes, good observation! I was considering a DAG but if the graph has cycles then one needs to bound the recursion to prevent infinite loops. The WITH RECURSIVE in Standard SQL has a CYCLE clause to handle this sort of thing but DB2 requires a manual solution.

We know that the length of a path through a directed graph can't be greater than the number of nodes in the graph. If we don't allow an edge from a node to itself then the length of a path must be less than the number of nodes in the graph. This is our bounding condition. I'll repeat the code from my previous posting with this constraint added.

CREATE TABLE Nodes
(
node_name CHAR(1) NOT NULL PRIMARY KEY
);

INSERT INTO Nodes (node_name)
VALUES ('1');
INSERT INTO Nodes (node_name)
VALUES ('2');
INSERT INTO Nodes (node_name)
VALUES ('3');
INSERT INTO Nodes (node_name)
VALUES ('4');
INSERT INTO Nodes (node_name)
VALUES ('5'); CREATE TABLE Edges
(source CHAR(1)NOT NULL REFERENCES Nodes (node_name),  destination CHAR(1)NOT NULL REFERENCES Nodes (node_name),  PRIMARY KEY (source, destination),
 lgth INTEGER DEFAULT 1 NOT NULL);

CREATE TABLE Paths
(source CHAR(1)NOT NULL REFERENCES Nodes (node_name),  destination CHAR(1) NOT NULL REFERENCES Nodes (node_name),  lgth INTEGER DEFAULT 0 NOT NULL);

  • Note that this graph now has a cycle between nodes 1 and 2 INSERT INTO Edges (source, destination, lgth) VALUES ('1', '2', 50); INSERT INTO Edges (source, destination, lgth) VALUES ('2', '1', 50); INSERT INTO Edges (source, destination, lgth) VALUES ('1', '3', 30); INSERT INTO Edges (source, destination, lgth) VALUES ('1', '4', 100); INSERT INTO Edges (source, destination, lgth) VALUES ('1', '5', 10); INSERT INTO Edges (source, destination, lgth) VALUES ('3', '2', 5); INSERT INTO Edges (source, destination, lgth) VALUES ('4', '2', 20); INSERT INTO Edges (source, destination, lgth) VALUES ('4', '3', 50); INSERT INTO Edges (source, destination, lgth) VALUES ('5', '4', 10);
  • Shortest paths for any directed graph CREATE VIEW ShortestPaths (source, destination, path_length) AS WITH NumberNodes (n) AS (SELECT COUNT(node_name) FROM Nodes), Paths (source, destination, path_length) AS (SELECT source, destination, 1 FROM Edges UNION ALL SELECT E1.source, P1.destination, P1.path_length + 1 FROM Edges AS E1, Paths AS P1, NumberNodes AS N WHERE E1.destination = P1.source AND P1.path_length < N.n -- condition to prevent infinite recursion ) SELECT source, destination, MIN(path_length) FROM Paths GROUP BY source, destination;

SELECT source, destination, path_length
FROM ShortestPaths
ORDER BY source, destination;

source destination path_length

1      1                     2
1      2                     1
1      3                     1
1      4                     1
1      5                     1
2      1                     1
2      2                     2
2      3                     2
2      4                     2
2      5                     2
3      1                     2
3      2                     1
3      3                     3
3      4                     3
3      5                     3
4      1                     2
4      2                     1
4      3                     1
4      4                     3
4      5                     3
5      1                     3
5      2                     2
5      3                     2
5      4                     1
5      5                     4

To find the cheapest paths:

CREATE VIEW CheapestPaths (source, destination, path_cost) AS
WITH NumberNodes (n) AS (SELECT COUNT(node_name) FROM Nodes),

           Paths (source, destination, path_length, path_cost)
           AS
           (SELECT source, destination, 1, lgth
            FROM Edges
            UNION ALL
            SELECT E1.source,
                           P1.destination,
                           P1.path_length + 1,
                           P1.path_cost + E1.lgth
            FROM Edges AS E1, Paths AS P1, NumberNodes AS N
            WHERE E1.destination = P1.source AND
                           P1.path_length < N.n  -- condition to prevent infinite recursion
           )

SELECT source, destination, MIN(path_cost) FROM Paths
GROUP BY source, destination;

SELECT source, destination, path_cost
FROM CheapestPaths
ORDER BY source, destination;

source destination path_cost

1      1                    85
1      2                    35
1      3                    30
1      4                    20
1      5                    10
2      1                    50
2      2                    85
2      3                    80
2      4                    70
2      5                    60
3      1                    55
3      2                     5
3      3                    85
3      4                    75
3      5                    65
4      1                    70
4      2                    20
4      3                    50
4      4                    90
4      5                    80
5      1                    80
5      2                    30
5      3                    60
5      4                    10
5      5                    90

The iterative solutions to find the shortest paths and cheapest paths from my previous posting will handle cycles as is by virtue of the NOT EXISTS subquery when doing insertions. As you've pointed out, DB2 won't allow this very same NOT EXISTS subquery to be placed in the common table expression of a WITH clause. Unfortunate as that would be a more direct way of avoiding infinite recursion from cycles without having to fall back on the path-length condition.

--
JAG
Received on Thu Dec 02 2004 - 23:39:59 CET

Original text of this message