Re: Transitive Reflexive Closure And Its Discontents

From: Vadim Tropashko <vadimtro_at_gmail.com>
Date: Thu, 22 Jul 2010 10:23:41 -0700 (PDT)
Message-ID: <a7fa0276-ba0f-4ae5-b4a3-e0db913b0db7_at_m17g2000prl.googlegroups.com>


On Jul 21, 2:11 pm, da..._at_fetter.org (David Fetter) wrote:
> Folks,
>
> I'm reading over Vadim Tropashko's SQL Design Patterns book, and
> trying to figure out how to maintain transitive reflexive closure,
> using PostgreSQL.
>
> Let's assume, for the purposes of this discussion, that I've looked
> over all the alternatives to representing graphs as adjacency
> relations and decided to go ahead with adjacency relations.
>
> For simplicity I'm using two tables: edges and
> transitive_reflexive_closure.
>
> CREATE TABLE edges (
>     tail INTEGER NOT NULL,
>     head INTEGER NOT NULL,
>     PRIMARY KEY(tail, head),
>     CHECK (tail <> head)
> );
>
> CREATE UNIQUE INDEX one_way_edges ON edges(
>     least(tail,head),
>     greatest(tail,head)
> ); /* Yes, this should probably be expressible as a constraint, but
> it's not yet in PostgreSQL */
>
> CREATE TABLE transitive_reflexive_closure (
>     tail INTEGER NOT NULL,
>     head INTEGER NOT NULL,
>     weight INTEGER NOT NULL,
>     CHECK(weight > 0),
>     PRIMARY KEY(tail, head),
>     FOREIGN KEY(tail, head) REFERENCES edges(tail,head)
> );
>
> To populate transitive_reflexive_closure initially, I'd use
> PostgreSQL's arrays:
>
> INSERT INTO transitive_reflexive_closure(tail, head, weight)
> WITH RECURSIVE t (tail, head, chain) AS (
>     SELECT tail, head, ARRAY[tail,head]
>     FROM edges
>     WHERE NOT EXISTS ( /* Start from only "roots" */
>         SELECT 1 FROM edges e WHERE e.head = edges.tail
>     )
> UNION ALL
>     SELECT t1.tail, t1.head, t.chain || t1.head
>     FROM
>         edges t1
>     JOIN
>         t ON t.head = t1.tail
>         AND t1.tail <> ANY(t.chain) /* No cycles */
> )
> SELECT tail, head, count(*)
> FROM t
> GROUP BY tail, head;

For DAGs why do you want to start with roots? Two more minor observations: the query doesn't seem to include identity relation (so the its not reflective), and in case of more than one path from node A to B it seems to violate transitive_reflexive_closure primary key constraint.

> I'd like to see whether continuing along this line will allow for
> "graph constraints" like "must be tree" or "must be DAGs."
>
> Does this seem like a valid approach?

Just to avoid any confusion, if you are referring to transitive closure method you seems to have outlined only half of it. The second step is transitive closure maintenance, that is what edges to insert into the transitive closure table when an adge is added into the source table.

Tree constraint is easy to express: no two edges should have the same tail (pp 208-209). The DAG constraint is also easy to express on the transitive closure table. Every cycle in the graph is represented in transitive closure by a reflective edge. (If your variation of transitive closure breaks cycles early then, the condition of not having cycles in an original agjacency relation becomes a self-join query upon transitive closure table asserting that any two nodes A and B are not connected by both edges(A,B) and (B,A) ). Received on Thu Jul 22 2010 - 19:23:41 CEST

Original text of this message