Re: SQL, related records (quotes)

From: Dan Guntermann <>
Date: Sun, 26 Jun 2005 10:58:37 GMT
Message-ID: <xTvve.3718$4M1.1866_at_trnddc07>

>> I'll have to get Celko's book myself, I guess.
>> Ideally and abstractly, I think something like this would work:
>> hierarchies(child_vertex *vertex*, parent_vertex *vertex*,
>> CONSTRAINT primary_key (child_vertex),
>> CONSTRAINT CHECK (child_vertex <> parent_vertex),
>> );
> A few questions:
> 1. Did you just forget the foreign key constraint
> (child_vertex referencing parent_vertex)
> or did you leave it out for a purpose?

No nulls in my model. I don't believe an FK is necessary, and with some implementations could not be achievable. I could be convinced otherwise, but consider the following:

child parent
  B A

how would one insert this tuple with the FK established? What would B reference?

Moreover, the model I was conceptualizing would allow for multiple hierarchies.

> 2. Both the '<>' and the ****ANTISYMMETRIC**** constraint serve to
> establish that 'hierarchies' can just contain hierarchies (trees)
> indeed.
> What is special about the '<>' constraint? Shouldn't it be included
> in the (as yet magical) ****ANTISYMMETRIC**** constraint?

I think Mr. Rybacki begs to differ, but I had the following rationale:

Starting with an empty relation with just a key constraint on 'child', the following would be allowable, but undersireable:

child     parent
  A          A

Thus, to guarantee the nonreflexive property of a hierarchy, the constraint is need to guarantee all cases. Note that this is only possible with a vertex involving itself as a root node of a hierarchy. It is not possible for child vertexes by virtue of the key constraint since they already exist in conjunction with a single parent vertex.

To address the issue of antisymmetry, I should identify the conditions where I think it such a constraint would be pertinent. Direct cycles are not an issue anywhere but for a certain case, and that is in the case where a root can be associated with a child and a child can be the parent of the root. All other vertices of the hierarchy will be immune to such a condition because each child is defined in terms of its relationship with a parent and, by definition, the number of arrows representing a directed parent-child relationship and thus pointing into child is limited to one (as enforced by the key).

example1:     child    parent
                       B         A
                       C         A
                       A         B        <--  cyclic (symmetric) 
example2        B          A
                      C          A
                      D          B
                      B          D       XXXX <-- can't happen,
                                                because of the key 

Nor can a leaf node or an intermediate branch node reference higher level vertices as children.

As Mr. Rybacki points out, a relation is antisymmetric <=> for all x, y in some domain, if x R y and y R x, then x = y. The antisymmetric constraint will disallow the extensional relation given in previous posts:

child     parent
   3          6
   6          3

Why? Because the DBMS will see that <child: 3, parent 6> exists and that if <child: 6, parent: 3> were allowed to be inserted, the assertion of antisymmetry would not hold because 6 <> 3, and thus the DBMS would reject the insertion on the basis of the ANTISYMMETRIC constraint.

> 3. Would ****NOLOOP*** (some other as yet magical constraint over
> child_vertex and parent_vertex) suggest the same mechanics
> to you as ****ANTISYMMETRIC**** ?

??? I'm not sure what NOLOOP would do. Properties of relations such as symmetry, reflexivity, and transitivity are states of relations that are very well defined.

> 4. (should have been the first Q to the OP - to late for that I guess:)
> What does one tuple mean / what is the predicate for 'hierarchies'?

Well, in my little encapsulated world, it can represent several types of conceptual objects, or specific aspects of hierarchy, really. Possibility:
Child Vertex: X of type *vertex* has one and only one Parent Vertex: Y of type *vertex*
  AND (informally) Child Vertex: X belongs to only one edge as the child and therefore has only one parent
  AND (informally) if the Child Vertex = X, then Parent Vertex <> X
  AND (informally) there should never be the case where EXISTS (child_vertex: X, parent_vertex Y) AND (child_vertex: Y, parent_vertex: X)

             AND (child vertex: X <> parent_vertex: Y).

The table constraints are really meta constraints across multiple tuples to enforce structure over the hierarchy.

> 5. (Proposal, not really a question) Let's isolate the roots
> (and get rid of the NULLs).
> (PS: later I saw that you already did just that)

>> This ideal but unachievable representation would solve the problem and it
>> enforces:
>> * a single root vertex for any hierarchy.
>> * a single parent for any child vertex and allows for multiple children
>> for any parent.
>> * ensures an nonreflexive property and thus eliminates the
>> self-referential cyclic issue.
>> * eliminates the possibility of a cycle between a root and single child
>> with the antisymmetric constraint.
>> * eliminates the possibility for indirect cycles by virtue of the
>> constraint that any given vertex can only have one parent.
>> But alas, not many implementations allow for assertions of antisymmetry,
>> though it could be done with a trigger.
> Let's forget implementation for now and focus on minimal needs.
Sure, why not.

> Hmm... the formatting makes this a tough read.

> FROM roots WHERE roots.root_vertex = current.child_vertex )
> no children are root. Ok.
> EXISTS ( SELECT 'x' FROM roots
> WHERE roots.root_vertex = current.parent_vertex)
> all parents should be root? To restrictive.
The formatting was tough, I know. I'll rewrite the constraint in its entirety here:


     FROM roots
     WHERE roots.root_vertex = current.parent_vertex
      FROM roots r2
      WHERE r2.child_index = current.parent_index) );

Basically, the intent is to only allow a tuple to be inserted into the hierarchies relation if the proposed parent vertex value does not already exist as a child and its value has been previously inserted as a value in the roots relation.
> To check we should have something like this:
> A length 1 loop:
> Values:
> parent_vertex child_vertex
> 3 3
> can't be in 'hierarchies', it will be rejected by
> 'CONSTRAINT CHECK (child_vertex <> parent_vertex)'. Good.
> A length 2 loop:
> Values:
> parent_vertex child_vertex
> 3 6
> 6 3
> can't be in 'hierarchies', it will be rejected by
> ???.
It will be rejected by the candidate key for almost all cases except for the possibility of such a case happening between a root and one of its children. In that case, the ANTISYMMETRIC constraint should ensure that that special case does no occur.

> A length 3 loop:
> Values:
> parent_vertex child_vertex
> 3 6
> 6 7
> 7 3
> can't be in 'hierarchies', it will be rejected by
> ???.

Well, I think the special roots relation and the added constraints I mentioned would be helpful in such cases. Admittedly, I overlooked this in my first proposal.

>> The last constraints, which are DB constraints, of the tables would most
>> likely have to be implemented as a database constraint either with
>> trigger logic or by modeling manipulation within the scope of a
>> transaction boundary from the application side. This overcomes the
>> obstacle that a hierarchy could conceivably be composed of a single root
>> node.
> The non-hierarchy hierarchy.
> Do I care wether it is in or out?
> Let's take the easy road. Out.

Makes sense. However, one might be neglecting some useful applications (e.g. modeling a B*Tree or something similar).


  • Dan
Received on Sun Jun 26 2005 - 12:58:37 CEST

Original text of this message