Re: SQL, related records (quotes)

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Sun, 26 Jun 2005 21:20:02 +0200
Message-ID: <42beffe2$0$32835$e4fe514c_at_news.xs4all.nl>


Dan Guntermann wrote:
> <snip>
>

>>>I'll have to get Celko's book myself, I guess.
>>>
>>>Ideally and abstractly, I think something like this would work:
>>>
>>>hierarchies(child_vertex  *vertex*, parent_vertex *vertex*,
>>>                   CONSTRAINT primary_key (child_vertex),
>>>                   CONSTRAINT CHECK (child_vertex <> parent_vertex),
>>>                   CONSTRAINT CHECK ****ANTISYMMETRIC****
>>>                 );
>>
>>A few questions:
>>1. Did you just forget the foreign key constraint
>>   (child_vertex referencing parent_vertex)
>>   or did you leave it out for a purpose?

>
> No nulls in my model.

Ok with me. As a consequence we need a model change, maybe more tables. Anyway we are changing the model. (In the OP's question there where quotes _without_ previous quotes)

So which FK are we talking about now?

> I don't believe an FK is necessary,

Maybe, maybe not. It depends on the - yet to be defined - new model.

> and with some
> implementations could not be achievable. I could be convinced otherwise,
> but consider the following:
>
> child parent
> B A
>
> how would one insert this tuple with the FK established? What would B
> reference?
>
> Moreover, the model I was conceptualizing would allow for multiple
> hierarchies.

Does that matter?
Multiple trees are easily mapped to branches of one root node, so whichever is easiest (one or more trees) is easily used for the other - or are you talking about multiple hierarchies over the same nodes? I don't think so.

>>2. Both the '<>' and the ****ANTISYMMETRIC**** constraint serve to
>>   establish that 'hierarchies' can just contain hierarchies (trees)
>>   indeed.
>>   What is special about the '<>' constraint? Shouldn't it be included
>>   in the (as yet magical) ****ANTISYMMETRIC**** constraint?

>
> I think Mr. Rybacki begs to differ, but I had the following rationale:
>
> Starting with an empty relation with just a key constraint on 'child', the
> following would be allowable, but undersireable:
>
> child parent
> A A
>
> Thus, to guarantee the nonreflexive property of a hierarchy, the constraint
> is need to guarantee all cases. Note that this is only possible with a
> vertex involving itself as a root node of a hierarchy. It is not possible
> for child vertexes by virtue of the key constraint since they already exist
> in conjunction with a single parent vertex.

It never stops bugging me how many arbitrary choices one has to make to put trees - or even just order - into tables). It seems you define a root as a node which has itself as a parent, right? That is one, not uncommon, way of dealing with this, but not the only way.

> To address the issue of antisymmetry, I should identify the conditions where
> I think it such a constraint would be pertinent. Direct cycles are not an
> issue anywhere but for a certain case, and that is in the case where a root
> can be associated with a child and a child can be the parent of the root.

Is this just a way of restating the root definition?

> All other vertices of the hierarchy will be immune to such a condition
> because each child is defined in terms of its relationship with a parent
> and, by definition, the number of arrows representing a directed
> parent-child relationship and thus pointing into child is limited to one (as
> enforced by the key).
>
> example1: child parent
> B A
> C A
> A B <-- cyclic (symmetric)
> relationship
>
> example2 B A
> C A
> D B
> B D XXXX <-- can't happen,
> because of the key
> constraint.
>
> Nor can a leaf node or an intermediate branch node reference higher level
> vertices as children.
>
> As Mr. Rybacki points out, a relation is antisymmetric <=> for all x, y in
> some domain, if x R y and y R x, then x = y. The antisymmetric constraint
> will disallow the extensional relation given in previous posts:
>
> child parent
> 3 6
> 6 3
>
> Why? Because the DBMS will see that <child: 3, parent 6> exists and that if
> <child: 6, parent: 3> were allowed to be inserted, the assertion of
> antisymmetry would not hold because 6 <> 3, and thus the DBMS would reject
> the insertion on the basis of the ANTISYMMETRIC constraint.
>
>

>>3. Would ****NOLOOP*** (some other as yet magical constraint over
>>   child_vertex and parent_vertex) suggest the same mechanics
>>   to you as ****ANTISYMMETRIC**** ?

>
>
> ??? I'm not sure what NOLOOP would do.

Something like this:
It would not allow loops. A loop being a set of vertices, where there exists a vertex A being both child and ancestor of a vertex B (which may, of course be A).

It does not play well with having root as parent of itself, though.

> Properties of relations such as
> symmetry, reflexivity, and transitivity are states of relations that are
> very well defined.
>
>

>>4. (should have been the first Q to the OP - to late for that I guess:)
>>   What does one tuple mean / what is the predicate for 'hierarchies'?

>
>
> Well, in my little encapsulated world, it can represent several types of
> conceptual objects, or specific aspects of hierarchy, really.
> Possibility:
> Child Vertex: X of type *vertex* has one and only one Parent Vertex: Y of
> type *vertex*
> AND (informally) Child Vertex: X belongs to only one edge as the child and
> therefore has only one parent
> AND (informally) if the Child Vertex = X, then Parent Vertex <> X
> AND (informally) there should never be the case where EXISTS
> (child_vertex: X, parent_vertex Y) AND (child_vertex: Y, parent_vertex: X)
> AND (child vertex: X <> parent_vertex: Y).

This is structure, not meaning.

> The table constraints are really meta constraints across multiple tuples to
> enforce structure over the hierarchy.

Yep.

>>5. (Proposal, not really a question) Let's isolate the roots
>>   (and get rid of the NULLs).
>>
>>   (PS: later I saw that you already did just that)
>>

>
> Yes!!
>
>>>This ideal but unachievable representation would solve the problem and it 
>>>enforces:
>>>*  a single root vertex for any hierarchy.
>>>*  a single parent for any child vertex and allows for multiple children 
>>>for any parent.
>>>*  ensures an nonreflexive property and thus eliminates the 
>>>self-referential cyclic issue.
>>>*  eliminates the possibility of a cycle between a root and single child 
>>>with the antisymmetric constraint.
>>>*  eliminates the possibility for indirect cycles by virtue of the 
>>>constraint that any given vertex can only have one parent.
>>>
>>>But alas, not many implementations allow for assertions of antisymmetry, 
>>>though it could be done with a trigger.
>>
>>Let's forget implementation for now and focus on minimal needs.
>>

>
> Sure, why not.
>
>
>>Hmm... the formatting makes this a tough read.
>>

>
> Sorry.
>
>
>>NOT EXISTS ( SELECT 'x'
>>  FROM roots WHERE roots.root_vertex = current.child_vertex )
>>no children are root. Ok.
>>
>>
>>EXISTS ( SELECT 'x' FROM  roots
>>   WHERE  roots.root_vertex = current.parent_vertex)
>>
>>all parents should be root? To restrictive.
>>

>
> The formatting was tough, I know. I'll rewrite the constraint in its
> entirety here:
>
> CONSTRAINT EXISTS
> (SELECT 'x'
> FROM roots
> WHERE roots.root_vertex = current.parent_vertex
> AND NOT EXISTS
> (SELECT 'x'
> FROM roots r2
> WHERE r2.child_index = current.parent_index) );
>
> Basically, the intent is to only allow a tuple to be inserted into the
> hierarchies relation if the proposed parent vertex value does not already
> exist as a child and its value has been previously inserted as a value in
> the roots relation.

The intent is clear and common - I think. That makes it more annoying that it is so difficult to come up with a good set of constraints. I'm puzzled wether the above does what it is supposed to do. Did you test it?

>>To check we should have something like this:
>>A length 1 loop:
>>Values:
>>    parent_vertex    child_vertex
>>          3           3
>>can't be in 'hierarchies', it will be rejected by
>>'CONSTRAINT CHECK (child_vertex <> parent_vertex)'. Good.
>>
>>A length 2 loop:
>>Values:
>>    parent_vertex    child_vertex
>>          3           6
>>          6           3
>>can't be in 'hierarchies', it will be rejected by
>> ???.

>
> It will be rejected by the candidate key for almost all cases except for the
> possibility of such a case happening between a root and one of its children.

This complication sounds like a consequence of the (arbitrarily chosen) root definition.

> In that case, the ANTISYMMETRIC constraint should ensure that that special
> case does no occur.
>
>

>>A length 3 loop:
>>Values:
>>    parent_vertex    child_vertex
>>          3           6
>>          6           7
>>          7           3
>>can't be in 'hierarchies', it will be rejected by
>> ???.

>
>
> Well, I think the special roots relation and the added constraints I
> mentioned would be helpful in such cases. Admittedly, I overlooked this in
> my first proposal.
>
>
>>>The last constraints, which are DB constraints, of the tables would most 
>>>likely have to be implemented as a database constraint either with 
>>>trigger logic or by modeling manipulation within the scope of a 
>>>transaction boundary from the application side.  This overcomes the 
>>>obstacle that a hierarchy could conceivably be composed of a single root 
>>>node.
>>
>>The non-hierarchy hierarchy.
>>Do I care wether it is in or out?
>>Let's take the easy road. Out.

>
> Makes sense. However, one might be neglecting some useful applications
> (e.g. modeling a B*Tree or something similar).

Ever heard of YAGNI? KISS, so neglect it unless you specifically aim for "modeling a B*Tree or something similar" - are you? Received on Sun Jun 26 2005 - 21:20:02 CEST

Original text of this message