SUPPORT FOR DECLARATIVE TRANSITION CONSTRAINTS

From: Brian <brian_at_selzer-software.com>
Date: Sat, 18 Sep 2010 18:42:32 -0700 (PDT)
Message-ID: <d67dea08-45fb-493e-99fc-554028a670c0_at_k11g2000vbf.googlegroups.com>



SUPPORT FOR DECLARATIVE TRANSITION CONSTRAINTS A proposed enhancement to D

By Brian S. Selzer

ABSTRACT A non-procedural mechanism is proposed for declaring transition constraints using set-based relational operators, without altering relvar headings and independent of whether updates affect keys.

INTRODUCTION Transition constraints can be used to enforce business rules. Limited support for defining transition constraints has been available for some time in SQL implementations through the use of triggered procedures, but that support is limited to one table at a time, and in implementations that don't support FOR EACH ROW triggers, the tables must be altered to include IDENTITY or GUID columns.

C.J. Date and Hugh Darwen suggest on pages 220-221 of TTM, Third Edition, that transition constraints can be enforced by defining constraints using primed relvar names that refer to the corresponding relvar as it was prior to the update, but that suggestion suffers from the same limitation that SQL implementations that don't support FOR EACH ROW triggers do.

There has to be a better way.

A MOTIVATING EXAMPLE What follows is a fragment of a database that is used to record labor activities performed by employees in a manufacturing facility.

/* stub work order master */
VAR WORKORDER REAL RELATION { WO# CHAR, PART# CHAR } KEY { WO# }; /* work order detail */
VAR WODETAIL REAL RELATION
    { WO# CHAR, SEQ INTEGER, OP CHAR, INSTRUCT CHAR }     KEY { WO#, SEQ };
  CONSTRAINT FK_WODETAIL_WORKORDER ( COUNT ( WODETAIL { WO } )

  • COUNT ( ( ( WODETAIL { WO } ) JOIN WORKORDER ) { WO# } ) );
/* labor activities are recorded here. */ VAR LABOR REAL RELATION
    { EMP# CHAR, WO# CHAR, SEQ INTEGER, M# CHAR, OP CHAR,
      LBRTYPE CHAR, LBRDATE DATE, TIMEON TIME, TIMEOFF TIME,
      ELAPSED INTEGER, APPLIED RATIONAL, PRODUCED INTEGER,
      REJECTED INTEGER, STATUS CHAR },

    KEY { EMP#, WO#, SEQ, OP, M#, LBRTYPE, LBRDATE, TIMEON },     KEY { EMP#, WO#, SEQ, OP, M#, LBRTYPE, LBRDATE, TIMEOFF }; /* descriptions of attributes

    EMP# identifies the employee who performed the activity.

    WO# identifies the work order to which the activity applies.

    SEQ is the sequence number in the work order.

    OP is the operation performed.

    M# identifies the machine on which the activity was performed.

    LBRTYPE identifies the type of labor: S[etup], D[irect],     R[ework].

    LBRDATE identifies the date on which the labor activity was     performed. Activities performed after midnight on a shift that     starts before midnight are applied to the previous day.

    TIMEON and TIMEOFF are the times the labor activity began and     ended respectively. Labor activities that are currently being     performed have the same TIMEON and TIMEOFF. TIMEOFF can be     less than TIMEON when an activity starts before and ends after     midnight.

    ELAPSED is the elapsed working time in minutes. Working time     does not include breaks or lunch. ELAPSED is always less than     1440 minutes. ELAPSED is zero when TIMEON and TIMEOFF are the     same.

    APPLIED is the working time in hours that is to be applied to     the work order. (This is not necessarily the same as     ELAPSED/60. An employee may have been on more than one job at     the same time, for example, monitoring several different     machines at the same time.) Like ELAPSED, APPLIED is zero when     TIMEON and TIMEOFF are the same.

    PRODUCED and REJECTED are the quantities of the WIP part that     were produced and rejected as a result of the labor activity.     It is possible for these to be zero--there are operations that     can take more than one shift to complete, or the activity may     have begun late in the shift. These are also zero when TIMEON     and TIMEOFF are the same.

    STATUS is one of O[pen], C[losed], or A[pproved]. STATUS is     O[pen] for labor activities that are currently being performed;     STATUS is C[losed] for activities that are no longer being     performed; STATUS is A[pproved] for activities that have been     approved by management.

  • foreign key and inclusion dependency declarations */ CONSTRAINT FK_LABOR_WORKORDER ( COUNT ( LABOR { WO# } )
    • COUNT ( ( ( LABOR { WO# } ) JOIN WORKORDER ) { WO# } ) ); CONSTRAINT IND_LABOR_WODETAIL ( COUNT ( LABOR WHERE ( STATUS != 'A' ) { WO#, SEQ, OP } )
      • COUNT ( LABOR WHERE ( STATUS != 'A' ) { WO#, SEQ, OP } JOIN WODETAIL ) { WO#, SEQ, OP } ) );
/* state constraints to implement the rules described above */

  CONSTRAINT IS_EMPTY ( LABOR WHERE NOT ( LBRTYPE = 'S'     OR LBRTYPE = 'D' OR LBRTYPE = 'R' ) );   CONSTRAINT IS_EMPTY ( LABOR WHERE ELAPSED < 0     OR ELAPSED >= 1440 );   CONSTRAINT IS_EMPTY ( LABOR WHERE APPLIED < 0.0     OR APPLIED >= 24.0 );   CONSTRAINT IS_EMPTY ( LABOR WHERE PRODUCED < 0     OR REJECTED < 0 );

  CONSTRAINT IS_EMPTY ( LABOR WHERE NOT ( STATUS = 'O'     OR STATUS = 'C' OR STATUS = 'A' ) );   CONSTRAINT IS_EMPTY ( LABOR WHERE ( TIMEON = TIMEOFF     AND STATUS <> 'O' ) );

  CONSTRAINT IS_EMPTY ( LABOR WHERE ( STATUS = 'O'     AND ( TIMEON <> TIMEOFF OR ELAPSED <> 0 OR APPLIED <> 0.0       OR PRODUCED <> 0 OR REJECTED <> 0 ) ) );

Below are five transition constraints that need to be declared to implement specific business rules:

Transition constraint #1:
  STATUS can only transition from O[pen] to C[losed]     or from C[losed] to A[pproved].

  Business rules implemented:

  • Closed activities cannot be reopened; instead, a new activity must be opened.
  • Labor activities can't be approved while they’re being performed.

Transition constraint #2:
  Labor activities with STATUS A[pproved] cannot be INSERTed,     UPDATEd or DELETEd.

  Business rules implemented:

  • Approved labor activities have to have been reviewed before they can be approved. They can't have been reviewed if they hadn't already been recorded in the database with STATUS C[losed].
  • Labor activities that have been approved affect WIP (work in progress inventory); as a consequence, once an activity has been approved, it cannot be altered or eliminated. Corrections to offset activities approved in error are accomplished through auditable journal entries recorded elsewhere in the database.

Transition constraint #3:
  When STATUS transitions from C[losed] to A[pproved], only STATUS     transitions from C[losed] to A[pproved].

  Business rule implemented:

  • A labor activity that is being approved has to have been reviewed before it can be approved. That can't have happened if any component other than STATUS differs.

Transition constraint #4:
  LBRDATE can't be different.

  Business rule implemented:

Transition constraint #5:
  Updates to SEQ in WODETAILS must cascade into LABOR--that is,   updates to SEQ must also be reflected in each referencing labor   activity.

  Business rule implemented:

  • When the sequence of operations required to manufacture a part is altered, all labor activities that haven't yet been approved must refer to the new sequence.

DISCUSSION It is not possible to implement the above transition constraints using primed relvar names. The keys are compound, and it's possible for both keys to be updated. For example, Joe entered the wrong work order when he clocked onto labor. When he tried to clock off, the system couldn't find the work order, so the next day an entry showed up on his supervisor's error log. The labor activity from the day before remained open. After determining what actually occurred, the supervisor specifies the correct work order number along with the time off, the quantities produced and rejected, the elapsed time and the hours to be applied, setting STATUS to C[losed].

Before: RELATION
{ TUPLE { EMP# '22', WO# '2343', SEQ 12, OP '46', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '08:00', ELAPSED 0, APPLIED 0.0,     PRODUCED 0, REJECTED 0, STATUS 'O' } } After: RELATION
{ TUPLE { EMP# '22', WO# '2334', SEQ 12, OP '46', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '15:00', ELAPSED 330, APPLIED 5.5,     PRODUCED 33, REJECTED 2, STATUS 'C' } } The work order number is an element of both candidate keys, so both keys differ as a result of the supervisor's update. How would it be possible to match the tuple in the relvar as it was prior to the update under consideration to the tuple in the relvar as it would be after the update to verify that LBRDATE isn't different?

Just in case you're tempted, it can't be assumed that the update under consideration involves only one tuple. There may have been a multiple assignment, and constraints are checked at statement boundaries--that is, after the entire multiple assignment, not between its component assignments. For example, when the employee tried to clock on today, an error occurred because he was already clocked on (the activity from yesterday remained open), so the supervisor also had to manually enter the activity.

LABOR Before: RELATION
{ TUPLE { EMP# '22', WO# '2343', SEQ 12, OP '46', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '08:00', ELAPSED 0, APPLIED 0.0,     PRODUCED 0, REJECTED 0, STATUS 'O' } } LABOR After: RELATION
{ TUPLE { EMP# '22', WO# '2334', SEQ 12, OP '46', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '15:00', ELAPSED 330, APPLIED 5.5,     PRODUCED 33, REJECTED 2, STATUS 'C' },   TUPLE { EMP# '22', WO# '2343', SEQ 12, OP '46', M# 'M3',     LBRTYPE 'D', LBRDATE '2010-07-01', TIMEON '08:00',     TIMEOFF '08:00', ELAPSED 0, APPLIED 0.0,     PRODUCED 0, REJECTED 0, STATUS 'O' } } At first glance, this appears to be a clear violation of Transition Constraint #4 (Every component of the tuples with WO# '2343' are identical except for LBRDATE!), but it isn't a violation. The tuple with WO#:2343 before matches the tuple with WO#2334 after, and the tuple with WO#2343 represents a new labor activity that began on '2010-07-01'. How can that be determined, given just the relvar as it was prior to the update under consideration and the relvar as it is afterward?

Let's look at another situation. In this scenario, Joe started filing off burrs yesterday and informed his supervisor that it is taking forever. The supervisor then tells Joe to grind each piece smooth, and then file off the burrs. When Joe tried to clock off operation '67', 'GRIND SMOOTH' it bombed because there wasn't already an open activity for operation '67'. When Joe tried clocking on today, it bombed because as far as the system was concerned, he was still filing off burrs since yesterday. The supervisor now has to adjust the work order to reflect the additional step, close the open labor activity from yesterday with the correct operation, and open up a new activity for Joe today.

WODETAIL Before: RELATION
{ TUPLE { WO# '2334', SEQ 11, OP '23', INSTRUCT 'CUT OFF EXCESS' },
  TUPLE { WO# '2334', SEQ 12, OP '46', INSTRUCT 'FILE OFF BURRS' },   TUPLE { WO# '2343', SEQ 12, OP '46', INSTRUCT 'FILE OFF BURRS' } } LABOR Before: RELATION
{ TUPLE { EMP# '22', WO# '2334', SEQ 12, OP '46', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '08:00', ELAPSED 0, APPLIED 0.0,     PRODUCED 0, REJECTED 0, STATUS 'O' } } WODETAIL After: RELATION
{ TUPLE { WO# '2334', SEQ 11, OP '23', INSTRUCT 'CUT OFF EXCESS' },

  TUPLE { WO# '2334', SEQ 12, OP '67', INSTRUCT 'GRIND SMOOTH' },
  TUPLE { WO# '2334', SEQ 13, OP '46', INSTRUCT 'FILE OFF BURRS' },
  TUPLE { WO# '2343', SEQ 12, OP '46', INSTRUCT 'FILE OFF BURRS' } }

LABOR After: RELATION
{ TUPLE { EMP# '22', WO# '2334', SEQ 12, OP '67', M# 'M3',
    LBRTYPE 'D', LBRDATE '2010-06-30', TIMEON '08:00',     TIMEOFF '17:00', ELAPSED 390, APPLIED 6.5,     PRODUCED 313, REJECTED 0, STATUS 'C' },   TUPLE { EMP# '22', WO# '2334', SEQ 13, OP '46', M# 'M3',     LBRTYPE 'D', LBRDATE '2010-07-01', TIMEON '08:00',     TIMEOFF '08:00', ELAPSED 0, APPLIED 0.0,     PRODUCED 0, REJECTED 0, STATUS 'O' } } In this case, given just the before and after values of WODETAIL and LABOR, it's not clear whether the tuple in LABOR as it was prior to the update matches the tuple with OP '67' or the tuple with OP '46' in LABOR after. If the OP is what's different, then the update doesn't violate Transition Constraint 4; if the SEQ is what's different, then it does violate it. It's also not clear in WODETAIL whether SEQ 12 or SEQ 13 for WO# '2334' was added. If SEQ 12 was added, then what had SEQ 12 in LABOR Before must be what has SEQ 13 in LABOR After, otherwise Transition Constraint #5 is violated, but that violates Transition Constraint #4. If SEQ 13 was added, then what had SEQ 12 in LABOR before must be what has SEQ 12 in LABOR After, otherwise Transition Constraint #5 is violated, and here Transition Constraint #4 is not violated. These are two possible mappings from the tuples in the initial state to the tuples in the terminal state of the database, one that violates transition constraints and one that doesn't.

THE PROBLEM There can be more than one mapping from the tuples in database as it was prior to an update to the tuples in the database as it would be if the update were to succeed. Each mapping constitutes a possible transition. Unless there is a mechanism to determine exactly which mapping should be checked, it is not possible to enforce some constraints that permit one mapping but not another.

One possible solution is to introduce permanent surrogates, but that requires altering relvar headings, along with additional machinery to prevent surrogates from being reused. Another possible solution would be to introduce temporal attributes and maintain histories for entities, thereby treating transition constraints as temporal constraints, but again, that requires altering relvar headings and may also require the introduction of permanent surrogates.

There is a better way: simply let the user inform the system which mapping should be checked. In a multiple assignment consisting of just DELETEs, INSERTs and UPDATEs, there is enough information to determine exactly which mapping the user selected. Here's how:

To start out with, there are two collections of tuples: the tuples in the database as it was prior to the update under consideration and the tuples in the database as it would be if the multiple assignment were to succeed.

The tuples in the database as it was prior to the update that are targeted by a DELETE don't correspond to any tuples in the database as would be if the multiple assignment were to succeed.

The tuples in the database as it would be if the multiple assignment succeeds that are the result of an INSERT don't correspond to any tuples in the database as it was prior to the update. (I don't think it makes sense to allow duplicate tuples to be INSERTed.)

There is a one-to-one correspondence--a bijective mapping--from the tuples in the relations in the database as it was prior to the update that are targeted by an UPDATE to the tuples in the relations in the database as it would be if the multiple assignment were to succeed. The tuples that are targeted are EXTENDed with additional attributes, and at that point a one-to-one correspondence is established. Renaming occurs, and then that which established the one-to-one correspondence is projected away.

All that are left are the tuples that were not targeted by a DELETE, an INSERT, or an UPDATE, and those must appear unchanged in both the database as it was prior to the update and the database as it would be if the multiple assignment were to succeed.

It may be significant that a tuple was the target of an UPDATE or a "DELETE, INSERT" multiple assignment, even though there is no apparent difference--that is, even if the same tuple is an element of both the database as it was prior to the update and the database as it would be if the multiple assignment were to succeed. For example, some renters may already be paying $750 per month for a single bedroom apartment when a set-based update sets the rent at $750 per month for all single bedroom apartments. It wouldn't make sense to send legally required notifications of rent increases to renters that were already paying the new rate.

THE MECHANISM Following the convention introduced in the section, RM VERY STRONG SUGGESTION 4: TRANSITION CONSTRAINTS, I propose that three additional relvar names be significant during an update. Given an arbitrary database D, for each relvar Ri in D, {i = 1, 2, …, n}, in addition to Ri' which is understood to refer to the corresponding relvar as it was prior to the update under consideration:

The relvar name Ri- is understood to refer to the union of all tuples in Ri as it was prior to the update that are identified by the WHERE of any DELETEs that target Ri.

The relvar name Ri+ is understood to refer to the union of all of the relation expressions for INSERTs that target Ri.

The relvar name Ri~ is understood to refer to the information supplied by all UPDATEs that target Ri. What follows is the information supplied by means of UPDATE:

  1. WHERE <bool exp> identifies which tuples in Ri as it was prior to the update were the target of the update.
  2. For each tuple identified by a WHERE <bool exp>, <attribute assign commalist> identifies which components of the tuple are the target of the update.
  3. For each tuple identified by a WHERE <bool exp>, <attribute assign commalist> specifies the proposed value for each component targeted by the update.

There are several ways to make this information available for defining transition constraints. What I propose is that each tuple in Ri~ have a set of components with attribute names postfixed with ' that correspond to the components of the tuple as it was prior to the update, a set of Boolean components with attribute names postfixed with ~ that indicate whether each component in the tuple was targeted by the update, and a set of components with unaltered attribute names which correspond to the components of the tuple as it would be if the update were to succeed.

Since

  <relvar target> := <relation exp>

is semantically equivalent to the multiple assignment,

  DELETE <relvar target>, INSERT <relvar target> <relation exp>,

the description of Ri-, Ri+ and Ri~ above apply also to := assignments, the equivalent multiple assignment being substituted prior to evaluation.

With this mechanism, not only can it be determined exactly what is different, but also the exact scope of the transition can be determined, regardless of whether the components of tuples within that scope actually differ.

Now that the mechanism has been defined, the transition constraints required in the example above can be declared:

Transition Constraint #1 would be

  CONSTRAINT IS_EMPTY ( LABOR~
    WHERE ( STATUS = 'A' AND STATUS' = 'O' )

      OR ( STATUS = 'O' AND STATUS' = 'C' )
      OR ( STATUS' = 'A' AND STATUS <> 'A' ) );

  We're only concerned with UPDATEs here.

Transition Constraint #2 would be

  CONSTRAINT IS_EMPTY ( ( LABOR- { STATUS } WHERE STATUS = 'A' )     UNION ( ( LABOR~ WHERE STATUS' = 'A' ) { STATUS } )     UNION ( LABOR+ { STATUS } WHERE STATUS ='A' ) );   The only permissible way that STATUS can be 'A' is if it has been   UPDATEd from 'C' to 'A' at some point.

Transition Constraint #3 would be

  CONSTRAINT IS_EMPTY ( LABOR~ WHERE STATUS = 'A' AND STATUS' = 'C'     AND ( EMP# <> EMP#' OR WO# <> WO#' OR SEQ <> SEQ'

      OR OP <> OP' M# <> M#' OR LBRTYPE <> LBRTYPE'
      OR LBRDATE <> LBRDATE' OR TIMEON <> TIMEON'
      OR TIMEOFF <> TIMEOFF' OR ELAPSED <> ELAPSED'
      OR APPLIED <> APPLIED' OR PRODUCED <> PRODUCED'
      OR REJECTED <> REJECTED' ) );


Transition Constraint #4 would be

  CONSTRAINT IS_EMPTY ( LABOR~ WHERE LBRDATE <> LBRDATE' );

Transition Constraint #5 would be

  CONSTRAINT COUNT ( ( LABOR~
      WHERE STATUS' <> 'A' ) { WO#, WO#', SEQ, SEQ' OP, OP' } )

  • COUNT ( ( ( LABOR~ WHERE STATUS' <> 'A' ) { WO#, WO#', SEQ, SEQ' OP, OP' } ) JOIN ( WODETAIL~ { WO#, WO#', SEQ, SEQ' OP, OP' } ) );
  This constraint enforces the constraint specified by a cascading   update. What is objectionable about cascading referential actions   is that changes to one real relvar affects other real revars. A   cascading delete or update can easily be transformed into a   multiple assignment, but a cascading referential action does more   than that. It imposes a constraint that prevents updates that   don't cascade. This mechanism for declaring transition   constraints makes it possible for those constraints to be enforced   as well.

A brief sketch of how to arrive at what each additional relvar name references:

Given an arbitrary database D,
  for each relvar Ri in D, {i = 1, 2, …, n}:

Populating Ri- and Ri+ are pretty straightforward.   Given DELETE Ri WHERE delete_bool_exp_Ri_k, {k = 1, 2, ..., m}.   Populate Ri- by evaluating the union of the expression:

      Ri' WHERE delete_bool_exp_Ri_k
    for each k,{k = 1, 2, ..., m}

  Given INSERT Ri insert_relation_exp_Ri_k, {k = 1, 2, ..., m}.   Populate Ri+ by evaluating the union of

      insert_relation_exp_Ri_k
    for each k, {k = 1, 2, ..., m}

Populating Ri~ is a bit more involved.
  Given UPDATE Ri
    WHERE update_bool_exp_Ri_k ( attribute_assign_commalist_Ri_k ),       {k = 1, 2, ..., m}.
  For each UPDATE,

  1. Construct a relation literal, relation_literal_Ri_k, consisting of a single tuple, such that: for each Aj, {j = 1, 2, ..., l} in Ri, if Aj appears as an attribute target in attribute_assign_commalist_Ri_k, Aj~ TRUE appears as a component of the tuple in relation_literal_Ri_k, otherwise, Aj~ FALSE appears as a component.
  2. Transform attribute_assign_commalist_Ri_k into an <extend add commalist>, complete_extend_add_commalist_Ri_k, such that: for each Aj, {j = 1, 2, ..., l}, if Aj appears as an attribute target in attribute_assign_commalist_Ri_k, the <attribute assign>, Aj := <exp> is transformed into the <extend add>, <exp> AS Aj, in complete_extend_add_commalist_Ri_k, otherwise the <extend add>, Aj' AS Aj, appears in complete_extend_add_commalist_Ri_k.
  3. Populate Ri~ by evaluating the union of the expression: EXTEND ( ( Ri' WHERE update_bool_exp_Ri_k ) RENAME { SUFFIX '' AS '''' } ) JOIN relation_literal_Ri_k ) ADD { complete_extend_add_commalist_Ri_k ); for each k, {k = 1, 2, ..., m}. Note: RENAME (SUFFIX '' AS '''') adds a single ' to each attribute name, the convention being that '' stands for ' within a character literal.

FURTHER ENHANCEMENT Another enhancement to D that I think would be worthwhile would be the addition of FROM in DELETE and UPDATE.

DELETE <relvar target> [ FROM <relation exp> ] [ WHERE <bool exp> ]

UPDATE <relvar target> [ FROM <relation exp> ]

    [ WHERE <bool exp> ] ( <attribute assign commalist> )

In both cases, <relvar target> is understood to have been joined to <relation exp>, and <bool exp> and <attribute assign commalist> permit <attribute ref>s (but not <attribute target>s) not just from the attributes of <relvar target> but instead from the attributes of ( <relvar target> JOIN <relation exp> ). If FROM <relation exp> is omitted, <relation exp> is understood to be TABLE_DEE.

The tuples in <relvar target> that are the target of DELETE or UPDATE becomes

<relvar target> SEMIJOIN ( ( <relvar target> JOIN <relation exp> )   WHERE <bool exp> ).

In the preceding section,
  Ri' WHERE delete_bool_exp_Ri_k
becomes
  Ri' SEMIJOIN ( ( Ri' JOIN from_expression_Ri_k )     WHERE delete_bool_exp_Ri_k )
and
  EXTEND ( ( Ri' WHERE update_bool_exp_Ri_k )     RENAME { SUFFIX '' AS '''' } )

      JOIN relation_literal_Ri_k )
        ADD { complete_extend_add_commalist_Ri_k );
becomes
  EXTEND ( ( Ri' SEMIJOIN ( ( Ri' JOIN from_expression_Ri_k )     WHERE update_bool_exp_Ri_k ) ) RENAME { SUFFIX '' AS '''' } )
      JOIN relation_literal_Ri_k )
        ADD { complete_extend_add_commalist_Ri_k );
Received on Sun Sep 19 2010 - 03:42:32 CEST

Original text of this message