Re: circular relationships ok?

From: Alexandr Savinov <spam_at_conceptoriented.com>
Date: Fri, 03 Mar 2006 09:48:55 +0100
Message-ID: <440802f8$1_at_news.fhg.de>


David Portas schrieb:
> Alexandr Savinov wrote:

>> David Portas schrieb:
>>> Let's try an even simpler and even more common example: An
>>> organizational hierarchy.
>>>
>>> CREATE TABLE Employees
>>>  (employee_id INTEGER PRIMARY KEY,
>>>  employee_name VARCHAR(50) NOT NULL,
>>>  manager_id INTEGER NOT NULL
>>>   REFERENCES Employees (employee_id),
>>>   CHECK (employee_id =0 OR employee_id <> manager_id));
>>>
>>> The business rule is that each employee has a manager who is also an
>>> employee.
>> An arrow goes from Employee up, then turns down and then comes to this
>> concept from down:
>>
>>          /\
>> Employee |
>>          \/
>>
>> That is very simple. Loops (self-references) can be used because they
>> are very convenient and frequently used. And they have clear and
>> unambiguous semantics. My opinion that loops should be permitted but not
>> cycles. Cycles do not have unambiguous semantics. If you can explain to
>> your database how to interpret them then cycles can also be used but I
>> do not know such an interpretation. And, IMHO, it is reason why
>> contemporary database do not manage such relationships - it is a task of
>> the programmer.
>>
>> --
>> http://conceptoriented.com

>
> I understand now that you don't recognize cycles because your diagrams
> are directed graphs. Does that mean you can only support common
> cardinality and not N-cardinality?

What do you mean by "cardinality and not N-cardinality"?

Concept graph is a directed *acyclic* graph.

> For example, suppose I want to implement a constraint that each Order
> must have exactly one Invoice and each Invoice must have exactly one
> Order. That seems like a reasonable business rule, even if it's a
> slightly unusual one.

Theoretically any cycle can be represented indirectly by using an additional common subconcept. In this case we can do it as follows:

Order Invoice
  | |
OrderInvoice

Now for each order there as an invoice and vice verse (we need of course a constraint to enforce one-to-one relationship). Then for the database system (which is able to maintain and make use of such a graph) this structure is unambiguous.

In practice however such an implementation is frequently too expensive. In particular, loops (self-references) can be easily permitted. Equivalent relationship such as between Orders and Invoices can also be permitted provided that there is a concrete interpretation (operational semantics) for them. The main problem is that the database needs to know how to manage such relationships. In particular, what to do if you delete an order or an invoice. How to propagate these operations?

> You say that cycles should not be permitted. I'm not clear if that is a
> limitation of your model or if it is just a recommendation. Do you mean
> that I *cannot* enforce my business rule that invoices and orders must
> match one for one?

Yes, the absence of cycles is a deliberate and one of the main properties of the concept-oriented model. It is however can be viewed as a recommendation or a general pattern for data modeling (cycles are evil).

You can enforce your business rule that your data items are mutually *related*. However, you are not allowed to mutually *reference* items. In other words, cycles cannot be implemented via references while references are managed by the database system and they have a built-in semantics for them. However, you can implement cycles and any other arbitrary relationships (according to your problem domain and its business rules) using references. (Please, notice again that loops and some special kinds of cycles should be permitted for performance and convenience of use reasons.)

You always say that you do not want to adapt your business rules to any data model. And in the next moment you adapt it to SQL query language. I think we always adapt the real world to some model or conventions. Why I have to use classes in OOP if I could use assembly language which much more powerful and has almost no limitations? According to your logic I have to assembly language because I am able to directly implement whatever an expert in the problem domain says. Having no cycles in COM is a constraint which allows us to automate many task on data management and analysis. When the database system knows that the model has no cycles it can do many such tasks automatically. Otherwise we have to do it ourselves. That is the main argument. Of course, we loose some flexibility and freedom however we also loose the freedom of making errors because only a small portion of specialists can write correct SQL queries and design correct schema.

I suppose that you mix two issues:

  1. a formal model, and
  2. using this model to model a graph (with cycles) or any other problem domain specific structure.

The first is what your database will manage and what it knows. The second is what your database does not know and does not care of. It is your responsibility to maintain consistency of your graph. In particular, in the concept-oriented model you can model easily any graph, hypergraph or any other structure - the database will simply not know about that. Edges are common subconcepts for nodes. COM will manage only its acyclic directed graph of concepts.

-- 
http://conceptoriented.com
Received on Fri Mar 03 2006 - 09:48:55 CET

Original text of this message