Date: Fri, 03 Mar 2006 09:48:55 +0100
From: Alexandr Savinov <spam@conceptoriented.com>
User-Agent: Thunderbird 1.5 (Windows/20051201)
MIME-Version: 1.0
Newsgroups: comp.databases.theory
Subject: Re: circular relationships ok?
References: <du4k8t$te7$1@nntp.fujitsu-siemens.com>   <du6v7s$41i$3@nntp.fujitsu-siemens.com>   <440705d2$1@news.fhg.de>   <1141312379.065049.193100@p10g2000cwp.googlegroups.com>   <440718b9$1@news.fhg.de>   <1141317138.737621.294520@i40g2000cwc.googlegroups.com>   <1141318257.897955.150140@v46g2000cwv.googlegroups.com>   <440727b9$1@news.fhg.de> <1141333040.551974.105530@e56g2000cwe.googlegroups.com>
In-Reply-To: <1141333040.551974.105530@e56g2000cwe.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: <440802f8$1@news.fhg.de>
Organization: Fraunhofer Gesellschaft (http://www.fraunhofer.de/)
Lines: 118
Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.icl.net!proxad.net!newsfeed.stueberl.de!npeer.de.kpn-eurorings.net!usenet-feed.fhg.de!news.fhg.de!not-for-mail
Xref: dp-news.maxwell.syr.edu comp.databases.theory:37171

David Portas schrieb:
> Alexandr Savinov wrote:
>> David Portas schrieb:
>>> Let's try an even simpler and even more common example: An
>>> organizational hierarchy.
>>>
>>> CREATE TABLE Employees
>>>  (employee_id INTEGER PRIMARY KEY,
>>>  employee_name VARCHAR(50) NOT NULL,
>>>  manager_id INTEGER NOT NULL
>>>   REFERENCES Employees (employee_id),
>>>   CHECK (employee_id =0 OR employee_id <> manager_id));
>>>
>>> The business rule is that each employee has a manager who is also an
>>> employee.
>> An arrow goes from Employee up, then turns down and then comes to this
>> concept from down:
>>
>>          /\
>> Employee |
>>          \/
>>
>> That is very simple. Loops (self-references) can be used because they
>> are very convenient and frequently used. And they have clear and
>> unambiguous semantics. My opinion that loops should be permitted but not
>> cycles. Cycles do not have unambiguous semantics. If you can explain to
>> your database how to interpret them then cycles can also be used but I
>> do not know such an interpretation. And, IMHO, it is reason why
>> contemporary database do not manage such relationships - it is a task of
>> the programmer.
>>
>> --
>> http://conceptoriented.com
> 
> I understand now that you don't recognize cycles because your diagrams
> are directed graphs. Does that mean you can only support common
> cardinality and not N-cardinality?

What do you mean by "cardinality and not N-cardinality"?

Concept graph is a directed *acyclic* graph.

> For example, suppose I want to implement a constraint that each Order
> must have exactly one Invoice and each Invoice must have exactly one
> Order. That seems like a reasonable business rule, even if it's a
> slightly unusual one.

Theoretically any cycle can be represented indirectly by using an 
additional common subconcept. In this case we can do it as follows:

Order Invoice
  |       |
OrderInvoice

Now for each order there as an invoice and vice verse (we need of course 
a constraint to enforce one-to-one relationship). Then for the database 
system (which is able to maintain and make use of such a graph) this 
structure is unambiguous.

In practice however such an implementation is frequently too expensive. 
In particular, loops (self-references) can be easily permitted. 
Equivalent relationship such as between Orders and Invoices can also be 
permitted provided that there is a concrete interpretation (operational 
semantics) for them. The main problem is that the database needs to know 
how to manage such relationships. In particular, what to do if you 
delete an order or an invoice. How to propagate these operations?

> You say that cycles should not be permitted. I'm not clear if that is a
> limitation of your model or if it is just a recommendation. Do you mean
> that I *cannot* enforce my business rule that invoices and orders must
> match one for one?

Yes, the absence of cycles is a deliberate and one of the main 
properties of the concept-oriented model. It is however can be viewed as 
a recommendation or a general pattern for data modeling (cycles are evil).

You can enforce your business rule that your data items are mutually 
*related*. However, you are not allowed to mutually *reference* items. 
In other words, cycles cannot be implemented via references while 
references are managed by the database system and they have a built-in 
semantics for them. However, you can implement cycles and any other 
arbitrary relationships (according to your problem domain and its 
business rules) using references. (Please, notice again that loops and 
some special kinds of cycles should be permitted for performance and 
convenience of use reasons.)

You always say that you do not want to adapt your business rules to any 
data model. And in the next moment you adapt it to SQL query language. I 
think we always adapt the real world to some model or conventions. Why I 
have to use classes in OOP if I could use assembly language which much 
more powerful and has almost no limitations? According to your logic I 
have to assembly language because I am able to directly implement 
whatever an expert in the problem domain says. Having no cycles in COM 
is a constraint which allows us to automate many task on data management 
and analysis. When the database system knows that the model has no 
cycles it can do many such tasks automatically. Otherwise we have to do 
it ourselves. That is the main argument. Of course, we loose some 
flexibility and freedom however we also loose the freedom of making 
errors because only a small portion of specialists can write correct SQL 
queries and design correct schema.

I suppose that you mix two issues:

1. a formal model, and

2. using this model to model a graph (with cycles) or any other problem 
domain specific structure.

The first is what your database will manage and what it knows. The 
second is what your database does not know and does not care of. It is 
your responsibility to maintain consistency of your graph. In 
particular, in the concept-oriented model you can model easily any 
graph, hypergraph or any other structure - the database will simply not 
know about that. Edges are common subconcepts for nodes. COM will manage 
only its acyclic directed graph of concepts.

-- 
http://conceptoriented.com
