Re: Entity and Identity

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Mon, 28 Sep 2009 15:46:21 -0300
Message-ID: <4ac10480$0$23779$9a566e8b_at_news.aliant.net>


Clifford Heath wrote:

> Walter Mitty wrote:
>

>> My initial motive in starting the thread was precisely to get some 
>> kind of rational discussion going between people who find value in 
>> object models and people who find value in relational models.

>
> I believe that it's possible to unite them, and I've set myself that task.

That was already done a decade ago.

> It does require a change to the programming language paradigm; my API
> has no new/delete, only assert/deny within a constellation of facts.
> Most of the rest of the O-O paraphernalia remains intact, and doesn't
> interfere with the relational and transactional nature of the system.

So you assert. I guess I will have to read your work to judge for myself. Thus far, it doesn't look very hopeful.

>> When we take on the task of building an information system that 
>> represents some part of the universe  (read "real world"), we accept 
>> as given the division of that universe into identifiable component 
>> parts (what I call "entities").  This division is inherent in the 
>> problem statement for the information system we are to build.

>
> Right; that's why we call it "modeling", and why it's viewed as a
> design activity rather than a descriptive one (see Simsion's PhD
> thesis for an exploration of this position).

I don't think I would go as far as calling what Walter wrote "right". I am not sure it is even wrong.

The Simsion reference is interesting, though. His conclusion that design is design is almost a "Well, duh!"; except that apparently a significant number of so-called "thought leaders" espouse some sort of platonic ideal. Scary, isn't it?

>> If my practical experience is any guide,  the community of 
>> stakeholders often has a hazy idea about just what those entities are, 
>> and very haphazard notions about how to identify them.

sigh

Those entities are fiments of someone's imagination. Why they hell should anyone else have a clear view of them? Seriously.

The principal hurdle when dealing with various stakeholders is to get people to communicate without unstated assumptions. People are often so wed to their assumptions they have difficulty articulating them at all.

>> Different 
>> parts of the community will use different identifiers to identify the 
>> same entity,  and even the same identifier to identify different 
>> (although closely related) entities. 

>
> To my mind this is at the root of perhaps the preeminent problem in
> delivering information systems, namely the problem of specification.
> It's for that reason that the Constellation Query Language is plain
> text, so as not to exclude anyone (capable of understanding the domain)
> from participating. The text can be synchronised with diagrams, but
> no special training is required to read and critique a CQL model.

That's a remarkable claim. One I have heard made about a large number of quite opaque languages. I will have to look into that.

>> Sorting that confusion out is (part of) data analysis  (in the case of 
>> building a database), and it has to be done regardless of whether one 
>> intends to take a O-O or an RM view of the data.

>
> Sadly it's often not done in O-O projects, and those projects suffer
> from the lack of it. I think *that* issue is the main reason for
> complaining about O-O...

No, the main reasons for complaining about O-O are the imprecise and heavy overloading of terms until they cease to have any useful meaning and the general regression to a gussied up network data model.

> but really, the problem is in the training
> of the teams, and to an extent in the languages.

The problem, really, is just a demonstrably bad model for data management.

> In CQL, it's not possible to define an entity without defining its
> identification pattern; in fact it's not possible to include more in
> the initial definition of an entity than what is required to identify
> the entity. Even the syntax enshrines the need for identification.
> There are four kinds of object type:
>
> * Value types (also known as lexical types). Because instances of
> value types are identified by a lexical (written) form, the CQL
> syntax encodes that, for example, "Name is written as String(20);"
>
> * Entity types (non-lexical types) of three forms:
> * Subtypes, defined as for example "Employee is a kind of Person".
> Here the identification is inherited from the first supertype.

Sounds like the 1st Great Blunder to me. Why do you not have subtypes for value types? Subtyping is far more useful for value types than for structures.

> * Normal entity types, defined using the keyphrase "is identified by".
> For example, "Company is identified by CompanyName where <fact type>;" or
> "Company is identified by its Name;". This format can be mixed with
> the supertype syntax, for example "Employee is a kind of Person
> identified by its Nr";
>
> * Objectified Fact Types, for example
> "Directorship is where Person directs Company;"
> It's also possible to mix "is identified by" into this pattern,
> where the identification of an objectified fact type is external.
>
> I haven't published much of the actual query syntax yet, but from what
> I've done, it's easy to see how SQL can be completely hidden under a
> truly relational definition and query language. I have no goal to make
> CQL a language for expressing updates, but that might be a possible
> future direction. In the meantime the API will suffice.

How is it truly relational with all these superfluous structural elements? Received on Mon Sep 28 2009 - 20:46:21 CEST

Original text of this message