Re: some information about anchor modeling

From: vldm10 <vldm10_at_yahoo.com>
Date: Mon, 25 Mar 2013 02:08:52 -0700 (PDT)
Message-ID: <92044e13-325d-45e2-a8c7-f3d99003b180_at_googlegroups.com>


Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:  

Hi Derek,

> Which begs the question: what exactly are you defining as surrogates ?

(a) Codd defined surrogate key in his paper RM/T. Wikipedia also has the same definition of the surrogate key. The surrogate key is the primary key of the corresponding binary relations. The surrogate key exists only in the corresponding database (not in the real world). Codd wrote “now the surrogate that is the primary key and provides truly permanent identification of each entity.” Note that he wrote “provides identification of each entity”, Codd did not write “identifies the entity”. This means that he uses the corresponding primary key from the original relation (RM relation) to identify the real world entity (not the surrogate key, which is primary key of binary relations.) This implies that he must maintain the two primary keys. And this implies that he must join the attributes from the primary key and keep them unique. (I hope that I didn't make a mistake in this explanation.) Note that Codd did not maintain the history, he even did not know for “history of events”. So my explanation here is my guessing that “system-assigned surrogates” must work in this way. Note that Code didn't explain how these things work, he only said, a system does this behind the scene.


In my paper “Some ideas about a new Data Model”, From September 17, 2005 at http://www.dbdesign10.com/ , see section 1.1, I wrote:  “Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.” In this definition I wrote “…or can provide identification of the entity”. This “can provides” means that my solution covers the case of the surrogate key. So the surrogate key is just a special case of my solution. But here the identifier not necessary must be the surrogate. I didn't think specifically about the surrogate key here. I thought on each identifier who can “identify or provide identification”. For example my identifier of a state is 100 times more complex than the surrogate key.

My approach to the identifiers, surrogates and keys is different from Codd's, it is based on the abstract objects and memory manipulation with abstract objects. My approach solves more complex things than the surrogates. (See the algorithm in my post from February 25 in this thread. I named that algorithm as “How a database stores an object and how a database can remember of its objects?” )


The paper “Anchor Modeling An Agile Modeling Technique Using the Sixth Normal Form for Structurally and Temporally Evolving Data” has reference [19].

At [19] the authors of Anchor Modeling wrote that the anchor is the surrogate key. In their paper they also “proved” that all structures from Anchor modeling are in “6nf”. This implies that Anchor Modeling is based on the nothing because I showed in this thread that “6nf” is absurd. This paper was signed by all five authors. Why the authors of “Anchor Modeling” are put the "6nf" in the title? It is because they use these atomic structures in their paper without proof. They use "6nf” and RM/T as implicit proof for their decomposition. As I wrote "6nf" is nonsense.

There are also problems in the Anchor Modeling that are related to transitions from one data model to another data model. It seems to me that the authors of Anchor modeling walk with giant steps through these data models. Firstly they have entities, then they have a set consisting of different types of attributes. But nowhere is proven the decomposition of this entity to these attributes. It was not proven that the reverse is true: it is not proven that this attributes forming this entity. What is surprising is that at the beginning of this paper the authors write that Anchor Modeling is the Relational Model: "An anchor model is a relational database schema" see section 1.

On August 6, 2010 (see my thread “The original version”) I wrote on this user group the following: “There is one other thing here, which is more important, which is badly done in the "Anchor Modeling. This is about how to do the transition from E / R model in the relational model and vice versa. I think it is necessary to define the mapping from E / R to RM, then the inverse mapping for the given mapping and in the end it is necessary to define the composition mapping. In my model I have at the outset, the binary concepts. Each binary structure has its own unique identifier of the state. Therefore, each tuple or a binary concept is uniquely defined. In "Anchor Modeling" They start from the E / R and go in the RM, so do 6NF, and return to the E / R. But it was not discussed in the paper, so it's not clear how to do it. We can note that mapping of schemas between two db models can be complex, for examples it can include constrains.”

Basic terms are not correct or they do not belong to the theory of databases. For example section 2.1 starts with the following text: “An anchor represents a set of entities, such as a set of actors…” The set of entities does not exist, because we do not put the physical entities into sets. For example, we can say that we have a set whose elements denote actors. Databases work mostly with names, not with physical objects.

Just after the above mentioned sentence, there are the following definitions: Def1 Let ID be an infinite set of symbols, which are used as identities.

Def2 An anchor A(C) is a table with one column. The domain of C is ID . The primary key for A is C.

If somebody wants to check what an identity is, then he can visit web page: Stanford encyclopedia of philosophy, the articles from this web site are written by the prominent scientists. On this web site there is no a definition of Identity, but there are tens of pages about the identity. This is among the most important terms in philosophy. However databases do not work with philosophical terms. I just want to tell you that the most basic term in Anchor Modeling are defined inaccurately.

This paper has the following title: “Anchor Modeling An Agile Modeling Technique Using the Sixth Normal Form for Structurally and Temporally Evolving Data”

I was looking just for the part that is related to “Using the Sixth Normal Form for Structurally Evolving Data”. I mean this is really impressive title and notation. However, I could not find anything about "using sixth normal form for structurally evolving data". If authors of this paper believed they can on "agile" way add attributes to the existing entities, then they have to realize that they need to swap the existing "identity" of the corresponding entity, i.e. they should change the "anchor". I have many doubts related to "Agile evolving" especially because there is no explanation or example about it.

This paper was awarded the first prize at the Congress that bears the name "International conference on conceptual modeling," but nowhere in the paper, there is no definition of the concept. Note that the authors introduce unusual concepts, which are about how to keep the identity of the entity that is changing. Also, we are talking about atomic entities. Therefore definitions of the concepts are important. Note that P. Chen also did not give any definition of the concept in his work ERM. ERM is conceptual model.

On the web site of Anchor Modeling the authors write about “meta data”. They have discussion club there, they correct errors and announce the new improved versions. I want to say the following:

1. “meta data” is undefined concept. 
2.  the authors didn't write in their paper nothing about “meta data”. For example they didn't include “meta data” in schema and they didn’t define which “meta data” are included.
3. I am sure that Anchor Modeling can not support “meta data”, but one would be frivolous when criticizing something that is not defined. So nobody who is reasonable person can’t say anything about it.

My point here is that these authors don’t understand what “meta data” is and this imply that they don’t know what the history is. So their solution can not solve the history. In fact these authors think that history is a kind of a temporal database, what is not true at all. Obviously the editors of this paper do not understand the nature of the history. With these few examples I wanted to draw attention to the low level of this work. It also raises the question of how do the editors of this paper did not notice the low level of the paper. By the way, the reference [19] is disappeared; it is not at the given address.  

> 5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
>
> • the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
>
> • on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
>
> ••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.



(b) CarKey is not a surrogate key by definition. Users can see the value of the surrogate key. If user wants, then he can delete the surrogate, which he saw. CarkKey is the identifier of the abstract object i.e. it is the key of the state of an entity. In contrast to Codd’s surrogate key which is related to a real entity, CarKey is related to the abstract object. CarKey directly identifies the corresponding state, so it is a key, while surrogate keys can’t identify the entities, the surrogate key indirectly identifies the real world entity, and it uses additional database structures.

But more important, CarKey is about how to store complex objects in a memory and about how to recall complex objects from a memory (Man can remember of ideas, emotions, music, thoughts, shapes and other very complex objects. Here I mean on a memory for databases. But this can be a clue for general theory about memories. So CarKey goes in that direction. See the algorithm in my post from February 25 in this thread, I named that algorithm as “How a database stores an object and how a database can remember of its objects?”). The surrogate is about simple objects and the surrogate doesn't work correctly. The surrogate is a naïve technical solution. (the surrogate looks to me as a kind of an index) Especially Codd’s surrogate doesn't work for General database theory. For example Codd didn't notice very important and huge field in database theory. It is the history. Now, it seems that authors of Anchor Modeling want to “include” Codd’s surrogate in the theory of the history of events although Codd didn't notice this very important field.


Note that each primary key is the identifier. So the corresponding identifier can be physically associated to the corresponding object from the real world. See my paper “Semantic databases and semantic machines”, section 5.1(i) at http://www.dbdesign11.com/ which is about primary keys.

Keep in mind that a lot of people do not understand what it is a surrogate key. For example, if we have an invoice and if the invoice has an identifier (invoice number), then this is not a surrogate key by definition, because the identifier is put on the object in the real world. I also think that the authors of Anchor Modeling not fully understood the surrogate key. I mean this is obvious.

>
> • (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)
>

(c) I posted my solution for the first time on September 23, 2005 on this user group. Many of the members are understood my paper, and they immediately started the discussion. It is OK that one can’t understand something; you (or anybody else) can post your questions to the group or can sent it to my email, I will respond.

> 8.1. I disagree that AM *substitutes* or replaces the RK with a surrogate. Clearly, one of their attribute tables (P-Relation) contains the RK, the K-Relation or K-Role. So the surrogate is used in the normal manner, as a permanent Identifier, a substitute PK, that is an FK in all its child tables.

(d) An entity is a fundamental term here. It is also fundamental semantic unit. The entities are the basic units for relationships. The surrogate keys are the keys for entities. The anchor is the surrogate key; this is written in reference [19]. In my opinion the surrogate is wrong in Anchor Modeling, because users can see surrogates in Anchor Modeling. It is not a surrogate key by definition. Therefore the consequence here is the following question: On what is based the decomposition of entities in Anchor Modeling? Is it based on "6nf" or E. Codd's "decomposition" or maybe on some combination of the mentioned approaches?

Vladimir Odrljin Received on Mon Mar 25 2013 - 10:08:52 CET

Original text of this message