Re: some information about anchor modeling

From: vldm10 <vldm10_at_yahoo.com>
Date: Thu, 25 Jul 2013 13:15:12 -0700 (PDT)
Message-ID: <71808748-b502-4165-b948-5794597ad6e5_at_googlegroups.com>


Hi Derek,

Now I would like to summarize my posts to you regarding the following two points:

1.
In my post from April 1, 2013, I listed eleven major fields that "Anchor Modeling" can not solve at all. This shows that "Anchor Modeling," is not a solution at all.

2.
All the main ideas of the paper Anchor Modelling can be found in my paper. My paper was published four years before Paper Anchor Modeling, presented and the mass of discussers in this user group over the years. Examples of these important ideas are listed in this thread and in the thread “The original version" in this user group.

In all these cases, which are very important for database theory, my solution is more general and gives an accurate solution, as opposed to "Anchor Modeling." The authors of the Anchor Modelling not understand these basic ideas that are used in their work. They did not understand the nature of these fundamental ideas and their essential properties.

The authors of Anchor Modeling have made some "cosmetic" changes, so these similarities with my solution are not obvious at first glance. What did they change? First they introduced naval terms. They changed the two main things from my model. These two things are the identifier of an entity and the identifier of the state of the entity:

(i) Instead of the identifier of an entity, they have introduced "anchor key." In fact they have introduced "surrogate key". Here, right at the start, the authors of Anchor Modeling show their not understanding the theory of databases. In fact, nearly all databases use the keys that are defined by the International Organization for Standardization. These keys are not surrogates, they are originals, and they are externally verifiable. For example, you can verify the VIN by phone, but you can not verify a surrogate key. The industry-standard keys usually have procedures for verification, decoding etc. In contrast to the industry-standard keys, the surrogate keys do not have these procedures.

(ii) Instead of the identifiers of state the authors of Anchor Modeling introduces key K (C, T). Here C is "anchor key", that is a surrogate key and T is time. Obviously the authors of Anchor Modeling think that the "the identifier of the entity + time" can stay instead of my identifier of a state. Again this is serious misunderstanding of database theory and logic. In this thread I showed that this key is a bad solution.

Note that this T is not defined in the paper "Anchor Modeling." For example, it is not clear what is T. Is T a time when information about the new attribute value was received by the IT department? Is T a time when attribute got a new value in the real world? Is T a time when a new attribute value is entered into the database? Note that the key is the basic concept in the theory of db. For more details see Def 5 in the paper “Anchor Modeling”.

I will now briefly summarize all the main ideas of Anchor Modeling, which can also be found in my paper:

(a)

The idea of “history”.
This is very important idea in my paper. The idea of history is not understood in the paper "Anchor Modeling." For example, in Anchor Modeling is allowed to delete the data. If you allow deletion of data, then you have no history. Even worse, if you allow deletion of data, then you have no database that is on-line supported. In the Internet age this design is extremely bad. These are capital items in the design phase of the database. In addition to these authors of "Anchor Modeling", these things do not understand also editors of their work, it's obvious. However the authors of Anchor Modeling use all important constructs related to history from my paper.

(b)

The immutable key.
The immutable key is the identifier of the entity (real or abstract), that is, it belongs to the entity. The identifier is not "identity" of the entity that is changing, as it is defined in the paper Anchor Modelling (see "Def 1" and "Def 2" in the paper Anchor Modeling). The immutable key is a matter of identification, it is not a matter of an identity. The idea of immutable key is not related to surrogate key . In this thread I've explained that identity is not a defined the term. In my model the immutable key is general solution in contrast to "Anchor Modeling." In my solution this key can be unlimited immutable, but it also can be immutable only in a limited period of time, depending on the real database application. This second case is important in practice. About this case, I was writing in more details in this thred.

(c)

Bitemporal Data in Anchor Modeling.
My data model has the general solution, my solution supports n-temporal data.

(d)

Simple key.
In all my published papers the key is simple. I introduced Simple Form in May 2006th. One of the main reasons for the introduction of the Simple Form was the construction of simple key. It was clear to me, that this was one of the most important steps in the construction of atomic structure and also for some other important things. Viewed from the standpoint of theory, the simple key is incomparably more important than the surrogate key. Obviously, Codd did not realize this nor the authors of the anchor modeling. They thought that here comes a technical solution. They also did not realize all the other things that are associated with the construction of the simple key. Of course if you have a simple key then you can easily import the visible surrogate key. However these authors use the simple key in the very restricted form of the surrogate key.

Note that my dbdesign always starts from entities and relationships. The construction of a key of a relationship is predetermined. Entities have only intrinsic attributes. These predefined conditions significantly improve the structure of entities from the beginning of the db design. Note that only constraints can introduce the need for NFs.

In 2006 I presented “Simple Form”, see section 4 at http://www.dbdesign10.com : Relation schema R (K, A1, A2,…, An) is in Simple Form if R satisfies: R (K, A1, A2, …,An) = R1 (K, A1) join R2 (K, A2), join … join Rn (K, An) if and only if

  1. Key K is simple
  2. A1, A2,…, An are mutually independent.

Note that the relvar which is in Simple Form is “all-key” relvar. So the simple key corresponds to “all-key” in Simple Form. I use term relvar, because it is traditional term in RM. However I think that “relational schema” is more appropriate term. The term “relational schema” is in compliance with the Model Theory.

In fact, in the current theory of database design is not precisely defined what are the first steps. Simple Form defines a first step in the db design. Simple Form is introduced only for one reason; it determines and constructs the attributes and the simple key for entities. It separates construction of entity from construction of the corresponding constraints. Note that even an axiom is a constraint. In contrast to so-called “6NF”, Simple Form completely determines natural (without constraints) construction of entities. However, the most important thing in Simple Form is the simple key; later I will elaborate this statement.

(e)

My model is more general than ERM. I am starting from states of entities (or relationships), which is essentially more general than entities and relationships. Historic attributes from AM are a special case of the entities. My model defines the concept. This definition for the first time introduces the correct definition of the concept. In my paper "Russell's paradox" has been resolved and dismissed as erroneous idea. My solution uses Frege's results and adds what is lacking in Frege's theory. It is shown that identification is another mind - the real world link. It has been also shown that the identification and concept are related and integrated semantic structure.

(f)

Anchor Modeling builds its model on a finite number of some structures (knots, ties ...), and there is no evidence that every business application can be presented through these structures.

(g)

In my paper, I have demonstrated the decomposition of entities (or relationship) into the corresponding atomic structures. This proof has been proven only by using of tools from my general ERM. The Anchor Modelling paper has no evidence that the entities and relationships can be decompose into the corresponding atomic structures. For more details see my post from 25 March 2013 in this thread.

(h)

Authors of Anchor Modelling transfer their ERM "atomic structure" in the RM. They do this without evidence. I also wrote about this in my post from March 25 2013. Note that many well known scientists deal with this difficult topic, known as schema mapping and data transfer.

(i)

Although Anchor Modeling is on conceptual level, the authors did not define the concept. In fact there is not one word that has something to do with concepts.

(j)

StartDate - EndDate. In Anchor Modeling they use only StartDate for History structures; EndDate is "implicitly" defined. In this thread I showed that this technique, which is based on "implicitly" defined EndDate, is wrong. Most importantly, this is not Temporal DB. General databases, which include databases that maintain history of events, are not temporal DB as authors of Anchor Modeling claim. Obviously they do not understand importance of this model. My data model is event oriented; it means that my data model is much more general then Temporal DB. For example, I use only two events to describe all operations to the data. I have shown that these two (existential) event define time. I introduced these two events in 2005. See also my paper from 2009, Database design and data model founded on concept constructs and knowledge constructs, see section 7.4

Note that the authors of Anchor Modeling using a combination StartDate, EndDate at the level of atomic structure. As I said atomic structure were introduced in Anchor Modeling without evidence.

(k)

In my model I use the idea of knowledge. My main structure, roughly speaking, has the following schema:                   IdentifierOfEntity, IdentifierOfState, Knowledge 

“Knowledge” is total knowledge about an entity, therefore Anchor Modeling data structures knots; static attributes, etc are just part of my solution. In my model a subject has this knowledge. It may be that more than one subject has his knowledge about an entity. I came here to the idea of using flexible structures. For example in my paper from 2005, example 5, I wrote, "Now we can assign a different number of columns of knowledge to each attribute. These columns can also be different, concerning what they represent. " In contrast to my solution, Anchor Modeling has fixed and limited number of data structures. In my model, one can add or delete arbitrary part of knowledge, depending on the real world application. In my paper I wrote: if there are no changes of certain entity, then in that case, the identifier of the entity is equal to the identifier of the state of the entity. Note also that my dbdesign enables the construction of databases which can maintain an individual entity which do not belong to an entity set. Note also that “knot” structure can cause very bad consequences, for example, if someone enter wrong data (accidentally or intentionally)

Knowledge is precisely defined in my papers. Let me mention some main characteristics of Knowledge introduced in my model:
(i) Knowledge is based on atomic facts; I have the procedure which decomposes entities into atomic structures.
(ii) Knowledge is strictly related to a subject (or subjects). This implies that in my model it is possible that different subjects can have different knowledge about one attribute.
(iii) I have a factual sentence and the corresponding fact; these are two very different things in my approach to knowledge. The fact in my model is on the level of thoughts, it is related to meaning and awareness, the fact is subjective and it links the corresponding data in a memory to the subject. So facts and data are very different things.
The factual sentence is just a set of symbols.  

(l)

Meta data.
Meta data is defined as "data about data" which is a kind of circular definition. On the other hand, "meta data" can have their own "meta data", so in this case we have "meta meta data," Which is really unusual dbdesign. In my paper I am using term knowledge which I explained in the above text. Note that it is possible the theoreticall case "meta ... meta data". I wrote in this thread that authors of Anchor Modeling did defined meta data in their db schemas at all. That is because they did not solve this problem. In Anchor Modeling there is neither any theoretical text nor example about “meta data”. Note that “meta data” are among most important constructs in Anchor Modeling.

As I wrote above I use "knowledge" in my data model. Knowledge is based on atomic facts. Facts that are stored in memory become data and so data become permanent.
(For more details about facts, knowledge and permanent, see section 3 from my paper “Database design and data model founded on concepts and knowledge constructs”. About facts also see section 3.5 from January 2006 at my website http://www.dbdesign10.com. and section 2 from my paper “Semantic databases and semantic machines )

(m)

Surrogate key.
The name “surrogate key” is very useful; this name is under the influence of the authority of name Codd. E. Codd defines that the surrogate key is invisible. E. Codd is the biggest authority for RM. Therefore people use E. Codd's definition of the surrogate key. Codd's definition of surrogate key people use especially when the surrogate key is part of binary relations. Today you can find Codd’s definition on Wikipedia, but it is false definition. On wikipedia someone wrote that by Codd’s definition the surrogate key is visible. This is not true. In my opinion, this fraud someone did intentionally. Obviously, this person is trying to fix this Codd's a huge mistake, not using scientific means. This person claims that Codd defines the surrogate key as visible in 1976. According to my knowledge, Codd has not published a single paper in 1976.You can see Codd’s definition about the invisible surrogate key in his paper RM/T.

The authors of Anchor modelling defined their anchor key as the surrogate key. You can find this in their paper Anchor Modeling, reference [19] “Analysis of normal forms for anchor models, http://www.anchormodeling.com/tiedostot/6nf.pdf”. This reference disappeared, so that you can not find it on this address. After my writings about this paper it appears in fixed version of Anchor modelling. I have original version of the mentioned article.

So applying surrogate key by Anchor Modeling is nonsense because it is invisible by definition.

If authors of Anchor Modeling apply some other definition, then, they should say which version they use because the anchor key is basic term in their paper. Especially, because there is another definition of the surrogate key, in which the surrogate key is invisible. See Wieringa and De Jonge (1991).

If they use surrogate key as a visible key, then it is plagiarism of the part of my definition about the identifier of an entity.

My definition about the identifier of an entity is divided into the two parts as I wrote about it in my post from June 15, 2013. I want to say that this definition is not simple, in fact the matter is very complex and must be examined by cases. This important fact, the authors of Anchor Modeling didn't notice, at all. The same case is with Codd and RM/T.

There is one bigger mistake about the surrogate key. The surrogate key is a kind of technical solution. However, a theoretical solution is much more important; it is more general and it fits in a theory. I was looking for something theoretical. On may 2006 I introduce Simple Form (see section (d) in this post, it is about Simple Form). Simple Form has the simple key and that is it, the simple key can solve everything. If you have the simple key, then instead of this simple key you can put some “industry-standard” key or a surrogate key, etc.

Note that Anchor Modeling assigns the surrogate key to an entity, unconditionally. They only had written that the surrogate key is “identity” of an entity. Note that AM is on the ERM level. Now, there are many questions. Let me mention two of them: 1. If the entity has the corresponding relations with anomalies? Note that in the paper Anchor Modeling, they don't have a procedure which enables mappings from ERM to RM 2. If they can somehow transfer their structures from AM (that is ERM) to RM and if they do some lossless decomposition, then no one knows what these identities (surrogates) represent. I mean which entity these surrogates represent after decompositions? (Note that the same question can be applied to RM/T) This part should be explained by Springer Company.

Definition of The Identifier of the entity



Here is this definition from year 2005 that I posted on this thread on June 15, 2013: “Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.”

(See section 1.1 from my paper at my website http://www.dbdesign10.com )
Look for the tread: “Database design, Keys and some other things”, from September 2005, on this user group and for the corresponding discussion.


As I have already said, the definition that defines the identifier of the entity is complex, there are more cases, the surrogate key is irrelevant case. The first part of this definition, I called the CaseA in my post from June 15, 2013 and explained it in details.

Now I'm going to analyze the second part of this definition which has the following text: "or can provide identification of the entity." I called it CaseB. The "Anchor Modeling" is based on this CaseB. As I already wrote surrogates can support a small part of the real world applications. Because of this, I'm not addressed a lot of attention to the surrogate keys. This CaseB part is related to small amount of the real world business applications. This section can be divided into sub cases; Here, I chose only a portion of all the cases that are of type CaseB. These are the following cases:

CaseB1. This case is appropriate for my Simple Form. The construction of the simple key should be done by using the values from the attributes which form the primary key. The key can be constructed by using concatenation of the corresponding values. (See 5.1.(iii) in my paper "Semantic database and semantic machines"). Note that this simple key should be the immutable key in my General database model. Therefore someone can apply the immutable key, starting from an arbitrary state and change this simple key with the immutable key. The simple key becomes immutable key, and the states of the entity are determined by applying the corresponding identifier of the state. Even more, it is possible to set “certain” constraints on the corresponding attributes. But this is separate field. So this case shows how one can start from Simple database and then how to switch to the General Database and vice verse.



Note that this procedure is better and more general than the surrogate key from Anchor Modeling. The procedure can solve all the cases which can be solved by using the surrogate key.

As I have already said, the basic idea is the simple key. "Simple Key" is the general theoretical idea and solution. That is what the authors of Anchor Modeling did not understand.


CaseB2. In this case the start is again at Simple Form, but now the values which will be assigned to the simple key, they come from an arbitrary domain.

CaseB3. In this case, we can apply the visible surrogate key.

CaseB4. In this case, we can apply indexes as the starting value for the immutable keys. Note that in this case, we can use some other technique that is proven in the maintenance of keys.

CaseB5. In this case, we can apply my structure “sequence” as the starting point. This structure was defined in my paper “Semantic databases and semantic machines.” section 5.12, at http://www.dbdesign.com. Note that this structure can maintain entities which identifiers are immutable for a limited period of time. In fact, this is the idea of changing the identity of an entity. Note that the authors of Anchor Modeling determine the immutable key as “eternal”. Sequence is a powerful structure and can support any kind of the identifier of an entity (industry-standard keys, as well as the visible surrogate keys which use different domains, etc).



CaseC. This is the case, which works with complex problems. Note that my solution is based on states. Technically speaking History is solved by using two identifiers; the identifiers of states and the identifiers of entities. However, if you pay more attention to this case, then you can see that this is theoretical approach with some complex cases. The states are complex db structures. See my posts from February 25, 2013 and from May 13, 2013 in this thread.

These B-cases show that the visible surrogate key is just one of many technical solutions. Note that Anchor Modeling could not solve problems even with the visible surrogate keys. In this post I show that all important ideas from “Anchor Modeling” exist in my papers in more general form. My ideas and solutions were published in 2005, and Paper Anchor Modeling 2009. Corrected version of paper Anchor Modeling was published 2010th. When I started working in this field, was not been crystallized, which are the main things in this area. Moreover at that time it was not known which the ideas of the game are. You saw how Codd was far from the solution for the decomposition in the atomic structures.

5. Identification
I devoted special attention to the process of the identification. I started with this field in 20007 see section 7 at http://www.dbdesign10.com I introduced new results related to identification in my paper from 2008 and 2012.

(i) CASE - SIMPLE DATABASE

In my opinion identification is fundamental for database theory. I will mention the aspect of identification that is linked to identifiers and concepts. I would also like to say that the surrogate key has a specific solution on the level of the corresponding concept. The identification process is in my work recursively defined. First, I defined the concept and identification for properties, then for entities, relationships, and finally for states. An entity is determined by the properties, relationship is determined by the entities and other relationships, and states are determined by the entities and relationships. I will explain the concept and identification only for properties. Note that in my papers a property is a concept, while an attribute is an instance of property. For example, color is the concept, while red is the attribute. I will explain the concept and identification only for properties. It is determine with: (3.3.3) in section3.3, from my paper “Database design and data model founded on concept and knowledge constructs”, at http://www.dbdesign11.com :



S (the m-attribute, the concept of the property) = T iff the m-attribute matches the entity’s attribute
  1. On the left side of this equivalence, we have the relation: "satisfies this concept".
  2. On the right side of this equivalence, we have the identification of the attribute.

Note that in section 3.2.1(i) of the mentioned paper, it is determined the following: “The m-attribute is created by the match between an entity’s attribute and the corresponding attribute in our mind.”

Similar as for attributes, goes the identification of entities. In my opinion, the above mentioned (3.3.3) “crashes” Russell's paradox. For more details about this see my thread: “Does the phrase “Russell’s paradox " should be replaced with another phrase?”



(ii) CASE – GENERAL DATABASE. This includes databases that maintain history. Here I am using my main data structure that fully models states. This structure has the following scheme: ConceptStateName (P, E, A1… An, Kp1…Knr, Dp1,…,Dns)
where P is the concept of the identifier of a state of the entity (or relationship); E is the concept of the identifier of the entity; A1,…,An are concepts of the properties of an entity (or relationship); Each property, including E and P, can have different sets of knowledge K associated to them and defined in 3.6 – 3.9. Thus:
P has knowledge Kp1, Kp2… Kpi; 
E has knowledge Ke1, Ke2… Kej; 
A1 has knowledge K11, K12…K1k;

….
An has knowledge Kn1, Kn2… Knr.
Knowledge Dp1,…,Dns is defined in 3.8.

For more details see section 4.2.5 and 4.2.6 from my paper paper “Database design and data model founded on concept and knowledge constructs”, at http://www.dbdesign11.com



Conclusion:
When doing some database design, then I usually start from Simple Form and make schemas for the concept of the entities. So again, I emphasize that there are two kind of the identifies of entities: The first group is where the identifier of the entity belongs to the entity and m-entity. This is CaseA that is discussed above. The second group is where the identifier of the entity belongs only to m-entity. These are CaseBs. The authors of Anchor Modeling did not realize that there are two groups of identifiers. E. Codd also didn’t understand this fact in RM/T. On the level of concepts, the two groups have difference in the phase of identifying of attribute – this is the right side of the above (3.3.3)

In the above mentioned case CaseC I use mentioned Concept for the General database. Note that in all cases I have the Identifier of the Entity, which enables the simple key. The simple key implies the surrogate keys and much more.


Vladimir Odrljin Received on Thu Jul 25 2013 - 22:15:12 CEST

Original text of this message