Re: The anatomy of plagiarism that was made by authors of "Anchor Modeling"
From: vldm10 <vldm10_at_yahoo.com>
Date: Tue, 19 May 2015 14:33:24 -0700 (PDT)
Message-ID: <002ed969-4ac2-4c42-bbd0-cb3a2597b2bc_at_googlegroups.com>
Date: Tue, 19 May 2015 14:33:24 -0700 (PDT)
Message-ID: <002ed969-4ac2-4c42-bbd0-cb3a2597b2bc_at_googlegroups.com>
In this thread I will write about the major steps in the design of a database. As I wrote the authors of "Anchor modeling" claim that "In this paper we propose a modeling technique for data werehousing, called anchor modeling, ..." The main anchor modeling concept is introduced in the following two definitions (look at page 3):
Def 1 (Identities). Let ID be an infinite set of symbols, which are used as
identities.
Def 2 (Anchor). An anchor A (C) is atable with one column. The domain of C is ID. The
domain of C is ID. The primary key for A is C.
-- Note that the term "Identities" is not defined. Note that this term is undefined in philosophy also. I want to emphasize that they initially start from the undefined term. I will try in this post, present a real database design. Note that I am using the following terms: "identification" and identifier. In my opinion this is a big difference from "identities". 1. Step I adopt the following Godel view of the world: “By the theory of simple types I mean the doctrine which says that the objects of thought (or, in another interpretation, the symbolic expressions) are divided into types, namely: individuals, properties of individuals, relations between individuals, properties of such relations, etc...Received on Tue May 19 2015 - 23:33:24 CEST
(Kurt Gödel 1944)“
In relation to this Godel's view I would like to say the following: I found this Godel's text about 2-3 years ago. Until then, I used the Entity / Relationship Model, which is applied in Peter Chen's data model. Obviously this Godel's definition, has a priority of ideas. I also believe that many philosophers and mathematicians have worked on such a view of the world before Godel and I believe that some of them made significant contributions to this theory. The term "object of thought" is important here, and it seems to me that in this small text, we can see a strong mindset of K. Godel. In my paper from year 2008, I introduced the concepts of m-attributes, m-entities, m-relationships and m-m-states, with the prefix "m". Prefix "m" I have used, because these objects are in memory. These my terms, only in part are "objects of thought". I also do not believe in the term "type". In my paper „Semantic databases and semantic machines“ (at http://www.dbdesign11.com) In section 1, I defined abstract objects. In section 2, I defined facts. Facts represent elementary (or atomic) thoughts that correspond to atomic data structures. In section 3, I introduced factual sentences. Factual sentences express facts. Section 4 is about awareness. Now I will not explain these ideas in detail, because it is a very broad topic, I'll just mention some of my thoughts and solutions related to them. I'm building facts from the atomic structures, facts are based on the atomic structures. Here you can see immediately a great advantage of atomic structures that are obtained by applying my theory about the states. If you try to use the "normal form" for an entity, then you should take all normal forms, to reach 6NF for which there is no procedure and which does not operate in a number of cases, which I described in this thread. One more thing is important here. If you want to apply "normal forms", then you must have the wrong data-structute, because "normal forms" repair only those data-structures that are wrong. In order to formalize the work with atomic structures, we need a tool that directly constructs these atomic structures. So, the basic idea here, related to facts, is to try to formalize work with thoughts. I think this is a fine place to quote G. Frege: „I am not here in the happy position of a mineralogist who shows his audience a rock-crystal: I cannot put a thought in the hands of my readers with the request that they should examine it from all sides. Something in itself not perceptible by sense, the thought is presented to the reader – and I must be content with that – wrapped up in a perceptible linguistic form. The pictorial aspect of language presents difficulties. The sensible always breaks in and makes expressions pictorial and so improper. So one fights against language, and I am compelled to occupy myself with language although it is not my proper concern here. I hope I have succeeded in making clear to my readers what I want to call 'thought' ...“ So, in fact, Frege studied thoughts. He found a tool with which it is able to work with the thoughts. This tool is language. Note that G. Frege discovered a major part in propositional logic. He completely discovered predicate calculus and Semantics, starting from scratch, 120 years ago. It seems to me that this Frege's thinking has a lot to do with the following Godel's „objects of thought“ and „the symbolic expressions“ in the above-mentioned sentence: „I mean the doctrine which says that the objects of thought (or, in another interpretation, the symbolic expressions).“ So, in fact, this is about entities(or individuals in Godel notation) 2. Step We identify entities by using Leibniz's Law of the identity of indiscernibles (the indiscernibility of identicals) Another important thing here is the following: Leibniz's Law allows that the identification of the object is a mathematical discipline. So Leibniz's Law is the mathematical tool. Things become accurate because we realize Leibniz's Law in ERM. So we can only identify entities and relationships. Many system-analysts think that each entity is determined by its attributes, ie that it is determined with its intrinsic attributes. This is not true. I've divided Leibniz's Law in two laws. Leibniz's Law that uses intrinsic attributes and General Law which uses intrinsic + entrinsic attributes (see may paper „Semantic databases and semantic machines“ section 5.5 i 5.6. This division is realized within the Entity / Relationship model about the world. I will repeat once again the following example, because it is important in this part of the theory. It argues against the claim that intrinsic attribtes determine the corresponding entity. Example1: Honda dealer received 200 new Honda Civic cars, which all have the same attributes. Imagine now that someone has wiped out all the VIN numbers from these Honda Civic. Then we get 200 cars that have all the attributes the same. If in this situation we apply surrogates, then we will get a disaster. If we keep the industry- standard identifiers, then we do not need surrogates. Note that this problem with a surrogate key, there is for all industrial products of this type. So, we have to say on which basis we give the VIN to each of these cars. We affirm the uniqueness of an entity by using the General Law. Note that in this case when we apply the General Law, then the newly introduced identifier of the entity becomes intrinsic attribute of the corresponding entity. 3. Step This step is about identification. Change of identity is allowed in some countries. Note that in "anchor modeling" is banned changes in the identities of entities. 1. In my model, identification is defined recursively. In my first paper from 2005, I am able to write all the keys in the form of simple identifiers (not composed). I have identifiers of entities, relattionships and states. Look at my website http://www.dbdesign10.com , section 1 and 2. In section 4 was introduced Simple Form. This form gives the conditions for decomposition of data structures in the atomic structures, for db that maintain current states. In this case, the binary structure consists of the simple key and one attribute. Simple Form fully describes the identifiers of entities, ie surrogate keys, locally defined keys and internationally defined key. To my knowledge, this is the first work that fully describes the surrogate keys, locally defined keys and internationally defined keys, and the conditions under which they may be designed as a simple keys. 2. In section 1 my paper in 2005, is part of the text, which I did mention several times. In this text, there are several important things for my data model. Here is the text: „We determine the Conceptual Model so that every entity and every relationship has only one attribute, all of whose values are distinct. So this attribute doesn’t have two of the same values. We will call this attribute the Identifier of the state of an entity or relationship. We will denote this attribute by the symbolAck. All other attributes can have values which are the same for some different members of an entity set or a relationship set. Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This identifier has one value for all the states of one entity or relationship...“ In this article I would like to analyze the following sentence: "or can provide identification of the entity." Thus, anything that can provide identification of an entity, "directly" or "indirectly". Let's call this rule "Identification OfEntity". To highlight the importance of this rule I will mention now three important examples from practice that are "IdentificationOfEntity" realized "indirectly".
(i) Surrogate key. Here, using "surrogate key" we identify the corresponding
attributes that are in the database (ie in the memory). Based on these m-attributes, we find the corresponding real world attributes on the corresponding real world object.
(ii) Application of the General Leibniz's Law. Here I use that "intrinsic +
extrinsic" properties "can provide identification of the entity".
(iii) I am most interested in this case. In my opinion, the human memory is not
operating as a db memory, that is, it does not use keys. For example, someone can recall of certain person, based on one date. For example, someone notices a date and he recall himself that on that date, certain person died. Thus, in the this example, we identify an entity based on the date (but not by using a key). In my post on 5 May, 2015, in this thread, I presented the example, where I showed how to find a person if we know his date of birth. This is realized by using the good organization of the data, ie by using the atomic data structures. In the above mentioned article, there is another important sentence: „This identifier has one value for all the states of one entity or relationship...“ I named it as „Procedure A“. I wrote about it in my thread „The original version“ in my post from 26. May, 2010. and in that post, I've marked it as (a). This very important part of the design of the General databases for the first time solves a set of important things is plagiarized from the authors of "Anchor Modeling" and named it anchor and immutable key. Now, there are names such as the "immutable objects", which is an example of not understanding the essence, at the level of db design. In this thread, I wrote that in the theory of object-oriented languages, began to appear the term "immutable key" and that this term is wrong term. In my thread "some information about anchor modeling," in my post from 18 July, 2012, I wrote that the surrogate key is a weak point in the oop and oo db. I wrote that these problems in OOP and OODB can be solved just by using the mentioned "procedure (a)." 3. In my paper from 2008, "Database design and data model founded on the concept and knowledge constructs", in section 3, I have defined important constraint on the subject, with the following title: Limitation of Interpretation. Our assumption related to real world objects is that we can recognize or match those objects for which we have perceptual, inferential or rational abilities. Therefore, I have defined that attributes are identifiers. We can identify attributes by using our capacity in terms of the above-introduced "Limitation of Interpretation". Attributes are determined by the formula (3.3.3), see my paper from 2008. So, these attributes are determined with the subject's ability to identify these attributes. Formula (3.3.3) provides a link between the conceptual thinking and identification. In my opinion it is not enough just to work with the concepts. With data-structures, in addition to formula (3.3.3), general knowledge is also associated. Associated knowledge about an attribute, can be as much as it Project leader decides. Knowledge in my data model is determined by factual sentences. Facts and factual sentences I defined in section 2 and 3 in my paper „Semantic databases and semantic machines“. 4. All other structures that are not attributes, which are complex, we construct in the following way:
(i) Each complex structure is constructed from previous simpler structures.
(ii) The identifier of the complex structures we build using the identifiers of the
previous structures. For example: The entities are built from attributes, by using Leibniz's Law (or General Law). The construction of the identifier of entity we build using attributes of this entity. Note that attributes are atomic identifiers in my data model. Identifiers of relationships are built from identifiers of entities participating in relationships. Identifiers of states are constructed by using identifiers of entities and by using general knowledge that is related to the corresponding entity. Thus, for the attributes in my db design, there are two important structures a) associated general knowledge that is related to the corresponding attribute b) formula that is designated with (3.3.3) In addition to the attributes, I also apply the formula (3.3.3) to the m-entities, m- relationships, and m-states. Procedures for identification of relationships and states are similar. So, constructions of complex objects are derived (recursively) from simpler objects. For example identifiers of states are determined by identifiers of the corresponding entities and by general knowledge related to the corresponding entity. As stated above, the identifiers for attributes are not derived. Attributes are identifiers. They are given (they depend on subject's abilities). I have already explained that the identifiers of entities are related to the subject's operations with a memory; how to store an identifier of m-entity into a memory and how to recall it from the memory. When we talk about thoughts, I explained that surrogates are related to a subject and one memory (the memory where surrogates and the corresponding m-attributes are stored). I also explained that the industry-standard identifiers can be used to explain how thoughts and semantic content are conveyed between two (or more) subjects that is to say between two (or more) memories. 4. Step 1. Now I will briefly describe the history of my work. When I talk about "General databases", then I will mention that I solve "decomposition" on the atomic structures and introduced the theory of states of entities and relationships. In April 2006 I introduced Simple Form, for Simple databases, ie databases that maintain current state. General and Simple Form enable the decomposition of database's structures into atomic structures. My paper "Database design and data model founded on the concept and knowledge constructs" I submitted on 21 August 2008. I submitted my paper in „Journal of Computing and Information Technology“ (from Croatia). Croatia is a country of my origin. Much time has passed since I submitted my paper without any information about whether the paper had been accepted or rejected. I realized that the paper could not be published even after a year, I contacted the Editor-in-Chief, Sven Lonacaric and informed him I would publish the paper on my website, and in the case that it was accepted, I would give all the rights to his journal. I quickly received a message from D. Mladenic (from Slovenia ), the Associate Editor, that my paper was rejected. I posted my paper on my website and on user group comp.dabases.theory on 7 March 2009. I was aware that my work is good, I carefully examined D. Mladenic reviews, in a few days. I stayed really astonished with her reviews. I have found that she does not know elementary things in databases. Then I found that S. Loncaric did not know databases, his field is "imiging". D. Mladinić also is not a professional for databases, her specialty is "machine learning". I was put in a position to correct my work; It was requested of me. I refused to do it, because I am sure that my paper is correct. My paper, I presented on my website, exactly as it was submitted to the Croatian journal. I also put reviews of D. Mladinić on Web, because this work is important, and I spent years on this work. This review can be found in my thread "The original version," in my post from 30th January, 2011. Then I accidentally discovered that Anchor Modeling plagiarized my work. I informed about it S. Loncaric, he did not respond. After a long correspondence between me, S. Loncaric and D. Mladenic, it took one year, I realized that they were actually banned the printing of my paper. By the way in the Croatian magazine "Journal of Computing and Information Technology" are not a world famous influential names. From well-known names, there is only Yuri Gurevich. (as far as I know this area). Gurevich is famous mathematician and has published papers with world famous mathematicians. Here's a web address of the Croatian journal's http://cit.srce.unizg.hr/index.php/CIT/about/editorialTeam When I saw that Microsoft started making software for what I published in 2005, when I saw the "Anchor modeling" plagiarism of my work, when I saw that in the 2010 Tine Borovnik from the University of Ljubljana in his master's thesis analyzes "Anchor Modeling" (By the way Dunja Mladenic is from Ljubljana), then the question was, what should I do now? On May 26, 2010 I decided to start thread "The original version", which will present the truth about "Anchor Modeling". Now I see that it was the only chance to save my work to some extent. Otherwise, there would be only about "Anchor modeling", my work would be dead. In the fall of 2010, in just a few months the authors of "Anchor modeling" released a number of new papers. Their main work was published in the journal Data & Knowledge Engineering, Editor Peter Chen. In this paper, the authors of "Anchor modeling" corrected the mistakes that I presented in the thread "The original version". This correction was done so as this time they have plagiarized my theory about states. Some of papers, they have published on September 15, 2010 on their website. All these papers are connected and the main thing in this work is mapping between data models. 2. The most important part of my design is "decomposition on atomic structures" and the theory of the states of entities and relationships. So in my data model design is about states of entities and relationships, not about the entities and relationships. In fact these states are decomposed into atomic structures. 3. In the second part of this paper I introduced the databases that store programs. Specifically these databases keep states of programs. One important idea here is that the processes and events can be realized with the execution of a program, or one state of the program. Of course, we can start a collection programs and then formally speaking we have a history of future that is implemented from the database that keeps the states of the programs and "knows" to implement a set of future events. In my post of April 18, 2015 I called this database "small world" that runs the following two important things:
(i) the world can maintain its past, present and future.
(ii) the main control part of the world is a collection of information, i.e. data
from appropriate database. --------------------------------------------------------------------------------------------------- These databases that keep programs, I'll start a new thread short, with maybe three posts. For this current thread I have one more post. I think it would be useful for this user group that someone start a thread on intellectual properties, plagiarism, etc. ------------------------------------------------------------------------------------------------------ Vladimir Odrljin