Re: some information about anchor modeling

From: vldm10 <vldm10_at_yahoo.com>
Date: Mon, 28 Jan 2013 11:54:17 -0800 (PST)
Message-ID: <542bbd62-671c-48de-8106-dfe02d7eb6df_at_googlegroups.com>

K K A K B K C

 k1     k1 a1      k1 b1        k1 c3 
 k2                                   
 k3                             k3 c3 
 k4                             k4 c3 
 k5                                   
 k6

From this post can be seen that surrogates have problems at the level of data entry with nulls and that these problems can not be solved at all.

I showed in this example that the relation with the surrogate k3 can not do either one operation with the data in this relation because the user has no way to find this relation. Also it is shown that if the user sees (knows exactly for the relation with k3), then he can not identify the corresponding object from the real world. Note that the exact same problems a user has, if the data from the relation with K = K3, he writes on a paper. For example if a user first tries to gradually gather all the data for the relation with K = K3 on a paper, then he would still have the same problems as the entries of the data in the DB. So the following question arises: how do the elementary thing; how to collect and where to keep the collected data (with nulls).

On the other hand, identifier, which is given in my solution, has many advantages compared to surrogates. However, notice that my identifier also has certain problems with nulls. If I have this key and nulls, then I can solve many of the mentioned problems. The key, which is given in my solution, could find a real object and vice verse. If I have nulls, then I can apply three-valued logic, or I can extract the tuples with nulls and implement some of programming languages, etc. It is not possible apply nulls, surrogates and the three-valued logic, all together.

Today more than 90% of the database has identifiers that are part of my solution; these are industry-standard identifiers. For entities with these keys does not make sense to introduce surrogates. So, for over 90% of today's databases, it is nonsense to apply surrogates. This number is an astonishing example of the amount of misunderstanding. I am referring to the wide usage of surrogates in scientific papers, which are related to OOA, RM / T and Anchor Modeling.

On the other hand, the identifiers that are given in my solution do not have to be like industry-standard identifiers. Every company can define its own system of identifiers and identification, which is based on my solution. This db design is a great advantage and a great independence for each company. In this database design, it is essential that these identifiers are placed on the real objects of our business, for example, these identifiers should be in the documentation, receipts, invoices, etc. In this way, the identification is completely under our control. Of course, there are many variations on this solution.

Imagine now a situation that everyone uses some of their surrogate keys. For example, that instead of the ISBN standard for books, every project leader uses his surrogates system. It is obvious that such a solution is impossible in real life. It is also obvious that if we use the ISBN identifier, then we do not need the surrogate key, at all.

In this thread I pointed to a large group of objects from the business applications that can not be resolved with surrogates. This is the example about an Honda dealer who sells Honda cars, which all have the same attributes. Here we can not use surrogates, because they would show the same entities in a database. Therefore, we must introduce the VIN. And, again, it is obvious that if we use the VIN identifier, then we do not need the surrogate key, at all.

Now after the above examples, we can set an important issue, it is the following question: is there a good theory of the surrogates. Note that such theory does not exist and this is the main problem with surrogates.

In my model, the identifier is intrinsic or extrinsic attribute of an object. It has been designed in accordance with the rules of identification. See my paper "Semantic databases and semantic machines" section 5.5 and 5.6 at http://www.dbdesign11.com As regards identification, my model identifies the following: attributes entities, relationships and states. Attributes and entities are in the real world, but the relationships and states are slightly different structures. Note that the interpretation and abstraction of these objects exist in our mind, I call these abstractions with the following names: m-attributes, m-entities, m-relationship, and m-states and defined them as abstract objects. I also introduced a definition of abstract objects. In this way I have tried to give a formalization of these objects. Note that these abstract objects we store in a db, using a data model. Identifiers of the attributes and entities can be found in the real world and in the database, while the identifiers of the relationships and states can be found only in the database. In my model, m-states and m-relationships are complex abstract objects that are constructed from less complex abstract objects. For example, m-relationship is constructed from the m-entity. Identifiers of m-entities provide �m-relationship - the real world� link. So, between abstract objects, I have introduced a hierarchy according to the complexity of abstract object. For the attributes we have innate abilities. We identified the entities by the mentioned identifiers; they are set in both real entity and the m-entity. We identify the relationships by the corresponding entities. The states of relationships are identified by corresponding relationships. See comment in Example 8 in my paper �Database design and data model founded on concept and knowledge constructs� at http://www.dbdesign11.com In this paper, I introduced a relation marked with (3.3.3). This is the first time that the concept is defined in an accurate manner. Note that Russell's paradox shows that the definition of the concept, using the properties, leading to a paradox. Formula (3.3.3) allows definition of the concept based on properties. This formula provides an important link between the relation of satisfying and identification of members of the extensions. In my opinion the formula (3.3.3) solves Russell's paradox. As I already wrote about this, Russell has made two mistakes: 1. Here is a semantic procedure, Russell used logic. 2. When we work with concepts, then we need to identify the objects for which we apply the concept.

In the design of the concept, I introduced the identifiers of the objects that satisfy the concept. Identification of abstract objects is also introduced. Knowledge is defined in a new way and it is incorporated into the construction of the concept.

My identifier is associated with columns of knowledge, while surrogates and anchors have strict unary structure.

I can change the identifier of the entity and maintain the history of these identifiers. See "Semantic databases and semantic machines" section 5.12 at http://www.dbdesign11.com

Vladimir Odrljin Received on Mon Jan 28 2013 - 20:54:17 CET

This message: [ Message body ]
Next message: cik1984_at_gmail.com: "The Ninth International Workshop on Agents and Data Mining Interaction (ADMI-13)"
Previous message: Martin Musatov: "Re: NTriples DBPedia from Musatov thank me later"
In reply to: vldm10: "Re: some information about anchor modeling"
Next in thread: Eric: "Re: some information about anchor modeling"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message