Re: some information about anchor modeling

From: vldm10 <vldm10_at_yahoo.com>
Date: Wed, 20 Feb 2013 07:53:56 -0800 (PST)
Message-ID: <dd2dbf21-662c-4113-8af6-8d524dd5778f_at_googlegroups.com>


In my post on February 19, 2013 I made a typo at ExampleA. This is a corrected version.

> I disagree, he did not suggest that he invented said theory (others have suggested that he did, which is incorrect).
  

I do not think that E. Codd invented significant mathematical theories, but I think he's done some things that need to be classified as applied mathematics (relational algebra, functional dependencies). He, along with others began to establish this theory of databases as a mathematical theory. This paved the further development of the theory of databases as science.      

> Which begs the question: what exactly are you defining as surrogates ?
  

  1. Before I give you a definition of the surrogate key, I would like to say that surrogate key is not the main thing here. The main thing here is the decomposition into the binary (atomic) structures. This is crucial not only for database theory but also in some other areas.

So far in this thread, I gave examples showing that the surrogate key does not work. Some of these examples show that the surrogate key can not be used for over 90% of business applications. These examples were more of practical character, that is, they were not theoretical.

Now I would like to present some theoretical problems related to RM and RM/T which are very serious. I will try to present it as much as it is possible on the user group. First, E. Codd did this decomposition of relations into binary relations, without any prove. Let me now give you examples which are of the theoretical character and which show that RM cannot solve some serious problems. These problems form a large field in the theory of databases. Example A:

State1.  Person X from a company that supplies consumers with electricity reads a device for measuring the energy consumption for a person Y. Person X writes down on paper that, on May 1st, 2011, person Y fully spent 128 units of energy. 
State2. Person X submits this list to the IT department on June 1st, 2011, a month later. 
State3. The person from the IT department has put this list in a drawer and forgotten about it. After two months, he was reminded of this list and promptly handed it over to the person doing data entry. 
State4. Thus, the data was entered into the database on September 1st, 2011. 
State5. Then the IT department filed a lawsuit against a person Y, because this person has not paid the electricity consumed. 
State6. However, it turns out person Y passed away on May 15th, 2011. 
State7. During the trial it is determined that the person who entered the data, made a mistake and entered 728 instead of 128 (as the amount of energy consumed). 
The son of person Y is a good lawyer represents in court his late father.   

This kind of problems cannot be solved using the RM / T. It cannot be solved using the RM (or others db models). Of course, for this kind of problems, the decomposition into binary relationships with surrogates, what is demonstrated by Codd, does not work at all.

This set of problems is solved for the first time in 2005, and this is my solution, which is presented in this user group in 2005.

If you want to see how to do the decomposition into binary structures for these cases, then please see my post from February 13, 2013, see second case. In my post which is from February 13, 2012. I explained the problems that General databases can solve, while Simple databases cannot solve these problems. I use the term "General database theory" (General databases as shortcut) for the group of database’s fields that I roughly presented in my post from 13 February, 2013 in this thread.

Keep in mind that "General databases" does not have Delete and Update operations. So, there are no Delete anomalies or Update anomalies. The purpose of General database is to keep all the data that are entered into it. Therefore there are no Insert anomalies.

I would like to emphasize that the paper "RM/T" has so many practical and theoretical errors that it is scandalous that the paper was published.   

> So let me take the definition of surrogate from todays wiki entry (gratefully "surrogate" and not "surrogate key"). And let me assume that your definition of surrogate is the same.

I take the definition of surrogate from RM/T. This is the only official definitions of Codd about the surrogate key. By the way, today I visited the site which is related to the surrogate key on Wikipedia. It is really sad what some people write.   

> ---------------
>
> dbdesign10
>
> ---------------
>
>
>
> 5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
>
> • the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
>
> • on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
>
> ••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.
>
> • (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)

In this example, I have two identifiers, both belonging to one database structure. The first identifier identifies a real object car; while the second identifier identifies an abstract object that is a state of this car. Everything else in this relation represents the knowledge about one attribute. Once again, notice that the state of an object is an abstract object. For example, you can not touch the state, while the object car you can. A state of an entity, I have defined as any knowledge about that entity, which has some subject. As I said, these two identifiers are linked and are located in one database structure, which I call the state of an entity. The identifier of the state identifies the state of the entity, but can not identify the entity. The identifier of the entity identifies the entity, but can not identify a state of the entity. However, the state structure connects these two identifiers. This is my solution, which connects the identification of an abstract object with the identification of the real object, and vice verse.

Obviously, my solution is not a surrogate.   

> For the record, Celko is an idiot, and Cressey is an even bigger idiot, that even Celko can destroy.
  

Joe and David are IT professionals. I have also in the past in some way insulted these guys, but now I know, it was my mistake.

I've already explained to you, the industry-standard identifiers are the same type as my identifiers. More than 90% of all business applications use these identifiers. You and I are discussing at the level of "db designer - database". Joe talks about “the industry-standard, externally verifiable keys with known validation rules”. It is not possible externally check the surrogates. The question is how a police officer can check a license, using the "RM / T"? In order that an officer might check the passport of a traveler, how should be implemented "RM / T"? In order to use ATMs, how should be implemented "RM / T"? There is another important question here. How the logical and semantic content is conveyed between a db and user. Using surrogates this is not possible. In RM/T this is possible only between db designer and a database.

Since Anchor Modeling and RM /T use surrogates, it is clear that the damage is global. I want to show you the extent of the damage, which occurs with implementation of the RM / T into industry and science.

> I think you are honest, you have put (a) and (b) squarely. But you damage its credibility because your (c) is laughable. Users do not walk up to databases (let alone highly normalised ones, at the cutting edge, which are not common), and make changes to single RM/T rows. No. First, they are isolated from the low level implementation of the database (if anything, they will see Views [2.4] ); second, there will be various constraints in place that prevent incorrect updates; third, whatever they are attempting will be encapsulated in a transactions (that convert the logical business action into a series of single-row updates, all of which together, constitute an Atomic change to the db).
>
> So, no, the user will not be doing any such thing, and as we agree (I think) the surrogate is not visible to them anyway, so they will not be looking for k4 that they cannot see. The user will execute the relevant transaction, that looks up C=c3, and finding that 3 identifiers exist, it will fail. Or else the user will look up c3 on a search window first, find that 3 logical 5NF rows exist for it; choose one via proper Keys (let's assume column A is the Key), then execute the relevant transaction using Key A{value}, which will succeed or fail. Since column A is not given in your example, there is not enough detail to suggest either success or failure.
>

Up in this thread, I explained that I took this example because it is similar to the example from the book of C. Date. I was surprised with the db design in Date's book. In the example I wanted to show the following:

  1. RM / T can not work with Nulls at all.
  2. If someone is working with Nulls then he should be aware that does not help Codd's recommended three-value logic at all.
  3. The surrogate key should be never displayed.
  4. It is shown that the binary relation with Nulls looks really bad and it would be best to forbid the work with Nulls, especially because the user cannot see the primary key (that is surrogate key) of the binary relation.
  5. Where and how to keep the uncompleted data?
  6. If one instead of surrogate key uses VIN number (that is industry-standard key), then he can do it fairly.

Vladimir Odrljin Received on Wed Feb 20 2013 - 16:53:56 CET

Original text of this message