Re: Atomic Structures
Date: Fri, 15 Jan 2016 14:52:31 -0800 (PST)
Message-ID: <aa107115-d269-4268-831f-369a4355cfaa_at_googlegroups.com>
In my previous post, from 06.january 2016 in this thread, I presented an
example, that proves that the changes on an entity can be precisely
determined only by application of atomic structures. This example shows that
the atomic structures are necessary
In this example I explained that "6NF" is just a name, which replaces the
term "the atomic structures". So "6NF" is nothing new and has no theoretical
significance.
In this post, some themes are repeated, but have made new aspects. For example some problems in terms of type theory are analyzed. The difference between the data model and the corresponding (relational) algebra is mentioned.
Now I will demonstrate some other important applications of these atomic structures.
- The use of the atomic structures for the construction of schema mapping.
Primarily this is a schema-mapping of base data structures. So schema
mapping is mainly applied to the base data structures, for example to base
relations in RM. Under term "data model", roughly speaking, I mean on "base
data structures". There is another kind of data, the data derived from these
basic data, for example views, queries, and generaly, that are data obtained
by applying relational algebra from RM.
There are many papers on this topic. In this post I gave a couple of papers
from the two groups, the first group of papers belonging to R Fagin, (IBM)
and the other group is presented with the paper of Alagic & Bernstein
(Microsoft).
See, for example, among others, the following two papers of Ronald Fagin: Towards a Theory of Schema-Mapping Optimization
Inverting Schema Mappings - IBM
and paper
Mapping XSD to OO Schemas by Suad Alagic and Philip A. Bernstein at http://research.microsoft.com/pubs/76534/alabernmsr-tr-2008-183.pdf
When we have the atomic structures, then schema-mapping between the schemas and the corresponding schema instances is easy - then we have a mapping between atoms. Similarly we do the inverse mapping.
--- Schema-mapping between schemas is not done in RM / T nor in 6NF. RM / T and 6NF in this area can not do anything to help you. Note that the E. Codd intensively used the term "entity". He defines the entity-type. But Codd did not define the two very important things: - what is the entity - mapping between an ER model and RM. 2. At this point I'll write about some of the large areas of database theory which have not been addressed in the RM / T and 6NF. However atomic structures play an important role in solving these large areas of database theory.Received on Fri Jan 15 2016 - 23:52:31 CET
(a)
In RM/T there is no some theory about maintaining "history". E. Codd did not even mention the "history" in his paper RM / T. Authors of 6NF, do not differ "Databases that maintain history" from "Temporal databases". For example, the authors of 6NF replaced the temporal intervals with the "historical relvar". They don't understand that temporal intervals are not "history". I mean, this is serious confusion.
(b)
In RM/T and 6NF there is no history of databases that are supported on-line
(databases that are supported by the Internet). Note that in these
databases, data must be the currently available. For example internet banking must have currently available data. These databases must also have the solutions to prevent crime. These databases must have very precise history of events and a wide range of additional solutions to solve many problems. In their book "Temporal databases and relational model", in section 10.3 which is about "Historical relvar only" the authors of 6NF wrote: "Now we turn our attention to historical or fully temporal relvar SSSC_DURING. " Of course temporal databases are not "History" at all. I will mention that the authors of 6NF allow operations "delete" and "update." ================================================================== If someone uses the "delete" and "update" operations, then there is no history of data, nor the history of of events. In that case a chaos corresponds to such "History". ==================================================================
(c)
In RM / T and 6NF is not done a theory that allows solving problems related to erroneous data in databases. Note that the atomic structures precisely indicate on erroneous data and who entered that information. (I already wrote about this problem) 3. In RMT & 6NF there is no theory about the atomic structures. Such a theory should enable us the following:
(i) the construction of atomic structure
(ii) theory about atomic propositions, atomic predicates, atomic concepts,
atomic extensions of the concept (that is, atomic sets), atomic relations, atomic sentences and atomic facts. We also need to do the theory for databases that maintain current, past and future states.
(iii) We have to build (relational) algebra for the atomic structure.
Note that my atomic structures are constructed so that they are associated with states. As I have already mentioned both Codd and Date & Darwen presented only how the atomic structures should look like. But we all know how the atomic structures must look like. 4. There are some misunderstandings of the authors of RM/T and 6NF that are related to atomic structures.
(a)
In his book Database Design & Relational Theory Normal Forms & All that Jazz, Page 141 C. Date wrote the following:
(start quote)
In our book Temporal Data and the Relational Model (Morgan Kaufmann, 2003), Hugh Darwen, Nikos Lorentzos, and I define: a. Generalized versions of the projection and join operators, and hence b. A generalized form of join dependency, and hence c. A new normal form, which we call 6NF. As the title of that book might suggest, these developments turn out to be particularly important in connection with temporal data, and they're discussed in detail in that book.
(end quote)
========================================================= Let us consider the following atomic structure: {Key, Attribute, Operator} ========================================================= Here Operator represents a person (or procedure) who is data entry person. If a data is entered by a person, then we demand that this person should enter his password. In this way, without knowledge of this person, we enter in "Operator" the name of the person who enters the data. In other words, for every piece of information, we know who entered that information into the database. ================================================================== Here in this example, we see that the above atomic structure is not related to the "temporal data". There is no "time" in this atomic structures. But it is related to the history of events. Sometimes we want to know who entered certain information in a database. We do not need time, when the data was entered into the database. This example shows that "history" in the database is not necessarily "temporal data". Obviously, the authors of 6NF, did not notice this. For example, we can enter in database the number of the station from which information is entered. A station number is not a temporal data but it is a kind of the history of the database. =================================================================== Note that atomic structure have a much more important role than to be the temporal data.
(b)
In the mantioned book C. Date wrote the following: ==============================================================
(start qoute)
- A single "anchor" relvar for the pertinent some particular primary key together with - Zero or more subsidiary relvars giving further information about entities of that type, each having a foreign key that refers back to the primary key of that anchor relvar.
(Does this state of affairs remaind you of the RM/T discipline discussed in
Chapter 15?)
(end qoute), look at page 211.
=============================================================== I will mention only three important differences between AM and RM / T:
(i) RM / T has an invisible surrogate key, while AnchorModeling has a
visible surrogate key.
(ii) Edgar Codd did not mention history, while AM has "historized
structures". By the way, I explained in this user group, that "historized structures" in AM are a plagiarism of my solution.
(iii) Anchor modeling is based on 6NF, and uses 6NF, as the authors of the
Anchor Modeling it wrote in the title of their paper. So AM construct the atomic structures by "using" 6NF. They do not use RM / T. By the way, on this user group, I demonstrated that 6NF is just another name for the atomic structures and that it does not provide the solution or algorithm, which determines the construction of atomic structures. So in conclusion I can say that AM is very different from RM / T in important matters, not to mention the other differences between the RM / T and AM. In my opinion the above statement of the C. Date unsuccessfully trying to repair the RM / T. Now I would like to discuss a little more about the types. C. Date in the mentioned book wrote following: ================================================================== " Let E be an "entity type" and let ID be a data type such that every entity of type E has exactly one primary identifier (my term, not Codd's) of type ID. For example, E and ID might be the entity type "suppliers" and the data type "character string" respectively." ================================================================== Note that in this way, the entity type and ID (ie data type ID) was done in the AM, but not in the RM / T. Here Date talks about two kinds of types and correspondence between these types. This is a correspondence between the entity type and data type. But as I understand RM/T, it seems to me that Codd talks about entity type and domain. More precisely Codd writes about E-domain which is "the source of all surrogates." Codd also writes in RM / T: "Surrogates behave as if each entity (regardless of type) has its own permanent surrogate, unique within the entire database." ================================================================== My emphasis here is on the following Codd's words: "... within entire database." Note that the key in RM is defined within the corresponding relation. ================================================================== However, the key in RM take the value of the attributes of the corresponding relation. But RM / T surrogate is valid on the entire database. I want to say that the C. Date slightly changed some things. Note that OO also has identifier that is valid in entire database and that it is the source of problems with OO databases. As for the theory of types, it is important to understand that RM is based on Frege's theory. In Frege's theory, the entity is defined. Neither RM nor RM / T do not use Frege's definition of entity. On the other hand I have several times quoted Godel's definition of entities from the theory of simple types, and this definition is closer to the approach that has been written by Date. it seems to me that Codd and Date have not noticed it. Although B. Russell introduced the theory of types, in order to solve the paradox related to Frege's concepts, today some well known mathematician accept fact that G. Frege is the originator of the theory of types, which stands together along with Frege's semantic and logic. Finally, regarding the types I would recommend the work of Suad Alagic and Philip A. Bernstein (I wrote the above web address of their paper) that discusses the problem of mismatch between the (data) type systems during schema mapping between two data models (they work with XML and OO). I think this should be linked to the above problem in which there is no theory about schema mapping between ER and (RM / T and RM-6NF). Note that the authors of 6NF did not say anything about keys. Are they simple or complex?
(c)
Here I will mention the following paper from C. Date: Codd,s First Relations Papers: A Critical Analysis. (You can find this paper on website for Third Manifesto) According to C. Date, the best definition of the relational model, has given by E. Codd in his paper RM / T. This claim seems a bit unusual, because in my opinion the RM / T model can not be be applied anywhere. Roughly speaking, this definition is as follows: ================================================================ "The relational model consists of (1) a collection of time-varying tabular relations, (2) the entity and referential integrity rules, and (3) the relational algebra." (look at page 9) ================================================================ In my post of 25 September 2015 in thread "Tarski school influence on Database Theory" I wrote the following article on the data model: -------------------------------------------------- In my opinion there is a big difference between relational algebra and relational model. I'll try to explain this difference. As I presented it in my discussion with Jan, the most important models in human activities are mathematical models. Roughly, this is about the following triplets: Man ------- Model ------ Real world( real objects ) For example in architecture, an architect (man) made the model (it is an architectural drawing of a building) and on the basis of this plan, we build real building. So according to the above-mentioned triplets, we have: Architect -------Plan------Building In other words, we have an architect with his solutions, ideas and thoughts. Then we have a drawing that is an architectural plan that was developed in mathematical notation. It's a model. And finally we have a real object in the real world, that is, the corresponding building. In the architectural drawing(model) we have simple mathematics, mostly we apply geometry. In the the relational model, we have a very complex mathematics. The relational model is completely done by Gottlob Frege, 120 years ago. (You can see Frege's "relations" in my ost of September 24, 2013, in thread "Sensible and NonsenSQL Aspects of the NoSQL oopla".) Relational algebra as opposed to relational model, mainly dealing with derived data (various queries, views, reports, etc.) -------------------------------------------------- In the mentioned post, my emphasis is on the data model rather than on the relational algebra. I'll give important examples where we use the data model, that is the base relations. Then I'll give you an example where it is used relational algebra, that is, where we are working with the derived data. In data model we work with:
(i) Schema mapping we are working on the level of data models.
(ii) In the level of data model at relational model, the order of attributes
is not important. Suppose we have the following binary (atomic) relations {K, A1}, {K, A2}, ..., {K, An} and say that we need their recomposition and decomposition again. Then at the data model level we can write {K, A1} = {A1, K} or we can write {K, A1, A2} = {A2, K, A1,} = { A1, A2, K} etc. Note that the atomic (binary) relation should be:
(i) joinable
(ii) nonloss joinable
If we are not talking about data model, but if we're talking about relational algebra, then we must build a relational algebra for the atomic relations. That's why surprises me Date's assertion that the best definition for the relational model is, one that is Codd gave in RM / T. First, I think that RM / T can not be applied anywhere. Secondly I think relational algebra over atomic relations is very difficult to construct. Note that Codd has not done anything concrete to the construction of relational algebra on the atomic structures in RM / T. Vladimir Odrljin