Re: Migrating from Hierarchial DB model to RDBMS OR ?

From: Derek Ignatius Asirvadem <derek.asirvadem_at_gmail.com>
Date: Sat, 11 Jun 2016 00:24:51 -0700 (PDT)
Message-ID: <0d918a38-a48f-486e-94cf-49dc0e616421_at_googlegroups.com>


James

> On Thursday, 19 May 2016 10:16:32 UTC+10, James K. Lowden wrote:
>
> No general transformation is possible. Any data can be modelled
> relationally. That they have some hierchical representation just
> provides a starting point.

I have already stated, with reasoning, that general transformation is stupid, when no transformation is necessary. Of course, if one doesn’t understand that, one has a problem, and one needs a solution, where others have neither the problem nor the need for the solution.

> As it happens I've had cause lately to investigate IMS. c.d.t readers
> might not know how little it resembles a SQL DBMS:

You do not seem to understand that you cannot learn a subject by reading the product manuals of a platform that supports the subject. Eg. one cannot learn or understand Relational Theory or Relational Database requirements from SQL manuals, even though SQL implements the Relational Model.

You might have more success, and certainly eliminate incorrect understandings and wasted time, if you learn from someone who knows the subject. Hence the value of education, why we pay for it. If you examine the diagrams I did for you last year, which give a correct and complete definition of the structure and Access Paths, you might understand the subject better.

The context for Giri is that he is replacing a *legacy* IMS database with a hopefully Relational database. No one is actually implementing new systems in IMS these days.

Given that IMS is 1960’s technology, and implementation of the Hierarchical paradigm, it is anachronistic to compare it with 1970’s Relational Model, and the 1980’s platforms that provide it. So the starting premise is false. But typical of Modernists, who think only in terms of their fragments, and look for database design or full Predicate articulation in the RM.

> Any comparison [of IMS] to the relational model is purely specious.

Correct.

And anachronistic. Like comparing motor vehicles of today with ox carts.

And therefore stupid. Typical of people who are used to erecting Straw Man arguments. Yes, when viewed from the perspective of todays motor vehicles, ox carts in the time when motor vehicles had not been invented, are decidedly “lacking”. Oh dear. Ox carts are bad because they don’t have a steering wheel. Oooh. Oooh. But they did have bull5y!t collectors, which are badly needed these days.

But you do it anyway …

> It is not a model.

As of our interaction last year, one point that I did agree with, is that it is not a model such as the Relational Model, which has a formal theoretical basis, an RA, etc. But it does have a Normalisation basis (Hierarchical) and it requires an understanding of that, in order to implement a successful database. Henceforth I refrain from using the common term *Hierarchical Model*, because it isn’t a model, and use the label *Hierarchical Paradigm*, to reflect the design requirement.

> It lacks a basic datatype (analogous to relations),

False. It has complete datatype enforcement, and record format (field) enforcement.

(Datatypes are not restricted to relations. Read the RM. Codd uses the term *Domain* no less than 83 times. Some of those instances are for datatypes, and yes, others are for relations, and there are yet others.)

> [It lacks] an algebra,

False. A bit silly to expect it, given that you know it is not a model, etc. Therefore it is not a lack. It just doesn’t have it when compared anachronistically, as you are doing.

> [It lacks] constraint enforcement.

False. You contradict yourself:
> There is a small amount of what might be seen as foreign key enforcement.

Given that the notion of constraint is in the RM, which was invented two decades after the HP, it is silly to expect constraint enforcement. Given that it is a database platform, and not a filing system, it is reasonable to expect the then DBMS equivalent of constraints. IMS does have that.

The “small amount” is false. It does the full amount that can be done in its paradigm, which is Referential Integrity to the parent Key.

Further, because the RI is to the Key, and not to a Record ID or surrogate, it is Relational Integrity, even though that concept was not invented for another two decades. Something which your “databases” of today, as proved in the Hierarchical thread, and all the “databases” implemented using the books written by post-Codd authors, do not have.

Only the HP and genuine Relational Databases (ie. implemented using the genuine Relational Model, and not the writings of the post-Codd filth) have Relational integrity.

> As it happens I've had cause lately to investigate IMS. c.d.t readers
> might not know how little it resembles a SQL DBMS:
>
> 1. IMS reserves a "segment" as a key-value store. The value need not
> be, unless it is a searchable term. The key is known to
> IMS as a single field, however the application may interpret it.

Yes. That was common, standard practice in the 1960’s and 70’s. The idea is, the Key is Atomic within the platform, but may not be Atomic outside.

Something you seem to have missed: the Key lower down in the Data Hierarchy is such a Key. what you are calling the “key-value” is in fact a child Key, made up of the parent Key plus at least one additional column (field). Exactly as stated in Codd’s RM, the Normal Form section.

> 2. The "key" need not be unique.

That is a stupid statement. Possibly from trying to understand the HP from the product manuals, otherwise from your own capability. Refer below.

> A search returns the first of N. To
> find the nth element (as with Berkeley DB) one iterates n times.

Yes. As described in my posts to you last year in the *Hierarchical Model and its Relevance in the Relational Model* thread. That is the variable segment. That supports the child records belonging to a parent Key. The child records are 1::0-to-many with the parent Key. As such, it is stupid to expect uniqueness of the parent Key, which is repeated. Same as a child relation in Relational SQL, the parent Key which is repeated, is “non-unique” in the child. Doh. In Relational SQL, the child Key is a larger Key (made up from the Parent), and it is unique. In IMS, the child Key does not need to exist as such, the segment needs only to be a repeating block of records belonging to the parent key, and the integrity is enforced.

> 3. A segment need not have a single definition. Different element may
> have different interpretations, depending on a discriminator field
> (that may or may not be defined to IMS).

Yes. In later editions, IMS supports Subtypes.

Just like any historical feature, it allows backward compatibility, ie. it allowed informal subtypes that were implemented before the IMS Subtype feature was implemented. Obviously, if one were to implement in IMS /after/ the Subtype edition, one should use that feature, and then Subtypes would be (a) formal, and (b) enforced. No s/w can enforce what it is unaware of.   

> 4. Fields may repeat, and be of varying lengths, and may be defined as
> nested within other fields. Every violation of normal forms you can
> imagine is fully supported.

Yes. You seem to be unaware that SQL, and every other language, allows that as well. It is not a special failing of IMS. s/w cannot do any thinking for you, you have to think for yourself, and implement what you think, and the s/w will enforce that, and only that.

But still, the claim is rich in hypocrisy, given the evidence that you do not know, and do not implement, the Normal Forms and Relational Integrity that is defined in the RM.

> The query language offers no generalized support for these structures.

What “query language” ??? IMS has only read/write verbs. All query languages were developed from Codd’s definition of a Data Sublanguage, which was written in 1970. Sytem/R SEQUEL, and then SQL. Something that is based on a mathematical definition (query languages) can be decomposed; normalised; optimised; etc.

Something as primitive as Read/Write verbs cannot be decomposed; normalised; optimised; etc.  

> > I would like to mention that this post is one of the most serious
> > issues and problems in DB theory.
>
> I doubt it. You say serious; I say chimera.

That is simply ignorant. Academics should be careful to postulate within their littel sphere of knowledge and not outside it. The fact is, there are tens of thousands of instances of IMS; IDMS; RDB; TOTAL; etc, that are running on current (high-end) SQL platforms. All that is required is a simple layer than transforms IMS; IDMS; RDB; TOTAL, commands into SQL commands. We have had those since the 1980’s. I have written a couple myself.

Further, something you do seem to understand, at least some of the time. That the RM is not just a database definition or a platform definition, that it is a [far superior] way of perceiving and modelling data, full stop. If you stand in that understanding, which is correct, then you should be able to imagine: a. the transformations that are required between platforms (ie. one that implements the RM vs others that implement some other Paradigm) b. and the nullity of transformations in the event that the RM was used first, as the principal model, or principle.

Eg. last year we discussed, and I provided examples, of a complete piece of s/w (awk) that uses a database (awk arrays) that is defined in pure RM terms, with full Relational Keys, FDs and Relational Integrity, that serves up the data any which way you like (the Codd principle and goal, realised). No platform required.

Eg. I always model a data problem using the RM, regardless of whether the data is intended to be implemented in a database. Last year I destroyed both the Jan Hidders’ paper and the Henning Köhler paper using precisely that method. The source papers were not simple Straw Man, they were Advanced Straw Man by Virtue of Ignorance, typical of Modern “theoreticians’, addressing a “problem” in the client or inventing a new “normal form”, where the “problem” was fixed in the data four and five decades prior.

> 5. There is nothing resembling a CHECK constraint in SQL.

Correct. Anachronistic expectation, since Constraints had not been invented.

> There is a
> small amount of what might be seen as foreign key enforcement.

False. Treated above.

> Why would anyone use it, then? How is it IBM continues to sell IMS?
>
> There still exist applications for which the services and features
> provided by SQL are not needed. If you have a record-keeping system
> for which the requirements are quite static and the data volumes very
> high, what need is there for predicate enforcement in the DBMS? If the
> rules are already enforced and the access patterns already established,
> why incur the overhead of dynamic storage allocation and location
> independence? Why "interpret a query" when you can just fetch a record
> by its key?

You can’t “interpret a query” because there ain’t no query to be interpreted. Already treated.

You paint the picture that IMS is a record filing system. That is understandable, since as per the evidence of (a) your Securities system last year, and (b) the books that you read, from which you implement, that teach only record filing systems, with no Relational capability or Integrity. They do have Referential integrity, but you did not use that.

But that is false, IMS is a proper DBMS, admitted pre-Relational, with full control of records and integrity that could be expected prior to the Relational Model. The HP, certainly IMS as one example platform, and the RM, certainly SQL as the platform, both support Relational Integrity. In the case of IMS, which was written ten years before the RM, and twenty four years before an implementation of the RM, we can’t say that it implemented Relational Integrity by intent, since that was invented afterward, but it did implement Referential Integrity per Key, and that by intent and design, which was later defined as Relational Integrity.

As for how IBM does manage to keep selling IMS, there exists a market called the legacy market. You might want to read up on it. Similarly, there exists a market called the Wall Street market, which still uses 2MB (not GB) disks. Modern systems, written by Modern developers cannot compete with older systems, despite the fact that Modern machinery in 1,000s of times faster. With that in mind, if you speculate about what is behind those flashy modern activity boards, you might be right.

There are hundreds of vendors running Unix boxes, that support an IMS or an Open VMS system, which latter runs, unchanged since 196x or 198x. Think Parallels on Mac, which runs an instance of Windows, then think big Unix boxes and IMS.

> These are two different systems. Codd recognized that the power of
> mathematics would make database management more tractable and
> rational. He understood that not all data processing need be pure
> record-keeping. He invented an alternative, and was proved right. But
> the thing he invented was utterly, completely different.

Yes. The introduction to the RM does a much better job of describing it.

I wouldn’t call it an “alternative”. It is a progression. Codd was not one of the Modern “theoreticians” who as evidenced severally, work in total isolation from reality. He was employed by IBM, and he had specific goals to overcome specific problems. He constructed something that IMS did not have: formal theory; FOPC; the first Relational Algebra … that is the formal foundation. But the HP remains, clearly, as per the historical evidence and the specific definitions in the RM, the material foundation of the RM. That is, it carried the Hierarchical *Paradigm*, complete and unchanged, into the RM. And of course Relational is more advanced than the HP, due to the formal foundation. Codd also used the one feature from the Network *Paradigm*, that made it superior to the HP.

> Any
> comparison risks pure sophistry.

I wouldn’t use that word to describe it, but if I understand what you meant from your post, yes. And idiocy.

And you did it anyway, despite your own warning. The mind boggles.

You failed to mention, IMS has complete support for ACID Transactions, something that your teachers cannot understand, for which they postulate “multiple assignments” and the hysterical “MVCC”, which (a) denies science, the principles of a database, and (b) does not have a scientific basis. SQL (the real ones, not the s/w that is labelled “SQL” that does not comply, hence they are fraudulent) has ACID Transactions.

Cheers
Derek Received on Sat Jun 11 2016 - 09:24:51 CEST

Original text of this message