Progress Please

From: Derek Asirvadem <derek.asirvadem_at_gmail.com>
Date: Wed, 11 Feb 2015 23:00:52 -0800 (PST)
Message-ID: <385bd9eb-f641-4114-b7c4-05dec0191bd5_at_googlegroups.com>


James

> On Wednesday, 4 February 2015 16:45:27 UTC+11, James K. Lowden wrote:

I am reminded, again, just how far apart our worlds are.

> select birthyear, salary
> from salaryhistory as j join children as c on j.man# = c.man#
>
> If that kind of access is possible, in what sense do the four tables
> form a hierarchy? Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

First, I have to say, while some posters have demonstrated consistent self-contradiction, where virtually every para contradicts the next, you have not. Sure, you have been rigid; refused to take up minor challenges that would demonstrate your declared knowledge; refused to give me an example where I could demonstrate that the ideas you have are false; etc, but you have not been self-contradictory. Therefore when I saw the staggering contradiction between your two paras above, it stayed with me. How could someone who could figure out the SQL required in p1, NOT understand what he is asking in p2 ? The two paras from the one person just did not jive.

> Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

Yes, of course that is true, but that is not the reason I suggest they are hierarchical keys. That would be a very weak argument to support the notion that they are "hierarchical keys".

The reason they are hierarchical keys is because employee->jobhistory->salaryhistory are related through their IDENTIFIERS, their PRIMARY keys (which, btw, components thereof happen to be foreign keys). Why did you not see that, why were you thinking that I was picking a weak reason when, the massive reason is standing out there front and centre, like testosterone on a bull, the same massive reason that you must have used when you wrote that SQL.

Despite the fact that Codd defined KEY in the preceding section, and then gave the keys:

> > Codd gives the Keys in Fig 3(b), they are in italics, the non-key attributes are not in italics.

you have implemented them as non-keys. Foreign keys, sure, but as attributes. Which means Non-identifying relations (dashed lines) rather than Identifying (solid lines).

The only way you would not see the advertisement on the bull, is that you are so very, very used to record filing systems; with surrogate record IDs (not "surrogate _keys_", there is no such thing); where all the relations are Non-identifying, that even when the Keys and Identifying relations are given, you see only non-keys, non-identifying relations. In that case, yes, the links (can't call them relations) that connect the files (can't call them tables) do not represent an hierarchy, one has to stretch to see it, the notion is weak.

I have known for some time, that theoreticians in this unserved space know only RFS, but I did not place you amongst them.

> Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

No. The reason they are hierarchical keys is because employee->jobhistory->salaryhistory are related through their IDENTIFIERS, their PRIMARY keys.

If you look at the keys Codd gave, the keys are Hierarchical:

__Employee ( ManNo ) 
____JobHistory ( ManNo, JobDate )
______Salary ( ManNo, JobDate, SalaryDate )
____Children ( ManNo, ChildName )

The FKs are used to form the PKs, as per Codd's definition of KEYS and construction in [1.4].

> If that kind of access is possible, in what sense do the four tables
> form a hierarchy?

Refer my para immediately above, *and* my diagrams for Codd's Fig 3(b). The tables form a classic hierarchy, and the keys within each table form the classic Relational Hierarchy.

The mental block for me is that you said you had implemented many Relational systems, and I took you at your word. I thought, yeah sure, this guy knows some Relational, and I will get him across the line, to understand hierarchies in the RM.

So it is a bit shock, horror, for me to realise, hang on, he is a product of his teachers, they are clueless about the RM, they diminish the RM at every turn (attempting to show how their "vision" is somehow "better"), as evidenced, they know and implement only pre-1970 RFS; non-FD; non-key, so there is no way you could be anything but.

The reason your p1 contradicts your p2, is that you are using a non-relational Record Filing system, and even when Codd gives you Keys, you place them as fields, with an FK. Your rendition of Codd's words is not in Relational Normal Form.


Now for the rest.

> You raise too many points for me to answer one by one. Let me call
> out some I think are important. If I mistake your meaning at any
> juncture, please correct me.
>
> > Note that in the RM, more than half of Codd's references are to
> > products and product manuals. There were not too many theoretical
> > papers in the field.
>
> You seem to think one implies the other, that commercial products
> preclude theoretical papers.

No, I have already posted a fair amount of detail that the opposite is true.

Perhaps I should have stated, There were not too many theoretical papers PUBLISHED in the field, as detailed previously, we had lots of internal proprietary papers.

> But you must know that's not true. The
> reason there were no papers is that there was no theory. You don't feel
> that's important; I suggest that's one reason pre-relational systems
> were so inelegant.

(Already answered that we had sound theory, proprietary, that was not published.)

They weren't inelegant for the time, they were quite elegant, if you consider that ISAM was all we had before the HM came along. And I haven't even started on the Network Model, which was even more elegant, less restrictive, than the HM.

Sure, they are inelegant now, because we have the RM as the measure against which the comparison is made.

> One giant leap owed to Codd that I think was (and often still is)
> underappreciated is his adoption of value semantics. Your helpful
> citation illustrates that point quite well, see next.
>
> > There are many terms that Codd uses in the RM, which have gone out of
> > fashion.
>
> It does take some work to read Codd's 1970 paper while trying to
> embrace the technological perspective of his audience in the days of
> punch cards and drum memory.

Nonsense.

You don't honestly believe that ISAM and the HM were implemented using punched cards and drum memory, do you ? Whoever told you that, whoever taught you that, is a disgusting liar. It is a transparent attempt to demean the fact that the RM was founded on the HM; that the HM had a sound basis. By rewriting history (the tell-tale sign of a liar), they wipe out the Hierarchy in the RM, and suggest the RM was fresh, new, first-time based on theory, and all that bull dust that contradicts the evidenced facts.

In 1976, when I took my first job in a computer service bureau, as an apprentice programmer, we had no drums, no punched cards. We had a machine with one disk (for loading the o/s and programs, not for data), and eight mag tapes (for data). Each mag tape was a data file, accessed serially, only. Typically a program would use six files, six tapes, repositioning each of them. Just like a six-file merge that unix sort would perform.

During that year, we installed 16 fixed disks, and eliminated the tapes for files (kept them for backups only). We had complete databases implemented in ISAM, pointer-based, of course. And Normalisation, without declared NFs, of course.

By 1978 we had removable disks of various sizes.

By 1978 we had full OCR readers, all of Canada's Catholic schools did their exams on OCR forms, and our service bureau read them, and tabulated them. The rests of Canada switched over slowly. The readers often jammed, so when the exams were on, we had a 24-watch, we took it in turns. While waiting, we programmed extensions to the Star Trek game we had, written in BASIC-PLUS, with the clever bits in Assembler.

When DBMS platforms came along, they gave us a much easier and much more secure (eg. the pointers) method of implementing those databases that we did have; ACID Transactions; better concurrency; etc, etc. It would be silly to think that we didn't have databases or Normalisation until DBMS platforms came along. Once DBMS took off, any errors in the database design (file design), any errors in Normalisation (fields within the files), were magnified.

We did all that without the benefit of Date, Darwen, Fagin, Abiteboul, Hull, Viana's contributions to mankind. Which, AFAIC, is still a great big zero.

> Looking at the example in section 1.4, I finally see what you mean by
> "hierarchy". And, fair enough, Codd says of Figure 3 {employee,
> jobhistory, salaryhistory children} ,"The tree ... shows just these
> interrelationships...." Having worked with the relational model all
> these years, I look at that diagram and I don't see a tree. I see an
> ancestor of a Chen diagram, and automatically assume the "nonsimple
> domains" are tables. To Codd's contemporaries, the tree-ness was
> obvious.
>
> > when he gives the pre-requisties to his __Relational Normal Form__
> > [1.4](1)(2), and in (1) states "collections of trees", we take that
> > to mean:
> > - trees with integrity
> > - normalised to the extent that we did prior to the RM
> > - no circular references
> > - what I am calling, in retrospect __Hierarchical Normal Form__
>
> Codd certainly knew that a tree is a kind of DAG.

No. He was a strong proponent of a single Large Shared Data Bank, the classic single-version-of-the-truth. The tree is the hierarchy, the tree is the Relational hierarchy. In a single location. Not a DAG at all. Distributed databases are for the birds, and a DAG is just the latest flavour of birdseed.

> I don't know what
> "normalized [before] RM" refers to,

Do you understand that DRY, Agile, etc, is Normalisation for a program ?

Imagine that "before the RM" disk space was extremely expensive, the waste was zero. Imagine that Update Anomalies had much greater negative effect (the duplicated field might be on a disk that was not mounted), to be avoided at all costs. We Normalised very carefully in those days, to eliminate data duplication. We just did not have a formal declaration and name. All our Normalisation prior to Codd was 3NF minus the Relational Key aspect, within the limits of whatever it was that we used, ISAM or HM or NM, as opposed to Codd's 3NF, which was Relational.

Imagine that even today, if one were to implement data in an ISAM system, one can Normalise to 3NF minus the Relational Key aspect.

Imagine that even today, if one were to implement data in awk arrays, which I have done recently, one can Normalise to 3NF fully, and implement the arrays as Relational tables, with full Relational (hierarchical) Keys. Of course, the other features of a DBMS are absent.

> and I wouldn't rely on this example
> to prove circular references can't exist in a relational database,

Not this example, but Codd's words in the RM. As explained in my para, ten paras above this one.

Do you honestly believe that a tree, in the days of the HM and NM, could survive a circular reference ? That a leaf node could point to a branch node ? What would the program that followed such a reference do ? I trust you understand the concept of the Infinite Loop, that it is to be avoided at all costs. In those days, we could not interrupt a program that was caught in a tight infinite loop, the o/s never got the chance (in its execution vector) to interrupt a tight loop, because the looping program never let go of the CPU. We had to HALT the machine, two hundred and fifty customer programs would be halted. The restart of all those programs was a massive affair. The HP-2000's and 3000's were excellent machines for the day.

No, a tree meant a tree, with the integrity properties of a tree, not a weed with incestuous properties.

The tree that Codd refers to in the pre-requisite to the [1.4]_[Relational]_Normal_Form_ is a tree with integrity, no circular references. The pre-requisite is obviously the _Hierarchical_Normal_Form_.

Therefore no circular references are allowed in the RM, in the RNF.

Which is why the commercial RDBMSs do not have the idiotic features such as "deferred constraint checking" that the idiots "must have".

To be clear, while this simple example itself does not prove it, Codd's RM prohibits circular references, by definition (if one understands the contemporaneous definitions).

The corollary is, that is why the freaks that teach you guys suppress the Hierarchical Model; why they lie that there is nothing of the HM in the RM, etc, etc. The freaks need their circular references. Put another way, the ordering in the HM resolves circular references into trees.

Further, circular references are simply not necessary. If you Normalise the data into HNF, then RNF, as demanded by Codd, or Normalise with an overall understanding of Normalisation without specific reference to HNF and RNF, you will not need circular references. Which is why I ask for your example, it is easy to prove.

> but
> I don't want to argue that point just yet, because we're talking
> hierarchies.

Ok. Whenever you are ready.

> Requoting,
>
> > - the set of such tables form an hierarchy of tables
> > - such keys may well be called Hierarchical Keys
>
> Sure, but only at a severe cost to meaning!

I did state:
> > - such keys may well be called Hierarchical Keys (they are well-known
> > as Relational Keys, I am not suggesting that we change it)

> Codd's reader doubtless saw a hierarchy (effortlessly, as you do, as I
> did not). But the example shows that they are *not* a hierarchy,
> despite appearances.

(I have treated this section of your post in detail, earlier, so my comments here are to be taken with that in mind.)

It is a classic hierarchy, visually; in terms of the tables; in terms of the Keys.

> Figure 3(b) shows each relation (his term) having
> "man#" as part of the key. It is not necessary to go through
> jobhistory to get to salaryhistory.

Absolutely, it is Relational.

> It is perfectly possible, as you
> know, to
>
> select birthyear, salary
> from salaryhistory as j join children as c on j.man# = c.man#
>
> If that kind of access is possible, in what sense do the four tables
> form a hierarchy? Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?
>
> If that's what you mean, OK. Given that the tables don't have to be
> used hierarchically, ISTM that calling them a hierarchy is to adopt a
> blinkered view.

(Treated)

> > To the extent that any hierarchy that exists in the data, is
> > maintained as an hierarchy, after transformation to the Relational
> > Model, the hierarchy lives, exists, breathes
>
> No. What you're really saying is that the tables are related, and that
> their relationships are manifest in their keys.

More. The relationships are manifest in their PRIMARY keys, which forms their identity, and which forms the hierarchy.

> The hierarchical systems you remember so well adopted the idea --
> and required the schema to manifest the idea -- that e.g. jobhistory is
> a *property* of employee. (They didn't use that term, of course.)

Not at all.

There was no *idea* to adopt, with or without the modern term.

The HM provided one file for each Hierarchy. Each Hierarchy consisted of several record-types. In Codd's example, he is clearly discussing a single hierarchy, a single file, with four record types. The DBMS handled grouping of record belonging to a single parent record type; fragmentation; etc, that was its job.

JobHistory would be a separate record-type, in a separate physical location (grouped) to the location (grouped) of the Employee record-type. There are multiple JobHistories per Employee. The Employee record had First/Last pointers to its JobHistories. The JobHistories had Previous/Next pointers to the JobHistories for that /one Employee record/.

That is hardly a *property* of employee.

Refer my diagram, which shows the logical storage in the HM (the hierarchies *modelled*, and any model is abstracted from the physical). If you request it, I will give you a diagram of the physical storage for the HM.

> One
> could not access jobhistory records except through a *pointer* acquired
> through an employee record.
 

Correct.

> The hierarchy wasn't just a notional (or
> notational) communication convention; it constituted the access path.

Correct.

And that Access Path Dependence is specifically prohibited in the RM.

> With that example, I really think the fairest thing to say is that it
> shows it's *not* a hierarchy.

(Treated. Look again.)

> By adopting value semantics -- by making
> the keys values instead of pointers --

Making the references keys instead of pointers, which is what Codd did, is quite different to "making the keys [there were none] values instead of pointers", the latter makes no sense.

> each relation becomes
> free-standing and self-consistent.

How, exactly, is that "self-consistency" achieved ? I don't see Codd's tables being "self-consistent" at all (except the head of the hierarchy, Employee). The other three are quite Dependent.

Yeah, but Hierarchical and Relational are non mutually exclusive. They can be independently-accessible tables, Relational, AND hierarchical.

If all you see in the RM is that each relation is now a "free-standing" table, with "independent" access, then you do not have the RM, you have a Record Filing System.

Now we know from your comments (at the top), that

- you don't have Codd's tables as defined in the RM Fig 3(b), as per my IDEF1X model of the same figure.  You are just not getting the Hierarchy is within the Primary Keys.  
- So you have something else in mind (your post that I am responding to, your understanding of Codd's words) 
- I presume it is not a Record Filing System of the first order (RFS 1), where record are referenced by record ID, and there is no itegrity.  
- I presume that you have been through the traps, that you have some integrity in them, that the records reference some unique key in the parent record (not the record ID which is NOT a key, and has no integrity).  
- In order to proceed and close this, have a look at this page, please confirm that what you have in mind is RFS 5, not RFS 1
- or something in-between

http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%20RFS.pdf

Based on your response, then we can move ahead.

> We can think of them as forming a
> hierarchy as a convenience; perhaps they'll be commonly used that way in
> some application. But we're not required to.

Never said we were. (That would be an Access Path Dependence.)

> The new, non-hierarchical
> relations can be combined in arbitrary ways. We can find the highest
> salary for each year, without ever learning the men's names.

Yeah, yeah. You can do that with the new hierarchical relations as well.

> > I said the RM is a progression of the HM.
>
> I suppose that's true, in the sense that the United States as
> constituted in 1789 was a progression of government from what had
> existed in 1775. Something came before the thing that came later, and
> many people would call it "progress".
 

And the government that existed in 1775 was a progression of the government in England, etc. And that was a progression of Henry V and Elizabeth I. Et cetera.

But I have said much more than that, that the HM is fundamental to the RM. Until you understand Codd's words, you will not see that. (you have come a fair way.)

> The RM was also revolutionary in 1) using math as a foundation, and 2)
> rejecting the tree -- and with it, pointer semantics -- as the basis
> for data organization.

I agree with [1]. It rejected the pointers but it did not reject the tree, it ket the tree. As you can see from Codd's words, in [1.4 Normal Form], the tree lives, the keys in the tree are transformed to a Relational Normal Form, and retained in the new Relational Primary Key.

And further, the denial of the tree, the suppression of the tree, by those teachers, causes people to fail to see trees in the data (where such exists). Hence they have grossly inefficient "adjacency lists" and "nested sets". Just look at the agony that the theoreticians are experiencing in the Normalisation thread, they are completely impotent, unable to produce anything.

> On Friday, 6 February 2015 10:14:01 UTC+11, Derek Asirvadem wrote:

No response yet.

  1. It would be good if you could confirm that you - got something out of that "further detail" post of mine - or this one, with additional diagrams - that you have a better understanding of the hierarchies, and the HM in the RM, than you had in your last post - that my diagrams assisted (or not) in the second reading of section [1.4] of the RM - that you can now see the hierarchy that Codd gives in Fig 3(b), visually in my diagrams, as hierarchical keys
  2. Further, It would be good if you could confirm that from the RM itself, hierarchies, and the HM (the essence, not the storage) exist in the RM.
  3. Re your teachers' allegations that hierarchies cannot be implemented in the RM; that there is nothing of the HM in the RM, I have proved such claims to be false, and that such teachers are evidently quite ignorant re the RM.

3.a I repeat, if you are interested, I have a set of tables and code that I use to teach the proper implementation of Relational Hierarchies, that are online. They include the projection of "nested relation"-type data using ordinary SQL.

4. Re your teachers' allegations that Codd's FOL and RM doesn't provide for Hierarchies; that SOL/42OL is required for hierarchies; etc, I am waiting for the requested example, in order to show you how all that can be done within the RM, and to prove that such claims are false. If you give me a reference to the AHV book or whatever, that will be fine.

If you do not respond, do consider that based on the information I have provided in this thread thus far, which proved their foundation claims re hierarchies in the RM [3] to be blatantly false, and ignorant of the RM, their secondary claims [4], are likewise false, ignorant, and without basis. And when challenged, no evidence is given. Which proves in and of itself, that such claims are baseless.

Cheers
Derek Received on Thu Feb 12 2015 - 08:00:52 CET

Original text of this message