Re: Progress Please

From: James K. Lowden <jklowden_at_speakeasy.net>
Date: Sun, 15 Feb 2015 20:07:44 -0500
Message-Id: <20150215200744.b6b48e0c.jklowden_at_speakeasy.net>


On Wed, 11 Feb 2015 23:00:52 -0800 (PST) Derek Asirvadem <derek.asirvadem_at_gmail.com> wrote:

> > Are we to say they have "hierarchical keys" simply
> > because employee->jobhistory->salaryhistory are related through
> > their foreign keys?
>
> The reason they are hierarchical keys is because
> employee->jobhistory->salaryhistory are related through their
> IDENTIFIERS, their PRIMARY keys (which, btw, components thereof
> happen to be foreign keys).

To me, that's a distinction without a difference. I just don't see what you find significant about it.

	employee is identified by {man#}
	jobhistory is identified by {man#, jobdate}

For some reason I can't fathom, you believe:

  1. It's vastly more important that the primary key for jobhistory incorporates the primary key for employee than that

        FOREIGN KEY (man#) REFERENCES employee(man#)

even though the two statements are equivalent.

2. The very fact that primary key for jobhistory incorporates the primary key for employee deserves the special designation of a "hierarchical" relationship, perhaps to reflect how the relationship would have been designed in pre-relational DBMSs.

To support this assertion, you list the keys vertically, and note that each longer one incorporates the shorter one above, ergo hierarchy. Also you note Codd visually arranged the boxes in a way that suggests a hierarchy. You'll forgive me if I find that unpersuasive?!

No, I don't think the fact that one table's primary key is a subset of another's is interesting, let alone signficant. It doesn't deserve any special designation, "hierarchical" other.

> > > Note that in the RM, more than half of Codd's references are to
> > > products and product manuals. There were not too many theoretical
> > > papers in the field.
> >
> > You seem to think one implies the other, that commercial products
> > preclude theoretical papers.
>
> No, I have already posted a fair amount of detail that the opposite
> is true.
>
> Perhaps I should have stated, There were not too many theoretical
> papers PUBLISHED in the field, as detailed previously, we had lots of
> internal proprietary papers.

If you say so. IBM was big in the field and not shy about publishing papers. But, taking you at your word, I can't evaluate them without having read them. I have this filed under "don't care" because even today there's no model for databases comparable to the relational model.

> > It does take some work to read Codd's 1970 paper while trying to
> > embrace the technological perspective of his audience in the days of
> > punch cards and drum memory.
>
> Nonsense.
...
> In 1976, when I took my first job in a computer service bureau, as an
> apprentice programmer, we had no drums, no punched cards. We had a
> machine with one disk (for loading the o/s and programs, not for
> data), and eight mag tapes (for data).

You arrived a little ahead of me. Doubtless you remember some things I've only read about.

In 1976, though, you were already 6 years in, and there were still plenty of punch cards around. I programmed with them in college after that. When I arrived at work in 1982, we had CICS and 3270 terminals, but they had only arrived two years before. My mother in the mid-70s programmed on punch cards, too. (One compile per day, taken to the computer and back the the unusual RJE mechanism known as a "station wagon"). But, if you thought I meant punch cards were used for data storage, no, sorry. I was being allusory.

Granted, "drum memory" is an exaggeration, but not much. The big IBM 360 machine sold in 1968 came with 1-4 MB core memory, but many came with much less.

It's very easy to imagine people in IT management in those days whose knowledge of computer science was nil and whose understanding of programming was limited to whatever IBM classes the firm had sent them to. No need to imagine them, in fact, because I worked with and under some of them. Hurrah for Syncsort [TM]. But it does take some work to try to read Codd's paper through their perspective.

> > Codd certainly knew that a tree is a kind of DAG.
>
> No. He was a strong proponent of a single Large Shared Data Bank, the
> classic single-version-of-the-truth. The tree is the hierarchy, the
> tree is the Relational hierarchy. In a single location. Not a DAG
> at all. Distributed databases are for the birds, and a DAG is just
> the latest flavour of birdseed.

I think you did not take my meaning: directed acyclic graph. I can't account for your answer otherwise.

> > I don't know what "normalized [before] RM" refers to,
>
> Do you understand that DRY, Agile, etc, is Normalisation for a
> program ?

If you say so. Not Agile, which is just methodology fetish. I've never once heard an application programmer call his data structures "normalized", whether or not he knew what the term meant.

> We Normalised very carefully in those days, to eliminate data
> duplication. We just did not have a formal declaration and name.

OK, so now I know what you mean. But you can minimize redundancy without eliminating repeating groups, a requirement for 1NF. And going to 1NF for a repeating group means repeating the key, definitely *not* minimizing redundancy wrt disk storage.

Are you going to claim you never used repeating groups in your "normalized" HM databases? Surely you know their use was standard practice, one that violated no theory.

So I can accept your defintion for purposes of discussion, but I reject it in general because the practice had no theoretical underpinning and was a mere suggestion of what we mean by the term today.

> Do you honestly believe that a tree, in the days of the HM and NM,
> could survive a circular reference ?

No. In fact I would go futher: a tree with a circular reference is not a tree. A tree is a kind of directed acyclic graph. A "tree" with a cycle is a cyclic graph (directed or not is hard to say).

By posing the question as you do, I am led to think you're working with an informal definition of "tree".

I hope at this point that we understand each others position regarding what the so-called hierchical model means, why it's not a "model" in the sense of "relational model", and why I think it's pointless to make any claim about a "hierarchy" based on the components of the keys.

It was an interesting foray into the systems of that bygone era, and I think I understand, vaguely why you say that pre-relational systems, Cullinet et al., influenced relational ones.

> But I have said much more than that, that the HM is fundamental to
> the RM. Until you understand Codd's words, you will not see that.

I reject that flatly. Whether or not I can convince you I understand Codd's words, I cannot see any way in which "the HM is *fundamental* to the RM". You say there's sound theory in proprietary papers that never came to light, lo these 45 years later, despite the immense importance of the RM and the ever-present reinvent-the-past interest in graph databases today. I can't prove you're wrong. All I can say is I don't believe you and won't until I can see for myself.

> 3. Re your teachers' allegations that hierarchies cannot be
> implemented in the RM

Au contraire. I said tables can represent graphs, and tree are graphs, and hierarchies are trees. Therefore hierarchies can be represented relationally. Furthermore, they can be represented simpler, because value semantics allow both relation and relationship to be represented using one structure.

Over to you.

--jkl Received on Mon Feb 16 2015 - 02:07:44 CET

Original text of this message