Re: Sensible and NonsenSQL Aspects of the NoSQL Hoopla

From: James K. Lowden <jklowden_at_speakeasy.net>
Date: Tue, 3 Sep 2013 00:02:38 -0400
Message-Id: <20130903000238.9100fec3.jklowden_at_speakeasy.net>


On Mon, 2 Sep 2013 21:45:34 +0100
Eric <eric_at_deptj.eu> wrote:

> Codd was trying to eliminate order
> dependencies which existed in many systems of the time, i.e. he was,
> in effect, banning any implementation which forced applications to
> process some data in an order determined by the storage method.

Just to expand on that well placed point, the issue isn't so much "process" as "specify". You might well want to process something in some particular order. But how do you want to specify it to the DBMS?

Codd sought to relieve us of the need to know something about the data not intrinsic to the data.

In days of yore -- which sadly we seemed destined to return to -- it was common to have the data stored "in order"

	1  A
	2  B
	3  C
	4  D

Here, the second column is data, and the first column is the navigation number supplied by the system on insertion, and used by the application on retrieval. The program specified not "B" but instead "record 2". This was handy because no one had invented auto-incrementing columns yet! But there was a problem with deletion

	1  A
	2   
	3  C
	4  D

To preserve C's location -- which we know is 3, right? -- it was necessary to preserve B's location, even though B had been evicted. Then, when retrieving the whole [1,4] set, it was necessary to mention somehow that B's place was there, but not B. That's the not 2 B Hamlet was worried about.

OTOH there were advantages some still seek today. You could easily have

	1  B
	2  B
	3  B
	4  B

and not get tangled up with pesky primary keys and the like.

Now arises the question, though: which B? Codd recognized that the only thing intrinsic about the data were they data themselves. All those B's are the same, so

        1 B

will do, or, better

        B

If you really care there were four B's, count them

        B 4

Here "4" is a quantity, not an instrument of navigation. You can find all the things whose quantity is 4, and you can remove a B by decrementing the quantity.

By eliminating the order, Codd removed extraneous nondata from the system, and permitted the programmer to access the data strictly on their terms: by value. And lo, it works with the first example just as well

	A
	B
	C
	D

and, happily, deletion works pretty well, too

	A
	C
	D

Michaelangelo produced David by starting with a block of marble and removing the parts that weren't David. That's what Codd did with database theory.

Lots of things are missing from RM. That's the point.

--jkl Received on Tue Sep 03 2013 - 06:02:38 CEST

Original text of this message