Re: Newbie question about db normalization theory: redundant keys OK?

From: paul c <toledobythesea_at_oohay.ac>
Date: Fri, 28 Dec 2007 22:23:23 GMT
Message-ID: <vHedj.31443$vd4.14902_at_pd7urf1no>


David Cressey wrote:
...
> What I worked with in that time frame was DEC Rdb/VMS. I guess you would
> have to say that Rdb/VMS was not a "big name database". However I think you
> could say that Digital was a "big name computer company" in 1985.
> ...

DEC was big and Rdb/VMS was widespread, if not in all of the giant datacentres of the time. But they were in a lot of medium-sized shops and they were to be found at some location of probably nearly all of the Fortune 50, if not 500. I vaguely remember writing an interface to Rdb and though I've forgotten the details, I'm sure the Rdb implementers were quite hip to the standard physical techniques which were well known and quite similar among the various OS'es.

> ...
> With regard to concurrency being a myth, I have to say this: not at the
> physical level. Overlapping disk seeks, disk transfers, as CPU cycles
> makes a big difference in throughput.
...

At the level of a physical commit mechanism/component, it is exactly those factors that conspired to induce any self-respecting multi-user system to hide concurrency in the sense that an application program wasn't aware that its "I/O" might be interleaved with that of other programs. I forget how VMS worked but in Unix there was usually a pre-determined number of forked processors and in the IBM world "tasks" with a similar motivation that would aim at a physical overlap of IO and CPU in the hope that the incoming mix of requests/messages didn't overlap as far as the physical structures, such as bit maps or index paths, were concerned. Usually, there were different kinds of such tasks that depended on the particular design, whether it was optimistic or pessimistic and sometimes there might be several distinct locking components, one at the application level and one at the physical 'sync' level. Sometimes there'd be a component that ordered the tables within requests in a consistent way, which was one way to prevent deadlocks (ie., prevent one of the four necessary deadlock conditions). There'd usually be some kind of cache or page manager that would collaborate with a checkpoint mechanism and redo or undo log component. If the commit process cpu was seen to be 100% busy that was some indication that the particular machinery was being used to its utmost throughput effect and usually it was only during checkpoints every so many minutes that the concurrent overlap paused. While log writes had to be synchronized, enough commit processes/tasks meant that all users weren't waiting for each log write to complete.

Fundamentally, not much has changed since those days, except for the advent of relatively massive communications bandwidth and the appearance of nearly seekless memory that exceeds the size of disk db's of those days. It is amusing in a pathetic kind of way, to read every so often somebody writing about persistent message queues or somesuch as if the concepts/techniques were only recently invented/discovered.

Not to tout Oracle then or now, but I'd bet that apps that introduced phantoms were failing to observe some recommended message protocol or other. Personally, I've often felt justified in not Reading-TFM as quite often those were written by non-developer tech writers who were not capable of testing their own instructions themselves. Received on Fri Dec 28 2007 - 23:23:23 CET

Original text of this message