Re: What would be a truly relational operating system ?

From: paul c <toledobythesea_at_oohay.ac>
Date: Wed, 11 Nov 2009 05:39:51 GMT
Message-ID: <H6sKm.52356$PH1.25791_at_edtnps82>


Cimode wrote:
> The current hypes about operating systems (Windows, Unix, Mac OS
> Linux) gave to think about how RM main principles could be implemented
> to extend the possibility of having a TRDBMS being implemented at a
> lower physical layer (I mean on current disks, RAM, and CPU
> architectures). That said, I am curious onto which prerequisites
> should be respected on a lower level to respect independence between
> the data layer and the physical layer when thinking of an operating
> systel that would manage the relationship between thethe two layers.
> I came up with the following ideas I would glad to exchange upon:
>
>

>> The operating system should be IO relation-aware (for a lack of a better word) meaning that it should implement a physical storage mechanism that minimizes the number of logical operations required to represent a in memory relation and relation operation logical .
>> The operating system should not be a direct image system, meaning that all information(files, file groups) should only be a result of a logical relation operation at runtime.  As a consequence, the information can not be physically stored on an as-is basis.   In other words, a traditional file would be a relation that would not exists before user interpretates it.
>> The mechanism by which a file would be represented at runtime  would be as a particular case of presentation of a relation, the same way an RTable could be one presentation.

>
> Here are few ideas that came up but it brings other questions:
>> Can current CPU architectures, RAM and disks adressing schemes sustain such model.
>> What would be benefits ? The threats ?

>
> Regards...

No answers, just some probably rambling comments. I think these are really useful questions, maybe not for implementing a 'DBOS' but for appreciating the few things that a dbms really needs to do. Even if one thinks one knows them, they are easy to forget when one takes on a very ambitious task - in the last few days even the language experts on the TTM discussion list have gotten themselves tripped up over the difference between recording a function invocation and a function value, reminds me of the criticism I made here about confusing logical variable names with attribute values. Ideally, chief among the 'few things' would be figuring out how to make a typical physical machine look declarative.

One perhaps tangential problem is that mainstream OS designers aren't really into making coherent interfaces, let alone applying information theory, even though some of them jabber about that. There is a huge bio-feedback loop between the OS and HW designers and another one between the HW and compiler designers. Each group looks at what the other two did in the latest release and reacts to it. I guess there are three big loops altogether. To a big extent they are isolated from the rest of us, the loops create their own private logic. 'Good' marketeers understand this and know that they are not necessarily selling what's apropos rather what's available.

Years ago, I had to cope with an IBM mainframe OS called MVS (a successor to the original 'OS/360' which many referred to as 'Obstacle System/360'). The vendor documentation was extremely uncoordinated, arcane, obtuse and never-ending. I thought the reasons were often deliberate. I didn't learn to make my way around MVS until I found a non-IBM book called 'Invitation to MVS'. This was a pretty short book while at the time IBM was said to be the second-largest publisher in the world, next to the USA government. It explained MVS in terms of the relationships between its key storage structures but didn't do much good since very few MVS users were aware of the book. But things have changed and today the desktop OS's are more complicated than MVS ever was (eg., it was designed to communicate with 'dumb' terminals and a very small selection of devices).

There have been similar though much thicker books about Unix but today it's ironic that even the Linux developers operate in just as adhoc a fashion as the others who have various corporate baggage on their backs, plus other insecurities. I guess part of the reason is that they have some baggage too in the form of Unix compatibility so technique impersonates real design. Of course the marketeers help confuse things with their doubletalk, 'baggage' doesn't sound so bad when you call it 'legacy'. It all reminds me of how a genome researcher I knew explained what he did: "It's a big race". In other words, nobody knows where they're really going. The lack of consistent lingo, also what Edward de Bono called 'porridge' words and the general decline in the ability to read critically is also a big impediment (not just in IT, the recent USA health-care debates show this - those against universal health care scare their constituents with TV ads' that talk of unrestricted cost increases, which is the basic problem with their private system in the first place!)

There have been a few embedded special-purpose OS's that had minimal (ie., only the essential) logical and coherent programmer interfaces. If I had to give the main design goal of an OS, that is it. But there is a big difference between an interface to hardware and an interface to a logical machine. Probably none of today's mainsteam OS's could adapt.   Unix originally had what Fred Brooks called 'conceptual integrity' or suchlike with its file-stream metaphor but like the others you mention, it certainly doesn't have any comprehensive foundation theory akin to relational algebra and the emphasis remains physical, with lots of physical device library functions, each expressed in terms of the device's characteristics or the underlying machine language. In theory, such a base, if it existed, would offer similar advantages to a relational dbms, tight definition, logical rules for manipulation and therefore prediction and correctness proof.

Around the time Codd was writing his first papers, IBM had a project called 'Future System' which resulted in a scaled-down product called the System/38. It was only packaged as a mini-computer because the mainframe sales force was afraid a bigger one would cut into the large system sales that had better margins and bigger maintenance dollars. It had a linear address space, eg., you didn't have to cope with disk or memory architecture, and an abstract machine language but copped out with most of the same procedural application languages that the mainframes used. At one time, I was unknowingly in charge of a System/38 until one day the customer engineer showed up to look it over.   Took an hour to find the room it was locked-up in. It was actually a development machine, but nothing ever went wrong with it! At the time, this was unheard of, even among the mini makers of the day, Dec, Prime, Data General and some others I forget. I think part of the reason was not to do with hardware, it was hard for application developers to bring things down. I remember working for a competitor where the doors were locked, not to keep secrets in, but to keep certain IBM materials out, for fear of lawsuits, but somebody sneaked some S/38 material in. The reason I mention S/38 is that I think the designers of it were hip to the idea of logical-physical separation even though IBM never tried to sell that idea.

I'm probably an old-timer compared to most people in this group and remember having to squeeze IO support that worked fine on a 128K cpu into 16K. In those days IBM expected programmers to know and use disk architecture, it was called 'channel programming'. The channel was a mini-processor in its own right. Today even the software technocrats miss the point - it's laughable to me how much horsepower gets spent on off-base technique, eg., XML fans think the message is the medium, failing to understand that the key ingredient is compatible database definitions at the message ends, if they did then there would probably be only three kinds of messages between db's, eg., insert, delete, replace or maybe just assert and retract. Or maybe something totally declarative. I've mentioned here before how I think if today's memory space had been available thirty years ago, the database implementation field might be much different today, eg., there might be very little need for paging memory support in the typical OS that is concerned with database. One little craziness is the segment registers of the early Intel processors. At the time they were needed because of the small word size. Now that consumer cpus' have gigabyte and larger word sizes the OS people, eg., Windows, have hidden them from the application layer. This is a shame, because they have obvious uses for a dbms that can manage its own memory much better than an OS can. Meanwhile Intel/Wintel regularly add machine instructions to optimize graphics and multi-media.

The reason I mentioned messages has to do with implementation. Just for argument's sake, assuming one wants a 'DBOS' that uses today's hardware, the obvious bootstrap is to use existing hardware drivers, eg., to avoid all the bootstrapping work that Linus Torvalds had to do. Most people I know aren't aware that many drivers are proprietary, one of the semi-secrets of Windows commercial success. If somebody wanted to make a similar solitary effort, this might mean eschewing anything to do with Windows which might in turn mean avoiding, for example, graphical presentation functions so I'd think one really wants a very limited kind of OS, which means depending on more general OS'es on separate motherboards. Maybe paging support isn't needed, but I think the two main breakthroughs needed would involve a relational translation compiler and a database-to-database communication theory. The first matters if dependence on today's baggage-laden, bio-feedback driven cpus/motherboards/storage devices is to be avoided and the second matters if one wants a single-purpose 'DBOS' platform. Back in the 1970's there was much inventiveness among the calculator manufacturers, basically there were no taboos in a new field. Now there are market taboosToday there are single purpose calculators, sometimes called 'personal organizer' or even cell-phone but they aren't what I would call DBOS platforms because their programming interfaces are application interfaces, not db interfaces.

But practically, if I wanted to make a DBOS, I think I would not start with the hardware, rather something that already exists, such as the physical layer of some relatively open-source dbms, maybe sqlite, maybe a non-Java version of Berkeley DB, anything that has its own abstract interface, but skip the sql layer. One interesting aspect of sqlite is that its virtual machine opcodes are exposed, another is that it is part of the Firefox package which might have some distribution advantages, also maintenance advantages for a small-team or solo effort.

Well, I've rattled on too long but even at that have left lots out! Received on Wed Nov 11 2009 - 06:39:51 CET

Original text of this message