A research effort on a computing model...

From: Cimode <cimode_at_hotmail.com>
Date: Thu, 7 Feb 2008 15:01:50 -0800 (PST)
Message-ID: <8232213f-1353-40e6-92ed-7ef596016a47_at_q21g2000hsa.googlegroups.com>

On the last years, I have been putting some modest but consistent effort to build a computing model whose principles, I embodied under the form of a db core that allows relation definition and manipulation as well as basic relational operations.

As a part of a genuine research effort, I hope this computing model and subsequent db core
to be more faithful to core relational principles than what we have get accustomed to with
direct image systems. Because I am getting closer to draw out a first stable version of
it (probably end of 2008) that I plan on making open source (a mix of assembly and C), I
would like to share some of the aspects of this model to have some feedback and eventually
correct some mistakes I may have mislooked or underestimated.

I always felt frustrated that we are using systems that are not delivering the tenth of what RM promesses so here are some achievements/aspects (I let you decide) of how this db core may get us one step closer to what could be a TRDBMS. A a note, I have made which may seem *unconventional* but nevertheless have serious fundamental justifications I would be happy to
discuss or amend if I am wrong.

The computing model has the following features and characteristics:

About the computing model and core fundamental concepts...

> The computing model and subsequent db core exclusively respects binary logic.
> The model and implemented system has 3 layers of abstraction: the media layer(that represents the methodology and protocols used to presents information to the user), the
logical and the physical layer.
> The subsequent core uses new algorhytms that allow, based on a specific logical representation, to bind relational concepts to rules of probability to set theory to represents sets in a way I have never observed elsewhere. The immediate consequence of such relationship is to allow the minimization the number of logical reads and writes at run time but not only. The systems also allows to exploit in an interesting way properties such as symmetry or commutativity.
> Relation definition is done exclusively through domain definition and constraints addition/update on attributes.
> Basic inter relation operations are supported.
> The notion of relation is equated with the notion of type. A new relation *de facto* constitutes a new type that may be reused in other relation definitions.
> The system supports subtyping and makes more practical relation decomposition. It makes
use of the only useful concept in OO: inheritance

About a possible language and opportunities of presented by the sublanguage

> The SQL SELECT verb is not used. Only WITH is emphasized to express conditionality
carrying over an atribute.
> The SQL INSERT/UPDATE/DELETE verbs are not used to the profit of more generic MAKE
associated with WHEN (in case of UPDATE) to modify the body of a relation. The purpose of
the WHEN verb is to reflect the dynamic nature of a specific relation.
> The concept of table in a SQL sense totally disappears. A relation has a runtime logical
representation that must explicitely be embodied as what is required in the media layer.

For instance, I use the PRESENT2D verb and associate it to the relation definition to represent data as a table. To put it short, the compiler I am currently working on allows to write

PRESENT2D -->states that the expected presentation layer is a bidimensional table

Example using the PART relation with PART: {number, name}


By supposing, I would need to send by FTP the output of the PART relation then I
would simply write:

SENDFTP('') [PART WITH NAME = 'screw']

But I can also write...

[MAKE PART = 'other' WHEN NAME NOT 'screw'] in case of UPDATE of PARTS not being 'screw'
[MAKE PART = 1, 'other'] in case of an INSERT of an additional row and...
[MAKE PART = PART MINUS [PART WITH NAME = 'screw']] in case of trying of a DELETE of all PARTS with name 'screw'

These are the basics but the grammatical and semantic separation allows interesting
properties when expressing more complex operations.

On the logical side of things...
> Logical order is never a prerequisite in logical layer for operating relations.
> The concept of key does not exists as well as the concept of primary key in the traditional SQL sense. The unique identifier is *de facto* a total set of attributes constituing the relation.
> The concept of foreign key does not exist as well for the same above reason.
> Based on the above principles, dupplicates as a result of an insert are impossible. The
system automatically discards any dupplicate that may potentially be the output of an
insert operation.
> Based on the above principles, dupplicates as a result of update are impossible. The
system automatically discards any dupplicate that may potentially be the output of an

On the physical side of things...

> Physical order is irrelevant because of the nature of physical data organization.
> No indexing scheme of any kind is used.
> As a consequence of non direct image representation, the size of physical files may actually decrease as the number of tuples increases.
> Operations are exclusively done on disk. No RAM caching of any sort is performed to operate relations.
> As far as response time is concerned, I have no PRESENT2D statement running above 5ms (7200 rpm), no matter how big the table (up to the limit of 64 bits memory registers). Response time increase as the number of rows increases is quasi neglectable.

There are many many other things I would like to add but I will try instead to put it as part of a future documentation. Any comments would be appreciated. Just keep in mind that this is a single man's research effort far from being perfect, I am aware of it. So please don't be too harsh. Received on Fri Feb 08 2008 - 00:01:50 CET

Original text of this message