A research effort on a computing model...
Date: Thu, 7 Feb 2008 15:01:50 -0800 (PST)
Message-ID: <8232213f-1353-40e6-92ed-7ef596016a47_at_q21g2000hsa.googlegroups.com>
On the last years, I have been putting some modest but consistent effort to build a computing model whose principles, I embodied under the form of a db core that allows relation definition and manipulation as well as basic relational operations.
As a part of a genuine research effort, I hope this computing model
and subsequent db core
to be more faithful to core relational principles than what we have
get accustomed to with
direct image systems. Because I am getting closer to draw out a first
stable version of
it (probably end of 2008) that I plan on making open source (a mix of
assembly and C), I
would like to share some of the aspects of this model to have some
feedback and eventually
correct some mistakes I may have mislooked or underestimated.
I always felt frustrated that we are using systems that are not
delivering the tenth of what RM promesses so here are some
achievements/aspects (I let you decide) of how this db core may get us
one step closer to what could be a TRDBMS. A a note, I have made which
may seem *unconventional* but nevertheless have serious fundamental
justifications I would be happy to
discuss or amend if I am wrong.
The computing model has the following features and characteristics:
About the computing model and core fundamental concepts...
> The computing model and subsequent db core exclusively respects binary logic.
> The model and implemented system has 3 layers of abstraction: the media layer(that represents the methodology and protocols used to presents information to the user), the
logical and the physical layer.
> The subsequent core uses new algorhytms that allow, based on a specific logical representation, to bind relational concepts to rules of probability to set theory to represents sets in a way I have never observed elsewhere. The immediate consequence of such relationship is to allow the minimization the number of logical reads and writes at run time but not only. The systems also allows to exploit in an interesting way properties such as symmetry or commutativity.
> Relation definition is done exclusively through domain definition and constraints addition/update on attributes.
> Basic inter relation operations are supported.
> The notion of relation is equated with the notion of type. A new relation *de facto* constitutes a new type that may be reused in other relation definitions.
> The system supports subtyping and makes more practical relation decomposition. It makes
use of the only useful concept in OO: inheritance
About a possible language and opportunities of presented by the sublanguage
> The SQL SELECT verb is not used. Only WITH is emphasized to express conditionality
carrying over an atribute.
> The SQL INSERT/UPDATE/DELETE verbs are not used to the profit of more generic MAKE
associated with WHEN (in case of UPDATE) to modify the body of a
relation. The purpose of
the WHEN verb is to reflect the dynamic nature of a specific relation.
> The concept of table in a SQL sense totally disappears. A relation has a runtime logical
representation that must explicitely be embodied as what is required
in the media layer.
For instance, I use the PRESENT2D verb and associate it to the relation definition to represent data as a table. To put it short, the compiler I am currently working on allows to write
PRESENT2D -->states that the expected presentation layer is a
bidimensional table
[<RELATION NAME> WITH ATTRIBUTE1 = <ATTRIBUTE 1 DOMAIN VALUE>]
Example using the PART relation with PART: {number, name}
PRESENT2D [PART WITH NAME = 'screw']
By supposing, I would need to send by FTP the output of the PART
relation then I
would simply write:
SENDFTP('192.168.10.99:2050') [PART WITH NAME = 'screw']
But I can also write...
[MAKE PART = 'other' WHEN NAME NOT 'screw'] in case of UPDATE of PARTS
not being 'screw'
and...
[MAKE PART = 1, 'other'] in case of an INSERT of an additional row
and...
[MAKE PART = PART MINUS [PART WITH NAME = 'screw']] in case of trying
of a DELETE of all PARTS with name 'screw'
These are the basics but the grammatical and semantic separation
allows interesting
properties when expressing more complex operations.
On the logical side of things...
On the physical side of things...
> Physical order is irrelevant because of the nature of physical data organization.
There are many many other things I would like to add but I will try
instead to put it as part of a future documentation. Any comments
would be appreciated. Just keep in mind that this is a single man's
research effort far from being perfect, I am aware of it. So please
don't be too harsh.
Received on Fri Feb 08 2008 - 00:01:50 CET
> Logical order is never a prerequisite in logical layer for operating relations.
> The concept of key does not exists as well as the concept of primary key in the traditional SQL sense. The unique identifier is *de facto* a total set of attributes constituing the relation.
> The concept of foreign key does not exist as well for the same above reason.
> Based on the above principles, dupplicates as a result of an insert are impossible. The
system automatically discards any dupplicate that may potentially be
the output of an
insert operation.
> Based on the above principles, dupplicates as a result of update are impossible. The
system automatically discards any dupplicate that may potentially be
the output of an
update.
> No indexing scheme of any kind is used.
> As a consequence of non direct image representation, the size of physical files may actually decrease as the number of tuples increases.
> Operations are exclusively done on disk. No RAM caching of any sort is performed to operate relations.
> As far as response time is concerned, I have no PRESENT2D statement running above 5ms (7200 rpm), no matter how big the table (up to the limit of 64 bits memory registers). Response time increase as the number of rows increases is quasi neglectable.