An object-oriented network DBMS from relational DBMS point of view

From: Dmitry Shuklin <shuklin_at_bk.ru>
Date: 9 Mar 2007 03:44:37 -0800
Message-ID: <1173440677.467627.35460_at_q40g2000cwq.googlegroups.com>



Shuklin D.E.

An object-oriented network DBMS from relational DBMS point of view

Naive ORDBMS

Lets examine the implementation of an abstract object-oriented DBMS. Lets take the relational DBMS as starting point. Rows in OODB tables will be an object instances. All table's rows will be instances of a class, which are corresponding to the table. Classes will be defined by table headings. In this case table column will conform to the field of the class. The value of the row field will conform to the value of the instance field. This first iteration of the ODBMS development allows us to work with DBMS in terms of classes, instances and values of instance fields. This OO system is still missing methods, virtual methods overriding, inheritance and encapsulation. Neverless, it is obvious, that no RDBMS values at that ORDBMS interpretation were lost.

Object - relational DBMS

Lets move further. By analogy with OOP it is possible to implement single inheritance. The table, which has inherited base table, also inherits columns of that base table or, what is equivalently, inherits class fields. Base table should contain all class instances (= rows) from all derived tables, which were inherited from this base table. Some methods can be associated to the row (= instance). An interface of the class, which was determined in the basic table, is inherited by derived tables. Each row method should implicitly receive one row as hidden parameter (= this). These row methods can be regarded as class (= table) methods. These methods will be executed on data stored in the row of table (= class instance or this). Identifiers of all virtual methods of single class can be stored in some table known as vtbl. Lets add hidden field (= column) with identifier of vtbl to each data table. The vtbl identifier stored in hidden field in each row allows to implementing overriding of virtual methods. The call of virtual method of some row will be implemented as searching of the virtual methods table by the vtbl identifier stored in this row and then searching and calling the implementation of the method by its name in the virtual methods table. The availability of vtbl allows to invocate overridden methods, defined in the derived tables by rows contained in the base table. So, in the developed abstract DBMS support of inheritance and polymorphism concepts are presented. Also it is obvious that the addition of such capacity uphold all present RDBMS capacities.

The encapsulation is implemented in the developed DBMS by VIEWs using. VIEW allows protect some table fields from direct access, providing the access to fields interface. Due to the availability of vtbl identifier in each row the VIEW also provides access to methods interface.

As is easy to see, such implementation of OOP concepts in the RDBMS is not new. PostgreeSQL is an example of already existing implementation.

Let's go on. Each row of the table is physically located in the storage at some unique address. Even if it was not yet implemented in the RDBMS used as prototype, it is technically possible to provide invariability of this address during all row lifetime. The analogue of this address can be bookmarks, used in the modern RDBMS for rows addressing. The availability of unique and invariant row logical address allows us to implement the concept of pointers to the rows. Previously it was developed concepts of inheritance, polymorphism and encapsulation. They are converting developed abstract ODDBMS to valuable OO programming system.

Note, that the concept of references to the rows is also not new and was embodied long ago in such famous RDBMS, as Oracle.

Neverless, it is obvious, that no RDBMS values at that ORDBMS interpretation were lost too. As before, developed abstract ORDBMS includes RDBMS as special case.

Network OODBMS

Let continue the building. The table of ORDBMS is a bookmarks collection which refers to some rows. So many tables can contain reference to one row and the same instance of row can be contained by several tables. The row instance will be contained by the table, which corresponds to the row class. And the same row instance will be contained by the all base tables (= classes). Interface, which was implemented by the row (= columns + methods), corresponds to the class from which this instance was inherited. It is significant that this interface is not equivalent to interfaces of base tables. It is wider than base table's interfaces. However, this interface is compatible with interfaces of tables, which are containing this row. That is how we are getting to the concepts of interface, abstract classes and multiple inheritance of interfaces.

This is very important step, because now the possibility of the belonging of some row to some table is determined by compatibility of this row interface with interface fixed for some table. This gives a possibility to make next step - to consider tables to be not just physical storage for rows, but a collection of instances of some classes, which interfaces were compatible with interfaces determined for this collections. Lets draw your attention to potential independence of the interface, determined for the table from the interface, fixed for the row. Notwithstanding the fact that developed system in some special case can operate just as RDBMS, tables in the developed DBMS are not relations in classical meaning. Rows of these tables (= collections) are not a subset of Cartesian product of interface declaration to possible domain of the interface. Yes, collections at the developed system can be interpreted as a subset of Cartesian product of interface declaration to possible domain of the interface. This makes RDBMS to be special case of developed one. But, the polymorphism of instances (= rows), which provides the access to their internal structures through public interfaces caused in this DBMS interfaces sharing (= columns and/or methods) to many tables (= collections). These tables can't be considered as relations because the changes of row fields provided through one table caused changes in all other tables.

Lets examine data interfaces more elaborately. Interface implemented by the row consists of the definition of signatures of methods, which are applicable to this row, and also of the definition of fields (= columns) which were inherited by this row from base tables. The possibility of the table inheritance causes the possibility of simultaneous ownership of the same columns (= fields) by different tables. As result, row (= object) instance has values for columns, which were described in all inherited interfaces. However, only that subset of values can be available through some table, which is an intersection of variety of all columns, which were inherited by row and the variety of all columns, fixed as the interface for the collection (= table). What is the row in this case? The row is the subset of Cartesian product of all possible columns by all possible values of these columns. Here interesting and unexpected result is obtained. Though the developed abstract ORDBMS included RDBMS as special case, the row is a relation instead of table.

Resume

During development no RDBMS capacities from the developed abstract DBMS were removed, but new capacities was only added to it. Developed ORDBMS includes RDBMS as special case. Rows are instances inheriting many columns, published only some columns through tables interface. Both rows and columns can be simultaneously placed into several different tables. Tables are collections of instances and are not classical relations. We could say that instances of classes (= rows) are nodes, fields of these instances are attributes of nodes. Fields of reference type holds pointers to other nodes. Nodes themselves can have millions of attributes, publishing only some attributes through interfaces of collections. Attributes can have both scalar values and references to other nodes, thus forming the net. So relational DBMS is just a special case of network DBMS.

The possibility of effective implementation of this OODBMS is required for further research.

Prototype of such network object oriented DBMS can be downloaded from here
http://www.codeplex.com/Cerebrum .

The Object-oriented network knowledge base Cerebrum (OONKB) has the following features:

It saves the current state of the graph of objects or the neural network in the OONKB between user sessions, including the current topology of objects so that it does not require the creation of objects again at the next run.

It restricts the amount of memory used by the graph of objects or the neural network with larger quantities of class instances. The most frequently used objects are left in the RAM, the others are moved to the physical storage area and are loaded into the RAM upon demand. It unloads the rarely used objects when other objects are loaded to the RAM. The memory amount restriction allows not using the paging file so that it significantly increases the modeling performance of networks with larger quantities of class instances.

The primary goal of this research is to create a virtual machine supporting free topology object-oriented network with up 2 billons of object instances within one physical storage area. This possibility is provided with implementation of the network object-oriented knowledge database. So that only a few class instances are in the RAM and the most objects are frozen in the file system.

I will be very appreciated for your feedback. Thank You for reading.

WBR, Dmitry Shuklin, PhD, Ukraine Received on Fri Mar 09 2007 - 12:44:37 CET

Original text of this message