The TransRelational Model: Performance Concerns
Date: 10 Nov 2004 23:54:59 -0800
Message-ID: <1c92edeb.0411102354.45157060_at_posting.google.com>
Recently there has been some excitement created amongst database enthusiasts (though not researchers) by the patented technology called the TransRelational(TM) Model. It was brought to the attention of the general public by C.J. Date who gave a brief description of it in the appendix of the 8th edition of his classic book "An Introduction to Database Systems". Elsewhere he talks about it as "a radically new and dramatically different technology" which would let us build an implementation of the relational model which "would be orders of magnitude faster than, and would deliver a far greater degree of data independence than, today's SQL products."
Unfortunately, the available documentation on the TransRelational Model is quite scarce. Besides C.J. Date's appendix the only source of information that an interested person can turn to is the published patent. (A side note: if we do a search on 'TransRelational' on Google we can get to the web site of the patent assignee, Required Technologies Inc. [http://www.requiredtech.com/]. The site is a quick read.)
I took the time and read the published patent (and patent applications) because I wanted to know more about the technology that (the otherwise often skeptical) C.J. Date calls "the most significant advance in this field since Codd gave us the relational model, nearly 35 years ago." This is a long email so I copied the 'Conclusions' section from the end of the text so you can read it as a kind of an 'Abstract' of the analysis.
As for the published patent description, it does not provide more information than C.J. Date's appendix but it is much harder to read. (As it is usual with patent publications.)
In the appendix C.J. Date mentions a book in the works, called "Go Faster! The TransRelational(TM) Approach to DBMS Implementation." When I contacted the Database Debunkings site for the publication date of the book I was told that it is unlikely to appear in the near future due to legal issues (of undisclosed nature.)
Source of C.J. Date quotes:
http://www.dbdebunk.com/page/page/619867.htm
Links to the relevant patents:
http://www.dbdebunk.com/page/page/618862.htm
As a part of my preparation for my Ph.D. dissertation I had to read a
couple of articles on the physical implementation issues of relational
DBMS's. In particular, I was interested in indexing techniques
(costs/benefits) and ways of improving natural join performance. These
articles made me think a little bit about the TransRelational Model
(abbreviated as 'TRM' from now on) and what I came up with is a bit
discouraging.
Here follows the detailed analysis:
('disk page', 'page', 'disk block', and 'block' are used interchangeably
in the text.)
First, in the TRM 'columns' of a relation are stored separately. To support effective searching, these columns are sorted individually based on attribute values. If we want to access more than one attribute of a given record we have to be able to 'reconstruct' the record. This is the Record Reconstruction (RR) step. RR in the TRM is handled in the following way: each attribute value of a given record has a 'link' to the value of the 'next' attribute of the same record. ('next' is between quotes because the actual order in which attributes are linked together is more or less arbitrary.) These links form a cycle so we can start the reconstruction of the record from any attribute value.
Example:
relation EMP:
E# ENAME HIREYEAR SALARY
30 Johnson 1993 5000
20 Smith 2000 6000
40 Hewitt 1998 9000
50 Bailey 1990 7000
TRM representation of relation EMP:
Field-Values (FV) Table (notice that each column is sorted):
ROW E# ENAME HIREYEAR SALARY
1 20 Bailey 1990 5000 2 30 Hewitt 1993 6000 3 40 Johnson 1998 7000 4 50 Smith 2000 9000
Record Reconstruction (RR) Table:
E# ENAME HIREYEAR SALARY
4 1 3 +---> 2 --+
+-> 3 3 +--> 1 ---+ 1 |
| 2 (2) --+ 4 4 |
| 1 4 2 3 |
| |
+-------------------------------------------+
Fig. 1: RR for the record where ENAME = 'Johnson'
What are the advertised benefits of this data organization?
- All attributes are stored in sorted order so practically the relation is simultaneously indexed by _all_ of its attributes. This makes range queries and equivalence queries quite efficient since we can do an initial binary search on the search attribute.
- Performing ORDER BY queries that involve one attribute of the relation becomes trivial. Even in the case of multiple attributes the bulk of the sorting is already done since we can traverse the FV table using the first attribute in the sort order.
- Since values that are the same are stored consecutively, performing GROUP BY (or SUMMARIZE BY) queries becomes much more efficient. In addition, various compression techniques (e.g. collapsing runs of values that are the same) become feasible so significant storage savings can be achieved.
(The organization can also be extended to support joins using 'merged
columns' but in this analysis I will stick to the single relation case.)
How do we use the TRM?
Let's suppose we would like to bring up the records of employees whose name is 'Johnson' (using Tutorial D):
EMP WHERE ENAME = 'Johnson'
How to compute the result in the TRM? Well, first we perform a binary search on the ENAME column in the FV table (cost: log ||EMP||, where ||EMP|| is the number of tuples in relation EMP) and then follow the attribute value links in the RR table (shown on Fig. 1)
So far so good. This is how the TRM is supposed to work.
Now let's compare the TRM's performance on this example to that of any of today's major DBMS's. For the sake of the example, let's say that in the major DBMS we have a B-Tree index defined on the E# attribute (this is a primary key so it is a reasonable assumption) and also that we have a secondary B-Tree index on the attribute ENAME. HIREDATE and SALARY are not indexed. To make the example more realistic, let's suppose that we have four additional attributes (non-indexed) in EMP: DEPT#, EMAIL, ADDRESS and PHONENO.
First, notice that in any 'standard' DBMS implementation, attribute values that belong to the same record are usually stored on the same page (disk block.) In the standard setting record reconstruction is trivial (a single block read) once we located the record.
In the analysis that follows I will use the following symbols:
ACNT 8 number of attributes in EMP |EMP| 1`000 number of pages of relation EMP ||EMP|| 100`000 number of tuples in relation EMP ||RS|| number of tuples in the Result Set
Let ||EMP|| be quite large (say, around 100`000). Let's say we store 100 employee records per page on average. This means that |EMP| is 1000.
The TRM stores each column separately so we can think of them as binary relations where the first attribute is the attribute from the original record and the second attribute is the link from the RR table. Each binary relation is sorted on the attribute values. Let's call each such table with the same name as the original attribute with '_FV' added as suffix. For example: E#_FV, ENAME_FV, etc.
The ENAME_FV Table:
ENAME LINK
Bailey 1
Hewitt 3
Johnson 2
Smith 4
The assumption that the TRM stores each FV column separately along with its corresponding links is not an assumption that is very important from the point of view of the analysis. What is important is that no matter how the TRM actually chooses to physically store column values the storage must facilitate binary search on the values and then provide easy access to the link that points to the next attribute value. If we want to do binary search on disk (since we cannot fit the whole relation in main memory) it is a good idea to keep as many attribute values on one disk page as possible to reduce the number of disk accesses. It is also a good idea to keep the next-attribute-link close to the attribute value so that once we found the attribute value we were looking for we can immediately begin the reconstruction of the record. Otherwise we would need an additional block read to fetch the link corresponding to the attribute value. The design of _FV tables shows these desirable properties and any other design for storing FV columns has either the same performance or worse. (For example, if fewer ENAME values are stored on a single page, then any binary search on ENAME will use more block reads than it would use on the ENAME_FV table.)
[[ A little side note, regarding the column compression feature of TRM:
The column compression feature of the TRM *does not* reduce the size of the RR table only the FV table. Also, the FV table will be reduced in size only if the cardinality of the given attribute is low. In our example only the DEPT# and, to some extent, the SALARY columns seem to lend themselves for compression. Compressing these two columns would modify the results of the analysis only slightly, so for the sake of simplicity we will ignore this possibility for now.) ]]
Obviously, the number of records in each _FV table is the same as in the original relation:
||EMP|| = ||E#_FV|| = ||ENAME_FV|| = ... However, since the record size is smaller the _FV tables will occupy less number of disk pages. If we could store 100 records of the original relation with 8 attributes on one page then we can store 800 records of the _FV tables on one page (on average). This means that each _FV table will use 125 pages on average:
|EMP| / 8 = 1000 / 8 = 125 = |E#_FV| = |ENAME_FV| = ... Now let's compare the cost of some simple retrieval queries on both storage organizations (ie: Traditional and TRM)
First, observe that we have at least two factors that contribute to the
total cost of executing a query: (L) the cost of locating the record,
(RR) the cost of reconstructing the record.
Query 1:
EMP WHERE E# = 30 This is a restriction on the primary key so the result set is of size one at most. Let's suppose we have an employee with E# equal to 30.
Traditional RDBMS:
L: B-Tree search that results in a physical RowID.
Cost: at most 2 blocks (the branching factor of a B-Tree is very high)
RR: Accessing the record using the RowID.
Cost: 1 block
TRM: L: Binary search in table E#_FV.
Cost: log |E#_FV| blocks = log 125 blocks = 7 blocks
RR: Accessing all _FV tables for the remaining attributes of the record.
Cost: (ACNT - 1) blocks = 7 blocks
Total cost of fetching a record based on its single attribute primary key
(assuming a B-Tree index in the traditional case):
Trad: 3 blocks TRM: 14 blocks
Query 2:
EMP WHERE ENAME = 'Johnson'
Let's suppose that the size of the result set is 20. (We have twenty employees whose last name is Johnson.)
Traditional DBMS:
L: B-Tree search that results in the physical RowID of the first record
that has 'Johnson' as its ENAME attribute. Since RowIDs in the leaf nodes of the B-Tree are kept in key order all we have to do is sequentially read the RowIDs that follow until we hit a key with a different ENAME.
Cost: at most 2 blocks for search (the branching factor is very high)
RR: Accessing the records using their RowIDs.
Cost: ||RS|| blocks = 20 blocks
TRM: L: Binary search in table ENAME_FV.
Cost: log |E#_FV| blocks = log 125 blocks = 7 blocks
RR: Accessing all _FV tables for the remaining attributes of the records.
There is no guarantee that two records that follow each other when we sort the relation on one attribute will follow each other (or will be even close to each other) if we sort the records on some other attribute. The consequence is that when we follow the links to the next attribute in the RR phase it will be more or less random access of _FV tables. So for each record that appears in the result set we have to read 1 block of each of the different attribute value tables. This is why the cost of RR is so high in this case.
Cost: (ACNT - 1) blocks * ||RS|| = 7 * 20 blocks = 140 blocks
Total cost of fetching a record based on single non-key attribute
(assuming a B-Tree index in the traditional case):
Trad: 3 blocks TRM: 147 blocks (!)
Query 3:
EMP WHERE SALARY = 5000 The final case is when we the attribute that appears in the query is not indexed in the traditional RDBMS. Let's suppose that the result set is again of size 20.
Total cost:
Trad: |EMP| blocks = 1000 blocks (full table-scan) TRM: 147 blocks
It is easy to see that the TRM does not care about which attribute we search on (it is 'unbiased') so the its execution cost is the same as for the second query (EMP WHERE ENAME = 'Johnson').
The total cost function of the TRM based on this analysis is the following:
Locating the result set using binary search (equality or range):
log (|EMP|/ACNT) blocks
Record Reconstruction:
(ACNT - 1) * ||RS|| blocks
Two important observations:
- After all, binary search is not that cheap on large relations.
- Record reconstruction cost is a linear function of *both* the attribute count and the size of the result set.
For example, in Q2, where ACNT was 8 and ||RS|| was 20 the total cost of generating the result set using the TRM turned out to be 147 blocks!
The TRM had to fetch 15% of the total relation *after* locating the records in question to generate the result set that consisted of 20 records, 0.02% (!) of the relation.
As the result set grows the performance of TRM degrades rapidly:
||RS|| % of ||EMP|| Total Cost % of |EMP|
40 0.04 287 blocks 28.7 60 0.06 427 blocks 42.7 80 0.08 567 blocks 56.7
Keep in mind that we have 100000 records in EMP. 40 records in the result set will make the TRM read in almost one third of the database!
Of course, as the result set grows the likelihood that we 'hit' the same field-value block is getting higher so the actual numbers will be somewhat smaller. Nevertheless, for small to medium sized result sets the overhead of the TRM is enormous. Of course, if the query is such that it has to process almost the entire relation than the TRM is not worse than the traditional approach.
Notice, that we are still talking about retrieval costs where the TRM supposedly shines because it keeps the database records sorted by all attributes at the same time.
Let's look at insertion costs now.
INSERT NEW_EMP_REC INTO EMP
(NEW_EMP_REC is a record that contains the data of a new employee.)
Traditional RDBMS:
Insertion: Two B-Tree inserts (the E# and ENAME indexes) and a single block access for inserting the record itself.
Cost: 2 * 2 + 1 = 5 blocks
TRM: Insertion: If we want to insert a new tuple into the EMP relvar the TRM has to insert _every_ attribute value into its corresponding _FV table using binary search. But this is not enough in itself. Since the TRM uses positional links we are forced to deal with the ordered array insert problem. We have to make room for the new value which means that we have to shift the rest of the table and we have to update the links in the 'previous' _FV table. Let's be generous and ignore the costs of making room/updating links step. As we will see, even without this additional overhead the cost of insertion is quite high.
Cost:
Binary searches:
ACNT * log (|EMP|/ACNT) = 8 * log (125) = 8 * 7 = 56 blocks (!)
Conclusions
Although the TransRelational Model seems to be very appealing at first glance, closer analysis brings out its shortcomings. To summarize: (1) retrieval is prohibitively expensive for any reasonably sized relation because of the huge record reconstruction costs incurred by the vertical decomposition of tuples into 'Field-Value' tables, (2) insertion is prohibitively expensive because of the many binary searches and value insertions that have to be performed for each attribute of the record to be inserted.
Another observation is that the reward offered for all the pain, binary search on all attributes, is 'just' a log(n) search which is too slow compared to B-Trees. Not to mention that binary search is also a random access search which is not very cache friendly.
Also, the current analysis completely ignored the cost of making room for new values on insertion and maintaining the links between the _FV tables after deletion and insertion.
Regards,
Josh Hewitt
Ph.D. Student
Florida Institute of Technology
Received on Thu Nov 11 2004 - 08:54:59 CET
