First Impressions on Using Alphora's Dataphor

From: Josh Hewitt <lajos.nagy_at_gmail.com>
Date: 26 Aug 2004 10:10:56 -0700
Message-ID: <1c92edeb.0408260910.60242a59_at_posting.google.com>



I have recently downloaded Alphora's Dataphor product and begun experimenting with it. I tried to write a simple (non-GUI), toy application based on the principles put forward in the The Third Manifesto (TTM) and also in the articles published on the Database Debunkings [http://www.dbdebunk.com] site. I defined scalar types and scalar operators and I also spent some time designing a normalized database schema along with integrity constraints. I did not use any external storage device (that is to say, a SQL database back-end). I only experimented with the 'in-process' server. (Although I plan to connect to an Oracle instance later to see how it performs.)

Good things

  • Expressing relational algebraic (relalg) queries in D4 was much easier than it used to be in SQL. The resulting queries were usually clearer, significantly more compact, and, of course, were seamlessly integrated with the rest of the language. It is important to mention that several queries that would have been too cumbersome to bother with in SQL became expressible in D4, quite often easily expressible. I think this is mainly due to the orthogonal design of the language which allows arbitrary nesting of relalg expressions. Needless to say that with D4 being truly relational, or, in other words, with D4 removing duplicates at each point of the evaluation, one can rest assured that the results will be relations in the mathematical sense.
  • The ability to declaratively define arbitrary constraints in Dataphor's D4 language was truly impressive. Knowing that the DBMS would enforce the constraints at all times gave great confidence that the database could not be modified in an inconsistent way. As it turned out later, this helped in finding subtle logic errors in the algorithms that tried to update the database.

Nuisances

  (I will further elaborate on this point and its implications a bit more   at the end of the post.)

Musings

How can one do software maintenance in a live Dataphor installation?

Since scalar types form the building blocks of the database design, almost every database object will depend on them, either directly or indirectly. Relvars and operators are all defined using types while constraints are defined in terms of expressions involving relvars and operators. The consequence is that every scalar type will have several dependencies and several dependents. How can one fix a bug that requires more than cosmetic changes in this setup? You cannot just drop the type since everything depends on it. Also, what are you going to do with the data in your relvars? You cannot just drop your production tables. What are you going to do with older applications that depend on (the now obsolete) old types/operators/tables/views. Some suggested creating views for the old applications based on the new tables. I do not think that this approach is viable in the long run. It would mean duplicating the affected parts every time there is a bug fix. Also, the added complexity of maintaining 'parallel universes' for old and new programs is not something that one would want.

Now I digress a little bit so the following musings are about UDTs in RDBMSs in general. Please do understand that I am in full support for truly relational systems and I pose the following question in order to provoke a healthy debate that might focus attention to areas neglected by relational theory but which are still important in practice:

What if major vendors 'do get it' but enterprise environments simply 'squeeze out' anything but atomic types (numbers, character strings, booleans, etc.) from the database?

Quite often, in enterprise environments, development teams rarely have access and/or control over applications written by other teams. Given the tangled web of dependencies in a 'UDT-enabled' RDBMS, achieving success suddenly becomes a factor of how well-designed those UDTs are in the first place, and if they need modification, who has the right (and the resources) to perform those modifications. I think this is the make or break question of the real life success of truly relational databases right along with the questions of performance (which I do not want to address right now.) Most development projects try to have control over as many factors of their success as possible. (That is why everybody is building several layers of abstractions in order to 'insulate' themselves from the ever-changing environment.) Average workplaces have average developers on average. This means that it is rarely the case that developers can work on a database that is well-designed. Even in database design there are a lot of ways to screw up and I do not think that designing UDTs is easier business in any way. So in a UDT-rich world developers would be forced to work with someone else's not-so-well designed UDTs most of the time. Not to mention that fixing a badly designed mess can be very time-consuming, especially if it is not the job one were hired to do in the first place. Even fixing serious database design flows in today's UDT-free databases often meets with severe resistance because of the (not completely unsubstantiated) fear of breaking other applications. Is it not better then to forget about UDTs in the database and only allow them in programming languages where developers have complete control over them? Is it not the case that the lack of UDTs in today's databases is more of a blessing then a curse?

Sorry for the long post, but I felt that this might be an interesting topic to discuss in this news group.  

Josh Hewitt

Ph.D. Student
Department of Computer Sciences
Florida Institute of Technology Received on Thu Aug 26 2004 - 19:10:56 CEST

Original text of this message