Catalog Concepts and Types for Business Application Development

From: jlowery <jslowery_at_gmail.com>
Date: 26 Feb 2007 08:32:43 -0800
Message-ID: <1172507563.157196.89780_at_j27g2000cwj.googlegroups.com>



I recently finished one of my old college text books that I never read named "An Introduction to Database Systems" by Date. A lot of the concepts in the book seemed to be natural conclusions to my continuously developing sense of software design. The reason I am writing this post is to elicit comments from others who may have walked down a similar path before, being able to provide insight into what works and what doesn't.

As a member of a consulting group, I develop web-based applications for businesses with standard database-oriented requirements: access control systems, data entry forms, work flow management, etc.

As of late, I have been wanting to develop more library routines and framework components to help produce product more efficiently.

It occurred to me that there is much productivity to be gained from a more relational design perspective. However, our tools of the trade are 100% open source, Python being our de facto language and MySQL our standard database.

As I am sure everyone is aware, MySQL is crippled compared to the full power of relational programming. Likewise, the Python web-application components of today do not have the appropriate facilities for business application development, most of them being focused on web site-oriented applications. I came across the Dataphor platform. While intriguing, it is completely incompatible with our technologies.

As such, I am looking to implement some system components to provide some of the more useful features of relational programming to my context. Unfortunately, I do not have the resources for a full-scale product implementation. My biases lean heavily toward pragmatism and implementation time.

In attempting to derive a methodology, my thoughts immediately drifted to the catalog abstraction used to store meta data about a given database. My initial idea is to create a custom catalog database that overlays a target database, providing meta data about the relations, fields, domains, views and access control. Synchronization tools would be written to keep the custom catalog consistent with the target database.

Such a catalog would allow me to implement additional data structures useful for web applications, especially concerning views, forms and access control.

Domains (types) would be implemented as immutable Python classes. For example:

ZIP_RE_STD = re.compile(r'^\d{5}$')
ZIP_RE_EXP = re.compile(r'^(\d{5})[\-](\d{4})$')

class zip_code(str):

    def __new__(cls, s):

        if not isinstance(s, basestring):
            s = str(s)
        inst = str.__new__(cls, s)
        if ZIP_RE_STD.match(s):
            inst._expanded = False
            inst._base = inst
            inst._addon = ''
            return inst

        match = ZIP_RE_EXP.match(s)
        if match:
            inst._expanded = True
            inst._base = match.groups()[0]
            inst._addon = match.groups()[1]
            return inst

        raise TypeError, "Invalid zip code %r" % s

   _at_property
    def expanded(self):

        return self._expanded

    _at_property
    def addon(self):

        return self._addon

    _at_property
    def base(self):

        return self._base

Now the immediate problem with this approach is that mapping the proper type to the real database type becomes a conversion or casting process without SQL language parsing. Simple retrieval operations on view or tables could be automatically mapped based on the field definitions.

However, for more complex queries, an explicit set of types would need to be provided.

query((currency, currency, string), """

    SELECT base_salary+insurance_benefits_total, base_salary, employee_name

    FROM employee""")

Or the type information could be stored in the database with some mapping between queries, stored procedures, views, etc.

This a lot of trouble, but I see a few advantages as well:

  • easily storing references to procedures and types defined in application code in fields
  • use the type names in relations that map them to handlers for application functionality

For example, mapping domain types to procedure HTML form element generation code.

/* type definition as closure */

def textbox(size=10):

    class textbox_gen(object):

        def __init__(self, value):
              self.value = value

        def __str__(self):
            return '<input type="text" size="%s" value="%s" />' %
(cgi.quote(str(size), True), cgi.quote(self.value))

    return textbox_gen

/* simplified catalog definition of field */
relation |field |type



type2html|type |type
type2html|html_gen|html_gen

/* type2html sample value */

type |html_gen



string|textbox(20)

Using the same technique, procedural values could also be stored on tables to allow the library algorithms to use higher-order, referentially-transparent functions.

Although I am not big on object-oriented programming, I do believe that types provide facilities for powerful abstractions. Interweaving types into application-level logic with database programming seems the most natural thing in the world to me.

However, I find it difficult to discern when to use a type with property accessors or when to use a full out relation. I would assume that its proper to use the immutable types when it makes no logical sense to set the values on a given property.

For example, the above zip_code has two provided attributes. base and addons.

It would seem silly to create a relation like so:

zip_code(PK) |base |addon


55555               |55555|
55555-4444       |55555|44444

Unless I was maybe writing an application for the post office :)

One thing I do know is that the more meta logic is folded out of the database, the more adversely affected the performance will be.

Anyway, this post turned out to be a hodgepodge. I have been writing standard SQL programs for a very long time, but the idea of conceptually extending the database is something I'm still toying with in my mind. In the end, I realize that most of this mental play is simply to get around the limitations of the SQL database status quo. A part of me feels cheated.

At the same time, I am confident that someone already wrote everything I'm thinking of for green screens back in the late 70's. Received on Mon Feb 26 2007 - 17:32:43 CET

Original text of this message