Re: OO versus RDB

From: Christian Brunschen <>
Date: Wed, 28 Jun 2006 20:13:05 +0000 (UTC)
Message-ID: <e7unsh$s41$>

In article <eZxog.3346$>, Bob Badour <> wrote:
>Christian Brunschen wrote:
>> In article <>,
>> erk <> wrote:
>>>topmind wrote:
>>>>[...] (Unlike OO, where encapsulation
>>>>encourages each object/class to reinvent its own
>>>>add/change/delete/cross-reference/search rules and interfaces so that
>>>>they are all different for each project or shop.
>>>Agreed. I wouldn't have a problem with inconsistencies if the languages
>>>just offered some powerful basic operations. You can't even write
>>>something in Java like this, which would be completely type-safe:
>>>Set<LineItem> items =
>> Perhaps somewhat interestingly, in a dynamic OO language such as
>> Smalltalk, Objective-C or Ruby, you can use the technique of Higher-Order
>> Messaging (HOM), as described in the 2005 OOPLSA paper
>> <>, to do
>> something quite similar; in Objective-C syntax something like
>> NSSet *lineItems =
>> [[[[order lineItems] selectWhere] status] equals:STATUS_SHIPPED];
>You have given an example involving only restriction.

If I interpret things correctly, this is what is called 'selection' in some places (suh as <> and <>)?

>Does the method work for project, extend and join as well?

Caveat: I am an OO developer, and use relational databases a lot, but I there is a lot of precise terminology with which I am not conversant, and I *will* make mistakes. but they will be just that, mistakes.

A bit more background:
Objective-C today is mainly used by Apple in their Cocoa frameworks. Objective-C has soem very nice and useful features:

- 'Categories' allow you to extend classes with new methods after-the-fact
- it uses late, dynamic binding
- it actually separates messages from methods
- there is a 'catch-all' method that is invoked for any message sent to an 
  object, for which that object does not implement a matching message - Objective-C has extensive type information available not just at compile   but also at run time, which allows for extensive   introspection/reflection (and with good performance, too)

This allowed the developer or Higher-Order Messaging for Objective-C, Marcel Weiher, to develop an approach based on objects which intercept messages, and handle them by, for instance, applying them to a collection of objects, or similar. In effect, one message takes another message as on argument - hence, 'higher-order messaging'.

Because new messages can be added to existing classes after-the-fact, the methods implementing these higher-order messages can be added to the existing Cocoa classes, incluing the collection classes such as NSSet, NSArray, and so on.

Apple, in their Cocoa frameworks, already use a number of conventions that I personally find quite nice. One of these is 'Key-Value Coding', which is fundamentally a convention that compliant objects expose any properties they have (either attributes or relationships to other objects) through a pair of methods,
  [anObject valueForKey:_at_"foo"];
  [anObject setValue:someValue forKey:_at_"foo"];

Cocoa includes a class, NSDictionary, which is simply an arbitrary collection of key-value pairs, and which implements key-value coding in the fairly obvious fashion of accessing the dictionary's contents.

For other objects, the Cocoa framework uses the powerful reflection and instrospection facilities in Objective-C to make available through key-value coding any attribute that is accessible through a pair of accessor methods like '[object foo]' and '[object setFoo:someValue]'. A call to
  [object valueForKey:_at_"foo"];
will invoke
  [object foo];
  [object setValue:aValue forKey:_at_"foo"] will invoke
  [object setFoo:aValue];

Of course, the key-value coding methods ('valueForKey:' and 'setValue:forKey:') only require that there be a matching pair of methods, not that they necessarily access an instance variable: they can, if desired, do something else entirely (for instance, a 'Point' class might internally store cartesian coordinates, but also offer accessors for cylindrical coordinates, which could still be accessed through key-value coding).

Key-Value Coding also includes a method 'valuesForKeys:', which takes an array of attribute keys, and returns an NSDictionary containing that object's values for those keys, like so:   NSArray *interestingKeys = /* ... */;
  NSDictionary *interesting = [object valuesForKeys:interestingKeys];

This can be done with any key-value-coding compliant object, including, of course, NSDictionaries.

Considering this, a 'project' operation (if I recall and understand correctly, this extracts from each tuple in a relation a subset of its 'columns', thus creating a narrower and potentially shorter relation) should be possible by 'just' using the existing 'collect' higher-order message using 'valuesForKeys:' as the argument message:   NSArray *keys = /* ... */;
  NSSet *projectedSet = [[set collect] valuesForKeys:keys];

(Caveat: It might be that the 'collect' higher-order message will return an array rather than a set; but that is easily fixed:   NSSet *projectedSet =
    [NSSet setWithArray:[set collect] valuesForKeys:keys]]; )

While I don't think that anyone has actually written join code yet, you could certainly write a higher-order method such that, given two sets - 'A' containing objects that respond to the message 'foo', and 'B' containing objects responding to the message 'bar', which can be compared against each other - you would end up being able to do (for an equi-join)

NSSet *joined =
  [[[A equijoinWith:B] foo] bar];

.. only with more usefully chosen message names than the horrid one I just came up with here. In the case of a join it could be argued whether the result should be a set of NSDictionaries containing just the attributes extracted from each joined pair of objects (i.e., trying to emulate as closely as possible the real relational model), or something like a set of pair of objects, where each such pair would simply identify the objects fromthe original A and B sets that were thus joined together (since we are, after all, working with objects rather than tuples). Of course, we could simply offer both operations and let the programmer decide which is appropriate to use.

I am uncertain what the 'extend' operation is that you refer to; I didn't recall it from my database classes at university, and googling for 'relational algebra extend' doesn't seem to give anything that seems immediately enlightening. Would you care to elucidate?

Best wishes,

// Christian Brunschen Received on Wed Jun 28 2006 - 22:13:05 CEST

Original text of this message