Re: OO versus RDB
Date: Thu, 29 Jun 2006 10:24:22 +0000 (UTC)
Message-ID: <e809om$8nt$1_at_news.lth.se>
In article <04Cog.3422$pu3.83147_at_ursa-nb00s0.nbnet.nb.ca>,
Bob Badour <bbadour_at_pei.sympatico.ca> wrote:
>Christian Brunschen wrote:
>
>> In article <eZxog.3346$pu3.80765_at_ursa-nb00s0.nbnet.nb.ca>,
>> Bob Badour <bbadour_at_pei.sympatico.ca> wrote:
>>
>>>Christian Brunschen wrote:
>>>
>>>>In article <1151501485.755962.108350_at_p79g2000cwp.googlegroups.com>,
>>>>erk <eric.kaun_at_gmail.com> wrote:
>>>>
>>>>>topmind wrote:
>>>>>
>>>>>>[...] (Unlike OO, where encapsulation
>>>>>>encourages each object/class to reinvent its own
>>>>>>add/change/delete/cross-reference/search rules and interfaces so that
>>>>>>they are all different for each project or shop.
>>>>>
>>>>>Agreed. I wouldn't have a problem with inconsistencies if the languages
>>>>>just offered some powerful basic operations. You can't even write
>>>>>something in Java like this, which would be completely type-safe:
>>>>>
>>>>>Set<LineItem> items =
>>>>>theOrder.lineItems.where(item.status==Status.SHIPPED);
>>>>
>>>>Perhaps somewhat interestingly, in a dynamic OO language such as
>>>>Smalltalk, Objective-C or Ruby, you can use the technique of Higher-Order
>>>>Messaging (HOM), as described in the 2005 OOPLSA paper
>>>><www.metaobject.com/papers/Higher_Order_Messaging_OOPSLA_2005.pdf>, to do
>>>>something quite similar; in Objective-C syntax something like
>>>>
>>>> NSSet *lineItems =
>>>> [[[[order lineItems] selectWhere] status] equals:STATUS_SHIPPED];
>>>
>>>You have given an example involving only restriction.
>>
>> If I interpret things correctly, this is what is called 'selection' in
>> some places (suh as <http://en.wikipedia.org/wiki/Generalized_selection>
>> and <http://www.cse.ohio-state.edu/~gurari/course/cse670/cse670Ch4.html>)?
>
>Suggesting a synonym for restrict does not answer my question.
This particular paragrah of my response was not intended to answer your question, but instead to ensure that I had understood the question correctly, in particular because a quick google search for 'relational algebra operations' shows in the top five result only one, the fifth, that actualy refers to this operation as 'restrict'; the others call it 'select'. I just wanted to avoid any possible misunderstandings about the terminology in use, also because I am coming mainly from the direction of being a pragmatic software developer rather than a deep theorist on any particular subject. Please also keep this in mind if my use of terminology is imprecise or inaccurate - that will probably be because I have lots of things to learn.
>The relational algebra also has union, join, project, extend, quantification
>etc.
>>>Does the method work for project, extend and join as well?
>>
>> Caveat: I am an OO developer, and use relational databases a lot, but I
>> there is a lot of precise terminology with which I am not conversant, and
>> I *will* make mistakes. but they will be just that, mistakes.
>
>[lengthy example using objects, dictionaries, higher-order-methods snipped]
>> In the case of a join it could be argued whether the
>> result should be a set of NSDictionaries containing just the attributes
>> extracted from each joined pair of objects (i.e., trying to emulate as
>> closely as possible the real relational model), or something like a set of
>> pair of objects, where each such pair would simply identify the objects
>> fromthe original A and B sets that were thus joined together (since we
>> are, after all, working with objects rather than tuples). Of course, we
>> could simply offer both operations and let the programmer decide which is
>> appropriate to use.
>
>With all due respect, you have just broken one of the most fundamental
>properties of the relational algebra/calculus: closure. (Actually, you
>have proposed two alternates that both break closure.) I cannot stress
>enough the importance of closure and nesting. Without that, your
>proposal falls flat.
Before beginning to address the substance of your comment, let me add
another 'terminology aside': I am presuming that by 'closure' you are
specifically referring to the property of relational algebra that any
operation on relation(s) returns another relation, which can then be used
immediately as one of the operands for another relational-algebra
operation, and so on and so forth; i.e., it enables arbitrary nesting of
relational operations (and that it is that 'nesting' you are referring to,
as well). If these assumptions of mine are incorrect, obviously my
arguments below (which are based in part on these assumptions) will fall.
Given my above understanding of 'closure' and 'nesting', I must say that I
disagree that my suggestions 'break closure' - in the context where my
suggestion is set, namely in the context of an object-oriented
environment, which is *not* the same as the relational model. Where the
relational model works with relations, which are fundamentally sets of
tuples, the object-oriented environment I've sketched at works with sets
of _objects_. Each operation thus fulfils closure if it, too, returns a
set of objects. Consider now that a 'pairs of objects' would be
straightforwardly represented in an object-oriented context precisely as
an object itself, something like:
_at_interface Pair : NSObject {
id left;
id right;
}
_at_end
_at_implementation Pair
Now, the join operation returning 'pairs of objects' a) operates on two
sets of objects, and b) returns a set of objects, which can in turn be
used as input to other operations (such as projection to extract a subset
of the properties available in the 'left' and 'right' objects of the
pair). Under the above informal definition of 'closure', this fulfils it,
and certainly it allows for nesting of these operations. For example,
using the fact that the 'Pair' class overrides its 'valueForKey:' method
to permit access to the properties of its 'left' and 'right' objects
directly (by prefixing the key with 'left_' and 'right_' respectively), we
could do something like
// these are the keys we want to extract from the join result
NSArray *interestingKeys =
// these are the keys we want to use in the extracted result
NSArray *renamedKeys =
// perform the join, extract the interesting keys and rename, all in one go
NSSet *extracted =
The 'extracted' set would then contain NSDictionaries, each containing as
NSSet *matching =
... again demonstrating that a result in the shape of a set of pairs of
objects certainly permits nesting, and to my understanding (which may
[NSArray arrayWithObjects:_at_"left_x", "right_y", nil];
[NSArray arrayWithObjects:_at_"a_x", @"b_y", nil];
[[[[A equijoinWith:B] foo] bar] project:interestingKeys
andRename:renamedKeys];
[[[[[[A equijoinWith:B] foo] bar] selectWhere] left_x] equals:47];
One thing that very much differentiates 'my' sketched-at system from the relational model and relational algebra, is that in my system there is no difference between what a set can contain vs what the properties on the set's contents are: everything is objects (or, well, references to objects). And these objects have properties, which can in turn be ojects, and which can be accessed uniformly and traversed even through different depths. But more importantly from an object-orientation viewpoint, these objects don't just hold data, they also associate wth it certain behaviour. This is a reason why an object-riented developer might prefer to get a set of pairs of objects as the result from a join rather than a set of dictionaries just holding the values of the respective properties of the objects: extracting the properties and putting them in a dictionary loses information (and can always, as demonstrated above, be done explicitly if desired).
Now, using either key-value coding or higher-order messaging, accessing attributes or relationships is all done uniformly. This means that both types of properties, attributes and relationships, can be used for any purpose, including for join conditions, selection criteria and so on.
For instance, if you have a set of objects, and want to select from it only the sbset that are all related to a common other object (say, from a set of employees all those that work in the same department), you could actually perform the join not using the 'departmentId' to compare on, but actually the department object itself:
Department *someDepartment;
NSSet *employees;
NSSet *departmentEmployees =
[[[employees selectWhere] department] equals:someDepartment];
Of course, this presumes that we haven't already included an 'employees' relationship on the Department object, in which case we could simply have used that. But there may be good reasons to exclude such a relationship and only calculate it through an explicit join when necessary, in which case the above would make an excellent candidate for being a method on the Department objects:
_at_implementation Department
- (NSSet *)matchingEmployees:(NSSet *)candidates { return [[[candidates selectWhere] department] equals:self]; }
The fact that to-many relationships are modeled as sets of objects is also quite useful, because it means that we can first traverse a graph of objects to find a fairly small candidate set of objects, and then perform a join or other operation only an that small set of objects rather than the set of all existing objets of that type. Essentially, it allows us to combine the best bits of using an object graph, with a lot of what makes the relational model so usefu.
>> I am uncertain what the 'extend' operation is that you refer to; I didn't
>> recall it from my database classes at university, and googling for
>> 'relational algebra extend' doesn't seem to give anything that seems
>> immediately enlightening. Would you care to elucidate?
>
>Extend derives new attributes from existing attributes. Thus given a
>relation including a date_of_birth attribute, one might derive a new
>relation that extends the original relation with an age attribute.
This is a very straightforward thing to do in object-orientation in
general. If you consider the 'Point' class I sketched at in my previous
post, it could be implemented to have two 'intrinsic' properties, 'x' and
'y', , which would be its cartesian coordinates (aka 'abscissa' and
'ordinate' if you are so inclined); but could also expose the derived
properties 'r' and 'theta', its polar coordinates, like so:
_at_interface Point : NSObject {
double x;
double y
}
_at_end
// the usual 'x' and 'y' accessors omitted for brevity
- (double) r { return hypot([self x], [self y]); }
- (double) theta { return atan2([self x], [self y]); } _at_end
Best wishes,
// Christian Brunschen Received on Thu Jun 29 2006 - 12:24:22 CEST