Re: Mixing OO and DB
Date: Sat, 08 Mar 2008 13:53:25 +0100
David Cressey wrote:
>Before I move on, I have to give an opinion based on my own data-centric
>world view. If you don't understand the data, then you don't know what
>you're talking about. In short, I completely fail to grasp how one can
>understand a system in terms of "behavior" without understanding the data
>that the behavior affects. This is something that it's going to take me
>years of lurking in comp.objects to grasp.
I don't think so. Can we understand the differences between sets, multisets, ordered lists, queues and stacks just by "understanding the data"? Well, it depends on what you mean by "data", but my guess is that you'll agree that the differences between them are not so much in their data, in what information they store, but on how we can access and update this information; the laws that govern their interaction with the rest of the world. This is what the OO world calls "behaviour". Behaviour is what an object looks like from the outside. Data structure is what it looks like from the inside, its implementation. The behaviour of sets, or multisets, or lists, can be implemented with many different concrete data structures, and conversely, the same concrete data structure can be used in implementing many different abstract data types.
Can we understand the behaviour of sets without having a concrete data structure in mind? Yes, definitely, we can write down set operations and the laws that govern them, e.g.
isempty: Set<T> -> Boolean
in: T x Set<T> -> Boolean
singleton: T -> Set<T>
union: Set<T> x Set<T>
intersection: Set<T> x Set<T> -> Set<T>
isempty(empty) = true
for all e in T: in(e, empty) for all e in T: in(e, singleton(e)) for all e in T: in(e, singleton(e))
for all e in T, X,Y in Set<T>:
in(e, X) or in(e, Y) <=> in(union(X,Y)) for all e in T, X,Y in Set<T>:
in(e, X) and in(e, Y) <=> in(intersection(X,Y)) (etc.)
It's not easy, and I'm no expert in it, but it can be done, and I don't think you'll call this "understanding the data". What is more, I'll claim that all of the "understanding data" that you claim to be capable of is essentially of this nature: even with a concrete data structure in mind to aid understanding and the implementation, the data structure doesn't really mean anything without specifying the operations that can be performed on it; and that meaning essentially consists of the laws that govern the behaviour of those operations as observable from the outside, and is thereby essentially independent of that concrete data structure.
>And I suspect that, based on the
>experience of people like Marshall Spight, that I'm going to conclude, at
>the end of the day, that behavior is not the holy grail of computing.
In an RDBMS the focus is on data structures with "relational" behaviour, where the operations and their behaviour are fixed in the query language; this is a good fit for many of the data we need to work with in practice, but not for everything.
>In my original perspective, the single thing that ties together all the
>applications and all the databases that collaborate by sharing data is just
>one thing: data.
>If you understand the data, and you understand the
>(observable) behavior of each of the applications and each of the databases,
>you can understand the system. Otherwise, you can't understand the system.
The opposite is even more true: no data can be understood without
understanding what operations are used to obtain and apply the data.
Here, I have some data for you:
Completely useless, unless you understand which interactions
with the real world these figures correspond to.
1,129,960,000 March 8, 2008
303,569,100 March 5, 2008
186,315,468 March 1, 2008
162,652,500 February 29, 2008
141,933,955 March 1, 2008
127,790,000 December 1, 2007
Completely useless, unless you understand which interactions with the real world these figures correspond to.
So I think your suggestion that understanding is somehow tied to data, not to behaviour, is flat-out wrong. We do need to understand the operations on our data before we can understand the data. The reason relational database designers want to store data and not operations or laws of behaviour has little to do with where the information is, but it is purely due to the fact that in most RDBMS applications we can "factor out" the behaviour into the fixed set of common, parametrizable operations provided by the relational query language. (In reality, of course, this rarely suffices, because the query language is't powerful enough, and we have to kludge around with transactions and stored procedures to actually get the job done.)
-- Reinier PostReceived on Sat Mar 08 2008 - 13:53:25 CET