Re: Modeling question...

From: Roy Hann <specially_at_processed.almost.meat>
Date: Wed, 22 Oct 2008 11:33:19 -0500
Message-ID: <pPidnVX_0LxSyGLVnZ2dnUVZ8tLinZ2d_at_pipex.net>


Walter Mitty wrote:

>
> "JOG" <jog_at_cs.nott.ac.uk> wrote in message
> news:96e644dc-2063-44ea-9d41-ce27970927e0_at_l64g2000hse.googlegroups.com...
>
>> I'm telling them, and you, that the relational model can't do it
>> because it was designed to handle "formatted" propositions (sets of
>> data with a high level of common predication). It is important to
>> recognize that the EAV approach you are looking at just happens to use
>> the RM as its physical layer, and that's it. It does not use the RM as
>> a logical model, and you therefore lose all of its algebraic power.
>> (Sure you keep the management system's transactional capabilities, but
>> thats nothing to do with the RM).
>>
>> In fact, having abandonded it, you might as well use XML, OO or RDF
>> databases and cut out the middle man. However, better imo to convince
>> the client that designing a robust a priori conceptual model is worth
>> doing, and that you can come and update it at appropriate intervals (I
>> say this because currently the RM is the most solid framework we
>> have).
>>
>> I do have sympathy, because the issue of handling semistructured and
>> dynamic schema is simply an unsolved problem (as is how to handle
>> missing data). Proposed "solutions" are all woeful (in fact completely
>> retrograde, whisking us back to 1960's tech). As such, anything you
>> try and implement for your client will inevitably be an ad-hoc hack /
>> in some way or other/. We're still in the stone age of informatics i'm
>> afraid. Regards, Jim.
>>
>
> The above states the case better than I can. I'm going to throw in my two
> cents, in addition to agreeing with JOG's comment.
>
> The big problem with using a semistructured approach to data, such as the
> EAV, is setting user expectations. If users were able to understand and
> appreciate that there is not necessarily any way to integrate the data
> between users after users have each used EAV to, in effect, design their own
> idiosyncratic database, that would be one thing. But my experience is that
> either users, or at least upper management, always fall back on the notion
> that databases are for sharing data, and therefore when they ask for outputs
> that require integration, the hard work has already been done when the
> database was built.
>
> In one sense, it's hard to argue with management. Databases are for sharing
> data. That's what they were invented for, and that's what they are good at.
> So the expectation lives on that getting an output from a database that
> requires integrated massaging of the data is a simple request. Just take
> the required information, map it to the way the database represents the
> data, crank up a report writer, and presto!
>
> The problem here is in the phrase "the way the database represents the
> data." With approaches like EAV, there is no ONE way the database
> represents data. Each user's data is represented the way that seems good to
> that user. Getting the users and management to understand that fact and set
> their expectations accordingly, is very very difficult. The easiest way to
> do this is to bypass a DBMS completely, and just store the semistructured
> and unintegrated data in a text file. Then at least you don't get the
> illusion that, because we manage the database with a DBMS, we must
> therefore have stored the data according to some coherent general plan.

I don't want to disagree too violently with Walter's re-telling of JOG's very sound position, but I get the sense that even Walter doesn't fully appreciate the idiocy of EAV.

The point to get across to management is that they don't need EAV because even an SQL DBMS already does everything EAV does, and more. If management want users to be able to dream up and implement their own fact types, each user can just go ahead and create suitable tables in the usual way.

Now, how do you get all the users to share an understanding of all these tables so they can usefully collaborate (share data)? Good question. The same way they imagine EAV would do it, I guess. (Only it will be easier to implement because all you need is dynamic SQL.)

And before I leave this alone, there is no such thing a "semi-structured" data. That term makes as much sense as semi-understood knowledge. The concept that people using that term might be struggling to convey is "semi-shared business model", or to put it another way, "(only) some of us know what (only) some of this means". My attitude to that is fine, just don't expect me to know what any of it means.

-- 
Roy
Received on Wed Oct 22 2008 - 18:33:19 CEST

Original text of this message