Re: Modeling question...

From: Walter Mitty <wamitty_at_verizon.net>
Date: Wed, 22 Oct 2008 14:22:17 GMT
Message-ID: <tGGLk.2944$Rx2.1106_at_nwrddc01.gnilink.net>


"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:96e644dc-2063-44ea-9d41-ce27970927e0_at_l64g2000hse.googlegroups.com...

> I'm telling them, and you, that the relational model can't do it
> because it was designed to handle "formatted" propositions (sets of
> data with a high level of common predication). It is important to
> recognize that the EAV approach you are looking at just happens to use
> the RM as its physical layer, and that's it. It does not use the RM as
> a logical model, and you therefore lose all of its algebraic power.
> (Sure you keep the management system's transactional capabilities, but
> thats nothing to do with the RM).
>
> In fact, having abandonded it, you might as well use XML, OO or RDF
> databases and cut out the middle man. However, better imo to convince
> the client that designing a robust a priori conceptual model is worth
> doing, and that you can come and update it at appropriate intervals (I
> say this because currently the RM is the most solid framework we
> have).
>
> I do have sympathy, because the issue of handling semistructured and
> dynamic schema is simply an unsolved problem (as is how to handle
> missing data). Proposed "solutions" are all woeful (in fact completely
> retrograde, whisking us back to 1960's tech). As such, anything you
> try and implement for your client will inevitably be an ad-hoc hack /
> in some way or other/. We're still in the stone age of informatics i'm
> afraid. Regards, Jim.
>

The above states the case better than I can. I'm going to throw in my two cents, in addition to agreeing with JOG's comment.

The big problem with using a semistructured approach to data, such as the EAV, is setting user expectations. If users were able to understand and appreciate that there is not necessarily any way to integrate the data between users after users have each used EAV to, in effect, design their own idiosyncratic database, that would be one thing. But my experience is that either users, or at least upper management, always fall back on the notion that databases are for sharing data, and therefore when they ask for outputs that require integration, the hard work has already been done when the database was built.

In one sense, it's hard to argue with management. Databases are for sharing data. That's what they were invented for, and that's what they are good at. So the expectation lives on that getting an output from a database that requires integrated massaging of the data is a simple request. Just take the required information, map it to the way the database represents the data, crank up a report writer, and presto!

The problem here is in the phrase "the way the database represents the data." With approaches like EAV, there is no ONE way the database represents data. Each user's data is represented the way that seems good to that user. Getting the users and management to understand that fact and set their expectations accordingly, is very very difficult. The easiest way to do this is to bypass a DBMS completely, and just store the semistructured and unintegrated data in a text file. Then at least you don't get the illusion that, because we manage the database with a DBMS, we must therefore have stored the data according to some coherent general plan. Received on Wed Oct 22 2008 - 16:22:17 CEST

Original text of this message