Re: Modeling question...

From: Walter Mitty <wamitty_at_verizon.net>
Date: Fri, 24 Oct 2008 08:10:42 GMT
Message-ID: <6qfMk.3266$Rx2.2190_at_nwrddc01.gnilink.net>


"Roy Hann" <specially_at_processed.almost.meat> wrote in message news:pPidnVX_0LxSyGLVnZ2dnUVZ8tLinZ2d_at_pipex.net...
> Walter Mitty wrote:
>
>>
>> "JOG" <jog_at_cs.nott.ac.uk> wrote in message
>> news:96e644dc-2063-44ea-9d41-ce27970927e0_at_l64g2000hse.googlegroups.com...
>>
>>> I'm telling them, and you, that the relational model can't do it
>>> because it was designed to handle "formatted" propositions (sets of
>>> data with a high level of common predication). It is important to
>>> recognize that the EAV approach you are looking at just happens to use
>>> the RM as its physical layer, and that's it. It does not use the RM as
>>> a logical model, and you therefore lose all of its algebraic power.
>>> (Sure you keep the management system's transactional capabilities, but
>>> thats nothing to do with the RM).
>>>
>>> In fact, having abandonded it, you might as well use XML, OO or RDF
>>> databases and cut out the middle man. However, better imo to convince
>>> the client that designing a robust a priori conceptual model is worth
>>> doing, and that you can come and update it at appropriate intervals (I
>>> say this because currently the RM is the most solid framework we
>>> have).
>>>
>>> I do have sympathy, because the issue of handling semistructured and
>>> dynamic schema is simply an unsolved problem (as is how to handle
>>> missing data). Proposed "solutions" are all woeful (in fact completely
>>> retrograde, whisking us back to 1960's tech). As such, anything you
>>> try and implement for your client will inevitably be an ad-hoc hack /
>>> in some way or other/. We're still in the stone age of informatics i'm
>>> afraid. Regards, Jim.
>>>
>>
>> The above states the case better than I can. I'm going to throw in my
>> two
>> cents, in addition to agreeing with JOG's comment.
>>
>> The big problem with using a semistructured approach to data, such as the
>> EAV, is setting user expectations. If users were able to understand and
>> appreciate that there is not necessarily any way to integrate the data
>> between users after users have each used EAV to, in effect, design their
>> own
>> idiosyncratic database, that would be one thing. But my experience is
>> that
>> either users, or at least upper management, always fall back on the
>> notion
>> that databases are for sharing data, and therefore when they ask for
>> outputs
>> that require integration, the hard work has already been done when the
>> database was built.
>>
>> In one sense, it's hard to argue with management. Databases are for
>> sharing
>> data. That's what they were invented for, and that's what they are good
>> at.
>> So the expectation lives on that getting an output from a database that
>> requires integrated massaging of the data is a simple request. Just take
>> the required information, map it to the way the database represents the
>> data, crank up a report writer, and presto!
>>
>> The problem here is in the phrase "the way the database represents the
>> data." With approaches like EAV, there is no ONE way the database
>> represents data. Each user's data is represented the way that seems good
>> to
>> that user. Getting the users and management to understand that fact and
>> set
>> their expectations accordingly, is very very difficult. The easiest way
>> to
>> do this is to bypass a DBMS completely, and just store the
>> semistructured
>> and unintegrated data in a text file. Then at least you don't get the
>> illusion that, because we manage the database with a DBMS, we must
>> therefore have stored the data according to some coherent general plan.
>
> I don't want to disagree too violently with Walter's re-telling of JOG's
> very sound position, but I get the sense that even Walter doesn't fully
> appreciate the idiocy of EAV.
>

I do understand some things about the idiocy of EAV that I didn't bother to include in my earlier response. But there are probably plenty of things about the idiocy of EAV that I don't really appreciate. I looked only briefly at EAV before deciding that I didn't want to go there. So I didn't fully explore its consequences.

> The point to get across to management is that they don't need EAV
> because even an SQL DBMS already does everything EAV does, and more. If
> management want users to be able to dream up and implement their own
> fact types, each user can just go ahead and create suitable tables in
> the usual way.
>

For some strange reason, the EAV enthusiasts have accepted, grudgingly, that only the DBAs should be able to do DDL. Perhaps this was already the satus quo when they began dreaming about user defined data. So EAV is one of those things that gets "rediscovered" from time to time as a brilliant way of allowing each user to invent his/her own data structure, while keeping the DDL under control.

> Now, how do you get all the users to share an understanding of all these
> tables so they can usefully collaborate (share data)? Good question.
> The same way they imagine EAV would do it, I guess. (Only it will be
> easier to implement because all you need is dynamic SQL.)
>

Using dynamic SQL, it's possible to create the same level of confusion that gets created under EAV. Without being too formal about it, you can lump the whole class of messes under the term "undocumented data". Received on Fri Oct 24 2008 - 10:10:42 CEST

Original text of this message