Re: Modelling objects with variable number of properties in an RDBMS

From: Bernard Peek <bap_at_shrdlu.com>
Date: Mon, 31 Oct 2005 20:49:58 +0000
Message-ID: <iQqJ$jn2NoZDFwaT_at_shrdlu.com>


In message <WsKdnWPOleBm8_veRVnyjg_at_pipex.net>, Roy Hann <specially_at_processed.almost.meat> writes

>> An EAV system probably wouldn't need to use many tables, because it
>> holds each attribute in a separate row.
>
>I assume from this that I am supposed to take the need for fewer tables as a
>virtue. Why would that be a virtue?

It might be better, might not. It all depends on the application.

>
>> >No, no, no! A thousand times no! What could be more agile than creating
>a
>> >real table? It takes literally seconds.
>>
>> It only takes seconds to create a table, once you know what the table
>> structure is going to be. In systems where the data structure is unknown
>> at design-time you would need to give the users the capacity to add or
>> modify tables.
>
>What is a "user" in this context? If it is an end-user (i.e. a warehouse
>clerk or tax officer or a laboratory technician), why would they want to
>create tables?

They would want tables to hold data that doesn't fit into the existing structure, unless you use EAV in which case they would add new rows.

> Consider how hard it is to get end-users to understand how
>to do ad hoc reporting on existing tables. They don't get it, and they
>(quite rightly IMO) don't see why they should be burdened with having to.
>What sort of tables might they concoct? Would we want them?

We don't know. The need is predicated on the assumption that we don't have a usable data model at the prototype's design-time. How else can you capture not just the data but its unknown metadata.

>
>I think your instinct here is to empower the users somehow. We could have
>pleasant chat over a beer sometime about whether that is a problem that even
>needs solving. I think both of us would be just defending hunches though.
>My opening position is that if the business doesn't underwrite the database
>design then the users can just go home and invent tables on their own time.

That is the reason why it might be useful. If the user doesn't have the power to modify the system's metadata then they have to live with a crippled system until the developer comes back with the next iteration. I'm suggesting a way that the end-user can modify the data model on-the-fly to short-circuit the delays.

In theory the EAV system could support this, and would make system development faster and eliminate some of the errors introduced by developers' assumptions about the data. In practise, I'm not so sure.

>
>> That isn't going to be any better than using EAV.
>
>Actually I agree with that (if we are talking about end-users).
>
>> >> An EAV database could be used to collect
>> >> metadata during the early development stages, and that metadata could
>> >> then be used to build the final database system, replacing the
>temporary
>> >> EAV model.
>> >
>> >Please elaborate on this, because I can't make head nor tail of it.
>>
>> In the early stages of agile development you probably have only a vague
>> idea of what entities and attributes are going to be required in the
>> final system.
>
>Fair enough. So capture what you do know with real tables. You don't need
>a papal bull giving consent to drop them so you can build new ones when your
>understanding becomes clearer. Using real tables you would have confidence
>that your rapidly evolving code (which is necessarily unstable) has not
>missed anything because the DBMS is testing and validating the mutual
>understanding of the tables. And you would have captured what understanding
>you do have in a transparent and easily comprehensible form. In the rapidly
>changing and chaotic world of agile development, you don't want to be taking
>the time to reverse-engineer the table design-of-the-moment from a lot of
>Java code or whatever. You want it handed to you on a plate with no
>nonsense.
>
>> By building a system where the users can add entities and
>> attributes without involving the developer you allow the users to define
>> what the final system needs.
>
>I think not. I think at best you will see one particular users' perspective
>on the data, and I think you've probably been around long enough to have
>worked in a more conventional way and so you know that even a handful of
>users often have wildly divergent ideas about the data they work with. I am
>afraid I just don't buy that approach at all. Deferring the ugly scenes
>that occur when you insist they agree doesn't make the problem go away.

I think that the reverse would apply. The EAV system would capture the perspective of each user independently. I've had problems in the past when I've taken instructions from a nominated worker (or even worse, a manager) and worked to their instructions. At the end of the project that nominated person signs off a system that does the job as they see it, but other users find it unusable.

>
>I am not trying to be nasty about this, but this just sounds lazy, not
>agile.

I'll definitely plead guilty to that. Nobody ever made a fortune by finding the hard way to do a job.

>
>> After collecting the metadata you have the
>> information required for the data modeller to put together a properly
>> structured relational data model.
>
>Where did this metadata come from all of a sudden? That's the bit I didn't
>get before. That's the bit I really want to understand.

The metadata comes from the data that the users enter into the system. You may have some idea of what entities and attributes the system needs to capture but the model you build needs to change as you capture more of the users' knowledge about their data. Current systems require the user to document ways in which the systems fail so that those can be fixed in the next iteration. In theory it would be better to give the users' a way of fixing at least some of those issues on the fly so that in effect the users and the developer are working in parallel rather than serially.

-- 
Bernard Peek
London, UK. DBA, Manager, Trainer & Author.
Received on Mon Oct 31 2005 - 21:49:58 CET

Original text of this message