Re: Modelling objects with variable number of properties in an RDBMS

From: Roy Hann <specially_at_processed.almost.meat>
Date: Wed, 2 Nov 2005 19:26:36 -0000
Message-ID: <>

"vc" <> wrote in message
> > I find both claims quite astonishing. That is entirely contrary to what
> > would expect. Evidently it was contrary to what they expected too since
> > say they tried one of my suggestions. Since it seems we agree that
> > the expected outcome that makes it an extraordinary claim.
> OK, a SQL Server row can be 8K long maximum. Say, a drug formula is a
> relation like (some_stable_info_about 2K long,
> ~5000-6000_lab_data_results). The SQL Server row can be 8K long
> maximum which leaves room for 750 double precision columns for storing
> trial results. In a multi-table, approach it would amount to 8 tables
> partitioned horizontally, by attribute subsets.

I assume you mean "vertically" not horizontally.

> A simple request like
> select formulas for a given date with drug trial parameters in some
> given ranges can be expressed as a single SQL statement with an EAV
> table and would require 8 unions, or thereabout, with multiple tables.

I guess if I wanted to make the effort I might come up with the sort of queries you are thinking about here, but it's not my argument and I can't be bothered. You are re-asserting they were tough queries, but you already said that.

Also, I am starting to wonder if the tables you are describing are really EAV tables (like what the OP came up with). I wonder if your tables might not be a perfectly legitimate design--something along the lines of

create table result

   experiment ...,
   datetime ...,
   analyte ...,

   concentration ...

> The optimizer could not build an access plan with acceptable
> performance with the multiple table approach. Joins complicated the
> problem even more.

No doubt there are very hard queries that tax an optimizer. You are asserting this is one of those cases but I choose not to take your word for it. Only examples will do.

> Additionally, on a daily basis, about a dozen or more drug trial
> parameters were added or dropped from consideration which would mean
> potentially modifying all the 9 tables with DDL statements. Clearly,
> it creates a severe maintenance headache.

Maybe, maybe not. Would it have been so hard to create a tool to do it? I assume a tool was created to make sure the EAV tables were well-behaved, that no one entered spurious rows, nor left rows out. But perhaps not?

> > An extraordinary claim always requires extraordinary proof and I think
it is
> > quite reasonable for me to remain skeptical about this until the
> > makes it perverse of me to go on denying it. I can be persuaded by
> > evidence. Can you provide full details?
> Unfortunately, I cannot provide more detail than above since the design
> discussion I mentioned happened about two years ago.

Pity. I remain skeptical then.

> > I am not clear if it it is your opinion that the EAV design is
beneficial in
> > general.
> If you'd carefully read my other contributions to the thread, you'd
> have been able to figure out that, in fact, I am strongly opposed to
> the EAV approach.

Fair point. I confess I read the contributions more carefully than the sigs.

> The implication of your original statement was that EAV is bad in all
> the cases.

Hmm. That's pretty close to my intended meaning, yes. (This is a theory group.) But I do of course concede that bugs and misfeatures might cripple a specific version of a specific product so badly that a really bad approach at least works even when the "proper" approach fails entirely. But that is a banal observation.

> My objection was that the generalization is not fair and I
> gave an example when it made sense to represent a table (or a couple of
> table) as an EAV set.

Unfortunately you only assert that is the case. I remain skeptical for want of substantial evidence. I can't even tell what your tables look like. Even if I wanted to believe you (but I don't) I'd be a fool to do so on the basis of what you've shown us so far. I can think of lots of reasons why those folks might not have got their first (and therefore presumably preferred) approach to work.

> You've offered first a saying

In fairness it was a saying that summed up an important point. Do you disagree that one should only reluctantly discard a successful approach? How eager should one be?

> and then a
> multitable approach that was not acceptable for the reasons I tried to
> describe above.

I have no doubt that you passionately believe in your position. I don't but I can be made to. I just need proof. Hazy recollections don't do it for me.

> > The EAV design has been widely and repeatedly criticised, has numerous
> > weaknesses with respect to the established approach, and there are
> > reasons to doubt it should be intrinsically easier to code for. It is
> > contrary to the established position. There is no burden on me to
> > the established position.
> The "established position" means nothing in a logical discourse.

That's not so. One is never required to re-establish what has already been established.

There is an established approach to table design. It says attributes are columns, not rows.

> Every
> situation in a database design world has to be argued on its own merits
> -- it's not a perpetuum mobile problem that has been settled once and
> for all.

That is not the implication of what I said. Of course the established orthodoxy might be toppled. I am just saying that the burden of toppling it is not on me if I take the orthodox position. I do.

Roy Received on Wed Nov 02 2005 - 20:26:36 CET

Original text of this message