Re: EAV (Re: Object-relational impedence)

From: topmind <topmind_at_technologist.com>
Date: Mon, 31 Mar 2008 08:11:49 -0700 (PDT)
Message-ID: <a5fa390d-025e-4708-93e0-588d573faa2d_at_u10g2000prn.googlegroups.com>


Brian Selzer wrote:
> "topmind" <topmind_at_technologist.com> wrote in message
> news:a742d466-e6a9-4d13-a1ba-953567fcb3dc_at_u10g2000prn.googlegroups.com...
> [big snip]
>
> > It's a gut feeling for everyone here because it's never been tried in
> > practice. We cannot really know with near certainty all the possible
> > usage scenarios (query patterns) and which are more frequent. It's an
> > educated WAG at best.
> >
>
> Perhaps some testing is in order.
>
> > There's another suggestion that nobody's mentioned here that I know
> > of: Add each new column to the Widgets table as needed, creating a
> > "sparse" table. If we are going to allow add/change to schemas, then
> > why not just add the new frippen column to the Widgets table and be
> > done with it? Sparse tables are not really space hogs on most modern
> > RDBMS like they were in the past. The overhead for unused (null)
> > columns is small.
> >
>
> It's not about space. Separate tables can be indexed separately, permitting
> the selection of better query plans. A lot of nulls, such as you would find
> in a sparse table, skews the statistics for columns, making some indexes
> that would otherwise appear optimal less attractive, leading to poor
> performance.

I doubt queries against sparse columns would be common. Generally there would first be a query involving the "widget type" and/or attribute type, which would greatly narrow what is being searched. And, depending on how the optimizer and index system is built, you don't need to store Nulls *in* the indexes. Think about it. There are a few operations where indexing nulls makes sense, but they are not common enough to bother in my opinion. There are other ways to get the same info. But, a given vendor may have seen it different. In short, it depends on the DB engine.

EAV's don't need to store nulls. You just don't include that entry. If a given index engine is not null-friendly, then perhaps an EAV is better than a sparse table as far as performance.

>
> The idea of a sparse table is ugly because it involves nulls.
> Nulls--especially those that indicate 'there shouldn't be a value
> here'--should be avoided whenever possible. If you're interested, there
> have been many discussions on cdt regarding nulls. If you can wade through
> the flames, you might even find some useful information.
>
> Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well. The real world is not always friendly to a given abstraction. And, although I am against "nulls" for strings, the concept of some kind of "empty" cell (zero length) is not necessarily a bad thing. (In some RDBMS, zero length and null are the same thing, in others they are not.)

> Stuffing everything into one table, whether
> it be an EAV table or a sparse table imposes an alien structure on the data,
> introducing redundancy and complexity and usually reducing performance. A
> solution that has one table per widget type also imposes a structure on the
> data, one that inevitably introduces redundancy. The 'table-happy' design
> that I suggested earlier does not introduce redundancy and does not require
> nulls.

Your solution does NOT reduce redundancy because the foreign key has to be repeated over and over again. And, empty cells don't necessarily reduce performance, per above. And, filling up table-space is ugly and not debugging-friendly. Fat table-spaces slow me down. I cannot speak for everybody's psychology and hand-eye-mouse-brain coordination, but it slows down MINE.

I will respect your choice because the tradeoffs are all sticky either way. But, I do not agree with it, nor do I agree with your performance criticisms and redundancy claims. They appear incorrect.

-T- Received on Mon Mar 31 2008 - 17:11:49 CEST

Original text of this message