Re: Best way to design table to store attributes?

From: <carl.henthorn_at_gmail.com>
Date: Thu, 22 Jan 2009 16:18:13 -0800 (PST)
Message-ID: <ee792bf7-69b8-4423-ba03-1d7f3c73e916_at_w24g2000prd.googlegroups.com>


On Jan 22, 12:53 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> carl.henth..._at_gmail.com wrote:
> > I have an entity that along with all of the usual attributes, has a
> > set of 140 bit flag attributes that I need to track all yes/no values
> > of. I started going down the path of creating a table 150 columns
> > wide, but after typing a few of them thought there may be a better
> > way. :-)
> > I am thinking of creating a table like this:
> > Create table details (
> >   DetailsID int identity(1000,1),
> >   Name nvarchar(40),
> >   Descr nvarchar(60),
> >   Category nvarchar(40) )
> > Thus I can take what would have been my column headers and use them in
> > Name field, throw what would have been the bit flag value and throw
> > that into the description field. I would tie them together with a
> > category value to combine them into logical groups.
> > I have around 1.2 million entities to store this info for, so this new
> > table will get long fast.
>
> > Another passing thought was to store these sets of values as XML for
> > each entity. I have to admit I dont know what the advantage is for
> > that other than the row count savings. how easy would it be to search
> > through a 140 field xml blob for 1 million entites?
>
> > Any advice is appreciated on how to tackle this.
>
> Before folks start pointing and laughing and snickering about you
> re-inventing EAV yet again, I thought I would point out that the
> simplest logical representation for a boolean in a relational system is
> simply the existence of a tuple in a relation.
>
> Thus, a design that has 150 tables each with a single column
> theoretically suffices.
>
> On the other hand, such a design would clearly violate the POOD. Some
> might question the principle of orthogonal design, but I think one
> ignores it at one's own risk when creating 150 tables with identical
> predicates.
>
> At a more basic level, though, are you sure you have correctly modelled
> your problem? 150 independent booleans creates a state machine with
> somewhere around 10^45 states. That's a big state machine. Without any
> sort of transition constraints, that creates a fully connected state
> machine with 10^45 states and somewhere around 10^90 allowable transitions.
>
> Those are big numbers, and it seems unlikely you really need such an
> unwieldy state machine for each row of your table.- Hide quoted text -
>
> - Show quoted text -

Thank you for bypassing the hazing of the rookie, I appreciate it greatly! :-)
For the purposes of this table, my entity is a person, and the 150 bit fields are a combination of columns describing all of the travel interests that they may have in a "yes I am interested", or "No, I am not interested" format. While it is possible to have a person interested in all 150 travel destinations, it is very unlikely, thus giving me a subset to work with. Since the set of "yes" values determines which marketing pieces get sent out, I do need to keep track of those, obviously, but I do not see the reason for keeping track of the "no's" since marketing would not be sent to them anyways. In my proposed table the absense of a row is the same as a "No" value.

Being moslty self taught, I was not familier with the term EAV. But after looking it up, I can see that my issue is very simualr to a patient-symptom set of data for a doctor. Sounds like this is the way to go after all.

I still have to wonder if there is a better way, however, to store/ read that info. Perhaps a quasi-binary where each bit represents a field in a footprint? or even the XML blob. thank you again for the response! Received on Fri Jan 23 2009 - 01:18:13 CET

Original text of this message