Re: Newbie question: DB design for variable fields in Database

From: Jerome H. Gitomer <jgitomer_at_rcn.com>
Date: Tue, 29 Oct 2002 00:37:33 -0500
Message-ID: <apl6jd$s8f$1_at_bob.news.rcn.net>


"Nandan Bagchee" <bagchee.1_at_osu.edu> wrote in message news:1103_1035835366_at_news.cis.ohio-state.edu...
> Hi
>
> I have come across a difficult problem while designing my
> database; basically I guess it involves mapping my data to a
> relational format.
>
> Each of the records of my database has a variable number
> of attributes, further, the attributes can have different
> names. Each record corresponds to a disk file, which is a
> binary image file containing lots of tags. Each tag is
> identified by a name, length and value triple. We want to
> put these tags into a database to facilitate finding required
> files using searches of the tag space.
>
> The problem is that a file can have any tags, ie the file
> author can define and use his own tags. In a particular
> subset of files the tags might be consistent, but across the
> whole lot they will not be.
>
> The tag names, I guess, correspond to the attributes in a
> relational schema; but these cannot be fixed for the entire
> data set. However one attribute at least is fixed, that is the
> file name (fully qualified path names are unique, other than
> that, it's a mush).
>
> I was thinking of a lookup table for the rest of the tags, ie
> have a table, call it tagstbl. It has an external key based on
> the primary key of the filename table; attributes tag name
> and tag value.
>
> The problem is that there are approx 200 tags per file; and
> the total number of files is large (and growing), I can't
> imagine decent performance if I put all my tag lookups into
> one table. Perhaps if I could segment the tags over 2 or
> more tables (instead of 1 tagtbl) that would help, but I don't
> know what criterion I could use.
>
> Is there a generalized way to deal with mapping this data
> organization to the relational domain? Is what I am
> planning even appropriate? Any pointers are welcome.
>
> Thanks in advance,
> Nandan
>
> BTW is this group moderated? my previous post has
> disappeared...
>
>

    If you use a single table and create an index on tag name it doesn't matter how big the table gets. Retrievals from multimillion row tables should still appear to be instantaeous to your users.

Jerry Received on Tue Oct 29 2002 - 06:37:33 CET

Original text of this message