Database design for large numbers of attributes

From: John <rjg96_at_yahoo.com>
Date: 15 Feb 2004 23:48:47 -0800
Message-ID: <63394c1f.0402152348.7f8db6b2_at_posting.google.com>


I'm developing an application for medical use that will be used to capture patient background and visit data. The application will have approximately 50 forms, with an average of about 20 fields each-- so a total of 1000+ fields. Almost always, the fields are unique to a particular form--and it will be pretty unlikely that new forms and fields will be added--although not out of the realm of possibility. There will also be a search feature that will allow physicians to query for certain forms based on as many attributes as they want to use. Another relevant piece of info is that each form will have a status, and that it will be necessary to aggregate all of the form statuses for a patient to determine a patient status. The target database is ORacle 9i (9.2.0.2). Total number of patients will probably be around 10,000-- each will probably have an average of 50 forms (including multipe visit forms)-- so about 500,000 forms. I'm trying to decide how to design the database: SHould I
1. use a "survey" type design, where every attribute will be a row in a response table?
Pros:
-most normalized
-most flexible and easy to make "schema" changes
Cons
-Large number of rows in response table
-cumbersome to work with
-reduced performance

2. create a separate table for each form w/ its unique attributes, and a master table that has the common attributes and the primary keys Pros
-Easier to work with
-updates and selects will have better performance
Cons
-Not normalized?
-The "ad-hoc" search queries and reports may be difficult to write and
maintain (possibly solvable w/ a large view?) 3. Store all the data for each form as XML in an XMLType column (non schema based)
Pros:
-Easy to work with
-updates and selects will have good performance (since i'm always
updating or selecting an entire form)-- may actually have the best performance of all 4 options for updating and retrieving form data

-normalized

Cons
-XML bloats storagce space requirements (but not really that big a
deal)
-Will make ad-hoc search and reporting queries perform(and probably
use lots of RAM) more slowly since XML will have to be parsed in order for Xpath queries to run
4. USe oracle 9i's schema based XML feature Pros
-Faster Xpath queries than option 3

COns
-If shcema changes, all generated objects have to be dropped and
re-created-- seems like a big pain
-If I was going to do tihs, why not just go with option 2.

I'm most tempted by options 2 and 3-- and would really prefer 3 if there was a way to make sure that the ad-hoc queries would perform decently. Received on Mon Feb 16 2004 - 08:48:47 CET

Original text of this message