Advice required for more efficient Oracle database definition/solution

From: Monty <mmontreaux_at_hotmail.com>
Date: 13 Apr 2002 05:52:14 -0700
Message-ID: <6284dd3.0204130452.39ee3266_at_posting.google.com>


Hello, just fielding opinion on a better solution than what is currently implemented for a large Oracle database. I've been brought in to a firm and asked to change the way a database is structured.

Here's the spec, a single Oracle table

VARCHAR2(20) NGD number (census form number) VARCHAR2(1) A digit (0..9) or character (Y, N, or X). 0 and X essentially mean Not Applicable and there are lots and lots of these entries in the table columns.

The table was probably not initially created using something like the SQL below, but it has been shown this way as it clearly defines the existing table structure.

CREATE TABLE FrmC1997(NGDid VARCHAR2(20), q001 VARCHAR2(1), q002 VARCHAR2(1) ..... q464 VARCHAR2(1));

This is not a typo. There are 464 question columns in this table. Each q??? value corresponds to a census question number (or where the government clerks have substituted a digit/character for ethnic race, profession etc).

There are approximately 85 million records and the database and it has been provided to us to propose and demonstrate a better table structure for all this census information (for a government stats department) for large batch database queries.

The database is currently being queried with very simply SQL queries from web-applications (SELECT something FROM tbl_Frm1997 WHERE q1='Y' AND q2='3' AND q3='X' AND q9='7' AND q387='N') and as we have no control over the many many web front ends that access this data, it has to stay like this. The SQL strings are often 4-5 hundred characters in length but are always of this simple SELECT something WHERE followed by lots of predicates. Thus we imagine the proposed solution would be to create a view called Frm1997 with this external schema definition, and access the underlying different format base tables.

So, my query, a better table structure would be ? For any information that is missing, just imagine a standard census questionairre.

The goal - faster access times for large processing jobs, and a more elegant solution.

Our answer is currently the obvious answer (put each question number into it's own table and perform table joins in the view) but believe there maybe a more efficient and logical solution that we are overlooking. Any ideas greatly appreciated.

Monty. Received on Sat Apr 13 2002 - 14:52:14 CEST

Original text of this message