Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Interesting warehouse design challenge (long)

Interesting warehouse design challenge (long)

From: markoos <no_at_spam.com>
Date: Sat, 26 Jan 2002 04:09:38 GMT
Message-ID: <6_p48.51619$d16.8708171@typhoon.nyc.rr.com>


We have a 9i system functioning as a data warehouse on Win2k. We're stuck with a bad old design of a previous warehouse and need to optimize things.

We have a table with 700,000 records and 10 columns that tracks our effort data (person works n hours on project 'abc' performing function 'def'). Lets call it tbl_effort.

We have ~10 manhour reports that run against this table from a CF web application. Each report can be run by any one of multiple user departments. The issue is that each department has a completely different set of criteria in which to query this table. Some departments filter by one column, others another column, others combinations of columns. Sometimes there are OR clauses as the rules change between date ranges. There is no direct rhyme or reason to these filters, they are just the way the business rules work. It's making it very difficult to index.

Unfortunately we're applying these filters at runtime. The web application pulls the 'where clause' from tbl_where_clause, and dynamically constructs the query (not my idea, trust me). Each department has one and only one where clause. Some effort records could be used by two departments, this is many-to-many.

The reports run between 10 seconds-2 minutes, often kicking off full table scans. I would like to improve this speed of this system by pre-filtering the data at night, creating any other tables, views, indexes, or materialized views as necessary to optimize the runtime speed.

One of the requirements is that departments and where clauses can be added or edited mid-day, and the system needs to reflect the new filter immediately.

Option 1:

My thought is that we add a new table that joins to tbl_effort on effort_ID, call it tbl_effort_join.

CREATE TABLE tbl_effort_join (

    effort_ID INT NOT NULL,
    dept_ID INT NOT NULL
);
--and the appropriate indexes.

At night we loop through a cursor of the departments, applying each where clause and inserting one record in the join table for each record returned f rom the effort table. This could be implemented as a procedure that takes in one department ID, dynamically constructs the query against tbl_effort, and runs an INSERT INTO tbl_effort_join SELECT (dynamic query). I haven't done this yet but I don't know why it wouldn't work.

We then could tweak the end-user queries to use this table joined (or create this as a view). All queries simply filter on dept=dept_ID instead of the long-winded where clause.

If a where clause changes during the day, we call the procedure once for that department. That procedure would delete & re-insert rows, or more efficiently if possible.

Option 2:

My second design candidate is to have only one table. That table is created with the same schema as tbl_effort, but with dept_ID added. Any of the original effort records that crossed depts would now have duplicate records with different dept_IDs. Storage space is not an issue.

We construct the table in the same way as option 1, with a procedure and a loop through the dept cursor. Also if where clause is added or updated, we run the procedure which deletes & reinserts the records.

I'm thinking that this is a stronger design, as having one table will be easier to index and create materialized views against. Also I'm thinking the queries may potentially run faster without having to join.

I would consider index organizing this table if that sped up the queries.

So the question is, which design is better? Should I opt for full normalization or go with one repetitive table? Is there a third design candidate I'm missing? Am I missing anything else?

Thanks for any suggestions! Received on Fri Jan 25 2002 - 22:09:38 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US