Re: Best solution: What would you do?

From: Mark Rosenbaum <mjr_at_netcom.com>
Date: 1997/03/23
Message-ID: <mjrE7IE6C.AI2@netcom.com>#1/1

>Subject: Best solution: What would you do?
>From: Bruce Bristol <bbristol_at_ix.netcom.com>
>Date: Sun, 23 Mar 1997 00:20:49 -0800
>Message-ID: <3334E7E1.7CC5_at_ix.netcom.com>
>
>Hello,
>
>I would like to get feedback from people on the following situation and
>my solution...
>
>My employer has a history database which consists of 8 base tables each
>containing 18 months of data and 8 corresponding incremental tables each
>containing the latest 3 months of data. There are views for each of the
>base and incremental tables.
>
>The tables are used for batch processing where input data is matched
>against the tables to make a flat file for further processing. At the
>end of processing we use sqlldr to insert the unmatched data into the
>tables.
>
>Here's the brutal part...
>

Every 3 months we 'purge' the history database. We read in the 21 months of base+incremental data, write out the oldest 3 months to a flat file to be archived to tape and the most recent 18 months to a flat file to be re- loaded into the base table. Once done, the incremental table is then truncated.

This process takes approximately 2 weeks to complete. This is due to the 85Gb of data housed in the tables, ranging from 17 million to 60 million rows per base table.

We are currently on Oracle 7.1.4 on a Sun Solaris environment. We will be upgrading to the latest 7.3.x.x release within the next 2 months.

I want to decrease the amount of time it takes to do the 'purge' significantly. While we are purging we are NOT in archivelog mode, and we are NOT loading current data into the database. This methodology has been put in place by my predecessors.

Anyway, here's my 'solution'. I'd like to know what you think and if you have any better ideas before I implement...

>I want to take each of the 8 base tables and break them into 6 pieces,
>where they'd each contain 3 months of data (6*3 = 18 months). When it's
>time to do the 'purge' again, we can simply unload and archive to tape
>the oldest table, unload the incremental table and load it into the
>table
>that was just truncated and then truncate the incremental table. We can
>do this for each of the 8 base tables.
>
>Another thought would be to unload and archive to tape the oldest table
>and then use rename tables using ALTER...RENAME...

The alter/rename should be an faster way to go.

>My greatest concern breaking the tables from 8 base + 8 incremental is
>that it'll now be 8*6 base + 8 incremental and Oracle's performance.
>
>Look at it this way, per base+incremental table:
>
> Today: Unload 21 months, load 18 months
>Solution 1: Unload 6 months, load 3 months
>Solution 2: Unload 3 months, load 0 months
>
>With Oracle 7.3.x.x we can used partition views to access the data. As
>I understand it, there will be none/very little performance degradation
>querying 7 tables, v.s. 2 tables in 7.1.4.

You have a very sound approach, but, (you knew that was comming didn't ya) partitioned views is a very new feature and needs to be tested on a platform/release basis. I would STRONGLY RECOMMEND that you test both volume and function before going operational. More than one cleaver DBA has been bitten by RDBMSes not functioning at volumes.

>The other concern is a trigger that was built to make sure duplicate
>data isn't added to the view (base+incremental). Every time we do a
>load using sqlldr, the trigger checks the base table to make sure the
>row doesn't already exist. I have to wonder the best way to take care
>of that using Oracle 7.3.x.x and 6 'little bases' that our loads need
>to check against. Do partition views work the opposite way as well,
>meaning will they work on the 6 'littel base' tables at the same time
>vs. 1 at a time, just as it'll query the 6 tables at the same time when
>doing a query?

It sounds like you may have a process problem if the first time a duplicate record is found is loading into the database. Be that as it may, it may be faster to load all of the data using direct path load and then dedup the new table. It may also be possible to eliminate dups in UNIX before loading. If every 3 month table is SORTed, CUT and COMPRESSed then the space could be very reasonable.

Hope this helps

mjr Received on Sun Mar 23 1997 - 00:00:00 CST