Re: Backing up a data warehouse

From: RSH <RSH_Oracle_at_worldnet.att.net>
Date: Tue, 23 Apr 2002 06:08:16 GMT
Message-ID: <kT6x8.45705$QC1.3221498@bgtnsc04-news.ops.worldnet.att.net>

Well, RMAN seems to be the miracle cure for everything. I'm not buying that solution for two reasons, not that I think that advice was bad, as such.

Especially using partitioning, and ROTS (which one of my groups funded much of development and also a few design inputs) [READ-ONLY Tablespaces] A well managed solution would be examining your data along a financial and chronological axis and at some point migrating it to a ROT.

If you take advantage of all the tools, that at one time we could only dream of, and pray for, could do you great advantage. Actively updated data of course are not candidates,; but segmenting your information, for example, say, if you are dealing with general ledger data, it could be generally accepted as legitimate (except perhaps at Enron) that 2 months past GL audited and accepted book close, those data could be migrated into Read Only tablespaces; even if they still reside on magnetic online storage, just designating a tablespace as a ROT reduces Oracle's overhead in accession of such data.

With all the tools and features we have now, the multitude of options we have now can be more confusing that what we had years ago.

But, just as a hypothetical, GL or billing or other such date-keyed information might be arranged into a 13-month rotation, using partitioning, perhaps IOT (index-organized tables) and other such features, and roll off Business Month 13 (last year's January book close) out to a ROT structure; depending on business and transaction activity, one might consider closed books data migration from active R/W tablespaces to ROT once the book close is certified.

If we had had IOT's and so many of the other features, like partitioning, available when we were building our first data warehouses, life would have been a bit easier.

As it was, in our Beta trials of Oracle 7.1 and the notion of things like PL/SQL, the enabling language giving us the unprecedented ability to define, construct, and enforce business rules inside the database, enveloping the (hopefully normally followed) design constructs of primary keys, unique indices, and relational constraints (PK / UK / FK ) backed by the corresponding indices, there was a feeling of great joy. When you have massive tables that cannot easily be recovered to a consistent state, PL/SQL is something to learn very quickly.

Having made all these largely irrelevant comments, you can begin reading again.

In our DW/DSS environment, we segregated active changes (i.e., the online changes and tweaks the lawyers, accountants, and so forth made), amounting to 1% to 2% of database change as expressed monthly. 99% of the activity occurred after final book close, where hundreds of millions to close to a billion rows would be drawn from flat feeders, to input tables, and then for post-processing by PL/SQL procedures.

Our processing cycle allowed us the leisure, if one might call it that, of ceasing access to all but the DBA God at 2000 hours (8 PM) of Friday COB , and doing full and incremental exports to tape, in duplicate. Once import tests on the tapes were run after the 12+ hours it took to write them, import processing was run on all the feeders, the PL/SQL and related derivation and condensing procedures were conducted, and balance tests were done early Sunday morning. Consistent cross-footing and balancing meant an acceptable run. Any error, meant retreat to the last consistent and accurate set of books, which might mean "desperate measures" up to and including reloading the entire set of derivative tables from the EXPORT.

Good, bad, dead, ugly, or infected with anthrax. at 0700 Monday, a consistent, functional, and accessible set of numbers were available to the bean counters. If there were any questions about cross-footings or other balancing or bad feed issues,, they would see the last certified book close numbers. (While we murdered people quietly in the background.) Otherwise, a new dawn would glow on numbers we could honestly say balanced to the penny, and at 0700 our users could do non-joins resulting in unspeakable Cartesian products, or whatever they wanted.

I highly recommend, particularly in a high DSS/low TP environment, doing full EXPORTs as part of your backup cycle, at least monthly. In addition, if you have the ability to do image backups with quiescent databases, do that as well.

Finally, depending on your volume and processing cycles, of course, enabling ARCHIVELOGging is an option, but not one I would recommend for primarily DSS systems that receive gigabytes of data once a month or so; but for full security, you could always disable it before your SQL*LOADs, and also possibly during your derivation processing, and turn it back on afterwards.

In a 24 x 7 x 99.99999 environment, you're stuck with archive logging whether you like it or not. We managed not to need that, and we processed billions of dollars.

But we balanced.

To the penny.

Or we retreated to the last set of books until we could find out why we were off.

RSH. PS: I am courtesy copying your origin mail address, but something tells me it's not what I would qualify as a reliable address.

"S. K." <askme_at_foo.com> wrote in message news:a9j9tg$3mavd$1_at_ID-87429.news.dfncis.de...
>
>
> Database: Oracle 8.1.6 Enterprise
>
> How can one effectively backup a data warehouse containing
> historical data?
>
> I was thinking about creating a consistent, self-containing time slice
> of the warehouse, whereby this time-slice contains everything from the
> previous month -- fact tables, dimension tables, indexes... and putting
> all those in a transportable tablespace.
>
> This would involve the following:
>
> - if the main tablespace for my data is TS_DATA, I would create a new
> tablespace TS_DATA_<date>,
>
> - if a table in TS_DATA is called TABLEX, I would create a new table
> TABLEX_<date> in TS_DATA_<date> and transfer all the records from
> TABLEX to TABLEX_<date>.
>
> - I would then create indexes for TABLEX_<date> in TS_DATA_<date>
>
> - finally, I would create a transportable tablespace out of
TS_DATA_<date>,
> and archive it. This transportable tablespace would then represent
> a time slice for the data warehouse, containing tables, indexes, etc.
> for the previous month.
>
> Alternative suggestions?
>
> SK
>
>
Received on Tue Apr 23 2002 - 01:08:16 CDT