Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: refreshing DW environment from APPS (9i)

Re: refreshing DW environment from APPS (9i)

From: jaromir nemec <jaromir_at_db-nemec.com>
Date: Thu, 18 Jan 2007 22:54:43 +0100
Message-ID: <10d301c73b4b$4c3890f0$3c02a8c0@JARAWIN>


Hi Cos,

some thoughts to the topic
> The problem with this approach, because of the "OR"
> clause, you're forced into full table scans. (and here, I'm only giving as
> an example a 2 table join however there are FACTS, etc, that require many
> more joins so the "full table scan" stuff gets really ugly)

this a is typical extraction problem. The OLPT databases are not suitable to perform large joins.

> so, basically, what we're tying to do is capture any changes for the year
> (month/day/whatever)

I my experience to get a year of data from the source system is sometimes better to get all required tables separately (whole tables or maybe there is a limit how long in the history can the application change the data; in this case you may limit the extract) and join the data in the stage area of the DW (using full scan and parallel hash join).

to extract a small number of data (e.g. daily changes) the application the above brute force approach is not suitable. The application should provide some change data capture (CDC) functionality, i.e. to store the whole required information in separated (delta) tables. There is a CDC provided by Oracle: synchronous base don triggers and asynchronous based on Oracle log. Some application implement proprietary CDC defining there own DELTA tables.

If this is not an option, last resort is to define an index of the last_update column and try to tune the join to get cascaded nested loop plan. Important to note that this is valid only in case of extraction of *small* number of data. The FULL SCAN you mentioned above is in my opinion not caused by the OR but due to extraction of large parts of the table.

> One can probably circumvent this by using materialized views to determine
> the delta,

The materialized view LOG contains the delta. The materialized view replicates the state of the table (or a select) in the DW database. The MV can be partitioned, check the CREATE MATERIALIZED VIEW command. Not in all case is it possible to have a) fast refresh and b) the required join in the MV definition.

Similar to the discussion above my experience with MV is: to get the data once a year make a full refresh of the MV to get the data daily or so make two steps: replicate the source tables with fast refresh and rebuild the new state (join) in DW to get the data more fervently some kind of fast refresh must be found.

> and ((ooha.last_update_date > '1-jan-2005' and ooha.last_update_date <=
> 31-dec-2005')

last note - remember if a record can be updated after insert, you may get an incomplete state of 2005 changes in your example. If a record is updated 2005 *and* 2006 it will be missing the 2005 extract.

HTH Jaromir D.B. Nemec

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jan 18 2007 - 15:54:43 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US