Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: datawarehouse extraction, snapshots?

Re: datawarehouse extraction, snapshots?

From: Jusung Yang <jusungyang_at_yahoo.com>
Date: 17 Aug 2002 02:35:25 -0700
Message-ID: <42ffa8fa.0208170135.2455fc67@posting.google.com>


Well, best that I shut up now. Data warehousing is kindda complicated. Without knowing your system fully, it's hard for me to say what's best for you. Once the DW is built, it is usually difficult to change it. We spent 4 months just to build the model. Our data sources came from SAP, SYBASE, ORACLE and EDI. Delta load design was one of the biggest design issues. Data partitioning was another. Refresh frequency and downtime management was another. We used Informatics PowerMart for the ETL. What I said before was based on the information provided in your posts.

Just a couple more things.
1.
Delta load design does not have to depend on a timestamp column. Snapshot log is a type of delta, no? I don't see ORACLE depend on timestamp columns to capture the changed data and place them in these logs. Trigger is what you need - if you can not use the 9i CDC. You basically do your own "limited" CDC. Capture the changed data and decide how to use them to refresh your warehouse. Conceptually, I would do this:
a. Use triggers to capture newly inserted or deleted data, flag them as 'I' and 'D'.
b. For updated data, you log the old record as 'D' and the new record as 'I'.
c. You Refresh your DW with these data. Transform them if needed.

2.
Here is what I understand you would like to do:

                Current Design
-------------------------------------------------
                ETL queris
(Source tables)     ->     (DW tables) <- (DW user queries)


                       Your Plan
-----------------------------------------------------------------
                                                   ETL queries
(Source tables) -> (Replicated source tables in DW)     ->    (DW
tables) <- (DW user queries)

Your DW tables are being completely refreshed every night, yes? By replicating source tables into DW, like I said before, you only shift your problem from OLAP to DW. It may be OK today since the DW hardware is more powerful, may not be in the future when the data volume grows. ETL queries are still doing the full table search, as opposed to delta, on the replicated tables.

But if doing CDC and modify ETL is not an option for you, you may indeed have to do this. I have nothing further to say. Good luck!

sg <s4v4g3_at_europe.com> wrote in message news:<3D5D1C8D.9010407_at_europe.com>...
> re-working ETL processes to start doing delta load is virtually
> impossible because source data are all third party software, therefore
> there are no timestamp columns for us to do delta extraction
>
> there are quite a warehouse users, this warehouse has been up for around
> 2 years, so far the performance is not too bad. the problem is
> extraction which kills OLTP source...
>
> regarding creating snapshots, I just did a small research in oracle
> documentation about datawarehouses and seems like snapshots are
> recommended for delta extractions.... in my case there wont be any
> complex snapshots because most probably is i will replicate tables and
> not data based on queries
>
> the perfect way would be adding timestamp columns but that seems far far
> far away although timestamp will not be able to deal with deleted data
> and for that we will probably have to rely on delete triggers and a log
> table
>
> I was hoping someone who has worked in datawarehouse share their data
> extraction experience :-)
>
> cheers
Received on Sat Aug 17 2002 - 04:35:25 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US