Re: Uploading data for star schema

From: Mike Krolewski <mkrolewski_at_rosetta.org>
Date: Wed, 06 Dec 2000 18:54:19 GMT
Message-ID: <90m20o$m3u$1_at_nnrp1.deja.com>


In article <3A2D47A9.98FAE0C0_at_weinerfamily.org>,   sw_at_weinerfamily.org wrote:
> I have a textfile that contains rows of data in a non-normalized
 fashion
>
> What I need to do is load that data into a star schema.
> In order to this I need to load each of the dimension tables and get
 the
> ids of the entry. Of course if the row already exists I need to just
 get
> the ids. After I have done this on all measure (8 of them). I need to
> stuff the ids and other data into a record in a fact table.
>
> Currently, I am doing this using a VB program and it is incredibly
 slow.
> Is there a way to this using SQLLoader or some other technque that
 would
> be more efficient?
>
> NOTE - probably doesn't matter but I am not using Oracle OLAP.
>
> Thank you.
>
>

Another process is handle some of the processing upfront in the original database. As you add records, time stamp them to determine which ones need to be loaded. Review the dimension (measure) fields to see if they are in the table -- then create new rows in the dimension fields are you go including getting sequences out to the Oracle rdbms.

When you load the data, reload all of the dimension data. You would have to disable the contraints on the fact table. Then start loading the fact table. You should be able to create from the original datasource a file with each record in the fact table defined. SQLLoader can then be used to add the new records quickly.

Alternatively, you could make the Oracle side do more of the work. Use sqlloader to load the raw data into a temporary table ( not part of the normal schema).

In a PL/SQL block:

Query the temp table for missing dimension elements. Assuming that all missing dimension elements should exist, create new rows in the appropriate dimesion table.

       insert into dimension1 ( primarykey, value, describtion)
            select
                seq1.nextval, v1.dim1_value, 'dummy description'
            from
               ( select dim1_value from temp_table
                 minus
                 select value from dimension1) v1

 repeat for each dimension table

 insert the fact table

 delete temp table data

 manually clean up the dimension information as appropriate.

you need indexes on the dimension tables to speed the join to the temp table.
Then use PL/SQL to merge the dimension tables and the fact tables.

--
Michael Krolewski
Rosetta Inpharmatics
mkrolewski_at_rosetta.org
              Ususual disclaimers


Sent via Deja.com http://www.deja.com/
Before you buy.
Received on Wed Dec 06 2000 - 19:54:19 CET

Original text of this message