Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Uploading data for star schema
In article <3A2D47A9.98FAE0C0_at_weinerfamily.org>,
sw_at_weinerfamily.org wrote:
> I have a textfile that contains rows of data in a non-normalized
fashion
>
> What I need to do is load that data into a star schema.
> In order to this I need to load each of the dimension tables and get
the
> ids of the entry. Of course if the row already exists I need to just
get
> the ids. After I have done this on all measure (8 of them). I need to
> stuff the ids and other data into a record in a fact table.
>
> Currently, I am doing this using a VB program and it is incredibly
slow.
> Is there a way to this using SQLLoader or some other technque that
would
> be more efficient?
>
> NOTE - probably doesn't matter but I am not using Oracle OLAP.
>
> Thank you.
>
>
Another process is handle some of the processing upfront in the original database. As you add records, time stamp them to determine which ones need to be loaded. Review the dimension (measure) fields to see if they are in the table -- then create new rows in the dimension fields are you go including getting sequences out to the Oracle rdbms.
When you load the data, reload all of the dimension data. You would have to disable the contraints on the fact table. Then start loading the fact table. You should be able to create from the original datasource a file with each record in the fact table defined. SQLLoader can then be used to add the new records quickly.
Alternatively, you could make the Oracle side do more of the work. Use sqlloader to load the raw data into a temporary table ( not part of the normal schema).
In a PL/SQL block:
Query the temp table for missing dimension elements. Assuming that all missing dimension elements should exist, create new rows in the appropriate dimesion table.
insert into dimension1 ( primarykey, value, describtion) select seq1.nextval, v1.dim1_value, 'dummy description' from ( select dim1_value from temp_table minus select value from dimension1) v1
repeat for each dimension table
insert the fact table
delete temp table data
manually clean up the dimension information as appropriate.
you need indexes on the dimension tables to speed the join to the temp
table.
Then use PL/SQL to merge the dimension tables and the fact tables.
-- Michael Krolewski Rosetta Inpharmatics mkrolewski_at_rosetta.org Ususual disclaimers Sent via Deja.com http://www.deja.com/ Before you buy.Received on Wed Dec 06 2000 - 12:54:19 CST