Re: copying large data on production environment

From: joel garry <joel-garry_at_home.com>
Date: Thu, 30 Sep 2010 09:45:57 -0700 (PDT)
Message-ID: <a2fe16c0-a27b-46fa-82c9-9b2fb12243f2_at_y12g2000prb.googlegroups.com>



On Sep 30, 2:06 am, "mick.w" <mic..._at_poczta.fm> wrote:
> Thanks for your response!
>
> > Well, something has got to give.  First of all, which exact versions of everything are you using?  
> Oracle 10.2.0.4.0
> > What kind of volumes of data are we talking about?  
> What do you mean?

100 rows? A million rows? 100 million rows? Terrabyte? Petabyte?

At some combination of configuration and data volume, it may be quicker to move the data out of Oracle and reload it, if you can run on the server. This is less of a consideration than it used to be.

> >Are there times when the system is less heavily loaded?
> Yes
> > How long do you have to do it?
>
> As I wrote, I can do this in a few parts. Probably client side will do
> this at the weekends or nightly hours, so the time is enough> Are you using any tools like Informatica, Toad, or just sql, or pl/sql?
> Just sql and pl/sql
> >  Are you on the server or is a client machine involved?
>
> I am not on the server. I need to write scripts which client side will
> execute

So then you probably don't want to do anything that will involve moving data to the client.

> Does the data need any transform?  
>
> I need to split data into to tables with the same structure. Primary
> key must be changed
> Are constraints the same?  
> Yes
> > What kind of storage are you using?
>
> Storage will be enough.

Enough doesn't say what kind. I ask because in mass data movement, some kinds of storage benefit from various types of physical separation. If all you have is one controller on one san, this may be a bottleneck as massive reads and writes conflict. If you have multiple physical devices and multiple controllers, where you put things can make a big difference. If you have two controllers and multiple logical volumes on a san, there might be things a sysadmin can do.

>
> Does executing command insert into ... values (select a, b from abc
> partition(part1))  will produce a big slowdown? Does hint 'append'
> could improve performance of this operation? (tables will be empty and
> no records will be deleted from them for now)

A slowdown? It depends. It can only be empirically determined under your systems load. The answer is often yes, since systems are tuned for online operations with small transactions, and this is a batch operation with large transactions. But you would also need to tell us what kind of apps are running on the system.

Append may or may not make a difference, since it's purpose is to add data above a high water mark, and various things will silently turn it off. What might make a difference is using a direct path load, which has specific requirements. If you were working on the server, there are data volume and configuration combinations where it makes sense to take the data out of the db and load it back in with a direct path load, though other combinations favor create table as. As a client, you are more constrained, and whether you can do better with parallel operations... depends. Nologging, bypass of buffering and lack of undo could help a lot if the append works.

See http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1211797200346279484 for a decent discussion of insert append. See oracledoug.com for some papers about parallelization. Some variants on how much redo generated: http://teymur-hajiyev.blogspot.com/2010/05/relation-between-oracle-direct-path.html

jg

--
_at_home.com is bogus.
So that's what iPods are for.
http://www.signonsandiego.com/news/2010/sep/29/man-arrested-sex-case-involving-encinitas-teen/
Received on Thu Sep 30 2010 - 11:45:57 CDT

Original text of this message