Re: Question on RMAN restore from tape

From: Keith Moore <keithmooredba_at_gmail.com>
Date: Wed, 18 Dec 2019 13:57:37 -0600
Message-Id: <142FC732-C529-4BA2-8FEE-F25ACDBE67A7_at_gmail.com>



Thanks! I am also responding to oracle-l so maybe that will go through.

I agree this performance is poor, about 1 TB per DAY! It is only one channel since I’m only restoring one file. But even with 16 channels and if perfectly linear, that is still somewhere around 700 GB per hour.

I also agree there are better solutions but this is the Oracle cloud (at customer) solution that was sold. The object storage is basically a ZFS storage appliance and I initially assumed it was attached to the Exadata (cloud at customer) through a direct Infiniband connection but that is not the case, it goes through the client’s network.

Yes, I have run both an RMAN debug and a trace on the sbtio driver. As for Oracle being able to use that to determine the issue, well….

But, whatever the cause, my question is more related to how other hardware / software solutions work. I would think, at least with a hardware solution that the storage device would be able to know what data from a backup set was required and not need to download everything over the network only to determine that 99% of it isn’t needed.

For cloud storage that may not be possible but I’m not positive.

Keith

> On Dec 18, 2019, at 1:36 PM, Mladen Gogala <gogala.mladen_at_gmail.com> wrote:
>
> It looks like the list rejecting emails with attachments. Sorry for responding privately, but my first message has bounced.
>
> Regards
>
>
>
> -------- Forwarded Message --------
> Subject: Re: Question on RMAN restore from tape
> Date: Wed, 18 Dec 2019 13:34:15 -0500
> From: Mladen Gogala <gogala.mladen_at_gmail.com> <mailto:gogala.mladen_at_gmail.com>
> To: oracle-l_at_freelists.org <mailto:oracle-l_at_freelists.org>
>
> Hi Keith!
>
> Using SBT interface to point to the storage is perfectly normal. The typical configuration is to backup to S3 storage and make a long term copy to Glacier. Glacier is up to 3 times slower than S3, primarily because of the 2-pass deduplication algorithm. I humbly apologize for using the terminology from the "wrong cloud", but I don't have too much experience with Oracle Cloud. On the other hand, I have more than 15 deployments on AWS and RDS, so AWS is something that i am fairly familiar with.
>
> However, the performance is abysmal. For the usual 10 Gb Ethernet, the expected throughput is around 3 TB per hour. Would run rman with the "debug" . Sample output is attached. Oracle should be able to determine where the bottleneck is. As a general advice, I would strongly prefer to use better established backup products, rather than the "cloud module" which is relatively new and relatively untested. Products like Commvault, Rubrik, TSM, DD Boost, Avamar or NetBackup have much longer tradition and even they encounter a bug here and there. There is no bug free software. Of course, I have the most experience with Commvault, but the other products also have a long tradition on the market and are fairly decent. Rubrik and Veeam are increasingly popular. BTW, I no longer work for Commvault. I would still strongly prefer them, because I know their product very well and it's a very good product, but the decision is yours, not mine.
>
> Interestingly enough, I will be doing a database deployment on Azure tomorrow morning.
>
> Regards
>
> On 12/18/19 12:59 PM, Keith Moore wrote:

>> I am working for a client that has an Exadata Cloud at customer. We just migrated a large database and I am setting up backups. The backups go to the Object Storage that is part of the Cloud at Customer environment and backups and restores are done through a tape interface.
>> 
>> As part of the testing, I tried to restore a single 5 GB archivelog and eventually killed it after around 12 hours.
>> 
>> After tracing and much back and forth with Oracle support, it was found that the issue is related to filesperset. The archivelog was part of a backup set with 45 archive logs and was around 500 GB in size. To restore the archive log, the entire 500 GB has to downloaded, throwing away what is not needed.
>> 
>> The obvious solution is to reduce filesperset to a low number.
>> 
>> But, my question for people with knowledge of other backup systems (hello Mladen) is whether this is normal. It is horribly inefficient for situations like this. Since object storage is “dumb”, maybe there is no other option but it seems like this should be filtered on the storage end rather than transferring everything over what is already a slow interface.
>> 
>> Keith --
>> http://www.freelists.org/webpage/oracle-l <http://www.freelists.org/webpage/oracle-l>
>> 
>> 

> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
>
>
> <rman_debug.out>
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Dec 18 2019 - 20:57:37 CET

Original text of this message