RE: Question on Corrupted blocks

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 6 Jun 2017 10:25:42 -0400
Message-ID: <046d01d2ded0$c7c954f0$575bfed0$_at_rsiz.com>



Not answering your question, but a side note when you write “all data files.” I have recently run into an exasperating situation with freespace.  

IF you experience an ora-600 abort during a parallel direct insert, space that contains internally corrupt blocks may be returned to your freespace (at least for bitmap [called auto] freespace, I have never seen it for freelists [called manual].)  

None of the tools of which I am aware do a scan of freespace. The symptom of the issue is when that space is attempted to be used you get a new ora-600 reporting a corrupt block. Creating a dummy table and allocating extents to cover the corrupt space is effective as a bridge until you can repair the damaged blocks (anyone have a good tool for repairing freespace blocks?) or relocate data out of the damaged file. That was the interim work-around I thought of since allocate extent does not check the blocks (apparently).  

I’ll happily add to my kitbag any tools anyone can identify (especially if I missed any and Oracle support missed any that are authorized while dealing with this SR.)  

One by one any ora-600 aborts that causes returning “bad blocks” to freespace can be handled normally via SR. But so far fixing the point problem of failing the insert as ora-600 instead of trapping it as a specific error has been ignored as each “don’t let it happen in the first place” fix has appeared. Once the customer’s problem is solved the SR is closed, so the general issue of cleaning up the blocks before returning them to the free list on any internally aborted direct insert has not been addressed.  

I have not been able to reproduce the issue by manually terminating a direct insert, so the normal case is apparently already handled just fine. Indeed it may be only a few corner cases that fail to do the cleanup.  

No data loss is associated with the problem, but you can get into a situation where inserts fail when new freespace with a declaration of corrupt blocks. Since any insert after that point can potentially trigger the issue and a scan of the objects associated with the insert report a clean slate, the issue is not obvious unless you happen to think of it.  

Anyway, freespace scanners and cleaners would be useful, so if you know of any I’m all ears.  

I also look forward in case Tim has an opportunity to write a comprehensive answer to your actual question.  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Ashoke Mandal Sent: Monday, June 05, 2017 3:50 PM
To: tim.evdbt_at_gmail.com
Cc: ORACLE-L
Subject: Re: Question on Corrupted blocks  

Hi Tim, I have been busy in doing production upgrade and didn't follow-up on your email. Luckily the block corruption I identified in one of my test databases and didn't find any corruption two of the production databases, which I upgraded recently.  

As I mentioned earlier that dbv utility identified that some blocks on some files were corrupted and I understand that dbv is not the right way to check block corruption since the datbase stays down when I run the dbv command.  

So please suggest how to check block/data corruption on the entire database (including all data files).  

Thanks,

Ashoke  

On Wed, May 24, 2017 at 4:30 PM, Tim Gorman <tim.evdbt_at_gmail.com> wrote:

Ashoke,

Please consider "dbv" as more of a *screening* for block corruption than a definitive diagnosis. For one thing, "dbv" does not connect to the database instance, so it cannot determine whether the buffer in the Buffer Cache in the SGA is also corrupted.

To determine if a datafile block is truly corrupted, please connect to SQL*Plus as SYSDBA and run the following...

ALTER SYSTEM DUMP DATAFILE [ file# | 'filename' ] BLOCK [ block# | MIN block# BLOCK MAX block# ]

...which in your situation might mean running the following commands, based on "dbv" output...

alter system dump datafile 40 block 35586;

alter system dump datafile 40 block 35598;

alter system dump datafile 40 block 35599;

alter system dump datafile 40 block 35600;

alter system dump datafile 40 block 35601;

alter system dump datafile 40 block 35603;

alter system dump datafile 40 block 35605;

alter system dump datafile 40 block 35607;

alter system dump datafile 40 block 35834;

alter system dump datafile 40 block 36579;

...or maybe something like this instead...

alter system dump datafile 40 block min 35586 block max 36579;

...but for that you'd have to be careful only to pay attention to the blocks listed above as corrupted within that range from 35586..36579, of course.

The dumped block information will go to a trace file in your USER_DUMP_DEST, so it might be wise to first run a command like "alter session set tracefile_identifier = dump" so that the trace file(s) generated from your ALTER SYSTEM DUMP will have the file-name suffix of "_DUMP.trc", making it easier to identify them inside the USER_DUMP_DEST directory on the database server.

If the indicated database blocks are indeed corrupted, then the output in the trace file will state so. If instead the ALTER SYSTEM DUMP command does not list the block as corrupted, then it was a "false positive" from the "dbv" program.

Hope this helps...

-Tim

On 5/24/17 15:10, Ashoke Mandal wrote:

Hello,  

I shutdown the database, ran dbverify($ORACLE_HOME/bin/dbv) and it reported the following block corruption in the respective logfile (as listed below - 1 through 4).  

But "select * from v$database_block_corruption;" doesn't return any rows.

Q1. Could you provide me any idea what are the possible cause for these corruption?

Q2. What is best way to check for such block corruption before I upgrade a production database?

Q3. Is it possible there might be some corrupt blocks but we don't receive any complain from anywhere (user or alert log etc.)?    

  1. $ORACLE_HOME/bin/dbv file=/u03/oradata/file_1_ind_idx02.dbf feedback=100 logfile=/<logfile_location>/file_1_ind_idx02.log blocksize=8192

Corrupt block relative dba: 0x0a80271a (file 42, block 10010)

Total Pages Marked Corrupt : 1  

2. $ORACLE_HOME/bin/dbv file=/u03/oradata/file_2_data.dbf feedback=100 logfile=/<logfile_location>/file_2_data.log blocksize=8192  

Corrupt block relative dba: 0x0a008b02 (file 40, block 35586)

Corrupt block relative dba: 0x0a008b0e (file 40, block 35598)

Corrupt block relative dba: 0x0a008b0f (file 40, block 35599)

Corrupt block relative dba: 0x0a008b10 (file 40, block 35600)

Corrupt block relative dba: 0x0a008b11 (file 40, block 35601)

Corrupt block relative dba: 0x0a008b13 (file 40, block 35603)

Corrupt block relative dba: 0x0a008b15 (file 40, block 35605)

Corrupt block relative dba: 0x0a008b17 (file 40, block 35607)

Corrupt block relative dba: 0x0a008bfa (file 40, block 35834)

Corrupt block relative dba: 0x0a008ee3 (file 40, block 36579)

Total Pages Marked Corrupt : 10  

3. $ORACLE_HOME/bin/dbv file=/u03/oradata/file_3_idx01.dbf feedback=100 logfile=/<logfile_location>/file_3_idx01.log blocksize=8192  

mdl_ind_medium_idx01.log:Corrupt block relative dba: 0x090075a5 (file 36, block 30117)

mdl_ind_medium_idx01.log:Total Pages Marked Corrupt : 1  

4. $ORACLE_HOME/bin/dbv file=/u03/oradata/file_3_ind_large_idx01.dbf feedback=100 logfile=/<logfile_location>/file_3_ind_large_idx01.log blocksize=8192  

wip_ind_large_idx01.log:Corrupt block relative dba: 0x02c0f581 (file 11, block 62849)

wip_ind_large_idx01.log:Corrupt block relative dba: 0x02c0f583 (file 11, block 62851)

wip_ind_large_idx01.log:Corrupt block relative dba: 0x02c0f586 (file 11, block 62854)

wip_ind_large_idx01.log:Total Pages Marked Corrupt : 3  

Thanks,

Ashoke    

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jun 06 2017 - 16:25:42 CEST

Original text of this message