Re: will there be data loss?

From: Steve Howard <stevedhoward_at_gmail.com>
Date: Thu, 28 Feb 2008 07:50:52 -0800 (PST)
Message-ID: <c9f8319b-63e1-4e22-b826-1e202add8973@41g2000hsc.googlegroups.com>


On Feb 28, 8:58 am, Ben <benal..._at_yahoo.com> wrote:
> On Feb 27, 7:46 pm, hpuxrac <johnbhur..._at_sbcglobal.net> wrote:
>
>
>
> > On Feb 27, 2:33 pm, Ben <benal..._at_yahoo.com> wrote:
>
> > > 10.2.0.2 EE, AIX 5.2 64bit. RMAN 10.2.0.2
>
> > > I have run the scenario and know the outcome, but was surprised kind
> > > of.
>
> > > database is in archivelog mode. I start up rman and connect to target
> > > and catalog then take a hot tablespace backup of the users tablespace
> > > which is made up of one datafile.
> > > I then proceed to log in via sqlplus and create a new table
>
> > > balvey.testing( txt varchar2(200), numb1 number) tablespace users;
>
> > > and insert 4 records and commit
>
> > > alter system switch logfile
>
> > > select count(*) from balvey.testing and see the 4 records.
>
> > > exit sqlplus and using os commands I delete the one and only datafile
> > > that makes up the tablespace
>
> > > If I log into the database again and insert records into the newly
> > > created balvey.testing and commit. Will the data that was inserted
> > > after deleting the datafile be lost?
>
> > Does a commit require only that the online redo log buffer space that
> > belongs to the transaction gets flushed to disk or does a commit also
> > force data buffers to be written to datafiles?
>
> > If you know the answer to that one ... it might give you a hint about
> > possible outcomes.- Hide quoted text -
>
> > - Show quoted text -
>
> From what I gathered in the concepts guide it looks like commits
> indeed only must succeed in writing redo buffers to the logfiles. But
> isn't there more to it than that?
>
> This all came up because of an upgrade to our veritas netbackup
> software and wanting to do a test recovery. I went into it expecting
> the commit to fail and the data never being saved. So only data that
> was committed before the file deletion would be recoverable.
>
> What I found was a mixed bag of treats. I believe it is what you guys
> are alluding to, partially.
>
> When I logged back in and performed the insert and commit it was all
> successful. I could select from the table and see the new records. I
> inserted another row and then updated all the rows and committed.
> Still successful.
>
> Then I selected from balvey.big_table (a 1 million row table) FAILURE.
> I got the ORA-01116 error opening file message. Then I tried another
> insert & commit into balvey.testing SUCCESS. I went on inserting a few
> more rows and committing. Then I tried to insert selecting from
> dba_objects and insert 400 rows. Finally I got a failure.
>
> This leads me to believe that since the table was small and the
> database was fairly dormant as far as activity goes, the entire table
> was in the buffer cache and the commit was being written to the redo
> log files. Only when it was going to insert enough data to grab
> another data block and bring it into the buffer cache did it fail.
>
> After all this I went back to read some one the processes DBWR, LGWR,
> and CKPT and that left me a little confused still yet.
> From the Oracle University manual for Fundamentals I course:
>
> "DBWn defers writing to the data files until one of the following
> events occurs:
> - Incremental or normal checkpoint
> ......."
>
> "LGWR performs sequential writes from the Redo Log Buffer to the
> online redo log file under the following situations:
> - when a transaction commits
> ...."
>
> "Every three seconds the CKPT process stores data in the control file
> to identify that place in the online redo log file where recovery is
> to begin, which is called a checkpoint. The purpose of a checkpoint is
> to ensure that all of the buffers in the Database Buffer Cache that
> were modified prior to a point in time have been written to the data
> files.
>
> Note: CKPT does not write data blocks to disk or redo blocks to the
> online redo log files."
>
> So if a checkpoint happens every 3 seconds and the DBWn writes blocks
> to the datafiles on a check point, why did it not fail until I tried
> to insert enough data to grab another data block?

I think Joel nailed it. My guess is the block you were writing was already in cache. The "new" one it had to get from disk. I just successfully ran the following on AIX 5.2 (sorry, no full C compiler on the box so I did it the poor mans way). There may be a system call to override this behavior on write that the Oracle kernel does not implement. I guess they figured that if I delete a datafile and they then issue a "phantom" write to it, I can still recover with a normal recovery.

import java.io.*;

class file{
  public static void main(String args[]) throws IOException {     int i;
    String str = null;

    try{

      FileOutputStream fin = new FileOutputStream("/tmp/foo.txt");
      fin.write(1);
      System.out.println("Successfully wrote 1, now delete the file in
the OS...");
      Thread.sleep(60000);  //wait 60 seconds to give time to delete
      fin.write(2);
      System.out.println("Successfully wrote 2...how did this
happen???");
      fin.close();

    }
    catch(Exception e){
      e.printStackTrace();
    }
  }
}

Regards,

Steve Received on Thu Feb 28 2008 - 09:50:52 CST

Original text of this message