Re: ORA-8102, why?

From: Cristian Cudizio <cristian.cudizio_at_yahoo.it>
Date: Thu, 17 Jan 2008 00:20:32 -0800 (PST)
Message-ID: <b57c5858-9a5d-4178-a5a1-39e44115d1f0@v67g2000hse.googlegroups.com>


On Jan 16, 7:33 pm, joel garry <joel-ga..._at_home.com> wrote:
> On Jan 16, 9:15 am, Cristian Cudizio <cristian.cudi..._at_yahoo.it>
> wrote:
>
>
>
> > On Jan 16, 5:20 pm, "shakespeare" <what..._at_xs4all.nl> wrote:
>
> > > "Cristian Cudizio" <cristian.cudi..._at_yahoo.it> schreef in berichtnews:71a7ef21-5cb7-4b44-948b-466a5d7d1031_at_s13g2000prd.googlegroups.com...
>
> > > > On Jan 16, 3:11 pm, "shakespeare" <what..._at_xs4all.nl> wrote:
> > > >> "Cristian Cudizio" <cristian.cudi..._at_yahoo.it> schreef in
> > > >> berichtnews:26ce818d-7d12-4eb7-b0c1-221cf4aae15d_at_x69g2000hsx.googlegroups.com...
>
> > > >> > Hello,
> > > >> > i've an Oracle database with RAC 2 node version 10.1.0.5 on Linux
> > > >> > Suse SLES9 64 bit. Recently i've
> > > >> > got an error ORA-8102 during the execution of a procedure that was
> > > >> > deleting some records. So i have found that
> > > >> > there are two records (the table has over 90 milions of records) for
> > > >> > which there are no corresponding index entry on one of the indexes.
> > > >> > I've read metalink notes 395006.1 and 1081343.6 and i can't understand
> > > >> > how can it happen. The note 395006.1 say's about "lost write ore some
> > > >> > similar OS issue" but how can be that there are no database block
> > > >> > corruptions? The notes say also that dropping and recreating the index
> > > >> > does not solve the issue. How can it be? is there anyone that can
> > > >> > explain me that? Since the problem occurs only when i try to change
> > > >> > the value of the indexed column for the two record or to delete the
> > > >> > records i've choose to let things as are now, because exporting e re-
> > > >> > importing the entire table woud be to expensive.
>
> > > >> > Thanks,
>
> > > >> > Cristian Cudizio
> > > >> >http://oracledb.wordpress.com
> > > >> >http://cristiancudizio.wordpress.com
>
> > > >> Are you using a function based index on this table?
>
> > > >> Shakespeare
>
> > > > No, it is a normal b-tree index on one field of type DATE. there are
> > > > other 5 indexes on other field and they are correct.
>
> > > > Thanks,
> > > > Cristian Cudizio
> > > >http://oracledb.wordpress.com
> > > >http://cristiancudizio.wordpress.com
>
> > > I think you misunderstand note 395006.1: it does not say dropping and
> > > recreating the index won't help. It's a description of the symptom.
> > > You could drop and recreate the index and see if the error remains.
>
> > > Shakespeare
>
> > Yes you are right, i misunderstand note 395006.1, it say that are
> > "other" indexes corrupted
> > with same problem.
> > But still remains that i found curious that there is no block
> > corruption detected by oracle
> > but there is a logical corruption that prevents deletion of the
> > records or updating of field
> > whose values are missing in the index. Those records are there from 02
> > Jan 2006 and have
> > never caused problems until we tried to delete them.
>
> > Thanks,
> > Cristian Cudiziohttp://oracledb.wordpress.comhttp://cristiancudizio.wordpress.com-Hide quoted text -
>
> > - Show quoted text -
>
> Have you tried CHECK LOGICAL with RMAN? Just curious, if the problem
> has been there since the index was written, not much RMAN could do to
> fix it. But if you had a hardware problem at one time it might show
> other time-bombs.
>
> Also curious if you have some sort of cached write SAN and if you are
> using direct i/o. Lost write errors are scary things that SAN
> salespeople say never happen, because if you can't trust the SAN,
> well, how can Oracle?
>
> jg
> --
> @home.com is bogus.
> "Flashlights are tubular metal containers kept in a flight bag to
> store dead batteries."

Interesting tip, i've to admit that i haven't seen before the "CHECK LOGICAL" option of RMAN.
I've tried with BACKUP CHECK LOGICAL VALIDATE DATAFILE 4; and there are no messages.
On the other hand also block dump does not highlight any anomaly. The keys are simply missing.

We have ASM with ASMLib over RAW devices so, for what i know, it uses DIRECT I/O, and if not
i don't know as to verify it.
I don't know much about the SAN i could tell you that it is an HP MSA1000 but i don't know
anything about caching. But i think that if oracle could not detect any type of logical or physical
corruption the absence of keys in the index is an handled situation, otherwise we have been "very
lucky".

Thanks,
 Cristian Cudizio
http://oracledb.wordpress.com
http://cristiancudizio.wordpress.com Received on Thu Jan 17 2008 - 02:20:32 CST

Original text of this message