Re: Fastest way to count exact number of rows in a very large table

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Mon, 05 Oct 2020 15:55:36 -0400
Message-ID: <2665cbdcb3fcdf1c83c859c327136637e4d052ec.camel_at_gmail.com>



You can also try the following: SELECT /*+ PARALLEL(16) */  100000*count(*) FROM TABLE SAMPLE(0.001);That would give almost correct count of rows in the table. Accuracy would be similar to SELECT NUM_ROWS from USER_TABLES WHERE TABLE_NAME=<TABLE_NAME>; BTW, has anyone played with APPROX_FOR_COUNT_DISTINCT parameter in 19c? Regards
On Mon, 2020-10-05 at 19:40 +0000, Reen, Elizabeth wrote:
> Since it is partitioned, why don’t you just count each
> partition separately? Have you run stats on the old partitions?
> Assuming that prior years
> do not change, you should be able to get a count there. If you do
> an import, then you can get the number of rows from the log. A
> transportable tablespace, will not mount if there is something wrong.
>
> Liz
>
> Liz Reen
>
> CPB Database Administration
>
> Dev Servers: Oracle | ORasS | Golden Gate | Sybase | MS-SQL
> | MSaaS | Neo4j | HBase - Prod Servers: Golden Gate | Neo4j
> | HBase - Team
> Email: *CPB Database - Hotline: 1 718 248 1416
>
> ServiceNow For: Oracle | OraaS | Golden Gate :
> ICG NA EX US CPB ORACLE APPLICATION DBAS – For Sybase | MS-SQL |
> Neo4j | HBase:
> ICG NA EX CPB SYBASE APPLICATION DBAS
>
>
> From: [gmail.com] Ashoke Mandal <ramukam1983_at_gmail.com>
>
>
> Sent: Monday, October 5, 2020 3:14 PM
>
> To: Reen, Elizabeth [ICG-IT]
>
> Cc: andysayer_at_gmail.com; ahmed.fikri_at_t-online.de; list, oracle
>
> Subject: Re: Fastest way to count exact number of rows in a very
> large table
>
>
> Hello
> Ahmed/Andy/Jackson/Mark/Gogala/Liz, Thanks for your response to my
> posting. Even though my question was how to improve the query time of
> a query to check row count in a big table, you have also brought up
> many good points related to cross platform migration.
>
>
>
>
> Here is some information regarding my table:
>
>
> We have used transportable tablespace for data migration.
> The primary key consists of three columns (UT_ID, UT_SEQ, TEST_DATE).
> This table is partitioned by date and has one partition for every
> month. So, 12 partitions for every calendar year and has data for the
> last 20
> years.
>
> After adding a parallel hint as described below the query time went
> down from 2.2 hours to 42 min.
> Let me know if you have any more recommendations to improve the query
> time for this select statement.
>
>
>
>
>
> select /*+ parallel */ to_char(count(*), '999,999,999,999') from
> test_data;
>
>
>
>
>
>
> Ashoke
>
>
>
>
>
> On Mon, Oct 5, 2020 at 10:40 AM Reen, Elizabeth <
> elizabeth.reen_at_citi.com> wrote:
>
> >
> > We just completed such a transition. We kept the
> > Oracle version the same so we could
> > see the impact of Linux. Transportable tablespaces was how we did
> > it. We were able to move a 17 terabyte database in under 10 hours.
> >
> >
> > Liz
> >
> >
> > From: [External]
> > oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>
> > On Behalf Of [External] Andy Sayer
> >
> > Sent: Friday, October 2, 2020 3:09 PM
> >
> > To: ahmed.fikri_at_t-online.de
> >
> > Cc: list, oracle; ramukam1983_at_gmail.com
> >
> > Subject: Re: Fastest way to count exact number of rows in a very
> > large table
> >
> >
> > Just because a table has the same number of rows, it doesn’t mean
> > it has the same data. With 108 billion rows, your data is going to
> > be changing quickly, in order to get accurate
> > counts at the right point in time you’re going to end up keeping
> > your application offline for a window before and after your
> > migration.
> >
> >
> >
> >
> >
> > What you need to do is determine where you expect data to go
> > missing and work out a way to check.
> >
> >
> >
> >
> >
> > This will depend on how you’re doing your migration, I would
> > suggest you use Cross-Platform Transportable Tablespaces (Doc
> > Id 371556.1) as
> > that would allow you to do a physical import and just convert the
> > files to the right endianness. This starts by making sure all data
> > has been written to your data files (so they can be read only on
> > the source system). As you’re working with the physical data
> > files rather than the logical data (rows in tables), the only way
> > you’re going to loose rows is by corrupting your files. You can
> > check for corruption using RMAN once you’ve imported the converted
> > files. No need to count all your rows, and no need to hope
> > that that’s all you need to compare.
> >
> >
> >
> >
> >
> > Hope that helps,
> >
> >
> > Andy
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, 2 Oct 2020 at 19:38,
> > ahmed.fikri_at_t-online.de <ahmed.fikri_at_t-online.de> wrote:
> >
> > > Hi Ashoke,
> > >
> > >
> > >
> > > could you send the execute plan of the query too? I think there
> > > is no general approach for that, it depends on several factors:
> > > whether the
> > > table has indexes (normal/bitmap) and in case the table has
> > > indexes the size of the table compared to the existing
> > > index...... But generally parallel processing should help.
> > >
> > >
> > >
> > > Best regards
> > >
> > > Ahmed
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > -----Original-Nachricht-----
> > >
> > > Betreff: Fastest way to count exact number of rows in a very
> > > large table
> > >
> > > Datum: 2020-10-02T19:45:19+0200
> > >
> > > Von: "Ashoke Mandal" <ramukam1983_at_gmail.com>
> > >
> > > An: "ORACLE-L" <oracle-l_at_freelists.org>
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dear All,
> > >
> > > I have a table with 108 billion rows and migrating this database
> > > from Oracle 11g on Solaris to Oracle 12c on Linux.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > After the migration I need to compare the row count of this table
> > > in both the source DB and the destination DB. It takes almost two
> > > hours to get the row count from this table.
> > >
> > >
> > >
> > > SQL> select to_char(count(*), '999,999,999,999') from test_data;
> > >
> > >
> > >
> > > TO_CHAR(COUNT(*)
> > >
> > > ----------------
> > >
> > > 108,424,262,144
> > >
> > > Elapsed: 02:22:46.18
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Could you please suggest some tips to get the row count faster so
> > > that it reduces the cut-over downtime.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > Ashoke
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > 
> > >
> >
> >
> >
> >
>
>
>
>

-- 
Mladen Gogala

Database Consultant

Tel: (347) 321-1217


--
http://www.freelists.org/webpage/oracle-l
Received on Mon Oct 05 2020 - 21:55:36 CEST

Original text of this message