RE: Fastest way to count exact number of rows in a very large table

From: Clay Jackson (cjackson) <"Clay>
Date: Fri, 2 Oct 2020 20:06:15 +0000
Message-ID: <MWHPR19MB0141C1923D23304603BC7E569B310_at_MWHPR19MB0141.namprd19.prod.outlook.com>



“Interesting problem” - If a count takes 2 hours, I can only imagine how long a backup, physical copy and then an “endian conversion” might take; and, as others mention, the only way to guarantee (without something like replication) integrity would be to quiesce the application for the ENTIRE time.

So, start at the “top” – how much downtime can you “afford”? If your user can tolerate it, then TTS or even DataPump would work just fine.

If not, then you’ll need to look at other options, all of which will cost something (time, resources, hard cash). One thing you might want to look at is ways to “partition” and parallelize the work. Are all 108 billion rows in this table “active” or even “active at the same rate”? Could you partition (not necessarily Oracle partitioning, although the tooling there makes it easier) the table and migrate in pieces? Could some of the “partitions” be made read-only? Partitioning might allow you to migrate small subsections; and then use views and perhaps even db-links to retain access to the entire data set.

Good luck!

Clay Jackson

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Andy Sayer Sent: Friday, October 2, 2020 12:09 PM
To: ahmed.fikri_at_t-online.de
Cc: list, oracle <oracle-l_at_freelists.org>; ramukam1983_at_gmail.com Subject: Re: Fastest way to count exact number of rows in a very large table

CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.

Just because a table has the same number of rows, it doesn’t mean it has the same data. With 108 billion rows, your data is going to be changing quickly, in order to get accurate counts at the right point in time you’re going to end up keeping your application offline for a window before and after your migration.

What you need to do is determine where you expect data to go missing and work out a way to check.

This will depend on how you’re doing your migration, I would suggest you use Cross-Platform Transportable Tablespaces (Doc Id 371556.1) as that would allow you to do a physical import and just convert the files to the right endianness. This starts by making sure all data has been written to your data files (so they can be read only on the source system). As you’re working with the physical data files rather than the logical data (rows in tables), the only way you’re going to loose rows is by corrupting your files. You can check for corruption using RMAN once you’ve imported the converted files. No need to count all your rows, and no need to hope that that’s all you need to compare.

Hope that helps,
Andy

On Fri, 2 Oct 2020 at 19:38, ahmed.fikri_at_t-online.de<mailto:ahmed.fikri_at_t-online.de> <ahmed.fikri_at_t-online.de<mailto:ahmed.fikri_at_t-online.de>> wrote:

Hi Ashoke,

could you send the execute plan of the query too? I think there is no general approach for that, it depends on several factors: whether the table has indexes (normal/bitmap) and in case the table has indexes the size of the table compared to the existing index...... But generally parallel processing should help.

Best regards

Ahmed

-----Original-Nachricht-----

Betreff: Fastest way to count exact number of rows in a very large table

Datum: 2020-10-02T19:45:19+0200

Von: "Ashoke Mandal" <ramukam1983_at_gmail.com<mailto:ramukam1983_at_gmail.com>>

An: "ORACLE-L" <oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>>

Dear All,
I have a table with 108 billion rows and migrating this database from Oracle 11g on Solaris to Oracle 12c on Linux.

After the migration I need to compare the row count of this table in both the source DB and the destination DB. It takes almost two hours to get the row count from this table.

SQL> select to_char(count(*), '999,999,999,999') from test_data;

TO_CHAR(COUNT(*)



 108,424,262,144
Elapsed: 02:22:46.18

Could you please suggest some tips to get the row count faster so that it reduces the cut-over downtime.

Thanks,

Ashoke


--

http://www.freelists.org/webpage/oracle-l Received on Fri Oct 02 2020 - 22:06:15 CEST

Original text of this message