Re: "Re-sequencing" a pseudo-key

From: De DBA <dedba_at_tpg.com.au>
Date: Fri, 10 Oct 2014 10:43:49 +1000
Message-ID: <54372BC5.6080600_at_tpg.com.au>

Hi Kim,

We don't know which options you have at your disposal, however if you have access to GoldenGate you could also consider the following variant on Iggy's solution:

Create a new schema (in a new database, ideally) with the same objects as the original (perhaps reserve storage as you create the tables), but no triggers or foreign keys yet (GG cannot handle those). As newly created rows have a unique, incrementing RECID, we can use the offset between the new recid and the maximum negative old_recid in the translation table -- as we're currently slowly climbing to 0 -- to calculate the new recid for new rows as new.recid = x + abs( old.recid ) - abs(y) where x = max(new_recid) and y is the greatest negative recid in the translation table. GoldenGate can be configured to do such transformations on the fly, if I am not mistaken.
I don't know if it makes much difference, but I'd create the master translation table as IOT with old_recid as the leading column, in the new schema. For this variant it is necessary to populate the entire translation table up front, as the maximum new recid is needed by GG.
Populate tables in the new schema with the existing rows. You can repopulate table by table and do a recid/newrecid lookup for each row in the new schema. GG copies & resequences new rows as the are created.
(no longer needed)
When resequencing of the old rows is completed, schedule a short outage to create foreign keys and swap the application from the old schema to the new.

I believe this will have some additional advantages:

the original production schema (database) remains untouched and complete, and so a provides a reliable rollback option if resequencing/cutover fails for some reason
the new schema is identical to the old, no object renaming needed. Only the translation table is added.
no need to repeat the translation process (point 4) and less chance of data loss due to oversight or error.
if the new schema is created in a second database/machine there is minimal stress on the production database, as transforms and recid lookups run in the new database
the outage will be short, as all indexes & MVs already exist and only foreign keys need to be created

There are some cons of course:

GG is not cheap
GG can be fickle and "abend" (= abort) when problems arise - it needs TLC, lots of it
GG does not support all Oracle datatypes
Extra storage & resource use (ideally a second server with dedicated network link)

Hth,
Tony

On 09/10/14 09:50, Iggy Fernandez wrote:
> It's an amazing problem with numerous restrictions but where there is a will there may be a way.
>
> To summarize, you have a list of sequence number with holes. The holes are probably caused by the way the sequence numbers are being used. A process that reads 10 sequence numbers may end up only using a few of them. This results in the sequence numbers being gobbled up faster than expected.
>
> The challenge is doing billions and billions of lookups into the translation table in a short downtime window without an Exadata. Winning the lottery or finding that much spare change is unrealistic. Not to mention that you have to update indexes and refresh materialized views.
>
> I think an incremental solution is possible as follows:
>
> 1. Add new_recid columns to each table. There will be as many new columns in each tables as there are columns that need to be translated. For now, they need to be nullable.
> 2. Create a master translation table with 600 million rows and two columns: recid and new_recid. New rows are being constantly added to the database but that's not a problem for this algorithm. New rows can be tackled later. Using a master translation table for all recid columns will ensure that all the translations are in the same order as the original values. Preserving this order is not necessary but is aesthetically pleasing.
> 3. Incrementally update the values of the new_recid column with the translation of the corresponding recid value. This can be done incrementally over the next six months. If updating a large number of rows, you can commit at regular intervals to avoid long transactions.
> 4. Purge the master translation table and repeat steps 2 and 3 for new entries only. Keep doing this until the day of cutover. In fact, you don't have to tackle 600 million entries in the first shot. You could do perhaps 10 million entries every day for 60 days.
> 5. As you get close to the cutover date, you can begin building indexes involving the new_recid columns instead of the recid columns using ONLINE build. You can also create a second set of MVs using the new_recid columns. For fast refreshability you can use ROWID materialized views instead of PRIMARY KEY materialized views. The only downside is that any DDL operation that requires row movement will force a complete refresh but what are the chances.
>
>
> With this incremental approach, you have not impacted the application in any way up to the time of cutover. The cutover steps:
>
> 1. Perform the last few updates.
> 2. Put the database into flashback recovery mode.
> 3. Drop primary key and foreign key constraints
> 4. Rename the recid columns to old_recid and new_recid columns to recid
> 5. The required indexes already exist
> 6. The new materialized views already exist. You don't need to rename them if you are using automatic query rewrite. If you are accessing them directly, you will have to work out some solution involving prebuilt tables.
> 7. You may not need the primary key constraints any more
> 8. You can drop the old_recid columns; it would happen instantaneously but you can postpone this step to a future time
> 9. Test the application and flash back if there is a failure
>
>
> The downside to an approach such as this is that you have added new columns and caused some chained rows and row migrations. Lots of details still need to be worked out. Lots of testing will ensure that you have a good plan.
>
> Now if you had an Exadata ...
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>

--
http://www.freelists.org/webpage/oracle-l

Received on Fri Oct 10 2014 - 02:43:49 CEST

This message: [ Message body ]
Next message: Ric Van Dyke: "RE: New policy on Oracle Certifications"
Previous message: Andrew Kerber: "Re: Logout system trigger"
In reply to: Iggy Fernandez: "RE: "Re-sequencing" a pseudo-key"
Next in thread: Iggy Fernandez: "RE: "Re-sequencing" a pseudo-key"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message