Home » RDBMS Server » Server Administration » What happened to our system and database? (Oracle 12.2.0.1, Oracle Unbreakable Linux 7, Oracle GolenGate 21)
What happened to our system and database? [message #690359] |
Thu, 05 June 2025 22:06  |
trantuananh24hg
Messages: 747 Registered: January 2007 Location: Ha Noi, Viet Nam
|
Senior Member |
|
|
Dear,
We have 2 RACs database, replicated from 12c to 11g using Golden Gate (21th). Both of them configured by DBFS in which contain the trail-file. The Golden Gate Manager is controlled by XAG.
They replicated traditionally 1 way, 21 tables small, 1 big partitioned table (about 699GB). Those small tables are not problem but only big partitioned table. This caused my post here, looking for your help.
1 week ago, we got a problem in application, some one to change code (java). In brief, he did not say to use any array, or something like that, then so many rows insert into the big table directly, by pass from cache system. And my nightmare began:
- The DBFS_TS tablespace growth from 21GB to 300GB in only one day
- The file-system mounted as '/mnt/dbfs_user' growth to 99% in only 4 hours.
- Golden Gate replicate lag checkpoint increased from 30 minutes to 61 hours in only 4 hours.
When I found the problem in his code, stop this module, and the problem for DBFS gone, and GoldenGate started normally, it worked till now.
However, I must find the solution to solve the distress behind:
- The DBFS_TS tablespace full at 99% in both of 2 RACs system (12c, 11g)
- The file-system '/mnt/dbfs_user' in 11g down from 99% to 1% usage
- But the file-system '/mnt/dbfs_user' in 12c DID NOT DECREASE. First time, this decreased from 99% to 31%, but 1 week ago, this intended to INCREASE.
As following informations:
1. Oracle RAC 12c and 2. Oracle RAC 11g configuration:
- 370GB big file for DBFS_TS tablespace
- Mounted in file system as '/mnt/dbfs_user/'
- The dbfs's resource added in crs
- All of directories (dirprm, dirchk, dirdat, ...) linked by symbolic links from $GGATE to '/mnt/dbfs_user/'
2. Oracle GoldenGate 12c and 11g MGR has configured by PURGE TRAILFILE MINKEPPHOURS 2 or 3
3. The RAC 11g file-system '/mnt/dbfs_user/' is 4% or 10% whenever mgr purge trail file, it did not increase.
4. The RAC12c file-system '/mnt/dbfs_user' increases day by day from 51% to 71% (at the present day). It's a big problem to me. I cannot understand reason, and I cannot analyze to find a right way. lsof command did not find any hang/keep process, the pure trail-file operation worked normally till now. In fact, du -sh '/mnt/dbfs_direct/FS/' show only 6GB to 10GB maximum (including all sub directories)

5. I knew, I must do re-ogarnzation dbfs_ts tablespace (Doc ID 2331565.1), or do another solution, such as re-configure GG's directories into filesystem normal (the ACFS solution is not useful because of driver lib does not support my OS).
But, I really want to know which cause '/mnt/dbfs_user' increase size day by day, in only 12c RAC system, but did not appear in 11g RAC system. May you help me? Thank you so much.
[Updated on: Thu, 05 June 2025 22:12] Report message to a moderator
|
|
|
|
Re: What happened to our system and database? [message #690361 is a reply to message #690359] |
Fri, 06 June 2025 02:06   |
John Watson
Messages: 8978 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
If your developer was using direct load, in some cases that can result in tables increasing in size hugely because every insert can add a new extent to the segment, so your table has large numbers of almost empty extents. For example:orclz>
orclz> alter session enable parallel dml;
Session altered.
orclz> create table emp2 parallel(degree 2) as select * from emp;
Table created.
orclz> select count(*) from user_extents where segment_name='EMP2';
COUNT(*)
---------------
1
orclz> insert /*+ append */ into emp2 select * from emp;
14 rows created.
orclz> commit;
Commit complete.
orclz> select count(*) from user_extents where segment_name='EMP2';
COUNT(*)
---------------
2
orclz> the above is of course very artificial, but it shows what can happen. There is supposed to be some sort of automatic extent trimming to alleviate the problem, but I don't know how effective it is or how it varies between releases. It can't work with uniform extents. Parallelism is important, and that does change between 11 and 12 which might be why you are seeing different results.
You could try disabling parallelism and direct load to see if that helps.
|
|
|
Re: What happened to our system and database? [message #690362 is a reply to message #690361] |
Fri, 06 June 2025 11:13  |
trantuananh24hg
Messages: 747 Registered: January 2007 Location: Ha Noi, Viet Nam
|
Senior Member |
|
|
Thank you for your reply, John Waston, Mr.
The problem in our application was gone, when I found and remove the module caused high I/O and CPU. This is not my problem at the present day.
Let me take more detail.
1. Describe my RAC's systems.
- In both of 12c and 11g databases (RAC), Golden Gate installed and configured and replicated between them as traditional 1 way (from 12c to 11g).
- I created and configure tablespace DBFS_TS (big file, nologging, autoallocate) to contain trail-file (GG's dirdat, dirprm, dirchk ..) and mounted the datafile into file-system as '/mnt/dbfs_direct'. This tablespace has only got secure file for trail-file. It's DBFS (Database file system), and no more data was in. It used for RAC's GG service within GoldenGate VIP under the XAG controlled.
- All of data in another tablespace, normally.
- GoldenGate work as well till a day (the developer's matter).
2. What exists still now?
- The tablespace DBFS_TS growth to 99% (and we will do reorganization as Oracle's document: https://docs.oracle.com/en/database/oracle/oracle-database/21/adlob/shrinking-reorganizing-DBFS-file-systems.html#GUID-68FE78C8-798C- 40DB-8566-F09EB27F4CC9).
- However, before do it, I might to analyze something:
+ 11g, one of RAC, got problem (this is Destination - GG's replication), 99% of DBFS_TS, 38% usage of dbfs_user@ (mounted in file-system OS), same as 12c system.
+ The problem gone at 4am, 28/05/2025, and dbfs_user@ at 11g, decrease from 38% to 1% usage. And it was not over 14% total space till now (11g).
+ 12c dbfs_user@ claimed to 54% (12c is source system), and it increased day by day, now is 71% usage.
+ mgr in both of 11g, 12c was configured by PURGEOLDEXTRACTS MIN KEEPHOUR 2 (or 3). mgr parameters in Golden Gate
+ Where is my space???
I do something to analyze to answer the question "Where is my space - dbfs_user@" ?
+ Using lsof in both of nodes of 12c RAC to find which process handle in /mnt/dbfs_direct/FS/dirdat. Nothing, none of GG's process handling
+ Using script to find how many expired block
set serveroutput on;
declare
v_segment_size_blocks number;
v_segment_size_bytes number;
v_ number;
v_used_blocks number;
v_used_bytes number;
v_expired_blocks number;
v_expired_bytes number;
v_unexpired_blocks number;
v_unexpired_bytes number;
begin
dbms_space.space_usage ('DBFS_OWNER', 'LOB_SFS$_FST_12345', 'LOB',
v_segment_size_blocks, v_segment_size_bytes,
v_used_blocks, v_used_bytes, v_expired_blocks, v_expired_bytes,
v_unexpired_blocks, v_unexpired_bytes );
dbms_output.put_line('Segment Size blocks = '||v_segment_size_blocks);
dbms_output.put_line('Segment Size bytes = '||v_segment_size_bytes);
dbms_output.put_line('Used blocks = '||v_used_blocks);
dbms_output.put_line('Used bytes = '||v_used_bytes);
dbms_output.put_line('Expired Blocks = '||v_expired_blocks);
dbms_output.put_line('Expired Bytes = '||v_expired_bytes);
dbms_output.put_line('UNExpired Blocks = '||v_unexpired_blocks);
dbms_output.put_line('UNExpired Bytes = '||v_unexpired_bytes);
end;
/
OK, 85% expired block, 15% useful. But I keep in mind, the NOLOGGIGN and RETENTION IS NONE was configured for DBFS_TS tablespace, it absolutely using 15% usage and will not increase in tbs. Why my file-system dbfs_user@ increase day by day. 31% - 54% - 67% and 71% now? I really don't understand, none of handle process, all of trail-file in dirdat rollover to another new file and purge old, maximum 8G usage for dirdat, ... I think it may be a bug. Oracle's SR does not describe detail more, except refering to Oracle's document link (Reorgranzation DBFS tablespace).
Updated:
- inodes in dbfs_user@ is normal, still 102 (if it full, the inodes will be 369K inodes)
- Stop GG's extract/pump, and restart MGR does not help, stop and restart dbfs_direct mounted does not help
- 2nd finding handling process if exists, none of GG's process appeared. It clean.
I found the link, another victim : https://bryangrenn.blogspot.com/2012/05/where-is-my-space-on-dbfs.html
[Updated on: Fri, 06 June 2025 11:21] Report message to a moderator
|
|
|
Goto Forum:
Current Time: Tue Jun 17 02:53:51 CDT 2025
|