Skip navigation.

DBA Blogs

Oracle Cloud Machine - Move the Cloud to your Data Center

While public cloud computing would make a significant difference in your business, handing the governance and control to someone else isn't always simply an option. The cloud is generally perceived...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Links for 2016-04-03 [del.icio.us]

Categories: DBA Blogs

Partner Webcast – Oracle Mobile Strategy and Mobility Offerings Overview

Mobile is on the mind of every business, as Mobile Is the New First Screen. Enabling existing business applications on handhelds can be very challenging, but even building new business applications...

We share our skills to maximize your revenue!
Categories: DBA Blogs

FBDA -- 2 : FBDA Archive Table Structure

Hemant K Chitale - Sun, 2016-04-03 10:10
Following up on my earlier post, I look at the FBDA Archive Tables.

[oracle@ora12102 Desktop]$ sqlplus hemant/hemant

SQL*Plus: Release 12.1.0.2.0 Production on Sun Apr 3 23:26:27 2016

Copyright (c) 1982, 2014, Oracle. All rights reserved.

Last Successful login time: Sat Apr 02 2016 23:32:30 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
TEST_FBDA
SYS_FBA_DDL_COLMAP_93250
SYS_FBA_HIST_93250
SYS_FBA_TCRV_93250

SQL> desc test_fbda
Name Null? Type
----------------------------------------- -------- ----------------------------
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE

SQL>
SQL> desc sys_fba_hist_93250
Name Null? Type
----------------------------------------- -------- ----------------------------
RID VARCHAR2(4000)
STARTSCN NUMBER
ENDSCN NUMBER
XID RAW(8)
OPERATION VARCHAR2(1)
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE

SQL> desc sys_fba_ddl_colmap_93250
Name Null? Type
----------------------------------------- -------- ----------------------------
STARTSCN NUMBER
ENDSCN NUMBER
XID RAW(8)
OPERATION VARCHAR2(1)
COLUMN_NAME VARCHAR2(255)
TYPE VARCHAR2(255)
HISTORICAL_COLUMN_NAME VARCHAR2(255)

SQL> desc sys_fba_tcrv_93250
Name Null? Type
----------------------------------------- -------- ----------------------------
RID VARCHAR2(4000)
STARTSCN NUMBER
ENDSCN NUMBER
XID RAW(8)
OP VARCHAR2(1)

SQL>


The HIST table is the History table for my active table. It adds columns that track Rowid, Start SCB and End SCN for a range of rows that are copied into the History Table, Transaction Identifier, Operation and then the actual columns of the active table.
The DDL_COLMAP table seems to track Column Mappings.  See example below.
The TCRV table seems to be tracking Transactions ?

Let's try some DDL to ADD and DROP columns to the active table.

SQL> alter table test_fbda add (new_col_1 varchar2(5));

Table altered.

SQL> desc test_fbda
Name Null? Type
----------------------------------------- -------- ----------------------------
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE
NEW_COL_1 VARCHAR2(5)

SQL> desc sys_fba_93250
ERROR:
ORA-04043: object sys_fba_93250 does not exist


SQL> desc sys_fba_hist_93250
Name Null? Type
----------------------------------------- -------- ----------------------------
RID VARCHAR2(4000)
STARTSCN NUMBER
ENDSCN NUMBER
XID RAW(8)
OPERATION VARCHAR2(1)
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE
NEW_COL_1 VARCHAR2(5)

SQL>
SQL> select * from sys_fba_ddl_colmap_93250
2 /

STARTSCN ENDSCN XID O
---------- ---------- ---------------- -
COLUMN_NAME
--------------------------------------------------------------------------------
TYPE
--------------------------------------------------------------------------------
HISTORICAL_COLUMN_NAME
--------------------------------------------------------------------------------
1697151
ID_COLUMN
NUMBER
ID_COLUMN

1697151
DATA_COLUMN
VARCHAR2(15)
DATA_COLUMN

1697151
DATE_INSERTED
DATE
DATE_INSERTED

1728713
NEW_COL_1
VARCHAR2(5)
NEW_COL_1


SQL>


The new column added to the active table is also now reflected in the History Table.  The DDL_COLMAP shows the effective start of each column (notice the STARTSCN mapped to the COLUMN_NAME)

Let's set some values in new column and see if they appear in the History Table.

SQL> update test_fbda set new_col_1 = 'New'
2 where id_column < 6;

5 rows updated.

SQL> commit;

Commit complete.

SQL> select id_column, new_col_1, scn_to_timestamp(startscn), scn_to_timestamp(endscn)
2 from sys_fba_hist_93250
3 where id_column < 6
4 order by 1,3;

ID_COLUMN NEW_C
---------- -----
SCN_TO_TIMESTAMP(STARTSCN)
---------------------------------------------------------------------------
SCN_TO_TIMESTAMP(ENDSCN)
---------------------------------------------------------------------------
1
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

2
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

3
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

4
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

5
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM


SQL>


What rows are copied into the History Table are *prior* image rows (copied from the Undo Area).  The STARTSCN and ENDSCN are of *yesterday* (02-April).

Let me DROP the new column.

SQL> alter table test_fbda drop (new_col_1);

Table altered.

SQL> desc test_fbda
Name Null? Type
----------------------------------------- -------- ----------------------------
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE

SQL> desc sys_fba_hist_93250;
Name Null? Type
----------------------------------------- -------- ----------------------------
RID VARCHAR2(4000)
STARTSCN NUMBER
ENDSCN NUMBER
XID RAW(8)
OPERATION VARCHAR2(1)
ID_COLUMN NUMBER
DATA_COLUMN VARCHAR2(15)
DATE_INSERTED DATE
D_1729869_NEW_COL_1 VARCHAR2(5)

SQL>
SQL> select * from sys_fba_ddl_colmap_93250;

STARTSCN ENDSCN XID O
---------- ---------- ---------------- -
COLUMN_NAME
--------------------------------------------------------------------------------
TYPE
--------------------------------------------------------------------------------
HISTORICAL_COLUMN_NAME
--------------------------------------------------------------------------------
1697151
ID_COLUMN
NUMBER
ID_COLUMN

1697151
DATA_COLUMN
VARCHAR2(15)
DATA_COLUMN

1697151
DATE_INSERTED
DATE
DATE_INSERTED

1728713 1729869
D_1729869_NEW_COL_1
VARCHAR2(5)
NEW_COL_1


SQL>


The dropped column is no longer in the active table  and has been renamed in the History table.  (The data in the column has to be preserved but the column is renamed).  Notice how the DDL_COLMAP table now shows an ENDSCN for this column, with the new (renamed)  column as in the History table.  The column name seems to include the SCN (ENDSCN ?)

Let's confirm what data is now present in the History table  (remember : Our earlier query showed the pre-update image for this column).

SQL> select id_column, D_1729869_NEW_COL_1, scn_to_timestamp(startscn), scn_to_timestamp(endscn)
2 from sys_fba_hist_93250
3 where (id_column < 6 OR D_1729869_NEW_COL_1 is not null)
4 order by 1,3;

ID_COLUMN D_172
---------- -----
SCN_TO_TIMESTAMP(STARTSCN)
---------------------------------------------------------------------------
SCN_TO_TIMESTAMP(ENDSCN)
---------------------------------------------------------------------------
1
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

1
02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM

1 New
03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM

2
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

2
02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM

2 New
03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM

3
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

3
02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM

3 New
03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM

4
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

4
02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM

4 New
03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM

5
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM

5
02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM

5 New
03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM


15 rows selected.

SQL>
SQL> select scn_to_timestamp(1729869) from dual;

SCN_TO_TIMESTAMP(1729869)
---------------------------------------------------------------------------
03-APR-16 11.45.27.000000000 PM

SQL>


Why do we now have 3 rows in the History table for each row in the Active Table ?  Take ID_COLUMN=1.  The first row -- for the time range 02-Apr 11:32pm to 02-Apr 11:46pm--  is as of yesterday, the same row we saw in the History table after the update in the active table.  The second row is the representation to preserve the row for the time rang 02-Apr 11:46pm to 03-Apr 11:41pm to support AS OF queries upto the time of the UPDATE.  The third row for the time range 03-Apr 11:41pm to 03-Apr 11:45pm is to present the UPDATEd value ('New') in the column upto the last transaction updating it before the column was dropped at 03-Apr 11:45:27pm.

Thus, Oracle maintains multiple versions of the same row, including versions for DROPped columns, in the History Table.

Note :  The History Table is not supposed to be directly queried in the manner I have shown here.  The proper query against the active table would be an AS OF query which is automatically rewritten / redirected to "hit" the History table when necessary.

What about the third table table -- the TCRV table ?

SQL> l
1 select scn_to_timestamp(startscn), op , count(*)
2 from sys_fba_tcrv_93250
3 group by scn_to_timestamp(startscn), op
4* order by 2,1
SQL> /

SCN_TO_TIMESTAMP(STARTSCN) O
--------------------------------------------------------------------------- -
COUNT(*)
----------
03-APR-16 11.45.24.000000000 PM U
1000


SQL>
SQL> select count(distinct(rid)) from sys_fba_tcrv_93250;

COUNT(DISTINCT(RID))
--------------------
1000

SQL>



It shows 1000 rows has having been UPDATEd ? (Does OP='U' mean 'UPDATE). We do know that ADD and DROP column are changes to the table.  But are they UPDATEs ?

Next post : Continuing with DML operations (more rows, some updates).  We'll see if we can decipher anything rom the TCRV table as well. Changed to showing support for TRUNCATEs.
.
.
.



Categories: DBA Blogs

FBDA -- 1 : Testing Flashback Data Archive in 12c (NonCDB)

Hemant K Chitale - Sat, 2016-04-02 09:53
Note : At the bottom of this post, you'll find links to more (subsequent) posts on this topic.

Some testing I'd done with Flashback Data Archive (henceforth called FBDA in this and subsequent posts, if any) in 11.2.0.4 left me with uncertainty about the automatic purging of data beyond the Retention Period.  I might return to testing 11.2.0.4, but here I shall begin testing in 12.1.0.2  (NonCDB).

Setting up FBDA :

[oracle@ora12102 ~]$ sqlplus system/oracle

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 2 23:23:53 2016

Copyright (c) 1982, 2014, Oracle. All rights reserved.

Last Successful login time: Sat Apr 02 2016 23:20:47 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> create tablespace fbda ;

Tablespace created.

SQL> create flashback archive fbda tablespace fbda retention 3 day;

Flashback archive created.

SQL> create tablespace hemant;

Tablespace created.

SQL> create user hemant identified by hemant
2 default tablespace hemant;

User created.

SQL> grant create table to hemant;

Grant succeeded.

SQL> grant create session to hemant;

Grant succeeded.

SQL> alter user hemant quota unlimited on hemant;

User altered.

SQL> alter user hemant quota unlimited on fbda;

User altered.

SQL> grant flashback archive administer to hemant;

Grant succeeded.

SQL> grant flashback archive on fbda to hemant;

Grant succeeded.

SQL>
SQL> connect / as sysdba
Connected.
SQL> grant execute on dbms_flashback_archive to hemant;

Grant succeeded.

SQL>
SQL> connect hemant/hemant
Connected.
SQL> create table test_fbda (id_column number, data_column varchar2(15), date_inserted date) tablespace hemant;

Table created.

SQL> alter table test_fbda flashback archive fbda;

Table altered.

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
TEST_FBDA

SQL>


Note the Flashback Archive history table corresponding to TEST_FBDA doesn't get created immediately.

SQL> connect hemant/hemant
Connected.
SQL> insert into test_fbda
2 select rownum , to_char(rownum), trunc(sysdate)
3 from dual connect by level < 1001;

1000 rows created.

SQL> commit;

Commit complete.

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
TEST_FBDA

SQL> select count(*) from test_fbda;

COUNT(*)
----------
1000

SQL> select flashback_archive_name, retention_in_days, status
2 from user_flashback_archive;

FLASHBACK_ARCHIVE_NAME
--------------------------------------------------------------------------------
RETENTION_IN_DAYS STATUS
----------------- -------
FBDA
3


SQL> select table_name, flashback_archive_name, archive_table_name, status
2 from user_flashback_archive_tables;

TABLE_NAME
--------------------------------------------------------------------------------
FLASHBACK_ARCHIVE_NAME
--------------------------------------------------------------------------------
ARCHIVE_TABLE_NAME STATUS
----------------------------------------------------- -------------
TEST_FBDA
FBDA
SYS_FBA_HIST_93250 ENABLED


SQL>
SQL> !sleep 300

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
TEST_FBDA
SYS_FBA_DDL_COLMAP_93250
SYS_FBA_HIST_93250
SYS_FBA_TCRV_93250

SQL>
SQL> select object_id
2 from user_objects
3 where object_name = 'TEST_FBDA'
4 and object_type = 'TABLE'
5
SQL> /

OBJECT_ID
----------
93250

SQL>


So, it took some time for the flashback archive history table (identified on the basis of the OBJECT_ID) to appear.  The background fbda process seems to run (wakeup) every 5minutes although it may wake up more frequently if there is more activity in the database.

SQL> select trunc(date_inserted), count(*)
2 from test_fbda
3 group by trunc(date_inserted)
4 /

TRUNC(DAT COUNT(*)
--------- ----------
02-APR-16 1000

SQL> select trunc(date_inserted), count(*)
2 from sys_fba_hist_93250
3 group by trunc(date_inserted)
4 /

no rows selected

SQL> update test_fbda
2 set data_column = data_column
3 where rownum < 451;

450 rows updated.

SQL> commit;

Commit complete.

SQL> select trunc(date_inserted), count(*)
2 from sys_fba_hist_93250
3 group by trunc(date_inserted)
4 /

no rows selected

SQL>
SQL> !sleep 180

SQL> select trunc(date_inserted), count(*)
2 from sys_fba_hist_93250
3 group by trunc(date_inserted)
4 /

TRUNC(DAT COUNT(*)
--------- ----------
02-APR-16 450

SQL>
SQL> select scn_to_timestamp(startscn), scn_to_timestamp(endscn), date_inserted, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn), date_inserted
4 order by 1;

SCN_TO_TIMESTAMP(STARTSCN)
---------------------------------------------------------------------------
SCN_TO_TIMESTAMP(ENDSCN)
---------------------------------------------------------------------------
DATE_INSE COUNT(*)
--------- ----------
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM
02-APR-16 450


SQL>


Notice that not all 1000 rows got copied to the FBDA.  Only the 450 rows that I updated were copied in.  They are tracked by SCN-Timestamp.  (The "DATE_INSERTED" column is my own date column, Oracle wouldn't be using that column to track DML dates for rows as the values in that column are controlled by me -- the application or developer, not Oracle).

Note :  The History Table is not supposed to be directly queried in the manner I have shown here.

Tomorrow :  More Rows, and some DDLs as well.

Post 2 here.  It covers some of the architectural components and support for ADD / DROP column DDL.

Post 3 here.  It shows support for TRUNCATEs.

Post 4 here.  On Partitions and Indexes.

Post 5 here.  On (Auto)Purging.

Post 6 here.  On Bug Notes
.
.
.


Categories: DBA Blogs

sql for first day of month and last day of month

Learn DB Concepts with me... - Fri, 2016-04-01 19:00

select SYSDATE ,
last_day(sysdate) as LAST_DATE_CURR_MNTH,
ADD_MONTHS(last_day(sysdate),-1) as PREVIOUS_MON_LAST_DATE,
last_day(sysdate)+1 as NEXT_MON_FIRST_DATE,
ADD_MONTHS(last_day(sysdate),+1) as NEXT_MON_LAST_DATE,
ADD_MONTHS(last_day(sysdate),+5) as LAST_DATE_OF_5TH_MON,
ADD_MONTHS(last_day(sysdate),+5) +1 as FIRST_DATE_IN_6TH_MON_AFTR_NOW  
from dual;

"SYSDATE"    "LAST_DATE_CURR_MNTH"    "PREVIOUS_MON_LAST_DATE"    "NEXT_MON_FIRST_DATE"    "NEXT_MON_LAST_DATE"    "LAST_DATE_OF_5TH_MON"   
-----------  ---------------------   ------------------------    ---------------------   --------------------    -----------------------
"FIRST_DATE_IN_6TH_MON_AFTR_NOW"
-----------------------

01-APR-16        30-APR-16                31-MAR-16                    01-MAY-16                31-MAY-16            30-SEP-16   
-----------------------
01-OCT-16
Categories: DBA Blogs

Best practice for setting up MySQL replication filters

Pythian Group - Fri, 2016-04-01 13:23

It is not uncommon that we need to filter out some DBs or Tables while setting up replication. It is important to understand how MySQL evaluates/process the replication filtering rules to avoid the conflicting or confusion while we setting them up.The purpose of this blog is to illustrate the rules and provide some suggestions for best practice.

MySQL provides 3 levels of filters for setting up replication: Binary log, DB and Table. The binlog filters apply on the master to control how to log the changes. Since MySQL replication is based on the binlog, it is the first level filter and has the highest priority. While the DB-level and Table-level filters apply on the slaves, since each table belongs to a schema, the DB-level filters have higher priority than the Table-level ones. Inside the Table-level filters, MySQl will evaluate the options in the order of: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table , –replicate-wild-ignore-table.

Based on that, we have the following suggestions for setting up MySQL replication filter as best practice:

I)Do not setup any binlog-level filters unless you really need to and can afford losing the chance of  having an extra full copy of data changes for the master.

II)In DB-level filters, use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

III) While using binlog_format=’statement’ OR ‘mixed’ (in mixed mode, if  a transaction is deterministic then it will be stored in statement format) and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database on master otherwise you might lose the changes on slave due to default database not matching.

IV)In Table-level filters, use only one of the 2 options, or use the following two combination: –replicate-ignore-table and —replicate-wild-do-table to avoid conflicting and confusing.

For MariaDB replication filters within Galera cluster, it should be used with caution. As a general rule except for InnoDB DML updates, the following replication filters are not honored in a Galera cluster :  binlog-do-db ,binlog-ignore-db, replicate-wild-do-db, replicate-wild-ignore-db. However, replicate-do-db,replicate-ignore-db filters are honored for DDL and DML for both InnoDB & MyISAM engines. As they might create discrepancies and replication may abort (see MDEV-421, MDEV-6229). (https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/), For the slaves replicating from cluster, the rules are similar with normal replication settings as above.

Here are the details/reasons:

1)Binlog-level filters

A)How MySQL process the Binlog-level filters

There are 2 options for setting binlog filter on master:  –binlog-do-db and –binlog-ignore-db. MySQL will check –binlog-do-db first, if there are any options, it will apply this one and ignore –binlog-ignore-db. If the –binlog-do-db is NOT set, then mysql will check –binlog-ignore-db.If both of them are empty, it will log changes for all DBs.

See the below examples. In scenario 1) no binlog level filters are set and so all changes were logged; In scenario 2) -binlog-do-db and –binlog-ignore-db are all set to m_test and changes on the DB m_test were logged and changes on the DB test were NOT logged;In scenario 3) only –binlog-ignore-db is set to m_test and so changes on the DB m_test were NOT logged and changes on the DB test were  logged;
scenario 1)–binlog-do-db and –binlog-ignore-db is NOT set:

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000003 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

mysql> show binlog events in “vm-01-bin.000003” from 120;  

Empty set (0.00 sec)

mysql> insert into t1(id,insert_time) values(10,now());

Query OK, 1 row affected (0.05 sec)

 

mysql> show binlog events in “vm-01-bin.000003” from 120;

+——————+—–+————+———–+————-+—————————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                          |

+——————+—–+————+———–+————-+—————————————————————+

| vm-01-bin.000003 | 120 | Query      |         1 |         211 | BEGIN                                                         |

| vm-01-bin.000003 | 211 | Query      |         1 |         344 | use `m_test`; insert into t1(id,insert_time) values(10,now()) |

| vm-01-bin.000003 | 344 | Xid        |         1 |         375 | COMMIT /* xid=17 */                                           |

+——————+—–+————+———–+————-+—————————————————————+

3 rows in set (0.00 sec)

scenario 2)–binlog-do-db=m_test and –binlog-ignore-db=m_test:

— insert into tables of DB m_test was logged

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      656 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

 

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.02 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 656;

+——————+—–+————+———–+————-+———————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                    |

+——————+—–+————+———–+————-+———————————————————+

| vm-01-bin.000004 | 656 | Query      |         1 |         747 | BEGIN                                                   |

| vm-01-bin.000004 | 747 | Intvar     |         1 |         779 | INSERT_ID=13                                            |

| vm-01-bin.000004 | 779 | Query      |         1 |         906 | use `m_test`; insert into t1(insert_time) values(now()) |

| vm-01-bin.000004 | 906 | Xid        |         1 |         937 | COMMIT /* xid=26 */                                     |

+——————+—–+————+———–+————-+———————————————————+

4 rows in set (0.00 sec)

— insert into tables of DB test was NOT logged

mysql> use test;

 

mysql> show master status ;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      937 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

 

mysql> insert into t1(`a`) values(‘ab’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 937;

Empty set (0.00 sec)

 

scenario 3)–Binlog_Do_DB=null –binlog-ignore-db=m_test:

mysql> use m_test

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000005 |      120 |              | m_test           |                   |

+——————+———-+————–+——————+——————-+

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.01 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

Empty set (0.00 sec)

 

mysql> use test

mysql> insert into t1(`a`) values(‘ba’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

+——————+—–+————+———–+————-+———————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                         |

+——————+—–+————+———–+————-+———————————————-+

| vm-01-bin.000005 | 120 | Query      |         1 |         199 | BEGIN                                        |

| vm-01-bin.000005 | 199 | Query      |         1 |         305 | use `test`; insert into t1(`a`) values(‘ba’) |

| vm-01-bin.000005 | 305 | Xid        |         1 |         336 | COMMIT /* xid=22 */                          |

+——————+—–+————+———–+————-+———————————————-+

3 rows in set (0.00 sec)

 

B)Best practice for setting up the Binlog-level filters

So, for Binlog-level filter, we will use either one (and ONLY one or none) of the 2 options: –binlog-do-db to make MySQL log changes for the DBs in the list. OR, –binlog-ignore-db to make MySQL log changes for the DBs NOT in the list. Or leave both of them empty to log changes for all the DBs.

However, we usually recommend NOT to setup any binlog-level filters. The reason is that to log changes for all DBs and set up filters only on slaves will achieve the same purpose and let us have an extra full copy of data changes for the master, in case we will need that for recovery.

 

2)DB-level filters

A)How MySQL process the DB-level filters

There are 2 options for setting DB-level filters:  –replicate-do-db or –replicate-ignore-db. MySQL processes these two filters the similar way as it processes the Binlog-level filters, the difference is that it ONLY applies on the slaves and so affects how the slaves replicate from its master. It will check –replicate-do-db first, if there are any options, it will replicate the DBs in the list and ignore –replicate-ignore-db. If the –replicate-do-db is NOT set, then mysql will check –replicate-ignore-db and replicate all the DBs except for the ones in this list.If both of them are empty, it will replicate all the DBs. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html

There is a trick for DB-level filters though If the binlog_format is set as statement or mixed. (The binlog_format =mixed also applies here, it is because that  in mixed mode replication, in case the transaction  is deterministic it will be resolved to statement which is equivalent to statement mode) .. Since “With statement-based replication, the default database is checked for a match.” (http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html). If you set up –replicate-do-db and you update a table out of the default database in master, the update statement will not be replicated if the default database you are running command from is not in the  –replicate-do-db. For example, there are 2 DBs in master, you set binlog_format=’statement’ OR ‘mixed’ and set –replicate-do-db=DB1 on slave. when execute the following commands: use DB2; update DB1.t1 … This update command will not be executed on slave. To make the update statement replicated to slave, you need to do: use DB1, update t1 …

For example: with binlog_format=statement or binlog_format=mixed,  we insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged in the master. But in slave, after it caught up, only the insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated. As shown below:

Scenario 1) binlog_format=statement

In master: insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> delete from t1;

Query OK, 16 rows affected (0.02 sec)

 

mysql> select * from m_test.t1;

Empty set (0.00 sec)

 

mysql> use m_test

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000006” from 654;

+——————+——+————+———–+————-+—————————————————————-+

| Log_name         | Pos  | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+——+————+———–+————-+—————————————————————-+

| vm-01-bin.000006 |  654 | Xid        |         1 |         685 | COMMIT /* xid=39 */                                            |

| vm-01-bin.000006 |  685 | Query      |         1 |         768 | BEGIN                                                          |

| vm-01-bin.000006 |  768 | Query      |         1 |         860 | use `m_test`; delete from t1                                   |

| vm-01-bin.000006 |  860 | Xid        |         1 |         891 | COMMIT /* xid=48 */                                            |

| vm-01-bin.000006 |  891 | Query      |         1 |         982 | BEGIN                                                          |

| vm-01-bin.000006 |  982 | Intvar     |         1 |        1014 | INSERT_ID=17                                                   |

| vm-01-bin.000006 | 1014 | Query      |         1 |        1148 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000006 | 1148 | Xid        |         1 |        1179 | COMMIT /* xid=52 */                                            |

| vm-01-bin.000006 | 1179 | Query      |         1 |        1268 | BEGIN                                                          |

| vm-01-bin.000006 | 1268 | Intvar     |         1 |        1300 | INSERT_ID=18                                                   |

| vm-01-bin.000006 | 1300 | Query      |         1 |        1432 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000006 | 1432 | Xid        |         1 |        1463 | COMMIT /* xid=60 */                                            |

+——————+——+————+———–+————-+—————————————————————-+

12 rows in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

+—-+———————+

2 rows in set (0.00 sec)

 

In slave: after it caught up, only the first insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000006

         Read_Master_Log_Pos: 1463

              Relay_Log_File: ewang-vm-03-relay-bin.000017

               Relay_Log_Pos: 1626

       Relay_Master_Log_File: vm-01-bin.000006

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 1463

             Relay_Log_Space: 1805

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

+—-+———————+

1 row in set (0.00 sec)

 

Scenario 2) binlog_format=mixed

In master:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000007 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> show binlog events in “vm-01-bin.000007” from 120;

+——————+—–+————+———–+————-+—————————————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+—–+————+———–+————-+—————————————————————-+

| vm-01-bin.000007 | 120 | Query      |         1 |         211 | BEGIN                                                          |

| vm-01-bin.000007 | 211 | Intvar     |         1 |         243 | INSERT_ID=19                                                   |

| vm-01-bin.000007 | 243 | Query      |         1 |         377 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000007 | 377 | Xid        |         1 |         408 | COMMIT /* xid=45 */                                            |

| vm-01-bin.000007 | 408 | Query      |         1 |         497 | BEGIN                                                          |

| vm-01-bin.000007 | 497 | Intvar     |         1 |         529 | INSERT_ID=20                                                   |

| vm-01-bin.000007 | 529 | Query      |         1 |         661 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000007 | 661 | Xid        |         1 |         692 | COMMIT /* xid=53 */                                            |

+——————+—–+————+———–+————-+—————————————————————-+

8 rows in set (0.00 sec)

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

| 19 | 2016-03-20 15:09:14 |

| 20 | 2016-03-20 15:09:25 |

+—-+———————+

4 rows in set (0.00 sec)

 

In slave:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000007

         Read_Master_Log_Pos: 692

              Relay_Log_File: ewang-vm-03-relay-bin.000023

               Relay_Log_Pos: 855

       Relay_Master_Log_File: vm-01-bin.000007

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 692

             Relay_Log_Space: 1034

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 19 | 2016-03-20 15:09:14 |

+—-+———————+

2 rows in set (0.00 sec)

 

B)Best practice for setting up the DB-level filters

Use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

If you use binlog_format=’statement’  OR ‘mixed’ and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database, otherwise the data discrepancy will be expected in the slaves.

 

3)Table-level filters

There are 4 options for setting Table-level filters: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table or –replicate-wild-ignore-table. MySQL evaluates the options in order. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.6/en/replication-rules-table-options.html

 

The above chart shows us that MySQL will first check –replicate-do-table, the tables listed here will be replicated and so won’t be ignored by the following options like –replicate-ignore-table , or –replicate-wild-ignore-table. Then MySQL will check –replicate-ignore-table, the tables listed here will be ignored even if it shows up in the following options  –replicate-wild-do-table. The lowest priority is –replicate-wild-ignore-table.

B)Best practice for setting up the Table-level filters

Due to the priorities for the 4 Table_level options, to avoid confusing/conflicting, we suggest using only one of the 4 options, or using the following two options: –replicate-ignore-table and replicate-wild-do-table so that it is clearly that the tables in –replicate-ignore-table will be ignored and the tables in replicate-wild-do-table will be replicated.

 

Categories: DBA Blogs

Deploying your Oracle MAF Apps on Windows Platform

As you may already know Oracle Mobile Application Framework (MAF) 2.3 has been released. And one of the symbolic features is support for Universal Windows Platform (UWP). This means that starting...

We share our skills to maximize your revenue!
Categories: DBA Blogs

What Are Your Options For Migrating Enterprise Applications to the Cloud?

Pythian Group - Fri, 2016-04-01 08:16

Migrating your enterprise applications from on-premises infrastructure to the public cloud is attractive for a number of reasons. It eliminates the costs and complexities of provisioning hardware and managing servers, storage devices, and network infrastructure; it gives you more compute capacity per dollar without upfront capital investment; and you gain opportunities for innovation through easier access to new technologies, such as advanced analytical capabilities.

So how do you get there?

You have a few options. At one end of the spectrum, you could simply wait and rationalize, making continuous incremental changes to gain efficiencies. This is obviously a “slow burn” approach. In the middle is a “lift-and-shift” from your current environment into the public cloud. And at the far extreme, you could plunge right in and re-architect your applications—a costly and probably highly complex task.

 

In fact, a true migration “strategy” will involve elements of each of these. For example, you could perform short-term optimizations and migrations on a subset of applications that are ready for the cloud, while transforming the rest of your application stack over the longer term.

 

What to expect from the major public cloud platforms

There are three leading public cloud platforms: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). As Google doesn’t seem to be driving customers to lift-and-shift their applications to GCP, I’m going to focus on AWS and Azure as potential cloud destinations and, for specificity, take Oracle enterprise databases as the use case.

 

Amazon Web Services

You have two options for migrating Oracle databases to the AWS cloud: infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS).

 

Deploying Oracle applications in AWS IaaS is much like deploying them on your in-house infrastructure. You don’t get flexible licensing options, but you do have the ability to easily allocate more or less capacity as needed for CPU, memory, and storage. However, because AWS IaaS is virtualized infrastructure, you may experience slower performance due to suboptimal CPU core allocation or processor caches. You’ll also have less flexibility with instance sizes, network topology, storage performance tiers, and the like.

 

AWS Relational Database Service (RDS) for Oracle is a managed PaaS offering where, in addition to giving you the benefits of IaaS, Amazon takes on major DBA and system administrator tasks including provisioning, upgrades, backups, and multi-availability zone replication. This significantly simplifies your operations—but also results in less control over areas such as configuration, patching, and maintenance windows. AWS RDS for Oracle can also be used with a pay-as-you-go licensing model included in the hourly rate.

 

Microsoft Azure

Azure does not have a managed offering for Oracle databases, so the only way to run Oracle Database on Azure is through its IaaS platform. The benefits are very similar to AWS IaaS, but Azure offers additional licensing options (with Windows-based license-included images) and its instances are billed by the minute rather than by the hour. What’s important to keep in mind is that Azure is not as broadly adopted as AWS and offers less flexibility for storage performance tiers and instance sizes. Oracle Database software running on Windows is also not as common as running on Linux.

 

For more in-depth technical details on these options, I encourage you to read our white paper, Migrating Oracle Databases to Cloud. My next blog in this series will look at one other option not discussed here: migrating to Oracle Cloud.

migratingtocloud

Categories: DBA Blogs

CHANGE STANDBY DATABASE PROTECTION MODE

Learn DB Concepts with me... - Fri, 2016-04-01 08:00
SQL> select protection_mode from v$database;

PROTECTION_MODE
--------------------
MAXIMUM PERFORMANCE

SQL> show parameter log_archive_dest_2

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2             string     SERVICE=ORCLSTB1 NOAFFIRM ASYN
                         C VALID_FOR=(ONLINE_LOGFILES,P
                         RIMARY_ROLE) DB_UNIQUE_NAME=OR
                         CLSTB1
log_archive_dest_20             string
log_archive_dest_21             string
log_archive_dest_22             string
log_archive_dest_23             string
log_archive_dest_24             string
log_archive_dest_25             string
log_archive_dest_26             string

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_27             string
log_archive_dest_28             string
log_archive_dest_29             string
SQL> show parameter db_unique_name

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                 string     ORCL
SQL> show parameter log_archive_config

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
log_archive_config             string     dg_config=(ORCL,ORCLSTB1,ORCLS
                         TB2)
SQL> alter system set log_archive_dest_2='SERVICE=ORCLSTB1 NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=ORCLSTB1';

System altered.

SQL> show parameter log_archive_dest_2

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2             string     SERVICE=ORCLSTB1 NOAFFIRM ASYN
                         C VALID_FOR=(ONLINE_LOGFILES,P
                         RIMARY_ROLE) DB_UNIQUE_NAME=OR
                         CLSTB1
log_archive_dest_20             string
log_archive_dest_21             string
log_archive_dest_22             string
log_archive_dest_23             string
log_archive_dest_24             string
log_archive_dest_25             string
log_archive_dest_26             string

NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_27             string
log_archive_dest_28             string
log_archive_dest_29             string

SQL> alter system set log_archive_dest_2='SERVICE=ORCLSTB1 NOAFFIRM SYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=ORCLSTB1';

System altered.


SQL> alter database set standby database to maximize availability;

Database altered.

NOTE: You don’t need to shutdown your instance, when you are changing  protection mode from MAXIMUM PERFORMANCE TO MAXIMUM AVAILABILITY.But you need to if you are going to MAXIMUM PROTECTION.

SQL> alter system switch logfile;

System altered.

SQL> select protection_mode from v$database;

PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY

SQL> archive log list;
Database log mode           Archive Mode
Automatic archival           Enabled
Archive destination           /u01/app/oracle/oraarch/
Oldest online log sequence     239
Next log sequence to archive   241
Current log sequence           241
SQL> select group#,bytes/1024/1024 from v$standby_log;

    GROUP# BYTES/1024/1024
---------- ---------------
     4        52
     5        52
     6        52
     7        52


SQL> select thread#,max(sequence#) from v$archived_log group by thread#;

   THREAD# MAX(SEQUENCE#)
---------- --------------
     1          240

SQL> select protection_mode from v$database;

PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY



SQL> alter system switch logfile;

System altered.

Categories: DBA Blogs

Links for 2016-03-31 [del.icio.us]

Categories: DBA Blogs

I am Oracle ACE

Oracle in Action - Thu, 2016-03-31 23:49

RSS content

It gives me immense pleasure to share the news that I have been honored with the prestigious  Oracle ACE award. I am grateful to Oracle ACE Program for accepting my nomination. I would like to thank to Murali Vallath Sir who nominated me for this award. I am also thankful to my family members without whose support and motivation, this would not have been possible.  Thanks a lot to all the readers of my blog whose comments and suggestions helped me to learn and share whatever little knowledge I have.

I will do my best to participate in the Oracle ACE program.

 

 



Tags:  

Del.icio.us
Digg

Comments:  36 comments on this item
You might be interested in this:  
Copyright © ORACLE IN ACTION [I am Oracle ACE], All Right Reserved. 2016.

The post I am Oracle ACE appeared first on ORACLE IN ACTION.

Categories: DBA Blogs

5 Phases for Migrating to a Cloud Platform

Pythian Group - Thu, 2016-03-31 13:11

Businesses today are increasingly looking to migrate to the cloud to realize lower costs and increase software velocity. They are now asking themselves “when” they should migrate rather than if they “should”, and with many vendors and solutions in the market, it can be difficult to take the first steps in creating a cloud strategy.   

In our latest on-demand webinar, Chris Presley, Solution Architect at Pythian, and Jim Bowyer, Solution Architect at Azure-Microsoft Canada, discuss a five phase framework for cloud transformations, and the benefits of migrating to the cloud with Microsoft Azure.

The five phase framework helps businesses understand the journey to successfully migrate current applications to a cloud platform. Here is a snapshot of the five phases:

 

1. Assessment: Analysis and Planning

A majority of the time investment should be upfront in assessment and preparation because it sets the stage for the actual development and migration, resulting in faster projects, lower costs, and less risk.

In this phase, businesses want to begin understanding the performance and user characteristics of their applications, and any other additional information that will be important during the transformation, such as regulatory, compliance, and legal requirements.

 

2. Preparation: POC, Validation and Final Road Map

The preparation phase is meant to help understand what the rest of the migration is going to look like.

While beneficial in any project, proof of concepts (POCs) are increasingly simple to create and are a great strength when leveraging the cloud. POCs are used to show some functionality and advantage early so you can get everyone – especially business owners – excited about the migration.

 

3. Build: Construct Infrastructure

Once the expectations around the final migration road map are developed, the infrastructure can be built. Jim discusses that beginning to think about automation during this phase is important, and Chris agrees, in particular with developing an automated test bed to help smooth out the migration.

 

4. Migration: Execute Transformation

The migration activity for cloud environments is very short. By this stage, if the planning and preparation has been done properly, “flicking the light switch” to the new environment should be seamless and feel like the easiest part.

Chris talks about creating both detailed success and rollback criteria and how they are both crucial for success in the migration phase. Jim mentions that Microsoft Azure provides a variety of tools to help make rollbacks easier and safer.

 

5. Optimization: IaaS Enhancements

Continually transforming and enhancing after the migration is complete is important for increasing software velocity, which is why businesses migrate to the cloud in the first place. While a piece of functionality may not available today, it may be available tomorrow.

By going back to iterate and take advantage of new functionalities, businesses are able to squeeze out more improvements and create opportunities for differentiation.

 

Learn More

To learn about these five cloud transformation phases in more depth, and how to leverage the cloud with Microsoft Azure, download our free on-demand webinar.

Azure_Webinar (1)

Categories: DBA Blogs

Log file parallel write wait graph

Bobby Durrett's DBA Blog - Thu, 2016-03-31 09:50

I got a chance to use my onewait Python based graph to help with a performance problem. I’m looking at slow write time from the log writer on Thursday mornings. Here is the graph with the database name erased:

log_file_parallel_write_waits

We are still trying to track down the source of the problem but there seems to be a backup on another system that runs at times that correspond to the spike in log file parallel write wait times. The nice thing about this graph is that it shows you activity on the top and average wait time on the bottom so you can see if the increased wait time corresponds to a spike in activity. In this case there does not seem to be any increase in activity on the problematic database.  But that makes sense if the real problem is contention by a backup on another system.

Anyway, my Python graphs are far from perfect but still helpful in this case.

Bobby

Categories: DBA Blogs

GoldenGate 12.2 Big Data Adapters: part 3 – Kafka

Pythian Group - Thu, 2016-03-31 09:39

This post continues my review of GoldenGate Big Data adapters started by review of HDFS and FLUME adapters. Here is list of all posts in the series:

  1. GoldenGate 12.2 Big Data Adapters: part 1 – HDFS
  2. GoldenGate 12.2 Big Data Adapters: part 2 – Flume
  3. GoldenGate 12.2 Big Data Adapters: part 3 – Kafka

In this article I will try the Kafka adapter and see how it works. Firstly, I think it may be worth reminding readers what Kafka is. Kafka is a streaming subscriber-publisher system. One can ask how it is different from Flume, and that question I’ve asked myself when I’ve heard about the Kafka. I think one of the best comparisons between Flume and Kafka has been made by Gwen Shapira & Jeff Holoman in the blog post Apache Kafka for Beginners . In essence, Kafka is general purpose system where most of the control and consumer functionality relays on your own built consumer programs. When in Flume you have pre-created sources, sinks, and can use interceptors for changing data. So, in Kafka you are getting on the destination exactly what you put on the source. Kafka and Flume can work together pretty well, and in this article I am going to use them both.
Let’s recall what we have in our configuration. We have an Oracle database running as a source, and Oracle GoldenGate for Oracle capturing changes for one schema in this database. We have OGG 12.2 and integrated extract on the source. The replication is going directly to trail files on the destination side where we have OGG for BigData installed on a Linux box. You can get more details about the installation on source and target from the first post in the series. I’ve made configuration as simple as possible dedicating most attention to the Big Data adapters functionality, which is after all the main point of the article.

Having installed OGG for Big Data, we need to setup the Kafka adapter. As for other adapters, we are copying the configuration files from $OGG_HOME/AdapterExamples/big-data directory.

bash$ cp $OGG_HOME/AdapterExamples/big-data/kafka/* $OGG_HOME/dirdat/

We need to adjust our kafka.props file to define Kafka/Zookeper topics for data and schema changes (TopicName and SchemaTopicName parameters), and the gg.classpath for Kafka and Avro java classes. I left rest of the parameters default including format for the changes which was defined as “avro_op” in the example.

[oracle@sandbox oggbd]$ cat dirprm/kafka.props

gg.handlerlist = kafkahandler
gg.handler.kafkahandler.type = kafka
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
gg.handler.kafkahandler.TopicName =oggtopic
gg.handler.kafkahandler.format =avro_op
gg.handler.kafkahandler.SchemaTopicName=mySchemaTopic
gg.handler.kafkahandler.BlockingSend =false
gg.handler.kafkahandler.includeTokens=false

gg.handler.kafkahandler.mode =tx
#gg.handler.kafkahandler.maxGroupSize =100, 1Mb
#gg.handler.kafkahandler.minGroupSize =50, 500Kb


goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE

gg.log=log4j
gg.log.level=INFO

gg.report.time=30sec

gg.classpath=dirprm/:/u01/kafka/libs/*:/usr/lib/avro/*:

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

[oracle@sandbox oggbd]$

The next file we have to correct is custom_kafka_producer.properties which contains information about our running Kafka server and define some addition parameters like compression. I left all the parameters unchanged except “bootstrap.servers” where I put information about my Kafka service.

[oracle@sandbox oggbd]$ cat dirprm/custom_kafka_producer.properties
bootstrap.servers=sandbox:9092
acks=1
compression.type=gzip
reconnect.backoff.ms=1000

value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
# 100KB per partition
batch.size=102400
linger.ms=10000
[oracle@sandbox oggbd]$

If we plan an initial load through Kafka we can use something like that parameter file I prepared for a passive replicat :

[oracle@sandbox oggbd]$ cat dirprm/irkafka.prm
-- Trail file for this example is located in "dirdat" directory
-- Command to run passive REPLICAT
-- ./replicat paramfile dirprm/irkafka.prm reportfile dirrpt/irkafka.rpt
SPECIALRUN
END RUNTIME
EXTFILE /u01/oggbd/dirdat/initld
--
TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP ggtest.*, TARGET bdtest.*;
[oracle@sandbox oggbd]$

Before starting any replicat we need to prepare our system to receive the data. Since the Kafka itself is pure streaming system it cannot pass files to HDFS without other program or connector. In the first case we will be using Kafka passing data to Flume and from Flume will use its sink to HDFS. Please be aware that you need a Zookeeper to manage topics for Kafka. I am not going to discuss setting up Zookeeper in this article, just assume that we have it already and it is up and running on port 2181.
I used Kafka version 0.9.0.1 downloading it from http://kafka.apache.org/downloads.html. After downloading the archive I unpacked it, slightly corrected configuration and started it in standalone mode.

[root@sandbox u01]# wget http://apache.parentingamerica.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
--2016-03-15 15:22:09--  http://apache.parentingamerica.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
Resolving apache.parentingamerica.com... 70.38.15.129
Connecting to apache.parentingamerica.com|70.38.15.129|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 35650542 (34M) [application/x-gzip]
Saving to: `kafka_2.11-0.9.0.1.tgz'

100%[=========================================================================================================================================>] 35,650,542  2.95M/s   in 16s

2016-03-15 15:22:26 (2.10 MB/s) - `kafka_2.11-0.9.0.1.tgz' saved [35650542/35650542]

[root@sandbox u01]# tar xfz kafka_2.11-0.9.0.1.tgz

[root@sandbox u01]# ln -s kafka_2.11-0.9.0.1 kafka

[root@sandbox u01]# cd kafka

[root@sandbox kafka]# vi config/server.properties
[root@sandbox kafka]# grep -v '^$\|^\s*\#' config/server.properties
broker.id=0
listeners=PLAINTEXT://:9092
num.network.threads=3

num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
delete.topic.enable=true
[root@sandbox kafka]#
[root@sandbox kafka]# nohup bin/kafka-server-start.sh config/server.properties > /var/log/kafka/server.log &
[1] 30669
[root@sandbox kafka]# nohup: ignoring input and redirecting stderr to stdout

Now we need to prepare our two topics for the data received from the GoldenGate. As you remember we have defined topic “oggdata” for our data flow using parameter gg.handler.kafkahandler.TopicName in our kafka.props file and topic “mySchemaTopic” for schema changes. So, let’s create the topic using Kafka’s supplemented scripts:

[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --create --topic oggtopic --partitions 1 --replication-factor 1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Created topic "oggtopic".
[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --list
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
oggtopic
[root@sandbox kafka]#

As matter of fact, all the necessary topics will also be created automatically when you start your GoldenGate replicat. You need to create the topic explicitly if you want to use some custom parameters for it. You also have the option to alter the topic later on when setting up configuration parameters.
Here is list of the topics we have when one of them is created manually and the second one is created automatically by the replicat process.

[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --describe --topic oggtopic
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Topic:oggtopic	PartitionCount:1	ReplicationFactor:1	Configs:
	Topic: oggtopic	Partition: 0	Leader: 0	Replicas: 0	Isr: 0
[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --describe --topic mySchemaTopic
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Topic:mySchemaTopic	PartitionCount:1	ReplicationFactor:1	Configs:
	Topic: mySchemaTopic	Partition: 0	Leader: 0	Replicas: 0	Isr: 0
[root@sandbox kafka]#

In our configuration we have only one server and the simplest configuration for Kafka. In a real business case it can be way more complex. Our replicat is going to post data changes to oggtopic, and all changes and definitions for schema to the mySchemaTopic. We’ve already mentioned that we are going to use Flume functionality to write to HDFS. I’ve prepared Flume with two sources and sinks to write data changes to the /user/oracle/ggflume HDFS directory. We had an option to split data and schema changes to different directories if we wish it. Here is my configuration for Flume:

[root@sandbox ~]# cat /etc/flume-ng/conf/flume.conf
# Name/aliases for the components on this agent
agent.sources = ogg1 ogg2
agent.sinks = hdfs1 hdfs2
agent.channels = ch1 ch2

#Kafka source
agent.sources.ogg1.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.ogg1.zookeeperConnect = localhost:2181
agent.sources.ogg1.topic = oggtopic
agent.sources.ogg1.groupId = flume
agent.sources.ogg1.kafka.consumer.timeout.ms = 100

agent.sources.ogg2.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.ogg2.zookeeperConnect = localhost:2181
agent.sources.ogg2.topic = mySchemaTopic
agent.sources.ogg2.groupId = flume
agent.sources.ogg2.kafka.consumer.timeout.ms = 100

# Describe the sink
agent.sinks.hdfs1.type = hdfs
agent.sinks.hdfs1.hdfs.path = hdfs://sandbox/user/oracle/ggflume
agent.sinks.hdfs2.type = hdfs
agent.sinks.hdfs2.hdfs.path = hdfs://sandbox/user/oracle/ggflume
#agent.sinks.hdfs1.type = logger

# Use a channel which buffers events in memory
agent.channels.ch1.type = memory
agent.channels.ch1.capacity = 1001
agent.channels.ch1.transactionCapacity = 1000
agent.channels.ch2.type = memory
agent.channels.ch2.capacity = 1001
agent.channels.ch2.transactionCapacity = 1000

# Bind the source and sink to the channel
agent.sources.ogg1.channels = ch1
agent.sources.ogg2.channels = ch2
agent.sinks.hdfs1.channel = ch1
agent.sinks.hdfs2.channel = ch2

As you can see, we have separate sources for each of our Kafka topics, and we have two sinks pointing to the same HDFS location. The data is going to be written down in Avro format.
All preparations are completed, and we are running Kafka server, two topics, and Flume is ready to write data to HDFS. Our HDFS directory is still empty.

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
[oracle@sandbox oggbd]$

Let’s run the passive replicat with our initial data load trail file :

[oracle@sandbox oggbd]$ cd /u01/oggbd
[oracle@sandbox oggbd]$ ./replicat paramfile dirprm/irkafka.prm reportfile dirrpt/irkafka.rpt
[oracle@sandbox oggbd]$

Now we can have a look to results. We got 3 files on HDFS where first two files describe structure for the TEST_TAB_1 and TEST_TAB_2 accordingly, and the third file contains the data changes, or maybe better to say initial data for those tables. You may see that the schema definition was put on separate files when the data changes were posted altogether to the one file.

[oracle@sandbox ~]$ hadoop fs -ls /user/oracle/ggflume/
Found 3 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
[oracle@sandbox ~]$

[oracle@sandbox ~]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691685
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?????k?\??????S?A?%?{
  "type" : "record",
  "name" : "TEST_TAB_1",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
.........................


[oracle@sandbox ~]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691686
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?*
?e????xS?A?%N{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
  }, {


...............................

[oracle@sandbox ~]$hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691718
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable??????c?C n??S?A?b"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.373000(00000000-10000002012
PK_ID1371O62FX&2014-01-24:19:09:20RJ68QYM5&2014-01-22:12:14:30"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405000(00000000-10000002155
PK_ID2371O62FX&2014-01-24:19:09:20HW82LI73&2014-05-11:05:23:23"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405001(00000000-10000002298
PK_ID3RXZT5VUN&2013-09-04:23:32:56RJ68QYM5&2014-01-22:12:14:30"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405002(00000000-10000002441
PK_ID4RXZT5VUN&2013-09-04:23:32:56HW82LI73&2014-05-11:05:23:23"BDTEST.TEST_TAB_2I42016-02-16 19:17:40.76289942016-03-23T12:14:35.408000(00000000-10000002926
PK_IDRND_STR_1ACC_DATE7IJWQRO7T&2013-07-07:08:13:52[oracle@sandbox ~]$

Now we need to create our ongoing replication. Our extract was set up the same way as it was described in the first post of the series. It is up and running, passing changes to the replicat side to the directory ./dirdat

GGSCI (sandbox.localdomain) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     GGEXT       00:00:09      00:00:03


[oracle@sandbox oggbd]$ ls -l dirdat/
total 240
-rw-r-----. 1 oracle oinstall   3028 Feb 16 14:17 initld
-rw-r-----. 1 oracle oinstall 190395 Mar 14 13:00 or000041
-rw-r-----. 1 oracle oinstall   1794 Mar 15 12:02 or000042
-rw-r-----. 1 oracle oinstall  43222 Mar 17 11:53 or000043
[oracle@sandbox oggbd]$

I’ve prepared parameter file for the Kafka replicat :

[oracle@sandbox oggbd]$ cat dirprm/rkafka.prm
REPLICAT rkafka
-- Trail file for this example is located in "AdapterExamples/trail" directory
-- Command to add REPLICAT
-- add replicat rkafka, exttrail dirdat/or, begin now
TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP GGTEST.*, TARGET BDTEST.*;

[oracle@sandbox oggbd]$

We need only add and start our rkafka replica for the Big Data GoldenGate.

GGSCI (sandbox.localdomain) 1> add replicat rkafka, exttrail dirdat/or, begin now
REPLICAT added.


GGSCI (sandbox.localdomain) 2> start replicat rkafka

Sending START request to MANAGER ...
REPLICAT RKAFKA starting


GGSCI (sandbox.localdomain) 3> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 11:53   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:06 ago)
Process ID           21041
Log Read Checkpoint  File dirdat/or000000000
                     2016-03-24 11:53:17.388078  RBA 0

You may remember that we don’t have dirdat/or000000000 file in our dirdat directory. So, our replicat has to be slightly corrected to work with proper trail files. I am altering sequence for my replicat to reflect actual sequence number for my last trail file.

GGSCI (sandbox.localdomain) 10> stop replicat rkafka

Sending STOP request to REPLICAT RKAFKA ...
Request processed.


GGSCI (sandbox.localdomain) 11> alter replicat rkafka EXTSEQNO 43

2016-03-24 12:03:27  INFO    OGG-06594  Replicat RKAFKA has been altered through GGSCI. Even the start up position might be updated, duplicate suppression remains active in next startup. To override duplicate suppression, start RKAFKA with NOFILTERDUPTRANSACTIONS option.

REPLICAT altered.


GGSCI (sandbox.localdomain) 12> start replicat rkafka

Sending START request to MANAGER ...
REPLICAT RKAFKA starting


GGSCI (sandbox.localdomain) 13> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 12:03   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:12 ago)
Process ID           21412
Log Read Checkpoint  File dirdat/or000000043
                     First Record  RBA 0


GGSCI (sandbox.localdomain) 14>

Let’s change some data:

orclbd> select * from test_tab_2;

           PK_ID RND_STR_1  ACC_DATE
---------------- ---------- ---------------------------
               7 IJWQRO7T   07/07/13 08:13:52


orclbd> insert into test_tab_2 values (8,'TEST_INS1',sysdate);

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 5 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
[oracle@sandbox oggbd]$

[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836268086
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?Q???n?y?1?R#S?j???"BDTEST.TEST_TAB_2I42016-03-24 16:17:29.00033642016-03-24T12:17:31.733000(00000000430000043889
PK_IDRND_STR_1ACC_DATE8TEST_INS1&2016-03-24:12:17:26[oracle@sandbox oggbd]$
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836268130
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?6F!?Z?-?ZA8r^S?j?oN{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",

We got our schema definition file and a file with data changes.

orclbd> update test_tab_2 set RND_STR_1='TEST_UPD1' where pk_id=8;

1 row updated.

orclbd> commit;

Commit complete.

orclbd>

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 6 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
-rw-r--r--   1 flume oracle        316 2016-03-24 12:28 /user/oracle/ggflume/FlumeData.1458836877420
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836877420
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable]??u????????qS?t,??"BDTEST.TEST_TAB_2U42016-03-24 16:27:39.00035642016-03-24T12:27:42.177000(00000000430000044052
PK_IDRND_STR_1ACC_DATE8TEST_INS1&2016-03-24:12:17:268TEST_UPD1&2016-03-24:12:17:26[oracle@sandbox oggbd]$

You can see that we only got a file with data changes since no DDL changes were made. The transactions will be grouped to the files according to our Flume parameters as we discussed in the previous blog post.

You can also see old value for the updated record and the new one. Using that information we can reconstruct the changes, but we need to apply certain logic to decrypt the changes.

For deletion operation we are getting operation flag “F” and values for the deleted record. Again, no schema definition file since no changes were made.

Let’s try some DDL.

orclbd> truncate table test_tab_2;

Table TEST_TAB_2 truncated.

orclbd>
GGSCI (sandbox.localdomain) 4> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 12:10   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:02 ago)
Process ID           21803
Log Read Checkpoint  File dirdat/or000043
                     2016-03-24 12:40:05.000303  RBA 45760


GGSCI (sandbox.localdomain) 5>

No new files on HDFS.

orclbd> insert into test_tab_2 select * from test_tab_3;

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 8 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
-rw-r--r--   1 flume oracle        316 2016-03-24 12:28 /user/oracle/ggflume/FlumeData.1458836877420
-rw-r--r--   1 flume oracle        278 2016-03-24 12:35 /user/oracle/ggflume/FlumeData.1458837310570
-rw-r--r--   1 flume oracle        277 2016-03-24 12:42 /user/oracle/ggflume/FlumeData.1458837743709
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458837743709
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable*?2??????>iS??\??"BDTEST.TEST_TAB_2I42016-03-24 16:42:04.00020042016-03-24T12:42:06.774000(00000000430000045760
PK_IDRND_STR_1ACC_DATE7IJWQRO7T&2013-07-07:08:13:52[oracle@sandbox oggbd]$

Again, we got only file with data changes. I tried to compare the file we were getting for the previous insert and insert after truncate, but couldn’t find difference except for the binary part of the avro file. It will require additional investigation and maybe clarification from Oracle. In the current state it looks like it is easy to miss a truncate command for a table on the destination side.

Let us change the table and add a column there.

orclbd> alter table test_tab_2 add test_col varchar2(10);
Table TEST_TAB_2 altered.

orclbd>

We are not getting any new files with new table definitions until we do any DML on the table. Both files (with the new schema definition and data changes) will appear after we insert, delete or update any rows there.

orclbd> insert into test_tab_2 values (8,'TEST_INS1',sysdate,'TEST_ALTER');

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 10 items
...................................................
-rw-r--r--   1 flume oracle       1654 2016-03-24 12:56 /user/oracle/ggflume/FlumeData.1458838582020
-rw-r--r--   1 flume oracle        300 2016-03-24 12:56 /user/oracle/ggflume/FlumeData.1458838584891
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458838582020
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable-??ip??/?w?S??/{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",
................
        "name" : "TEST_COL",
        "type" : [ "null", "string" ],
        "default" : null
.................

[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458838584891
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritabletr?V?_$???:2??S??/w?"BDTEST.TEST_TAB_2I42016-03-24 16:56:04.00026042016-03-24T12:56:08.370000(00000000430000047682
PK_IDRND_STR_1ACC_DATETEST_COL8TEST_INS1&2016-03-24:12:56:01TEST_ALTER

I used JMeter to generate some load, and it could easily with almost no delays, replicate 225 transactions per second (30% inserts 80% updates). It was not a test for Kafka or Flume, which could sustain way more load, but rather combination of GoldenGate with the Big Data infrastructure. It was stable without any errors. I do understand that the current test is very far from any potential production workflow which may include Oracle Database (or any other RDBMS) + GoldenGate + Kafka + Storm + …. . And maybe the final data format will be completely different. So far the adapters are looking good and doing the job. In the next post I will observe the HBASE adapter. Stay tuned.

Categories: DBA Blogs

Log Buffer #467: A Carnival of the Vanities for DBAs

Pythian Group - Thu, 2016-03-31 08:40

This Log Buffer Edition brings some top of the list blog posts from Oracle, SQL Server and MySQL.

Oracle:

An Exadata quarter rack has two database servers and three storage cells. In a typical setup, such a system would have three ASM disk groups, say DATA, RECO and DBFS_DG. Usually the disk group DATA would be high redundancy and the other two disk groups would be normal redundancy.

Best practice for calling web services from Oracle Process Cloud Service

2 Min Tech Tips at Oracle OpenWorld: Are You Ready for Your Close-Up?

Are your SQL Plus scripts going to ‘ell ?

New ways of input still on the verge of the enterprise

SQL Server:

Why Every SQL Server Installation Should Be a Cluster

When AUTO_UPDATE_STATISTICS Doesn’t Happen

Fixing Maintenance Plan Error code 0x534

SQL Server Table Smells

Some companies have been slow to acquire big data applications. They discovered that modern hardware platforms and database management systems were more than adequate for most of their business analytics needs.

MySQL:

Galera Cluster and Docker Swarm

MariaDB 10.1.13 and Connector/J 1.3.7 now available

Why an App-Centric View Isn’t Enough

How to Install and Configure MySQL Cluster on CentOS 7

Invalid datetime when converting to timestamp

Categories: DBA Blogs

New Oracle Cloud Offering – Indexing as a Service (IDXaaS) (I Pity The Fool)

Richard Foote - Thu, 2016-03-31 07:09
This of course is an April Fools joke. Sorry !! A very exciting announcement !! I’ve recently been promoted within Oracle Corporation to lead their brand new Oracle Cloud offering “Indexing as a Service” (IDXaaS) team, based on my previous work and experience in the indexing space. Yes, I’m both thrilled and excited to be […]
Categories: DBA Blogs

Python DBA Graphs Github Repository

Bobby Durrett's DBA Blog - Tue, 2016-03-29 16:40

I decided to get rid of the Github repository that I had experimented with and to create a new one. The old one had a dump of all my SQL scripts but without any documentation. But, I have updated my Python graphing scripts a bit at a time and have had some recent value from these scripts in my Oracle database tuning work. So, I created a Github repository called PythonDBAGraphs. I think it will be more valuable to have a repository that is more focused and is being actively updated and documented.

It is still very simple but I have gotten real value from the two graphs that are included.

Bobby

Categories: DBA Blogs

In Depth: MySQL 5.6+ DDL

Pythian Group - Tue, 2016-03-29 09:07
Overview

DDL (Data Definition Language) statements create, alter, and remove database objects. These types of changes can be a very dangerous action to take on such a critical piece of your infrastructure. You want to make sure that the command that you are executing has been given proper thought and testing.

In this post I go through multiple version of MySQL and verify the best course of action to take in regards to executing DDL statements.  There are many things that you have to consider when making these types of changes, such as disk space, load on the database server, slave replication, the type of DDL statement you are executing, and if it will lock the table. 

Because of these risks, there are tools that can be used to help mitigate some of the dangers. But unless you have tested and verified their functionality, these tools in themselves can cause trouble. Whenever in doubt, take the time to test and verify any changes that you will make. In my testing I will be using :

pt-online-schema-change in particular since it is a very popular tool and I have used it many times.  Also, the primary reason it was created was before MySQL offered online DDL changes. In some cases, depending on your environment, the best course of action may be removing the database server from being accessed, by failing over to a slave, or taking a cluster node offline.

I will be focusing on the most common DDL statements as I want to keep this post to a reasonable size. Many of the MySQL DDL statements by default are using the INPLACE algorithm where it is able, which is only available in MySQL 5.6 or later. In earlier versions 5.5 and 5.1 with the InnoDB plugin they had fast index creation but all other table alters were blocking. Online DDL with the INPLACE algorithm allows MySQL to make a copy of the table in the background, copy the data to this table, make your table alters, and then swap the tables, all without locking the table. Some DDL statements can be done instantaneously, such as dropping an index or renaming a column. When MySQL isn’t able to use the INPLACE algorithm it will have to revert to using the COPY algorithm which will in turn lock the table. An example of this is changing a column definition from VARCHAR to BLOB. Whenever you are doing an INPLACE alter you will want to specify the algorithm in your command. This will help protect you in the case that MySQL is unable to do an INPLACE alter. MySQL will return an error rather than running the command with the COPY algorithm.


ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE COLUMN first_name first_name BLOB NULL;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.

All of my testing was done without specifying the algorithm, allowing MySQL to determine the best algorithm to use.  If there are any DDL statements that you want more information on, please refer to the documentation for the release of MySQL that you are using, as I will not be going into foreign keys.

The Setup

All of my testing was done in virtual machines (VMs) on my laptop. I have a VM that will be running mysqlslap to perform remote DML statements such as SELECT, UPDATE, DEELTE and INSERT, causing load on the database server. This will allow me to see any potential table locks or performance impact. Here is the setup of the MySQL machine and it’s components. I have created the table shown below and imported 10 million rows. While mysqlslap was running I performed each of the DDL statements and watched that the DML statements were being executed with no table locks. I then recorded the time as they completed.

MySQL Server Stats
  • CPU : 4x CPUs at 2.6 GHz Intel Core i7
  • Memory allocated to VM : 2 Gig
  • Memory allocated to MySQL Innodb buffer pool: 1 Gig
  • Flash Storage
  • Table has 10 Million Rows.
  • DML (Data Manipulation Language) statements such as select, insert, update, and delete, that will be executed against the table during DDL statements
Table Structure
CREATE TABLE `employee_test` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`birth_date` date NOT NULL,
`first_name` varchar(14) NOT NULL,
`last_name` varchar(16) NOT NULL,
`gender` enum('M','F') NOT NULL,
`hire_date` date NOT NULL,
PRIMARY KEY (`emp_no`),
KEY `ix_lastname` (`last_name`),
KEY `ix_firstname` (`first_name`)
) ENGINE=InnoDB AUTO_INCREMENT=10968502 DEFAULT CHARSET=latin1
MySQL DDL Commands
CREATE INDEX ix_hire_date ON employee_test (hire_date); --CREATE INDEX
CREATE FULLTEXT INDEX ix_lastname_fulltext ON employee_test(last_name); --CREATE FULLTEXT INDEX
DROP INDEX ix_hire_date ON employee_test; --DROP INDEX
OPTIMIZE TABLE employee_test; --OPTIMIZE TABLE
ALTER TABLE employee_test ADD COLUMN test_column INT NULL; --ADD COLUMN
ALTER TABLE employee_test DROP COLUMN f_name; --DROP COLUMN
ALTER TABLE employee_test CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN
ALTER TABLE employee_test MODIFY COLUMN emp_no BIGINT AUTO_INCREMENT NOT NULL; --CHANGE COLUMN TYPE
pt-online-schema-change DDL Commands
pt-online-schema-change --execute --alter 'ADD FULLTEXT INDEX ix_lastname_fulltext (last_name)' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'ENGINE=INNODB' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'ADD COLUMN test_column3 INT NULL' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'MODIFY COLUMN gender BLOB NULL' D=employees,t=employee_test
Results

This matrix is a representation of the testing that I performed and how quickly the commands took to execute. Be careful with Fulltext indexes on your tables since they potentially can cause additional locking by creating the necessary infrastructure in the background. Please see MySQL Innodb Fulltext Indexes for more details. This requirement causes a great deal of locking on the table.

DDL Matrix

pt-online-schema-change

For the DDL statements that cause locking of the table we wanted to look at incorporating pt-online-schema-change, to help us overcome this obstacle.

pt-online-schema-change results

pt-online-schema-change allowed us to perform the operations that locked the table previously with no locking. pt-onilne-schema-change also has many other features such as helping with the impact on slave replication, and handling foreign keys. But it also has it’s limitation such as not being able to run it on a table that already has triggers, or complications with foreign keys. There are also impacts on your environment if it is not properly tested and verified. One such example is, every time that I ran pt-online-schema-change in my test it caused a deadlock causing mysqlslap to die and no longer perform and further statements.

mysqlslap: Cannot run query UPDATE employee_test SET first_name = ‘BigPurpleDog’ WHERE last_name = ‘SmallGreenCat’; ERROR : Deadlock found when trying to get lock; try restarting transaction

This is why it is very important to try and determine the impact if any that pt-online-schema-change may have on your environment before starting to use it. I did not encounter this behavior with any of the MySQL DDL statements that I ran.

Performance Impact

While performing the changes there were consistent increases in CPU load, disk I/O, and disk usage as the new tables were being created for the table alters. We have to remember that when certain DDL statements are being executed, a full copy of the table is being performed, so you will want to make sure you have enough disk space to complete the change.  This is why it is very important to take into consideration the size of the table you are altering and the load on the MySQL server while performing DDL statements. It is preferred that you run any of the DDL statements that cause table copies, off hours as to avoid any delays or outages to the application that is using the data.

Query Execution Impact

Query Execution Baseline

Server Performance Impact

MySQL Alter Load
MySQL Alter Load

Conclusion

As I have observed in performing these tests, there are many things to consider when performing DDL statements to avoid potential downfalls. Here is a summary of the recommendations to executing DDL statements or using pt-online-schema-change. Before considering any of this determine if the statement you are going to perform is going to copy a table, and if it does, make sure you have enough disk space.

Without Fulltext
With Fulltext

If you are going to make changes to your production servers, make sure that you run your DDL statements during off hours when the server is at it’s lowest utilization for both CPU and disk.

For an added safety measure when you are performing any of the MySQL DDL statements that you are expecting to be executed INPLACE and will not lock the table, make sure you specify ALGORITHM=INPLACE in your statement. If MySQL is unable to execute the command in place, it will just return an error, instead of executing the statement with the COPY algorithm which will lock the table. Here are samples of the DDL statements that you should be able run INPLACE and not cause any locking of your table.

ALTER TABLE employee_test ALGORITHM=INPLACE, ADD INDEX ix_hire_date (hire_date); --CREATE INDEX
ALTER TABLE employee_test ALGORITHM=INPLACE, DROP INDEX ix_firstname; --DROP INDEX
ALTER TABLE employee_test ALGORITHM=INPLACE, ENGINE=INNODB; --OPTIMIZE TABLE
ALTER TABLE employee_test ALGORITHM=INPLACE, ADD COLUMN test_column INT NULL; --ADD COLUMN
ALTER TABLE employee_test ALGORITHM=INPLACE, DROP COLUMN f_name; --DROP COLUMN
ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN

 

 

References

 

 

Categories: DBA Blogs