Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 8 hours 30 min ago

Enabling Large Pages on Oracle Database 11g running on IBM AIX

Mon, 2016-04-04 17:30

For implementing Large Pages on AIX first you will need to choose large page size at OS level.
On AIX you can have multiple large page sizes of 4KB, 64KB, 16MB, and 16GB.

In this example we will be using a large page size of 16MB.

Steps for implemenatation:

1- Based on MOS Doc ID 372157.1 first you need to enable Large Pages at OS level

# vmo -r  -o lgpg_size=16777216 -o lgpg_regions=<Total number of pages>
# vmo -o lru_file_repage=0
# vmo -p -o v_pinshm=1
# lsuser -a capabilities oracle
# chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle
# bosboot -a

This needs a server reboot

For complete instruction please review note ID 372157.1

2- Setting parameters at instance level

On AIX databases you only need to set LOCK_SGA to TRUE:

alter system set lock_sga=TRUE scope=spfile;

Note: On AIX databases, USE_LARGE_PAGES parameter has NO impact.
These parameters are only valid for databases running on Linux, the value of this parameter even if set to FALSE will be ignored on AIX.

By default when Large Pages is available on AIX it will be used by database instances regardless of USE_LARGE_PAGES parameter value. You only need to set LOCK_SGA.

3- Restart the instance and confirm Large Pages is in use:

After setting lock_sga instance must be restarted.
As I explained above, when Large Pages is available at OS level it will be used by instance, but the key point in here is how to confirm whether Large Pages is in use or not.

How to check if Huge Pages is used by Oracle instance.

For Oracle 11g running on AIX, no informational message is written to the alert log as what we see in the alert log of databases running on Linux.

So for your database instance running on AIX do NOT expect following lines in the alert log:

****************** Large Pages Information *****************
Total Shared Global Region in Large Pages = xx MB (100%)
Large Pages used by this instance: xxx (xxx MB)
Large Pages unused system wide = x (xxx MB) (alloc incr 4096 KB)
Large Pages configured system wide = xxx (xxx MB)
Large Page size = 16 MB
***********************************************************

The only way you can make sure large pages is being used by instance is checking memory usage at OS level:

Consider SGA_TARGET in your instance is 8G
Total number of Large Pages (with size of 16M) will be 8G/16M + 1 which is : 8589934592 / 16777216 + 1 = 513

Check the number of large 16M pages in use at OS level before starting your instance:

$ vmstat -P all

System configuration: mem=98304MB

pgsz            memory                           page
----- -------------------------- ------------------------------------
           siz      avm      fre    re    pi    po    fr     sr    cy
   4K  4420992  2926616   487687     0     0     5  1280   2756     0
  64K   582056   581916      754     0     0     0     0      0     0
  16M     2791       87     2704     0     0     0     0      0     0

In this example number of 16M pages in use before starting instance is 87 pages from total available of 2791 pages.

We start the instance with SGA size of 8G:

SQL> startup
ORACLE instance started.

Total System Global Area 8551575552 bytes
Fixed Size                  2238616 bytes
Variable Size            2348812136 bytes
Database Buffers         6190792704 bytes
Redo Buffers                9732096 bytes
Database mounted.
Database opened.
SQL> show parameter sga_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
sga_target                           big integer 8G

SQL> show parameter lock_sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     TRUE

Then we check Large pages in use again :

$ vmstat -P all

System configuration: mem=98304MB

pgsz            memory                           page
----- -------------------------- ------------------------------------
           siz      avm      fre    re    pi    po    fr     sr    cy
   4K  4428160  2877041   420004     0     0     5  1279   2754     0
  64K   581608   530522    51695     0     0     0     0      0     0
  16M     2791      600     2191     0     0     0     0      0     0

As you can see the total number of 16M pages in use is now 600 pages, which is exactly 513 pages more than what it was before instance startup.
This proves that 16M pages have been used by our instance.

You can also check memory usage of your instance by checking one of the instance processes like pmon:

$ ps -ef|grep pmon
  oracle 14024886 31392176   0 14:05:34  pts/0  0:00 grep pmon
  oracle 41681022        1   0   Mar 11      -  3:12 ora_pmon_KBS

Then check memory used by this process is from 16M Pages:

$ svmon -P 41681022

-------------------------------------------------------------------------------
     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB
41681022 oracle         2180412  2109504     1778  2158599      Y     N     Y

     PageSize                Inuse        Pin       Pgsp    Virtual
     s    4 KB               31820          0       1650       9975
     m   64 KB                2959        516          8       2961
     L   16 MB                 513        513          0        513

I hope this will be useful for you, and good luck.

Categories: DBA Blogs

More Fun With Oracle Timestamp Math

Mon, 2016-04-04 14:23
Timestamp Math

Several years ago I wrote an article on Oracle date math.
Amazingly, that article was still available online at the time of this writing.

Working With Oracle Dates

An update to that article is long overdue.
While date math with the DATE data type is fairly well known and straight forward, date math with Oracle TIMESTAMP data is less well known and somewhat more difficult.

Data Types and Functions

Let’s begin by enumerating the data types and functions that will be discussed

Datetime and Interval Data Types

Documentation for Datetime and Interval Data Types

  • Timestamp
  • Timestamp with Time Zone
  • Interval Day to Second
  • Interval Year to Month
Datetime Literals

Documentation for Datetime Literals

  • Date
  • Timestamp
  • Timestamp with Time Zone
  • Timestamp with Local Time Zone
Interval Literals

Documentation for Interval Literals

  • Interval Day to Second
  • Interval Year to Month
Datetime/Interval Arithmetic

Documentation for Datetime/Interval Arithmetic

There is not a link to the heading, just scroll down the page until you find this.

Timestamp Functions

Documentation for Datetime Functions

There quite a few of these available. Most readers will already be familiar with many of these, and so only some of the more interesting functions related to timestamps will be covered.

  • extract
  • to_dsinterval
  • to_yminterval
Timestamp Internals

It is always interesting to have some idea of how different bits of technology work. In Working With Oracle Dates I showed how Date values are stored in the database, as well as how a Date stored in the database is somewhat different than a date variable.

Let’s start by storing some data in a timestamp column and comparing how it differs from systimestamp.

Test table for Timestamp Math Blog:

col c1_dump format a70
col c1 format a35
col funcname format a15

set linesize 200 trimspool on
set pagesize 60

drop table timestamp_test purge;

create table timestamp_test (
c1 timestamp
)
/

insert into timestamp_test values(systimestamp );
insert into timestamp_test values(systimestamp - 366);
commit;

select
'timestamp' funcname, c1, dump(c1) c1_dump
from timestamp_test
union all
select
'systimestamp' funcname, systimestamp, dump(systimestamp) systimestamp_dump
from dual
/

FUNCNAME        C1                                  C1_DUMP
--------------- ----------------------------------- ----------------------------------------------------------------------
timestamp       26-MAR-16 03.09.27.649491 PM -07:00 Typ=180 Len=11: 120,116,3,26,16,10,28,38,182,114,56
timestamp       26-MAR-15 03.09.27.000000 PM -07:00 Typ=180 Len=7: 120,115,3,26,16,10,28
systimestamp    26-MAR-16 03.09.27.687416 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,9,27,0,192,34,249,40,252,0,5,0,0,0,0,0

3 rows selected.

One of the first things you might notice is that the value for Typ is 180 for TIMESTAMP columns, but for SYSTIMESTAMP Typ=188.
The difference is due to TIMESTAMP being an internal data type as stored in the database, while SYSTIMESTAMP is dependent on the compiler used to create the executables.

Another difference is the length; the first TIMESTAMP column has a length of 11, whereas the SYSTIMESTAMP column’s length is 20. And what about that second TIMESTAMP column? Why is the length only 7?

TIMESTAMP with length of 7

An example will show why the second row inserted into TIMESTAMP_TEST has a length of only 7.

  1* select dump(systimestamp) t1, dump(systimestamp-1) t2, dump(sysdate) t3 from dual
15:34:32 ora12c102rac01.jks.com - jkstill@js122a1 SQL- /

T1                                       T2                                       T3
---------------------------------------- ---------------------------------------- ----------------------------------------
Typ=188 Len=20: 224,7,3,22,22,34,35,0,16 Typ=13 Len=8: 224,7,3,21,18,34,35,0      Typ=13 Len=8: 224,7,3,22,18,34,35,0
0,181,162,17,252,0,5,0,0,0,0,0


1 row selected.

T2 was implicitly converted to the same data type as SYSDATE because standard date math was performed on it.

The same thing happened when the second row was inserted TIMESTAMP_TEST.

Oracle implicitly converted the data to a DATE data type, and then implicitly converted it again back to a timestamp, only the standard date information is available following the previous implicit conversion.

You may have noticed that in this example the length of the data is 8, while that stored in the table was 7. This is due to the use of SYSDATE, which is an external data type, whereas any data of DATE data type that is stored in the database is using an internal data type which always has a length of 7.

SYSTIMESTAMP Byte Values

Let’s see if we can determine how each byte is used in a SYSTIMESTAMP value

The following SQL will use the current time as a baseline, and start a point 10 seconds previous, showing the timestamp value and the internal representation.

 

col t1 format a35
col t2 format a38
col dump_t1 format a70
col dump_t2 format a70

set linesize 250 trimspool on

/*
 using to_disinterval() allows performing timestamp math without implicit conversions

see https://en.wikipedia.org/wiki/ISO_8601
 for an explanation of the PTnS notation being used in to_dsinterval()

*/

alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

-- subtract 1 second from the current date
-- do it 10 times
select
 --systimestamp t1,
 --dump(systimestamp) dump_t1,
 systimestamp - to_dsinterval('PT' || to_char(level) || 'S') t2,
 dump(systimestamp - to_dsinterval('PT' || to_char(level) || 'S')) dump_t2
from dual connect by level <= 10
order by level desc
/


T2                                     DUMP_T2
-------------------------------------- ----------------------------------------------------------------------
26-MAR-16 03.34.55.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,55,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.56.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,56,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.57.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,57,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.58.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,58,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.59.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,59,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.00.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,0,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.01.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,1,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.02.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,2,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.03.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,3,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.04.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,4,0,152,108,205,20,252,0,5,0,0,0,0,0

10 rows selected.

 

From the output we can see that seconds are numbered 0-59, and that the 7th byte in the internal format is where the second is stored. We can also see that the Month is represented by the 3rd bytes, and the Day by the fourth 4th byte.

One would then logically expect the 5th bite to show us the hour. Glancing at the actual time of 3:00 PM it seems curious then the value we expect to be the hour is 19 rather than 15.

The server where these queries is being run has a Time Zone of EDT. Next I ran the same queries on a server with a TZ of PDT, and though the time in the timestamp appeared as 3 hours earlier, the value stored in the 5th byte is still 19. Oracle is storing the hour in UTC time, then using the TZ from the server to get the actual time.

Playing with Time Zones

We can modify the local session time zone to find out how Oracle is calculating the times.

The first attempt is made on the remote client where scripts for this article are developed. The TZ will be set for Ethiopia and then the time checked at the Linux command line and in Oracle.

 

# date
Sat Mar 26 13:16:12 PDT 2016

# TZ='Africa/Addis_Ababa'; export TZ

# date
Sat Mar 26 23:16:17 EAT 2016

# sqlplus jkstill/XXX@p1
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
SQL- !date
Sat Mar 26 23:16:40 EAT 2016

SQL-  l
 1 select
 2 systimestamp t1,
 3 dump(systimestamp) dump_t1
 4* from dual
23:16:50 ora12c102rac01.jks.com - jkstill@js122a1 
SQL- /
T1                                  DUMP_T1
----------------------------------- ----------------------------------------------------------------------
26-MAR-16 04.16.56.254769 PM -04:00 Typ=188 Len=20: 224,7,3,26,20,16,56,0,104,119,47,15,252,0,5,0,0,0,0,0

Setting the TZ on the client clearly has no effect on the time returned from Oracle. Now let’s try while logged on to the database server.

$ date
Sat Mar 26 16:20:23 EDT 2016

$ TZ='Africa/Addis_Ababa'; export TZ

$ date
Sat Mar 26 23:20:38 EAT 2016

$ sqlplus / as sysdba
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL- l
 1 select
 2 systimestamp t1,
 3 dump(systimestamp) dump_t1
 4* from dual
SQL- /

T1                                  DUMP_T1
----------------------------------- ----------------------------------------------------------------------
26-MAR-16 11.22.48.473298 PM +03:00 Typ=188 Len=20: 224,7,3,26,20,22,48,0,80,244,53,28,3,0,5,0,0,0,0,0

 

This experiment has demonstrated two things for us:

  1. Oracle is storing the hour as UTC time
  2. Setting the TZ on the client does not have any affect on the calculations of the time.
What About the Year?

Given the location of the month, it would be expected to find the year in the byte just previous the month byte. There is not just one byte before the month, but two. You will recall that SYSTIMESTAMP has a different internal representation than does the TIMESTAMP data type. Oracle is using both of these bytes to store the year.

Working with this timestamp from an earlier example, we can use the information in Oracle Support Note 69028.1 to see how this works.

 

T2                                     DUMP_T2
-------------------------------------- ----------------------------------------------------------------------
26-MAR-16 03.34.55.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,55,0,152,108,205,20,252,0,5,0,0,0,0,0

 

For the timestamp of March 16 2016 the first two bytes of the timestamp are used to represent the year.

The Formula for AD dates is Byte 1 + ( Byte 2 * 256 ). Using this formula the year 2016 can be arrived at:

224 + ( 7 * 256) = 2016

For the TIMESTAMP data type, the format is somewhat different for the year; actually it works the same way it does for the DATE data type, in excess 100 notation.

 

SQL- l
1* select c1, dump(c1) dump_c1 from timestamp_test where rownum < 2
SQL- /

C1 DUMP_C1
------------------------------ ---------------------------------------------------
26-MAR-16 03.09.27.649491 PM Typ=180 Len=11: 120,116,3,26,16,10,28,38,182,114,56

 

The 2nd byte indicates the current year – 1900.

Decode All Timestamp Components

Now let’s decode all of the data in a TIMESTAMP. First we need some some TIMESTAMP test data

Creating the test data

The following SQL will provide some test data for following experiments.

We may not use all of the columns or rows, but they are available.

 

drop table timestamp_tz_test purge;

create table timestamp_tz_test (
 id integer,
 c1 timestamp,
 c2 timestamp with time zone,
 c3 timestamp with local time zone
)
/

-- create 10 rows each on second apart

begin
for i in 1..10
loop
 insert into timestamp_tz_test values(i,systimestamp,systimestamp,systimestamp );
 dbms_lock.sleep(1);
 null;
end loop;
commit;
end;
/

 

We already know that TIMESTAMP data can store fractional seconds to a billionth of a second.

Should you want to prove that to yourself, the following bit of SQL can be used to insert TIMESTAMP data into a table, with each row being 1E-9 seconds later than the previous row. This will be left as an exercise for the reader.

 

create table t2 as
select level id,
 to_timestamp('2016-03-29 14:25:42.5' || lpad(to_char(level),8,'0'),'yyyy-mm-dd hh24.mi.ssxff') c1
from dual
connect by level <= 1000
/

col dump_t1 format a70
col c1 format a35
col id format 99999999

select id, c1, substr(dump(c1),instr(dump(c1),',',-1,4)+1) dump_t1
from t2
order by id
/

 

Oracle uses 4 bytes at the end of a timestamp to store the fractional seconds.

The value of the least byte is as shown.

Each greater byte will be a power of 256.

The following SQL will make this more clear. Don’t spend too much time at first trying to understand the SQL, at it will become more clear after you see the results.

SQL to decode 1 row of TIMESTAMP data.

 

col id format 99
col t1 format a35
col dumpdata format a50
col tz_type format a10
col ts_component format a40
col label format a6
col real_value format a50

set linesize 200 trimspool on

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';


with rawdata as (
 select c2 t1, dump(c2) dump_t1
 from timestamp_tz_test
 where id = 1
),
datedump as (
 select t1,
 substr(dump_t1,instr(dump_t1,' ',1,2)+1) dumpdata
 from rawdata
),
-- regex from http://nuijten.blogspot.com/2009/07/splitting-comma-delimited-string-regexp.html
datebits as (
 select level id, regexp_substr (dumpdata, '[^,]+', 1, rownum) ts_component
 from datedump
 connect by level <= length (regexp_replace (dumpdata, '[^,]+')) + 1
),
labeldata as (
 select 'TS,DU,CC,YY,MM,DD,HH,MI,SS,P1,P2,P3,P4' rawlabel from dual
),
labels as (
 select level-2 id, regexp_substr (rawlabel, '[^,]+', 1, rownum) label
 from labeldata
 connect by level <= length (regexp_replace (rawlabel, '[^,]+')) + 1
),
data as (
 select db.id, db.ts_component
 from datebits db
 union
 select 0, dumpdata
 from datedump dd
 union select -1, to_char(t1) from rawdata
)
select d.id, l.label, d.ts_component,
 case l.label
 when 'DU' then d.ts_component
 when 'CC' then 'Excess 100 - Real Value: ' || to_char(to_number((d.ts_component - 100)*100 ))
 when 'YY' then 'Excess 100 - Real Value: ' || to_char(to_number(d.ts_component - 100 ))
 when 'MM' then 'Real Value: ' || d.ts_component
 when 'DD' then 'Real Value: ' || d.ts_component
 when 'HH' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'MI' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'SS' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'P1' then 'Fractional Second P1 : ' || to_char((to_number(d.ts_component) * POWER(256,3) ) / power(10,9))
 when 'P2' then 'Fractional Second P2 : ' || to_char((to_number(d.ts_component) * POWER(256,2) ) / power(10,9))
 when 'P3' then 'Fractional Second P3 : ' || to_char((to_number(d.ts_component) * 256 ) / power(10,9))
 when 'P4' then 'Fractional Second P4 : ' || to_char((to_number(d.ts_component) + 256 ) / power(10,9))
 end real_value
from data d
join labels l on l.id = d.id
order by 1
/

When the values for the Pn fractional second columns are added up, they will be equal to the (rounded) value shown in the timestamp.

 

 ID LABEL  TS_COMPONENT                             REAL_VALUE
--- ------ ---------------------------------------- --------------------------------------------------
 -1 TS     2016-03-31 09.14.29.488265 -07:00
  0 DU     120,116,3,31,17,15,30,29,26,85,40,13,60  120,116,3,31,17,15,30,29,26,85,40,13,60
  1 CC     120                                      Excess 100 - Real Value: 2000
  2 YY     116                                      Excess 100 - Real Value: 16
  3 MM     3                                        Real Value: 3
  4 DD     31                                       Real Value: 31
  5 HH     17                                       Excess 1 - Real Value: 16
  6 MI     15                                       Excess 1 - Real Value: 14
  7 SS     30                                       Excess 1 - Real Value: 29
  8 P1     29                                       Fractional Second P1 : .486539264
  9 P2     26                                       Fractional Second P2 : .001703936
 10 P3     85                                       Fractional Second P3 : .00002176
 11 P4     40                                       Fractional Second P4 : .000000296

13 rows selected.

Timezones are recorded in an additional two bytes in TIMESTAMP WITH TIMEZONE and TIMEAZONE WITH LOCAL TIMEZONE data types.

Decoding those two bytes is left as an exercise for the reader.

Timestamp Arithmetic

Now that we have had some fun exploring and understanding how Oracle stores TIMESTAMP data, it is time to see how calculations can be performed on timestamps.

Note: See this ISO 8601 Article to understand the notation being used in to_dsinterval().

Interval Day to Second

It is a common occurrence to add or subtract time to or from Oracle Dates.

How that is done with the Oracle DATE data type is fairly well known.

  • Add 1 Day
    • DATE + 1
  • Add 1 Hour
    • DATE + (1/24)
  • Add 1 Minute
    • DATE + ( 1 / 1440)
  • Add 1 Second
    • DATE + (1/86400)

Following is a brief refresher on that topic:

 

alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

select sysdate today, sysdate -1 yesterday from dual;

select sysdate now, sysdate - (30/86400) "30_Seconds_Ago" from dual;

select sysdate now, sysdate + ( 1/24 ) + ( 15/1440 ) + ( 42/86400) "1:15:42_Later" from dual;

SQL- @date-calc

Session altered.

TODAY YESTERDAY
------------------- -------------------
2016-03-30 13:39:06 2016-03-29 13:39:06
NOW 30_Seconds_Ago
------------------- -------------------
2016-03-30 13:39:06 2016-03-30 13:38:36

NOW 1:15:42_Later
------------------- -------------------
2016-03-30 13:39:06 2016-03-30 14:54:48

 

While this same method will work with timestamps, the results may not be what you expect. As noted earlier Oracle will perform an implicit conversion to a DATE data type, resulting in truncation of some timestamp data. The next example makes it clear that implicit conversions have converted TIMESTAMP to a DATA type.

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_date_format = 'DD-MON-YY';

select systimestamp today, systimestamp -1 yesterday from dual;

select systimestamp now, systimestamp - (30/86400) "30_Seconds_Ago" from dual;

select systimestamp now, systimestamp + ( 1/24 ) + ( 15/1440 ) + ( 42/86400) "1:15:42_Later" from dual;

SQL- @timestamp-calc-incorrect

TODAY                                                                       YESTERDAY
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.591223 -04:00                                           30-MAR-16

NOW                                                                         30_Second
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.592304 -04:00                                           31-MAR-16

NOW                                                                         1:15:42_L
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.592996 -04:00                                           31-MAR-16

 

Oracle has supplied functions to properly perform calculations on timestamps. The previous example will work properly when ds_tointerval is used as seen in the next example.

 

col c30 head '30_Seconds_Ago' format a38
col clater head '1:15:42_Later' format a38
col now format a35
col today format a35
col yesterday format a38

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';


-- alternate methods to subtract 1 day
select systimestamp today, systimestamp - to_dsinterval('P1D') yesterday from dual;
select systimestamp today, systimestamp - to_dsinterval('1 00:00:00') yesterday from dual;

-- alternate methods to subtract 30 seconds
select systimestamp now, systimestamp - to_dsinterval('PT30S') c30 from dual;
select systimestamp now, systimestamp - to_dsinterval('0 00:00:30') c30 from dual;

-- alternate methods to add 1 hour, 15 minutes and 42 seconds
select systimestamp now, systimestamp + to_dsinterval('PT1H15M42S') clater from dual;
select systimestamp now, systimestamp + to_dsinterval('0 01:15:42') clater from dual;

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-30 18.10.41.613813 -04:00 2016-03-29 18.10.41.613813000 -04:00

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-30 18.10.41.614480 -04:00 2016-03-29 18.10.41.614480000 -04:00

NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-30 18.10.41.615267 -04:00 2016-03-30 18.10.11.615267000 -04:00

NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-30 18.10.41.615820 -04:00 2016-03-30 18.10.11.615820000 -04:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-30 18.10.41.616538 -04:00 2016-03-30 19.26.23.616538000 -04:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-30 18.10.41.617161 -04:00 2016-03-30 19.26.23.617161000 -04:00

 

Extract Values from Timestamps

The values for years, months, days, hours and seconds can all be extracted from a timestamp via the extract function. The following code demonstrates a few uses of this, along with examples of retrieving intervals from two dates.

The values in parentheses for the day() and year() intervals specify the numeric precision to be returned.

 

def nls_tf='yyyy-mm-dd hh24.mi.ssxff'

alter session set nls_timestamp_format = '&nls_tf';

col d1_day format 999999
col full_interval format a30
col year_month_interval format a10

with dates as (
   select
      to_timestamp_tz('2014-06-19 14:24:29.373872', '&nls_tf') d1
      , to_timestamp_tz('2016-03-31 09:42:16.8734921', '&nls_tf') d2
   from dual
)
select
   extract(day from d1) d1_day
   , ( d2 - d1) day(4) to second full_interval
   , ( d2 - d1) year(3) to month year_month_interval
   , extract( day from d2 - d1) days_diff
   , extract( hour from d2 - d1) hours_diff
   , extract( minute from d2 - d1) minutes_diff
   , extract( second from d2 - d1) seconds_diff
from dates
/


 D1_DAY FULL_INTERVAL                  YEAR_MONTH  DAYS_DIFF HOURS_DIFF MINUTES_DIFF SECONDS_DIFF
------- ------------------------------ ---------- ---------- ---------- ------------ ------------
     19 +0650 19:17:47.499620          +001-09           650         19           17   47.4996201

Building on that, the following example demonstrates how the interval value the represents the difference between dates d1 and d2 can be added back to d1 and yield a date with the same value as d1.

 

def nls_tf='yyyy-mm-dd hh24.mi.ssxff'

alter session set nls_timestamp_format = '&nls_tf';

col d1 format a30
col d2 format a30
col full_interval format a30
col calc_date format a30

with dates as (
   select
      to_timestamp('2014-06-19 14:24:29.373872', '&nls_tf') d1
      , to_timestamp('2016-03-31 09:42:16.873492', '&nls_tf') d2
   from dual
)
select
   d1,d2
   , ( d2 - d1) day(4) to second  full_interval
   , d1 + ( d2 - d1) day(4) to second calc_date
from dates
/


D1                             D2                             FULL_INTERVAL                  CALC_DATE
------------------------------ ------------------------------ ------------------------------ ------------------------------
2014-06-19 14.24.29.373872000  2016-03-31 09.42.16.873492000  +0650 19:17:47.499620          2016-03-31 09.42.16.873492000

 

PL/SQL Interval Data Types

 

The ISO 8601 Article previously mentioned will be useful for understanding how time durations may be specified with interval functions.

The following combination of SQL and PL/SQL is used to convert the difference between two timestamps into seconds. The code is incomplete in the sense that the assumption is made that the largest component of the INTERVAL is hours. In the use case for this code that is true, however there could also be days, months and years for larger value of the INTERVAL.

The following code is sampled from the script ash-waits-use.sql and uses PL/SQL to demonstrate the use of the INTERVAL DAY TO SECOND data type in PL/SQL.

 

var v_wall_seconds number
col wall_seconds new_value wall_seconds noprint

declare
	ash_interval interval day to second;
begin

	select max(sample_time) - min(sample_time) into ash_interval from v$active_session_history;


	select
		max(sample_time) - min(sample_time) into ash_interval
	from v$active_session_history
	where sample_time 
	between
		decode('&&snap_begin_time',
			'BEGIN',
			to_timestamp('1900-01-01 00:01','yyyy-mm-dd hh24:mi'),
			to_timestamp('&&snap_begin_time','yyyy-mm-dd hh24:mi')
		)
		AND
		decode('&&snap_end_time',
			'END',
			to_timestamp('4000-12-31 23:59','yyyy-mm-dd hh24:mi'),
			to_timestamp('&&snap_end_time','yyyy-mm-dd hh24:mi')
		);

	:v_wall_seconds := 
		(extract(hour from ash_interval) * 3600 )
		+ (extract(second from ash_interval) * 60 )
		+ extract(second from ash_interval) ;
end;
/


select round(:v_wall_seconds,0) wall_seconds from dual;

 

Similarly the to_yminterval function is used to to perform timestamp calculations with years and months.


col clater head 'LATER' format a38
col now format a35
col today format a35
col lastyear format a38
col nextyear format a38

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';

-- alternate methods to add 1 year
select systimestamp today, systimestamp + to_yminterval('P1Y') nextyear from dual;
select systimestamp today, systimestamp + to_yminterval('01-00') nextyear from dual;


-- alternate methods to subtract 2 months
select systimestamp now, systimestamp - to_yminterval('P2M') lastyear from dual;
select systimestamp now, systimestamp - to_yminterval('00-02') lastyear from dual;

-- alternate methods to add 2 year, 4 months, 6 days ,1 hour, 15 minutes and 42 seconds
select systimestamp now, systimestamp + to_yminterval('P2Y4M')  + to_dsinterval('P2DT1H15M42S') clater from dual;
select systimestamp now, systimestamp + to_yminterval('02-04')  + to_dsinterval('2 01:15:42') clater from dual;

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-31 09.06.22.060051 -07:00   2016-03-30 09.06.22.060051000 -07:00

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-31 09.06.22.061786 -07:00   2016-03-30 09.06.22.061786000 -07:00


NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-31 09.06.22.063641 -07:00   2016-03-31 09.05.52.063641000 -07:00


NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-31 09.06.22.064974 -07:00   2016-03-31 09.05.52.064974000 -07:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-31 09.06.22.066259 -07:00   2016-03-31 10.22.04.066259000 -07:00


NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-31 09.06.22.067600 -07:00   2016-03-31 10.22.04.067600000 -07:00

While date math with the DATE data type is somewhat arcane, it is not too complex once you understand how it works.

When Oracle introduced the TIMESTAMP data type, that all changed. Timestamps are much more robust than dates, and also more complex

Timestamps bring a whole new dimension to working with dates and times; this brief introduction to working with timestamp data will help demystify the process of doing math with timestamps.

Categories: DBA Blogs

Migrate a SQL Server environment with complex replication without reinitializing or rebuilding replication

Mon, 2016-04-04 11:14

When you have a SQL Server environment where a very complex replication setup is in place, and you need to migrate/move (without upgrading), some or all the servers involved in the replication topology to new servers/Virtual Machines or to a new Data Center/Cloud, this Blog post is for you!

Let’s assume you also have Transactional and/or Merge publications and subscriptions in place, and you need to move the publisher(s) and/or distributor(s) to a new environment. You also have one or more of the following restrictions:

  • You are not sure if the schema at the subscribers is identical to the publisher (i.e.: different indexes, different columns, etc).
  • You cannot afford downtime to reinitialize the subscriber(s)
  • There are too many subscribers to reinitialize and you cannot afford the downtime if anything goes wrong.

Here are the general steps for this migration:
Prior the migration date:

  • New instance has to have same SQL Server version and edition plus patch level as old instance. Windows version and edition can be different but you need to ensure the version of Windows supports the version of SQL Server.
  • The directory structure for the SQL Server files should be identical in the new server as old server and same permissions:
    • Same path for SQL Server binaries
    • Same path and database files names in both servers for system databases
    • Same directories where user database files and T-logs reside
    • Same path for the replication directories (when applies)
  • Copy over any instance-level objects (Logins, Linked Servers and jobs) to new instance; leave jobs disabled if applies or stop SQL Server Agent on new server

On migration date:

  • Disable any jobs, backups and maintenance that should run during the migration window on old server
  • Stop all database activity on old instance or disable logins
  • Restart old instance and verify there is no activity
  • Synchronize all replication agents that are related to the server being migrated
  • Stop and disable replication agents related to the server being migrated
  • Stop both instances
  • Copy over all system database files from old to new server
  • Copy over all user database files from old server to new one
    • Alternatively, backup all user databases on old server before stopping service and copy the files to new server
  • Shutdown old server
  • Rename new server to the name of old server and change the IP of new server to old server’s IP
  • Start the new server
  • Verify that the name of the new instance is like the old server and it’s local
  • If you backed up the user databases previously, you need to restore them to same location and file names as in old server with RECOVERY and KEEP_REPLICATION
  • Verify that all user databases are online and publications + subscribers are there
  • Start all replication agents related to the migrated server and verify replication is working properly
  • Verify that applications are able to connect to the new instance (no need to modify instance name as it is the same as before and same IP)

At any case, it is strongly recommended to test the migration prior to the real cutover, even if the test environment is not identical to Production, just to get a feel for it. Ensure that you are including most replication scenarios you have in Production during your test phase.

The more scripts you have handy for the cutover date, the less downtime you may have.

It is extremely important to also have a good and tested rollback plan.

In future Blog posts I will discuss more complex replication scenarios to be migrated and rollback plans.

If you would like to make suggestions for future blogs, please feel free to add a comment and I will try to include your request in future posts.

Categories: DBA Blogs

Why You Should Consider Moving Your Enterprise Application to the Oracle Cloud

Mon, 2016-04-04 09:32

 

If you’ve decided to migrate your Oracle enterprise applications to the public cloud, it’s a good idea to consider Oracle Cloud alongside alternatives such as Amazon Web Services (AWS) and Microsoft Azure.

Oracle has made big strides in the cloud lately with platform-as-a-service (PaaS) offerings for its middleware and database software, culminating in the release of its first infrastructure-as-a-service (IaaS) offering in late 2015.

Oracle has a clear advantage over the competition when it comes to running its own applications in the cloud: it has full control over product licensing and can optimize its cloud platform for lift-and-shift migrations. This gives you a low-risk strategy for modernizing your IT portfolio.

 

What to expect from Oracle Cloud IaaS

Because Oracle’s IaaS offering is quite new, it has yet to match the flexibility and feature set of Azure and AWS. For example, enterprise VPN connectivity between cloud and on-premises infrastructure is still very much a work in progress. Unlike AWS, however, Oracle provides a free software appliance for accessing cloud storage on-premises. In addition to offering an hourly metered service, Oracle also provides unmetered compute capacity with a monthly subscription. Some customers prefer this option because it allows them to more easily control their spending through a predictable monthly fee rather than a pure pay-as-you-go model.

At the same time, Oracle Cloud IaaS has a limited selection of instance shapes, there is no SSD storage yet or guaranteed input/output performance levels, and transferring data is more challenging for large-volume migrations.

 

What to expect from Oracle Cloud PaaS

Oracle’s PaaS offerings are quickly becoming among the most comprehensive cloud-based services for Oracle Database. They include:

 

Oracle Database Schema Service

This is the entry-level unmetered offering, available starting at $175 a month for a 5GB database schema limit. Tenants share databases but are isolated in their own schemas. This means you have no control over database parameters, only the schema objects created. This service is currently available only with Oracle Database 11g Release 2 (i.e., it is not yet included in the latest release of Oracle Database 12c).

 

Oracle Exadata Cloud Service

This is a hosted service with monthly subscriptions starting at $70,000 for a quarter rack with 28 OCPUs enabled and 42TB of usable storage provisioned. You have full root OS access and SYSDBA database access, so you have total flexibility in managing your environment. However, this means Oracle manages only the bare minimum—the external networking and physical hardware—so you may end up expending the same effort as you would managing Exadata on-premises.

 

Oracle Database Virtual Image Service

This is a Linux VM with pre-installed Oracle Database software. The license is included in the rate. It’s available metered (priced per OCPU per hour of runtime) and unmetered (priced per OCPU allocated per month). As you’ll need to manage everything up from the VM level, including OS management and full DBA responsibilities, the metered service is a particularly good option for running production environments that require full control over the database deployment.

 

Oracle Database-as-a-Service (DBaaS)

This is an extension of Virtual Image Service and includes additional automation for database provisioning during service creation, backup, recovery, and patching. While you are still responsible for the complete management of the environment, the embedded automation and tooling can simplify some DBA tasks.

I should point out that, with the exception of Oracle Database Schema Service, these are not “true” PaaS offerings; they function more like IaaS-style services but with database software licenses included. But this is on the way, as Oracle recently announced plans for a fully managed DBaaS offering  similar to the one available through AWS.

 

While Oracle’s cloud options are still quite new and require additional features for broad enterprise adoption, if this option sparks your interest, now is the time to take the first steps. If you want to learn more about the migration path to Oracle Cloud, check out our white paper, Migrating Oracle Databases to Cloud.

migratingtocloud

Categories: DBA Blogs

Best practice for setting up MySQL replication filters

Fri, 2016-04-01 13:23

It is not uncommon that we need to filter out some DBs or Tables while setting up replication. It is important to understand how MySQL evaluates/process the replication filtering rules to avoid the conflicting or confusion while we setting them up.The purpose of this blog is to illustrate the rules and provide some suggestions for best practice.

MySQL provides 3 levels of filters for setting up replication: Binary log, DB and Table. The binlog filters apply on the master to control how to log the changes. Since MySQL replication is based on the binlog, it is the first level filter and has the highest priority. While the DB-level and Table-level filters apply on the slaves, since each table belongs to a schema, the DB-level filters have higher priority than the Table-level ones. Inside the Table-level filters, MySQl will evaluate the options in the order of: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table , –replicate-wild-ignore-table.

Based on that, we have the following suggestions for setting up MySQL replication filter as best practice:

I)Do not setup any binlog-level filters unless you really need to and can afford losing the chance of  having an extra full copy of data changes for the master.

II)In DB-level filters, use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

III) While using binlog_format=’statement’ OR ‘mixed’ (in mixed mode, if  a transaction is deterministic then it will be stored in statement format) and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database on master otherwise you might lose the changes on slave due to default database not matching.

IV)In Table-level filters, use only one of the 2 options, or use the following two combination: –replicate-ignore-table and —replicate-wild-do-table to avoid conflicting and confusing.

For MariaDB replication filters within Galera cluster, it should be used with caution. As a general rule except for InnoDB DML updates, the following replication filters are not honored in a Galera cluster :  binlog-do-db ,binlog-ignore-db, replicate-wild-do-db, replicate-wild-ignore-db. However, replicate-do-db,replicate-ignore-db filters are honored for DDL and DML for both InnoDB & MyISAM engines. As they might create discrepancies and replication may abort (see MDEV-421, MDEV-6229). (https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/), For the slaves replicating from cluster, the rules are similar with normal replication settings as above.

Here are the details/reasons:

1)Binlog-level filters

A)How MySQL process the Binlog-level filters

There are 2 options for setting binlog filter on master:  –binlog-do-db and –binlog-ignore-db. MySQL will check –binlog-do-db first, if there are any options, it will apply this one and ignore –binlog-ignore-db. If the –binlog-do-db is NOT set, then mysql will check –binlog-ignore-db.If both of them are empty, it will log changes for all DBs.

See the below examples. In scenario 1) no binlog level filters are set and so all changes were logged; In scenario 2) -binlog-do-db and –binlog-ignore-db are all set to m_test and changes on the DB m_test were logged and changes on the DB test were NOT logged;In scenario 3) only –binlog-ignore-db is set to m_test and so changes on the DB m_test were NOT logged and changes on the DB test were  logged;
scenario 1)–binlog-do-db and –binlog-ignore-db is NOT set:

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000003 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

mysql> show binlog events in “vm-01-bin.000003” from 120;  

Empty set (0.00 sec)

mysql> insert into t1(id,insert_time) values(10,now());

Query OK, 1 row affected (0.05 sec)

 

mysql> show binlog events in “vm-01-bin.000003” from 120;

+——————+—–+————+———–+————-+—————————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                          |

+——————+—–+————+———–+————-+—————————————————————+

| vm-01-bin.000003 | 120 | Query      |         1 |         211 | BEGIN                                                         |

| vm-01-bin.000003 | 211 | Query      |         1 |         344 | use `m_test`; insert into t1(id,insert_time) values(10,now()) |

| vm-01-bin.000003 | 344 | Xid        |         1 |         375 | COMMIT /* xid=17 */                                           |

+——————+—–+————+———–+————-+—————————————————————+

3 rows in set (0.00 sec)

scenario 2)–binlog-do-db=m_test and –binlog-ignore-db=m_test:

— insert into tables of DB m_test was logged

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      656 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

 

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.02 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 656;

+——————+—–+————+———–+————-+———————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                    |

+——————+—–+————+———–+————-+———————————————————+

| vm-01-bin.000004 | 656 | Query      |         1 |         747 | BEGIN                                                   |

| vm-01-bin.000004 | 747 | Intvar     |         1 |         779 | INSERT_ID=13                                            |

| vm-01-bin.000004 | 779 | Query      |         1 |         906 | use `m_test`; insert into t1(insert_time) values(now()) |

| vm-01-bin.000004 | 906 | Xid        |         1 |         937 | COMMIT /* xid=26 */                                     |

+——————+—–+————+———–+————-+———————————————————+

4 rows in set (0.00 sec)

— insert into tables of DB test was NOT logged

mysql> use test;

 

mysql> show master status ;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      937 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

 

mysql> insert into t1(`a`) values(‘ab’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 937;

Empty set (0.00 sec)

 

scenario 3)–Binlog_Do_DB=null –binlog-ignore-db=m_test:

mysql> use m_test

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000005 |      120 |              | m_test           |                   |

+——————+———-+————–+——————+——————-+

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.01 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

Empty set (0.00 sec)

 

mysql> use test

mysql> insert into t1(`a`) values(‘ba’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

+——————+—–+————+———–+————-+———————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                         |

+——————+—–+————+———–+————-+———————————————-+

| vm-01-bin.000005 | 120 | Query      |         1 |         199 | BEGIN                                        |

| vm-01-bin.000005 | 199 | Query      |         1 |         305 | use `test`; insert into t1(`a`) values(‘ba’) |

| vm-01-bin.000005 | 305 | Xid        |         1 |         336 | COMMIT /* xid=22 */                          |

+——————+—–+————+———–+————-+———————————————-+

3 rows in set (0.00 sec)

 

B)Best practice for setting up the Binlog-level filters

So, for Binlog-level filter, we will use either one (and ONLY one or none) of the 2 options: –binlog-do-db to make MySQL log changes for the DBs in the list. OR, –binlog-ignore-db to make MySQL log changes for the DBs NOT in the list. Or leave both of them empty to log changes for all the DBs.

However, we usually recommend NOT to setup any binlog-level filters. The reason is that to log changes for all DBs and set up filters only on slaves will achieve the same purpose and let us have an extra full copy of data changes for the master, in case we will need that for recovery.

 

2)DB-level filters

A)How MySQL process the DB-level filters

There are 2 options for setting DB-level filters:  –replicate-do-db or –replicate-ignore-db. MySQL processes these two filters the similar way as it processes the Binlog-level filters, the difference is that it ONLY applies on the slaves and so affects how the slaves replicate from its master. It will check –replicate-do-db first, if there are any options, it will replicate the DBs in the list and ignore –replicate-ignore-db. If the –replicate-do-db is NOT set, then mysql will check –replicate-ignore-db and replicate all the DBs except for the ones in this list.If both of them are empty, it will replicate all the DBs. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html

There is a trick for DB-level filters though If the binlog_format is set as statement or mixed. (The binlog_format =mixed also applies here, it is because that  in mixed mode replication, in case the transaction  is deterministic it will be resolved to statement which is equivalent to statement mode) .. Since “With statement-based replication, the default database is checked for a match.” (http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html). If you set up –replicate-do-db and you update a table out of the default database in master, the update statement will not be replicated if the default database you are running command from is not in the  –replicate-do-db. For example, there are 2 DBs in master, you set binlog_format=’statement’ OR ‘mixed’ and set –replicate-do-db=DB1 on slave. when execute the following commands: use DB2; update DB1.t1 … This update command will not be executed on slave. To make the update statement replicated to slave, you need to do: use DB1, update t1 …

For example: with binlog_format=statement or binlog_format=mixed,  we insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged in the master. But in slave, after it caught up, only the insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated. As shown below:

Scenario 1) binlog_format=statement

In master: insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> delete from t1;

Query OK, 16 rows affected (0.02 sec)

 

mysql> select * from m_test.t1;

Empty set (0.00 sec)

 

mysql> use m_test

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000006” from 654;

+——————+——+————+———–+————-+—————————————————————-+

| Log_name         | Pos  | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+——+————+———–+————-+—————————————————————-+

| vm-01-bin.000006 |  654 | Xid        |         1 |         685 | COMMIT /* xid=39 */                                            |

| vm-01-bin.000006 |  685 | Query      |         1 |         768 | BEGIN                                                          |

| vm-01-bin.000006 |  768 | Query      |         1 |         860 | use `m_test`; delete from t1                                   |

| vm-01-bin.000006 |  860 | Xid        |         1 |         891 | COMMIT /* xid=48 */                                            |

| vm-01-bin.000006 |  891 | Query      |         1 |         982 | BEGIN                                                          |

| vm-01-bin.000006 |  982 | Intvar     |         1 |        1014 | INSERT_ID=17                                                   |

| vm-01-bin.000006 | 1014 | Query      |         1 |        1148 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000006 | 1148 | Xid        |         1 |        1179 | COMMIT /* xid=52 */                                            |

| vm-01-bin.000006 | 1179 | Query      |         1 |        1268 | BEGIN                                                          |

| vm-01-bin.000006 | 1268 | Intvar     |         1 |        1300 | INSERT_ID=18                                                   |

| vm-01-bin.000006 | 1300 | Query      |         1 |        1432 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000006 | 1432 | Xid        |         1 |        1463 | COMMIT /* xid=60 */                                            |

+——————+——+————+———–+————-+—————————————————————-+

12 rows in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

+—-+———————+

2 rows in set (0.00 sec)

 

In slave: after it caught up, only the first insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000006

         Read_Master_Log_Pos: 1463

              Relay_Log_File: ewang-vm-03-relay-bin.000017

               Relay_Log_Pos: 1626

       Relay_Master_Log_File: vm-01-bin.000006

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 1463

             Relay_Log_Space: 1805

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

+—-+———————+

1 row in set (0.00 sec)

 

Scenario 2) binlog_format=mixed

In master:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000007 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> show binlog events in “vm-01-bin.000007” from 120;

+——————+—–+————+———–+————-+—————————————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+—–+————+———–+————-+—————————————————————-+

| vm-01-bin.000007 | 120 | Query      |         1 |         211 | BEGIN                                                          |

| vm-01-bin.000007 | 211 | Intvar     |         1 |         243 | INSERT_ID=19                                                   |

| vm-01-bin.000007 | 243 | Query      |         1 |         377 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000007 | 377 | Xid        |         1 |         408 | COMMIT /* xid=45 */                                            |

| vm-01-bin.000007 | 408 | Query      |         1 |         497 | BEGIN                                                          |

| vm-01-bin.000007 | 497 | Intvar     |         1 |         529 | INSERT_ID=20                                                   |

| vm-01-bin.000007 | 529 | Query      |         1 |         661 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000007 | 661 | Xid        |         1 |         692 | COMMIT /* xid=53 */                                            |

+——————+—–+————+———–+————-+—————————————————————-+

8 rows in set (0.00 sec)

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

| 19 | 2016-03-20 15:09:14 |

| 20 | 2016-03-20 15:09:25 |

+—-+———————+

4 rows in set (0.00 sec)

 

In slave:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000007

         Read_Master_Log_Pos: 692

              Relay_Log_File: ewang-vm-03-relay-bin.000023

               Relay_Log_Pos: 855

       Relay_Master_Log_File: vm-01-bin.000007

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 692

             Relay_Log_Space: 1034

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 19 | 2016-03-20 15:09:14 |

+—-+———————+

2 rows in set (0.00 sec)

 

B)Best practice for setting up the DB-level filters

Use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

If you use binlog_format=’statement’  OR ‘mixed’ and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database, otherwise the data discrepancy will be expected in the slaves.

 

3)Table-level filters

There are 4 options for setting Table-level filters: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table or –replicate-wild-ignore-table. MySQL evaluates the options in order. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.6/en/replication-rules-table-options.html

 

The above chart shows us that MySQL will first check –replicate-do-table, the tables listed here will be replicated and so won’t be ignored by the following options like –replicate-ignore-table , or –replicate-wild-ignore-table. Then MySQL will check –replicate-ignore-table, the tables listed here will be ignored even if it shows up in the following options  –replicate-wild-do-table. The lowest priority is –replicate-wild-ignore-table.

B)Best practice for setting up the Table-level filters

Due to the priorities for the 4 Table_level options, to avoid confusing/conflicting, we suggest using only one of the 4 options, or using the following two options: –replicate-ignore-table and replicate-wild-do-table so that it is clearly that the tables in –replicate-ignore-table will be ignored and the tables in replicate-wild-do-table will be replicated.

 

Categories: DBA Blogs

What Are Your Options For Migrating Enterprise Applications to the Cloud?

Fri, 2016-04-01 08:16

Migrating your enterprise applications from on-premises infrastructure to the public cloud is attractive for a number of reasons. It eliminates the costs and complexities of provisioning hardware and managing servers, storage devices, and network infrastructure; it gives you more compute capacity per dollar without upfront capital investment; and you gain opportunities for innovation through easier access to new technologies, such as advanced analytical capabilities.

So how do you get there?

You have a few options. At one end of the spectrum, you could simply wait and rationalize, making continuous incremental changes to gain efficiencies. This is obviously a “slow burn” approach. In the middle is a “lift-and-shift” from your current environment into the public cloud. And at the far extreme, you could plunge right in and re-architect your applications—a costly and probably highly complex task.

 

In fact, a true migration “strategy” will involve elements of each of these. For example, you could perform short-term optimizations and migrations on a subset of applications that are ready for the cloud, while transforming the rest of your application stack over the longer term.

 

What to expect from the major public cloud platforms

There are three leading public cloud platforms: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). As Google doesn’t seem to be driving customers to lift-and-shift their applications to GCP, I’m going to focus on AWS and Azure as potential cloud destinations and, for specificity, take Oracle enterprise databases as the use case.

 

Amazon Web Services

You have two options for migrating Oracle databases to the AWS cloud: infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS).

 

Deploying Oracle applications in AWS IaaS is much like deploying them on your in-house infrastructure. You don’t get flexible licensing options, but you do have the ability to easily allocate more or less capacity as needed for CPU, memory, and storage. However, because AWS IaaS is virtualized infrastructure, you may experience slower performance due to suboptimal CPU core allocation or processor caches. You’ll also have less flexibility with instance sizes, network topology, storage performance tiers, and the like.

 

AWS Relational Database Service (RDS) for Oracle is a managed PaaS offering where, in addition to giving you the benefits of IaaS, Amazon takes on major DBA and system administrator tasks including provisioning, upgrades, backups, and multi-availability zone replication. This significantly simplifies your operations—but also results in less control over areas such as configuration, patching, and maintenance windows. AWS RDS for Oracle can also be used with a pay-as-you-go licensing model included in the hourly rate.

 

Microsoft Azure

Azure does not have a managed offering for Oracle databases, so the only way to run Oracle Database on Azure is through its IaaS platform. The benefits are very similar to AWS IaaS, but Azure offers additional licensing options (with Windows-based license-included images) and its instances are billed by the minute rather than by the hour. What’s important to keep in mind is that Azure is not as broadly adopted as AWS and offers less flexibility for storage performance tiers and instance sizes. Oracle Database software running on Windows is also not as common as running on Linux.

 

For more in-depth technical details on these options, I encourage you to read our white paper, Migrating Oracle Databases to Cloud. My next blog in this series will look at one other option not discussed here: migrating to Oracle Cloud.

migratingtocloud

Categories: DBA Blogs

5 Phases for Migrating to a Cloud Platform

Thu, 2016-03-31 13:11

Businesses today are increasingly looking to migrate to the cloud to realize lower costs and increase software velocity. They are now asking themselves “when” they should migrate rather than if they “should”, and with many vendors and solutions in the market, it can be difficult to take the first steps in creating a cloud strategy.   

In our latest on-demand webinar, Chris Presley, Solution Architect at Pythian, and Jim Bowyer, Solution Architect at Azure-Microsoft Canada, discuss a five phase framework for cloud transformations, and the benefits of migrating to the cloud with Microsoft Azure.

The five phase framework helps businesses understand the journey to successfully migrate current applications to a cloud platform. Here is a snapshot of the five phases:

 

1. Assessment: Analysis and Planning

A majority of the time investment should be upfront in assessment and preparation because it sets the stage for the actual development and migration, resulting in faster projects, lower costs, and less risk.

In this phase, businesses want to begin understanding the performance and user characteristics of their applications, and any other additional information that will be important during the transformation, such as regulatory, compliance, and legal requirements.

 

2. Preparation: POC, Validation and Final Road Map

The preparation phase is meant to help understand what the rest of the migration is going to look like.

While beneficial in any project, proof of concepts (POCs) are increasingly simple to create and are a great strength when leveraging the cloud. POCs are used to show some functionality and advantage early so you can get everyone – especially business owners – excited about the migration.

 

3. Build: Construct Infrastructure

Once the expectations around the final migration road map are developed, the infrastructure can be built. Jim discusses that beginning to think about automation during this phase is important, and Chris agrees, in particular with developing an automated test bed to help smooth out the migration.

 

4. Migration: Execute Transformation

The migration activity for cloud environments is very short. By this stage, if the planning and preparation has been done properly, “flicking the light switch” to the new environment should be seamless and feel like the easiest part.

Chris talks about creating both detailed success and rollback criteria and how they are both crucial for success in the migration phase. Jim mentions that Microsoft Azure provides a variety of tools to help make rollbacks easier and safer.

 

5. Optimization: IaaS Enhancements

Continually transforming and enhancing after the migration is complete is important for increasing software velocity, which is why businesses migrate to the cloud in the first place. While a piece of functionality may not available today, it may be available tomorrow.

By going back to iterate and take advantage of new functionalities, businesses are able to squeeze out more improvements and create opportunities for differentiation.

 

Learn More

To learn about these five cloud transformation phases in more depth, and how to leverage the cloud with Microsoft Azure, download our free on-demand webinar.

Azure_Webinar (1)

Categories: DBA Blogs

GoldenGate 12.2 Big Data Adapters: part 3 – Kafka

Thu, 2016-03-31 09:39

This post continues my review of GoldenGate Big Data adapters started by review of HDFS and FLUME adapters. Here is list of all posts in the series:

  1. GoldenGate 12.2 Big Data Adapters: part 1 – HDFS
  2. GoldenGate 12.2 Big Data Adapters: part 2 – Flume
  3. GoldenGate 12.2 Big Data Adapters: part 3 – Kafka

In this article I will try the Kafka adapter and see how it works. Firstly, I think it may be worth reminding readers what Kafka is. Kafka is a streaming subscriber-publisher system. One can ask how it is different from Flume, and that question I’ve asked myself when I’ve heard about the Kafka. I think one of the best comparisons between Flume and Kafka has been made by Gwen Shapira & Jeff Holoman in the blog post Apache Kafka for Beginners . In essence, Kafka is general purpose system where most of the control and consumer functionality relays on your own built consumer programs. When in Flume you have pre-created sources, sinks, and can use interceptors for changing data. So, in Kafka you are getting on the destination exactly what you put on the source. Kafka and Flume can work together pretty well, and in this article I am going to use them both.
Let’s recall what we have in our configuration. We have an Oracle database running as a source, and Oracle GoldenGate for Oracle capturing changes for one schema in this database. We have OGG 12.2 and integrated extract on the source. The replication is going directly to trail files on the destination side where we have OGG for BigData installed on a Linux box. You can get more details about the installation on source and target from the first post in the series. I’ve made configuration as simple as possible dedicating most attention to the Big Data adapters functionality, which is after all the main point of the article.

Having installed OGG for Big Data, we need to setup the Kafka adapter. As for other adapters, we are copying the configuration files from $OGG_HOME/AdapterExamples/big-data directory.

bash$ cp $OGG_HOME/AdapterExamples/big-data/kafka/* $OGG_HOME/dirdat/

We need to adjust our kafka.props file to define Kafka/Zookeper topics for data and schema changes (TopicName and SchemaTopicName parameters), and the gg.classpath for Kafka and Avro java classes. I left rest of the parameters default including format for the changes which was defined as “avro_op” in the example.

[oracle@sandbox oggbd]$ cat dirprm/kafka.props

gg.handlerlist = kafkahandler
gg.handler.kafkahandler.type = kafka
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
gg.handler.kafkahandler.TopicName =oggtopic
gg.handler.kafkahandler.format =avro_op
gg.handler.kafkahandler.SchemaTopicName=mySchemaTopic
gg.handler.kafkahandler.BlockingSend =false
gg.handler.kafkahandler.includeTokens=false

gg.handler.kafkahandler.mode =tx
#gg.handler.kafkahandler.maxGroupSize =100, 1Mb
#gg.handler.kafkahandler.minGroupSize =50, 500Kb


goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE

gg.log=log4j
gg.log.level=INFO

gg.report.time=30sec

gg.classpath=dirprm/:/u01/kafka/libs/*:/usr/lib/avro/*:

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

[oracle@sandbox oggbd]$

The next file we have to correct is custom_kafka_producer.properties which contains information about our running Kafka server and define some addition parameters like compression. I left all the parameters unchanged except “bootstrap.servers” where I put information about my Kafka service.

[oracle@sandbox oggbd]$ cat dirprm/custom_kafka_producer.properties
bootstrap.servers=sandbox:9092
acks=1
compression.type=gzip
reconnect.backoff.ms=1000

value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
# 100KB per partition
batch.size=102400
linger.ms=10000
[oracle@sandbox oggbd]$

If we plan an initial load through Kafka we can use something like that parameter file I prepared for a passive replicat :

[oracle@sandbox oggbd]$ cat dirprm/irkafka.prm
-- Trail file for this example is located in "dirdat" directory
-- Command to run passive REPLICAT
-- ./replicat paramfile dirprm/irkafka.prm reportfile dirrpt/irkafka.rpt
SPECIALRUN
END RUNTIME
EXTFILE /u01/oggbd/dirdat/initld
--
TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP ggtest.*, TARGET bdtest.*;
[oracle@sandbox oggbd]$

Before starting any replicat we need to prepare our system to receive the data. Since the Kafka itself is pure streaming system it cannot pass files to HDFS without other program or connector. In the first case we will be using Kafka passing data to Flume and from Flume will use its sink to HDFS. Please be aware that you need a Zookeeper to manage topics for Kafka. I am not going to discuss setting up Zookeeper in this article, just assume that we have it already and it is up and running on port 2181.
I used Kafka version 0.9.0.1 downloading it from http://kafka.apache.org/downloads.html. After downloading the archive I unpacked it, slightly corrected configuration and started it in standalone mode.

[root@sandbox u01]# wget http://apache.parentingamerica.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
--2016-03-15 15:22:09--  http://apache.parentingamerica.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
Resolving apache.parentingamerica.com... 70.38.15.129
Connecting to apache.parentingamerica.com|70.38.15.129|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 35650542 (34M) [application/x-gzip]
Saving to: `kafka_2.11-0.9.0.1.tgz'

100%[=========================================================================================================================================>] 35,650,542  2.95M/s   in 16s

2016-03-15 15:22:26 (2.10 MB/s) - `kafka_2.11-0.9.0.1.tgz' saved [35650542/35650542]

[root@sandbox u01]# tar xfz kafka_2.11-0.9.0.1.tgz

[root@sandbox u01]# ln -s kafka_2.11-0.9.0.1 kafka

[root@sandbox u01]# cd kafka

[root@sandbox kafka]# vi config/server.properties
[root@sandbox kafka]# grep -v '^$\|^\s*\#' config/server.properties
broker.id=0
listeners=PLAINTEXT://:9092
num.network.threads=3

num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
delete.topic.enable=true
[root@sandbox kafka]#
[root@sandbox kafka]# nohup bin/kafka-server-start.sh config/server.properties > /var/log/kafka/server.log &
[1] 30669
[root@sandbox kafka]# nohup: ignoring input and redirecting stderr to stdout

Now we need to prepare our two topics for the data received from the GoldenGate. As you remember we have defined topic “oggdata” for our data flow using parameter gg.handler.kafkahandler.TopicName in our kafka.props file and topic “mySchemaTopic” for schema changes. So, let’s create the topic using Kafka’s supplemented scripts:

[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --create --topic oggtopic --partitions 1 --replication-factor 1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Created topic "oggtopic".
[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --list
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
oggtopic
[root@sandbox kafka]#

As matter of fact, all the necessary topics will also be created automatically when you start your GoldenGate replicat. You need to create the topic explicitly if you want to use some custom parameters for it. You also have the option to alter the topic later on when setting up configuration parameters.
Here is list of the topics we have when one of them is created manually and the second one is created automatically by the replicat process.

[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --describe --topic oggtopic
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Topic:oggtopic	PartitionCount:1	ReplicationFactor:1	Configs:
	Topic: oggtopic	Partition: 0	Leader: 0	Replicas: 0	Isr: 0
[root@sandbox kafka]# bin/kafka-topics.sh --zookeeper sandbox:2181 --describe --topic mySchemaTopic
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u01/kafka_2.11-0.9.0.1/libs/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Topic:mySchemaTopic	PartitionCount:1	ReplicationFactor:1	Configs:
	Topic: mySchemaTopic	Partition: 0	Leader: 0	Replicas: 0	Isr: 0
[root@sandbox kafka]#

In our configuration we have only one server and the simplest configuration for Kafka. In a real business case it can be way more complex. Our replicat is going to post data changes to oggtopic, and all changes and definitions for schema to the mySchemaTopic. We’ve already mentioned that we are going to use Flume functionality to write to HDFS. I’ve prepared Flume with two sources and sinks to write data changes to the /user/oracle/ggflume HDFS directory. We had an option to split data and schema changes to different directories if we wish it. Here is my configuration for Flume:

[root@sandbox ~]# cat /etc/flume-ng/conf/flume.conf
# Name/aliases for the components on this agent
agent.sources = ogg1 ogg2
agent.sinks = hdfs1 hdfs2
agent.channels = ch1 ch2

#Kafka source
agent.sources.ogg1.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.ogg1.zookeeperConnect = localhost:2181
agent.sources.ogg1.topic = oggtopic
agent.sources.ogg1.groupId = flume
agent.sources.ogg1.kafka.consumer.timeout.ms = 100

agent.sources.ogg2.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.ogg2.zookeeperConnect = localhost:2181
agent.sources.ogg2.topic = mySchemaTopic
agent.sources.ogg2.groupId = flume
agent.sources.ogg2.kafka.consumer.timeout.ms = 100

# Describe the sink
agent.sinks.hdfs1.type = hdfs
agent.sinks.hdfs1.hdfs.path = hdfs://sandbox/user/oracle/ggflume
agent.sinks.hdfs2.type = hdfs
agent.sinks.hdfs2.hdfs.path = hdfs://sandbox/user/oracle/ggflume
#agent.sinks.hdfs1.type = logger

# Use a channel which buffers events in memory
agent.channels.ch1.type = memory
agent.channels.ch1.capacity = 1001
agent.channels.ch1.transactionCapacity = 1000
agent.channels.ch2.type = memory
agent.channels.ch2.capacity = 1001
agent.channels.ch2.transactionCapacity = 1000

# Bind the source and sink to the channel
agent.sources.ogg1.channels = ch1
agent.sources.ogg2.channels = ch2
agent.sinks.hdfs1.channel = ch1
agent.sinks.hdfs2.channel = ch2

As you can see, we have separate sources for each of our Kafka topics, and we have two sinks pointing to the same HDFS location. The data is going to be written down in Avro format.
All preparations are completed, and we are running Kafka server, two topics, and Flume is ready to write data to HDFS. Our HDFS directory is still empty.

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
[oracle@sandbox oggbd]$

Let’s run the passive replicat with our initial data load trail file :

[oracle@sandbox oggbd]$ cd /u01/oggbd
[oracle@sandbox oggbd]$ ./replicat paramfile dirprm/irkafka.prm reportfile dirrpt/irkafka.rpt
[oracle@sandbox oggbd]$

Now we can have a look to results. We got 3 files on HDFS where first two files describe structure for the TEST_TAB_1 and TEST_TAB_2 accordingly, and the third file contains the data changes, or maybe better to say initial data for those tables. You may see that the schema definition was put on separate files when the data changes were posted altogether to the one file.

[oracle@sandbox ~]$ hadoop fs -ls /user/oracle/ggflume/
Found 3 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
[oracle@sandbox ~]$

[oracle@sandbox ~]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691685
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?????k?\??????S?A?%?{
  "type" : "record",
  "name" : "TEST_TAB_1",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
.........................


[oracle@sandbox ~]$ hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691686
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?*
?e????xS?A?%N{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",
  "fields" : [ {
    "name" : "table",
    "type" : "string"
  }, {


...............................

[oracle@sandbox ~]$hadoop fs -cat  /user/oracle/ggflume/FlumeData.1458749691718
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable??????c?C n??S?A?b"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.373000(00000000-10000002012
PK_ID1371O62FX&2014-01-24:19:09:20RJ68QYM5&2014-01-22:12:14:30"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405000(00000000-10000002155
PK_ID2371O62FX&2014-01-24:19:09:20HW82LI73&2014-05-11:05:23:23"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405001(00000000-10000002298
PK_ID3RXZT5VUN&2013-09-04:23:32:56RJ68QYM5&2014-01-22:12:14:30"BDTEST.TEST_TAB_1I42016-02-16 19:17:40.74669942016-03-23T12:14:35.405002(00000000-10000002441
PK_ID4RXZT5VUN&2013-09-04:23:32:56HW82LI73&2014-05-11:05:23:23"BDTEST.TEST_TAB_2I42016-02-16 19:17:40.76289942016-03-23T12:14:35.408000(00000000-10000002926
PK_IDRND_STR_1ACC_DATE7IJWQRO7T&2013-07-07:08:13:52[oracle@sandbox ~]$

Now we need to create our ongoing replication. Our extract was set up the same way as it was described in the first post of the series. It is up and running, passing changes to the replicat side to the directory ./dirdat

GGSCI (sandbox.localdomain) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     GGEXT       00:00:09      00:00:03


[oracle@sandbox oggbd]$ ls -l dirdat/
total 240
-rw-r-----. 1 oracle oinstall   3028 Feb 16 14:17 initld
-rw-r-----. 1 oracle oinstall 190395 Mar 14 13:00 or000041
-rw-r-----. 1 oracle oinstall   1794 Mar 15 12:02 or000042
-rw-r-----. 1 oracle oinstall  43222 Mar 17 11:53 or000043
[oracle@sandbox oggbd]$

I’ve prepared parameter file for the Kafka replicat :

[oracle@sandbox oggbd]$ cat dirprm/rkafka.prm
REPLICAT rkafka
-- Trail file for this example is located in "AdapterExamples/trail" directory
-- Command to add REPLICAT
-- add replicat rkafka, exttrail dirdat/or, begin now
TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP GGTEST.*, TARGET BDTEST.*;

[oracle@sandbox oggbd]$

We need only add and start our rkafka replica for the Big Data GoldenGate.

GGSCI (sandbox.localdomain) 1> add replicat rkafka, exttrail dirdat/or, begin now
REPLICAT added.


GGSCI (sandbox.localdomain) 2> start replicat rkafka

Sending START request to MANAGER ...
REPLICAT RKAFKA starting


GGSCI (sandbox.localdomain) 3> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 11:53   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:06 ago)
Process ID           21041
Log Read Checkpoint  File dirdat/or000000000
                     2016-03-24 11:53:17.388078  RBA 0

You may remember that we don’t have dirdat/or000000000 file in our dirdat directory. So, our replicat has to be slightly corrected to work with proper trail files. I am altering sequence for my replicat to reflect actual sequence number for my last trail file.

GGSCI (sandbox.localdomain) 10> stop replicat rkafka

Sending STOP request to REPLICAT RKAFKA ...
Request processed.


GGSCI (sandbox.localdomain) 11> alter replicat rkafka EXTSEQNO 43

2016-03-24 12:03:27  INFO    OGG-06594  Replicat RKAFKA has been altered through GGSCI. Even the start up position might be updated, duplicate suppression remains active in next startup. To override duplicate suppression, start RKAFKA with NOFILTERDUPTRANSACTIONS option.

REPLICAT altered.


GGSCI (sandbox.localdomain) 12> start replicat rkafka

Sending START request to MANAGER ...
REPLICAT RKAFKA starting


GGSCI (sandbox.localdomain) 13> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 12:03   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:12 ago)
Process ID           21412
Log Read Checkpoint  File dirdat/or000000043
                     First Record  RBA 0


GGSCI (sandbox.localdomain) 14>

Let’s change some data:

orclbd> select * from test_tab_2;

           PK_ID RND_STR_1  ACC_DATE
---------------- ---------- ---------------------------
               7 IJWQRO7T   07/07/13 08:13:52


orclbd> insert into test_tab_2 values (8,'TEST_INS1',sysdate);

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 5 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
[oracle@sandbox oggbd]$

[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836268086
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?Q???n?y?1?R#S?j???"BDTEST.TEST_TAB_2I42016-03-24 16:17:29.00033642016-03-24T12:17:31.733000(00000000430000043889
PK_IDRND_STR_1ACC_DATE8TEST_INS1&2016-03-24:12:17:26[oracle@sandbox oggbd]$
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836268130
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?6F!?Z?-?ZA8r^S?j?oN{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",

We got our schema definition file and a file with data changes.

orclbd> update test_tab_2 set RND_STR_1='TEST_UPD1' where pk_id=8;

1 row updated.

orclbd> commit;

Commit complete.

orclbd>

[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 6 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
-rw-r--r--   1 flume oracle        316 2016-03-24 12:28 /user/oracle/ggflume/FlumeData.1458836877420
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458836877420
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable]??u????????qS?t,??"BDTEST.TEST_TAB_2U42016-03-24 16:27:39.00035642016-03-24T12:27:42.177000(00000000430000044052
PK_IDRND_STR_1ACC_DATE8TEST_INS1&2016-03-24:12:17:268TEST_UPD1&2016-03-24:12:17:26[oracle@sandbox oggbd]$

You can see that we only got a file with data changes since no DDL changes were made. The transactions will be grouped to the files according to our Flume parameters as we discussed in the previous blog post.

You can also see old value for the updated record and the new one. Using that information we can reconstruct the changes, but we need to apply certain logic to decrypt the changes.

For deletion operation we are getting operation flag “F” and values for the deleted record. Again, no schema definition file since no changes were made.

Let’s try some DDL.

orclbd> truncate table test_tab_2;

Table TEST_TAB_2 truncated.

orclbd>
GGSCI (sandbox.localdomain) 4> info rkafka

REPLICAT   RKAFKA    Last Started 2016-03-24 12:10   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:02 ago)
Process ID           21803
Log Read Checkpoint  File dirdat/or000043
                     2016-03-24 12:40:05.000303  RBA 45760


GGSCI (sandbox.localdomain) 5>

No new files on HDFS.

orclbd> insert into test_tab_2 select * from test_tab_3;

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 8 items
-rw-r--r--   1 flume oracle       1833 2016-03-23 12:14 /user/oracle/ggflume/FlumeData.1458749691685
-rw-r--r--   1 flume oracle       1473 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691686
-rw-r--r--   1 flume oracle        981 2016-03-23 12:15 /user/oracle/ggflume/FlumeData.1458749691718
-rw-r--r--   1 flume oracle        278 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268086
-rw-r--r--   1 flume oracle       1473 2016-03-24 12:18 /user/oracle/ggflume/FlumeData.1458836268130
-rw-r--r--   1 flume oracle        316 2016-03-24 12:28 /user/oracle/ggflume/FlumeData.1458836877420
-rw-r--r--   1 flume oracle        278 2016-03-24 12:35 /user/oracle/ggflume/FlumeData.1458837310570
-rw-r--r--   1 flume oracle        277 2016-03-24 12:42 /user/oracle/ggflume/FlumeData.1458837743709
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458837743709
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable*?2??????>iS??\??"BDTEST.TEST_TAB_2I42016-03-24 16:42:04.00020042016-03-24T12:42:06.774000(00000000430000045760
PK_IDRND_STR_1ACC_DATE7IJWQRO7T&2013-07-07:08:13:52[oracle@sandbox oggbd]$

Again, we got only file with data changes. I tried to compare the file we were getting for the previous insert and insert after truncate, but couldn’t find difference except for the binary part of the avro file. It will require additional investigation and maybe clarification from Oracle. In the current state it looks like it is easy to miss a truncate command for a table on the destination side.

Let us change the table and add a column there.

orclbd> alter table test_tab_2 add test_col varchar2(10);
Table TEST_TAB_2 altered.

orclbd>

We are not getting any new files with new table definitions until we do any DML on the table. Both files (with the new schema definition and data changes) will appear after we insert, delete or update any rows there.

orclbd> insert into test_tab_2 values (8,'TEST_INS1',sysdate,'TEST_ALTER');

1 row inserted.

orclbd> commit;

Commit complete.

orclbd>
[oracle@sandbox oggbd]$ hadoop fs -ls /user/oracle/ggflume/
Found 10 items
...................................................
-rw-r--r--   1 flume oracle       1654 2016-03-24 12:56 /user/oracle/ggflume/FlumeData.1458838582020
-rw-r--r--   1 flume oracle        300 2016-03-24 12:56 /user/oracle/ggflume/FlumeData.1458838584891
[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458838582020
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable-??ip??/?w?S??/{
  "type" : "record",
  "name" : "TEST_TAB_2",
  "namespace" : "BDTEST",
................
        "name" : "TEST_COL",
        "type" : [ "null", "string" ],
        "default" : null
.................

[oracle@sandbox oggbd]$ hadoop fs -cat /user/oracle/ggflume/FlumeData.1458838584891
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritabletr?V?_$???:2??S??/w?"BDTEST.TEST_TAB_2I42016-03-24 16:56:04.00026042016-03-24T12:56:08.370000(00000000430000047682
PK_IDRND_STR_1ACC_DATETEST_COL8TEST_INS1&2016-03-24:12:56:01TEST_ALTER

I used JMeter to generate some load, and it could easily with almost no delays, replicate 225 transactions per second (30% inserts 80% updates). It was not a test for Kafka or Flume, which could sustain way more load, but rather combination of GoldenGate with the Big Data infrastructure. It was stable without any errors. I do understand that the current test is very far from any potential production workflow which may include Oracle Database (or any other RDBMS) + GoldenGate + Kafka + Storm + …. . And maybe the final data format will be completely different. So far the adapters are looking good and doing the job. In the next post I will observe the HBASE adapter. Stay tuned.

Categories: DBA Blogs

Log Buffer #467: A Carnival of the Vanities for DBAs

Thu, 2016-03-31 08:40

This Log Buffer Edition brings some top of the list blog posts from Oracle, SQL Server and MySQL.

Oracle:

An Exadata quarter rack has two database servers and three storage cells. In a typical setup, such a system would have three ASM disk groups, say DATA, RECO and DBFS_DG. Usually the disk group DATA would be high redundancy and the other two disk groups would be normal redundancy.

Best practice for calling web services from Oracle Process Cloud Service

2 Min Tech Tips at Oracle OpenWorld: Are You Ready for Your Close-Up?

Are your SQL Plus scripts going to ‘ell ?

New ways of input still on the verge of the enterprise

SQL Server:

Why Every SQL Server Installation Should Be a Cluster

When AUTO_UPDATE_STATISTICS Doesn’t Happen

Fixing Maintenance Plan Error code 0x534

SQL Server Table Smells

Some companies have been slow to acquire big data applications. They discovered that modern hardware platforms and database management systems were more than adequate for most of their business analytics needs.

MySQL:

Galera Cluster and Docker Swarm

MariaDB 10.1.13 and Connector/J 1.3.7 now available

Why an App-Centric View Isn’t Enough

How to Install and Configure MySQL Cluster on CentOS 7

Invalid datetime when converting to timestamp

Categories: DBA Blogs

In Depth: MySQL 5.6+ DDL

Tue, 2016-03-29 09:07
Overview

DDL (Data Definition Language) statements create, alter, and remove database objects. These types of changes can be a very dangerous action to take on such a critical piece of your infrastructure. You want to make sure that the command that you are executing has been given proper thought and testing.

In this post I go through multiple version of MySQL and verify the best course of action to take in regards to executing DDL statements.  There are many things that you have to consider when making these types of changes, such as disk space, load on the database server, slave replication, the type of DDL statement you are executing, and if it will lock the table. 

Because of these risks, there are tools that can be used to help mitigate some of the dangers. But unless you have tested and verified their functionality, these tools in themselves can cause trouble. Whenever in doubt, take the time to test and verify any changes that you will make. In my testing I will be using :

pt-online-schema-change in particular since it is a very popular tool and I have used it many times.  Also, the primary reason it was created was before MySQL offered online DDL changes. In some cases, depending on your environment, the best course of action may be removing the database server from being accessed, by failing over to a slave, or taking a cluster node offline.

I will be focusing on the most common DDL statements as I want to keep this post to a reasonable size. Many of the MySQL DDL statements by default are using the INPLACE algorithm where it is able, which is only available in MySQL 5.6 or later. In earlier versions 5.5 and 5.1 with the InnoDB plugin they had fast index creation but all other table alters were blocking. Online DDL with the INPLACE algorithm allows MySQL to make a copy of the table in the background, copy the data to this table, make your table alters, and then swap the tables, all without locking the table. Some DDL statements can be done instantaneously, such as dropping an index or renaming a column. When MySQL isn’t able to use the INPLACE algorithm it will have to revert to using the COPY algorithm which will in turn lock the table. An example of this is changing a column definition from VARCHAR to BLOB. Whenever you are doing an INPLACE alter you will want to specify the algorithm in your command. This will help protect you in the case that MySQL is unable to do an INPLACE alter. MySQL will return an error rather than running the command with the COPY algorithm.


ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE COLUMN first_name first_name BLOB NULL;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.

All of my testing was done without specifying the algorithm, allowing MySQL to determine the best algorithm to use.  If there are any DDL statements that you want more information on, please refer to the documentation for the release of MySQL that you are using, as I will not be going into foreign keys.

The Setup

All of my testing was done in virtual machines (VMs) on my laptop. I have a VM that will be running mysqlslap to perform remote DML statements such as SELECT, UPDATE, DEELTE and INSERT, causing load on the database server. This will allow me to see any potential table locks or performance impact. Here is the setup of the MySQL machine and it’s components. I have created the table shown below and imported 10 million rows. While mysqlslap was running I performed each of the DDL statements and watched that the DML statements were being executed with no table locks. I then recorded the time as they completed.

MySQL Server Stats
  • CPU : 4x CPUs at 2.6 GHz Intel Core i7
  • Memory allocated to VM : 2 Gig
  • Memory allocated to MySQL Innodb buffer pool: 1 Gig
  • Flash Storage
  • Table has 10 Million Rows.
  • DML (Data Manipulation Language) statements such as select, insert, update, and delete, that will be executed against the table during DDL statements
Table Structure
CREATE TABLE `employee_test` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`birth_date` date NOT NULL,
`first_name` varchar(14) NOT NULL,
`last_name` varchar(16) NOT NULL,
`gender` enum('M','F') NOT NULL,
`hire_date` date NOT NULL,
PRIMARY KEY (`emp_no`),
KEY `ix_lastname` (`last_name`),
KEY `ix_firstname` (`first_name`)
) ENGINE=InnoDB AUTO_INCREMENT=10968502 DEFAULT CHARSET=latin1
MySQL DDL Commands
CREATE INDEX ix_hire_date ON employee_test (hire_date); --CREATE INDEX
CREATE FULLTEXT INDEX ix_lastname_fulltext ON employee_test(last_name); --CREATE FULLTEXT INDEX
DROP INDEX ix_hire_date ON employee_test; --DROP INDEX
OPTIMIZE TABLE employee_test; --OPTIMIZE TABLE
ALTER TABLE employee_test ADD COLUMN test_column INT NULL; --ADD COLUMN
ALTER TABLE employee_test DROP COLUMN f_name; --DROP COLUMN
ALTER TABLE employee_test CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN
ALTER TABLE employee_test MODIFY COLUMN emp_no BIGINT AUTO_INCREMENT NOT NULL; --CHANGE COLUMN TYPE
pt-online-schema-change DDL Commands
pt-online-schema-change --execute --alter 'ADD FULLTEXT INDEX ix_lastname_fulltext (last_name)' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'ENGINE=INNODB' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'ADD COLUMN test_column3 INT NULL' D=employees,t=employee_test
pt-online-schema-change --execute --alter 'MODIFY COLUMN gender BLOB NULL' D=employees,t=employee_test
Results

This matrix is a representation of the testing that I performed and how quickly the commands took to execute. Be careful with Fulltext indexes on your tables since they potentially can cause additional locking by creating the necessary infrastructure in the background. Please see MySQL Innodb Fulltext Indexes for more details. This requirement causes a great deal of locking on the table.

DDL Matrix

pt-online-schema-change

For the DDL statements that cause locking of the table we wanted to look at incorporating pt-online-schema-change, to help us overcome this obstacle.

pt-online-schema-change results

pt-online-schema-change allowed us to perform the operations that locked the table previously with no locking. pt-onilne-schema-change also has many other features such as helping with the impact on slave replication, and handling foreign keys. But it also has it’s limitation such as not being able to run it on a table that already has triggers, or complications with foreign keys. There are also impacts on your environment if it is not properly tested and verified. One such example is, every time that I ran pt-online-schema-change in my test it caused a deadlock causing mysqlslap to die and no longer perform and further statements.

mysqlslap: Cannot run query UPDATE employee_test SET first_name = ‘BigPurpleDog’ WHERE last_name = ‘SmallGreenCat’; ERROR : Deadlock found when trying to get lock; try restarting transaction

This is why it is very important to try and determine the impact if any that pt-online-schema-change may have on your environment before starting to use it. I did not encounter this behavior with any of the MySQL DDL statements that I ran.

Performance Impact

While performing the changes there were consistent increases in CPU load, disk I/O, and disk usage as the new tables were being created for the table alters. We have to remember that when certain DDL statements are being executed, a full copy of the table is being performed, so you will want to make sure you have enough disk space to complete the change.  This is why it is very important to take into consideration the size of the table you are altering and the load on the MySQL server while performing DDL statements. It is preferred that you run any of the DDL statements that cause table copies, off hours as to avoid any delays or outages to the application that is using the data.

Query Execution Impact

Query Execution Baseline

Server Performance Impact

MySQL Alter Load
MySQL Alter Load

Conclusion

As I have observed in performing these tests, there are many things to consider when performing DDL statements to avoid potential downfalls. Here is a summary of the recommendations to executing DDL statements or using pt-online-schema-change. Before considering any of this determine if the statement you are going to perform is going to copy a table, and if it does, make sure you have enough disk space.

Without Fulltext
With Fulltext

If you are going to make changes to your production servers, make sure that you run your DDL statements during off hours when the server is at it’s lowest utilization for both CPU and disk.

For an added safety measure when you are performing any of the MySQL DDL statements that you are expecting to be executed INPLACE and will not lock the table, make sure you specify ALGORITHM=INPLACE in your statement. If MySQL is unable to execute the command in place, it will just return an error, instead of executing the statement with the COPY algorithm which will lock the table. Here are samples of the DDL statements that you should be able run INPLACE and not cause any locking of your table.

ALTER TABLE employee_test ALGORITHM=INPLACE, ADD INDEX ix_hire_date (hire_date); --CREATE INDEX
ALTER TABLE employee_test ALGORITHM=INPLACE, DROP INDEX ix_firstname; --DROP INDEX
ALTER TABLE employee_test ALGORITHM=INPLACE, ENGINE=INNODB; --OPTIMIZE TABLE
ALTER TABLE employee_test ALGORITHM=INPLACE, ADD COLUMN test_column INT NULL; --ADD COLUMN
ALTER TABLE employee_test ALGORITHM=INPLACE, DROP COLUMN f_name; --DROP COLUMN
ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN

 

 

References

 

 

Categories: DBA Blogs

DataStax OpsCenter upgrade (4.1 to 5.1) for Cassandra – issue and resolution

Tue, 2016-03-29 08:42

For the Apache Cassandra cluster (version C* 1.2.16) that I’ve supported, the monitoring of the cluster is through DataStax OpsCenter, version 4.1.2. As part of the effort to improve the monitoring capability for this cluster, my team decided first to upgrade OpsCenter to version 5.1.4, the latest available version of OpsCenter that is compatible with Cassandra 1.2.16. The same OpsCenter is also used to monitor another cluster of DataStax Enterprise (DSE) 4.5.2 (it corresponds to Apache Cassandra version 2.0.10).

During the upgrade we ran into an issue, and  we couldn’t find a similar problem to this one on Google. We’d like to use this post to document the problems that we faced as well as the solutions and findings we found during the upgrade.

 

Problem Overview

The OpsCenter upgrade procedure is as straightforward as what is described in the DataStax OpsCenter document. After OpsCenter upgrade, the OpsCenter web portal detects mismatched version of datastax-agents on all nodes. Choose the “FixMe” option from the portal to upgrade datastax-agents to version 5.1.4 on all nodes being monitored. After the datastax-agent upgrade, we addressed some datastax-agent configuration issues in “address.yaml” file to reflect the changes brought by the new version of OpsCenter/datastax-agent.

After all this was done, we double checked the log files for OpsCenter and datastax-agent. The OpsCenter log file was mostly clear, the datastax-agent log for DSE 4.5.2 cluster was also clear, but the datastax-agent log for Cassandra 1.2.16 was NOT. The corresponding OpsCenter web portal was not able to display Cassandra metrics for C* 1.2.16 cluster.

On each of the datastax-agent log files in the C* 1.2.16 cluster, we saw a lot of repeating  errors like the ones below:

          ERROR [async-dispatch-3] 2016-02-19 12:57:52,750 There was an error when attempting to load stored rollups.
          com.datastax.driver.core.exceptions.InvalidQueryException: Undefined name key in where clause (‘key EQ ‘<… …>”)
          at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
          at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:291)
          at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:205)
          at clojurewerkz.cassaforte.client$execute.invoke(client.clj:289)
          … …
          ERROR [cassandra-processor-1] 2016-02-19 13:00:02,295 Error when proccessing cassandra callcom.datastax.driver.core.exceptions.InvalidQueryException: Unknown identifier key

 

Problem Analysis and Solution

The fact that the error showed up in datastax-agent log file gave me a hint that the error might be related with datastax-agent failing to write collected metrics into OpsCenter tables. So as the first step of the analysis, I compared the schema of “OpsCenter” keyspace between the two clusters monitored. Below is the example of two OpsCenter table definition comparison between the two clusters.

C* 1.2.16 Cluster

DSE 4.5.3 Cluster

CREATE TABLE events (  “KEY” blob,  column1 blob,  value blob,  PRIMARY KEY (“KEY”, column1)CREATE TABLE events (  key text,  action bigint,  level bigint,  success boolean,  time bigint,  PRIMARY KEY ((key))CREATE TABLE events_timeline (  “KEY” blob,  column1 bigint,  value blob,  PRIMARY KEY (“KEY”, column1)CREATE TABLE events_timeline (  key text,  column1 bigint,  value blob,  PRIMARY KEY ((key), column1)

 

From this table, we can clearly see that the upgrade process of OpsCenter and datastax-agent to verion 5.1.4 somehow doesn’t migrate OpsCenter schema properly for C* 1.2.16 cluster. The theory for the error is that the upgraded datastax-agent in C* 1.2.16 cluster is trying to query or update Cassandra metrics from OpsCenter tables in a fashion that matches the OpsCenter schema as in the DSE 4.5.2 cluster. But the actual OpsCenter schema in C* 1.2.16 still has the old definition, thus causing the invalid query exception as presented in the log file.

Once the problem is clear, the solution is straightforward. The steps are summarized below:

In C* 1.2.16 cluster,

  1. Take a snapshot for OpsCenter keyspace on all nodes
  2. Stop DataStax agents on all nodes, so they won’t try to write metrics into OpsCenter tables.
  3. Use CQL to drop OpsCenter tables and re-create them, matching the OpsCenter schema for DSE 4.5.3 cluster. Make sure that all table properties are the same.
  4. Once OpsCenter schema is recreated. Start DataStax agents on all nodes in.
  5. Verify the agent log file that the error message is gone.
  6. Restart OpsCenter service.

 

After these steps, we double checked the log files for all datastax-agents, and for OpsCenter, and we can confirm that there were no errors. The OpsCenter web portal was also able to display the Cassandra metrics properly.

Categories: DBA Blogs

Amazon Database Migration Service – first try

Mon, 2016-03-28 08:04

Recently, checking Amazon Web Services, I stumbled upon a service I hadn’t tested before. It was Data Migration Service (DMS). I read documentation and checked other resources. I found a good, fresh blog post AWS Database Migration Service written by Jeff Barr. It was really interesting and I decided to give a try and test the service.

I created an Oracle RDS on AWS as a target and an Oracle Linux box on Azure with Oracle 12c EE as a source database for migration. The source database sid was “test” and destination was “orcl”. I created tablespaces and users on both sides with the name “testmig” and created a table on the source database. Initially I loaded 1000000 records to the table and created an index. The schema on destination database was empty. I also enabled archivelog mode on the source database.

Creating user and table on the source:

test> create user testmig identified by welcome1 default tablespace testmig temporary tablespace temp;

User created.

test> grant connect,resource to testmig;

Grant succeeded.

test> conn test

test> create table test_tab_1 (pk_id number, rnd_str_1 varchar2(15),use_date date,rnd_str_2 varchar2(15), acc_date date);

Table created.

test>

Loading the data:

[oracle@oradb1 patchdepot]$ head test_tab_1.dat
340,MLBO07LV,10/30/13 15:58:04,NABCFVAQ,12/08/17 18:22:48
341,M48R4107,12/09/13 12:30:41,ACA79WO8,12/15/16 08:13:40
342,KARMF0ZQ,04/21/14 08:53:33,JE6SOE0K,06/18/17 07:12:29
343,8NTSYDIS,11/09/14 23:41:48,FBJXWQNX,08/28/15 20:47:39
344,0LVKBJ8T,09/28/12 06:52:05,VBX3FWQG,10/28/15 06:10:42
345,Z22W1QKW,06/06/13 11:14:32,26BCTA9L,08/21/17 08:35:15
346,CGGQO9AL,08/27/14 02:37:41,15SRXZSJ,11/09/17 19:58:58
347,WKHINIUK,07/02/13 14:31:53,65WSGVDG,08/02/15 10:45:50
348,HAO9X6IC,11/17/12 12:08:18,MUQ98ESS,12/03/15 20:37:20
349,D613XT63,01/24/15 16:49:11,3ELW98N2,07/03/16 11:03:40
[oracle@oradb1 patchdepot]$ export NLS_DATE_FORMAT="MM/DD/YY HH24:MI:SS"
[oracle@oradb1 patchdepot]$ sqlldr userid=testmig table=test_tab_1
Password:

SQL*Loader: Release 12.1.0.1.0 - Production on Wed Mar 16 13:07:50 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: TEST_TAB_1
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table TEST_TAB_1:
  100000 Rows successfully loaded.

Check the log files:
  test_tab_1.log
  test_tab_1_%p.log_xt
for more information about the load.
[oracle@oradb1 patchdepot]$

On the target system:

rdsorcl> create tablespace testmig;

Tablespace TESTMIG created.

rdsorcl> create user testmig identified by welcome1 default tablespace testmig;

User TESTMIG created.

rdsorcl>

In the blog post mentioned, the migration was done without replication and I was curious to test it with some kind of ongoing DML activity on the source database. I setup a linux box with Jmeter and started my load with pace about 15 transactions per second. The transactions were inserts and updates on the created table.

Everything was working fine so far and I switched to the Data Migration Service on AWS. The service has a pretty easy and clear workflow. You need just push the button “Create migration” and it will guide you through the process. In general, you need to create a replication instance, endpoints for source and target and task to start initial load and replication.

I created a replication instances and while it was creating (it took some time) was asked to setup endpoints for source and target. The first issue I hit when I tried to use a DNS name for my Azure instance. The test connection was failing by timeout and it was not clear where the problem were. It could be either connection or DNS problem. The issue was solved by providing IP address instead of domain name for my Azure instance.
Screen Shot 2016-03-16 at 1.26.40 PM
The test for target endpoint failed with the same timeout, but the reason was totally different. It was not DNS, but rather a connection issue. At first, I couldn’t figure that out because I was able to connect to my RDS instance from my laptop using server name and port but test endpoint in DMS was not working. Eventually I figured out that the problem was in security groups for endpoint in RDS. By default the AWS RDS instance was created with security group allowing connections outside but somehow restricting connections from DMS. I changed the security group for AWS RDS to “default” and was able to successfully test the endpoint in DMS.

The next step was to create a task. I created a task with initial load and ongoing replication for my testmig schema. The task was supposed to drop any tables on the target (you can choose truncate instead if you want) create objects, move data and keep replication until cutover day when you will be able to switch your applications to the new database. It will tell you that you need to setup supplemental logging for replication. Unfortunately it doesn’t tell you what kind of supplemental logging you have to setup.

So, I enabled minimal data supplemental logging on my Azure test instance.

test> alter database add supplemental log data;
Database add SUPPLEMENTAL altered.

test> exec dbms_capture_adm.prepare_table_instantiation('testmig.test_tab_1','keys')

PL/SQL procedure successfully completed.

test>

It was not enough and I got the error. By default you are not getting logging for your task but only configuration and statistics about replicated and loaded objects. As a result if you get an error, it is not clear where to look. I enabled supplemental logging for primary key on my replicated table and recreated task checking and logging checkbox. I got error again but I had a log and was able to see what was causing the issue.

2016-03-16T19:41:11 [SOURCE_CAPTURE  ]I:  Oracle compatibility version is 12.1.0.0.0  (oracle_endpoint_conn.c:86)
2016-03-16T19:41:11 [SOURCE_CAPTURE  ]I:  Oracle capture start time: now  (oracle_endpoint_capture.c:701)
2016-03-16T19:41:12 [SOURCE_CAPTURE  ]I:  New Log Miner boundaries in thread '1' : First REDO Sequence is '4', Last REDO Sequence is '4'  (oracdc_reader.c:589)
2016-03-16T19:41:18 [SOURCE_UNLOAD   ]W:  Supplemental logging is not defined for table with no key 'TESTMIG.TEST_TAB_1'  (oracle_endpoint_utils.c:831)
2016-03-16T19:41:18 [SOURCE_UNLOAD   ]E:  Supplemental logging for table 'TESTMIG.TEST_TAB_1' is not enabled properly [122310] Supplemental logging is not correct (oracle_endpoint_unload.c:245)
2016-03-16T19:41:18 [SOURCE_UNLOAD   ]I:  Unload finished for table 'TESTMIG'.'TEST_TAB_1' (Id = 1). 0 rows sent.  (streamcomponent.c:2567)
2016-03-16T19:41:18 [SOURCE_UNLOAD   ]E:  Failed to init unloading table 'TESTMIG'.'TEST_TAB_1' [122310] Supplemental logging is not correct (oracle_endpoint_unload.c:441)

It looked like my supplemental logging was not enough. So, I added supplemental logging for all columns and for entire schema testmig. I recreated task and started it again.

test> exec dbms_capture_adm.prepare_table_instantiation('testmig.test_tab_1','all');
PL/SQL procedure successfully completed.

test> exec dbms_capture_adm.prepare_schema_instantiation('testmig');
PL/SQL procedure successfully completed.

test>

It was working fine and was able to perform initial load.

2016-03-16T19:49:19 [SOURCE_CAPTURE  ]I:  Oracle capture start time: now  (oracle_endpoint_capture.c:701)
2016-03-16T19:49:20 [SOURCE_CAPTURE  ]I:  New Log Miner boundaries in thread '1' : First REDO Sequence is '4', Last REDO Sequence is '4'  (oracdc_reader.c:589)
2016-03-16T19:49:31 [SOURCE_UNLOAD   ]I:  Unload finished for table 'TESTMIG'.'TEST_TAB_1' (Id = 1). 100723 rows sent.  (streamcomponent.c:2567)
2016-03-16T19:49:31 [TARGET_LOAD     ]I:  Load finished for table 'TESTMIG'.'TEST_TAB_1' (Id = 1). 100723 rows received. 0 rows skipped. Volume transfered 45929688  (streamcomponent.c:2787)

What about ongoing changes? Yes, it was keeping the replication on and the tables were in sync. Replication lag for my case was minimal but we need to note that it was just one table with a low transaction rate. By the end I switched my load to AWS RDS database, stopped and deleted the DMS task. Migration was completed. I compared data in tables running a couple of simple checks for count and rows and running also one table “minus” other. Everything was fine.

rdsorcl> select max(pk_id) from testmig.test_tab_1;

      MAX(PK_ID)
----------------
         1000843

rdsorcl> select * from testmig.test_tab_1 where pk_id=1000843;

           PK_ID RND_STR_1       USE_DATE                    RND_STR_2       ACC_DATE
---------------- --------------- --------------------------- --------------- ---------------------------
         1000843 OUHRTHQ8        02/11/13 07:27:44           NFIAODAU        05/07/15 03:49:29

rdsorcl>

----------------

test> select max(pk_id) from testmig.test_tab_1;

      MAX(PK_ID)
----------------
         1000843

test> select * from testmig.test_tab_1 where pk_id=1000843;

           PK_ID RND_STR_1       USE_DATE                    RND_STR_2       ACC_DATE
---------------- --------------- --------------------------- --------------- ---------------------------
         1000843 OUHRTHQ8        02/11/13 07:27:44           NFIAODAU        05/07/15 03:49:29

test>

test> select count(*) from (select * from test_tab_1 minus select * from test_tab_1@rdsorcl);

        COUNT(*)
----------------
               0

test>

A summary of DMS:

    • We may need to adjust security groups for target RDS or EC2 systems. It may prevent connections.
    • Better to use IP for source endpoints since DNS may be not reliable.
    • Enable logging when you create task.
    • If you enable replication from Oracle database you have to setup full supplemental logging for the replicated schemas on your source system.
    • It requires basic knowledge about replication and how it works to understand and fix the error.

Next time I will try heterogeneous replication from MySQL to Oracle and then the other way around.

Categories: DBA Blogs

MySQL Memory Usage Docs Get a FaceLift

Mon, 2016-03-28 07:59

The MySQL Documentation team recently gave these docs on how MySQL uses memory a much needed face-lift. The new page provides a much clearer overview on how MySQL allocates memory, and provides many helpful links to be able to dig deeper.

For instance, if you weren’t aware of how Performance Schema memory utilization changed in 5.7, there is this helpful paragraph (emphasis mine):

The MySQL Performance Schema is a feature for monitoring MySQL server execution at a low level. As of MySQL 5.7, the Performance Schema dynamically allocates memory incrementally, scaling its memory use to actual server load, instead of allocating required memory during server startup. Once memory is allocated, it is not freed until the server is restarted. For more information, see Section 22.14, “The Performance Schema Memory-Allocation Model”.

Therefore, if you are starting a new project on MySQL 5.7, or upgrading an existing environment, and you have Performance Schema enabled, you might see your memory footprint rising inexplicably. According to the linked Performance Schema Memory-Allocation Model documentation, one reason might because of auto-scaling Performance Schema variables:


performance_schema_accounts_size
performance_schema_hosts_size
performance_schema_max_cond_instances
performance_schema_max_file_instances
performance_schema_max_index_stat
performance_schema_max_metadata_locks
performance_schema_max_mutex_instances
performance_schema_max_prepared_statements_instances
performance_schema_max_program_instances
performance_schema_max_rwlock_instances
performance_schema_max_socket_instances
performance_schema_max_table_handles
performance_schema_max_table_instances
performance_schema_max_table_lock_stat
performance_schema_max_thread_instances
performance_schema_users_size

Of course, you can limit each variable by supplying a value to prevent autoscaling beyond a point.

There might me some areas missing, such as explicit MEMORY tables, but by-and-large it is a vast improvement.

Other honorable mentions that I’ve seen updates in the documentation include Limiting memory utilization of range optimizations and Configuring innodb_buffer_pool_size

Happy reading!

Categories: DBA Blogs

Log Buffer #466: A Carnival of the Vanities for DBAs

Mon, 2016-03-28 07:47

This Log Buffer Edition covers weekly round up of blog posts from Oracle, SQL Server and MySQL.

Oracle:

The Universal Theme introduced with APEX 5 is immensely good looking and comes with a powerful Theme Roller to customize it.

The implementation of Ksplice has been greatly simplified. Now you just need to register your system(s) with Unbreakable Linux Network (ULN), subscribe to the appropriate Ksplice channel, use the yum command to install the uptrack package, and perform any custom configuration. Your systems will be automatically updated with the latest kernel and user space patches.

Every business book you read talks about delegation. It’s a core requirement for successful managers: surround yourself with good people, delegate authority and responsibility to them, and get out of their way.

Accelerating SQL Queries that Span Hadoop and Oracle Database

Oracle Big Data SQL 3.0 adds support for Hortonworks Data Platform and commodity clusters

SQL Server:

Instant File Initialization : Impact During Setup

Enumerate Windows Group Members

How to execute an SSIS package from the command line or a batch file

When AUTO_UPDATE_STATISTICS Doesn’t Happen

SQL Server Table Smells

MySQL:

MySQL replication primer with pt-table-checksum / pt-table-sync, part 2

How do you dig down into the JSON data, say like in comments on a blog post?

Percona XtraBackup 2.3.4 is now available

Connection timeout parameters in MySQL

What have we learnt in two decades of MySQL?

Categories: DBA Blogs

To apply or not to apply that Cumulative Update (CU)

Thu, 2016-03-24 11:56

Today the SQL Server Engineering posted an important shift in their recommendations regarding applying Cumulative Updates (often referred to as CUs) on their blog. You can find it here.

About 4 months ago we had an internal debate regarding the best patch strategy and I noticed that our SQL Server DBAs were divided on the best approach.

Some insisted that installing CUs as they were released was the best practice, while others insisted that you should only patch if you needed the Hotfix. I don’t know what spurred it but I saw other discussions on the subject pop up in the community a few days later.

Throughout my career, I’ve been torn on the best strategy myself. I like to keep my systems up to date but I’d always taken the approach that if you needed the hotfix, then with proper testing you should apply a CU. The release of a CU has never been the trigger for me to patch all my SQL Servers in any other occasion – except – when a lot of time had passed between Service Packs which did happen.

I think that strategy of waiting a long time before applying a service pack is a flawed one and don’t recommend it. I don’t think it’s a good idea to be “one release behind” or wait a year. That said, as a career DBA I don’t think I’ll rush out and apply the CU unless it’s fixing something. If I have the cycles, I may test it early on, but I’ll probably wait a month or two and see what the community has to say about it before apply it to production.

I predict that you’ll see Microsoft (and other vendors) move away from large service packs as they move into a more agile approach to their own software. I suspect it won’t belong before we see a formal cancellation of large releases. This is all the more reason for us to ensure we have automation in place for testing and deployment so that the release of an update isn’t a significant topic in our systems-planning meetings.

Do you apply CUs right away or delay? What’s your patch-strategy?

Categories: DBA Blogs

Three essential practices for security compliance

Wed, 2016-03-23 13:40

No IT or business person needs to be told twice that a major security breach can have a devastating impact on a business. Yet enterprises routinely find themselves non-compliant with security best practices and even their own policies.

Why? First, there’s a lot of complexity to manage. And second, with IT teams constantly putting out fires, background functions like security tend to get shortchanged.

That said, there are a few simple things you can do to strengthen how you protect your data and your business.

1. Stay patched and monitor for unauthorized changes

You really aren’t safe without up-to date security patches for your vulnerable systems — which means most of them. Any software that faces out or touches the Internet is definitely at risk. But internal personnel can pose threats, as well , meaning even “inside” systems can be vulnerable.

The problem with patching is scale. If you’re a bank with 300 branches across the country, all with their own IT systems, you don’t have the time or the people to manually patch every system in a centralized, whole-enterprise way. Automation is essential: a mechanism for pushing patches out across all your departments and locations — and verifying successful installation.

Patching is essential, but it’s not enough. If you’re breached, the intruders will try to downgrade or otherwise weaken your defenses. So you need an automated auditing platform that: a) looks for unauthorized changes that could weaken your software systems; and b) reverts compromised systems back to the authorized version of software.

2: Only allow access that’s strictly necessary

Mindset is a big part of security. When it comes to controlling access to system resources, data and applications, your default should be that no one has access to anything. “Permissions” then become very deliberate enablement of specific apps and services to specific users based on specific needs. People should only ever have access to the data and systems they need to do their immediate jobs.

Access rights should be linked to your provisioning systems so that when a person changes jobs or leaves your company, their old rights are immediately removed.

In general, security should match risk to systems, with levels of increasing verification when an employee’s behavior is unusual. For example, if someone has never logged in from a particular location but appears to be doing so now, serve them up an additional verifying question. If they’ve never logged into a system before, get them to verify their location and identity.

Users should also be prompted to confirm or deny unusual behaviour. Did you just log in from a new computer? Did you just change your password? These kinds of security health checks are being integrated into applications, periodically forcing users to review their settings and ensure their identity and security information is up to date.

What happens when you don’t have stringent rules like these? Weakly enforced access rules were at the root of a recent, headline-grabbing security breach at a major U.S. retailer. The intruders had access to one hacked device, but by exploiting weak permissions were able to access many other devices — and make off with 40 million credit card numbers.

3: Assume you’ve been hacked.

It’s easy to have a defensive mindset about security: “We’ll stop the bad guys from getting in.” But the reality is they may already be in. The strongest security position comes from assuming you’ve already been hacked. Keep a vigilant watch for evidence of it.

This starts by imposing tight controls on systems that are key to your business operations. Audit all planned changes daily, recording these with approvals in a change log accessible only through off-site logging not connected to systems. If an intruder makes changes, the change log will be your first line of defense — it will be impossible for the hacker to cover their tracks because you will have a forensic change record in a protected location.

Security needs to be a priority in every area of your business. Business units should test the security of their operational practices as part of quarterly business continuity planning. You should regularly test your company-wide systems internally to identify vulnerabilities, and consider hiring professional security experts to attack or socially engineer access to your systems. When it comes to enterprise security, offence really is the best defense.

Categories: DBA Blogs

A Tale of Three Cities: Perspectives on innovation from New York, San Francisco and Sydney

Tue, 2016-03-22 11:29

Recently, Pythian hosted a number of Velocity of Innovation (Velocity) events. I moderated two of these: one last June in New York, and one in November in San Francisco. Another event in Sydney, Australia was moderated by Tom McCann, senior customer experience analyst with Forrester.

Our Velocity events have given us unique insights into what IT professionals in various regions see as their top priorities or concerns. And although we always framed our discussions with similar questions, it was interesting to see the different directions they took in each location — especially when it came to the topic of innovation.

So what makes a particular region fertile ground for innovation? And can you measure it?

The Global Innovation Index (GII) ranks countries based on a multitude of indicators of innovation. The United States ranks number 2 on the GII, behind Switzerland, while Australia is number 17, out of 141 countries. According to the GII website, the index aims to capture the multi-dimensional facets of innovation and provide the tools to assist in tailoring policies to promote long-term output growth, improved productivity and job growth.

The ideas discussed in the US and Australian locations seemed to align with the GII results, with US panelists expressing more positive attitudes and concrete ideas on how companies can improve agility and become more innovative. And while being at the forefront of technology in the Asia-Pacific region, the Australian panelists and audience members described more cautious approaches to achieving innovation.

Sydney: Cautiously moving forward

Early in the Sydney panel discussion, Chris Mendez, executive consultant big data and analytics from Industrie IT, sparked a lively discussion about innovation by asserting that innovation is lacking in that region.

“I actually don’t think there’s enough innovation in Australia, in particular. There’s a lot of talk about it, people are doing a lot of experiments, and there are some companies who’ve set up business purely based on tool sets that use data to innovate. But there are a few things that seem to be working against innovation, and I think one of those things is that it doesn’t stand on its own,” Mendez said.

According to Francisco Alvarez, vice president, APAC at Pythian, the risks associated with innovation might be holding companies back in Australia. “The main problem for most companies is that innovation equals risk,” Alvarez said.

Alvarez also commented on what it takes to make innovation work. “If you take a step back and look at the companies that are doing well in the market, you can see that there is one factor that differentiates them: they were not afraid to try to innovate. And because of that innovation they are getting their share of the market and gaining ground. Just look at the financial market. CBA was considered crazy a few years ago for all the investment they were making in technology, social media, apps and so on. They got ahead. And now everybody is trying to do the same,” he said.

Mendez thinks that innovation needs to start from the top. “I think there’s also a very big misunderstanding at board levels about innovation because boards are there to actually stop you changing your business. The fundamental tenant is: ‘We’ve got a great business model here, it’s running well, we’ve got to make sure that any change to it doesn’t damage that.’ There’s a natural caution at board levels and it’s totally understandable,” he said.

While cautious, the Sydney panelists expressed that they thought there is hope for more innovation in the future. They expressed a need to proceed slowly, watching what works for innovation leaders.

“The key is to have a balance,” Alvarez said.

New York: Early adopters

If you were to put our New York panelists on Geoffrey Moore’s https://en.wikipedia.org/wiki/Geoffrey_Moore Technology Adoption Lifecycle, you might classify them as early adopters, rather than true innovators. Not surprising, since New York’s competitive industries such as banking and publishing rely on innovative technologies, but they don’t create them.

According to New York panelist, Forrester Analyst Gene Leganza, what makes an enterprise agile is the ability to sense what’s going on in the marketplace and to quickly respond to it. But, he said that innovation comes at a cost. “The flip side of agility is innovation. An interesting aspect of innovation is getting really hot talent into your environment. Getting the right talent and doing smart things and being leading edge are challenges. You have to figure out what level to drop in on, where you are in the industry. You need to determine if you are a startup or a state organization that needs to be a fast follower,” Leganza said.

Otto Toth, CTO at Huffington Post warned that innovating quickly is not always in the best interest of the business, or it may not be the way to do it properly. He asserted that quick innovation can actually work against the business, and that instead of making your business faster, being very agile can slow everything down.

“Too many decision-makers just slow down the process. It’s better to have a few people or a core team who make the decisions and come up with new features,” he added.

Leganza went on to describe what it takes at various levels of the organization. He said that there’s a notion at the engineer level that agility means bureaucracy won’t get in their way. Then there’s agility at the enterprise level, which is about reducing risk and understanding how soon change can be in production.

“The higher up you go, the more people are going to be receptive to what improves the whole portfolio rather than one project. This is where architects come in. They have been hands-on, but have the credibility and knowledge to guide the organization more strategically,” Leganza said.

San Francisco: The innovators

In San Francisco the narratives on innovation were quite different. Although cities don’t have their own GII ranking, you might assume that the West Coast IT leaders are the innovators. And judging by the discussion at the San Francisco event, this assumption seemed to be true.

Cory Isaacson, CTO at RMS was one of our San Francisco panelists. His company runs catastrophe models for some of the world’s largest insurance companies, like scenarios that will tell what a disaster like an earthquake or hurricane might cost them. Isaacson has been working on bringing big data and scalable systems together to create a new cloud-based platform.

“At my company some of the things that we’re trying to do are, honestly, more advanced than most other things I’ve ever seen in my career. But when you’re doing innovation, it is risky. There’s no way around it. There is a lot to evaluate: from different algorithms to the risk models and the catastrophe models,” said Isaacson.

Sean Rich, director of IT at Mozilla added to the San Francisco discussion by talking about some of the concrete innovations his company is working on. They’re taking a partnership approach to enable agility.

“Innovation is doing something new. In an effort toward achieving agility, one of the things that we’re doing is enabling the agility of our business partners, by changing our own operating model. Instead of traditional IT where we run all the services and infrastructure necessary to drive the business, we’re taking more of an enabler or partnership approach,” Rich said.

“We’re now doing things like encouraging shadow IT, encouraging the use of SaaS applications and helping them really do that better through different service offerings like vendor management or change management of user adoption for certain platforms and data integration” he added.

“Overall, we’re looking at ourselves differently, and asking what new capabilities we need to develop, and what processes, tools and skills we need to enable agility for our marketing group or our product lines, as an example,” Rich said.

Aaron Lee, the Chief Data Officer at Pythian, runs a team that specializes in helping clients harness technology to deliver real outcomes. Usually they involve things like big data, DevOps, cloud, advanced analytics — he’s involved in some of the most leading edge initiatives for Pythian customers. He takes a practical approach to innovation with clients, and said that companies could improve innovation by looking at the root of the motivation for it.

“They need to ask: Why are we going down this path, trying to innovate something and what is the value of that thing we’re trying to innovate?

“If the shared goals around innovation opportunities aren’t defined in a way that actually lead to success over time, then the business is just like any other organism: it starts to get more risk averse. Then it becomes harder and harder to execute any kind of change agenda. Planning in a way that is likely to have a good long-term outcome, even at the outset of any sort of initiative, is one key success criteria that we put in place to help ourselves and our customers get to a good place,” Lee said.

Isaacson added that companies like Google have been known to allow an engineer to take a day a week or a day every two weeks to just look at things. “I think though, the challenge is you have to get your organization up to the point where this is an economically viable thing to do. Once we get more ahead of the curve, I think we could do that kind of thing,” he said.

Interested in being a part of a discussion like these? VELOCITY OF INNOVATION is a series of thought-leadership events for senior IT management hosted by Pythian. Pythian invites leading IT innovators to participate in discussions about today’s disruptive technologies: big data, cloud, advanced analytics, DevOps, and more. These events are by invitation only.

If you are interested in attending an upcoming Velocity of Innovation event in a city near you, please contact events@pythian.com. To view our schedule of upcoming events visit our Velocity of Innovation page.

Categories: DBA Blogs

Apache Cassandra 2.1 Incremental Repair

Mon, 2016-03-21 14:05

The “incremental repair” feature has been around since Cassandra’s 2.1. Conceptually the idea behind incremental repair is straightforward, but it can get complicated. The official Datastax document describes the procedure for migrating to incremental repair, but in my opinion, it doesn’t give a full picture. This post aims to fill in this gap by summarizing and consolidating the information of Cassandra incremental repair.

Note: this post assumes the reader has a basic understanding of Apache Cassandra, especially the “repair” concept within Cassandra.

 

1. Introduction

The idea of incremental repair is to mark SSTables that are already repaired with a flag (a timestamp called repairedAt indicating when it was repaired) and when the next run of repair operation begins, only previously unrepaired SSTables are scanned for repair. The goal of an “incremental repair” is two-fold:

1) It aims to reduce the big expense that is involved in a repair operation that sets out to calculate the “merkle tree” on all SSTables of a node;

2) It also makes repair network efficient because only rows that are marked as “inconsistent” will be sent across the network.

2. Impact on Compaction

“Incremental repair” relies on an operation called anticompaction to fulfill its purpose. Basically, anticompaction means splitting an SSTable into two: one contains repaired data and the other contains non-repaired data. With the separation of the two sets of SSTables, the compaction strategy used by Cassandra also needs to be adjusted accordingly. This is because we cannot merge/compact a repaired SSTable with an unrepaired SSTable together. Otherwise, we lose the repaired states.

Please note that when an SSTable is fully covered by a repaired range, no anticompaction will occur. It will just rewrite the repairedAt field in SSTable metadata.

SizeTiered compaction strategy takes a simple strategy. Size-Tiered compaction is executed independently on the two sets of SSTables (repaired and unrepaired), as the result of incremental repair Anticompaction operation.

For Leveled compaction strategy, leveled compaction is executed as usual on repaired set of SSTables, but for unrepaired set of SSTables, SizeTiered compaction will be executed.

For DateTiered compaction strategy, “incremental repair” should NOT be used.

3. Migrating to Incremental Repair

By default, “nodetool repair” of Cassandra 2.1 does a full, sequential repair. We can use “nodetool repair” with “-inc” option to enable incremental repair.

For Leveled compaction strategy, incremental repair actually changes the compaction strategy to SizeTiered compaction strategy for unrepaired SSTables. If a nodetool repair is executed for the first time on Leveled compaction strategy, it will do SizeTiered compaction on all SSTables because until the first incremental repair is done, Cassandra doesn’t know the repaired states. This is a very expensive operation and it is therefore recommended to migrate to incremental repair one node at a time, and follow the following procedure to migrate to incremental repair:

  1. Disable compaction on the node using nodetool disableautocompaction
  2. Run the default full, sequential repair.
  3. Stop the node.
  4. Use the tool sstablerepairedset to mark all the SSTables that were created before you disabled compaction.
  5. Restart cassandra
3.1 Tools for managing SSTable repaired/unrepaired state

Cassandra offers two utilities for SSTable repaired/unrepaired state management:

  • sstablemetadata is used to check repaired/unrepaired state of an SSTable. The syntax is as below:

             sstablemetadata <sstable filenames>

  • sstablerepairedset is used to manually mark if an SSTable is repaired or unrepaired. The syntax is as below. Note that this tool has to be used when Cassandra is stopped.

             sstablerepairedset [–is-repaired | –is-unrepaired] [-f <sstable-list> | <sstables>]

Please note that with utility sstablerepairedset, you can also stop incremental repair on Leveled compaction and restore the data to be leveled again with the “—is-unrepaired” option. Similarly, the node needs to be stopped first.

4. Other Considerations with Incremental Repair

There are some other things to consider when using incremental repair.

  • For Leveled compaction, once an incremental repair is used, it should be done so continuously. Otherwise, only SizeTiered compaction will be executed. It is recommended to run incremental repair daily and run full repairs weekly to monthly.
  • Recovering from missing data or corrupted SSTables require a non-incremental full repair.
  • “nodetool repair” –local option should be only used with full repair, not with incremental repair.
  • In C* 2.1, sequential repair and incremental repair does NOT work together.
  • With SSTable’s repaired states being tracked via it’s metadata, some Cassandra tools can impact the repaired states:
    1. Bulk loading will make loaded SSTables unrepaired, even if was repaired in a different cluster.
    2. If scrubbing causes dropped rows, new SSTables will be marked as unrepaired. Otherwise, SSTables will keep their original repaired state.
Categories: DBA Blogs

Pythian at Collaborate 16

Mon, 2016-03-21 13:27

Collaborate is a conference for Oracle power users and IT leaders to discuss and find solutions and strategies based on Oracle technologies. This many Oracle experts in one place only happens one per year, and Pythian is excited to be attending. If you are attending this year, make sure to register for some of the sessions featuring Pythian’s speakers, listed below.

Collaborate 16 is on April 10-14, 2016 at the Mandalay Bay Resort and Casino in Las Vegas, Nevada, US.

 

Pythian Collaborate 16 Speaker List:

 

Michael Abbey | Consulting Manager | Oracle ACE

Communications – the Good, the Bad, and the Best

Tues April 12 | 9:15 a.m. – 10:15 a.m. | North Convention, Room South Pacific D

Traditional DB to PDB: The Options

Tues April 12 | 2:15 p.m. – 3:15 p.m. | Room Jasmine A

Documentation – A Love/Hate Relationship (For Now)

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Palm A

 

Nelson Caleroa | Database Consultant | Oracle ACE

Exadata Maintenance Tasks 101

Tues April 12 | 10:45 a.m. – 11:45 am | Room Palm C

Evolution of Performance Management: Oracle 12c Adaptive Optimization

Tues April 12 | 3:30 p.m. – 4:30 p.m | Room Jasmine A

 

Subhajit Das Chaudhuri | Team Manager

Deep Dive Into SSL Implementation Scenarios for Oracle Application E-Business Suite

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Breakers E

 

Alex Gorbachev | CTO | Oracle ACE Director

Oaktable World: TED Talks

Wed April 13 | 12:00 p.m. – 12:30 p.m. | Room Mandalay Bay Ballroom

Oaktable World: Back of a Napkin Guide to Oracle Database in the Cloud

Wed April 13 | 4:15 p.m. – 5:15 p.m. | Room Mandalay Bay Ballroom

 

Gleb Otochkin | Principal Consultant

Two towers or story about data migration. Story about moving data and upgrading databases.

Mon April 11 | 4:30 p.m. – 5:30 p.m. | Room Jasmine A

 

Simon Pane | ATCG Principal Consultant | Oracle Certified Expert

Oracle Database Security: Top 10 Things You Could & Should Be Doing Differently

Mon April 11 | 2 p.m. – 3 p.m. | Room Palm A

Time to get Scheduling: Modernizing your DBA scripts with the Oracle Scheduler (goodbye CRON)

Tues April 12 | 10:45 a.m. – 11:45 a.m. | Room Palm A

 

Roopesh Ramklass | Principal Consultant

Oracle Certification Master Exam Prep Workshop

Sun April 10 | 9:00 a.m. – 3:00 p.m. | Room Jasmine C

Fast Track Your Oracle Database 12c Certification

Wed April 13 | 8:00 a.m. – 9:00 a.m. | Room Jasmine A

 

Categories: DBA Blogs

The 5 Best Things That Will Happen to DBAs When SQL Server Moves to Linux

Fri, 2016-03-18 13:50

 

In the second half of 2017, SQL Server will start calling Linux its second home. Azure Data Lake for Ubuntu was the sign that Microsoft was serious about going Linux. Private preview is already available for SQL Server on Linux, and this scribe is not part of it but let me wildly guess what would warm the hearts of those DBAs who have played with Oracle on Linux and SQL Server on Windows:

 

  1. Cleanup won’t require sifting through registry entries and cleaning it up. The uninstall would be quick, simple, and a breeze.
  2. No confusion about Windows authentication or SQL Server authentication.
  3. A much much better system utilities for monitoring and root cause analysis.
  4. Boasting and bragging rights about managing an enterprise database on Linux.

 

Probably the most serious thing that will happen with SQL Server on Linux is that more and more Oracle DBAs who prefer to use Linux, will start taking an interest in managing SQL Server.

Pythian is perfectly poised to leverage this change in technology from Microsoft. We have world class SQL Server DBAs, Linux gurus, and some magnificent Oracle DBAs. Existing clients as well as new clients can count on these resources to get their SQL Server databases supported or migrated to Linux.

And of course, it doesn’t matter really whether these database are on cloud or not come 2017, because Pythian has already covered that too.

Categories: DBA Blogs