Skip navigation.

DBA Blogs

Hands-On Programming with R by Garrett Grolemund

Surachart Opun - Wed, 2014-08-27 02:42
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
R language is useful to become a data scientist, as well as a computer scientist. I mention a book that points about a data science with R. A Hands-On Programming with R Write Your Own Functions and Simulations By Garrett Grolemund. It was written how to solve the logistical problems of data science. Additional, How to write our own functions and simulations with R. In a book, readers are able to learn in practical data analysis projects (Weighted Dice, Playing Cards, Slot Machine) and understand more in R. Additional, Appendix A-E will help to install/update R and R packages as well as loading Data and debugging in R code.
Garrett Grolemund maintains shiny.rstudio.com, the development center for the Shiny R package.
Free Sampler.Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

SQL Server Replication Quick Tips

Pythian Group - Tue, 2014-08-26 07:56

There is a time in every SQL Server DBA career where a mail came in with a “replication is not working, please check” message. This article is intended to provide with quick tips on how to handle common replication errors and performance problems in a one way transactional replication topology

Oh boy, there is a data problem:

ID-10039897

You check replication monitor and get a :

“Transaction sequence number: 0x0003BB0E000001DF000600000000, Command ID: 1″

The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)

Note the sequential number will be used in the following scripts, also the commandID is important to note as not necessarily the whole sequential number has issues, it might be tied to just one command.

Go to the distributor database en run the following command to get the list of articles involved in this issue:

select * from dbo.MSarticles
where article_id in (
select article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)

To get the whole list of commands you can run below query

exec sp_browsereplcmds
@xact_seqno_start = ’0x0003BB0E000001DF000600000000′,
@xact_seqno_end = ’0x0003BB0E000001DF000600000000′

With this last query you can get to the exact command that is failing (by searching the command number in the commandID column)

You will notice that a transactional replication will typically(depending on setup) use insert, delete, update stored procedures to replicate the data, so the command you will see over here will look something like:

{CALL [sp_MSdel_dboMyArticle] (118)}

That is the stored procedure generated to process delete statement over dbo.MyArticle table, and in this case it is trying to delete ID 118. Based on the error reported you will now realize that the issue is that the replication is trying to delete MyArtcile on ID 118 and is not there, so it is trying to delete a non existent record.

Options:

  1. You can either check the publisher for this record and manually insert it in the subscriber, this will cause the replication command to succeed and will fix the issue.
  2. You can skip the command, for this specific example you can skip the command as there is no need to delete something that has been already deleted, by removing the command from the MSrepl_commands table. (Beware, only do this when you know what you are doing, manually removing transactions can result in a unstable replication.) In this example you would use something like
    Delete from MSrepl_commands
    where xact_seqno = 0x0003BB0E000001DF000600000000 and commandID=1
  3. Reinitialize, this option is the least famous, you should try to fix the issue before doing this, however if after skipping the command you still get new errors everywhere, something definitely went wrong and there is no easy way to guarantee that your subscription is up to date and stable, this can be indicator that someone or something messed around with the data, there was some type of modification at the subscription and this is causing issues with the replication. Remember most likely a one way transactional replication is intended to have a copy of the data so it can be queried, no modification should be made to the data as this won´t replicate back to the publisher.

Query time outs:

After checking the replication monitor you get a message like:ID-10054415

Query timeout expired
The process is running and is waiting for a response from the server
Initializing…

and then terminating with this error…
Agent ‘MyAgent’ is retrying after an error, YY retries attempted

This can be due to several reasons:

  • Your transaction is taking a long time and needs some tuning. If your transaction is touching too much data or is using a bad query plan it can result in a long running query, check your TSQL and see if the execution plan is optimal
  • There is a problem with the network. If you normally don´t have this issue and this just happened out of the blue, you can try to check the network, sometimes a network failure or saturated endpoint can increase transfer rates affecting your replication.
  • Server performance, either the publisher or subscriber can have a performance problem, either too much CPU or Memory usage can eventually impact a replication transaction causing it to timeout
  • The query just needs some more time to complete. If this is the case you can tweak the time out setting to give the transaction some more time so it can process properly. To do this:
  1. Right click the Replication folder
  2. Click Distributor Properties and select General
  3. Click ‘Profile Defaults’
  4. Choose ‘Distribution Agents’ on left
  5. Click ‘New’ to create a new default agent profile
  6. Choose ‘Default Agent Profile’ from the list displayed, (to copy this)
  7. Pick a name for your new profile and upate the QueryTimeout value in right column
  8. Save
  9. Choose to use this profile across all your replication sets. However I would recommend to only apply to the agent that requires this change
  10. To individually assign the profile, open Replication Monitor and then in the left pane click your replication set
  11. In the right pane, select your desired agent, right click and change the profile to the new one you just created

 Mini Hack on expired subscriptionsID-10098834

When a replication is marked as expired, it will tell you that you need to reinitialize.

To activate it “under the hood”, check your replication monitor last error, it will show you the last sequential number that tried to process, then run this command(using the corresponding seq_no):

update MSsubscriptions
set status=2
where subscription_seqno=0x0002AADE00005030000100000002

The status column means:

0 = Inactive.

1 = Subscribed.

2 = Active.

You can change it to Active and it will try to process again. Why would you use this? if the subscription expired but your distribution cleanup job haven´t run, then it can try to reprocess everything again, if the issue was related to a network time out and now you have your network back up, you can try this as it will try to start from the last sequential number. Also you can try to do this to reproduce the last error reported, so it will fail and eventually expire again but you will have a better idea on why it failed in the first place.

Multi threading or “Streams”

A slow replication, and by slow I mean when you know that your replication is experiencing a delay when your command goes from the distributor to the subscription, you can check this with performance counters or quickly insert a token(http://technet.microsoft.com/en-us/library/ms151846%28v=sql.105%29.aspx)

You can improve the performance by adding streams, normally a default setting will write sequentially the replication transactions one by one, with Streams you can add more threads, say you specify to use 4 strems, you will be processing 4 transactions at a time meaning a faster turnaround. This can work beautifully but it can also generate deadlocks and inconsistencies, I would recommend to start low and just add 1 stream at a time and stop when you start seeing a problem. Do not go crazy and feel this is a turbo button and add 30 streams, and like most features, test it in QA first!

To Enable this option follow these steps:

  1. Open Replication Monitor, expand the Publisher and select the Publication in the left pane.
  2. On the right pane window , under “All Subscriptions” , you will see a list of all the Subscribers.
  3. Right Click the Subscriber you want to modify and click on “View Details”. A new Window will appear with the distribution agent session details.
  4. Now click on “Action” in the Menu bar at top and select “Distribution Agent Job Properties”, this will open the corresponding job properties.ID-100203331
  5. Go to  “Steps” in the left pane window followed by highlighting “Run Agent” on the Right pane window, click Edit.
  6. A new Windows will popup , scroll to the right end of the command section and append this parameter “ -SubscriptionStreams 2”
  7. Save the settings and restart the Distribution Agent job.

You might encounter some issues when implementing this, you can read this KB for further info:

http://support.microsoft.com/kb/953199

Conclusion

There are many tips on how to fix a replication, sometimes is easier to just reinitialize, but sometimes this is not an option when critical systems depend on the subscription to be up to date or your database is so huge that it will take days to complete. When possible try to troubleshoot instead of just restarting the replication from scratch as it will give you a lot more insight on what is going on.

Categories: DBA Blogs

12.1.0.2 Introduction to Attribute Clustering (The Division Bell)

Richard Foote - Tue, 2014-08-26 00:03
One of the really cool new features introduced in 12.1.0.2 is Attribute Clustering. This new table based attribute allows you to very easily cluster data in close physical proximity based on the content of specific columns. As I’ve discussed many times, indexes love table data that is physically clustered in a similar manner to the index […]
Categories: DBA Blogs

Arizona Oracle User Group (AZORA)

Bobby Durrett's DBA Blog - Mon, 2014-08-25 11:10

I saw this fun blog post about the Arizona Oracle User Group getting organized: blog post

I’m definitely interested in being involved.  Please pass this on to any Oracle DBAs, developers, etc. that you know in the Phoenix area.

– Bobby

Categories: DBA Blogs

Integrated Application Heartbeat for Oracle GoldenGate

DBASolved - Mon, 2014-08-25 06:51

Over the last two weeks I’ve been working on a heartbeat monitoring solution for a client.  This is not the normal heartbeat solution provided by Oracle as described in note 1299679.1; yet very similar.  The approach that I configured is similar to a traditional heartbeat setup but uses the same extract and replicats already being used by the database/application.  For simplicity reasons, I like to call this approach an Integrated Application Heartbeat.

In order to setup this style of heartbeat monitoring, the following items are are needed:

1.  Identify the heartbeat table in the source database.  In this configuration, lets call this table SETTINGS (Keep in mind nothing will change with the table (no DDL changes)).
2.  Identify and generate DDL for the target heartbeat table.  Let’s call this table GG_STATUS_HB.
3.  Identify and generate DDL for the target heartbeat history table.  Let’s call this table GG_STATUS_HB_HIST
4.  Two triggers for updating information in the target heartbeat tables (1 example below)
5.  A few macros.  One for each of the processes in the configuration.
6.  Either a crontab job or a DBMS_JOBs process

Now that the components have been identified, lets take a look at what needs to be done.

The source side heartbeat table has already been identified.  In the database, the table name is SETTINGS.  The SETTINGS table has a single column for a primary key.  This table is updated using a crontab job that runs the following SQL.  The <schema> variable is due to a this table (SETTINGS) being in more than one schema.

update <schema>.settings set id=id where rownum < 2;

On the target side, the target heartbeat and heartbeat history table need to be created.  These tables will not match the columns in the SETTINGS table.  This means that all the data that is replicated in the SETTINGS table still have to be replicated a long with being used as a heartbeat table.  The DDL to create these two tables are as follows:

CREATE TABLE <schema>.GG_STATUS_HB
(
SITE_ID                    NUMBER(10),
DB_SCHEMA                  VARCHAR2(30),
CSN_TS                     TIMESTAMP(6),
CSN_NB                     NUMBER(18),
EXT_GROUP                  VARCHAR2(10),
EXT_TIME                   TIMESTAMP(6),
EXT_LAG                    NUMBER,
PMP_GROUP                  VARCHAR2(10),
PMP_TIME                   TIMESTAMP(6),
PMP_LAG                    NUMBER,
TGT_DB_NAME                VARCHAR2(15),
REP_GROU{                  VARCHAR2(10),
REP_TIME                   TIMESTAMP(6),
REP_LAG                    NUMBER,
TOTAL_LAG                  NUMBER,
TOTAL_LAG_MIN              NUMBER,
UPDATE_TS                  TIMESTAMP(6),
CONSTRAINT GG_STATUS_HB_PK PRIMARY KEY (SITE_ID) ENABLE
);
CREATE TABLE <schema>.GG_STATUS_HB_HIST
(
SITE_ID                    NUMBER(10),
DB_SCHEMA                  VARCHAR2(30),
CSN_TS                     TIMESTAMP(6),
CSN_NB                     NUMBER(18),
EXT_GROUP                  VARCHAR2(10),
EXT_TIME                   TIMESTAMP(6),
EXT_LAG                    NUMBER,
PMP_GROUP                  VARCHAR2(10),
PMP_TIME                   TIMESTAMP(6),
PMP_LAG                    NUMBER,
TGT_DB_NAME                VARCHAR2(15),
REP_GROU{                  VARCHAR2(10),
REP_TIME                   TIMESTAMP(6),
REP_LAG                    NUMBER,
TOTAL_LAG                  NUMBER,
TOTAL_LAG_MIN              NUMBER,
UPDATE_TS                  TIMESTAMP(6),
CONSTRAINT GG_STATUS_HB_HIST_PK PRIMARY KEY (SITE_ID, DB_SCHEMA, CSN_TS, CSN_NB, UPDATE_TS) ENABLE
);

In mapping these tables, I found it easier to create macros to handle the mappings.  Macros are especially handy since the SETTINGS table is in multiple schemas and they need to be mapped through existing GoldenGate processes.  For more on macros and passing parameters, I’ve written this post to highlight macros (here).

Examples of the macros that are used in this configuration are as follows:

Macro for Extract process:
This macro is used to capture and map the extract name, time the transaction was extracted, the timestamp of the commit and the System Change Number.

--Heartbeat Extract Macro Library - Version 1.01
--Inital write - 08-13-2014 - BLC</pre>
<pre>-- ***************************************
-- Common macros for this library
-- called internally
-- ***************************************
MACRO #hb_ext_details
BEGIN
EXT_GROUP = @GETENV("GGENVIRONMENT","GROUPNAME"), &
EXT_TIME = @DATE("YYYY-MM-DD HH:MI:SS.FFFFFF","JTS",@GETENV("JULIANTIMESTAMP")), &
CSN_TS = @GETENV("GGHEADER","COMMITTIMESTAMP"), &
TRANS_CSN = @GETENV("TRANSACTION","CSN")
END;
-- *****************************************
-- Application Specific Marcos for Heartbeat
-- *****************************************
MACRO #opshb_info
BEGIN
       #hb_ext_details()
END;

Macro for the Pump process:
The macro captures the pump name and time of transaction passed through the pump.

--Heartbeat Pump Macro Library - Version 1.01
--Inital write - 08-13-2014 - BLC
-- ***************************************
-- Common macros for this library
-- called internally
-- ***************************************

MACRO #hb_pmp_details
BEGIN
PMP_GROUP=@GETENV("GGENVIRONMENT","GROUPNAME"),
PMP_TIME=@DATE("YYYY-MM-DD HH:MI:SS.FFFFFF","JTS",@GETENV("JULIANTIMESTAMP"))
END;

-- *****************************************
-- Application Specific Marcos for Heartbeat
-- *****************************************

MACRO #opshb_info
BEGIN
        #hb_pmp_details()
END;

Macro for the Replicat Process:
This macro does the mapping of all the tokens that have been passed from the other GoldenGate processes to the target tables.

--Heartbeat Replicat Macro Library - Version 1.01
--Inital write - 08-13-2014 - BLC
-- ***************************************
-- Common macros for this library
-- called internally
-- ***************************************
MACRO #opshb_rep_details
BEGIN
SITE_ID=@TOKEN("NHIN_STORE_ID"), &
DB_SCHEMA=@TOKEN("SRC_DB_SCHEMA"),
HOST_NAME=@GETENV("GGFILEHEADER", "HOSTNAME"), &
CSN_TS=@TOKEN("SRC_CSN_TS"), &
CSN_NB=@TOKEN("SRC_TRANS_CSN"), &
EXT_NAME=@TOKEN("EXT_GROUP"), &
EXT_TIME=@TOKEN("EXT_TIME"), &
PMP_GROUP=@TOKEN("PMP_GROUP"), &
PMP_TIME=@TOKEN("PMP_TIME"), &
TGT_DB_NAME=@GETENV("DBENVIRONMENT","DBNAME"), &
REP_GROUP=@GETENV ("GGENVIRONMENT", "GROUPNAME"), &
REP_TIME=@DATE("YYYY-MM-DD HH:MI:SS.FFFFFF","JTS",@GETENV ("JULIANTIMESTAMP")), &
UPDATE_TS=@DATENOW()
END;
-- *****************************************
-- Application Specific Marcos for Heartbeat
-- *****************************************
MACRO #hb_info
PARAMS (#src_schema)
BEGIN
MAP #src_schema.SETTINGS,TARGET <target schema>.GG_STATUS_HB, &
KEYCOLS (SITE_ID), &
INSERTMISSINGUPDATES, &
COLMAP ( #hb_rep_details() );</pre>
<pre>MAP #src_schema.SETTINGS,TARGET <target schema>.GG_STATUS_HB_HIST, &
KEYCOLS (SITE_ID, DB_SCHEMA, CSN_TS, CSN_NB, UPDATE_TS), &
INSERTALLRECORDS, HANDLECOLLISIONS &
COLMAP ( #hb_rep_details() );
END;

For each of the target heartbeat tables, a trigger is needed to calculate the lag and timestamp information.  The triggers that I used/created are based on the Oracle supplied examples in note 1299679.1.  To keep this blog some what short, I’m just going to include one sample of the triggers used.

Trigger Example:


CREATE OR REPLACE TRIGGER <target schema>.GG_STATUS_HB_TRIG
BEFORE INSERT OR UPDATE ON <target schema>.GG_STATUS_HB
FOR EACH ROW
BEGIN
SELECT TRUNC(TO_NUMBER(SUBSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS),1, INSTR(:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS,' ')))) * 86400
+ TO_NUMBER(SUBSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS),' ')+1,2)) * 3600
+ TO_NUMBER(SUBSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS),' ')+4,2) ) * 60
+ TO_NUMBER(SUBSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS),' ')+7,2))
+ TO_NUMBER(SUBSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.SRC_EXT_TIME - :NEW.SRC_CSN_TS),' ')+10,6)) / 1000000
INTO :NEW.EXT_LAG
FROM DUAL;</pre>
<pre>SELECT TRUNC(TO_NUMBER(SUBSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME),1, INSTR(:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME,' ')))) * 86400
+ TO_NUMBER(SUBSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME), INSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME),' ')+1,2)) * 3600
+ TO_NUMBER(SUBSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME), INSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME),' ')+4,2) ) * 60
+ TO_NUMBER(SUBSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME), INSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME),' ')+7,2))
+ TO_NUMBER(SUBSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME), INSTR((:NEW.SRC_PMP_TIME - :NEW.SRC_EXT_TIME),' ')+10,6)) / 1000000
INTO :NEW.PMP_LAG
FROM DUAL;</pre>
<pre>SELECT TRUNC(TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME),1, INSTR(:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME,' ')))) * 86400
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME),' ')+1,2)) * 3600
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME),' ')+4,2) ) * 60
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME),' ')+7,2))
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_PMP_TIME),' ')+10,6)) / 1000000
INTO :NEW.REP_LAG
FROM DUAL;</pre>
<pre>SELECT TRUNC(TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS),1, INSTR(:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS,' ')))) * 86400
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS),' ')+1,2)) * 3600
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS),' ')+4,2) ) * 60
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS),' ')+7,2))
+ TO_NUMBER(SUBSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS), INSTR((:NEW.TGT_REP_TIME - :NEW.SRC_CSN_TS),' ')+10,6)) / 1000000
INTO :NEW.TOTAL_LAG
FROM DUAL;
SELECT round((:NEW.TOTAL_LAG/60),1) INTO :NEW.TOTAL_LAG_MIN FROM DUAL;
SELECT SYSTIMESTAMP INTO :NEW.UPDATE_TS FROM DUAL;
END;
/

After all these items are in place, the next thing that has to be done is updating the parameter files.  With any macros, the parameter files have to reference the macro via an INCLUDE statement.  Then the mappings for the SETTINGS table has to be done in each parameter file before restarting the process.  More information on macros can be found here.

The biggest change that has to be be made to existing processes comes in the pump process.  The pump has to be configured to be a PASSTHRU for all tables except the SETTINGS table.  In order to assign tokens to the SETTINGS table, the pump needs to be put in NOPASSTHRU mode.  In order to do this, the pump parameter file needs to be updated with something similar to this approach:

NODYNAMICRESOLUTION
WILDCARDRESOLVE IMMEDIATE
NOPASSTHRU
TABLE <schema>.SETTINGS, TOKENS ( #opshb_info() );

DYNAMICRESOLUTION
WILDCARDRESOLVE DYNAMIC
PASSTHRU
TABLEEXCLUDE <schema>.SETTINGS;

TABLE <schema>.*;

With everything in place and processes restarted, the crontab job should be updating the heartbeat process on a scheduled interval.  By reviewing the STATUS_HB table, it will provide you a quick update on the application heartbeat.  By looking at the STATUS_HB_HIST table, you can get a sense of how much lag is happening within your GoldenGate environment over time without having additional overhead of a separate heartbeat processes.

Enjoy!

about.me: http://about.me/dbasolved

 

 

 


Filed under: Golden Gate
Categories: DBA Blogs

Don’t go directly to Maximum Protection!

The Oracle Instructor - Mon, 2014-08-25 04:14

With a Data Guard Configuration in Maximum Performance protection mode, don’t go to Maximum Protection directly, because that leads to a restart of the primary database:

 Attention!

DGMGRL> show configuration;

Configuration - myconf

  Protection Mode: MaxPerformance
  Databases:
  prima  - Primary database
    physt  - Physical standby database
      physt2 - Physical standby database (receiving current redo)

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

DGMGRL> edit configuration set protection mode as maxprotection;
Operation requires shutdown of instance "prima" on database "prima"
Shutting down instance "prima"...
Database closed.
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "prima" on database "prima"
Starting instance "prima"...
ORACLE instance started.
Database mounted.
Database opened.

Instead, go to Maximum Availability first and then to Maximum Protection:

DGMGRL> edit configuration set protection mode as maxperformance;
Succeeded.
DGMGRL> edit configuration set protection mode as maxavailability;
Succeeded.
DGMGRL> edit configuration set protection mode as maxprotection;
Succeeded.

The demo was done with 12c, involving a cascading standby database, but the behavior is the same in 11g already. The odd thing about it is that DGMGRL will restart the primary without warning. Wanted to share that with the Oracle community for years but always got over it somehow.


Tagged: Data Guard, High Availability
Categories: DBA Blogs

ASM Commands : 2 -- Migrating a DiskGroup to New Disk(s)

Hemant K Chitale - Sun, 2014-08-24 08:52
In 11gR2 Grid Infrastructure and RAC

After the previous demonstration of adding a new DiskGroup, I now demonstrate migrating the DiskGroup to a new pair of disks.

First, I create a table in the Tablespace on that DiskGroup.

[root@node1 ~]# su - oracle
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 24 22:17:28 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: hemant/hemant

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> create table new_tbs_tbl
2 tablespace new_tbs
3 as select * from dba_objects
4 /

Table created.

SQL> select segment_name, bytes/1048576
2 from user_segments
3 where tablespace_name = 'NEW_TBS'
4 /

SEGMENT_NAME
--------------------------------------------------------------------------------
BYTES/1048576
-------------
NEW_TBS_TBL
9


SQL> select file_name, bytes/1048576
2 from dba_data_files
3 where tablespace_name = 'NEW_TBS'
4 /

FILE_NAME
--------------------------------------------------------------------------------
BYTES/1048576
-------------
+DATA3/racdb/datafile/new_tbs.256.855792859
100


SQL>


Next, I verify that the DiskGroup is currently on disk asmdisk.7 and that the two new disks that I plan to migrate the DiskGroup to are available as asmdisk.8 and asmdisk.9  (yes, unfortunately, they are on /fra, instead of /data1 or /data2 because I have run out of disk space in /data1 and /data2).
This I do from node1 :

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
-sh-3.2$ exit
logout

[root@node1 ~]#
[root@node1 ~]# su - grid
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 24 22:22:32 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> l
1 select d.name, d.path
2 from v$asm_disk d, v$asm_diskgroup g
3 where d.group_number=g.group_number
4* and g.name = 'DATA3'
SQL> /

NAME
------------------------------
PATH
--------------------------------------------------------------------------------
DATA3_0000
/data1/asmdisk.7


SQL>
SQL> !sh
sh-3.2$ ls -l /fra/asmdisk*
-rwxrwxr-x 1 grid oinstall 1024000000 Aug 24 22:06 /fra/asmdisk.8
-rwxrwxr-x 1 grid oinstall 1024000000 Aug 24 22:07 /fra/asmdisk.9
sh-3.2$


Note how the ownership and permissions are set for the two new disks (see my previous post).

I now add the two new disks.

sh-3.2$ exit
exit

SQL> show parameter power

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_power_limit integer 1
SQL> alter diskgroup data3 add disk '/fra/asmdisk.8', '/fra/asmdisk.9';

Diskgroup altered.

SQL>
SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE
------------ ----- ---- ---------- ---------- ---------- ---------- ----------
EST_MINUTES ERROR_CODE
----------- --------------------------------------------
3 REBAL RUN 1 1 1 101 60
1


SQL>


With ASM_POWERLIMIT set to 1, Oracle ASM automatically starts a REBALANCE operation.  However, since I did *not* drop the existing asmdisk.7, Oracle will still continue to use it.

After a while, I confirm that the REBALANCE has completed.  I can now drop asmdisk.7.  Unfortunately, this will trigger a new REBALANCE !

SQL> l
1* select * from v$asm_operation
SQL> /

no rows selected

SQL>
SQL> l
1 select d.name, d.path
2 from v$asm_disk d, v$asm_diskgroup g
3 where d.group_number=g.group_number
4* and g.name = 'DATA3'
SQL> /

NAME
------------------------------
PATH
--------------------------------------------------------------------------------
DATA3_0000
/data1/asmdisk.7

DATA3_0002
/fra/asmdisk.9

DATA3_0001
/fra/asmdisk.8


SQL>
SQL> alter diskgroup data3 drop disk '/data1/asmdisk.7';
alter diskgroup data3 drop disk '/data1/asmdisk.7'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15054: disk "/DATA1/ASMDISK.7" does not exist in diskgroup "DATA3"


SQL> alter diskgroup data3 drop disk 'DATA3_0000';

Diskgroup altered.

SQL>
SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE
------------ ----- ---- ---------- ---------- ---------- ---------- ----------
EST_MINUTES ERROR_CODE
----------- --------------------------------------------
3 REBAL RUN 1 1 2 102 120
0


SQL>
SQL> l
1* select * from v$asm_operation
SQL>
SQL> /

GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE
------------ ----- ---- ---------- ---------- ---------- ---------- ----------
EST_MINUTES ERROR_CODE
----------- --------------------------------------------
3 REBAL RUN 1 1 47 101 95
0


SQL> /

no rows selected

SQL>


NOTE : Note how I must specify the Disk NAME (not the PATH) for the DROP. When I added disks asmdisk.8 and asmdisk.9, I could have given then meaningful names as well. Oracle has automatically named them.

Ideally, what I should have done is to use the ADD and DROP command together.  That way, I would have a single-pass REBALANCE required.

After a while, I run my validation queries on node2.


[root@node2 ~]# su - grid
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 24 22:42:39 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select d.name, d.path
from v$asm_disk d, v$asm_diskgroup g
where d.group_number=g.group_number
and g.name = 'DATA3' 2 3 4
5
SQL> l
1 select d.name, d.path
2 from v$asm_disk d, v$asm_diskgroup g
3 where d.group_number=g.group_number
4* and g.name = 'DATA3'
SQL> /

NAME
------------------------------
PATH
--------------------------------------------------------------------------------
DATA3_0002
/fra/asmdisk.9

DATA3_0001
/fra/asmdisk.8


SQL>
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options
-sh-3.2$ exit
logout

[root@node2 ~]# su - oracle
-sh-3.2$
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 24 22:44:10 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: hemant/hemant

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select count(*) from new_tbs_tbl;

COUNT(*)
----------
72460

SQL>


I have now accessed the table, tablespace, diskgroup and disks from node2 successfully. Disk asmdisk.7 is no longer part of the DiskGroup.

I can physically remove disk asmdisk7 from the storage.


SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options
-sh-3.2$ exit
logout

[root@node1 ~]# cd /data1
[root@node1 data1]# ls -l asmdisk.7
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 24 22:39 asmdisk.7
[root@node1 data1]# rm asmdisk.7
rm: remove regular file `asmdisk.7'? y
[root@node1 data1]#
[root@node1 data1]# su - grid
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 24 22:50:18 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> set pages 60
SQL> col name format a15
SQL> col path format a20
SQL> select group_number, name, path
2 from v$asm_disk
3 order by 1,2;

GROUP_NUMBER NAME PATH
------------ --------------- --------------------
0 /crs/voting.disk
0 /data1/votedisk.1
0 /data2/votedisk.2
0 /fra/votedisk.3
1 DATA1_0000 /data1/asmdisk.1
1 DATA1_0001 /data2/asmdisk.4
2 DATA2_0000 /data1/asmdisk.2
2 DATA2_0001 /data2/asmdisk.5
2 DATA2_0002 /data2/asmdisk.6
3 DATA3_0001 /fra/asmdisk.8
3 DATA3_0002 /fra/asmdisk.9
4 DATA_0000 /crs/ocr.configurati
on

5 FRA_0000 /fra/fradisk.3
5 FRA_0001 /fra/fradisk.2
5 FRA_0002 /fra/fradisk.1
5 FRA_0003 /fra/fradisk.4

16 rows selected.

SQL>

The disk asmdisk.7 is no longer part of the storage. (Remember : All my disks here are on NFS).
.
.
.
Categories: DBA Blogs

Commit Puzzle

Bobby Durrett's DBA Blog - Fri, 2014-08-22 11:54

This graph represents commit time compared to CPU utilization and redo log write time.  I’ve included only the hourly intervals with more than 1,000,000 commits.  At these peaks the number of commits ranges 1 to 1.6 million commits per hour so each point on the graph represents roughly the same commit rate.  I’m puzzled by why the commit time bounces around peaking above 5 milliseconds when I can’t see any peaks in I/O or CPU that correspond to the commit time peaks.

commitvscpuio

I derived CPU% from DBA_HIST_OSSTAT.  I got the other values by getting wait events from DBA_HIST_SYSTEM_EVENT.  Commit time is log file sync wait time.  Redo write time is log file parallel write wait time.  I converted the wait times to milliseconds so they fit nicely on the chart with CPU%.

I thought I would pass this along as a puzzle that I haven’t figured out.

Here is a zip of the script I used to get the data, its raw output, and the spreadsheet I used to make the chart: zip

– Bobby

P.S.  This is on HP-UX 11.31, Itanium, Oracle 11.2.0.3

P.P.S  Did some more work on this today.  Looks like the high commit time periods have short spikes of long redo log writes even though the average over the hour is still low.  I’m looking at DBA_HIST_SYSTEM_EVENT to get a histogram of the log file parallel write waits and there are a number in the 1024 bucket when the log file sync time is high on average.

END_INTERVAL_TIME   LFPW_MILLI LFPW_COUNT AVG_COMMIT_MS AVG_WRITE_MS
------------------- ---------- ---------- ------------- ------------
21-AUG-14 11.00 AM           1     268136    9.14914833   2.45438987
21-AUG-14 11.00 AM           2     453913    9.14914833   2.45438987
21-AUG-14 11.00 AM           4     168370    9.14914833   2.45438987
21-AUG-14 11.00 AM           8      24436    9.14914833   2.45438987
21-AUG-14 11.00 AM          16       5675    9.14914833   2.45438987
21-AUG-14 11.00 AM          32       6122    9.14914833   2.45438987
21-AUG-14 11.00 AM          64       3369    9.14914833   2.45438987
21-AUG-14 11.00 AM         128       2198    9.14914833   2.45438987
21-AUG-14 11.00 AM         256       1009    9.14914833   2.45438987
21-AUG-14 11.00 AM         512        236    9.14914833   2.45438987
21-AUG-14 11.00 AM        1024         19    9.14914833   2.45438987
21-AUG-14 11.00 AM        2048          0    9.14914833   2.45438987
21-AUG-14 02.00 PM           1     522165    2.97787777   1.64840599
21-AUG-14 02.00 PM           2     462917    2.97787777   1.64840599
21-AUG-14 02.00 PM           4     142612    2.97787777   1.64840599
21-AUG-14 02.00 PM           8      17014    2.97787777   1.64840599
21-AUG-14 02.00 PM          16       4656    2.97787777   1.64840599
21-AUG-14 02.00 PM          32       5241    2.97787777   1.64840599
21-AUG-14 02.00 PM          64       1882    2.97787777   1.64840599
21-AUG-14 02.00 PM         128        820    2.97787777   1.64840599
21-AUG-14 02.00 PM         256        149    2.97787777   1.64840599
21-AUG-14 02.00 PM         512         10    2.97787777   1.64840599
21-AUG-14 02.00 PM        1024          2    2.97787777   1.64840599
21-AUG-14 02.00 PM        2048          0    2.97787777   1.64840599

There were 19 waits over half a second in the first hour and only 2 in the second hour.  Maybe all the log file sync waits pile up waiting for those long writes.  Here is a graph that compares the number of waits over half a second – the 1024 ms bucket – to the average log file sync and log file parallel write times for the hour:

longwrites

You can see that the average redo write time goes up a little but the commit time goes up more.  Maybe commit time is more affected by a few long spikes than by a lot of slightly longer write times.

Found a cool blog post that seems to explain exactly what we are seeing: blog post

 

Categories: DBA Blogs

Log Buffer #385, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-08-22 08:00

This Log Buffer edition combs through top notch blog posts from Oracle, MySQL and SQL Server postings around the globe.

Oracle:

You want to test the Oracle Java Cloud? You can get your own 30 day test account & instance. Or you can get a Java account with our permanent test system.

Some ramblings about Oracle R Distribution 3.1.1.

Scott is demystifying Oracle Unpivot.

Java 8 for Tablets, Pis, and Legos at Silicon Valley JUG – 8/20/2014

A new version of Oracle BPM Suite 11.1.1.7 with Adaptive Case Management (ACM) is now available.

SQL Server:

Data Mining: Part 14 Export DMX results with Integration Services

Should you be planning to move from Exchange to Office 365? If so, why?

Stairway to T-SQL DML Level 12: Using the MERGE Statement

From SQL Server Management Studio it’s hard to look through the first few rows of a whole lot of tables in a database.

Special Characters can lead to many problems. Identifying and reporting on them can save a lot of headache down the road.

MySQL:

MariaDB Galera Cluster 5.5.39 now available

A closer look at the MySQL ibdata1 disk space issue and big tables

How-To: Guide to Database Migration from MS Access using MySQL Workbench

Using resource monitoring to avoid user service overload

How to use MySQL Global Transaction IDs (GTIDs) in production

Categories: DBA Blogs

Partner Webcast - Oracle Data Integration Competency Center (DICC): A Niche Market for services

Market success now depends on data integration speed. This is why we collected all best practices from the most advanced IT leaders, simply to prove that a Data Integration competency center should...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Microsoft PowerBI: News from WPC

Pythian Group - Wed, 2014-08-20 14:24

During the Worldwide Partner Conference (WPC) that happened last month, Microsoft made public some of the new functionalities that we can expect to see soon on Microsoft PowerBI.

If you were on another planet in the last few months, and didn’t heard about PowerBI for Office 365, I’ve included a list of the set of tools that comprise Microsoft BI stack in the cloud. Keep in mind, they are different tools and not a single product, each tool serving a different purpose. At the date of writing of this article the tools that comprise the entire solutions are:

  • Power Pivot
  • Power View
  • Power Query
  • Power Map
  • Power Q&A
  • Power BI Windows Store App

Some of those were already available as an Excel add-in or built-in as part of the Excel product for a long time and now had being re-packed on this cloud solution, while others, like the Power Q&A are a cloud-only solution.

So, what are the big news from the WPC 2014? During the conference, we watched a demonstration of what to expect for the coming months, so today I’ll discuss the ones I believe were the most important. If you want to see the entire presentation you can find it here.

PowerBI-Image1

New visualizations!

One of the key issues we were facing with the current version of Power BI, was the lack of some visualizations the customer always asks for, the main one being the gauge. It is incredible how popular those little gauges have become with time – decision makers love it, so it’s really good to see they are finally available in Power BI.

Besides the gauge, we can see in the image above taken from the WPC demonstration, other data visualizations like the radar chart and the treemap.

Edition capabilities in the browser

Another important thing that was announced was the possibility to edit the dashboard and change the data visualizations on the browser, without using Excel. It doesn’t seem like much, but this can be very important, and indicate a shift in Microsoft behaviour. If you look at all the Power BI functionalities, and in fact, at the entire BI stack, Excel was always the central tool, you needed Excel to basically everything. And now we are seeing some nice features that you can manage and control inside the browser. Let’s wait for the next steps.

Important to mention that everything is HTML5 instead of Silverlight, meaning we can have the same experience in basically any device.

Partner Solution Pack

If I was asked to name just a single announcement that was made that could drastically change the market, it would be this one.

Partner Solution Pack is the ability for the Microsoft partners to create a bundled BI solution including the data, connectivity to the data sources and all the interactive reports. That means that we can as a user buy a solution pack from Salesforce for instance, connect it with our Salesforce account and it would automatically create a fully interactive dashboard with our own data.

Now, imagine the other way around: you, as a Microsoft Partner now has the ability to create a complete BI solution to your customers and make it available on the cloud. And your customers can buy this package, connect it with their data and make use of the solution in a manner of seconds.

The Partner Solution Pack in my opinion will create a huge market for all Microsoft partners and provide us, the users, with tons of good packages and the ability to have a BI solution paying much less than what would cost creating everything from scratch.

PowerBI-Image2 PowerBI-Image3

But you may tell me that we have other tools in the market that can do this, connect on partner applications and build a dashboard, what would be the advantage of using Power BI over the existing tools?

The biggest advantage is the PowerBI Q&A, as you can see in the screenshot in every screen we have a simple search box at the top of the page, which allows the user to do simple natural language questions to query the data. So, if the user wants to know the “opportunity size by month and by industry” all you have to do is ask and PowerBI will find the data, and choose the best visualization method for you. After that, you can just pin this report in the dashboard and that’s it, now you can keep track of this important information on a daily basis. Without requiring a single line of code, without asking for a change request to the IT department and going to a huge queue of requests that would take months to be addressed.

PowerBI-Image4

I hope that in this article I was able to show you the potential this new functionalities can bring to your company. If you need more information about PowerBI, or if you’re as excited as I am with it and want to start using it right away, just contact us and our team will be glad to work with your company to either develop a BI solution that consumes your data, or to plan the development of your own Partner Solution Pack so you can offer your customers a complete BI solution using PowerBI for Office 365.

Click here to watch the entire WPC announcement and see the PowerBI reports in action. The PowerBI demonstration starts at 21:10 minutes.

 

Categories: DBA Blogs

Commit scalability

Bobby Durrett's DBA Blog - Wed, 2014-08-20 13:57

I am learning about how well commits scale on Oracle 11.2 and so far they seem to scale surprisingly well.

I’ve looked at two waits – log file parallel write and log file sync.  Based on documents I’ve read on Oracle’s support site log file parallel write represents the time it takes to do one write to the redo logs.  For mirrored redo logs the log file parallel write time includes the time to write to both of the copies.  Log file sync represents the time it takes for a session to complete a commit and should include all the time measured for the write to the redo logs and added CPU time to process the commit.  So, the log file sync time should equal or exceed the log file parallel write time.

Looking at AWR data I found that at peak times one of our databases had 1 millisecond log file parallel write waits and about 1.2 million waits per hour.  Since there are 3.6 million milliseconds in an hour it seemed to me that during this peak hour the redo logs were about 33% utilized because writes to the redo logs were occurring during 1.2 million of the available 3.6 million milliseconds.  I decided to look at a simple queuing theory model that I had read about in Craig Shallahamer’s Forecasting Oracle Performance book to get a basic idea of how queuing might impact redo log write time as the utilization of the redo log grew closer to 100%.

scalability

This model predicts that the redo log write time will go through the roof as the number of writes per hour approaches 3.6 million, assuming a constant 1 millisecond write time.

To attempt to confirm the predictions made by this graph I decided to build some scripts that will run a bunch of commits on a test database and attempt to max out the writes to the redo log so I could graph the results and compare it to the theoretical model.  In the test I had twenty tables named TEST1, TEST2,…, TEST20.  These tables have one row and one column.  I ran a test of 1, 2, 5, 10, and 20 updating processes that I designed to generate a bunch of commits quickly.  Each process ran 100,000 updates and commits like these:

update test1 set a=a+1;
commit;
update test1 set a=a+1;
commit;
update test1 set a=a+1;
commit;

Each process had its own table – i.e. process 15 acted on table TEST15.

My hope was that as I ran tests with increasing numbers of processes running in parallel eventually I would max out the writes to the redo log and see increasing log file parallel write wait times.  But, surprisingly, as the commit rate increased the redo log write rate actually went down.

commitswrites

This is just one test, but it makes me wonder if I can max out the writes to the redo log.  I believe that as the commit rate increases the database batches the commits together in some efficient way which makes commits more scalable than I realized.

I think that an Oracle database must have some limit to commit scalability that relates more to the CPU used to process the commits instead of the writes to the redo logs.  In these same tests the log file sync or commit time did increase slowly as the number of commits ramped up.

committime

It started around half a millisecond at 3 million commits per hour and slowly grew to almost 2 milliseconds at 10 million commits per hour.  So, commit time grew, but nothing like the original response time graph which went through the roof at around 3.6 million writes to the redo logs per hour.

Here is a zip of the scripts I used to generate the data, the spreadsheets I used to make the charts, and the results from the tests: zip

To run the tests yourself enter your own username and password at the top of setup.sql, commits.sql, and report.sql.  Run one of the five reports – run1.sh, run2.sh, run5.sh, run10.sh or run20.sh like this:

./run5.sh

Wait for the last process to finish outputting then run the report:

./report.sh

Results look like this:

WAIT_EVENT              ELAPSED_SECS WAITS_PER_HOUR WAITS_PER_SECOND AVG_WAIT_MS AVG_WAIT_SEC
----------------------- ------------ -------------- ---------------- ----------- ------------
log file parallel write          542     3008922.51       835.811808  .648076577   .000648077
log file sync                    542     5306207.38       1473.94649   1.1727226   .001172723

I’m really just getting started understanding how commits scale, but it was surprising to me how hard it was to get the rate of redo log writes high enough to cause the  write time to increase due to queuing.  I assume this is because the database batches commits together more efficiently that I expected, which makes commits more scalable than I realized.

– Bobby

P.S.  This is on HP-UX 11.31, Itanium, Oracle 11.2.0.3.0

Categories: DBA Blogs

SSAS Database Doesn’t Show Up in SharePoint 2013 Dashboard Designer

Pythian Group - Wed, 2014-08-20 07:43

Howdy everyone,

Just a quick tip for everyone that is struggling to configure SharePoint Server 2013 PerformancePoint  to connect to a SQL Analysis Services 2012 or 2014 cube.

After a new SharePoint Server 2013 installation, I have tried to create a new connection to my Analysis Services cube through the SharePoint Dashboard Designer, but no matter what, the Database option always shows up as empty and I can select my Analysis Services database.

DashboardDesigner

 

In the Windows server log event I could find the following message:

The data source provider for data sources of type ‘ADOMD.NET’ is not registered. Please contact an administrator.

PerformancePoint Services error code 10115.

The reason you would receive this error message, believe it or not, is because even if you are using SQL Server 2012 or newer, and SharePoint Server 2013, it will try to load the SQL Server 2008 version of the ADMD.NET dll.

If you install the SQL Server 2008 R2 ADMD.NET component, that you can download from the following location: http://www.microsoft.com/en-us/download/details.aspx?id=16978 , and restart IIS you will fix this issue and will be able to successfully connect to your SQL Server 2012/2014 Analysis Services database.

Hope this helps.

 

Categories: DBA Blogs

SQL Server Error: 18056, Severity: 20, State: 29

Pythian Group - Wed, 2014-08-20 07:40

Howdy everyone,

One of the most frequent error messages I come across when dealing with SQL Server, particularly SQL Server 2008 R2, is Error 18056, Severity: 20, State: 29. Not only do I constantly see this error message in the SQL logs, but also in the Microsoft community forums. I often see inquiries likes, “I have SQL Server 2008 R2 SP2 installed and I see the error below. I have the most recent Service Pack installed, but the problem remains.”

Message
Error: 18056, Severity: 20, State: 29.
Message
The client was unable to reuse a session with SPID XXXX, which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

Now, this error message can be triggered by many things. For instance, lack of user permissions in any database, or in the worst case, lack of resources in the server that made it stop accepting new connection requests. However, in any of those cases you would probably have a lot of angry users calling you and complaining that they can’t do their work. If that’s not the case, you are probably just seeing a generic and internal error message that shouldn’t be in the SQL error log in the first place.

The first step to addressing this issue, or simply investigating if it’s something more serious, is to update to the latest Cumulative Update.

Microsoft released a fix that makes SQL Server stop logging this generic error message in some cases, when it’s not important for you to investigate the issue. This fix is documented in the following Knowledge Base article: http://support.microsoft.com/kb/2543687?wa=wsignin1.0

I hope this help you to get rid of this error message.

 

Categories: DBA Blogs

On the Road with Laerte

Pythian Group - Mon, 2014-08-18 09:21

For the month of October, Microsoft PowerShell MVP, Laerte Junior will be touring Brazil and Europe for various SQL Server-related speaking engagements.

“Thankfully, I am working at a company that fully supports their employees to speak and participate in community events.” Laerte says. “I can travel to Europe for 5 SQL Server conferences, and then go to the USA to attend the MVP Global Summit and SQL PASS Summit.”

While most European speaking sessions have been confirmed, we’ll be updating the schedule as the topics become available. You can follow Laerte on his personal blog at shellyourexperience.com.

Date Location Event Topic Speaking Schedule September 27, 2014 São Paulo, Brazil SQL Saturday #325 Criando suas próprias solouções usando PowerShell See speaking schedule October 1, 2014 Schelle, Belgium SQL Server Days Mastering PowerShell for SQL Server See speaking schedule October 2, 2014 Utrecht, Holland SQL Saturday #336 Full day pre-conference training session: Mastering PowerShell for SQL Server October 3/4 2014 Utrecht, Holland SQL Saturday #336 TBD See speaking schedule October 11, 2014 Sophia, Bulgaria SQL Saturday #311 Writing Your Solutions Using PowerShell See speaking schedule October 18, 2014 Oporto, Portugal SQL Saturday #341 Criando suas próprias solouções usando PowerShell See speaking schedule October 24, 2014 Barcelona, Spain SQL Saturday #338 TBD See speaking schedule

Will you be attending any of these sessions? If so, which ones?

Categories: DBA Blogs

What Is Oracle Elapsed Time And Wall Time With A Parallelism Twist


What Is Oracle Elapsed Time And Wall Time With A Parallelism Twist
In this post I'm focusing on Oracle Database SQL elapsed time, adding parallelism into the mix and then revisiting wall time. What initially seems simple can take some very interesting twists!

If you are into tuning Oracle Database systems, you care about time. And if you care about time, then you need to understand the most important time parameters: what they are, their differences, how they relate to each other and how to use them in your performance tuning work.

A couple weeks ago I wrote about Oracle DB Time, non-idle wait time, and server process CPU consumption (DB CPU) time. If you have not read that posting, HERE is the link. It must be a good read because it quickly become my most viewed post ever! In this posting, the focus is SQL elapsed time, parallelism, and again wall time. Enjoy!

Quick Review
In my previous related post, I covered non-idle wait time, DB CPU, and DB Time. Here is a very quick summary of each.

Non-Idle Wait Time occurs when an Oracle process is not consuming CPU, the session pauses (i.e., waits) and Oracle considers the wait time important for performance tuning. An example of a non-idle wait event is direct path read temp. An example of an idle wait event is SQL*Net message from client or pmon timer.

DB CPU is Oracle server/foreground/shadow process CPU consumption. This is not include Oracle background process CPU consumption.

DB Time is DB CPU plus Non-Idle Wait Time. Remember that DB Time does not include background process CPU consumption and Oracle Corporation determines which wait events are considered idle.

Elapsed Time
Elapsed Time (ET) is all DB Time related to a defined task. A "defined task" could be a SQL statement, group of SQL statements, pl/sql procedure, batch job, etc. It is whatever makes sense in your tuning situation.

The elapsed time for a SQL_ID can be found in v$sql. But be careful because this elapsed time is related to "all" the SQL_ID executions. Thankfully, there is an "executions" column in v$sql.


Elapsed time is displayed in a number of areas within an Oracle Database AWR and Statspack report. Looking at the above screen shot, the "top" elapsed time SQL has an elapsed time of 268561 seconds. This means that over the AWR report's snapshot interval, for all this SQL's executions, its total DB Time is 268561 seconds. Said another way, if we were to add up all this SQL's DB CPU and non-idle wait time for all its executions within the snapshot interval, the value should be 268561.

There is a lot of great information provided in the AWR and Statspack SQL reports. For example, because the elapsed time and the CPU time (DB CPU) is shown above, we can calculate the non-idle wait time for the "top" elapsed time SQL ID.

non idle wait time = elapsed time - cpu time
268465 = 268561 - 96

For the "top" elapsed SQL, its elapsed time 268561 and it's DB CPU is 96 therefore its non-idle wait time is 268465. Wow! This statement has tons of associated wait time compared to CPU consumption time.

But it gets even better! Because the total Elapsed Time and the total number Executions over the snapshot interval is displayed, we can determine the average elapsed time!

average elapsed time = total elapsed time / executions
746.03 = 268561 / 36

Do not be deceived! The average elapsed time is unlikely what the user is experiencing. Two possibility examples for this deception are skewed elapsed times and parallelism.

For most DBAs this is unexpected. It also causes performance perception problems yet solutions are available to understand what's really going on.

I've spent so much time researching this topic and seen it increase my consulting value, I've posted a number of blog entries on this subject. Plus I created an OraPub Online Institute seminar focused specifically on this subject. It's called Using Skewed Performance Data To Your Advantage. Check it out. I'm really proud how it turned out. I also have a couple of OSM scripts dedicated to this topic, sqlelget[11].sql.

Revisiting Wall Time With A Parallelism Twist
Now it's time to put this all together.

DB CPU is the Oracle server process CPU consumption.

Non-Idle Wait Time (NIWT) is the time when an Oracle process can not consume CPU and must pause and we care about this time.

DB Time is the Oracle server process CPU consumption and all non-idle wait time.

Elapsed Time (ET) is the sum (i.e., all) DB Time related to a task, such as a SQL_ID.

Wall Time is what we hope the user experiences. I'll assume there is no time gap between Oracle and the user, therefore the wall time will equal the user's experience.

Effective Parallelism is the effective number of Oracle parallel slaves or some other form of parallelism, such as designed-in application parallelism. (For simplicity, I'm only going to mention Oracle parallel query.) If Oracle parallel query is not involved, then the effective parallelism is one. If two parallel query slaves are involved, then the effective parallelism will be a little less than 2

Parallelism can reduce wall time because we can simultaneously "burn time" in multiple places. For example, 60 seconds of elapsed time with a process running serially, results in a wall time of 60 seconds. But if we have two parallel query slaves, while the elapsed time (i.e., all the DB Time) is still 60 seconds (plus some overhead time), the wall time will be around 30 seconds (plus some overhead time).

The math is really simple...that is until you factor in scalability (i.e., the overhead), which I won't. If you're interested, read the last chapter of my book, Forecasting Oracle Performance.

Let's simplify this by using some mathematical notation.

DB Time = DB CPU + NIWT

Elapsed Time = Sum of DB Time

Wall Time = Elapsed Time / Effective Parallelism

Pretty straightforward, eh? Below is a short video clip summarizing the key time parameters taken from the OraPub Online Institute seminar, Tuning Oracle Using an AWR Report - Part 2. If you can't see the video, click HERE watch it on YouTube.



Test You Knowledge
True or False? If the total elapsed time is 60 seconds and parallel query is not involved, the total wall time will also be 60 seconds. True

True or False? If the elapsed time per execution is 60 seconds and the wall time is 30 seconds, then parallel query is involved. True

True or False? Bonus question yet very important to understand: If the elapsed time per execution is 60 seconds and two PQ slaves are involved, then the wall time will be 30 seconds.

The last question is false because there is overhead when parallelizing. Parallelism is not free. Because of this, the wall time will hopefully drop to perhaps 35 seconds. That 5 seconds is the parallelization overhead.

Coming Up Next: Video Proof!
While the above may seem correct, I ran some SQL statements and captured the relevant time statistics. There is quite a bit of detail and I ran two different tests, so I'll post that in a week or two.

Thanks for reading,

Craig.
Categories: DBA Blogs

Partner Webcast - Oracle SOA Suite 12c: Connect 4 Cloud, Mobile, IoT with On-premise

The pace of new business projects continues to grow from increasing customer self-service to seamlessly connecting all your back office and in-the-field applications. At the same time increased...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Closer look at the SOA 12c Feature: Oracle Managed File Transfer

The rapid growth of cloud-based applications in the enterprise, combined with organizations' desire to integrate applications with mobile technologies, is dramatically increasing application...

We share our skills to maximize your revenue!
Categories: DBA Blogs

ASM Commands : 1 -- Adding and Using a new DiskGroup for RAC

Hemant K Chitale - Sat, 2014-08-16 10:22
In 11gR2 Grid Infrastructure and RAC

On node1, I discover and add a disk to ASM.  NFS "devices" asmdisk.1 to asmdisk.6 are present as ASM Disks. asmdisk.7 has been added on NFS mount point /data1. (Disks asmdisk.3 to asmdisk.6 are on /data2)

I start on node1 in my Cluster

[root@node1 ~]# su - grid
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sat Aug 16 23:42:02 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter asm_diskstring

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring string /crs/*, /data1/*, /data2/*, /f
ra/*
SQL> !ls -l /data1/asm*
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 16 23:42 /data1/asmdisk.1
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 16 23:42 /data1/asmdisk.2
-rw-r--r-- 1 grid oinstall 2048000000 Aug 16 23:33 /data1/asmdisk.7

SQL> create diskgroup DATA3 disk '/data1/asmdisk.7';
create diskgroup DATA3 disk '/data1/asmdisk.7'
*
ERROR at line 1:
ORA-15018: diskgroup cannot be created
ORA-15072: command requires at least 2 regular failure groups, discovered only
1


SQL> create diskgroup DATA3 external redundancy disk '/data1/asmdisk.7';

Diskgroup created.

SQL>
SQL> select group_number, name, total_mb
2 from v$asm_diskgroup
3 where name = 'DATA3'
4 /

GROUP_NUMBER NAME TOTAL_MB
------------ ------------------------------ ----------
5 DATA3 1953

SQL>

I now have a new DiskGroup using External Redundancy with a single disk.  Is it visible at node2 ?

[root@node2 ~]# su - grid
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sat Aug 16 23:47:45 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select group_number, name, total_mb
2 from v$asm_diskgroup
3 where name = 'DATA3'
4 /

GROUP_NUMBER NAME TOTAL_MB
------------ ------------------------------ ----------
0 DATA3 0

SQL>

Why is the size not visible yet ?  Because, although the CREATE from node1 had also MOUNTed the Disk Group, it hasn't been mounted on node2 yet.

SQL> alter diskgroup DATA3 mount;

Diskgroup altered.

SQL> select group_number, name, total_mb
2 from v$asm_diskgroup
3 where name = 'DATA3'
4 /

GROUP_NUMBER NAME TOTAL_MB
------------ ------------------------------ ----------
5 DATA3 1953

SQL>

Can I confirm the underlying disk ?

SQL> select group_number, disk_number, header_status, state, total_mb
2 from v$asm_disk
3 where group_number = 5;

GROUP_NUMBER DISK_NUMBER HEADER_STATU STATE TOTAL_MB
------------ ----------- ------------ -------- ----------
5 0 MEMBER NORMAL 1953

SQL>


What happens when I create a tablespace/datafile in this DiskGroup, from the instance on node1 ?

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options
-sh-3.2$ su - oracle
Password:
-sh-3.2$ sqlplus '/ as sysdba'

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 17 00:08:31 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> create tablespace NEW_TBS datafile '+DATA3';
create tablespace NEW_TBS datafile '+DATA3'
*
ERROR at line 1:
ORA-01119: error in creating database file '+DATA3'
ORA-15045: ASM file name '+DATA3' is not in reference form
ORA-17502: ksfdcre:5 Failed to create file +DATA3
ORA-15081: failed to submit an I/O operation to a disk


SQL>

Why do I get this error ? I could create a DiskGroup on the ASM Disk but I couldn't add a datafile ?  Let me check the permissions.

SQL> !sh
sh-3.2$ cd /data1
sh-3.2$ ls -l asmd*
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 17 00:11 asmdisk.1
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 17 00:11 asmdisk.2
-rw-r--r-- 1 grid oinstall 2048000000 Aug 17 00:11 asmdisk.7
sh-3.2$ su grid
Password:
sh-3.2$ pwd
/data1
sh-3.2$ ls -l asmd*
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 17 00:12 asmdisk.1
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 17 00:12 asmdisk.2
-rw-r--r-- 1 grid oinstall 2048000000 Aug 17 00:12 asmdisk.7
sh-3.2$ chmod 775 asmdisk.7
sh-3.2$ ls -l asmdisk.7
-rwxrwxr-x 1 grid oinstall 2048000000 Aug 17 00:12 asmdisk.7
sh-3.2$

The oinstall group that is used by "oracle" did not have write permissions. Let me go back to Oracle now after having granted the permissions.

sh-3.2$ exit
exit
sh-3.2$ exit
exit

SQL> l
1* create tablespace NEW_TBS datafile '+DATA3'
SQL> /

Tablespace created.

SQL>

The CREATE TABLESPACE has succeeded.  I can verify the datafile and the ASM file from node2 now.

-sh-3.2$ id
uid=500(grid) gid=1001(oinstall) groups=1001(oinstall),1011(asmdba)
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 17 00:17:19 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysasm

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select group_number, file_number, bytes/1048576, type, redundancy
2 from v$asm_file
3 where group_number=5;

GROUP_NUMBER FILE_NUMBER BYTES/1048576
------------ ----------- -------------
TYPE REDUND
---------------------------------------------------------------- ------
5 256 100.007813
DATAFILE UNPROT


SQL>
SQL> exit
suDisconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Real Application Clusters and Automatic Storage Management options
-sh-3.2$
-sh-3.2$ su - oracle
Password:
-sh-3.2$ sqlplus

SQL*Plus: Release 11.2.0.2.0 Production on Sun Aug 17 00:19:34 2014

Copyright (c) 1982, 2010, Oracle. All rights reserved.

Enter user-name: / as sysdba

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select file_name, bytes/1048576 from dba_data_files
2 where tablespace_name = 'NEW_TBS';

FILE_NAME
--------------------------------------------------------------------------------
BYTES/1048576
-------------
+DATA3/racdb/datafile/new_tbs.256.855792859
100


SQL>

Now, I have the new DataFile visible in ASM and the Database on the New DiskGroup.
.
.
.

Categories: DBA Blogs