Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 14 hours 24 min ago

The End of an Era – Preparing for a SQL Server 2005 Upgrade

Wed, 2015-12-09 07:02


With the impending End of Support for SQL Server 2005 coming up in April 2016, most companies know that they need to find an upgrade path in order to benefit from the new features.

There are many upgrade paths and targets you can take to get out of SQL Server 2005 and into 2014 or even straight into the cloud with Azure SQL Database or Azure SQL DataWarehouse.

Here at Pythian we have created a handy 45 minute webinar to cover upgrade scenarios, upgrade tools and we are even share our own process on how we approach SQL Server upgrade projects. We’re also including demos of upgrading to 2014 and to Azure SQL Database.

As much as we’ve enjoyed working with SQL Server 2005, it’s time to say good bye and jump 10 years ahead.


And just for fun, let’s go down memory lane and analyze what an amazing milestone SQL Server 2005 was for Microsoft data professionals.

SSIS was introduced. And still going strong after 10 years with countless improvements.

DMVs and DMFs were introduced. How did we ever do engine troubleshooting before!? Such a huge difference from SQL 2000 and was the foundation of the insane amount of instrumentation that is built into the product today.

CLR was introduced. .NET into the engine, a whole new world of programming capabilities opened.

Row multi-versioning was introduced in the engine. Opening up new levels of concurrency and closing the gap on a major piece of functionality that Oracle had from day 1.

Table partitioning was introduced. Another major feature needed to take SQL Server into the VLDB and warehousing space.

Database mirroring introduced. Shared-nothing, easy high availability without the need of shared storage. Obviously the foundation of the current AlwaysOn Availability Groups.

DDL triggers, XML support, ranking functions, recursive CTEs, separation of user and schema. Not much to add here.

And that’s not a complete list. Truly, the 2005 release finally placed SQL Server as a real enterprise-grade relational database system. No wonder it took 5 years, as still to this day, SQL 2005 is the foundation that has taken us all the way to 2016 and the massive world of cloud relational databases on Azure. It also kick started many careers (mine included) as more and more companies began to see the potential of the product and adopted Microsoft’s data platform.

If you’re considering upgrading from SQL Server 2005, let us know. We’ll connect you with one of our SQL Server experts to review your requirements and discuss how we can work with your team to ensure the success of your upgrade.

Categories: DBA Blogs

Locks, Blocks and Deadlocks – What’s the Difference?

Fri, 2015-12-04 07:42


We are often being paged by development teams talking about locks, blocks or deadlocks and some people make the wrong use of the terms.

There is a big difference between the three and it will explained at a high level in this post:


Lock is acquired when any process accesses a piece of data where there is a chance that another concurrent process will need this piece of data as well at the same time. By locking the piece of data we ensure that we are able to action on that data the way we expect.

For example, if we read the data, we usually like to ensure that we read the latest data. If we update the data, we need to ensure no other process is updating it at the same time, etc.
Locking is the mechanism that SQL Server uses in order to protect data integrity during transactions.


Block (or blocking lock) occurs when two processes need access to same piece of data concurrently so one process locks the data and the other one needs to wait for the other one to complete and release the lock. As soon as the first process is complete, the blocked process resumes operation. The blocking chain is like a queue: once the blocking process is complete, the next processes can continue. In a normal server environment, infrequent blocking locks are acceptable. But if blocking locks are common (rather than infrequent), there is probably some kind of design or query implementation problem and the blocking may simply be causing performance issues.
A block can be described like this:


A blocking situation may NOT be resolved by itself (i.e. if the blocking process did not complete the transaction properly) or may take a long time to complete. In these extreme situations, the blocking process may need to be killed and/or redesigned.


Deadlock occurs when one process is blocked and waiting for a second process to complete its work and release locks, while the second process at the same time is blocked and waiting for the first process to release the lock.

In a simplified way, the deadlock would look like this:


In a deadlock situation, the processes are already blocking each other so there needs to be an external intervention to resolve the deadlock. For that reason, SQL Server has a deadlock detection and resolution mechanism where one process needs to be chosen as the “deadlock victim” and killed so that the other process can continue working. The victim process receives a very specific error message indicating that it was chosen as a deadlock victim and therefore the process can be restarted via code based on that error message.
This article explains how SQL Server detects and resolves deadlocks: Deadlocks are considered a critical situation in the database world because processes are just being automatically killed. Deadlocks can and should be prevented.Deadlocks are resolved by SQL Server and do not need manual intervention.


Lock-Avoiding Design Strategies

Some of the strategies are described here:

“There are a few design strategies that can reduce the occurrence of blocking locks and deadlocks:

  • Use clustered indexes on high-usage tables.
  • Avoid high row count SQL statements that can cause a table lock. For example, instead of inserting all rows from one table to another all at once, put a single INSERT statement in a loop and insert one row at a time.
  • Break long transactions up into many shorter transactions. With SQL Server, you can use “bound connections” to control the execution sequence of the shorter transactions.
  • Make sure that UPDATE and DELETE statements use an existing index.
  • If you use nested transactions, be sure there are no commit or rollback conflicts.

My additions:

  • Access objects always in the same order (i.e.: update Table1, Table2 and Table3 rather than sometimes Table2 first).
  • Don’t schedule long data updating processes to run concurrently, if possible.
  • Keep transactions as short as possible.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Log Buffer #452: A Carnival of the Vanities for DBAs

Fri, 2015-12-04 07:28

This Log Buffer Edition covers top Oracle, SQL Server and MySQL blog posts of the week.


  • In Oracle EBS 12.0 and 12.1 the Workflow notification system was not enabled to send e-mail notifications to users or roles who happened to have multiple e-mail addresses associated to them.
  • Just how can a SQL Developer user quickly build out a SQL script for a database user that will include ALL of their privileges, roles, and system grants?
  • Oracle BI 12c has been released for some time now. There are a few changes in the way it is installed compared to the previous 11g releases. This post is about installing and configuring OBIEE 12c with detailed step-by-step instructions (Linux x86-64 in this case).
  • In today’s digital economy, customers want effortless engagements and answers to their questions regardless of how they connect with a brand.
  • Upgrade to Oracle Database 12c and Avoid Query Regression.

SQL Server:

  • Continuous integration (CI) is the process of ensuring that all code and related resources in a development project are integrated regularly and tested by an automated build system.
  • SSIS Issues after Master DB Corruption – “Please Recreate Master Key” When Running Package.
  • Check FileSize and LogUsage for all DBs.
  • Other Users Cannot Execute SSIS Packages after migration.
  • How to Get Started Using SQL Server in Azure.


  • Amazon Aurora in sys bench benchmarks.
  • “Data” and “Performance” is where MySQL Cluster’s heart is. In-memory performance and always-up drives our agenda. The Percona Live Data Performance Conference is coming up with two submitted sessions about Cluster.
  • Fixing errant transactions with mysqlslavetrx prior to a GTID failover.
  • MariaDB CONNECT storage engine handles access to JSON files through standard SQL. It comes with a set of UDFs (user defined functions) to manipulate the JSON format. This JSON content can be stored in a normal text column.
  • Become a ClusterControl DBA: Managing your Database Configurations.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

How to Troubleshoot an ORA-28030 Error

Fri, 2015-12-04 07:22

ORA-28030: Server encountered problems accessing LDAP directory service.
Cause: Unable to access LDAP directory service.
Action: Please contact your system administrator.


There are many reasons for causing this error when you are trying to login to the database with your oracle internet directory (OID) authentication. The error sample is shown as below:

SQL> conn howie@dbtest
Enter password:
ORA-28030: Server encountered problems accessing LDAP directory service

Warning: You are no longer connected to ORACLE.


Here how I usually troubleshoot this kind of issue. Two examples.

First of all, you need to enable the trace to dump the actual errors in the database:

SQL> alter system set events '28033 trace name context forever, level 9';


Sencond, regenerate the error:

SQL> conn howie@dbtest
Enter password:
ORA-28030: Server encountered problems accessing LDAP directory service


Third, disable the trace:

SQL> alter system set events '28033 trace name context off';

After checking the trace files, I found errors. This is related to the OID server lnx-ldap DNS configuration. Check /etc/hosts or DNS to make sure the OID server lnx-ldap or the port 3131 is reachable.

KZLD_ERR: failed to open connection to lnx-ldap:3131
KZLD_ERR: 28030
KZLD_ERR: failed from kzldob_open_bind.

Or you may see the error like this, this is because the wallet files were corrupted, you need to recreate the wallet, and make sure the wallet path is defined properly:

kzld_discover received ldaptype: OID
KZLD_ERR: failed to get cred from wallet
KZLD_ERR: Failed to bind to LDAP server. Err=28032
KZLD_ERR: 28032
KZLD is doing LDAP unbind
KZLD_ERR: found err from kzldini.

There are many possibilities to throw out ORA-28030, in this blog I am just simply giving you the hints for identifying the root cause.

Hope it helps!


Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

Step-by-Step Upgrades to Cloudera Manager and CDH

Fri, 2015-12-04 07:03



Lately, several of our security conscious clients have expressed a desire to install and/or upgrade their Hadoop distribution on cluster nodes that do not have access to the internet. In such cases the installation needs to be performed using local repositories. Since I could not find a step-by-step procedure to accomplish this I thought I would publish it myself.

The following step-by-step procedure has been implemented using the following configuration and specifications:

Cloudera Manager Node : m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0))
Name Node: m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0))
Data Nodes (3): m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0))

Existing Version of Cloudera Manager: 5.4.3
Existing Version of CDH: 5.4.2

Upgrade to Version of Cloudera Manager: 5.5.0
Upgrade to Version of CDH: 5.5.0

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)

# cat /proc/version
Linux version 2.6.32-504.16.2.el6.x86_64 ( (gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC) ) #1 SMP Tue Mar 10 17:01:00 EDT 2015


Upgrade Steps

We will be completing the upgrade of the cluster in two steps. In the first step only Cloudera Manager will be upgraded to version 5.5. Once the cluster has been verified to be functional with Cloudera Manager 5.5 then we will upgrade CDH to version 5.5.


1. Upgrade Cloudera Manager


1. Let’s start by creating the local repository for Cloudera Manager. Download latest version of Cloudera Manager from link below on Local Repository Host:

# wget -r –no-parent –reject “index.html*” “”# wget “”2. Copy downloaded files to pub/repos/cloudera-manager directory on Local Repository Host. After that start a local web server with pub/repos root directory. You may use any webserver including Python SimpleHTTPServer or Apache. Following are steps to use the SimpleHTTPServer:# cd pub/repos# nohup python -m SimpleHTTPServer 8000 &Expected output for http://Local Repository Host:8000/pub/repos/cloudera-manager [/vc_column_text] 82485_1

Expected output for http://Local Repository Host:8000/pub/repos/cloudera-manager/RPMS/x86_64


3. Make sure the local repository for Cloudera Manager on$ cat /etc/yum.repos.d/cloudera-manager.repo
[cloudera-manager] name=Cloudera Manager package mirror
baseurl=http://Local Repository Host:8000/pub/repos/cloudera-manager
gpgkey=http://Local Repository Host:8000/pub/repos/cloudera-manager/RPM-GPG-KEY-cloudera
gpgcheck=14. Log on to Cloudera Manager. Stop Cloudera Management Service:


5. Make sure all services are stopped. Sample screens after stopping below:


6. Stop the Hadoop Cluster:

7. SSH to Cloudera Manager Server. Stop Cloudera Manager Service:# sudo service cloudera-scm-server status
cloudera-scm-server (pid 6963) is running…
# sudo service cloudera-scm-server stop
Stopping cloudera-scm-server: [ OK ] # sudo service cloudera-scm-server status
cloudera-scm-server is stopped8. Before proceeding with the upgrade make sure you backup the Cloudera Manager Databases used by CDH services like Hive Metastore, Oozie, Sentry etc.9. When you are ready to upgrade issue command to upgrade Cloudera Manager:

# yum upgrade cloudera-manager-server cloudera-manager-daemons

Make sure the Upgrade Version for Cloudera Manager is as below:


10. To verify if the upgrade is successful issue the following command:# rpm -qa ‘cloudera-manager-*’
cloudera-manager-server-5.5.0-1.cm550.p0.61.el6.x86_6411. Start the Cloudera Manager:# service cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]12. Monitor the Cloudera Manager Server Log for errors. The Cloudera Manager Server console is ready for use once you see the “Started Jetty Server” message in the log:# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log13. Log on to Cloudera Manager. You should now see the following screen. Note the running version:


14. Choose Option as below to upgrade Cloudera Manager Agents. Press Continue:


15. Choose Custom Repository:

In first box add: http://Local Repository Host:8000/pub/repos/cloudera-manager
In second box add: http://Local Repository Host:8000/pub/repos/cloudera-manager/RPM-GPG-KEY-cloudera


Press Continue.


16. Check JDK/Java options as below and press Continue:


17. Provide SSH credentials and Press Continue:


18. Cloudera Manager will now upgrade the Agents:


19. Verify Completion. Press Continue:


20. Inspect Hosts for Correctness. Press Continue:


21. You should now see a Confirmation Screen as below:


22. Upgrade Cloudera Management Service. Press Continue:


23. Confirm Restart of Cloudera Management Service:


24. Verify Cloudera Management Service restarted. Press Finish:


25. On the Cloudera Manager Home Screen. Choose Deploy Client Configuration:


26. Verify Client Configurations Deployed:


27. Start the Cluster:


28. Verify Services on the Cluster are Active:


29. Verify Cloudera Manager Version:


30. Verify Agents Upgraded. Issue the following commands on all nodes:

# rpm -qa ‘cloudera-manager-*’

31. Congratulations. Upgrade of Cloudera Manager was successful:


2. Upgrade Cloudera DistributionNow that the Cloudera Manager has been upgraded lets upgrade CDH to version 5.5.1. Download latest version of CDH from link below on Local Repository Host:wget
wget Create/Refresh the local repository for Cloudera Manager by copying the downloaded files to pub/repos/cloudera-cdh5/ directory on Local Repository Host.
Expected output for http://Local Repository Host:8000/pub/repos/cloudera-cdh5/


3. Back up HDFS metadata using the following command:$ whoami
$ hdfs dfsadmin -fetchImage ~
15/11/27 19:23:58 INFO namenode.TransferFsImage: Opening connection to http://ip-10-169-250-118.ec2.internal:50070/imagetransfer?getimage=1&txid=latest
15/11/27 19:23:58 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
15/11/27 19:23:58 INFO namenode.TransferFsImage: Transfer took 0.09s at 2715.91 KB/s
$ ls -l
total 244
-rw-rw-r–. 1 hdfs hdfs 244838 Nov 27 19:23 fsimage_00000000000000154184. Backup databases used for the various CDH services. The following screen shows the databases details used for various services like Oozie, HUE, Sentry etc:


5. Log on to Cloudera Manager.6. Verify the parcel download setting is pointing to the local repository for CDH. Press the Parcels icon on the Cloudera Manager Home Page. Press Edit settings:


7. Choose the following Option to start upgrade of CDH:


8. Choose version 5.5:


9. Make sure you have backed up all databases:


10. The following screen indicates that we are all set to proceed. Press Continue:


11. CDH Version 5.5 parcels will now be downloaded, distributed to all nodes and unpacked. Press Continue:


12. Hosts will be inspected for correctness. Press Continue:


13. Verify that no party is using the HH-TEST Cluster. Choose Full Cluster Restart. Press Continue:


14. The HH-TEST cluster will now be stopped. Upgraded and restarted. Press Continue:


15. Confirmation screen show now show the upgraded version of CDH. Press Continue:


16. Review additional post-upgrade instructions. Press Finish.17. Verify CDH version on Cloudera Manager Home Page:


18. Verify CDH version on back-end. SSH to any node in the cluster:$ hadoop version
Hadoop 2.6.0-cdh5.5.0
Subversion -r fd21232cef7b8c1f536965897ce20f50b83ee7b2
Compiled by jenkins on 2015-11-09T20:37Z
Compiled with protoc 2.5.0
From source with checksum 98e07176d1787150a6a9c087627562c
This command was run using /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/hadoop-common-2.6.0-cdh5.5.0.jar
$ hadoop fs -ls /
Found 3 items
drwxrwxr-x – solr solr 0 2015-11-26 20:58 /solr
drwxrwxrwt – hdfs supergroup 0 2015-11-27 02:29 /tmp
drwxr-xr-x – hdfs supergroup 0 2015-11-27 02:29 /user19. This completes the upgrade of CDH:



Discover more about our expertise in Hadoop.

Categories: DBA Blogs


Fri, 2015-12-04 06:32


In my previous post I talked about how create and use a policy against your SQL Server Instance (don’t forget to check our part 1 in this series too). Now we will talk about the Data Collector and how configure it on your environment.

What is the Data Collector?

The Data Collector (DC) is one of the main components in the set of tools for data collection provided by SQL Server. With the DC we can define a centralized point for storage of all collected metrics through the instances of the SQL Server database in your infrastructure, and these metrics can be from multiple sources and not just related to performance metrics.

To increase the efficiency of the metrics collected, you must adjust the DC according to each existing infrastructure environment (development, approval, production). The DC stores all the collected information in a Management Data Warehouse (MDW) and allows you to set different retention periods for each metric that will be collected.

As the DC has a programming interface (API), we can customize collections for any other type of desired metric. However, in this article, we will focus only on the three collections of the DC system: Disk Usage, Query Activity and Server Activity. Figure 28 shows how the DC fits in the strategy for collecting and managing data in a SQL Server database.


Figure 28. Data collection strategy.


The Data Collector Architecture

Before starting the implementation of the DC, it is necessary to understand which components are part of this feature. They are:

  • Target: An instance of the SQL Server database that supports the process of collecting metrics by using the DC.
  • Target Type: Defines the type of target which will collect metrics. For example, an instance of SQL Server database has different metrics than the metrics collected from a SQL Server database itself.
  • Data provider: A data source that will provide metrics for the collector type.
  • Collector Type: A delimiter for the packages in the SQL Server Integration Service (SSIS), which provides the mechanism for the collection and storage of the metrics in the MDW.
  • Collection Item: Is a collection item in which are defined which the metrics will be collected, how often this gathering will be held and what is the retention time of the metric stored.
  • Collector Set: A set of Collection Items.
  • Collection Mode: The way that the metrics will be collected and stored in the MDW. The metrics can be collected on an ongoing basis (Cached Mode) or through a scheduling sporadic (Non-Cached Mode).
  • Management Data Warehouse (MDW): The relational database used to store all the collected metrics.

 Note: In SQL Server 2014 we have the following collector types: Generic T-SQL Query, Generic SQL Trace, Performance Counters and Query Activity.

The Figure 29 shows the dependencies and relationships between the components of the DC.


Figure 29. Relationship between the components of the DC.

The data provider is an external component in DC architecture and which, by definition, has an implicit relationship with the target. A data provider is specific to a particular target and provides metrics through views, performance counters and components of Windows Management Instrumentation (WMI) are consumed by the DC.

We can visualize from Figure 29 that a collector type is associated to a particular target, and that this relationship also defines how the metrics will be collected and what the storage schema of these metrics, for the collector type also provides the location of MDW, which can be on the server that is running the collection or on a centralized server.

A collection item has a default and collection frequency that can only be created within a collector set. The collector set, in turn, is created on the instance of SQL Server that will be monitored through the DC and consists of one or more collection items. The collection of the set of metrics defined in the collector set is accomplished through Jobs executed by the SQL Server Agent service, and the metrics collected are stored in the MDW periodically through predefined schedules.

The Figure 30 shows a collector set called system Disk Usage, in which we visualize that the configuration was performed with the collection mode set to Non-Cached, using two collection items of type Generic T-SQL Query Collector Type, and that the metrics are collected every 60 seconds, with retention of these metrics in the MDW for 730 days.


Figure 30. Definition of the collector system set Disk Usage.

It is important to note that the DC is fully integrated with the SQL Server Agent service and using Integration Services, using both intensively. After the DC configuration, the process of collecting and recording of metrics is accomplished by a set of SQL Server Agent Jobs created and started automatically.


Management Data Warehouse (MDW)

So we can use the metric collection through the DC, though you must first perform the configuration of the MDW that will be the relational database responsible for storing all the metrics collected by the collector sets.

For this we can use an existing relational database and configure it as a MDW. However, it is recommended that you set a new database, because during the configuration process of the MDW several schemas and tables relating to DC will be created. The schemas are generated automatically after the configuration of DC are the core and the snapshot. A third schema, custom_snapshots, name will be created when a collector set as customized is set by the administrator of the Bank.

The main schema of the MDW is the core, because it has the tables, stored procedures, and views that are available to all collector types that will also be used to organize and identify the metrics collected. To ensure the integrity and security of MDW, all database objects belonging to the core schema can only be changed by members of the Profiles database db_owner and mdw_admin.

The Table 2 lists all the existing tables in the core schema and their respective descriptions.


Table 2. Core schema tables.


The schema snapshot, in turn, owns the objects required for the storage of collected metrics through the system collection sets. The tables in this schema can only be changed by members belonging to the database profile mdw_admin.

The Table 3 illustrates which tables are used by collection sets of Server Activity system and Query Statistics, created after the setting of DC.

Table 3. Tables used by collection sets.


Already the custom_snapshot schema has the tables and views that were created when a custom collection set has been configured. Any custom collection set that you need a new table for to store collected metrics can create tables in this schema. The tables can be added by any member of the mdw_writer database.


Configuring the Data Collector

To exemplify the metric using the DC collection, we have the instance VITADB\SQLCMS, responsible for hosting the MDW database, and instances VITADB\SQLINSTANCE1 and VITADB\SQLINSTANCE2, which will have your metrics collected through the collector sets. That said, the first step to setting the DC is the creation of the MDW in VITADB\SQLCMSinstance, as the following steps:

1)   Through the Object Explorer, select the folder Management.

2)    Right-click on Data Collection -> Task -> Configure Management data warehouse.

3)     In the dialog box that appears (see Figure 31), select the VITADB\SQLCMS instance and create the MDW database via the button New.

4)     Select which logins have access to the MDW database ( Figure 32), and then click Finish.


Figure 31. Creation of the MDW.


Figure 32. Definition of permissions to the MDW.


Note: The members of the mdw_admin database have permission of SELECT, INSERT, UPDATE, and DELETE, in addition to being able to change any .MDW schema and perform the Jobs of maintaining DC. The members of the mdw_writer database have permission to upload the collected metrics to the MDW. Already members of the database profile mdw_reader have only SELECT permission on MDW.

After the creation and configuration of the MDW in VITADB\SQLCMS instance, you must start the process of collecting metrics on instances VITADB\SQLINSTANCE1 and VITADB\SQLINSTANCE2 by setting the collector sets of system in each of the instances and directing the metrics collected for the MDW database.

For the configuration of the collector sets of a system, we have the following steps:

1)  Through the Object Explorer, select the folder Management.

2)  Right-click on Data Collection -> Tasks -> Configure Data Collection.

3)  In the dialog box that appears (Figure 33), connect on VITADB\SQLCMS instance, select the MDW database, the collector set that you want, and then click Finish. For this example we will use the collector set of system System Data Collection Sets, which will be automatically created after Setup.


Figure 33. Collector set definition.

Complete the configuration of the collection, we’re creating three collector sets of system: Disk Usage, Query Statistics and Server Activity. The collector set Disk Usage collects metrics about the growth of data files (.mdf and .NDF) and log files (.ldf) user databases and existing systems in instance monitored by DC. With this information it is possible to know what the daily growth trend is, in MB, of the files examined.

Table 4 shows the properties of the collector set of system Disk Usage.


Table 4. Collector set properties of system Disk Usage.

In turn, the collector set of Server Activity system collects server activity metrics, statistics, performance, blocking chains, General information of memory, CPU, and network.

The Table 5 shows the properties of the collector set of Server Activity system.


Table 5. Collector set properties of Server Activity system.

Finally, the collector set of Query Statistics collection system metrics for queries executed against the database monitored by the ad, as statistics, execution plans, most costly queries in relation to your use of disk, CPU, memory and queries that took more time to be finalized.

Table 6 shows the properties of the collector set of Query Statistics system.


Table 6. Properties of the collector set of Query Statistics System.


You will see the metrics collected:

The metrics collected by DC can be accessed directly by T-SQL queries. However, after setting the collector sets of system, some standardized reports become available for viewing. To access them it is necessary to right-click on Data Collection -> Reports -> Management Data Warehouse. After the configuration of the collector sets, three reports are available. We will look at each of them in the following subtopics.


Server Activity History

All available information in this report is related to the use of resources of the database server, such as CPU or memory allocated in total, of which the biggest wait types that exist in SQL Server, the value of IOPs, among others. All this information is extremely useful for troubleshooting and tuning.

The Figure 34 shows the top of the Server Activity Historyreport, extracted from the VITADB\SQLINSTANCE2 instance.


Figure 34. The top of the report Server Activity History.


At the top of the report we visualize which SQL Server instance metrics are displayed and at what date and time it was requested. Below this information you can select what time period, in which the metrics were collected, must be loaded in the report. In each of the graphics presented, there is information about the operating system (green lines) and on the SQL Server (blue lines).


Figure 35 shows Server Activity History, extracted from the VITADB\SQLINSTANCE2 instance at the bottom of the report.


Figure 35. The bottom of the Server Activity History report.


These reports are also extremely useful in the process of performance analysis and troubleshooting, because they display the biggest wait types and what main types of activities that occur in the instance. From any of these reports we can visualize more details by selecting one of the lines or data bars and performing a drill-down on the desired information.

Note: Immediately after the setting of DC there will be information to be loaded in the reports, and the more metrics collected and stored in the MDW, the greater the detail achieved through the reports.


Disk Usage Summary

This report lists the size of the databases that are monitored by the ad and what the average growth is over a period of time. The metrics displayed by the report are separated by data files and the log files of monitored databases. As shown in Figure 36, each of the data files and log files has the initial size information, the current size and the average growth per day in MB.


Figure 36. Disk Usage Summary Report.


Query Statistics History

The most common reason for performance issues found in SQL Server is writing T-SQL commands inefficiently. Therefore, the collection of performance metrics of these consultations is an essential part for the tuning process. By default, you can view the 10 queries that consume more CPU, but you can change this filter and view the queries that carried out more IO operations, how long they were running, held more physical reads or carried out more logical writings.


Figure 37 shows the report Query Statistics History, extracted from the VITADB\SQLINSTANCE2 instance.


Figure 37. Query Statistics History Report.


Recommendations for Configuration of DC

To ensure that there is minimal impact during the environmental monitoring process of a SQL Server database by DC, adopt the following recommendations:

  • Use a centralized server to the MDW, it allows that there is only one location for execution and visualization of reports.
  • All database SQL Servers that will be monitored by the DC should be part of the same domain.
  • When creating a custom collector set using the collector type Generic SQL Trace, define a set of filters so that only the really necessary metrics are collected, because in this way the MDW doesn’t store unnecessary information.
  • Before you create a custom collector set using performance counters, you can be sure the collector set of Server Activity system is no longer collecting this metric.
  • If any collections of metrics across multiple T-SQL queries are carried out with the same frequency, combine them in a single collector set. Doing this we will reduce the amount of memory used by the DC executable (DCCEXEC.exe) while gathering metrics. Similarly, combine multiple collection items of type Performance Counters in a single collection item whenever possible.
  • Combine multiple collection items in a single collector set whenever possible. The only reason to create collector sets apart is if there are different retention periods or a different collection schedule.
  • A collector set using the collection mode set to Cached should always keep a running collection process. If the metrics are collected often, this is more efficient than starting and stopping the collection process where new metrics should be collected. In contrast, the collection mode set to NonCached doesn’t have a running collection process most of the time, that is, a new collection process will be started according to the predefined schedule and so will be stopped again, avoiding the excessive use of server hardware resources. So, if the metric collection occurs rarely, the collection mode set to NonCached is more efficient than leaving the collection process on hold most of the time. As a general rule, if the metric needs to be collected every five minutes or more often than that, consider configuring a collector set using the collection mode: Cached. If the collection of metrics can be performed with a frequency greater than five minutes, it is recommended to configure a collector set using the collection mode: Non-Cached.
  • The higher the frequency, the greater the overhead on the database server. In this way, choose to always configure the lowest frequency possible that meets the need of collecting.



As described earlier, beginning with SQL Server 2008 we have two tools that facilitate the management of a SQL Server database consisting of multiple instances. They are: the Policy-Based Management and the Data Collector.

Using the PBM you can create policies that assess certain conditions in existing objects on the instance of a SQL Server database. These policies can be designed manually or imported through XML files available after the installation of the database engine. Policies can be evaluated manually (OnDemand), following a predefined schedule (OnSchedule) or at the time a particular property of a SQL Server object is changed (OnChange) and can also be evaluated on multiple SQL Server instances at once, through the Central Management Server functionality.

With the DC we have a feature that collects metrics for all SQL Server instances and stores them in a centralized database, called the Management Data Warehouse. Through the configuration of DC and the Management Data Warehouse, three collector sets of the system are created. They collect metrics concerning the utilization of server hardware resources (CPU, memory, disk, and network), growth of data and log files of monitored databases in addition to the more costly T-SQL queries executed on the database server, which is accomplished through the collection of Jobs defined on a SQL Server Agent.

It is worth noting that the DC also provides a wide range of reports, so that the metrics collected by the collector sets can be evaluated during troubleshooting and tuning processes. Finally, note that the DC is a complete monitoring tool, that needs to be configured in the best possible way to avoid a high overload to database servers.

Until next time!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

AWS Redshift Cluster Sizing

Thu, 2015-12-03 15:50


Sizing a database is one of the primary DBA functions no matter what the technology. Though Redshift is highly managed for us, we must still address this task.

The first thing to note is that in sizing a cluster, we start with an estimated need of storage capacity, since the amount of storage available per node of the cluster is a fixed amount. While you get the disk space you pay for, AWS guidelines and user experience shows that performance can suffer when space becomes tight (>80%). So when sizing the cluster for a specific capacity an extra 20% will need to be tacked onto your calculations.

AWS currently offers two types of instances for Redshift clusters, dense compute (dc1.*) or dense storage (ds2.*) servers. The dc1 series offers about 6x the CPU and 6x the memory per terabyte of storage. These are good for use cases where there is a regular load of heavy analytics querying involving multiple joins. The ds2 series is more cost effective when the tables are highly denormalized and the analytics can be offloaded into a BI tool such as Microstrategy. It is possible to migrate to a new node type through snapshots, but the process will involve some downtime.

To address a few points of the Redshift architecture, note that only the compute nodes hold storage so the leader node is not considered (nor do you have to pay for it). Each storage disk is replicated to two others for redundancy, but these additional disks are not part of the calculations or specifications used for sizing (though they affect the disk monitoring calculations that we see later). Here we examine example clusters on a budget of around $10k using yearly contracted instances (saving 30+% over on-demand instances):

(7) dc1.large with 160Gb SSD @ $1380/year = 1120Gb @ $9660/year.
(2) ds2.xlarge with 2Tb HHD @ $4295/year = 4000Gb @ $8590/year.

The dc1 instances in this case are around 4x as expensive per terabyte (though still quite the bargain as compared to hosting the cluster in-house) and give a 30-80% performance gain (depending on the benchmarks, (example)). And while you can always add nodes to accommodate data growth, you can only add nodes of the same instance type which could potentially become quite expensive if you’re using instances of the small disk capacity dc1’s.

Once your cluster is up, it is vital to monitor disk and CPU utilization so you know when to add new nodes. It is highly advisable to watch the graphs under the Performance tab in the Redshift Console as you add new load to the cluster.

There are built in Cloudwatch alarms for disk usage, and these should be configured to alert above 70%. I like to know well in advance when it is getting there, so I regularly use Period= 5 minutes, Statistic = average, over 1 consecutive period, but since loads and vacuums can create usage surge spikes, you might want to configure the alert over more or longer periods. While Cloudwatch is great for this monitoring, it is convenient to also be able to compute capacity. There are several ways to query disk usage that render subtly different results, unfortunately none of which will yield the stats given by Cloudwatch. Here’s an example for a 6-node 12Tb cluster that currently shows disk space as 32% used on each node in the Console yet displays as 23%:

,sum(capacity)/3 as total
,sum(used)/3 as used
,sum(capacity)/3 – sum(used)/3 as free
,(((sum(used)/3)/(sum(capacity)/3.00)) * 100.00) as pct_used
group by host


The point here is to be wary of querying disk space and rely on Cloudwatch and the Console.

Compression encoding on your tables can save substantial space (50-75% on average depending on the encoding) and can improve performance by 100+%. Encoding can also increase CPU usage, so we want to monitor it as we implement encoding. A cluster can be brought to a standstill by just one node hitting 100% CPU utilization, so we also need to setup Cloudwatch alarms to make sure we average < 70% or don’t hit spikes > 90% for more than 10 minutes. Once we bump up against those metrics, it’s time to add another node.


Discover more about our expertise in the Cloud.

Categories: DBA Blogs

Syncing Inconsistent MySQL Slaves

Wed, 2015-12-02 11:52


Checksum is a standard practice among DBAs to verify the data consistency across replicated nodes. In this post we’re going to review the syncing options for an inconsistent MySQL slave of Galera cluster node.

Here we’re assuming a setup of regular replication to a MySQL instance from one of the Galera cluster nodes.
In the usual MySQL replication setup, standard practice involves the usage of the pt-table-checksum tool to identify the discrepancies and usage of pt-table-sync to bring them in sync. The checksum tool, pt-table-checksum, can run across Galera cluster node to verify the data consistency and confirm if the MySQL slave is consistent with a chosen primary node.

What happens if this Galera cluster’s regular MySQL slave sees data inconsistency on it? Will pt-table-sync work there? The answer to this depends…
pt-table-sync when used with –sync-to-master causes it to take locks on master but Galera doesn’t like those lock attempts.

You may ask, why locks on a master?

Coz’ pt-table-sync will treat the master as the source and the slave as the destination. It will lock the table (–lock=1) on master, apply changes on master which will eventually be replicated to slave and thus causing the sync.


Respective snippet of code from pt-table-sync script V.2.2.15:

	lock_server(src => $src, dst => $dst, %args);
	$exit_status |= sync_a_table(
               src   => $src,
               dst   => $dst,
               where => 1,  # prevents --where from being used
               diff  => $diff,
         unlock_server(src => $src, dst => $dst, %args);

Again… coming back to our point, pt-table-sync wouldn’t work well on Galera with –sync-to-master. Let’s do an attempt:

PTDEBUG=1 ./pt-table-sync --verbose --execute --replicate=pyth.checksum --sync-to-master --tables slave_repl.tmp h=localhost,u=root,p=$pw > slave_repl.tmp.sql

If you run the above command (on slave) with debug you will note following error:

	# TableSyncer:6114 4650 Committing DBI::db=HASH(0x177cf18)
	Deadlock found when trying to get lock; try restarting transaction at line 6115 while doing slave_repl.tmp on localhost

(without PTDEBUG you won’t see much on slave except it will report nothing changed!)

Great, so why the error? Let’s again check the code snippet (pt-table-sync V.2.2.15):

	sub lock_table {
	   my ( $self, $dbh, $where, $db_tbl, $mode ) = @_;
	   my $query = "LOCK TABLES $db_tbl $mode";
	   PTDEBUG && _d($query);

As you see, it’s calling up for LOCK TABLES and Galera, which as we know, doesn’t support explicite locking because of the conflict with multi-master replication. That’s the reason for the error above.

Okay, let continue… Upon executing pt-table-sync on slave, the “master”‘s (Galera node’s) error-log will show the following error:

	2015-08-27 14:45:07 6988 [Warning] WSREP: SQL statement was ineffective, THD: 17, buf: 1399
	QUERY: commit
	 => Skipping replication


We already have a bug report in place and if it affects you, go ahead and mark it so.

So how would you fix this?

Easy Answer: Do a complete rebuild of slave from a fresh data backup of the cluster node.
Desync the cluster node, take the backup and restore it on a slave machine and setup replication.

But let’s think about an alternate method other than a complete restore…

“fixing using pt-slave-restart?”

But pt-table-sync is “not Galera Ready” as they say!!

Even then, pt-table-sync can help us understand the differences and that’s the long answer :)


It can still work and prepare SQL for you using –print –verbose options.

./pt-table-sync --verbose --print --replicate=pyth.checksum --sync-to-master --tables slave_repl.tmp h=localhost,u=root,p=$pw > slave_repl.tmp.sql

So, all you need to do is run the SQL against a slave to fix the discrepancies. You may choose to desync the node and run the pt-table-sync to generate differential sql. You’ll still need to confirm if slave got synced by re-running the pt-table-checksum and the discrepancies are resolved.

Our steps to resync a PX Cluster’s slave using pt-table-sync are as follows:

(Note: It’s advisable to stop writes on cluster to fix the discrepancies on slave. “Why?” “Explained later.”)

– Desync master node:

set global wsrep_desync=ON;

– Generate differential SQL:

./pt-table-sync --verbose --print --replicate=pyth.checksum --sync-to-master --tables slave_repl.tmp h=localhost,u=root,p=$pw > slave_repl.tmp.sql

– Review the sql generated and execute them on slave.

– Once slave is synced, you can:

set global wsrep_desync=OFF;

– Finally rerun the pt-table-checksum to verify the discrepancies.

That concludes our solution.


“Wait, but why desync?”
hmm… Well wsrep_desync is a dynamic variable which controls whether the node can participate in Flow Control.


“hold on!! Flow control??”
Galera has synchronous replication where in node provides the feedback to rest in the group – fellas-I’m-late-hold-on OR okay-let’s-continue-the-job. So this communication feedback is flow control. (You should read galera-documentation & Jay’s post).

When we will set wsrep_desync=ON on master, it will continue to replicate in and out the writesets as usual; but flow control will no longer take care of the desynced node. So, other nodes of the group won’t bother about the deynced node lagging behind. Thus by desyncing we’re making sure that our operations on one node are not affecting the whole cluster. This should answer why writes need to be stopped before starting to sync.

Hope this helps.


Discover more about our expertise in MySQL.

Categories: DBA Blogs

Data Collector and Policy-Based Management: Part 2

Wed, 2015-12-02 11:20


In my previous post we talked about the architecture of Policy-Based Management. In this post we will create and run a policy against some SQL Server Instances.

Creating a Policy

In SQL Server 2014 you are allowed to create a policy manually via T-SQL, export existing policies in an instance of a SQL Server database or import policies made available by Microsoft. When designing a policy we can use advanced conditions that allow almost unlimited amounts of validation on database objects existing in an instance.

To facilitate the understanding of all the components used in a policy and how the same interact with each other, we will create a policy from scratch. To do this, you must first define a condition and then the policy used in this condition. With the policy set, it will be possible to categorize it and define which targets should be evaluated.

In order to illustrate this in a practical way we will use PBM. From now on, we will set a policy that will evaluate if all databases in a given SQL Server instance are disabled with the AutoShrink property.

Creating a Condition

The first step in the creation of the policy is the setting for the condition, which can be accomplished through SQL Server Management Studio (SSMS) or by using T-SQL commands.

To create a condition with SSMS, we have the following steps:

1)  Using the Object Explorer, open the folder Management and then the Policy Management folder (see Figure 8):

Figure 8. Folder Policy Management.


2)  Within that folder, right-click the folder Conditions and select the New Condition.

3)  This will open the window for the creation of the new condition, as it exposes the Figure 9.



Figure 9. Condition creation window.


According to this figure, while creating the condition these were the following options:

  • Name: The condition was created with the name AutoShrink False.
  • Facet: The facet used was the Database.
  • Expression: The property was valued @AutoShrink and its value must be equal to false.

Additionally, you can include a detailed description about the condition through the Description, as shown in Figure 10.


Figure 10. Condition description window.

The second option for creating the condition is through the system stored procedure SP_SYSPOLICY_ADD_CONDITION. The Listing 1 shows that the creation of condition AutoShrink is False, with the same options set in SSMS.

Listing 1. Creating the condition with T-SQL.

Declare @condition_id int

EXEC msdb. dbo. sp_syspolicy_add_condition @name=N ' AutoShrink False ', @description=N ' ', @facet=N ' Database ', @expression=N ' &amp;lt;Operator&amp;gt;














&amp;lt;/Operator&amp;gt; ' , @is_name_condition=0, @obj_name=N ' ', @condition_id=@condition_id OUTPUT

Select @condition_id



Creating a Policy

Now that we have a condition, we can define a policy that uses it. As a condition, the policy can be configured either through the SQL Server Management Studio, such as through T-SQL.

First we create a policy using the SSMS:

1)  Using the Object Explorer, open the folder Management and then the Policy Management folder (see Figure 8).

2)  Within that folder, right-click on the Policies folder and select New Policy.

3) This will open the window for the creation of the new policy, as shown in the Figure 11.


Figure 11. Window for policy creation.


According to the Figure 11, during the creation of the policy the following options were set:

  • Name: The policy was created with the name Checks AutoShrink.
  • Check Condition: The condition evaluated by policies is the AutoShrink False.
  • Against Targets: The policy should be evaluated in any existing database on the instance of SQL Server.
  • Evaluation Mode: The evaluation mode selected was the OnDemand, which determines that this policy must be performed manually.
  • Server Restriction: To this policy there will be no conditions that restrict the database servers.

Note: As a target it won’t always be a database. The targets change based on the context and evaluation of the condition. For example, if we create a policy to standardize the name of new tables using the facet Tables, Against Targets option will display All Tables.

The evaluation modes available in the list depend on the facets available in the condition. All facets support the OnChange and OnSchedule, but the OnChange: Prevent depends on the possibility of using the DDL triggers to perform the transaction rollback procedure. Already the evaluation mode OnChange: Log Only is based on the capacity of changes in the facet being captured for an event.

To facilitate the maintenance and management of configured policies, we can add more details through the Description, like Figure 12.


Figure 12. Policy description window.


In addition, you can also perform the creation of policies using the system stored procedures SP_SYSPOLICY_ADD_OBJECT_SET, SP_SYSPOLICY_ADD_TARGET_SET, SP_SYSPOLICY_ADD_TARGET_SET_LEVEL and SP_SYSPOLICY_ADD_POLICY.

The Listing 2 exemplifies the creation of policy Checks AutoShrink, with the same options set in SSMS.

Listing 2. Creation of the policy with T-SQL.

Declare @object_set_id int

EXEC msdb. dbo. sp_syspolicy_add_object_set @object_set_name=N ' AutoShrink_ObjectSet_1 ' Checks, @facet=N ' Database ', @object_set_id=@object_set_id OUTPUT


Declare @target_set_id int

EXEC msdb. dbo. sp_syspolicy_add_target_set @object_set_name=N ' AutoShrink_ObjectSet_1 ' Checks, @type_skeleton=N ' Server/Database ', @type=N ' DATABASE ', @enabled=True, @target_set_id=@target_set_id OUTPUT


EXEC msdb. dbo. sp_syspolicy_add_target_set_level @target_set_id=@target_set_id, @type_skeleton=N ' Server/Database ', @level_name=N ' Database ', @condition_name=N ' ', @target_set_level_id=0


Declare @policy_id int

EXEC msdb. dbo. sp_syspolicy_add_policy @name=N ' Checks AutoShrink ', @condition_name=N ' AutoShrink False ', @execution_mode=0, @policy_id=@policy_id OUTPUT, @root_condition_name=N ' ', @object_set=N ' AutoShrink_ObjectSet_1 ' Checks


Note: All policies created are stored in the system database msdb. After we create our policies, we have to make sure the system database msdb is part of the backup strategy used in the environment, because in this way it will be possible to rebuild all policies if there is some migration or disaster with the current environment.


Importing Policies

In order to increase the efficiency in the administration of a database environment, we can import policies made available by Microsoft during the SQL Server installation process. These policies are stored in XML format files that are located in the Tools directory where SQL Server was installed.

The import of predefined policies has some advantages, because along with the policies all necessary conditions for the correct operation are created. In addition to the predefined policies we can import an XML file that was generated based on customized policies already created by the database administrator using the PBM.

To run the import of a policy we must perform the following steps:

1)  Using the Object Explorer, open the Management folder (see Figure 8).

2)  Within that folder, right-click the Policies folder and select the option Import Policy.

3)  This will open the window to import the XML file with policy settings, as shown in the Figure 13.


Figure 13. Importing a policy.


To avoid duplication during the import, select Replace duplicates with Items Imported to overwrite any policy and condition that has the same name as the policy that is being imported. To overwrite an existing policy, the information will not be lost if the validation was already carried out within the policy.

We can also preserve the default configurations of the policy being imported, as well change them after importation policy, as Figure 14 presents.


Figure 14 . Definition of the status of the policy.


After importing the XML file containing the definitions of the policy, we can see it in the Policies folder. As shown in Figure 15, the new policy was created with the name Data and Log File Location.


Figure 15. Policy created through the import process.

It is interesting to note that unlike the previous created policy, the policy created through the import process has a constraint that limits the server validation of existing conditions only in instances of SQL Server using Enterprise or Standard editions.

Figure 16 shows the General information that was automatically inserted when imported through the policy XML file. The category, description and hyperlinks were populated, facilitating the process of documentation that states the reason why this policy has been implemented.


Figure 16. Description of the policy created through the import process.


Exporting Policies

Similarly, we can import policies using XML format files, and you can export the policies already created for files of this format. From these files can import these same policies in other SQL Server servers.

To perform the export policy procedure in XML format, there are two options:

  1. Exporting an existing policy.
  2. Exporting the current state of a facet.

To export an existing policy to an XML file, simply execute the following steps in SQL Server Management Studio:

1)  Using the Object Explorer, open the Management folder (Figure 8).

2)  Within that folder, open the folder Policy Management -> Policies to list existing policies (see Figure 17).

3)  Right-click the policy that you want to export and select Export Policy option (see Figure 18).

4)  According to the Figure 19, select the desired location to write the XML format file and the name of the same.


Figure 17. List of existing policies.



Figure 18. Export the policy.



Figure 19. Selecting policy storage location.


You can also export a policy by sourcing the current setting from the properties of a facet. For example, after setting up a customization for facet properties called Surface Area Configuration, you can export these settings to a file in XML format as follows:

1)  Through the Object Explorer, right-click the SQL Server instance and select Facets (see Figure 20).

2)  Select the facet you want, and then click Export Current State as Policy (Figure 21).

3)  According to Figure 22, fill in the name of the policy, the name of the condition and the export target of policy.


Figure 20. Listing the facets.


Figure 21. Exporting the current state of the facet as a policy.


Figure 22. Setting the policy storage location.


Policy Evaluation

Policy evaluation is the process in which we execute the policy in a target determined and reviewed by the same results. The PBM allows a policy to be evaluated in a single instance or a group of instances using the CMS. As the purpose of this article is the management of multiple instances, we will use the CMS to evaluate the policy Checks AutoShrink, created earlier, in two instances of the SQL Server database.

The Table 1 shows instances of SQL Server that will be used to assess the policy defined by the PBM.

Name of the InstanceVersionEditionBuildVITADB\SQLCMSSQL Server 2014Enterprise12.0.2000VITADB\SQLINSTANCE1SQL Server 2014Enterprise12.0.2000VITADB\SQLINSTANCE2SQL Server 2014Enterprise12.0.2000

Table 1. Instances used by PBM


To use the VITADB\SQLCMS instance as our management, we must perform the following steps:

1) Right-click the Central Management Servers option and select the option Register Central Management Server (see Figure 23).

2) In the dialog box New Server Registration, fill in the connection information, in accordance with the Figure 24.

3) Right-click the VITADB\SQLCMS instance and select New Server Registration (Figure 25).

4) Repeat the procedure described in step 3 and record the instances VITADB\SQLINSTANCE1 and VITADB\SQLINSTANCE2.

5) Right-click the VITADB\SQLCMS instance and select Evaluate Policies.

6) In the dialog box, select the instance that has the list of policies,  of which will be evaluated and, as shown in Figure 26, click the Evaluate button to start the validation of the selected rule.


Figure 23. Starting the creation of a central management instance.




Figure 24. Connection properties.



Figure 25. Registering a new server in the CMS.



Figure 26. By selecting the policy will be evaluated.


After the evaluation of the policy Checks AutoShrink can analyze, as in the Figure 27, there is a database named DBTeste1 on the VITADB\SQLINSTANCE1 instance and a database named DBTeste2 on the VITADB\SQLINSTANCE2 instance that are out of politics. That is to say, where the AutoShrink property is enabled, contrary to the previously set condition that determines how to correct the situation with a disabled property.


Figure 27. Result of the policy Checks AutoShrink.

As you can see, PBM has managed to create and evaluate policies in one or more database instances. In this way we have a simpler and more efficient management environment composed of multiple instances.

In the last part of this series, we will take a look at the Data Collector tool and how to use it in relation to a centralized monitoring of instances of SQL Server.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

SQL 2016 – Improve Performance with the Query Store feature: Part 1

Mon, 2015-11-30 15:19


After playing with this great new feature for some time and speaking about it during a SQL Saturday in Montreal, I can say that I am very excited about Query Store in SQL 2016!
Performance tuning is our bread and butter here at Pythian and I think that almost every DBA will agree that, unless you have some 3rd party tool (and even if you have such a tool), it’s not always straight forward to tune T-SQL code. It’s usually a tedious process and it’s a science by itself that requires a good background and understanding of how things work “behind the scenes”, as well as how to help the optimizer “make” the right decisions.

Here are some great links to start with:

If you don’t have a dev/testing environment, you can even start with Azure DB which has this feature already. Just note that this feature is currently at a CTP version, meaning that it is not yet supported. Azure DB is cheap and affordable. Another option would be using a dev/test Azure Virtual Machine with SQL Server 2016 on it.


Good luck and feel free leave us questions or comments!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Data Collector and Policy-Based Management: Part 1

Fri, 2015-11-27 14:23


This article will address the main techniques for the efficient management of multiple instances in a SQL Server database 2014 using the Tools Data Collector and Policy-Based Management. In addition, it will be demonstrated in a practical way how to configure each of the tools, as well as how to extract and analyze the metrics collected.

With the exponential growth of the amount of data generated by applications, comes the increased complexity in managing database environments for the database administrator. With this growth, combined with the low cost of storage media, servers began a scenario in which the database administrator left to administer dozens of databases and proceeded to administer hundreds of them.

Since the main responsibilities of a database administrator is to ensure the integrity, performance and stability of all instances of SQL Server under its administration, the greater the number of instances and databases used within an enterprise, the greater the difficulty in monitoring and managing such an environment in proactive and automated fashion.

For this type of scenario, SQL Server makes it possible to centralize both the execution of routine tasks for an administrator, since the collection of performance metrics from all instances and existing databases is through the Tools Data Collector (DC) and Policy-Based Management (PBM). For example, there is the need for all databases that have the recovery model parameter set to Full to perform a backup of the log file every hour. So, instead of this policy existing only as a  concept, requiring you to check manually on all database servers, you can use the PBM to create “physically” a policy and ensure that it is applied at once in all instances of SQL Server.

In order to facilitate the understanding of the management of multiple instances, the presentation of the tools will be performed in the following order: first we will look at the Policy-Based Management and then we will know the Data Collector.


What is the Policy-Based Management?

The Policy-Based Management (PBM) is a feature available starting with SQL Server 2008 that enables the creation and implementation of policies on their SQL Server instances. The PBM works similarly to the created group policy through the Active Directory.

Note: Group policies provide centralized management of applications and users, by means of rules created by system administrators and that can be applied at various levels of the directory structure defined in Active Directory.

PBM applies a policy on a particular target, for example, a database, a table, a view, or a stored procedure and then it is checked to see if the target is in accordance with the rules of this policy. If the target does not agree, it is possible to both enforce the rules of politics as raise an alert to the administrator of the database so he/she knows of this violation.

One of the great advantages of the PBM is the implementation of a policy on multiple instances of a SQL Server database at once, facilitating the Administration and management of all the infrastructure of the Corporation Bank.

Many features of SQL Server 2014, such as Resource Governor, Data Compression and In-Memory OLTP need Enterprise Edition or Developer. This is not the case for the PBM, which is available in all editions of SQL Server, including Express (although with the Express Edition is not possible to create a Central Management Server).

As soon as the instance of SQL Server is installed in 2014, it is possible to create and evaluate the policies against any existing SQL Server in your environment, including in versions prior to 2014.


Policy-Based Management Components

The PBM is composed of three main components: Policies, Conditions and Facets, as shown in Figure 1. These components are arranged in a sort of hierarchical order for using the PBM. A facet is required for creating a condition, and the condition is necessary for the creation of policies. The policies, in turn, are applied to specific targets .

Figure 1. Policy Management in SQL Server 2014.



The targets are the managed objects for a particular policy and can be of various types: servers, databases, instances, stored procedures, etc. An important detail is that you can use more than one target at the same time in a policy. For example, we have a policy which States that only object names starting with the db _ prefix are correct and perform a validation of this rule on tables, functions and stored procedures of one or more instances at the same time.



A facet is a group of properties that relate to a particular target. SQL Server 2014 has 86 facets, each containing several different properties. This allows the use of hundreds of properties in the creation of a policy.

You can view the properties of a facet expanding Facets folder and double-clicking any of the options. For example, the facet Data File has several properties, such as maximum size of the data file, number of readings and writings and if the data file is online, as shown in Figure 2.

Note: The facets are available as read-only, i.e. it is not possible to create customized facets or modifications of existing ones. Currently new facets can be included only by Microsoft, through service packs or by upgrading the version of SQL Server.

Figure 2. Facet Properties Data File.



A condition can be described as a rule for a policy to be evaluated. Basically, the rule checks a target and, if this target is not in accordance with the policy, it fails. It is worth mentioning that a policy can evaluate only one condition, however it is possible that a condition has different rules for different properties. A condition can be viewed by expanding the Conditions folder and double-clicking any one of the available options, as shown in Figure 3.

Note: In an instance of SQL Server, a database will not exist conditions customized unless previously imported or created manually, that is, initially there will only be the conditions of system.

Figure 3. Condition created on Facet Database.



The policies are complete packages that include conditions, facets, targets, assessment modes and Server restrictions (the evaluation modes and the server are discussed in the next topic).

When created, the policies are stored in the system database msdb, but you can export them into an XML format. This portability allows database administrators with greater ease to share and compare the policies created. To view a policy it is necessary to expand the Policies folder and double-click any one of the options, as shown in Figure 4.

Figure 4. Details of a policy created

Note: In an instance of SQL Server, databases will not exist as customized policies, unless previously imported or created manually, that is, there will be only the initial policies.


Policy Evaluation Modes

The PBM has four distinct ways of performing a policy and that determines how the evaluation will occur under a predefined target. The following modes of evaluation may be available, depending on the facet used in policies:

  • On Demand: This evaluation mode specifies that the implementation should occur manually. By default, any policy with this evaluation mode is disabled automatically after it is created. However, even though I disabled it it can still be evaluated at any time.
  • On Schedule: By selecting this mode you can schedule the evaluation policy to be evaluated at any time. By default, you can select a schedule already created or create a new schedule that meets your needs. Creating a schedule allows you to set options such as the recurrence of execution, execution frequency per day, frequency of execution per hour and how long a policy should be executed. For example, you could run a particular policy for the next two weeks.
  • On Change: Log Only: when you select this mode, the policy will be evaluated only if a change is made to the target specified. If the change violates the policy, the event will be executed and the results of the violation will be stored in the event log and in the system database msdb. This evaluation mode helps the database administrator without affecting the performance of the environment.
  • On Change: Prevent: This evaluation mode is very similar to the On Change: Log Only, namely, the assessment will be the moment an event to perform any change in target. But unlike the Log Only option, Prevent performs the rollback procedure of any amendment which violates the policy.

The Figure 5 shows an example of a policy and evaluation modes available for the same.

Figure 5. Evaluation Modes.


Server Restrictions

In conjunction with the targets and the facets, the server restrictions are another way to control how a policy is evaluated. A server restriction is nothing more than a condition used to delete a particular policy server through the facet Server.

With the use of the PBM, you can create a server restriction to limit the evaluation of a policy only on instances of SQL Server using the Standard editions or Enterprise. When this policy is applied it will not be assessed by the instances that do not use these specific issues.


Management of Policies

SQL Server 2014 has some features that facilitate the management and evaluation of policies created. One of these features is the ability to create categories to group similar policies and use the Central Management Server (CMS) to execute the policies throughout the database environment.



The categories are a logical group of one or more policies that assist in the management and execution of the same. For example, you can create a policy group that will be evaluated only in test or development environments. When a policy is created, specify a category Description option, as shown in Figure 6.


Figure 6. Category definition


Central Management Server (CMS)

CMS functionality is not part of the architecture of the PBM, but has become extremely important in the use of policies in a SQL Server database consisting of multiple servers.

Through the CMS you can specify a database instance (or greater) to be a central management and store a list of registered instances that can be organized into one or more groups, as shown in Figure 7.

Figure 7. Central Management Server (CMS)



Once the policies are configured and implemented, there is no need to constantly check the servers to make sure that they are in accordance with the conditions set out in the policies. Instead, we can use the SQL Server Agent alerts to receive notifications automatically when a policy is violated.

In the next post of this 3 part series we will learn how to create a policy and how to use it.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

SQL On The Edge #5 – Custom Visualizations in PowerBi

Fri, 2015-11-27 13:39


Have you ever used Microsoft’s PowerBi service? If you have, have you used it within the last year? The reason I ask is that the current PowerBi service is so radically different from the initial release that pretty much only the name is what they have in common. Today I’m going to do a short summary of how we got here, where the service is, where it’s going and the actual topic for my video, the new awesome custom visualization functionality!

A Short History Lesson

A few years ago, Microsoft went on the direction of empowering business analysts on what is one of the most common business applications in the world: Excel. With this push, they started releasing some amazing plugins for Excel: PowerPivot, PowerQuery, PowerView, PowerMap. Suddenly we could import millions of rows from all kinds of sources into Excel! And transform them! And visualize them! Then with the release of Office 365 and Sharepoint Online, a service was created to make it easy to share and consume all these Excel-born reports. And thus PowerBi was born but it required all these other tools and subscriptions. It didn’t catch on.

Fast Forward To Today

This initial offering of PowerBi had too many external dependencies. You needed Excel with all the bells and whistles to do the reports and then all these other satellite cloud services. Thankfully someone saw and executed on a clearer vision: one service, no dependencies, no cost of entry, fully mobile friendly. It’s been a very interesting journey from a market perspective to see how MS released something, course corrected and then went head-on with their improved vision.

The PowerBi desktop designer is free. The mobile apps (all major mobile OS’es AND Windows Phone), free as well. The service itself also has a free fully functional tier, it’s only limited by the amount of data and the enterprise sharing capabilities. Add the ease of use of the tools and the natural language query capabilities and this is now a strong tool that can finally become a contender with the Tableau and Qlikviews of the world.

No, it’s not perfect but it is growing and an insane amount of new features are getting added constantly. New vendors are adding content packs and now custom visualizations have made an appearance.

Community Involvement

The idea behind the first batch of custom visuals was great. MS opened a contest for people to submit their best custom visuals and the community responded with amazing creativity and participation. Not only do these immediately provide more value to every current user of PowerBi but they also serve as examples for further development of more custom visuals.

The full gallery of custom visuals can be found in the PowerBi Visual Gallery.

And if you don’t have your PowerBi account, try it out, it’s free! Head over to the official PowerBi site.

Demo Time

For the demo of working with a PowerBi project on the desktop, online and importing a custom visual, let’s check out the video! Enjoy!



Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Log Buffer #451: A Carnival of the Vanities for DBAs

Fri, 2015-11-27 12:59


The show goes on. This Log Buffer Edition picks some blogs which are discussing new and old features of Oracle, SQL Server and MySQL.


  • Directory Usage Parameters (ldap.ora) list the host names and port number of the primary and alternate LDAP directory servers.
  • Data Visualization Cloud Service (DVCS) is a new Oracle Cloud Service. It is a subset offering of the currently supported Business Intelligence Cloud Service (BICS).
  • ORA-24247: network access denied by access control list (ACL).
  • Latches are low level serialization mechanisms, which protect memory areas inside SGA. They are light wait and less sophesticated than enqueues and can be acquired and released very quickly.
  • handling disks for ASM – when DB, Linux and Storage admins work together.

SQL Server:

  • How to use the Performance Counter to measure performance and activity in Microsoft Data Mining.
  • Phil Factor demonstrates a PowerShell-based technique taking the tedium out of testing SQL DML.
  • Sandeep Mittal provides an introduction to the COALESCE function and shows us how to use it.
  • Hadoop many flavors of SQL.
  • Installing and Getting Started With Semantic Search.


  • Support for storing and querying JSON within SQL is progressing for the ANSI/ISO SQL Standard, and for MySQL 5.7.
  • Loss-less failover using MySQL semi-syncronous replication and MySQL Fabric!
  • Memory consumption The binary format of the JSON data type should consume more memory.
  • This post compares a B-Tree and LSM for read, write and space amplification. The comparison is done in theory and practice so expect some handwaving mixed with data from iostat and vmstat collected while running the Linkbench workload.
  • If you do not have a reliable network access (i.e. in some remote places) or need something really small to store your data you can now use Intel Edison.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

Delayed Durability and Impact on Mirroring

Fri, 2015-11-27 12:39

To Test the delayed durability feature with mirroring in high performance and high safety mode. The idea is to confirm what the performance improvement is of the transaction and if it has any benefit in high latency networks.


Test Scenario

We have two databases configured with mirroring in high safety mode, which will require that commits happen at the secondary first. One database called “DalayedTEST” has delayed durability enabled with FORCED mode. Then I have configured performance counters to check latency and performance of each database each second, I have added mirroring, transaction, lock and physical disk counters so we can compare the performance when using delayed durability or not in a mirrored environment. Then we are going to switch mirroring to high performance and see its behaviour as well. No Witness is configured




Using SQLQueryStress tool I am loading 5000 transactions to a log table with 50 threads enabled. This is equivalent to 50 persons loading 5000 records to the same table at the same time. Should be a good test to analyze behaviour of the databases.

Normal Database Test (Full Transactions) High Safety Mode









Locks and Transactions



Physical Disk




A failover happens 30 seconds after starting the test. The database is able to failover and record count 98850.




Delayed Database Test (Delayed Transactions) High Safety Mode









Locks and Transactions



Physical Disk





A failover happens 30 seconds after starting the test. The database is able to failover and record count 165757.


Conclusion – High Safety Mode

Mirroring High Safety Mode



Having delayed transactions enabled in a database with high safety mirroring improves performance under high contention scenarios. In this case having lots of transactions running at the same time and requesting for the same table object proved to be better, as the execution time was faster when using delayed transactions.

When checking the mirroring behaviour one can see that the mirroring log queue is bigger with delayed transactions. The difference is that with full transactions the queue is processed sequentially, which decreases the queue while the transaction completes. As a result, you will see a high spike at the beginning of the stress test which is the full queue , and then it decreases in time. When using delayed transactions one can see spikes spread evenly every 20 seconds or so, this means that one batch is processed, and then other batch is processed, and this process carries on until the queue is empty.

So having a sequential mirror queue processing vs. batch processing makes a difference in latency? It is clear that transaction delay is 3 times worse with delayed transactions as it will processes batches in a cyclic way, which will saturate more at the endpoint on every iteration and also keep the transaction waiting for a commit acknowledgement from the mirroring partner. So, having delayed transactions with high contention, it is faster locally as it has less logical reads/writes, but slows down the mirroring communication as it works in batches.

But total execution time was better with delayed transactions?

Having delayed transaction improves execution time but increases mirroring delay, so it is simply a matter of doing the math. The local processing was so much better that it compensated for the slower mirroring transaction delay and in general terms is faster in this scenario (I used a local network), however if you add other variables to the stress test, such as a high latency network for the mirroring, the end result might not be compensated and the total execution time can end up being around the same or worse.


Now, let´s see the numbers with a high performance mode:

Normal Database Test (Full Transactions) High Performance Mode









Locks and Transactions



Physical Disk



Delayed Database Test (Delayed Transactions) High Performance Mode









Locks and Transactions



Physical Disk



Conclusion High Performance Mode

Mirroring High Performance Mode



Having delayed transactions means we have a database with high performance mirroring that is not improving performance, despite the fact that indicators show less locking and less logical reads. Seems the mirroring queue was not being handled properly. The mirroring queue is considerably bigger when using delayed transactions and despite the fact that we are working in high performance mode the general execution time is the worst in this test environment. After I realized this I ran other tests to see if the results were the same, and generally speaking the total execution time is almost the same, a difference of 4-10 seconds tops.


General Conclusion

When using delayed transactions in all the tests we can conclude that consistently the mirroring queue, and the amount of data transferred is considerably larger than the one produced by full transactions.

When using high safety mode, it seems delayed transactions have a better disk performance, which compensate the higher mirroring/network values and results in a faster execution time. This situation can change if the network has high latency producing the same or worse performance.

When using high performance mode it seems delayed transactions have an ok to good disk performance, but it is not good enough to compensate the higher mirroring/network values, and results in the same or slower execution time.

I would recommend enabling delayed transactions only to fix high contention rates. Also avoid using it in high latency or very busy networks when mirroring high performance environments. These recommendations are based on a small test and in a controlled environment and should not necessarily extend to all environments out there. The rule of thumb is that using delayed transactions does impact mirroring performance, which by itself is a technology known to introduce certain performance issues depending on the system, so do not take the decision to use delayed transactions lightly and test first in a staging environment before rolling into production.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs


Fri, 2015-11-27 11:39


This blog post is a continuation to an earlier post about my musings on Oracle EBS support for virtual host names.

Actually, most parts of Oracle E-Business Suite work with virtual host names with out any problem. The only component that doesn’t work when using virtual host names are the Concurrent Managers. Concurrent Managers expect that the node name defined in the Concurrent Manager definition screen matches the host name FNDLIBR executable reads at the server level. Having the virtual host name as an alias in the hosts file in the server doesn’t cut it for the FNDLIBR executable. FNDLIBR reads the host name of the server using the Unix system call.

This behaviour of FNDLIBR can be hacked by overriding the Unix gethostname system call using LD_PRELOAD functionality. There is already a prebuilt program out there on github to achieve this functionality. It’s called fakehostname. I have tested this and verifies that it works with Oracle 11i, R12.0 and R12.1 version without any problem.

Here is a demo:

$ hostname
$ export LD_PRELOAD=/home/oracle/fakehost/
$ export MYHOSTNAME=ebsfakehost
$ hostname
$ export MYHOSTNAME=newebshost
$ hostname


This utility helps in making concurrent managers thinking that it’s running on the virtual host by overriding the gethostname system call. This method of getting EBS to work with virtual hostnames doesn’t work any more with EBS R12.2. The reason for this EBS R12.2 is that it it’s shipped in a mix of 32bit and 64bit components. Earlier releases of EBS like 11i, 12.0 and 12.1 are 32bit only, even though they run on 64bit platforms. We can get EBS R12.2 working by having both 32bit and 64bit versions of the fakehostname library in the LD_PRELOAD, but EBS borks too many warning messages about not being able to load 32bit/64bit libraries, which defeats the whole purpose of having a simple solution.

I am working on another way of getting virtual host names working in EBS R12.2. I will post that in my next blog post. Stay tuned!


Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

Advanced Compression Option Caveat in Oracle 12c

Tue, 2015-11-24 12:18


Oracle 12c introduced a new capability to move a partition online, without any interruptions to DML happening at the same time. But, there’s a catch. So far we’ve been able to use basic table compression without having to worry about any extra licensing – it was just a plain EE feature.

If you are planning to use the online partition move functionality, carefully check if you’re not using basic compression anywhere. For example:

create tablespace data datafile '+DATA' size 1g

create user foo identified by bar
default tablespace data
quota unlimited on data

grant create session, create table to foo

connect foo/bar

create table test (x int, y varchar2(20))
partition by range (x)
partition p1 values less than (100) tablespace data compress,
partition p2 values less than (200) tablespace data,
partition p3 values less than (300) tablespace data

So we now have this, and our licensing is still as we know it:

select partition_name, compression, compress_for from user_tab_partitions
------------------------------ -------- ------------------------------

We can use the new feature on partition p3:

alter table test move partition p3

Or, we can use the traditional means to compress the partition p2:

alter table test move partition p2

But as soon as we do this move “online”, we are required to purchase the Advanced Compression Option:

alter table test move partition p2

And, even sneakier:
alter table test move partition p1

Notice how partition p1 – which was previously compressed – also was online moved to a compressed format:

select partition_name, compression, compress_for from user_tab_partitions

—————————— ——– ——————————


And that, therefore, required the Advanced Compression Option.

Also note that the usage of this is not caught by dba_feature_usage_statistics (tested on

select name, currently_used from dba_feature_usage_statistics where lower(name) like '%compress%';

—————————————————————- —–
Oracle Advanced Network Compression Service FALSE
Backup ZLIB Compression FALSE
Backup BZIP2 Compression FALSE
Backup BASIC Compression FALSE
Backup LOW Compression FALSE
Backup MEDIUM Compression FALSE
Backup HIGH Compression FALSE
Segment Maintenance Online Compress FALSE
Compression Advisor FALSE
SecureFile Compression (user) FALSE
SecureFile Compression (system) FALSE
HeapCompression FALSE
Advanced Index Compression FALSE
Hybrid Columnar Compression FALSE
Hybrid Columnar Compression Row Level Locking FALSE

15 rows selected.

I also tried to bounce the database and the data wasn’t updated in my tests. I would’ve expected this to show up under “Segment Maintenance Online Compress”, but in my tests, it did not.

This feature restriction isn’t documented anywhere in the official product documentation – at least not that I could find. The only place where I could find this information was in this Oracle document.


Discover more about our experience in the world of Oracle.

Categories: DBA Blogs

Implementing Fuzzy Search in SQL Server – Part 2: Levenshtein Distance

Mon, 2015-11-23 15:18


The Levenshtein Distance, as discussed in my last post, is a way to measure how far two strings are located from each other. There are several T-SQL implementations of this functionality, as well as many compile versions. In addition, the MDS library in SQL Server has a Similarity function which uses the Levenshtein Distance to find how similar two words are.

In this post, I’ve done a simple comparison of performance using a C# CLR implementation of Levenshtein Distance (The code is from the Wiki), and a well written T-SQL implementation from Arnold Fribble.

As many of you might expect, the C# implementation is much quicker. Needing only 2504 ms to run through dictionary table of 203,118 words. The T-SQL implementation took 42718 ms for the same work.

A comparison of two ways to implement the Levenshtein Distance

Levenshtein Distance Comparison


Levenshtein Distance CLR

To implement the Levenshtein Distance CLR, run this SQL Script


Levenshtein Distance T-SQL

To implement the Levenshtein Distance in T-SQL, run the below code. Please note that this function has a cut-off value (@d) where it simply gives up and returns -1.


CREATE FUNCTION edit_distance_within(@s nvarchar(4000), @t nvarchar(4000), @d int)
DECLARE @sl int, @tl int, @i int, @j int, @sc nchar, @c int, @c1 int,
@cv0 nvarchar(4000), @cv1 nvarchar(4000), @cmin int
SELECT @sl = LEN(@s), @tl = LEN(@t), @cv1 = ”, @j = 1, @i = 1, @c = 0
WHILE @j <= @tl
SELECT @cv1 = @cv1 + NCHAR(@j), @j = @j + 1
WHILE @i <= @sl
SELECT @sc = SUBSTRING(@s, @i, 1), @c1 = @i, @c = @i, @cv0 = ”, @j = 1, @cmin = 4000
WHILE @j @c1 SET @c = @c1
SET @c1 = UNICODE(SUBSTRING(@cv1, @j, 1)) + 1
IF @c > @c1 SET @c = @c1
IF @c @d BREAK
SELECT @cv1 = @cv0, @i = @i + 1
RETURN CASE WHEN @cmin <= @d AND @c <= @d THEN @c ELSE -1 END


Discover more about our expertise in SQL Server

Categories: DBA Blogs

ORA-10173: Dynamic Sampling Time-Out Error

Mon, 2015-11-23 15:06


Recently on a heavily used and freshly upgraded ware-house type database, we started seeing lots of ORA-10173 dumped into the alert log. The information out there on this error is somewhat sparse, and it is often linked to Tuning Advisor functionality. Since we’re not running that advisor on this database, a little digging was in order.

What I always do as a first step if I am confronted with an error where I am not certain why and exactly where it is raised, is to set up an error trap. In Oracle, this can be done by setting an errorstack event, like this:

alter session set events '10173 trace name errorstack level 1';

Please note: whenever possible you’d want to set this on the smallest scope possible – starting with the session level, eventually in a login trigger for multiple sessions, and only when all that fails, on the system level using the alter system statement.

Once you have trapped an occurrence or two, you can disable it again by running:
alter session set events '10173 trace name errorstack off';

Upon activating this event, the next time around when this exception is raised Oracle will write a trace file to the diagnostics repository. Two two most prevalent pieces of information in the trace file are the current SQL query:

----- Current SQL Statement for this session (sql_id=anbp9r5n8ysu6) -----
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM(C1) FROM (SELECT /*+ qb_name("innerQuery") */ 1 AS C1 FROM (SELECT "INV"."CASE_UPC" "ITEM_1","INV"."WHS_NBR" "ITEM_2",SUBSTR("INV"."CASE_UPC",3,11) "ITEM_3" FROM "XX"."XX_WHS_INV" "INV","XX"."XX_CASE_DIM" "CD2","XX"."XX_WHS_DIM" "WD" WHERE "WD"."WHS_DESC" LIKE '%PEY%' AND "CD2"."WHS_NBR"="INV"."WHS_NBR" AND "CD2"."CASE_UPC"="INV"."CASE_UPC" AND "INV"."WHS_NBR"="WD"."WHS_NBR" GROUP BY "INV"."CASE_UPC","INV"."WHS_NBR",SUBSTR("INV"."CASE_UPC",3,11)) "VW_DIS_13") innerQuery

And the stack trace, which looks something like this:

skdstdst()+29 call kgdsdst() 7FFFA592F860 ? 000000000 ?
7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
ksedst()+112 call skdstdst() 7FFFA592F860 ? 000000000 ?
7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
dbkedDefDump()+1153 call ksedst() 000000000 ? 000000000 ?
7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
ksedmp()+26 call dbkedDefDump() 000000001 ? 000000000 ?
7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
dbkdaKsdActDriver() call ksedmp() 000000001 ? 000000000 ?
+880 7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
dbgdaExecuteAction( call dbkdaKsdActDriver() 7F482FDE8618 ? 7FFFA5932830 ?
)+319 7FFFA5912120 ? 7FFFA5912238 ?
7FFFA592FF18 ? 7FFFA592F810 ?
... ( rest omitted for clarity and we don't need the stack for the purpose of this post )

For my case, the interesting bit was the query that caused the error to be raised. This query provides us with some hints about what’s going on:

– We can see the DS_SVC comment at the beginning of the query
– We can see lots of hints, the more interesting ones being dynamic_sampling(0) and result_cache(snapshot=3600)

A little bit of research led to a new feature introduced in 12c: Dynamic Statistics (or Adaptive Statistics). This is an enhancement to dynamic sampling, where the optimizer can automatically choose to gather more statistics as part of the parse phase. The sampling is controlled internally with a time out, and if that time out is hit, ORA-10173 is raised to the alert log.

This means that these errors are generally safe to ignore, as they are raised purely internally and your application won’t see these exceptions. Your query didn’t, and won’t fail. However, your query may well be spending more time parsing and sampling data than what makes sense. My colleague Slava has already blogged about such a case here: Performance Problems with Dynamic Statistics in Oracle 12c and there are other documented cases.

The feature can be controlled through the optimizer_dynamic_sampling parameter. If it’s set to the default value of 2, the optimizer may choose to raise that temporarily to 11 which enables these new dynamic features. You can tell by looking at an execution plan of a query using dbms_xplan.display_cursor for example, and looking at the notes section:

- dynamic statistics used: dynamic sampling (level=AUTO)

As always, to determine whether or not this feature is helpful or harmful to your applications: benchmark it. You probably don’t want to turn it off system-wide, but it may make a lot of sense to disable it for certain sessions, or certain queries in your application. The ORA-10173 you may see in the alert log however is no reason to panic, if your application is not negatively impacted. If you are mostly running queries that take minutes or longer to execute, spending a couple seconds gathering more statistics may not impact you. If on the other hand your queries are fast, and Oracle spends more time parsing and sampling than it actually takes to execute the query, or you are seeing loads and loads of ORA-10173, you may want to take action. Alternatively you may also resort to setting a _fix_control as mentioned in Slava’s post – but as always, double-check with Oracle support if setting it has the desired effect on your exact version of Oracle.

In essence, the ORA-10173 indicates wasted work, because Oracle is spending time gathering data, hits the timeout and then throws the partial data away since it’s incomplete. This certainly isn’t optimal, but again, your mileage may vary if this really impacts your applications. In our case, we are seeing thousands of them, and we have opened a case with Oracle, since obviously somewhere the optimizer keeps making the wrong choice. The bug is currently still with BDE but I shall keep you posted if anything develops.

Another thing worth noting is that Oracle stores the results of these dynamic sampling queries in the result cache, and marks them for expiration after a certain time has elapsed. This timeout that we are seeing with the result_cache(snapshot=3600) hint, comes from the hidden parameter _optimizer_ads_result_cache_life which defaults to 3600 in This may also help in reducing the frequency of the dynamic sampling queries by increasing the lifetime for which they remain cached. But as always, before changing any hidden parameters, please consult with your local Oracle support representative!


Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

Implementing Fuzzy Search in SQL Server – Part 2: Levenshtein Distance

Mon, 2015-11-23 14:40

The Levenshtein Distance, as discussed in my last post, is a way to measure how far two strings are located from each other. There are several T-SQL implementations of this functionality, as well as many compiled versions. In addition, the MDS library in SQL Server has a Similarity function which uses the Levenshtein Distance to find how similar two words are.

In this post, I’ve done a simple comparison of performance using a C# CLR implementation of Levenshtein Distance (The code is from the Wiki), and a well written T-SQL implementation from Arnold Fribble.

As many of you might expect, the C# implementation is much quicker. Needing only 2504 ms to run through dictionary table of 203,118 words. The T-SQL implementation took 42718 ms for the same work.

A comparison of two ways to implement the Levenshtein Distance

Levenshtein Distance Comparison


Levenshtein Distance CLR

To implement the Levenshtein Distance CLR, run this SQL Script


Levenshtein Distance T-SQL

To implement the Levenshtein Distance in T-SQL, run the below code. Please note that this function has a cut-off value (@d) where it simply gives up and returns -1.

CREATE FUNCTION edit_distance_within(@s nvarchar(4000), @t nvarchar(4000), @d int)
  DECLARE @sl int, @tl int, @i int, @j int, @sc nchar, @c int, @c1 int,
    @cv0 nvarchar(4000), @cv1 nvarchar(4000), @cmin int
  SELECT @sl = LEN(@s), @tl = LEN(@t), @cv1 = '', @j = 1, @i = 1, @c = 0
  WHILE @j <= @tl
    SELECT @cv1 = @cv1 + NCHAR(@j), @j = @j + 1
  WHILE @i <= @sl
    SELECT @sc = SUBSTRING(@s, @i, 1), @c1 = @i, @c = @i, @cv0 = '', @j = 1, @cmin = 4000
    WHILE @j <= @tl BEGIN SET @c = @c + 1 SET @c1 = @c1 - CASE WHEN @sc = SUBSTRING(@t, @j, 1) THEN 1 ELSE 0 END IF @c > @c1 SET @c = @c1
      SET @c1 = UNICODE(SUBSTRING(@cv1, @j, 1)) + 1
      IF @c > @c1 SET @c = @c1
      IF @c < @cmin SET @cmin = @c SELECT @cv0 = @cv0 + NCHAR(@c), @j = @j + 1 END IF @cmin > @d BREAK
    SELECT @cv1 = @cv0, @i = @i + 1
  RETURN CASE WHEN @cmin <= @d AND @c <= @d THEN @c ELSE -1 END

Discover more about our expertise in SQL Server

Categories: DBA Blogs

What’s the Ratio of Your Marketing Budget to Cyber Security Budget?

Sun, 2015-11-22 19:45

This was a question asked over Twitter by @DanBarker to TalkTalk, the major British telco after they managed to lose a significant portion of its customers’ details, apparently through an SQL Injection attack by a 15 year old.

The question wasn’t answered, but the sooner companies realise that a security incident can wipe out a significant part of the brand’s goodwill, the more this ratio will adjust.

Here are three top tips to ensure you’re investing wisely in cyber security, and protecting your brand’s good name:

1. Keep everything patched and up to date – old databases have security holes that can lead to an attack. A new client of ours was running SQL Server 2002 and failed a pen-test in 30 minutes. But it doesn’t need to be that old to fail.

2. Audit and document everything. What data is where? Who or what has access? Do they need it? Are they still with the company? Not knowing what data might be lost was the major problem at the NSA post-Snowden. And within hours of the TalkTalk hack, the CEO said “I don’t know today whether all four million customers’ details have been stolen” (it was about 150,000 in the end, but by then the brand damage was done).

3. Check how employees and third party suppliers access your production environment, to make sure it’s from a safe, virus-free place. Can you see what they see? Do you know what they’re looking at?

Overall, to use Pythian’s tagline, just learn to “love your data”.

If your in-house team doesn’t have the skills and expertise to take care of these tasks, then find a company or contractor that does. The cost will be far less than a major security incident. And probably lower than your marketing budget too.

Categories: DBA Blogs