Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from http://www.dbi-services.com/
Updated: 6 hours 7 min ago

AOUG - Real World Performance Tour

Thu, 2015-03-26 13:26

This week, Tom Kyte, Graham Wood and Andrew Holdsworth were present in Europe for several dates. One of the events was organised by the Austrian Oracle User Group (AOUG) in collaboration with the German and Swiss User Group (DOAG and SOUG) and I had the chance to be there to attend to one session of the Real Worl Performance tour session in Vienna.

Standard Edition on Oracle Database Appliance

Wed, 2015-03-25 11:15

The Oracle Database Appliance is really interresting for small enterprises. It's very good hardware for very good price. It's capacity on demand licensing for Enteprise Edition. But small companies usually go to Standard Edition for cost reasons.

Then does it make sense to propose only Enterprise Edition to the small companies that are interrested by ODA?

QlikView Tips & Tricks - part II

Thu, 2015-03-19 15:00

QlikView.png
In a previous blog entry, I started to share some Tips & Tricks about the installation and configuration of some QlikView components/software. I will try to complete this list with some other interesting things to know. On this blog entry, I will try to explain how to enable the Task Performance Summary, to debug or at least enable the Communication with Salesforce and how to install the QlikView Management Console Connector (QlikView Management API).


For the whole blog entry, let's define the following values:

  • %SYS_PROFILE% = C:/Windows/System32/config/systemprofile
  • %QV_SRV_HOME% = D:/Apps/QlikView Server (defined during the QlikView Server installation)
  • %DS_DATA_HOME% = D:/QlikView/DistributionService (defined in the QMC: System ˃ Setup ˃ Distribution Services ˃ QDS@hostname ˃ General ˃ Settings for QDS ˃ Application Data Folder)
  • %PROXY_HOST% = your-proxy-url.domain.com
  • %PROXY_PORT% = 11210
  • %PROXY_PORT_S% = 11211
  • %HOST_FQDN% = your-qlikview-hostname.domain.com
  • %HOST_IP% = 160.160.20.20
  • %HOST_ALIAS% = qlikview-alias.domain.com
  • %QV_GRPS% = QV_GRP_SUP + QV_GRP_DEV + QV_GRP_ADM (some groups for QlikView, all under the domain "DOMAIN")
  • %QV_SYS_USER% = QV_SYS_USER (the "DOMAIN" user under which QlikView is running)


Each time you will see one of these parameters or values in the text below, don't forget to replace them with YOUR OWN values. I only associate them for an example and to help you to find a match in your environment.

I. Task Performance Summary


Introduced in QlikView 11.2 SR7 and disabled by default, the Task Performance Summary is a new feature of QlikView to analyse the performance of the Tasks (I'm sure you already understood that from the title!). This new feature simply launch a little benchmark during the execution of all tasks to record some useful data like:

  • Name of the process that ran the task with its PID
  • CPU used by the process (Average, Peak)
  • CPU used by the Windows Server (Peak)
  • Virtual RAM used by the process (Average, Peak)
  • Virtual RAM used by the Windows Server (Peak)
  • Physical RAM used by the process (Average, Peak)
  • Physical RAM used by the Windows Server (Peak)
  • Duration of the reload task


For debugging, performance analysis and performance improvements, it's pretty cool to have this kind of information. Enable the Task Performance Summary is quite simple:

  1. Login to the Windows Server with any Administrator account
  2. Open the file: %SYS_PROFILE%/AppData/Roaming/QlikTech/QlikViewBatch/Settings.ini
    1. Add at the end: EnableQVBProcessSummary=1
    2. Add an empty line at the end of the file (VERY IMPORTANT: the last line MUST be an empty line)
  3. Open a command prompt as Administrator and execute the command: "%QV_SRV_HOME%/Distribution Service/qvb.exe"
  4. In the command prompt, execute another command: services.msc
  5. Restart all QlikView Services
  6. Open the folder: %DS_DATA_HOME%/TaskResults/
  7. Refresh the folder's content until there is a new file created (if nothing new appears, force the execution of a QlikView Task)
  8. Open the last created xml file and check that inside there is a line which include CPU and RAM consumption

Task1.pngModification of the Settings.ini file to enable the Task Performance Summary

Task2.pngLine added in the TaskResults' xml files by the Task Performance Summary


That's it, the Task Performance Summary is now enabled and will record the performance of the future task's executions.

II. Communication with Salesforce


If it's your first time with the QlikView Salesforce Connector, then there is one thing that is essential to understand: the QlikView Salesforce Connector requires an internet access to work. To be more precise, the connector will have to be able to access to the Salesforce Website to retrieve some data models that will be used by QlikView for any communication with a Salesforce Data Source. Therefore, if your enterprise network uses a proxy or anything else to prevent or restrict the access to internet, then this will need to be fixed.


SalesForceIssueNW_P.pngLog file generated during a task execution when the proxy configuration isn't done

SalesForceIssueW_P.pngLog file generated during a task execution when the proxy configuration is properly done

On this section, I will just describe how to configure your Windows Server to allow the communication with the Salesforce Website by configuring the proxy settings:

  1. The QlikView Salesforce Connector must be properly installed
  2. Login to the Windows Server with the account under which QlikView is running
  3. Open: Internet Explorer ˃ Internet Options ˃ Connections ˃LAN settings
    1. Click on: Use a proxy server for your LAN
    2. Click on: Bypass proxy server for local addresses
    3. Click on: Advanced
    4. HTTP: Address = %PROXY_HOST%
    5. HTTP: Port = %PROXY_PORT%
    6. Secure: Address = %PROXY_HOST%
    7. Secure: Port = %PROXY_PORT_S%
    8. Exceptions = 127.0.0.1;localhost;%HOST_FQDN%;%HOST_IP%;%HOST_ALIAS%
  4. Click on: OK (3 times)

SalesForce1.pngConfiguration of the proxy in the Internet Explorer's options


After this modification, the communication and therefore the reload of QlikView Documents using Salesforce as a Data Source should be successful.

III. QMC Connector


In QlikView by default, the license management through the QlikView Management Console is quite difficult... That's why the QlikView Community is really active on this topic and some solutions have been developed to easily manage the QlikView licenses. The most common solution is to use the QMC Connector. This connector will use the QlikView Management API to manage a lot of things directly from a QlikView Document. The installation of this connector is quite easy:

  1. Download the QlikView Connector: QVSManager.zip
  2. Login to the Windows Server with any Administrator account
  3. Extract the QVSManager zip files into: C:/Program Files/Common Files/QlikTech/Custom Data/. This will create the folder "QVSManager" and in this folder, there should be 4 files
  4. Create a Windows local group:
    1. Open a command prompt as Administrator and execute: compmgmt.msc
    2. Open: System Tools ˃ Local Users and Groups ˃ Groups
    3. Create a new group with the following entries:
      1. Name = QlikView Management API
      2. Description = QlikView Management API
      3. Members = %QV_GRPS% + %QV_SYS_USER%
    4. Click on: Create
  5. Restart all QlikView Services
  6. Deploy the QlikView Document: QVSCALManager.qvw (I put it in a zip file for convenience)

QMCConnector1.pngDeployment of the QVSManager.zip file in the "Custom Data" directory of QlikView

QMCConnector2.pngCreation and configuration of the Windows Server's local group for the QVSManager

Once this new QlikView Document is ready (Document deployed, new task created and scheduled, aso...), you should be able to see it in the QlikView AccessPoint and manage the QlikView licenses directly from this new application. Believe me, it will greatly facilitate your work!


Okay, that was my second blog entry about QlikView, I guess that's enough for now :). I hope some of you found these tips useful and if needed, don't hesitate to let me a little comment below and I'll do my best to help you. See you soon!

SQL Server: Change Data Capture for Oracle

Thu, 2015-03-19 02:55

This new feature has been introduced in SQL Server 2012 and needs an Enterprise Edition. It uses a Windows Service which scans Oracle Logs and tracks DML changes from Oracle tables into SQL Server change tables.
In other words, Change data capture records Insert, Update and Delete activities that is applied to Oracle tables.
Let's see how it works.

Prerequisites

The database where you want to capture changes must be in ARCHIVELOG and OPEN.
The user which will be used to connect to the Oracle database, must have DBA privileges.

Change Data Capture services installation

Installation will be done via msi packages. Those packages are not installed automatically with SQL Server 2012 or 2014. You will find them in your installation media under ToolsAttunityCDCOraclex641033.
There are two msi packages, one for the CDC Service Configuration named AttunityOracleCdcService.msi and another for CDC Designer named AttunityOracleCdcDesigner.msi.

Double click on AttunityOracleCdcService.msi and install the package, same for AttunityOracleCdcDesigner.msi.

b2ap3_thumbnail_OracleCDCSercice.jpg

b2ap3_thumbnail_OracleCDCDesigner.jpg

When both installations are done, go to the Application Panel and launch “Oracle CDC Service Configuration”.

b2ap3_thumbnail_OracleCDCSercice2.jpg

We will now have to prepare our SQL Server instance to use it. Each Oracle CDC Service instance lies a single SQL Server instance which will be used to manage it.
To create this instance click on "Prepare SQL Server" or right click on Local CDC Service and select “Prepare SQL Server”:

b2ap3_thumbnail_CDCSQLServerPrepare.jpg

Once it is done select the SQL Server instance where you want to install the MSXDBCDC database and click on the Run button:

b2ap3_thumbnail_CDCSQLServerPrepare2.jpg

The database is created:

b2ap3_thumbnail_CDCSQLServerPrepare3.jpg

Let's check from SQL Server Management Studio the newly created database:

b2ap3_thumbnail_CDCSQLServerPrepare4.jpg

At this point we can create a new CDC service:

b2ap3_thumbnail_OracleCDCSercice3.jpg

To create the windows service we have to provide the following information:

  • The Service Name
  • Use a local system account for the service account

In addition, we must provide the following information for the associated SQL Server instance:

  • Server name
  • Authentication and login

The next step consists in creating a master password for CDC service which will be used to create symmetric key.

b2ap3_thumbnail_OracleCDCSercice4.jpg

After that we will create an instance from the CDC Designer.
When I open it, I have to enter the SQL server instance, I have created before, which is associated with the CDC service. Credentials are required to connect to the concerned SQL Server Instance.

b2ap3_thumbnail_OracleCDCSercice6.jpg

After connecting to the SQL Server I can see all related Oracle CDC Services. Next, after selecting the OracleCDCService1, I will create a new instance for this service:

b2ap3_thumbnail_OracleCDCInstance1.jpg

Provide a name to the future CDC Instance and then create the change associated database:

b2ap3_thumbnail_OracleCDCInstance2.jpg

After clicking on “Create Database” button the wizard will assist us to create the new Oracle CDC instance and the change database.

b2ap3_thumbnail_OracleCDCSercice7.jpg

Click Next.
We have now to fill out the form with following information:

  • the Oracle connection string to our listener
  • user name
  • password

b2ap3_thumbnail_OracleCDCSercice8.jpg

Check if the connection to the source is successful:

b2ap3_thumbnail_OracleCDCSercice9.jpg

Click Next and then click on the Add button to select tables and columns for capturing changes.

b2ap3_thumbnail_OracleCDCSercice10.jpg

Select the schema and click on the Search button.
Select the SCOTT.EMP table and click on the Add button.

b2ap3_thumbnail_OracleCDCSercice11.jpg

A message warns up that the SCOTT_EMP table has been added to the list:

b2ap3_thumbnail_OracleCDCSercice12.jpg

Now, you can see the table in the list:

b2ap3_thumbnail_OracleCDCSercice13.jpg

The Capture instance column, here SCOTT_EMP, will be used to name the capture instance specific to each table object in my SQL Server database.

At this point, no CDC gating role is specified so it means that no gating role will be used to limit access to the change data.

By default all the columns of the SCOTT.EMP table are selected for CDC. If I want to select just some columns I can click on the Edit button and choose the columns I want in CDC.
Click on OK when your selection is finished and after click on Next:

b2ap3_thumbnail_OracleCDCSercice17.jpg

To be able to capture Oracle changes, supplemental logging have to be set up for the Oracle databases tables. For this purpose, a script have been generated automatically and have to be run immediately or later but you have to be aware that changes will not be captured until the script is executed.
Click on the “Run Script” button:

b2ap3_thumbnail_OracleCDCSercice14.jpg

In order to run the script some credential must be provided and after click on the Run button:

b2ap3_thumbnail_OracleCDCSercice15.jpg

The script has been executed with success:

b2ap3_thumbnail_OracleCDCSercice16.jpg

Click on Next and in the last screen click on the “Run” button to finalize the CDC process:

b2ap3_thumbnail_OracleCDCSercice18.jpg

It looks like I have some errors...

b2ap3_thumbnail_OracleCDCSercice19.jpg

Let’s click on Details:

b2ap3_thumbnail_OracleCDCSercice20.jpg

Apparently I forgot to set up my Oracle database for supplemental logging, let’s do it:

b2ap3_thumbnail_OracleCDCSercice21.jpg

Now, we can click on the Next button:

b2ap3_thumbnail_OracleCDCSercice22.jpg

We have successfully created an Oracle CDC instance. Let’s check on my SQL Server:

b2ap3_thumbnail_OracleCDCSercice23.jpg

The mirror table will be always empty conceptually. The generated deployment script denies all DML permissions on the mirror table.

I have a new database named OracleCDCInstance1 with, for the moment, an empty table named SCOTT.EMP, which is the table I selected earlier from the wizard.

Back to my CDC Designer, I see that I have now an OracleCDCInstance1 which is for the moment not started:

b2ap3_thumbnail_OracleCDCSercice24.jpg

Now let's start the instance:

b2ap3_thumbnail_OracleCDCSercice25.jpg

The Detailed status has changed from Initial to IDLE:

b2ap3_thumbnail_OracleCDCSercice26.jpg

It may take a few minutes (one or two) to start change capture process. You may notice the detailed status that will change from IDLE to PROCESSING. Likewise, Processing and Counters areas will also change as oracle logs are ridden.
Here I performed an update of the salary column in my table SCOTT.EMP from my Oracle database:

b2ap3_thumbnail_OracleCDCSercice28.jpg

We can see that the twelves operations are reflected in the counters area as twelves reads:

b2ap3_thumbnail_OracleCDCSercice27.jpg

If the Detailed Status changes to LOGGER instead of PROCESSING, it means that even you have a temporary delay mining Oracle logs and in this case PROCESSING will come back quickly even you have a problem of mining Oracle logs and in this case check the dbo.xdbcdc_trace in the MSXDBCDC database to check errors.

We are also able to check information from log traces by clicking on the “Collect diagnostics” link which will generate diagnostics data from both Oracle environment and dbo.xdbcdc_trace table into a trace file:

b2ap3_thumbnail_OracleCDCSercice29.jpg

This feature gives the opportunity to use Change Data Capture in a SQL Server database using an Oracle database as a source. A major advantage of using CDC for Oracle is certainly the reduction of data level latency and at the same time, Oracle ETL knowledge becomes useless. It is also a good way to source SQL Server database with Oracle Data.
Hope this article will help you, envoy ;-)

A funny story with recompile hint at statement level

Wed, 2015-03-18 09:31

Last night, I had an interesting discussion with one of my MVP French friend that faces a weird situation where a query that uses a statement level RECOMPILE hint produces an execution plan that disappointed him. He told me we can simulate the same situation by using the AdventureWorks database and Sales.SalesOrderHeader table.

First, we have to add a nonclustered index on the orderdate column as follows:

 

CREATE NONCLUSTERED INDEX [idx_sales_salesorderheader_orderdate] ON [Sales].[SalesOrderHeader] (        [OrderDate] ASC )

 

Now, let me show you the query. In fact we compare two queries and two potential behaviours. Indeed, the first query will use a local variable and an inequality operator while the second query will be pretty the same except we add the recompile hint option.

 

declare @date datetime = getdate()   select top 1 SalesOrderID from sales.salesorderheader where orderdate >= @date order by SalesOrderID go   declare @date datetime = getdate()   select top 1 SalesOrderID from sales.salesorderheader where orderdate >= @date order by SalesOrderID option(recompile)

 

Let’s compare the estimated execution plan of the both queries. In fact, the estimated plan produced by the query optimizer is the same in both cases.

 

blog_35_-_1_-_estimated_execution_plan

 

Next, let’s compare their real execution plans.

--> Concerning the first query:

blog_35_-_2_-_real_execution_plan_query_1

 

--> Concerning the second query (with recompile hint):


blog_35_-_3_-_real_execution_plan_query_2

 

Interesting, isn’t it? For the same query except the recompile option the query optimizer has decided to use an index scan operator in the first case and an index seek in the second case. At this point, of course we supposed that the recompile hint affects the query optimizer decision but how far? Well, the response lies in the way the query optimizer handles the parameter @date.

In the first query, the query optimizer is not aware of the @date parameter value at the compile time.

 

blog_35_-_4_-_real_execution_plan_query_1_-_predicate

 

 

It means that it will not be able to use the density information and instead it will use the standard guess of 30% selectivity for inequality comparisons. If we take a look at the statistic object of the primary key we may see that 30% of the total rows is equal to 31465 * 0.3 = 9439

 

dbcc show_statistics ('sales.salesorderheader', 'PK_SalesOrderHeader_SalesOrderID') with stat_header;

 

blog_35_-_6_-_statistic_primary_key

 

But at this point, we may wonder why SQL Server is using the primary key rather than the index on the orderdate column for example. In fact, the primary key is a good candidate for this query includes an ORDER BY clause (order by SalesOrderId). It means that data is already ordered by SalesOrderId and the query optimizer doesn’t need to perform an extra step that consists in sorting data before using the top operator as shown below:

 

declare @date datetime = getdate()   select top 1 SalesOrderID from sales.salesorderheader where orderdate >= @date order by SalesOrderID   select top 1 SalesOrderID from sales.salesorderheader with (index (idx_sales_salesorderheader_orderdate) ) where orderdate >= @date order by SalesOrderID go

 

blog_35_-_7_-_query_1_-_sort

 

Let’s continue with the second query and notice how SQL Server is handling the @date parameter this time: the local variable value is transformed as parameter as shown below:

 

blog_35_-_8_-_real_execution_plan_query_2_-_predicate

 

Hmm… does it mean that SQL Server is aware of local variable value when using the statement RECOMPILE hint? In fact, yes it does and this detail changes completely the story because in this case SQL Server is able to use the histogram of the index idx_salesorderheader_orderdate. However, we are in such situation where the predicate value is not represented on the histogram and SQL Server will use an estimate of 1 that implies to use an index seek operator. Likewise, we may wonder why SQL Server has decided to seek the index idx_salesorderheader_orderdate with a TopN Sort operator here. Once again, the query optimizer is cost-based and this strategy is surely the less costly. Let’s demonstrate by comparing the both strategies:

 

declare @date datetime = getdate()   select top 1 SalesOrderID from sales.salesorderheader with (index (PK_SalesOrderHeader_SalesOrderID) ) where orderdate >= @date order by SalesOrderID option(recompile)   select top 1 SalesOrderID from sales.salesorderheader where orderdate >= @date order by SalesOrderID option(recompile)

 

blog_35_-_9_-_real_execution_plan_query__-_force_using_pk

 

Let’s take a look at the statistics related on both CPU and IO consumption of the both queries

--> Concerning the first query (clustered index scan operator)

Table 'SalesOrderHeader'. Scan count 1, logical reads 689, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.   (1 row(s) affected)   SQL Server Execution Times:    CPU time = 16 ms, elapsed time = 2 ms. SQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 1 ms.

 

--> Concerning the second query (index seek operator). Yes, in this case you have only to read 2 pages (the index root page and one data page at leaf level)

Table 'SalesOrderHeader'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.   (1 row(s) affected)   SQL Server Execution Times:    CPU time = 0 ms, elapsed time = 0 ms. SQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 0 ms.

 

The bottom line: in certain cases we may be surprised by the choices made by the query optimizer but most of the time, we can trust it and this is what I wanted to point out in the blog post! The truth often lies elsewhere :-)

Happy optimization!


Use a wallet to encrypt Oracle client passwords

Tue, 2015-03-17 06:53

In our more and more connected world, security is a big concern. Storing password is a very important topic because if a machine is compromised, we can assume that any password stored in clear text are also compromised. I had the question from a customer how can we encrypt the password in our application and monitoring scripts? In this blog we will discuss one solution: using the Oracle client wallet

12c: shutdown abort a PDB?

Sun, 2015-03-15 23:56

Can we shutdown abort a PDB? Let's try:

SQL> show con_id
CON_ID
------------------------------
3SQL> shutdown abort;
Pluggable Database closed.


But is it really a shutdown abort?

Index on trunc(date) - do you still need old index?

Sun, 2015-03-15 15:16

Sometimes we have to index on ( trunc(date) ) because a SQL statement uses predicate on it instead of giving a range from midnight to midnight. When you do that you probably keep the index on the column. That's two indexes to maintain for DML. Do we need it?

ODA 12.1.X.X.X - add a multiplexed control file under ACFS

Thu, 2015-03-12 07:43

Since version 12, ODA stores databases on ACFS volumes instead of ASM directly. This slightly changed the way the files are managed and administer. This articles presents how to multiplex your control files on ACFS.

Venturing into bulk insert a SQL Server error log and data order

Thu, 2015-03-12 02:21

Have you ever attempted to bulk import a SQL Server error log in order to use the information inside a report for example? If yes, you have probably wondered how to keep data in the correct order in a query because you cannot refer to any column from the table. In such case you can notice that you may have many records with the same date. Of course, there exists some workarounds but this is not the subject of this blog. Instead, I would like to share with you an interesting discussion I had with a forum member about the guarantee to get the SQL Server error log data in order with a simple SELECT statement without an ORDER BY clause.

Let’s begin with the following script which bulk import data from a SQL Server error log file inside a table:

CREATE TABLE ErrLog ( LogCol NVARCHAR(max) NULL )

 

BULK INSERT ErrLog FROM 'C:Program FilesMicrosoft SQL ServerMSSQL12.MSSQLSERVERMSSQLLogERRORLOG.1' WITH ( FIRSTROW = 6, DATAFILETYPE = 'widechar' )


You may notice that we use FIRSTROW hint to begin from the 6th line and skip information as follows:

2015-03-07 13:57:41.43 Server     Microsoft SQL Server 2014 - 12.0.2370.0 (X64) Jun 21 2014 15:21:00 Copyright (c) Microsoft Corporation Enterprise Edition (64-bit) on Windows NT 6.3 (Build 9600: ) (Hypervisor)

 

In addition, using DATAFILETYPE = 'widechar' is mandatory in order to bulk import Unicode data.

Let’s continue and after bulking import data let’s take a look at the data itself inside the table. You will probably get the same kind of sample information as follows:

 

SELECT        LogCol FROM ErrLog;

 

blog_34_-_1_-_bulk_result

 

Comparing records order between the SQL Server error log file and the table tends to state that the order is the same. At this point, we may wonder how to number the records inside the table without affecting the table order. Indeed, numbering records in the table will allow to control the order of data by using the ORDER BY clause. So go ahead and let’s using the ROW_NUMBER() function in order to meet our requirement. You may notice that I use an “artificial” ORDER BY clause inside the windows function to avoid to interfere with the original order of getting data in my table.

 

SELECT        ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS numline,        LogCol FROM ErrLog;

 

blog_34_-_2_-_bulk_result_with_row_number

 

At this point, the forum member tells me that we cannot guarantee the order of data without using an order by clause but once again, it seems that we get the same order that the previous query but can I trust it? I completely agree with this forum member and I tend to advice the same thing. However in this specific case the order seems to be correct but why?

If you take a look at the first script, the first operation consisted in creating a heap table. This detail is very important. Then, the bulk insert operation reads sequentially the file and insert the data in the allocation order of pages. It means that data insertion order is tied to the allocation order of the pages specifying in the IAM page. So when we perform a table scan operation to number each row in the table (see the execution plan below), in fact this operation will be performed by scanning the IAM page to find the extents that are holding pages.

 

blog_34_-_21_-_execution_plan

 

As you know, IAM represents extents in the same order that they exist in the data files, so in our case table scan is performed by using the same allocation page order. We can check it by catching the call stack of our query. We may see that the query optimizer uses a particular function called AllocationOrderPageScanner for our heap table.

 

blog_34_-_3_-_allocation_order_scan

 

So, that we are agreed, I don’t claim that in all cases we may trust blindly the data order without specifying the ORDER BY clause. In fact, I’m sure you will have to process differently most of the time depending on your context and different factors that will not guarantee the order of data without specifying the ORDER BY CLAUSE (bulking import data from multiple files in parallel, reading data from table with b-tree structure in particular condition etc…)

Enjoy!

Analyzing easily the blocked process report

Sun, 2015-03-08 13:06

Which DBA has not yet face a performance problem issued by several blocked processes? In reality, I’m sure a very little number of them. Troubleshooting a blocked issue scenario is not always easy and may require to use some useful tools to simplify this hard task. A couple of months ago, I had to deal this scenario at one of my customer. During some specific periods in the business day, he noticed that its application slowed down and he asked to me how to solve this issue.

Fortunately, SQL Server provides a useful feature to catch blocked processes. We have just to configure the “blocked process threshold (s)” server option. There are plenty of blogs that explain how to play with this parameter. So I let you perform your own investigation by using your favourite search engine.

Having a blocked process report is useful but often in such situation, there are a lot of processes that sometimes blocked each other’s and we have to find out among this can of worms the real responsible. So, my main concern was the following: how to extract information from the blocked process report and how to correlate all blocked processes together. After some investigation I found a useful script written by Michael J S Swart here. Usually I prefer to write my own script but I didn't had the time and I had to admit this script met perfectly my need. The original version provides the blocked hierarchy and the XML view of the issue. It’s not so bad because we have all the information to troubleshoot our issue. However, my modification consists to change this XM view by adding useful information in tabular format to make the reading of the final result easier. Here the modified version of the script:

 

CREATE PROCEDURE [dbo].[sp_blocked_process_report_viewer_dbi] (        @Trace nvarchar(max),        @Type varchar(10) = 'FILE' )   AS   SET NOCOUNT ON   -- Validate @Type IF (@Type NOT IN('FILE', 'TABLE', 'XMLFILE'))        RAISERROR ('The @Type parameter must be ''FILE'', ''TABLE'' or ''XMLFILE''', 11, 1)   IF (@Trace LIKE '%.trc' AND @Type <> 'FILE')        RAISERROR ('Warning: You specified a .trc trace. You should also specify @Type = ''FILE''', 10, 1)   IF (@Trace LIKE '%.xml' AND @Type <> 'XMLFILE')        RAISERROR ('Warning: You specified a .xml trace. You should also specify @Type = ''XMLFILE''', 10, 1)          CREATE TABLE #ReportsXML (        monitorloop nvarchar(100) NOT NULL,        endTime datetime NULL,        blocking_spid INT NOT NULL,        blocking_ecid INT NOT NULL,        blocking_bfinput NVARCHAR(MAX),        blocked_spid INT NOT NULL,        blocked_ecid INT NOT NULL,        blocked_bfinput NVARCHAR(MAX),        blocked_waitime BIGINT,        blocked_hierarchy_string as CAST(blocked_spid as varchar(20)) + '.' + CAST(blocked_ecid as varchar(20)) + '/',        blocking_hierarchy_string as CAST(blocking_spid as varchar(20)) + '.' + CAST(blocking_ecid as varchar(20)) + '/',        bpReportXml xml not null,        primary key clustered (monitorloop, blocked_spid, blocked_ecid),        unique nonclustered (monitorloop, blocking_spid, blocking_ecid, blocked_spid, blocked_ecid) )   DECLARE @SQL NVARCHAR(max); DECLARE @TableSource nvarchar(max);   -- define source for table IF (@Type = 'TABLE') BEGIN        -- everything input by users get quoted        SET @TableSource = ISNULL(QUOTENAME(PARSENAME(@Trace,4)) + N'.', '')              + ISNULL(QUOTENAME(PARSENAME(@Trace,3)) + N'.', '')              + ISNULL(QUOTENAME(PARSENAME(@Trace,2)) + N'.', '')              + QUOTENAME(PARSENAME(@Trace,1)); END   -- define source for trc file IF (@Type = 'FILE') BEGIN         SET @TableSource = N'sys.fn_trace_gettable(N' + QUOTENAME(@Trace, '''') + ', -1)'; END   -- load table or file IF (@Type IN('TABLE', 'FILE')) BEGIN        SET @SQL = N'                    INSERT #ReportsXML(blocked_ecid, blocked_spid, blocked_bfinput , blocking_ecid, blocking_spid,                                 blocking_bfinput, blocked_waitime, monitorloop, bpReportXml,endTime)              SELECT                     blocked_ecid,                     blocked_spid,                     blocked_inputbuffer,                     blocking_ecid,                     blocking_spid,                     blocking_inputbuffer,                 blocked_waitime,                     COALESCE(monitorloop, CONVERT(nvarchar(100), endTime, 120), ''unknown''),                     bpReportXml,                     EndTime              FROM ' + @TableSource + N'              CROSS APPLY (                     SELECT CAST(TextData as xml)                     ) AS bpReports(bpReportXml)              CROSS APPLY (                     SELECT monitorloop = bpReportXml.value(''(//@monitorLoop)[1]'', ''nvarchar(100)''), blocked_spid = bpReportXml.value(''(/blocked-process-report/blocked-process/process/@spid)[1]'', ''int''), blocked_ecid = bpReportXml.value(''(/blocked-process-report/blocked-process/process/@ecid)[1]'', ''int''),                            blocked_inputbuffer = bpReportXml.value(''(/blocked-process-report/blocked-process/process/inputbuf/text())[1]'', ''nvarchar(max)''), blocking_spid = bpReportXml.value(''(/blocked-process-report/blocking-process/process/@spid)[1]'', ''int''), blocking_ecid = bpReportXml.value(''(/blocked-process-report/blocking-process/process/@ecid)[1]'', ''int''),                            blocking_inputbuffer = bpReportXml.value(''(/blocked-process-report/blocking-process/process/inputbuf/text())[1]'', ''nvarchar(max)''), blocked_waitime = bpReportXml.value(''(/blocked-process-report/blocked-process/process/@waittime)[1]'', ''bigint'')                     ) AS bpShredded              WHERE EventClass = 137';                     EXEC (@SQL); END   IF (@Type = 'XMLFILE') BEGIN        CREATE TABLE #TraceXML(              id int identity primary key,              ReportXML xml NOT NULL            )               SET @SQL = N'              INSERT #TraceXML(ReportXML)              SELECT col FROM OPENROWSET (                            BULK ' + QUOTENAME(@Trace, '''') + N', SINGLE_BLOB                     ) as xmldata(col)';          EXEC (@SQL);               CREATE PRIMARY XML INDEX PXML_TraceXML ON #TraceXML(ReportXML);          WITH XMLNAMESPACES        (              'http://tempuri.org/TracePersistence.xsd' AS MY        ),        ShreddedWheat AS        (              SELECT                     bpShredded.blocked_ecid,                     bpShredded.blocked_spid,                     bpShredded.blocked_inputbuffer,                     bpShredded.blocked_waitime,                     bpShredded.blocking_ecid,                     bpShredded.blocking_spid,                     bpShredded.blocking_inputbuffer,                     bpShredded.monitorloop,                     bpReports.bpReportXml,                     bpReports.bpReportEndTime              FROM #TraceXML              CROSS APPLY                     ReportXML.nodes('/MY:TraceData/MY:Events/MY:Event[@name="Blocked process report"]')                     AS eventNodes(eventNode)              CROSS APPLY                     eventNode.nodes('./MY:Column[@name="EndTime"]')                     AS endTimeNodes(endTimeNode)              CROSS APPLY                     eventNode.nodes('./MY:Column[@name="TextData"]')                     AS bpNodes(bpNode)              CROSS APPLY(                     SELECT CAST(bpNode.value('(./text())[1]', 'nvarchar(max)') as xml),                            CAST(LEFT(endTimeNode.value('(./text())[1]', 'varchar(max)'), 19) as datetime)              ) AS bpReports(bpReportXml, bpReportEndTime)              CROSS APPLY(                     SELECT                            monitorloop = bpReportXml.value('(//@monitorLoop)[1]', 'nvarchar(100)'),                            blocked_spid = bpReportXml.value('(/blocked-process-report/blocked-process/process/@spid)[1]', 'int'),                            blocked_ecid = bpReportXml.value('(/blocked-process-report/blocked-process/process/@ecid)[1]', 'int'),                            blocked_inputbuffer = bpReportXml.value('(/blocked-process-report/blocked-process/process/inputbuf/text())[1]', 'nvarchar(max)'),                            blocking_spid = bpReportXml.value('(/blocked-process-report/blocking-process/process/@spid)[1]', 'int'),                            blocking_ecid = bpReportXml.value('(/blocked-process-report/blocking-process/process/@ecid)[1]', 'int'),                            blocking_inputbuffer = bpReportXml.value('(/blocked-process-report/blocking-process/process/inputbuf/text())[1]', 'nvarchar(max)'),                            blocked_waitime = bpReportXml.value('(/blocked-process-report/blocked-process/process/@waittime)[1]', 'bigint')              ) AS bpShredded        )        INSERT #ReportsXML(blocked_ecid,blocked_spid,blocking_ecid,blocking_spid,              monitorloop,bpReportXml,endTime)        SELECT blocked_ecid,blocked_spid,blocking_ecid,blocking_spid,              COALESCE(monitorloop, CONVERT(nvarchar(100), bpReportEndTime, 120), 'unknown'),              bpReportXml,bpReportEndTime        FROM ShreddedWheat;               DROP TABLE #TraceXML   END   -- Organize and select blocked process reports ;WITH Blockheads AS (        SELECT blocking_spid, blocking_ecid, monitorloop, blocking_hierarchy_string        FROM #ReportsXML        EXCEPT        SELECT blocked_spid, blocked_ecid, monitorloop, blocked_hierarchy_string        FROM #ReportsXML ), Hierarchy AS (        SELECT monitorloop, blocking_spid as spid, blocking_ecid as ecid,              cast('/' + blocking_hierarchy_string as varchar(max)) as chain,              0 as level        FROM Blockheads               UNION ALL               SELECT irx.monitorloop, irx.blocked_spid, irx.blocked_ecid,              cast(h.chain + irx.blocked_hierarchy_string as varchar(max)),              h.level+1        FROM #ReportsXML irx        JOIN Hierarchy h              ON irx.monitorloop = h.monitorloop              AND irx.blocking_spid = h.spid              AND irx.blocking_ecid = h.ecid ) SELECT        ISNULL(CONVERT(nvarchar(30), irx.endTime, 120),              'Lead') as traceTime,        SPACE(4 * h.level)              + CAST(h.spid as varchar(20))              + CASE h.ecid                     WHEN 0 THEN ''                     ELSE '(' + CAST(h.ecid as varchar(20)) + ')'              END AS blockingTree,        irx.blocked_waitime,        bdp.last_trans_started as blocked_last_trans_started,        bdp.wait_resource AS blocked_wait_resource,        bgp.wait_resource AS blocking_wait_resource,        bgp.[status] AS blocked_status,        bdp.[status] AS blocking_status,        bdp.lock_mode AS blocked_lock_mode,        bdp.isolation_level as blocked_isolation_level,        bgp.isolation_level as blocking_isolation_level,        bdp.app AS blocked_app,        DB_NAME(bdp.current_db) AS blocked_db,        '-----> blocked statement' AS blocked_section,        CAST('' + irx.blocked_bfinput + '' AS XML) AS blocked_input_buffer,        CASE              WHEN bdp.frame_blocked_process_xml IS NULL THEN CAST('' + irx.blocked_bfinput + '' AS XML)              ELSE bdp.frame_blocked_process_xml        END AS frame_blocked_process_xml,        DB_NAME(bgp.current_db) AS blocking_db,        bgp.app AS blocking_app,        'blocking statement ----->' AS blocking_section,        CAST('' + irx.blocking_bfinput + '' AS XML) AS blocking_input_buffer,        CASE              WHEN bgp.frame_blocking_process_xml IS NULL THEN CAST('' + irx.blocking_bfinput + '' AS XML)              ELSE bgp.frame_blocking_process_xml        END AS frame_blocking_process_xml,        irx.bpReportXml from Hierarchy h left join #ReportsXML irx        on irx.monitorloop = h.monitorloop        and irx.blocked_spid = h.spid        and irx.blocked_ecid = h.ecid outer apply (        select              T.x.value('(./process/@waitresource)[1]', 'nvarchar(256)') AS wait_resource,              T.x.value('(./process/@lasttranstarted)[1]', 'datetime') as last_trans_started,              T.x.value('(./process/@lockMode)[1]', 'nvarchar(60)') as lock_mode,              T.x.value('(./process/@status)[1]', 'nvarchar(60)') as [status],              T.x.value('(./process/@isolationlevel)[1]', 'nvarchar(60)') as isolation_level,              T.x.value('(./process/@currentdb)[1]', 'int') as current_db,              T.x.value('(./process/@clientapp)[1]', 'nvarchar(200)') as app,              cast(              (select SUBSTRING(txt.text,(ISNULL(T.x.value('./@stmtstart', 'int'), 0) / 2) + 1,                            ((CASE ISNULL(T.x.value('./@stmtend', 'int'), -1)                                   WHEN -1 THEN DATALENGTH(txt.text)                                   ELSE T.x.value('./@stmtend', 'int')                               END - ISNULL(T.x.value('./@stmtstart', 'int'), 0)) / 2) + 1) + CHAR(13) AS statement_txt                        from bpReportXml.nodes('//blocked-process/process/executionStack/frame') AS T(x)                        cross apply sys.dm_exec_sql_text(T.x.value('xs:hexBinary(substring((./@sqlhandle), 3))', 'varbinary(max)')) AS txt                        for XML path('')) as xml) AS frame_blocked_process_xml          from bpReportXml.nodes('//blocked-process') AS T(x) ) AS bdp outer apply (        select              T.x.value('(./process/@waitresource)[1]', 'nvarchar(256)') AS wait_resource,              T.x.value('(./process/@status)[1]', 'nvarchar(60)') as [status],              T.x.value('(./process/@isolationlevel)[1]', 'nvarchar(60)') as isolation_level,              T.x.value('(./process/@currentdb)[1]', 'int') as current_db,              T.x.value('(./process/@clientapp)[1]', 'nvarchar(200)') as app,              cast(              (select SUBSTRING(txt.text,(ISNULL(T.x.value('./@stmtstart', 'int'), 0) / 2) + 1,                            ((CASE ISNULL(T.x.value('./@stmtend', 'int'), -1)                                   WHEN -1 THEN DATALENGTH(txt.text)                                   ELSE T.x.value('./@stmtend', 'int')                               END - ISNULL(T.x.value('./@stmtstart', 'int'), 0)) / 2) + 1) + CHAR(13) AS statement_txt                        from bpReportXml.nodes('//blocking-process/process/executionStack/frame') AS T(x)                        cross apply sys.dm_exec_sql_text(T.x.value('xs:hexBinary(substring((./@sqlhandle), 3))', 'varbinary(max)')) AS txt                        for XML path('')) as xml) AS frame_blocking_process_xml               from bpReportXml.nodes('//blocking-process') AS T(x) ) AS bgp order by h.monitorloop, h.chain   DROP TABLE #ReportsXML

 

Unfortunately I can’t show my customer context so I will show only a sample of my own test to explain how we can use this script. In fact, the generated result set is splitted into three main sections.

First section: Hierarchy blocked tree, lock resources and transaction isolation level

 

blog_33_-_1_-_result_lock_section

 

Let’s begin by the first category. You can see here the hierarchy tree and the blocked interactions that exist between the different processes. The above picture shows the process id = 72 that is blocking the process id = 73. In turn, the process = 73 is blocking other sessions (with id = 75, 77). Furthermore, the process 74 is at the same level than the process id = 73 and it is blocked by the process id = 72. Finally the process id = 76 is blocked by the process id = 74. A real can of worms isn’t it?

Displaying the blocking hierarchy tree is very useful in this case. In addition, I added the transaction isolation level used by all processes, the status of the processes, the locks and the resources related to the issue. As a reminder, these information are already in the blocked processes report and my task consisted in extracting these information in tabular format. We will use all of them later in this blog post. For the moment, let’s focus on the first hierarchy branch: 72 -> 73 -> 75 -> 77 and the resource that all concerned processes are hitting:

KEY: 6:72057594045595648 (089241b7b846) that we can split in three main parts

6 : Database id = 6 => AdventureWorks2012

72057594045595648 : The container hobt id of the partition that give us the schema, table and index as follows:

select        s.name as [schema_name],        o.name as table_name,        i.name as index_name from sys.partitions as p join sys.objects as o        on p.object_id = o.object_id join sys.indexes as i        on i.object_id = p.object_id              and i.index_id = p.index_id join sys.schemas as s        on s.schema_id = o.schema_id where p.hobt_id = 72057594045595648

 

blog_33_-_2_-_partition

 

Person.Person.PK_Person_BusinessEntityID is a clustered index that includes the BusinessEntityID column.

 

(089241b7b846) :

The lock resource value that identifies the index key in the table Person.Person locked by the process id = 72. We may use the undocumented function %%lockres%% to locate the correct row in the table as follows:

 

select        BusinessEntityID from Person.Person where %%lockres%% = '(089241b7b846)'

 

blog_33_-_3_-_lockres

 

At this point we know that the blocking process has started a transaction in repeatable read transaction isolation level and has not yet released the lock on the index key with value 14. This is why the session id = 73 is still pending because it attempts to access to the same resource by putting an S lock.

Let’s continue with the next sections of the result set:

 

Second section: blocking and blocked input buffers and their related frames

This second part provides detailed information of blocked statement information including the concerned application and the concerned databases as well.

 

blog_33_-_4_-_blocked_session_section

 

Likewise, the last part provides the same kind of information but for the blocking statement(s):

 

blog_33_-_5_-_blocking_session_section

 

We will correlate the information of the above sections. For example, if we take a look directly at the blocking input buffer of the process id = 72 we will discover the responsible that is the following stored procedure:

 

<blockingInputBuffer> EXECTestUpdatePersonNameStyle@NameStyle &lt;/blockingInputBuffer&gt;

 

Next, the blocking frame identifies exactly the portion of code inside the stored procedure where the blocking issue has occurred:


WAITFOR DELAY '00:02:00';

 

Ok it seems that the stored procedure has started an explicit transaction with the repeatable read transaction isolation level and includes a WAITFOR DELAY command with a duration of 2 minutes. During this time, the different resources are still holding by the transaction because there is no transaction commit or transaction rollback and we are in repeatable read transaction isolation level. Let’s take a look at the stored procedure code:

 

ALTER PROCEDURE [dbo].[TestUpdatePersonNameStyle] (        @NameStyle BIT,        @BusinessEntityID INT ) AS   SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;   BEGIN TRAN   SELECT PhoneNumber FROM Person.PersonPhone WHERE BusinessEntityID = @BusinessEntityID;   UPDATE Person.Person SET NameStyle = @NameStyle WHERE BusinessEntityID = @BusinessEntityID + 100;   WAITFOR DELAY '00:02:00';   ROLLBACK TRAN;

 

We can confirm that we found in the first section, the repeatable read transaction isolation level used by the blocking session. In reality, it seems that we have two different resources holding by the above transaction. The first (index key = 14) and the second (index key = 14 + 100).

Now let’s switch to the blocked statement part. A quick look at the input buffer tells us that the session id = 73 is trying to access the same resource than the UPDATE part of the blocking process. It confirms what we saw in the first section: the process id = 73 is in suspended state because it is trying to put a S lock on the concerned resource that is not compatible with an X lock from the UPDATE statement of the process id = 72.


SELECT * FROM Person.Person WHERE BusinessEntityID = 114;  

 

I will do not the same demonstration for all the lines in the result set but let’s finish by the process id = 74. Let’s go back to the first section. We can see that session id = 74 is trying to put an X lock on the following resource:

KEY: 6:72057594045726720 (58e9f9de4ab6)

Let’s apply the same rule that earlier and we may easily find the corresponding index key on the table Person.PersonPhone this time.

 

blog_33_-_6_-_lockres

 

Now let’s continue to the next sections and let’s take a look at the blocking frame:

 


WAITFOR DELAY '00:02:00';

 

The same thing that the first case…. Finally let’s take a look at the blocking input buffer:

 

  BEGIN TRAN; IF EXISTS(SELECT 1 FROM Person.Person WHERE BusinessEntityID = 14) DELETE FROM Person.PersonPhone WHERE BusinessEntityID = 14; ROLLBACK TRAN;    

 

This time, it concerns an explicit transaction but with a different transaction isolation level: read committed mode. You can correlate with the first section by yourself. The blocking point concerns only the second part of the above query as indicated by the blocked_lock column in the first section: The process id = 74 is trying to put an X lock on a resource that is still holding by the process id = 72 (SELECT statement in repeatable read transaction isolation level).

The issue that I faced with my customer was pretty similar. In fact you have just to replace the WAITFOR DELAY command by a series of other pieces of code which deferred drastically the transaction commit time. In this case, having a precise idea of the blocking tree and the other information readable directly on a tabular format helped us to save a lot of time in order to resolve this issue.

Happy troubleshooting!

Oracle compression, availability and licensing

Fri, 2015-03-06 15:03

Various methods of table compression have been introduced at each release. Some require a specific storage system Some requires specific options. Some are only for static data. And it's not always very clear for the simple reason that their name has changed. 

Name change for technical reasons (ROW/COLUMN STORE precision when a columnar compression has been introduced) or for marketing reason (COMPRESS FOR OLTP gave the idea that other - Exadata - compression level may not be suited for OLTP).

Of course that brings a lot of ambiguity such as:

  • HCC is called 'COLUMN STORE' even if it has nothing to do with the In-Memory columns store
  • COMPRESS ADVANCED is only one part of Advanced Compression Option
  • EHCC (Exadata Hybrid Columnar Compression) is not only for Exadata
  • COMPRESS FOR OLTP is not called like that anymore, but is still the only compression suitable for OLTP
  • HCC Row-Level Locking is not for ROW STORE but for COLUMN STORE. It's suited for DML operation but is different than FOR OLTP. Anyway COLUMN STORE compression can be transformed to ROW STORE compression during updates. And that locking feature is licenced with the Advanced Compression Option, and available in Exadata only... 
  • When do you need ACO (Advanced Compression Option) or not?

Let's make it clear here.

Generic query for multicriteria search - part II: BIND_AWARE (Adaptive Cursor Sharing)

Tue, 2015-03-03 14:52

In the previous post I explained the performance issue encountered when using a generic query to deal with optional search criteria on multiple columns. The statement was shared by all executions, was marked as bind sensitive, but never became bind aware. Let's use the BIND_AWARE hint.

Generic query for multicriteria search - part I: USE_CONCAT (OR Expansion)

Tue, 2015-03-03 14:47

You have a multicriteria search screen on the EMPLOYEE table where you can enter an employee id, a department id, a manager id or a job id. Either you put the value you want to filter on, or you leave it null when you don't want to filter on it. How will you code that? You can build the query on the fly with dynamic SQL or use a generic query like this one:

       SELECT *
       FROM employees
       WHERE (job_id = NVL(:job_id, job_id))
       AND (department_id = NVL(:department_id, department_id))
       AND (manager_id = NVL(:manager_id, manager_id))
       AND (employee_id = NVL(:employee_id, employee_id))
This is good for the code maintainability, but having a one-fit-all query will not be optimial for each cases. Markus Winand (every database developer should read his book) describes the danger ot that in his website: Use The Index, Luke

RAC Attack! next month 12c in Las Vegas

Tue, 2015-03-03 14:39

 

b2ap3_thumbnail_5---Banner---Official-COLLABORATE-Speaker.jpg

 

RAC is the most complex installation you can have for an oracle database. A RAC DBA is involved not only on database, but storage, network, and system as well. It involves also the application in order to be sure that the application service can follow the database service high availability. It's also brings every database skills to the highest level: small contention on single instance database can become a big bottleneck in RAC.

But RAC is also fascinating. It's the highest service availability. When correctly configured you can stop a node without any impact on your users. It's the highest scalability: you are not limited by the number of CPU or the amount of memory that you can put in a server. Just add a node. RAC is not new. Oracle 6 was already able to open the same database from several instances. It was called parallel server.

Do you think it's impossible to learn and practices that king of infrastructure when you don't have already one in your data center? No. You can install and practice RAC on your laptop. This is what RAC Attack! is about: at various events, experienced RAC Attack volunteers (ninjas) will help you address any related issues and guide you through the setup process and you will have a RAC on your laptop. Next month in Las Vegas is the IOUG event: COLLABORATE15. I'll be there as a speaker and I'm also very happy to help as a RAC Attack! Nija. 

Here you can find all information about it:

http://collaborate.ioug.org/precon#rac

Hope to see you there.  

 

 

Query the Enterprise Manager collected metrics

Mon, 2015-03-02 08:10

Enterprise Manager (Cloud Control for example) gathers a lot of metrics. You can display them from the GUI, but you can also query the SYSMAN views directly. Today, I wanted to get the history of free space in an ASM disk group for the previous week. Here is how I got it.

Using Windows 2012 R2 & dynamic witness feature with minimal configurations

Mon, 2015-02-23 15:30

Do you have ever seen the following message while you’re trying to validate your cluster configuration with your availability groups or FCI’s and Windows Server 2012 R2?


blog_32_-_0_-_cluster_validation


 

Microsoft recommends to add a witness even if you have only two cluster members with dynamic weights. This recommendation may make sense regarding the new witness capabilities. Indeed, Windows 2012 R2 improves the quorum resiliency with the new dynamic witness behavior. However, we need to take care about it and I would like to say at this point that I’m reluctant to recommend to meet this requirement with a minimal cluster configuration with only 2 nodes. In my case, it’s very usual to implement SQL Server AlwaysOn and availability groups or FCI’s architectures with only two cluster nodes at customer places. Let’s talk about the reason in this blog post.

 

First of all, let’s demonstrate why I don’t advice my customers to implement a witness by following the Microsoft recommendation. In my case it consists in adding a file share witness on my existing lab environment with two cluster nodes that use the dynamic weight behavior:


blog_32_-_1_-_cluster_2_nodes_configuration_nodeweight


 

Now let’s introduce a file share witness (\DC2WINCLUST-01) as follows:


blog_32_-_2_-_adding_FSW

 

We may notice after introducing the FSW that the node weight configuration has changed:

 

blog_32_-_3_-_cluster_new_configuration

 

 

blog_32_-_4_-_cluster_fsw_config

 

The total number of votes equals 3 here because we are in the situation where we have an even number of cluster members plus the witness. As a reminder, we are supposed to use a dynamic witness feature according to the Microsoft documentation here.

 

In Windows Server 2012 R2, if the cluster is configured to use dynamic quorum (the default), the witness vote is also dynamically adjusted based on the number of voting nodes in current cluster membership. If there is an odd number of votes, the quorum witness does not have a vote. If there is an even number of votes, the quorum witness has a vote.

 

The quorum witness vote is also dynamically adjusted based on the state of the witness resource. If the witness resource is offline or failed, the cluster sets the witness vote to "0."

 

The last sentence draws my attention and now let’s introduce a failure of the FSW. In my case I will just turn off the share used by my WFSC as follows:

 

blog_32_-_5_-_disable_fileshare


 

As expected, the file share witness state has changed from online to failed state by the resource control manager:

 

blog_32_-_6_-_fileshare_witness_failed

 

At this point, according to the Microsoft documentation, we may expect that the WitnessDynamicWeight property will change by the cluster but to my surprise, this was not the case:

 

blog_32_-_62_-_fileshare_witness_configuration


 

In addition, after taking a look at the cluster log I noticed this sample among the entire log records:

000014d4.000026a8::2015/02/20-12:45:43.594 ERR   [RCM] Arbitrating resource 'File Share Witness' returned error 67 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [RCM] Res File Share Witness: OnlineCallIssued -> ProcessingFailure( StateUnknown ) 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [RCM] TransitionToState(File Share Witness) OnlineCallIssued-->ProcessingFailure. 000014d4.00001ea0::2015/02/20-12:45:43.594 INFO [GEM] Node 1: Sending 1 messages as a batched GEM message 000014d4.000026a8::2015/02/20-12:45:43.594 ERR   [RCM] rcm::RcmResource::HandleFailure: (File Share Witness) 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [QUORUM] Node 1: PostRelease for ac9e0522-c273-4da8-99f5-3800637db4f4 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [GEM] Node 1: Sending 1 messages as a batched GEM message 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [QUORUM] Node 1: quorum is not owned by anyone 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [RCM] resource File Share Witness: failure count: 0, restartAction: 0 persistentState: 1. 000014d4.00001e20::2015/02/20-12:45:43.594 INFO [GUM] Node 1: executing request locally, gumId:281, my action: qm/set-node-weight, # of updates: 1 000014d4.000026a8::2015/02/20-12:45:43.594 INFO [RCM] numDependents is zero, auto-returning true 000014d4.00001e20::2015/02/20-12:45:43.594 WARN [QUORUM] Node 1: weight adjustment not performed. Cannot go below weight count 3 in a hybrid configuration with 2+ nodes

 

The latest line (highlighted in red) is the most important. I guess here that “hybrid configuration” means my environment includes 2 cluster nodes and one witness (regarding its type). An interesting thing to notice is a potential limitation that exists for the dynamic witness behavior that cannot be performed below two cluster nodes. Unfortunately, I didn’t find any documentation from Microsoft about this message. Is it a bug or just a missing entry to the documentation or have I overlook something concerning the cluster behavior? At this point I can’t tell anything and I hope to get soon a response from Microsoft. The only thing I can claim at this point is that if I lose a cluster node, the cluster availability will be compromised. This reproduced issue is not specific on my lab environment and I faced the same behavior several times at my customers.

Let’s demonstrate by issuing a shutdown of one of my cluster node. After a couple of seconds, connection with my Windows failover cluster is lost and here what I found by looking at the Windows event log:


blog_32_-_7_-_quorum_lost

 

As I said earlier, at this point, with minimal configuration with two cluster nodes, I always recommend to my customers to skip this warming. After all, having only two cluster members with dynamic quorum behavior is sufficient to get a good quorum resiliency. Indeed, according to the Microsoft documentation to allow the system to re-calculate correctly the quorum, a simultaneous failure of a majority of voting members should not occur (in others words, the failure of cluster members must be sequential) and with two cluster nodes we may only lose one node at the same time in all cases.

What about more complex environments? Let’s say a FCI with 4 nodes (two cluster nodes on each datacenter) and a file share witness on the first datacenter. In contrast, in this case, if the file share witness fails, the cluster will adjust correctly the overall node weight configuration both on the cluster nodes and on the witness. This is completely consistent with the message found above: "Cannot go below weight count 3".


blog_32_-_8_-_quorum_adjustement_with_4_nodes



 

The bottom line is that the dynamic witness feature is very useful but you have to take care about its behavior with minimal configurations based on only two cluster nodes which may introduce unexpected results in some cases.

 

Happy cluster configuration!




12c online statistics gathering and startup restrict

Sat, 2015-02-21 16:15

I've written about 12c online statistics gathering in a UKOUG OracleScene article. My opinion is clear about it: you sill need to gather stale stats afterwards or you have mising, stale and inconsistent object statistics. This post is about cases where online statistics gathering does not occur (and are not documented) - which is another reason why we can't rely on it.

The case where it works

You can check on the article about how online gathering statistics works (or come to our 12c new feature workshop where we cover and practice all 12c optimizer new features)
In order to do something else here I'm showing how to trace it by activating the 0x10000 trace flag for dbms_stats:

SQL> connect demo/demo@//localhost/PDB1
Connected.
SQL> set serveroutput on
SQL> exec dbms_stats.set_global_prefs('TRACE',1+65536);
PL/SQL procedure successfully completed.

SQL> drop table DEMO;
Table dropped.

SQL> create table DEMO ( n number ) pctfree 99;
Table created.

SQL> insert /*+ append */ into DEMO select rownum from dual connect by 1000>=level;
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 1         1         1         to_char(count("N"))                        100
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 2         1         1         substrb(dump(min("N"),16,0,64),1,240)      9
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 3         1         1         substrb(dump(max("N"),16,0,64),1,240)      17
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 1         1         1         to_char(count("N"))                        100
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 2         1         1         substrb(dump(min("N"),16,0,64),1,240)      9
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 3         1         1         substrb(dump(max("N"),16,0,64),1,240)      17
DBMS_STATS: postprocess online optimizer stats gathering for DEMO.DEMO: save statis
DBMS_STATS: RAWIDX    SELMAPPOS RES                            NNV       NDV       
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 1         1          1000      1000      0         2891      1000
DBMS_STATS: RAWIDX    SELMAPPOS RES                            NNV       NDV       
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 2         2         Typ=2 Len=2: c1,2              NULL      NULL      
DBMS_STATS: RAWIDX    SELMAPPOS RES                            NNV       NDV       
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 3         3         Typ=2 Len=2: c2,b              NULL      NULL      
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 1         1         1         to_char(count("N"))                        100
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 2         1         1         substrb(dump(min("N"),16,0,64),1,240)      9
DBMS_STATS: SELMAPPOS CLISTIDX  INTCOL    SELITEM                                    GATHFLG
DBMS_STATS: ------------------------------------------------------------------------
DBMS_STATS: 3         1         1         substrb(dump(max("N"),16,0,64),1,240)      17

1000 rows created.

From the trace, online statistics gathering occured for that direct-path load.
We can see it also in the execution plan:

SQL> select * from table(dbms_xplan.display_cursor('1k2r9n41c7xba'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  1k2r9n41c7xba, child number 0
-------------------------------------
insert /*+ append */ into DEMO select rownum from dual connect by 1000>=level

Plan hash value: 1600317434

---------------------------------------------------------------------------------
| Id  | Operation                        | Name | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |      |       |     2 (100)|          |
|   1 |  LOAD AS SELECT                  |      |       |            |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |      |     1 |     2   (0)| 00:00:01 |
|   3 |    COUNT                         |      |       |            |          |
|   4 |     CONNECT BY WITHOUT FILTERING |      |       |            |          |
|   5 |      FAST DUAL                   |      |     1 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------

and statistics are there:

SQL> select last_analyzed,num_rows,blocks from user_tables where table_name='DEMO';

LAST_ANAL   NUM_ROWS     BLOCKS
--------- ---------- ----------
21-FEB-15       1000        179

Don't forget to set the trace off:

SQL> exec dbms_stats.set_global_prefs('TRACE',0);
PL/SQL procedure successfully completed.

Ok. That is the known case. Table statistics are there.

 

startup restrict

When you want to do some online maintenance, being sure that the application is not connected, you start the database in restrict mode.

SQL> alter system enable restricted session;
System altered.

Then you can do you imports, reorg, bulk load, etc. and be sure that nobody will write or read into the table you are working on. Imagine you have tested the previous load and you have observed that the online gathered statistics are sufficient. Now you run the same in production in restricted mode.

SQL> connect demo/demo@//localhost/PDB1
Connected.
SQL> set serveroutput on
SQL> exec dbms_stats.set_global_prefs('TRACE',1+65536);
PL/SQL procedure successfully completed.

SQL> drop table DEMO;
Table dropped.

SQL> create table DEMO ( n number ) pctfree 99;
Table created.

SQL> insert /*+ append */ into DEMO select rownum from dual connect by 1000>=level;
1000 rows created.

No trace related to online statistics gathering.

SQL> select * from table(dbms_xplan.display_cursor('1k2r9n41c7xba'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
SQL_ID  1k2r9n41c7xba, child number 0
-------------------------------------
insert /*+ append */ into DEMO select rownum from dual connect by 1000>=level

Plan hash value: 1600317434

-------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | INSERT STATEMENT               |      |       |     2 (100)|          |
|   1 |  LOAD AS SELECT                |      |       |            |          |
|   2 |   COUNT                        |      |       |            |          |
|   3 |    CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   4 |     FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

no STATISTICS GATHERING operation.

SQL> select last_analyzed,num_rows,blocks from user_tables where table_name='DEMO';

LAST_ANAL   NUM_ROWS     BLOCKS
--------- ---------- ----------

and no statistics.

 

10053 trace

Because we can't see the STATISTICS GATHERING operation in the execution plan, I know that it's an optimizer decision done at compilation time. I've dump the 10053 trace and got the following line:

ONLINEST: Checking validity of online stats gathering
ONLINEST: Failed validity check: database not open, in restricted/migrate mode, suspended, readonly, instance not open or OCI not available.

So we have a few reasons where online statistics does not occur and that are not documented as Restrictions for Online Statistics Gathering for Bulk Loadsand restricted mode is one of them.

 

Thin JDBC

Because the preceding line mentions OCI I wanted to be sure that online statistics gathering occurs even when connected though thin jdbc, and I used the sqlcl beta from SQL Developer 4.1 Early Adopter. Note that I'm not in restricted session anymore.

sql.bat demo/demo@//192.168.78.113/pdb1

SQLcl: Release 4.1.0 Beta on Sat Feb 21 21:10:59 2015

Copyright (c) 1982, 2015, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL> show jdbc
-- Database Info --
Database Product Name: Oracle
Database Product Version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Database Major Version: 12
Database Minor Version: 1
-- Driver Info --
Driver Name: Oracle JDBC driver
Driver Version: 12.1.0.2.0
Driver Major Version: 12
Driver Minor Version: 1
Driver URL: jdbc:oracle:thin:@//192.168.78.113/pdb1

SQL> create table DEMO ( n number ) pctfree 99;

Table DEMO created.

SQL> insert /*+ append */ into DEMO select rownum from dual connect by level select last_analyzed,num_rows,blocks from user_tables where table_name='DEMO';

LAST_ANALYZED                 NUM_ROWS     BLOCKS
--------------------------- ---------- ----------
21.02.15                          1000        100


Ok. no problem. I don't know what that 'OCI not available' is but it works even though JDBC Thin.

 

Conclusion

As I already said for other reasons, don't rely on online statistics gathering and always gather stale stats afterwards. It's good to have it as it saves some work to do by dbms_stats later. There are cases where it is better than no statistics (when combined with GTT private statistics for example) but don't rely on it. but don't rely on it.

How to set NLS for SQL Developer

Sat, 2015-02-21 14:24

I'm using Oracle SQL Developer 4.1 Early Adopter for a while and I like it. That version comes with a command line (in beta) which goal is to be fully compatible with sqlplus but running in java, and having a lot more features. 

Becuse it's connecting with thin java driver by default, it doesn't use NLS_LANG. It's java. It's unicode. So here is how to set the language and characterset with the java options.

QlikView Tips & Tricks

Thu, 2015-02-19 08:00

Qlik.png
For several months now, I've been working on some QlikView projects which is a quite interesting discovery for me. Generally, these projects are limited to the management of QlikView at the administrator level (installation, upgrade, configuration of the QlikView Management Console, aso...) but I was still able to accumulate some knowledge that I want to share with you today. In this blog entry, I will try to explain how to debug the QlikView Offline Service, how to properly configure the access to remote Shared Folders and how to enable the Single Sign-On between QlikView and a third party software. I will try to describe the required steps as best I can to avoid any problems.


I. QlikView Offline Service for QlikView 11.2 SR7 or below


In a complete QlikView environment that is using SSL (I don't know if it can happen without SSL), if you try to setup the QlikView Offline Service, you may face an issue where the Offline Service doesn't work at all. This happen even if the component was installed successfully and even if there are no errors in the QlikView log files. This issue comes from the fact that by default, QlikView enforces the FIPS compliance when using the Offline Service but this can cause some problems depending on your enterprise network restrictions. After a feedback on that point to the QlikView Support Team, they confirmed us that it was a bug and they fixed it in their next QlikView version (11.2 SR8 and above). A simple workaround for this issue can be setup by following these steps:

  1. SSL must be properly configured
  2. The QlikView Offline Service must be properly installed
  3. Login to the Windows Server with any Administrator account
  4. Open the file: C:/Windows/Microsoft.NET/Framework64/v4.0.30319/Config/machine.config
    1. Find the line with: ˂runtime /˃
    2. Replace this line with:
                ˂runtime˃
                         ˂enforceFIPSPolicy enabled="false" /˃
                ˂/runtime˃
  5. Save the file
  6. Open a command prompt as Administrator and execute the command: services.msc
  7. Restart all QlikView Services

Modification of the machine.conf file to disable the FIPS enforcementModification of the machine.conf file to disable the FIPS enforcement


After doing so, you should be able to access to your QlikView documents from a smartphone or a tablet to work offline.


II. Access to remote Shared Folders


As before, depending on your Windows Server GPOs, you may face some issues regarding the access to files stored on a remote Shared Folder (access via the user who run QlikView). By remote I mean another city, country, continent or whatever. This tips can help to solve some Shared Folders access even if you aren't using QlikView, it's more a Windows Server Tips ;). Regarding QlikView, this issue can be easily found in the log file because you will be able to see something like this during a task execution:

ShareIssue.png


The configuration I will show you below worked for me but depending on your network restrictions, it may not work as it is. The important thing here is to understand each parameters and the consequences of this configuration:

  1. Login to the Windows Server with any Administrator account
  2. Open a command prompt as Administrator and execute the command: regedit
  3. Open: HKLM ˃ SYSTEM ˃ CurrentControlSet ˃ Services ˃ LanmanServer ˃ Parameters
    1. Set "enablesecuritysignature" to 1
    2. Set "requiresecuritysignature" to 1
  4. Open: HKLM ˃ SYSTEM ˃ CurrentControlSet ˃ Services ˃ LanmanWorkstation ˃ Parameters
    1. Set "EnableSecuritySignature" to 1
    2. Set "RequireSecuritySignature" to 0
  5. Reboot the Windows Server

Share1.pngConfiguration of the LanmanServer registry keys to 1-1

Share2.pngConfiguration of the LanmanWorkstation registry keys to 1-0


As you can see, there are two different sections named "LanmanServer" and "LanmanWorkstation":

  • LanmanServer control the parameters of the current Windows Server when it acts as a Server
  • LanmanWorkstation control the parameters of the current Windows Server when it acts as a Client


For example, if you access a remote Shared Folder from the QlikView Windows Server, then you are acting as a Client and therefore with this configuration you can access to everything whatever is the LanmanServer configuration of the Shared Folder's Windows Server. Indeed, the local SecuritySignature is enabled but not required (Enable=1, Required=0, it will shorten this as "1-0") so it's the most generic case which cover all possible solutions of LanmanServer configuration of the remote Host (3 solutions: 0-0, 1-0 or 1-1).


In the same way, if a user try to access to a Shared Folder on the QlikView Server, then the QlikView Server will act as a LanmanServer and therefore the configuration taken into account is (1-1). This configuration can be changed but if the LanmanWorkstation configuration of the user's laptop is 1-1, then the LanmanServer configuration will need to be 1-1 otherwise the user will not be able to access to the Shared Folder of the QliKView Server. The 1-1 configuration is of course the most secure and therefore, it's often (always?) chosen on the user's Workstation. That's why it's generally a good idea to set the LanmanServer of the QlikView Server to 1-1 too.


It's really hard to explain this kind of stuff but I hope I was clear enough!


III. SSO between QlikView and "X"


Again, this configuration isn't something only related to QlikView but it can be useful if you need, for example, to allow QlikView to automatically store some documents into another system "X" which can be a Document Management System (Alfresco, SharePoint, Documentum, aso...) or something else. You may not need to do this because it can be allowed by default on your enterprise but it's generally a good practice to restrict the SSO features on a Windows Servers and therefore, this kind of configuration is often required. For this configuration, let's define X as a third party software and https://x.domain.com as the URL related to it.


From the Windows Server, if you try to access to the real URL of your third party software (e.g for Alfresco Share it would be: https://x.domain.com/share) and if you get a pop-up window asking you for credentials, then follow the steps below:

  1. Ensure that your Active Directory is properly configured for SSO (this is a very large topic and I will not describe it here)
  2. Login to the Windows Server with the account under which QlikView is running
  3. Open: Internet Explorer ˃ Internet Options ˃ Security ˃ Trusted Sites ˃ Sites
    1. Write: https://x.domain.com
    2. Click on: Add
    3. Write: about:blank (this step may not be mandatory)
    4. Click on: Add (this step may not be mandatory)
    5. Close the window
  4. Click on: Custom Level...
    1. Scroll down and find: User Authentication
    2. Set "Logon" to "Automatic logon with current username and password"
    3. Click on: OK
    4. Click on: Apply
    5. Click on: OK
  5. Restart Internet Explorer
  6. Ensure that the account under which QlikView is running has the proper permissions on the third party software

SSO_1.png
Trusted sites configuration with the list of URL for all "X"

SSO2.pngEnable the automatic logon with current user name and password for the Trusted Sites


After that configuration, if you try again to access to the real URL of your third party software, then the pop-up window should not be displayed and the login should be successful. Of course, the last step is important because the user that is running QlikView must have access to the third party software otherwise the SSO is useless...


This concludes this first approach of some QlikView Tips & Tricks. I'm sure that more will come soon but I will need to find some time to share that with you. I hope you will find this blog entry useful and don't hesitate to give me your feedback using the comments below!