Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from http://www.dbi-services.com/
Updated: 14 hours 4 min ago

How to import SQL Plan Directives

Thu, 2015-05-21 10:01

Today I've presented SQL Plan Directives at the SOUG Romandie event. I had a question about the import/export of directives by Data Pump. The idea is that a lot of testing has been done on QA in order to validate the upgrade to 12c. A few directives had bad consequences (see Ludovico Caldara blog post for an example), then directives have been disabled. When going to production, they want to start with those directives disabled. Yes, they can be imported. We have to pack them into a staging table, import that table, and unpack them. It's similar tho what we do with SQL Plan Baselines.

testcase: a PERMANENT SPD

I'm using the state of the demo in my presentation where I have a PERMANENT directive that I have disabled because I don't want to do too much dynamic sampling.

SQL> show user
USER is "DEMO"
SQL> select table_name,num_rows,last_analyzed from user_tables;

TABLE_NAME             NUM_ROWS LAST_ANA
-------------------- ---------- --------
STGTAB
DEMO_TABLE                 1000 17:35:51

SQL> select table_name,column_name,num_distinct,last_analyzed from user_tab_col_statistics;

TABLE_NAME           COLUMN_NAME  NUM_DISTINCT LAST_ANA
-------------------- ------------ ------------ --------
DEMO_TABLE           A                       2 17:35:51
DEMO_TABLE           B                       2 17:35:51
DEMO_TABLE           C                       2 17:35:51
DEMO_TABLE           D                       2 17:35:51
DEMO_TABLE           SYS_STSPJNMI            2 17:35:51

SQL> select directive_id,state,created,last_modified,last_used,enabled,extract(notes,'//internal_state/text()') internal_state from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

           DIRECTIVE_ID STATE      CREATED  LAST_MOD LAST_USE ENABLED INTERNAL_
----------------------- ---------- -------- -------- -------- ------- ---------
    9456295843023884415 USABLE     17:35:45 18:07:16 18:07:16 NO      PERMANENT

Pack directive

Datapump can export/import the table, the extended statistics and the statistics, but the SQL Plan Directives are not included. We have to pack then into a staging table in order to export/import that table and umpack it at the destination site.

SQL> exec dbms_spd.create_stgtab_directive('STGTAB');

PL/SQL procedure successfully completed.

SQL> select dbms_spd.pack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.PACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-----------------------------------------------------------------
                                                                1

DataPump

Here I'll import the DEMO table, and the STGTAB which contains the packed statistics, through a loopback database link, and import it to another schema: DEMO_SITE2

SQL> create database link LOOPBACK connect to demo identified by demo using '&_CONNECT_IDENTIFIER';
Database link created.

SQL> host impdp demo/demo@&_CONNECT_IDENTIFIER network_link=LOOPBACK tables="DEMO.DEMO_TABLE,DEMO.STGTAB" remap_schema='DEMO:DEMO_&&1';

Import: Release 12.1.0.1.0 - Production on Thu May 21 18:07:42 2015

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Starting "DEMO"."SYS_IMPORT_TABLE_01":  demo/********@//192.168.78.114/DEMO14 network_link=LOOPBACK tables=DEMO.DEMO_TABLE,DEMO.STGTAB remap_schema='DEMO:DEMO
_SITE2'
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 256 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . imported "DEMO_SITE2"."STGTAB"                            6 rows
. . imported "DEMO_SITE2"."DEMO_TABLE"                     1000 rows
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Job "DEMO"."SYS_IMPORT_TABLE_01" successfully completed at Thu May 21 18:08:18 2015 elapsed 0 00:00:37

Check what is imported:

SQL> show user
USER is "DEMO_SITE2"
SQL> select table_name,num_rows,last_analyzed from user_tables;

TABLE_NAME             NUM_ROWS LAST_ANA
-------------------- ---------- --------
DEMO_TABLE                 1000 17:35:51
STGTAB

SQL> select table_name,column_name,num_distinct,last_analyzed from user_tab_col_statistics;

TABLE_NAME           COLUMN_NAME  NUM_DISTINCT LAST_ANA
-------------------- ------------ ------------ --------
DEMO_TABLE           A                       2 17:35:51
DEMO_TABLE           B                       2 17:35:51
DEMO_TABLE           C                       2 17:35:51
DEMO_TABLE           D                       2 17:35:51
DEMO_TABLE           SYS_STSPJNMI            2 17:35:51

SQL> select directive_id,type,state,created,last_modified,last_used,enabled,notes from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

no rows selected

I have my statistics (columns and extended stats for the column group) but no directives. If I stop here, a new query will probably create a new SQL Plan Directive, which will became PERMANENT and will trigger Adaptive Dynamic Sampling. I want to avoid that and get my disabled directive from the previous environment.

Unpack directive

SQL> select dbms_spd.unpack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.UNPACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-------------------------------------------------------------------
                                                                  0
Look at the returned number: no directive has been unpacked. Because I'm now into another schema, I have to update the owner. I don't know if there is a better way to do it, but here I update the STGTAB:
SQL> update stgtab set c1='DEMO_SITE2';

6 rows updated.

SQL> commit;

Commit complete.

don't forget to commit. It doesn't work if you don't.
SQL> select dbms_spd.unpack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.UNPACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-------------------------------------------------------------------
                                                                  1

SQL> select directive_id,state,created,last_modified,last_used,enabled,extract(notes,'//internal_state/text()') internal_state from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

           DIRECTIVE_ID STATE      CREATED  LAST_MOD LAST_USE ENABLED INTERNAL_
----------------------- ---------- -------- -------- -------- ------- ---------
   18397758907897483632 USABLE     18:09:12                   NO      PERMANENT

The directive is there. It's disabled which is what I wanted. Creation date and last used has been reset. If you see a directive with a status different than NEW and without a LAST_USED then it's an imported one. Note that the directive ID has changed.

Conclusion

When you have directives in HAS_STATS, then because Data Pump imports the statistics by default (including extended statistics), you probably don't need to import the SQL Plan Directives. But if you disabled some statistics and want the same in another environment, you have to pack/import/unpack them.

This morning event was a great event. Thanks Ludovico for the picture.

Live demos are best! @FranckPachot pic.twitter.com/aJfBO9cX5C

— Ludovico Caldara (@ludodba) May 21, 2015

DBA_SQL_PLAN_DIRECTIVE.LAST_USED

Thu, 2015-05-21 02:30

If you have read Matching SQL Plan Directives and queries using it then you know how to use the '+metrics' format of dbms_xplan.

21:49:54 SQL> explain plan for select distinct * from DEMO_TABLE 
              where a=0 and b=0 and c=0 and d=0;
Explained.

21:50:01 SQL> select * 
              from table(dbms_xplan.display(format=>'basic +rows +note +metrics'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
Plan hash value: 3823449216

-------------------------------------------------
| Id  | Operation          | Name       | Rows  |
-------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |
|   1 |  SORT UNIQUE NOSORT|            |     1 |
|   2 |   TABLE ACCESS FULL| DEMO_TABLE |   500 |
-------------------------------------------------

Sql Plan Directive information:
-------------------------------
  Used directive ids:
    2183573658076085153
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement
So, when I run this query with predicates on columns A,B,C,D I'm using the directive id 2183573658076085153. 'Using' means doing dynamic sampling in order to get good estimations, because the directive tells us that there is a misestimate when using only the static statistics. Then we can look at that directive:
21:50:11 SQL> select directive_id,state,created,last_modified,last_used 
              from dba_sql_plan_directives where directive_id=2183573658076085153;

           DIRECTIVE_ID STATE      CREATED  LAST_MODIFIED LAST_USED
----------------------- ---------- -------- ------------- -------------
    2183573658076085153 USABLE     21:41:50 21:41:55.0000 21:41:55.0000
Look at the timestamps. I'm at 21:50 and the directive which has been created 9 minutes ago at 21:41:50 has been used 5 seconds later at 21:45:55 and it's the last usage.

Let's run the query now. I know that it will use the directive:

21:50:40 SQL> select distinct * from DEMO_TABLE where a=0 and b=0 and c=0 and d=0;

         A          B          C          D
---------- ---------- ---------- ----------
         0          0          0          0
It I check to be sure that the directive has been used:
21:50:55 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +note'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select distinct * from DEMO_TABLE where a=0 and b=0 and c=0 and d=0

Plan hash value: 3823449216

-------------------------------------------------
| Id  | Operation          | Name       | Rows  |
-------------------------------------------------
|   0 | SELECT STATEMENT   |            |       |
|   1 |  SORT UNIQUE NOSORT|            |     1 |
|   2 |   TABLE ACCESS FULL| DEMO_TABLE |   500 |
-------------------------------------------------

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

Yes: SPD used, we have dynamic sampling and accurate cardinality estimation.
The SPD has been used by the query that I've run at 21:50:40

So I'll will check the LAST_USED timestamp, after being sure that what has been modified in memory is written to dictionary:

21:50:59 SQL> exec dbms_spd.flush_sql_plan_directive;
PL/SQL procedure successfully completed.

21:51:07 SQL> select directive_id,state,created,last_modified,last_used 
              from dba_sql_plan_directives where directive_id=2183573658076085153;

           DIRECTIVE_ID STATE      CREATED  LAST_MODIFIED LAST_USED
----------------------- ---------- -------- ------------- -------------
    2183573658076085153 USABLE     21:41:50 21:41:55.0000 21:41:55.0000
The LAST_USED has not been updated.

LAST_USED

This is not a bug. Updating the dictionary for all directives used during the previous 15 minutes (the automatic flush frequency) would be too expensive, so it is not done at each flush. The LAST_USED goal is to manage SPD retention - drop those that have not been used for a while (53 weeks by default) - and the retention is defined as weeks. So the LAST_USED is updated only if it is one week higher than the previous LAST_USED. Not exactly one week but 6.5 days.

6.5 days

In fact, when LAST_USED is updated, it adds 6.5 days to the current timestamp. We can see that in the underlying table:

SQL> alter session set nls_date_format='dd-mon-yyyy hh24_mi:ss';
Session altered.

SQL> select type,state,created,last_modified,last_used from sys.opt_directive$ where dir_id=2183573658076085153;
      TYPE      STATE CREATE  LAST_MODIFI LAST_USED
---------- ---------- ------- ----------- --------------------
         1          3 19-may- 19-may-2015 26-may-2015 09:41:26
Last used as displayed by the DBA_SQL_PLAN_DIRECTIVES is 19-may-2015 at 21:41:26 but the internal table OPT_DIRECTIVE$ stores 26-may-2015 09:41:26 which is 6.5 days later. The view subtracts 6.5 days to it in order to expose it to us. But we must keep in mind that this timestamp will not be updated until 26-may-2015.

Conclusion: We cannot rely on LAST_USED when we want to know if a directive has been used recently. We must use explain plan for that. Note that this implementation is not new: it's exactly the same as the LAST_EXECUTED column in DBA_SQL_PLAN_BASELINES. I'ts an optimization because tracking each usage would be too expensive.

what that in-memory term is about, and what not

Thu, 2015-05-21 02:10

Everybody is talking about in-memory databases these days. And everybody is talking about columnar store for sets of data because this can be a benefit for analytic queries. And a lot of people start mixing these terms not realizing that these topics are not exchangeable.

 

Alfresco: video thumbnails and transformations

Thu, 2015-05-21 01:00


Alfresco support, by default, the preview in your browser of some video formats like mp4 but doesn't support some others like mkv, avi or wmv. Even if Google Chrome can read mkv files for example, if you try to use the action "View in Browser" (this is not the same thing as the preview), Chrome will not try to play the mkv file but will download it instead. That's why if you upload an mp4 video in Alfresco, you will certainly be able to watch this video directly in your browser. In this blog, I will try to explain how to configure an external tool to take care of the thumbnails creation and transformations for your videos. To achieve that, we will install and configure ffmpeg. In addition to that, you can also very easily configure Alfresco to embed an external video player like "FlowPlayer" that would take care of playing all video formats directly "streaming" from Alfresco, not using your browser. Basically, this is done by replacing the preview page for some Mime types but I will not describe it in details here.


For this blog post, I will use /opt/alfresco-4.2.c as the $ALF_HOME environment variable: the folder where Alfresco has been installed. The current release of ffmpeg is version 2.6.3. This is at least working for Alfresco v4.2.x and v5.x.


I. Installation of ffmpeg


The ffmpeg binaries/executables are available in the ffmpeg website. From this website, download the "Static" build for Windows, Linux or Mac depending on the Operating System on which Alfresco has been installed. For this example, I will use a linux host (RedHat 6.x, 64 bits) but I can assure you that it's working properly on Windows too. I already installed ffmpeg with several versions of Alfresco from v4.2.x to 5.x.

[alfresco ~]$ cd /opt
[alfresco /opt]$ wget http://johnvansick...elease-64bit-static.tar.xz
[alfresco /opt]$ tar -xJf ffmpeg-release-64bit-static.tar.xz
[alfresco /opt]$ chmod -R 755 ffmpeg-2.6.3-64bit-static
[alfresco /opt]$ chown -R alfresco:alfresco ffmpeg-2.6.3-64bit-static
[alfresco /opt]$ mv ffmpeg-2.6.3-64bit-static /opt/alfresco-4.2.c/ffmpeg


These commands will download, extract, change the permissions/ownership and rename the folder to something more friendly (at a more proper location too).


II. Configuration of Alfresco


Once ffmpeg has been installed, the next step is to configure Alfresco to use it. The first thing to do is to add some parameters in the well-known alfresco-global.properties file. Don't hesitate to customize these paremeters, remove the lines that aren't needed, aso...

[alfresco /opt]$ cd /opt/alfresco-4.2.c/tomcat/shared/classes

[alfresco /classes]$ cat alfresco-global.properties
### Begin of the file with your custom parameters ###
### E.g.: dir.root, db.driver, db.username, aso... ###

### FFMPEG executable location ###
ffmpeg.exe=/opt/alfresco-4.2.c/ffmpeg/ffmpeg

### Video Thumbnails parameters ###
# ffmpeg.thumbnail
content.transformer.ffmpeg.thumbnail.priority=50
content.transformer.ffmpeg.thumbnail.extensions.3g2.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.3gp.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.asf.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.avi.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.avx.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.flv.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mov.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.movie.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mp4.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mpeg2.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mpg.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.ogv.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.wmv.jpg.supported=true

### Video Transformations parameters ###
# ffmpeg.flv
content.transformer.ffmpeg.flv.priority=50
content.transformer.ffmpeg.flv.extensions.3g2.flv.supported=true
content.transformer.ffmpeg.flv.extensions.3gp.flv.supported=true
content.transformer.ffmpeg.flv.extensions.asf.flv.supported=true
content.transformer.ffmpeg.flv.extensions.avi.flv.supported=true
content.transformer.ffmpeg.flv.extensions.avx.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mov.flv.supported=true
content.transformer.ffmpeg.flv.extensions.movie.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mp4.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mpeg2.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mpg.flv.supported=true
content.transformer.ffmpeg.flv.extensions.ogv.flv.supported=true
content.transformer.ffmpeg.flv.extensions.wmv.flv.supported=true
# ffmpeg.mp4
content.transformer.ffmpeg.mp4.priority=50
content.transformer.ffmpeg.mp4.extensions.3g2.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.3gp.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.asf.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.avx.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mov.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.movie.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mpeg2.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mpg.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.wmv.mp4.supported=true
content.transformer.avi.mp4.extensions.mpg.mp4.supported=true
content.transformer.ogv.mp4.extensions.wmv.mp4.supported=true
# ffmpeg.mp3
content.transformer.ffmpeg.mp3.priority=50
content.transformer.ffmpeg.mp3.extensions.aiff.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.au.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.m4a.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.oga.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.wav.mp3.supported=true


In the above configuration, the parameter "ffmpeg.exe=/opt/alfresco-4.2.c/ffmpeg/ffmpeg" is the location of the binary file named ffmpeg. If you install Alfresco and ffmpeg on a Windows host, then this parameter looks like: "ffmpeg.exe=C:/Alfresco-4.2.c/ffmpeg/bin/ffmpeg.exe"


Once this is done, you need to enable the thumbnail & transformation extensions in Alfresco and this is done using the following steps:

[alfresco /classes]$ cd /opt/alfresco-4.2.c/tomcat/shared/classes/alfresco/extension
[alfresco /extension]$ mv video-thumbnail-context.xml.sample video-thumb-context.xml
[alfresco /extension]$ mv video-transformation-context.xml.sample video-transf-context.xml


If you want, you can parse these two files to understand what Alfresco will do with ffmpeg but basically these files define the commands and options that will be used by Alfresco.


And that's it, Alfresco has been configured to use ffmpeg for thumbnails creation and transformations of your video. For Alfresco to take these changes into account, simply restart your Application Server using an Alfresco service:

[alfresco ~]$ service alfresco restart


Or using Alfresco default scripts:

[alfresco ~]$ /opt/alfresco-4.2.c/alfresco.sh stop
[alfresco ~]$ /opt/alfresco-4.2.c/alfresco.sh start



To check if ffmpeg is working properly, simply open Alfresco Share, browse your repository to find a video with an avi or wmv format and a thumbnail should now be present (if not, just refresh the page...). You can also try to upload a video and if a thumbnail is created for this video, then ffmpeg is working!



Row Store vs Column Store in SAP HANA

Wed, 2015-05-20 00:00

The SAP HANA database allows you to create your tables in Row or Column Store mode. In this blog, I will demonstrate that each method has its advantages and disadvantages and should be used for specific cases.

Thanks to two kind of tests, I will show you that the Row Store mode should be used for simple SELECT SQL queries, without aggregation and the Column Store mode should be used for complex SELECT queries, containing aggregation levels.

If you want to have more information regarding the Column Store or the In-memory technologies, don't hesitate to assist at the next dbi services event:

http://www.dbi-services.com/index.php/newsroom-e/events/event-l-in-memory-r-boost-your-it-performance

Test 1: Simple SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a simple SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

1_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

2_SQL_COLUMN_STORE.PNG

Tables Row Store Table

You can find here information regarding the Row Store table used in the test.

Name:                 SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                1

Partition:            0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

3_TABLE_ROW_STORE_1.PNG

 

4_TABLE_ROW_STORE_2.PNG

Column Store Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:           Column Store

Row count:          10 309 873

Index:                 0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

6_TABLE_COLUMN_STORE_2.PNG

Result of the test Using the Row Store table

8_SQL_Q1_SELECT_ROW_RESULT.PNG

Using the Column Store table

9_SQL_Q1_SELECT_COLUMN_RESULT.PNG

Test 2: Complex SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a complex SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

10_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

11_SQL_COLUMN_STORE.PNG

Tables Row Store Fact Table

You can find here information regarding the Row Store table used in the test.

Name:                  SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                   2

Partition:             0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

Column Store Fact Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:          Column Store

Row count:         10 309 873

Index:                   0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

Result of the test Using the Row Store tables

12_SQL_Q2_SELECT_ROW_RESULT.PNG

Using the Column Store tables

13_SQL_Q2_SELECT_COLUMN_RESULT.PNG

Conclusion

Row and Column store modes in SAP HANA should be used in two different contexts:

 - Tables in Row store mode must be used in SELECT queries WITHOUT any aggregation functions

 -Tables in Column store mode are powerful when they are used to create analytical queries or view, using aggregation functions (GROUP BY, …)

The performance can be highly optimized if the tables selected in the queries have the right store mode.

 

 

 

List all RMAN backups that are needed to recover

Tue, 2015-05-19 09:49

This blog post is something I had in draft and Laurent Schneider blog post reminds me to publish it. With the right RMAN configuration you should not have to managed backup files yourself. The RMAN catalog knows them and RMAN should be able to access them. If you want to keep a backup for a long time, you just tell RMAN to keep it.
But sometimes, RMAN is not connected to your tape backup software, or the backups are not shared on all sites, and you have to restore or copy the set of files that is needed for a restore database or a duplicate database.

A customer was in that case, identifying the required files from their names because they are all timestamped with the beginning of the backup job. It's our DMK default. In order to rely on that, the 'backup database plus archivelog' was run. And in order to be sure to have all archived logs in those backup sets, any concurrent RMAN job are blocked during that database backup. Because if a concurrent job is doing some archivelog backups, they will be timestamped differently.

RPO and availability

I don't like that. I don't want that anything can block the backup of archived logs.
They are critical for two reasons:

  • The Recovery Point Objective is not fulfilled if some archivelog backups are delayed
  • The frequency of archivelog backup is also defined to prevent a full FRA
But if we allow concurrent backup of archived logs, we need something else to be able to identify the whole set of files that are needed to restore the database at that point in time. then my suggestion was to generate the list of those files after each database backup, and keep that list. When we need to restore that backup, then we can send the list to the backup team ans ask them to restore them.

The script

Here is my script, I'll explain later:

echo "restore controlfile preview; restore database preview;" | rman target / | awk '
/Finished restore at /{timestamp=$4}
/Recovery must be done beyond SCN /{if ($7>scn) scn=$7 }
/^ *(Piece )Name: / { sub(/^ *(Piece )Name: /,"") ; files[$0]=1 }
END{ for (i in files) print i > "files-"timestamp"-SCN-"scn".txt" }
' 
this script generate the following file:
files-20150519013910-SCN-47682382860.txt
which list the files needed to do a RESTORE/RECOVER UNTIL SCN 47682382860

the content of the file is:

oracle@dbzhorap01:/home/oracle/ [DB01PP1] sort files-20150519019910-SCN-47682382860.txt
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168279_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168280_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168281_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168282_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168283_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168284_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169462_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169463_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169464_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169465_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169466_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169467_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169468_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169469_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169470_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169471_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169472_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169482_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169473_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169474_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169475_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169476_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169477_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169478_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
and lists the backup pieces for the incremental 0, incremental 1 and archivelogs needed to recover to a consistent state that can be opened. The script lists only backup sets so we are supposed have have backed up the latest archived logs (with backup database plus archivelog for example).

You can put an 'until scn'^but my primary goal was to run it just after a backup database in order to know which files have to be restored to get that backup (restore or duplicate).

Restore preview

The idea is to rely on RMAN to find the files that are needed to restore and recover rather than doing it ourselves from the recovery catalog. RMAN provides the PREVIEW restore for that:

RMAN> restore database preview
Starting restore at 20150501390436
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
166388  Incr 0  10.53G     DISK        00:52:56     20150516031010
        BP Key: 166388   Status: AVAILABLE  Compressed: YES  Tag: WEEKLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
  List of Datafiles in backup set 166388
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  1    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/system.329.835812499
  2    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/undotbs1.525.835803187
  10   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.676.835815153
  17   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.347.835815677
  23   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.277.835814327
  25   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.342.835811161
...
BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
167586  Incr 1  216.09M    DISK        00:01:34     20150519012830
        BP Key: 167586   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
  List of Datafiles in backup set 167586
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  43   1  Incr 47681921440 20150519012700 +U01/DB01Pp/datafile/cpy_idx.346.835815097

List of Backup Sets
===================

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
167594  105.34M    DISK        00:00:23     20150519015400
        BP Key: 167594   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck

  List of Archived Logs in backup set 167594
  Thrd Seq     Low SCN    Low Time       Next SCN   Next Time
  ---- ------- ---------- -------------- ---------- ---------
  3    59406   47681333097 20150519010239 47682617820 20150519014652
  4    46800   47681333143 20150519010240 47682617836 20150519014652
  1    76382   47681333188 20150519010240 47682618254 20150519014655
  2    60967   47681333315 20150519010242 47682385651 20150519013711
...

Media recovery start SCN is 47681637369
Recovery must be done beyond SCN 47682382860 to clear datafile fuzziness
Finished restore at 20150501390440
You see the list of datafiles backupsets and archivelog backupsets and at the end you have information about SCN. Let me explain what are those SCNs.

Recovery SCN

Because it is online backup the datafiles are fuzzy. We need to apply redo generaed during backup.

The 'media recovery start SCN' is the begining of the archivelog to be applied:

SQL> select scn_to_timestamp(47681637369) from dual;

SCN_TO_TIMESTAMP(47681637369)
---------------------------------------------------------------------------
19-MAY-15 01.10.38.000000000 AM

The 'recovery must be done beyond SCN' is the last redo that must be applied to have datafiles consistent:

SQL> select scn_to_timestamp(47682382860) from dual;

SCN_TO_TIMESTAMP(47682382860)
---------------------------------------------------------------------------
19-MAY-15 01.35.58.000000000 AM

In my example, the backup (incremental level 1 + archivelog) started at 01:00:00 and was completed at 01:35:00

Conclusion

And I have a file with the list of backups that are needed to restore or duplicate the database at that point in time. Why do I need that when RMAN is supposed to be able to retrieve them itself? Because sometimes we backup to disk and the disk is backed up to tape without RMAN knowing it. Of course RMAN can connect directly to the tape backup software but that is not for free. Or we want to duplicate to another site where backups are not shared. We need to know which files we have to bring there. And that sometimes requires a request to another team so it's better to have the list of all files we need.

As usual, don't hesitate to comment if you see something to improve in my small script.

SQL Server 2014: First Service Pack (SP1) is available

Tue, 2015-05-19 01:48

May 14th, Microsoft has released the first Service Pack (SP1) for SQL Server 2014. It is more than thirteen months after the RTM version.
SQL Server 2014 Service Pack 1 includes all of the CU from 1 to 5.

Which issues are fixed in this SP1

There are 29 hotfixes:

  • 19 for the Engine
  • 6 for SSRS
  • 3 for SSAS
  • 1 for SSIS

 

b2ap3_thumbnail_PieSQL2014SP1.jpg

Some improvements are:

  • Performance improvement of Column store with batch mode operators and a new Extended Event
  • Buffer pool extension improvement
  • New cardinality estimator to boost queries performances

Historic of SQL Server 2014

The build version of SQL Server 2014 SP1 is 12.0.4100.1.
Here, a quick overview of SQL Server 2014 builds since the CTP1:

Date SQL Server 2014 version Build

Juin 2013

Community Technology Preview 1 (CTP1)

11.00.9120

October 2013

Community Technology Preview 2 (CTP2)

12.00.1524

April 2014

RTM

12.00.2000

April 2014

Cumulative Update 1 (CU1)

12.00.2342

June 2014

Cumulative Update 2 (CU2)

12.00.2370

August 2014

Cumulative Update 3 (CU3)

11.00.2402

October 2014

Cumulative Update 4 (CU4)

12.00.2430

December 2014

Cumulative Update 5 (CU5)

11.00.2456

May 2015

Service Pack 1 (SP1)

12.00.4100

If you need more information about SQL Server 2014 SP1 or to download it, click here.

As a reminder, Service Packs are very critical and important for bug fixing point of view, product upgrade so take care to install it quickly ;-)
See you.

SQL Saturday Lisbon: from Francesinha to Bacalhau

Mon, 2015-05-18 23:45

The last week-end, I was at the SQL Saturday 369 that held in Lisbon. If you take a look at the agenda, you'll probably see that there is a lot of interesting sessions with a lot of famous speakers. Unfortunately, I was not able to attend to all sessions, so I decided to focus only on those that have a direct correlation with my work.

First, 2 "headache" sessions given by Paul White (aka @SQL_Kiwi) about the query optimizer and some internal stuffs. The QO is definitely a very interesting topic and I'm always willing to discover more and more with guys like Paul to improve my skills.

Then, 2 sessions about In-Memory features with SQL Server 2016. In fact, I'm already aware about potential new features about the next SQL Server version, but attending to a session given by Niko Neugebauer about columnstore and discuss about next features adds always a certain value for sure. Thanks Niko and Murilo Miranda for your sessions! 

Finally another "headache" session to finish this day about batch mode and CPU archictectures given by Chris Adkin. We had a very deep dive explaination about batch mode and how it improves performance with CPU savings.  

 

Moreover, it was also the opportunity to meet some of my SQL Server MVP friends like Jean-Pierre Riehl and Florian Eiden ...

 

blog_45_-_1_-_french_team_sqlsat369

 

... and have a good dinner with the SQL Saturday staff and other speakers. A lot of countries represented here: Portugal, Germany, UK, New Zealand, France and probably others.

 

blog_45_-_2_-_dinner_sqlsat369

 

A beautiful city, a good weather, a lot of very good speakers and a very good staff ... maybe the secret sauce of a successful SQL Server event!

I'm pretty sure that it will be the same to the next SQL Saturday in Paris and I will be there (maybe as a speaker this time)

Variations on 1M rows insert (3): using SQL Server In-Memory features

Mon, 2015-05-18 11:00

Let’s continue with this series about inserting 1M rows and let’s perform the same test with a new variation by using SQL Server In-Memory features. For this blog post, I will still use a minimal configuration that consists of only 1 virtual hyper-V machine with 1 processor, 512MB of memory. In addition my storage includes VHDx disks placed on 2 separate SSDs  (one INTEL SSDC2BW180A3L and one Samsung SSD 840 EVO). No special configuration has been performed on Hyper-V.

Let's begin by the creation script of my database DEMO:

 

CREATE DATABASE [demo] ON PRIMARY ( NAME = N'demo_data', FILENAME = N'E:\SQLSERVER\demo_data.mdf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ), FILEGROUP [demo_hk_grp] CONTAINS MEMORY_OPTIMIZED_DATA DEFAULT ( NAME = N'demo_hk', FILENAME = N'E:\SQLSERVER\HK' , MAXSIZE = UNLIMITED) LOG ON ( NAME = N'demo_log', FILENAME = N'F:\SQLSERVER\demo_hk_log.ldf' , SIZE = 1395776KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO   ALTER DATABASE [demo] SET COMPATIBILITY_LEVEL = 120 GO   ALTER DATABASE [demo] SET RECOVERY SIMPLE; GO

 

Next the creation script of all user objects that includes:   - 2 disk-based tables: DEMO_DB_PK (with a clustered primary key) and DEMO_DB_HP (a heap table) - 2 In-Memory optimized tables: DEMO_HK_SCH_DATA (data arepersisted) and DEMO_HK_SCH (only schema is persisted)

 

CREATE TABLE [dbo].[DEMO_DB_PK] (        [id] [int] NOT NULL primary key,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )   CREATE TABLE [dbo].[DEMO_DB_HP] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )     CREATE TABLE [dbo].[DEMO_HK_SCH_DATA] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA ) GO   CREATE TABLE [dbo].[DEMO_HK_SCH] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY ) GO

 

Finally, the last script of creating 7 stored procedures in order to test different cases:

sp_demo_insert_demo_db_hp: insert 1M rows inside a disk-based heap table sp_demo_insert_demo_db_pk: insert 1M rows inside a disk-based clustered table sp_demo_insert_demo_hk_sch_data: insert 1M rows inside an In-Memory optimized table in INTEROP with data persisted on disk sp_demo_insert_demo_hk_sch: insert 1M rows inside an In-Memory optimized table in INTEROP with only schema persisted on disk sp_demo_insert_demo_hk_sch_data_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and durability sp_demo_insert_demo_hk_sch_data_cp_d: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and delayed durability enable sp_demo_insert_demo_hk_sch_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with only schema persisted on disk  

Just as reminder, INTEROP procedures allow using both disk-based and In-Memory optimized tables whereas the NATIVE (or natively compiled) procedures doesn't support disk-based tables. However, using the latter is very interesting in performance perspective because it improves drastically the execution time.

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_hp] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_HP VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_pk] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_PK VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH_DATA VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_hk_sch] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

Note that for the following natively compiled stored procedures, I rewrote one portion of code because it concerns CASE statement which is not supported with SQL Server 2014.

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END End go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp_d] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english', delayed_durability = on)          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end GO

 

Ok it's time to run the different test scenarios. You will see below the different results and their related wait statistics that I found on my environment:

 

blog_44_-_1_-_results_

 

blog_44_-_2_-_wait_stats_

 

First of all, let’s notice that using In-Memory optimized tables in INTEROP mode seems to not improve the overall performance in all cases. Indeed, we still deal with the pretty same duration and the same CPU time as well when writing to the transaction log and checkpoint files for the In-Memory optimized tables with data persisted on disk – SCHEMA_AND_DATA.

Next, using In-Memory tables with only schema persisted on disk – SCHEMA_ONLY - contributes to better results (performance gain x 2) as we may expect. Indeed, the WRITELOG wait type has completely disappeared because data are not persisted in this case, so we minimize considerably the amount of records inside the Tlog.

Finally let’s have a look at the tests concerning natively compiled stored procedures. If we refer to the first tests (either for disk-based tables or for INTEROP), we can see that we reduce drastically the CPU consumption by using natively compiled stored procedures (roughly 97% in the best case). So, inserting 1M rows is very fast in this case.

 

Moreover, if we focus only on the results only between In-Memory optimized tables with different durability (SCHEMA_AND_DATA and SCHEMA_ONLY), we may notice that using transaction delayed durability may help. Once again persisting data by writing into the TLog and checkpoint files seems to slow down the insert process.

Some wait types still remain as PREEMPTIVE_OS_CREATEFILE, PREEMPTIVE_OS_CLOSEHANDLE and PREEMPTIVE_IS_FILEOPS and I will probably focus on them later. At this point, I would suspect a misconfigured storage or maybe my Hyper-V settings but I have to verify this point. I already double checked that I enabled instant file initialization according to the Microsoft documentation and disabled also some others features like 8.3 names, file indexing and last modification date tracking as well. So I will come back soon when I have more information.

But anyway for the moment we get a new time reference here: 2’’59’ for disk-based tables against 778 ms for In-Memory optimized tables + natively compiled stored procedures in the best scenario that tends to state that with In-Memory optimized we may get a huge performance improvement depending on our scenario.

So stay connected and see you soon for the next story!

 

Sharepoint and Nintex Workflow: automate your Business Processes

Mon, 2015-05-18 03:58
 

Nintex_workflow_logo

 

What is a WORKFLOW?

A workflow is the representation of an operation sequence, declared as work of a person or group, an organization of staff, or one or more simple or complex mechanisms. These mechanisms could be automated using specific tools.

In other words, we could compare a "workflow" to a "recipe".
Imagine you have to cook something sweet, first you choose the kind, then open the book, and you have to follow steps, one by one with validation, to finally get a beautiful apple pie, Business Processes are the same. Step by step, each validation counts, in order to get a final "product". By NOT USING PROCESSES ... processnogood   ... following a WORKFLOW   precessgood Nintex Workflow

Nintex is an independent software vendor and workflow company with customers in over 90 countries. Nintex was founded in 2004, the company's products include Nintex Workflow, recognized by Forrester analysts as the most utilized third party tool for Microsoft SharePoint, Nintex Forms, Nintex Workflow for Project Server, Nintex Live and Nintex Mobile apps.

Sharepoint and Nintex Workflow is the best combination for advanced workflows

Sharepoint already has an adequate function integrated that enables you to use workflows.
As soon as users starts to use it, they became familiar with the tool and as usual, they wants more and more ... and here starts advanced needs.


Nintex is the best tool recommended. It can be integrated into the SharePoint user interface so it does not require the use of Visual Studio or SharePoint Designer. Nintex Workflow provides an intuitive visual interface and allows users and IT pro to create and maintain workflows more easily. Nintex Workflow 2007 and 2010 add a ‘drag and drop’ design program for workflows, connectivity and advanced workflow features to the Microsoft SharePoint 2010 platform.

Why automation?

The workflow automation ensures perfect interaction between documents, user and management applications.


The advantages of Nintex Workflow

  • It is intuitive and easy to use.
  • You can view the real-time status of workflows.
  • It improves your business performance.
  • It offers reusable workflows.
  • It is easy to use and manage; no client software is needed.
CONCLUSION

Benefits of workflow automation

  • Optimum productivity and competitiveness through a continuous processing workflows.
  • Shorter response times and better customer service through faster processing and full availability of information.
  • Optimal cash with a better customer invoice processing cycle
  • Manage financial and legal risks through audit processes and data retention over several years.
  • Reliability and consistency of the process to ensure compliance with procedures.
  • Transparency workflows with continuous tracking of documents, identifying bottlenecks and speeding decision making.
  • A complete history of all changes made by users , available at any time
  • A quick adaptability to environmental changes or customers business needs, providing a decisive competitive advantage.

 

megadesk  Use Workflow Automation with Nintex on SharePoint make life easier

Variations on 1M rows insert (6): CPU Flame Graph

Sun, 2015-05-17 12:21

If you followed the 1M rows insert variations, you have seen how something simple brought me into an investigation with the latest tools available to see where the time is spent. When the time is CPU time, there is no wait event to get the detail of DB Time. I've used perf events in the previous post and I'll now visualize them with CPU Flame Graph. My goal is to understand why my test cases is 2x longer in 11.2.0.4 vs 11.2.0.3

The idea to use Flame Graphs come from Luca Canali:

Variations on 1M rows insert (5): using 'perf'

Sat, 2015-05-16 13:13

In previous post I explained that I've observed a 2x CPU usage when running my insert test case in 11.2.0.4 vs. 11.2.0.2 I discussed that with @martinberx @OracleSK @TanelPoder @xtner @Ycolin @fritshoogland @lleturgez @LucaCanaliDB on twitter and it became clear that:

  • 'perf' is the tool to use when investigating CPU usage.
  • we must be use hardware event 'cpu-cycles' and not 'cpu-clock' and they are not available in my VirtualBox VM
So I installed a VMware VM with OEL7 and the following two versions of Oracle: 11.2.0.3 and 11.2.0.4

Elapsed Time

On my VirtualBox VMs my test case elapsed time was 77 seconds on 12.1.0.2 instead of 35 seconds on 11.2.0.2. I've tested it in different situations: 32-bit and 64-bit linux, 11.2.0.1, 11.2.0.2, 11.2.0.4 and it was clear that the CPU time used to execute the test case is consistently about 35 seconds in version 11.2.0.2 and before. And the double on versions 11.2.0.4 and 12c.

I tried perf but it wasn't possible to get cpu-cycles samples even when setting SSE4 passthrough. So I installed a VMware environment (and I've enabled 'Virtualize CPU performance counters in VM sessions') and tried the buk insert version.

Here are a few runs on 11.2.0.4

Elapsed: 00:00:18.01
Elapsed: 00:00:16.40
Elapsed: 00:00:15.61
Elapsed: 00:00:15.54
Here is the elapsed time on 11.2.0.3
Elapsed: 00:00:08.34
Elapsed: 00:00:08.35
Elapsed: 00:00:09.87
Elapsed: 00:00:09.25
So I've two conclusions for the moment:
  • It's not the same time in VMWare and VirtualBox
  • I've reproduced my issue where elapsed time is x2 11.2.0.4

cpuinfo

The performance difference probably comes from the CPU features that are enabled in the VMware VM:

[oracle@vmware home]$ grep ^flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand hypervisor lahf_lm ida arat epb xsaveopt pln pts dtherm fsgsbase smep
and that are not in the VBox VM:
[oracle@vbox home]$ grep ^flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl pni monitor ssse3 lahf_lm

perf

In order to use perf, I've read Stefan Koehler blog about it.

First, I get the process id for my session:

SQL> select spid from v$process 
where addr=(select paddr from v$session where sid=sys_context('userenv','sid'));
SPID
------------------------
55597
Then I record the sampling of cpu-cycles:
perf record -e cpu-cycles -o /tmp/perf.out -g -p 55597
Then I run my PL/SQL block inserting 1M rows, bulk insert, commit only at the end. And I stop the sampling with ^c.

I show the report with:

perf report -i /tmp/perf.out -g none -n --stdio

Perf report for run on 11.2.0.3

# ========
# captured on: Sat May 16 21:59:53 2015
# hostname : localhost.localdomain
# os release : 3.8.13-35.3.1.el7uek.x86_64
# perf version : 3.8.13-35.3.1.el7uek.x86_64
# arch : x86_64
# nrcpus online : 1
# nrcpus avail : 1
# cpudesc : Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 1780608 kB
# cmdline : /usr/libexec/perf.3.8.13-35.3.1.el7uek.x86_64 record -e cpu-cycles -o /tmp/perf.out -g -p 55256
# event : name = cpu-cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 10 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 28K of event 'cpu-cycles'
# Event count (approx.): 18991352213
#
# Overhead      Samples  Command       Shared Object                                       Symbol
# ........  ...........  .......  ..................  ...........................................
#
    13.11%         3763   oracle  oracle              [.] kdiins0
     3.43%          958   oracle  oracle              [.] lnxadd
     2.73%          761   oracle  oracle              [.] pfrfd1_init_locals
     2.47%          688   oracle  oracle              [.] lnxmul
     2.23%          635   oracle  oracle              [.] __intel_new_memcpy
     2.18%          608   oracle  oracle              [.] pevm_ENTER
     1.90%          529   oracle  oracle              [.] lnxsub
     1.84%          519   oracle  oracle              [.] pmusgel_Get_Element
     1.75%          500   oracle  oracle              [.] kdkcmp1
     1.68%          467   oracle  oracle              [.] pfrrun_no_tool
     1.58%          440   oracle  oracle              [.] pfrust
     1.54%          429   oracle  oracle              [.] lnxmin
     1.22%          349   oracle  oracle              [.] kauxsin
     1.07%          297   oracle  oracle              [.] pfrinstr_INMDH
     0.94%          269   oracle  [kernel.kallsyms]   [k] native_apic_mem_write
     0.88%          244   oracle  oracle              [.] pevm_RET
     0.87%          248   oracle  oracle              [.] pfsabrc
     0.85%          237   oracle  oracle              [.] pfrinstr_ADDN
     0.83%          231   oracle  oracle              [.] pfrxca
     0.78%          217   oracle  oracle              [.] pfrinstr_INHFA1
     0.75%          209   oracle  oracle              [.] pfrinstr_SUBN
     0.73%          204   oracle  oracle              [.] kgiPinObject
     0.70%          200   oracle  oracle              [.] pfrb2_convert_var2aad
     0.69%          191   oracle  oracle              [.] pmusise_Insert_Element
     0.67%          187   oracle  oracle              [.] pfrinstr_INFR
     0.64%          179   oracle  libc-2.17.so        [.] __memmove_ssse3_back

Perf report for run on 11.2.0.4

# ========
# captured on: Sat May 16 19:46:51 2015
# hostname : localhost.localdomain
# os release : 3.8.13-35.3.1.el7uek.x86_64
# perf version : 3.8.13-35.3.1.el7uek.x86_64
# arch : x86_64
# nrcpus online : 1
# nrcpus avail : 1
# cpudesc : Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 1780608 kB
# cmdline : /usr/libexec/perf.3.8.13-35.3.1.el7uek.x86_64 record -e cpu-cycles -o /tmp/perf.out -g -p 35344
# event : name = cpu-cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 5 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 25K of event 'cpu-cycles'
# Event count (approx.): 17296175886
#
# Overhead      Samples  Command      Shared Object                                       Symbol
# ........  ...........  .......  .................  ...........................................
#
    14.20%         3646   oracle  oracle             [.] kdiins0
     3.61%          918   oracle  oracle             [.] lnxadd
     3.20%          814   oracle  oracle             [.] pfrfd1_init_locals
     2.49%          637   oracle  oracle             [.] __intel_new_memcpy
     2.43%          618   oracle  oracle             [.] lnxmul
     2.34%          596   oracle  oracle             [.] pevm_ENTER
     2.05%          527   oracle  oracle             [.] kdkcmp1
     2.01%          513   oracle  oracle             [.] pmusgel_Get_Element
     2.00%          507   oracle  oracle             [.] lnxsub
     1.99%          505   oracle  oracle             [.] pfrrun_no_tool
     1.74%          443   oracle  oracle             [.] lnxmin
     1.74%          441   oracle  oracle             [.] pfrust
     1.72%          434   oracle  libc-2.17.so       [.] __memmove_ssse3_back
     1.25%          321   oracle  oracle             [.] kauxsin
     1.22%          311   oracle  oracle             [.] pfrfd_init_frame
     1.17%          299   oracle  oracle             [.] pfrinstr_INMDH
     1.09%          276   oracle  oracle             [.] kglpnp
     0.96%          243   oracle  oracle             [.] pfrinstr_ADDN
     0.94%          239   oracle  oracle             [.] pfrxca
     0.93%          239   oracle  oracle             [.] pfsabrc
     0.91%          230   oracle  oracle             [.] pmusise_Insert_Element
     0.90%          228   oracle  oracle             [.] __PGOSF347_pfrinstr_INHFA1
     0.81%          206   oracle  oracle             [.] kss_get_owner
     0.80%          204   oracle  oracle             [.] pfrinstr_XCAL
     0.72%          182   oracle  oracle             [.] pevm_RET
     0.70%          177   oracle  oracle             [.] pevm_ARGEIBBI
     0.70%          178   oracle  oracle             [.] pfrb2_convert_var2aad

I don't see many differences here, so it does not seem that there is an additional code path that is run.

So the question is still opened... But 'perf' is really a great tool to get the detail about the database time with is not accounted in wait events. There are a lot of references about it at the end of Stefan Koehler's blog.

Update May 17th: It was not clear when I posted that yesterday, but all tests on that VMware environment are done using the bulk version with forall but no append_values hint.

Optimized Row Columnar (ORC) format in PostgreSQL

Sat, 2015-05-16 06:28

Nowadays everybody is talking about columnar storage format. What can PostgreSQL do in this area? There is no native support for that in PostgreSQL but thanks to the fact that PostgreSQL is highly extensible there is a foreign data wrapper called cstore_fdw. Lets take a look on what it can do.

Variations on 1M rows insert (4): IN MEMORY

Fri, 2015-05-15 14:09

Oracle In-Memory option is for reporting. Oracle has chosen an hybrid approach: maintain the In-Memory Column Store in addition of the buffer cache. Dont' forget our In-Memory event, but for the moment, let's try the 1M row insert when the target table is IN MEMORY.

12c

I've done the previous variations on 11g because I wanted to use the Developer Days VM that has also TimesTen installed. But now I want to test some 12c features. So I run the same as the 1st variation. The one on a table with one index (primary key), conventional inserts committed only at the end. The one that takes 35 seconds on 11.2.0.2.

Here is the same on an equivallent VM with 12.1.0.2:

SQL> set timing on
SQL> declare
  2   type people_array is varray(12) of varchar(15);
  3   people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
  4   people_count number :=people.COUNT;
  5   n number;
  6  begin
  7   for i in 0..1e6 loop
  8    n:=trunc(dbms_random.value(0,10000));
  9    insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
 10    -- 
 11   end loop;
 12   commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:17.03
Yes it's 2 times longer and I don't know why. I tried with different versions and for the moment it seems that the CPU consumption in 11.2.0.4 or 12c is x2 when compared to 11.2.0.3. I didn't investigate further and I'll probably come back to that after attending Luca Canali session next week about modern linux tools for Oracle troubleshooting.

Please don't take any conclusion about it now. In real life, 12c is not slower than 11g. This is just one testcase on one specific context and there is nothing to say about it before understanding what happens. This test is just there to set the baseline as 01:17:00 on that 12c database.

In-Memory

The question of the day is: How this 1 million rows insert behave on an IN MEMORY table? We know that there is an overhead to maintain both the buffer cache and the In-Memory Column Store. And we know that this is probably not on problem because In-Memory often let us to get rid of a few indexes and the gain in index maintenance compensates the overhead.

SQL> create table DEMO ("id" number primary key, "text" varchar2(15), "number" number) 
INMEMORY PRIORITY CRITICAL;
Table created.
I've created the table in-memory with on-demand population. The load duration is not higher than without in-memory:
PL/SQL procedure successfully completed.
Elapsed: 00:01:23.35	
However in-memory journal has been updated:
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
IM space private journal bytes allocated                           25100288
IM space private journal bytes freed                               25100288
IM transactions rows journaled                                       394895
The 25MB is the size of my 1M rows but not all rows have been populated in memory:
SQL> select segment_name,inmemory_size,bytes,bytes_not_populated from v$im_segments;

SEGMENT_NA    INMEMORY_SIZE            BYTES      BYTES_NOT_POPULATED
---------- ---------------- ---------------- ------------------------
DEMO              1,179,648       23,068,672               11,354,112
If I query the table we still read some blocks from buffer cache:
SQL> set autotrace on 
SQL> select count(*) from demo;

  COUNT(*)
----------
   1000002

Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

----------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    28   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| DEMO |  1025K|    28   (0)| 00:00:01 |
----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
         13  recursive calls
          1  db block gets
       4681  consistent gets
       1795  physical reads
     145188  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

And then the whole table is now populated in memory:
SQL> select segment_name,inmemory_size,bytes,bytes_not_populated from v$im_segments;

SEGMENT_NA    INMEMORY_SIZE            BYTES      BYTES_NOT_POPULATED
---------- ---------------- ---------------- ------------------------
DEMO              8,585,216       23,068,672                        0

But even then, we need to read some blocks from buffer cache:
SQL> set autotrace on
SQL> select count(*) from demo;

  COUNT(*)
----------
   1000002

Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

----------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    28   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| DEMO |  1025K|    28   (0)| 00:00:01 |
----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1381  consistent gets
          0  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Conclusion

The update if In-Memory is not very well documented. From this simple test, it seems that updating the in-memory column store has very limited overhead when storing the modifications into the in-memory transactional log. However, reading the rows just after the insert seems to be not very optimal. Not all rows have been updated into the transaction log. And even when all is populated, buffer cache is still read. If you want to know a bit more, with deep tracing, you can check Mahmoud Hatem investigations about that, with deep tracing.

In this post there are more questions than answers, but the fact is that maintaining the In-Memory Column Store is not a big overhead, which make it possible on our OLTP databases.

Migration of QlikView Services

Fri, 2015-05-15 12:00


QlikView is a product of the software company named Qlik (previously known as QlikTech) which was designed to provide business intelligence & visualization capabilities using reports or dashboards. The QlikView solution is in fact composed of some components just like a Documentum environment (Content Server, WebServer, JMS, aso...). So QlikView is composed of three main components:

  • QlikView Server: the core of the QlikView solution which provide access to the QlikView Documents for the users
  • QlikView Publisher: an entity used to automatically reload and manage more efficiently the QlikView Documents
  • QlikView Desktop: a development tool that can be used to build your reports/dashboards


Based on these descriptions, the only element that is needed on all environments is the QlikView Server. Of course at a certain point of your project, you may also need a QlikView Desktop to build your reports/dashboards but once done, you just don't really need it anymore. The QlikView Publisher isn't necessary but it will definitively make your life easier.

 

I. Considerations


To be more precise, QlikView doesn't just provide some components but it provides some Services. In fact, each Service is represented by a Windows Service and it can be seen as a specific role with dedicated features. In the QlikView world, an upgrade or a migration is almost the same thing. The main difference is that some elements may change between two main releases of QlikView: the path of a folder, the .NET Framework used, aso... So if you plan to upgrade or migrate your installation (or a part of your installation), then the most important thing to understand is probably that you need to take care of each Service, one Service at a time.


To improve the performance of QlikView, the QlikView Server is designed to mainly use the RAM to store QlikView Documents because the access to the Memory is way more faster than the access to the hard drive. For example, when a user opens a QlikView Document (using a browser) of 1GB (size of the document), then 4GB or RAM are used to store the document in the Memory. Each additional user that will access this QlikView Document will increase this amount by 40% of the document's size (+ 400 MB of RAM per user). On the other side, the QlikView Publisher is designed to use CPUs for its calculations, aso...


When using QlikView in a small company or with a small number of users, installing all QlikView Services in one Windows Server is often sufficient. A Windows Server with 64, 128 or 256GB or RAM and 16, 32 or 64 CPUs is something quite "normal" for QlikView. However, if your QlikView environment starts to show some weaknesses (jobs failure, locked tasks, QMC not responding, aso...) then it's probably the time to do something... Because the QlikView Server and Publisher handle the Memory and CPU consumption in a very different way, a best practice is always to separate them but for small companies it may not be necessary.


II. Migration of Services - Theory


Because of this last consideration, in the two remaining parts of this post I will try to explain how to separate the QlikView Services as it is recommended. So what are the Services provided by QlikView?

  • QlikView Server
  • QlikView WebServer (when using the WebServer of QlikView
  • QlikView Settings Service (when using IIS)
  • QlikView Distribution Service
  • QlikView Management Service
  • QlikView Directory Service Connector


You can also have some additional Services according to what have been installed on the QlikView environment like the QlikView Offline Service (offline access via a mobile/tablet) or the QlikView SAP Network Server. The best practice is generally to do the following:

Server 1 - focus on RAM

  • QlikView Server
  • QlikView WebServer


Server 2 - focus on CPU

  • QlikView Management Service
  • QlikView Directory Service Connector
  • QlikView Distribution Service


III. Migration of Services


The general procedure to migrate a QlikView Service from Server 1 to Server 2 is always the same but some steps differs a little bit for a specific Service. Remember that the best thing to do is always to do one service at a time and to check that QlikView is still working properly between each migration. So an overview of this procedure would be:

  1. Installation of the QlikView Service on Server 2
  2. Configuration
  3. Uninstallation of the QlikView Service on Server 1



Installation of the QlikView Service on Server 2


The installation of a Service on a separate server during a migration is a quite simple step:

  1. Stop the QlikView Service on Server 1
  2. Run the QlikView Server installer: QlikViewServer_x64Setup.exe (64 bits Windows Server 2k8) or QlikViewServer_Win2012andUp.exe (64 bits Windows Server 2012)
  3. On the screen to specify what should be installed, always choose the "Custom" proposal and then check which QlikView Service should be installed.
  4. Reboot the Server 2



Configuration


The configuration part is quite simple since there is only one task that should be executed: change a URL. This task is the same for all QlikView Services except one: the QlikView Management Service. For all other QlikView Services, here is what should be done:

  1. Open the QlikView Management Console using a browser. The default URL is something like: http://##Server_1_Hostname##:4780/qmc/SystemSetup.htm
  2. Expand the folder that correspond to the QlikView Service (e.g. "Distribution Services" for the QlikView Distribution Service)
  3. Click on the element inside this folder (e.g. "QDS@##Server_1_Hostname##" for the QlikView Distribution Service)
  4. Click on the "General" tab
  5. Change the URL value replacing ##Server_1_Hostname## with ##Server_2_Hostname##
  6. Click on "Apply" to save your changes


And that should be sufficient for QlikView to know that you installed the QlikView Service on another server.



So as said above, the configuration part is different for the QlikView Management Service. This Service is the one that takes care of the configuration on the QlikView Management Console. That's why it doesn't make much sense to change something on the QlikView Management Console that was just installed on another server (the installer knows on which server it was executed)...


So what should be done in case this QlikView Service has been installed on Server 2? Well it's also quite simple: almost all configurations of QlikView are stored in something they called the "QVPR Database" (QVPR for QlikView Repository). By default this QVPR Database is just a XML Repository but it can also be a SQL Server database. There is a setting on the QlikView Management Console (System > Setup > Management Service > Repository tab) that control if the QVPR Database should be backed-up or not (never, daily, every X minutes...) and there is even a button to "Backup Now" the configuration. The location of these backups if defined in this page too but if you want to open the real location of the XML Repository, you should take a look at "C:/ProgramData/QlikTech/ManagementService/QVPR/". So the configuration part for the QlikView Management Service consists of:

  1. Stop the QlikView Management Service on Server 2
  2. Copy the QVPR backup from Server 1 to Server 2
  3. Restore the QVPR backup to the Server 2 (replace existing files with the ones from the backup)
  4. Start the QlikView Management Service on Server 2
  5. Check if all configurations are OK



Uninstallation of the QlikView Service on Server 1


The uninstallation step is quite simple too... Everything is simple with QlikView, isn't it? ;)

  1. Run the QlikView Server installer: QlikViewServer_x64Setup.exe (64 bits Windows Server 2k8) or QlikViewServer_Win2012andUp.exe (64 bits Windows Server 2012)
  2. QlikView will detect that some components are already installed
  3. On the second screen, select "Modify"
  4. On the next screen, click on the QlikView Service and disable it ("This feature will not be available")
  5. Complete the "installation" process to uninstall the QlikView Service

 

Once the three QlikView Services have been migrated from Server 1 to Server 2, you should be able to see an improvment in the performances or at least less issues with the QlikView Server & Publisher! ;)

 

 

tmux - an alternative to screen

Fri, 2015-05-15 00:37

You may already use screen for multiplexing a terminal. This is especially useful when you want to start long running processes on a server and do not want to loose the connection because of a ssh connection timeout, firewall settings or other reasons. With screen the session keeps running even if you got disconnected somehow and you may re-attach to the screen session at any point later.

Matching SQL Plan Directives and queries using it

Thu, 2015-05-14 13:45

This is another blog post I'm writing while reviewing the presentation I'm doing next week for SOUG and next month for DOAG. You have SQL Plan Directives used by several queries that have same kind of predicates. And queries that can use several SPD. Can you match them?

When a query uses a SPD (meaning that the SPD in usable state - NEW, MISSING_STATS or PERMANENT internal state) the execution plan show it as:

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement
but you don't have information about which directive(s).

Unfortunately that information is not stored in V$SQL_PLAN information. There are two ways to get information:

  • Parse it and trace it with set events 'trace [SQL_Plan_Directive.*]' but that's for another post.
  • Do an EXPLAIN PLAN and info is in PLAN_TABLE.OTHER_XML

example

Here are the SQL Plan Directives I have:

SQL> select directive_id,type,state,reason,notes,created,last_modified,last_used from dba_sql_plan_d
irectives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner='DEMO
' ) order by created;

           DIRECTIVE_ID TYPE             STATE      REASON
----------------------- ---------------- ---------- ------------------------------------
NOTES                                                                                      CREATED
------------------------------------------------------------------------------------------ --------
LAST_MOD LAST_USE
-------- --------
   11092019653200552215 DYNAMIC_SAMPLING SUPERSEDED SINGLE TABLE CARDINALITY MISESTIMATE
.spd_note.                                                                                 21:21:58
  .internal_state.HAS_STATS./internal_state.
  .redundant.NO./redundant.
  .spd_text.{EC(DEMO.DEMO_TABLE)[A, B, C, D]}./spd_text.
./spd_note.
21:30:09 21:30:09

    9695481911885124390 DYNAMIC_SAMPLING USABLE     SINGLE TABLE CARDINALITY MISESTIMATE
.spd_note.                                                                                 21:35:45
  .internal_state.NEW./internal_state.
  .redundant.NO./redundant.
  .spd_text.{E(DEMO.DEMO_TABLE)[A, B, C, D]}./spd_text.
./spd_note.
(I changed the xml tag because our current blog platform is a bit creative with them... fortunately we are migrating soon to wordpress)

+metrics

So in order to have more information, you have to re-parse the statement with EXPLAIN PLAN FOR... and show it with DBMS_XPLAN.DISPLAY witht he format '+METRICS'

SQL> explain plan for select * from DEMO_TABLE where a+b=c+d;

Explained.
This query will use the {E(DEMO.DEMO_TABLE)[A, B, C, D]} directive but not the {EC(DEMO.DEMO_TABLE)[A, B, C, D]} one because it's not simple columns predicates.
Let's get the execution plan from PLAN_TABLE with the +METRICS format:
SQL> select * from table(dbms_xplan.display(null,null,'+metrics'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 4063024151

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |  1000 | 12000 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| DEMO_TABLE |  1000 | 12000 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"+"B"="C"+"D")

Sql Plan Directive information:
-------------------------------

  Used directive ids:
    9695481911885124390

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

As you can see, in addition to the number of SPD used you have the DIRECTIVE ID.

conclusion

It's not easy to match all queries that can use a SQL Plan Directive, but you can do it on the other way: do an explain plan for each query you suspect and check the notes. If you are ready to parse a lot of queries, you can also do it automatically.

Kerberos configuration for DFS 6.7 SP1

Thu, 2015-05-14 13:30


In my last post, I explained how to configure Kerberos for a CS 6.7 SP1. Unfortunately if you only configure the Content Server, it will almost be useless... If you want this configuration to be useful, then you will also have to configure the Kerberos SSO for the Documentum Foundation Services (DFS). That's why in this blog post I will describe step by step what need to be done for that purpose.


So what are the pre-requisites to setup the Kerberos SSO for the Documentum Foundation Services? Well of course you will need an application server for your DFS, a Content Server that is already installed and an Active Directory to generate the keytab(s). Just to let you know, I used (for the DFS) a Tomcat application server that is on the Content Server's machine and an Active Directory on a Windows Server 2008 R2. Let's define the following properties:

  • Active Directory - user = dfskrb
  • Active Directory - password = ##dfskrb_pwd##
  • Active Directory - domain = DOMAIN.COM
  • Active Directory - hostname1 = adsrv1.domain.com
  • Active Directory - hostname2 = adsrv2.domain.com
  • Alias of the DFS' host = csdfs.domain.com (can be a Load Balancer alias)
  • $CATALINA_HOME = /opt/tomcat


I. Active Directory prerequisites


As always when working with Kerberos on an Active Directory, the first thing to do is to create a user. So let's create this user with the following properties:

  • User name: dfskrb
  • Support AES 128 bits encryption
  • This account MUST NOT support AES 256 bits encryption. I set it that way because the Content Server doesn't support AES 256 bits encryption so I disabled it for the DFS part too.
  • Trust for Delegation to any service (Kerberos Only)
  • Password never expires
  • Account never expires
  • Account not locked


Once the user has been created, you can proceed with the keytab creation using the comment prompt on the Active Directory host:

dfs_keytab.png


For the Content Server part, the name of the "princ" (SPN) has to be "CS/##repository_name##". For the DFS part, the EMC documentation ask you to generate a keytab with a SPN that is "DFS/##dfs_url##:##dfs_port##". In fact, if you are going to use only one DFS url/port, then you don't need to add the port in the SPN of the DFS.


Regarding the name of the keytab, for the Content Server part, it has to be "##repository_name##.keytab" for the Content Server to be able to automatically recognize it during the server re-initialization. For the DFS part, the name of the keytab isn't important because you will have to configure it manually.


II. Configuration of the Documentum Foundation Services side


So let's start the configuration of the Kerberos SSO for the DFS. The first thing to do is of course to transfer the keytab created previously (dfs.keytab) from the Active Directory to the host of the DFS (a Linux in my case). There are no specific locations for this keytab so you just have to put it somewhere and remember this location. For this example, I will create a folder that will contain all elements that are required. Please make sure that the keytab belongs to the Documentum installation owner (user and group) on the file system with the appropriate permissions (640).

[dmadmin ~]$ echo $CATALINA_HOME
/opt/tomcat
[dmadmin ~]$ mkdir /opt/kerberos
[dmadmin ~]$ mv ~/dfs.keytab /opt/kerberos
[dmadmin ~]$ chmod 640 /opt/kerberos/dfs.keytab


Create the file "/opt/kerberos/jaasDfs.conf" with the following content:

[dmadmin ~]$ cat /opt/kerberos/jaasDfs.conf
DFS-csdfs-domain-com {
  com.sun.security.auth.module.Krb5LoginModule required
  debug=true
principal="DFS/document.write(['csdfs.domain.com','DOMAIN.COM'].join('@'))"
realm="DOMAIN.COM"
  refreshKrb5Config=true
  noTGT=true
  useKeyTab=true
  storeKey=true
  doNotPrompt=true
  useTicketCache=false
  isInitiator=false
  keyTab="/opt/kerberos/dfs.keytab";
};


The first line of this file jaasDfs.conf is the name of the "module". This name is derived from the SPN (or principal) of the DFS: take the SPN, keep the uppercase/lowercase characters, remove the REALM (everything that is after the at-sign (included)) and replaced all special characters (slash, back-slash, point, colon, aso...) with a simple dash "-".


The next thing to do is to modify the DFS war file. So let's create a backup of this file and prepare its modification:

[dmadmin ~]$ cd $CATALINA_HOME/webapps/
[dmadmin ~]$ cp emc-dfs.war emc-dfs.war.bck_$(date +%Y%m%d)
[dmadmin ~]$ cp emc-dfs.war /tmp/
[dmadmin ~]$ cd /tmp
[dmadmin ~]$ unzip emc-dfs.war -d /tmp/emc-dfs


The setup of the Kerberos SSO requires some jar files that aren't necessarily present on a default installation. For that purpose, you can copy these jar files from the Content Server to the new dfs:

[dmadmin ~]$ cp $DOCUMENTUM/product/dfc6.7/dfc/jcifs-krb5-1.3.1.jar /tmp/emc-dfs/WEB-INF/lib/
[dmadmin ~]$ cp $DOCUMENTUM/product/dfc6.7/dfc/krbutil.jar /tmp/emc-dfs/WEB-INF/lib/
[dmadmin ~]$ cp $DOCUMENTUM/product/dfc6.7/dfc/vsj-license.jar /tmp/emc-dfs/WEB-INF/lib/
[dmadmin ~]$ cp $DOCUMENTUM/product/dfc6.7/dfc/vsj-standard-3.3.jar /tmp/emc-dfs/WEB-INF/lib/
[dmadmin ~]$ cp $DOCUMENTUM/product/dfc6.7/dfc/questFixForJDK7.jar /tmp/emc-dfs/WEB-INF/lib/


Once done, a Kerberos handler must be added to the DFS. For that purpose, open the file authorized-service-handler-chain.xml, locate the XML comment that start with "Any handler using ContextFactory" and add the following lines just before this comment:

authorization_chain.png


Then, some Kerberos specific configurations must be added to the web.xml file. For that purpose, open this file and add the following lines at the end, just before the web-app end tag (before the last line):

web_xml.png


In the above configuration, only the "env-entry-value" for each "env-entry" section should be changed to match your environment. As you can see, the krb5.conf file referenced here is in /opt/kerberos. You can use the same krb5.conf file as the one used for the Content Server or you can specify a separate file. As this file can be the same for the Content Server and the DFS I will not set it here but just check my last post to get more information about that.


So the configuration is now over and you can just repackage and re-deploy the new DFS:

[dmadmin ~]$ cd /tmp/emc-dfs/
[dmadmin ~]$ jar -cvf emc-dfs.war *
[dmadmin ~]$ $CATALINA_HOME/bin/shutdown.sh
[dmadmin ~]$ mv emc-dfs.war $CATALINA_HOME/webapps/
[dmadmin ~]$ cd $CATALINA_HOME
[dmadmin ~]$ rm -Rf emc-dfs/
[dmadmin ~]$ $CATALINA_HOME/bin/startup.sh


Once done, the Tomcat application server should have been started and the new version of the DFS WAR file should have been deployed. If the Content Server and the DFS are properly setup to use the Kerberos SSO, then you should be able to execute a .NET or Java code to get a Kerberos Ticket for the DFS and work with the DFS features.


does impdp into a compressed table really compress data?

Thu, 2015-05-14 00:29

Today at a customer we discussed the following scenario: To refresh a test database a datapump export and import was implemented. To save space on the test system the idea came up to compress the data on the test system. When we checked the documentation we came across the following statement:

Matching SQL Plan Directives and extended stats

Wed, 2015-05-13 11:33

This year is the year of migration to 12c. Each Oracle version had its CBO feature that make it challenging. The most famous was the bind variable peeking in 9iR2. Cardinality feedback in 11g also came with some surprises. 12c comes with SPD in any edition, which is accompanied by Adaptive Dynamic Sampling. If you want to know more about them, next date is in Switzerland: http://www.soug.ch/events/sig-150521-agenda.html

SQL Plan Directives in USABLE/MISSING_STATS state can create column groups and extended stats on it at the next dbms_stats gathering. When the next usage of the SPD validates that static statistics are sufficient to get good cardinality estimates, then the SPD goes into the SUPERSEDED/HAS_STATS state. If an execution still see misestimates on them, then the state will go to SUPERSEDED/PERMANENT and dynamic sampling will be used forever. Note that disabled SPD can still trigger the creation of extended statistics but not the dynamix sampling.

Query

If you want to match the directives (from SQL_PLAN_DIRECTIVES) with the extended statistics (from DBA_STATS_EXTENSION) there is no direct link. Both list the columns, but not in the same order and not in the same format:

SQL> select extract(notes,'/spd_note/spd_text/text()').getStringVal() from dba_sql_plan_directives where directive_id in ('11620983915867293627','16006171197187894917');

EXTRACT(NOTES,'/SPD_NOTE/SPD_TEXT/TEXT()').GETSTRINGVAL()
--------------------------------------------------------------------------------
{ECJ(STOPSYS.EDGE)[CHILDID, CHILDTYPE, EDGETYPE]}
{EC(STOPSYS.EDGE)[CHILDID, CHILDTYPE, EDGETYPE]}

those SPD has been responsible for the creation of following column groups:
SQL> select owner,table_name,extension from dba_stat_extensions where extension_name='SYS_STSDXN5VXXKAWUPN9AEO8$$W$J';

OWNER    TABLE_NA EXTENSION
-------- -------- ------------------------------------------------------------
STOPSYS  EDGE     ("CHILDTYPE","CHILDID","EDGETYPE")

So I've made the following query to match both:

SQL> column owner format a8
SQL> column table_name format a30
SQL> column columns format a40 trunc
SQL> column extension_name format a20
SQL> column internal_state format a9
SQL>
SQL> select * from (
    select owner,table_name,listagg(column_name,',')within group(order by column_name) columns
     , extension_name
    from dba_tab_columns join dba_stat_extensions using(owner,table_name)
    where extension like '%"'||column_name||'"%'
    group by owner,table_name,extension_name
    order by owner,table_name,columns
    ) full outer join (
    select owner,object_name table_name,listagg(subobject_name,',')within group(order by subobject_name) columns
     , directive_id,max(extract(dba_sql_plan_directives.notes,'/spd_note/internal_state/text()').getStringVal()) internal_state
    from dba_sql_plan_dir_objects join dba_sql_plan_directives using(directive_id)
    where object_type='COLUMN' and directive_id in (
        select directive_id
        from dba_sql_plan_dir_objects
        where extract(notes,'/obj_note/equality_predicates_only/text()').getStringVal()='YES'
          and extract(notes,'/obj_note/simple_column_predicates_only/text()').getStringVal()='YES'
        and object_type='TABLE'
    )
    group by owner,object_name,directive_id
    ) using (owner,table_name,columns)
   order by owner,table_name,columns
  ;
This is just the first draft. I'll probably improve it when needed and your comments on that blog will help.

Example

Here is an exemple of the output:

OWNER  TABLE_NAME                COLUMNS             EXTENSION_ DIRECTIVE_ID INTERNAL_
------ ------------------------- ------------------- ---------- ------------ ---------
STE1SY AUTOMANAGE_STATS          TYPE                             1.7943E+18 NEW
STE1SY CHANGELOG                 NODEID,NODETYPE                  2.2440E+18 PERMANENT
...
SYS    AUX_STATS$                SNAME                            9.2865E+17 HAS_STATS
SYS    CDEF$                     OBJ#                             1.7472E+19 HAS_STATS
SYS    COL$                      NAME                             5.6834E+18 HAS_STATS
SYS    DBFS$_MOUNTS              S_MOUNT,S_OWNER     SYS_NC0000
SYS    ICOL$                     OBJ#                             6.1931E+18 HAS_STATS
SYS    METANAMETRANS$            NAME                             1.4285E+19 MISSING_S
SYS    OBJ$                      NAME,SPARE3                      1.4696E+19 NEW
SYS    OBJ$                      OBJ#                             1.6336E+19 HAS_STATS
SYS    OBJ$                      OWNER#                           6.3211E+18 PERMANENT
SYS    OBJ$                      TYPE#                            1.5774E+19 PERMANENT
SYS    PROFILE$                  PROFILE#                         1.7989E+19 HAS_STATS
SYS    SCHEDULER$_JOB            JOB_STATUS          SYS_NC0006
SYS    SCHEDULER$_JOB            NEXT_RUN_DATE       SYS_NC0005
SYS    SCHEDULER$_WINDOW         NEXT_START_DATE     SYS_NC0002
SYS    SYN$                      OBJ#                             1.4900E+19 HAS_STATS
SYS    SYN$                      OWNER                            1.5782E+18 HAS_STATS
SYS    SYSAUTH$                  GRANTEE#                         8.1545E+18 PERMANENT
SYS    TRIGGER$                  BASEOBJECT                       6.0759E+18 HAS_STATS
SYS    USER$                     NAME                             1.1100E+19 HAS_STATS
SYS    WRI$_ADV_EXECUTIONS       TASK_ID                          1.5494E+18 PERMANENT
SYS    WRI$_ADV_FINDINGS         TYPE                             1.4982E+19 HAS_STATS
SYS    WRI$_OPTSTAT_AUX_HISTORY  SAVTIME             SYS_NC0001
SYS    WRI$_OPTSTAT_HISTGRM_HIST SAVTIME             SYS_NC0001

Conclusion

Because SPD are quite new, I'll conclude with a list of questions:

  • Do you still need extended stats when a SPD is in PERMANENT state?
  • Do you send to developers the list of extended stats for which SPD is in HAS_STATS, so that they integrate them in their data model? Then, do you drop the SPD when new version is released or wait for retention?
  • When you disable a SPD and an extended statistic is created, do you re-enable the SPD in order to have it in HAS_STAT?
  • Having too many extended statistics have an overhead during statistics gathering (especially when having histograms on them). But it helps to have better estimations. Do you think that having a lot of HAS_STATS is a good thing or not?
  • Having too many usable (MISSING_STATS or PERMANENT) SPD has an overhead during optimization (dynamic sampling) . But it helps to have better estimations. Do you think that having a lot of PERMANENT is a good thing or not?
  • Do you think that only bad data models have a lot of SPD? Then why SYS (the oldest data model optimized at each release) is the schema with most SPD?
  • Do you keep your SQL Profiles when upgrading, or do you think that SPD can replace most of them.

Don't ignore them. SQL Plan Directive is a gread feature but you have to manage them.