Skip navigation.

Feed aggregator

When tempdb database may cause indirect issues

Yann Neuhaus - Tue, 2015-05-05 13:23

A couple of weeks ago, I had an interesting discussion with one of my friend that faced a weird issue with a SQL Server instance that lacked worker threads. Because I can’t use his own information I decided to reproduce the same issue in order to share with you some interesting information. So the next part of this blog post refers exclusively to my own test but it represents exactly the real issue encountered by my friend.

Let’s set the scene: in the SQL Server error log we found the following records related to our issue:

 

2015-04-16 14:06:16.54 Server     New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 300 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 94%.

2015-04-16 14:11:16.74 Server     New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 600 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 92%.

 

In the same time, we found an event related to the generation of a SQL Server dump file triggered by the deadlock schedulers monitor as follows:

 

2015-04-16 14:06:04.66 Server     **Dump thread - spid = 0, EC = 0x0000000000000000 2015-04-16 14:06:04.83 Server     ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Log 2015-04-16 14:06:04.83 Server     * ******************************************************************************* 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * BEGIN STACK DUMP: 2015-04-16 14:06:04.83 Server     *   16/04/15 14:06:04 spid 6392 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * Deadlocked Schedulers 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * ******************************************************************************* 2015-03-27 01:05:02.83 Server     * ------------------------------------------------------------------------------- 2015-03-27 01:05:02.83 Server     * Short Stack Dump

 

I began my investigation by having a first look at the SQL Server error log file but I didn’t find any relevant information. So, I decided to move on the analysis of the dump file.

My first idea was to look at the runaway threads at the moment of the dump file generation (!runaway command). In fact, I suspected first a high-CPU workload because I didn’t get other information from my friend. Here what I found:

 

blog_39_-_1_-_runaway

 

I began my investigation by having a first look at the SQL Server error log file but I didn’t find any relevant information. So, I decided to move on the analysis of the dump file.

My first idea was to look at the runaway threads at the moment of the dump file generation (!runaway command). In fact, I suspected first a high-CPU workload because I didn’t get other information from my friend. Here what I found:

 

blog_39_-_2_-_nb_of_running_threads

 

557 threads in use at the moment of the generation of the dump file. According to the Microsoft documentation here and the concerned SQL Server infrastructure (4 logical processors and max worker thread option = 0), the number of running threads are greater than maximum of configured SQL Server threads equal to 512 in this case.

 

SQL Server detected 2 sockets with 2 cores per socket and 2 logical processors per socket, 4 total logical processors;

 

We found a lot of thread patterns that seem blocked by a database lock.

 

blog_39_-_3_-_call_stacks

 

Note in this case that SQL Server has decided to switch off the “tied” thread but it is still waiting to be signaled that the database lock is available to continue …

Another piece of interesting call stack is the following:

 

blog_39_-_4_-_call_stacks_2

 

 

This above call stack seems to concern the SQL Server start-up process. Let’s take a look at the functions like sqlmin!DBMgr::OpenUnstartedDatabase, sqlmin!DBMgr::StartupDB or sqlmin!IMEDUtil::SynchronzeWithResourceDB .We can also notice the use of the function sqlmin!StartUp::WaitForTempdbReady that seems to be related to tempdb database readiness. After that, the concerned thread is also waiting on the database lock to be available and we don’t see any functions that state tempdb is ready at the moment of the dump file generation.

 

Well, to summarize we are facing a general locking issue inside SQL Server process that has consumed all of available threads and we may think that tempdb readiness has something to do with this our blocking issue.

So now, let’s go back to the SQL Server error log and let’s have a deeper look at the SQL Server error log event. At this point, I remembered the famous Bob Ward’s session at Summit 2011 about tempdb database in which he explained the importance of tempdb database during the SQL Server start-up process because even if SQL Server is ready for user connections. Indeed, they have to wait until tempdb has really started and it makes sense because we can wonder what’s going on if tempdb database cannot start correctly. Do we allow user connections to perform their work? As a reminder, the unavailability of tempdb is considered as a SQL Server shutdown event.

In my context we found effectively that the tempdb database was ready only 40 minutes after SQL Server was ready for user connections as follows:

 

2015-04-16 13:52:38.43 spid5s     Starting up database 'master'. 2015-04-16 13:53:05.76 spid9s     Starting up database 'model'. 2015-04-16 13:53:25.02 spid9s     Clearing tempdb database. 2015-04-16 13:51:48.13 spid13s     SQL Server is now ready for client connections. This is an informational message; no user action is required. 2015-04-16 14:32:12.28 spid9s     Starting up database 'tempdb'. 2015-04-16 14:32:15.71 spid9s     Recovery completed for database tempdb (database ID 2) in 2 second(s)

 

The $100 dollar question: Why tempdb started so slowly? Well, I didn’t have all information but my friend told me that they had a storage maintenance at the time … maybe we may correlate these two events together but without any further investigation, it is not possible to respond correctly about it.

Anyway it was very interesting to see how the tempdb database may trigger an escalade of issues in this case. There are probably a lot of ways to reproduce this issue.

Finally let me finish my blog post by explaining my test scenario. In fact to simulate the same issue than my friend, I created first a very big user table inside the model database to considerably slow down the start-up of my tempdb database. Then I configured tempdb with big database files in size without instant file initialization (zeroing is mandatory in this case for all of database files). In the same time, I used ostress tool to simulate roughly 600 running threads. Finally I let the magic happens ...

I hope this blog will help you in your database administrator daily work.

Cloud Content Sharing and Collaboration Use Cases in an Enterprise

WebCenter Team - Tue, 2015-05-05 12:24

We are gearing up for the big executive video webcast on Documents Cloud Service featuring Oracle Chief Information Officer, Mark Sunday, IDC Program Vice President for Content and Digital Media Technologies, Oracle Product Management executives including our webcast host and Vice President for Oracle Middleware, Scott Howley interviewing David Le Strat who drives product management for WebCenter and BPM, and Alexander Hooshmand, Vice President for Oracle Marketing Cloud. Don't forget to register for the live event, "Introducing Oracle Documents Cloud Service" scheduled for Wednesday, May 13 at 10 a.m. PT/ 1 p.m. ET.

As we do that, thought I would share with you a recent feature that was published in Oracle Fusion Middleware Newsletter, April edition that delves into common use cases that we come across industries and lines of businesses around cloud document and file sharing and cloud collaboration. Take a look and give us your thoughts on how these align with what you are seeing within your enterprise, whether you are in IT or with marketing, sales, services, HR, finance or other division.

Cloud Computing Use Cases: Oracle Documents Cloud Service

Cloud computing—software as a service, platform as a service, and infrastructure as a service—is an undisputed game-changer in the corporate world. But with so many different cloud options, it can be complicated to align them with possible real-life scenarios. Here’s where use cases come in handy. In this series of articles, the Oracle Fusion Middleware Newsletter will illustrate how various aspects of cloud computing work within a business setting. For this issue, we’re looking at Oracle Documents Cloud Service.

“One of the reasons that cloud-based file-sharing services are so popular is that they address a universal challenge. Every employee, regardless of their role, needs a secure way to share files for work collaboration,” says Oracle Vice President of Oracle Fusion Middleware Product Management Scott Howley. “In addition, today’s digital workplace requires a 24/7 access to work content from anywhere on any device.”

Use Case 1: Coordinating with Business Partners
Today, internal business functions work with a variety of outside partners—think, for example, of a project in which marketing collaborates with an outside agency on deliverables. “You need to be able to collaborate and share files in real time, but without jeopardizing confidential information such as a press release that could impact stock prices,” says Howley. With Oracle Documents Cloud Service, you have the convenience of collaborative access without the risk associated with consumer products. Enterprise encryption, auditing, tracking, permission controls, and automatic backups keep your information safe.

Use Case 2: Fostering Sales Collaboration
Sales teams often work under deadline, such as when creating a detailed customer request for proposal, a process that involves quickly changing versions that must be instantly shared with both internal and external parties. Additionally, the final output likely needs to be tied back to an opportunity ID in the organization’s CRM system. With Oracle Documents Cloud Service, automatic versioning and syncing ensures version integrity, and its ability to integrate with other Oracle Cloud services such as Oracle Sales Cloud means CRM opportunities won’t get lost.

Use Case 3: Support for Mobile Field Workers
Companies must ensure that mobile or field-site workers can easily and safely access documents—even those with sensitive company information—via a wide variety of devices, from smart phones to tablets and laptops. Oracle Documents Cloud Service provides that flexibility with native support for iPhones, iPads, and Android mobile devices, along with desktop sync for both Mac and Windows computers. Moreover, the ability to embed a user interface in applications, portals, or sites ensures that content can have limitless expressions as work dictates.

“As these use cases show, smart organizations want more than just a standalone file sync and share solution,” says Howley. “They want enterprise-grade security, control, and integration for cloud content sharing.”

To learn more about Oracle Documents Cloud Service, register today for the Oracle Documents Cloud Service executive webcast featuring Oracle CIO Mark Sunday and visit cloud.oracle.com/documents for more information.


Understand the Lifecycle of Alfresco Nodes

Yann Neuhaus - Tue, 2015-05-05 12:19


I guess you all already know what a lifecycle is. We born, we live and we die... In fact, it's exactly the same for Aflresco Nodes! Well, at least from an end user point of view. But what is really going on behind that? This is what I will try to explain in this post.

First of all what is an Alfresco Node? For most people, a Node is just a document stored in Alfresco but in reality it's much more than that: everything in Alfresco is a Node! A Node has a type that defines its properties and it also have some associations with other Nodes. This is a very short and simplified description but we don't need to understand what exactly a Node is to understand the lifecycle process. So as said above, Alfresco Nodes have their own lifecycle but it's a little bit more complicated than just three simple steps.

Please note that in this post, I will use $ALF_HOME as a reference to the location where alfresco has been installed (e.g. /opt/alfresco-4.2.c) and $ALF_DATA as a reference to the alf_data location. By default the alf_data folder is $ALF_HOME/alf_data/.


I. Creation


In this post I will use a document as an Alfresco Node to easily understand the process. So this lifecycle start with the creation of a new document named "Test_Lifecycle.docx" with the following creation date: Febrary the 1st, 2015 at 16:45.

When a document is created in Alfresco, three things are done:

  • File System: the content of this file is stored on the Alfresco "Content Store". The Content Store is by default under $ALF_DATA/contentstore. This file is actually put somewhere under this folder that depends on the creation time and an ID is given to this file. For our file, it would be: $ALF_DATA/contentstore/2015/2/1/16/45/408a6980-237e-4315-88cd-6955053787c3.bin.
  •  Database: the medatada of this file are stored on the Alfresco Database. In fact in the DB, this document is mainly referenced using its NodeRef or NodeID. This NodeRef is something we can see on the Alfresco Web Interface from the document details page (web preview of a document): http://HOSTNAME:PORT/share/page/document-details?nodeRef=workspace://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6. Please be aware that the NodeRef contains an UUID but it's not the same that the ID on the Content Store side... Moreover, there is a property in the DB that link the NodeRef to the Content Store's ID for Alfresco to be able to retrieve the content of a file.
  • Index: an index is created for this file in the Search engine (can be Lucene for older versions of Alfresco or Solr for newer versions). This index is in the "workspace" store.



II. Update, Review, Approve, Publish, aso...


Once the document is created, his life really begins. You can update/review/approve/publish it manually or automatically with different processes. All these actions are part of the active life of a document. From an administration point of view, there isn't that much to say here.


III. Deletion - User level

 
When a document isn't needed anymore, for any reason, a user with sufficient permissions is able to delete it. For our example, let's say that a user deleted our file "Test_Lifecycle.docx" using the Alfresco Share Web Interface on Febrary the 20th, 2015 at 15:30 (19 days after creation). When using the Web Interface or Web Services to delete a document, the "nodeService.deleteNode" method is called. So what happened to our "three things"?

  • FS: nothing changed on the Content Store. The file content is still here.
  • DB: on the DB side, the NodeRef changed from workspace://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6 to archive://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6 (the "store_id" field changed on the "alf_node" table).
  • Index: same thing for the search index: the index is moved from the "workspace" store to the "archive" store.


Actually, when a user deletes a document from a Web Interface of Alfresco, the document is just moved to a "global trashcan". By default all users have access to this global trashcan in Alfresco Explorer to restore the documents they may have deleted by mistake. Of course they can't see all documents but only the ones related to them. On Alfresco Share, the access to this global trashcan is configured in the share-config.xml file and by default on most versions, only administrators have access to it.

The only way to avoid this global trashcan is to programmatically delete the document by applying the aspect "cm:temporary" to the document and then call the "nodeService.deleteNode" on it. In that way, the document is removed from the UI and isn't put in the global trashcan.


 IV. Deletion - Administration level

 
What I describe here as an "Administration level" is the second level of deletion that happen by default. This level is the deletion of the document from the global trashcan. If the document is still in the global trashcan, Administrators (or users if you are using Alfresco Explorer) can still restore the document. If the document is "un-deleted", then it will return exactly where it was before and of course metadata & index of this document will be moved from the "archive" store to the "workspace" store to return in an active life.

On April the 1st, 2015 at 08:05 (40 days after deletion at user level), an administrator decides to remove "Test_Lifecycle.docx" from the global trashcan. This can be done manually or programmatically. Moreover, there are also some existing add-ons that can be configured to automatically delete elements in the trashcan older than XX days. This time, the "NodeArchiveService.purgeArchiveNode" method is called (archiveService.purge in some older Alfresco versions). So what happened to our "three things" this time?

  • FS: still nothing changed on the Content Store. The file content is still here.
  • DB: on the DB side, the document is still there but some references/fields (not all) are removed. All references on the "alf_content_data" table are removed when only some fields are emptied on the "alf_node" table. For Alfresco 4.0 and below, the "node_deleted" field on the table "alf_node" is changed from 0 to 1. On newer versions of Alfresco, the "node_deleted" doesn't exist anymore but the QNAME of the node (field "type_qname_id") on the "alf_node" table is changed from 51 ("content") to 140 ("deleted"). So the Node is now deleted from the global trashcan and Alfresco knows that this node can now be safely deleted but this will not be done now... Once the "node_deleted" or "type_qname_id" is set, the "orphan_time" field on the "alf_content_url" table for this document is also changed from NULL to the current unix timestamp (+ gmt offset). In our case it will be orphan_time=1427875500540.
  • Index: the search index for this Node is removed.


As you can see, there are still some remaining elements on the File System and in the DB. That's why there is a last step in our lifecycle...


V. Deletion - One more step.


As you saw before, the document "Test_Lifecycle.docx" is now considered as an "orphaned" Node. On Alfresco, by default, all orphaned Nodes are protected for 14 days. That means that during this period, the orphaned Nodes will NOT be touched at all. Of course this value can be changed easily on Alfresco configuration files. So what happen after 14 days? Well in fact every day at 4am (again, by default... can be changed), a scheduled job (the "contentStoreCleaner") scan Alfresco for orphaned Nodes older than 14 days. Therefore on April the 15th, 2015 at 04:00, the scheduled job runs and here is what it does:

  • FS: the content file is moved from $ALF_DATA/contentstore/ to $ALF_DATA/contentstore.deleted/
  • DB: on the DB side, the document is still here but the line related to this document on the "alf_content_url" table (this table contains the orphan_time and reference to the FS location) is removed.
  • Index: nothing to do, the search index was already removed.


You can avoid this step where documents are put on the "contentstore.deleted" folder by setting "system.content.eagerOrphanCleanup=true" in the alfresco-global.properties configuration file. If you do so, after 14 days, the document on the File System is not moved but will be deleted instead.


VI. Deletion - Still one more step...!


As said before, there are still some references to the "Test_Lifecycle.docx" document on the Alfresco Database (especially on the "alf_node" table). Another scheduled job, the nodeServiceCleanup runs every day at 21:00 to clean everything that is related to Nodes that has been deleted (orphaned nodes) for more than 30 days. So here is the result:

  • FS: the content file is still on the $ALF_DATA/contentstore.deleted/ folder
  • DB: the DB is finally clean!
  • Index: nothing to do, the search index was already removed.



VII. Deletion - Oh you must be kidding me!?


So many steps, isn't it! As saw before, the only remaining thing to do is to remove the content file from the $AL_DATA/contentstore.deleted/ folder. You probably think that there is also a job that do that for you after XX days but it's not the case, there is nothing in Alfresco that deletes the content file from this location. In consequences, if you want to clean the File System, you will have to do it by yourself.

On Unix for example, you can simply create a crontab entry:

50 23 * * * $ALF_HOME/scripts/cleanContentStoreDeleted.sh


Then create this file with the following content:

#!/bin/sh

CS_DELETED=$ALF_DATA/contentstore.deleted/

# Remove all files from contentstore.deleted older than 30 days
find $CS_DELETED -type f -mtime +30 | xargs rm 2> /dev/null

# Remove all empty folders from contentstore.deleted older than 60 days
find $CS_DELETED -type d -mtime +60 -empty | xargs rm -r 2> /dev/null


Please be aware that you should be sure that the folder "$ALF_DATA/contentstore.deleted" exist... And when I say that, I mean YOU MUST ABSOLUTELY BE SURE that it exists. Please also never remove anything under the "contentstore" folder and never remove the "contentstore.deleted" folder itself!

You may wonder why the DB isn't cleaned automatically when the trashcan is cleaned and why the file content is also kept 14 days... Well I can assure you that there are several reasons and the principal one is for backup/restore performance concerns. I will not explain it in details but basically as the file content isn't touched for 14 days by default, that means that you can restore your database up to 14 days in the past and your database will still be consistent with the File System without to restore the FS! Of course if you just do that you will lose the documents uploaded/changed in the last 14 days because your database wasn't aware of these files 14 days ago. But you can just backup/restore the content files created in the last 14 days with an incremental backup.

E.g.: Today (05-May-2015), I want to backup the FS (only the last 14 days), then I will have to backup all folders inside $ALF_DATA/contentstore/2015/5 AND I will also have to backup all folders inside $ALF_DATA/contentstore/2015/4 with a folder name bigger or equal than 30 (days in Apr) + 5 (days in May) - 14 = 21.


I hope this post was clear enough because it's true that it can be hard to understand everything regarding the lifecycle of Alfresco Node and to deal with it properly.

Drone, Data X Conference – Trip Report

Oracle AppsLab - Tue, 2015-05-05 09:39

Let me start by saying that there were kids in the audience of the Drones, Data X Conference held this past weekend in Santa Cruz, something I have not seen at any other tech conference. I thought it was pretty cool. At the end, I found there was a reason for kids to be there.

From the very first presentation it became clear that this is not really about the drones. The drones were almost harshly referred to as “hardware.”

Dr. Ernest Earon from Precision Hawk, the company that flies its drones for agriculture, oil & gas, and such, said “Farmers are not interested in buying hardware, or buy pictures of their crop. They need answers to their questions, and solutions to their problems.”

Similarly, Will Sauer, the speaker from Skycatch, the mapping company, said that this is about “how to get from raw images to making better decisions.”

Along the same lines, Mike Winn(@mikewinn) from DroneDeploy (@DroneDeploy) said that “there is only a fraction of business questions that could be answered by images and models.”

All that made me feel right at home. Nothing disrupting about drones to our day-to-day, just another tool in a toolbox of different ways to help the users with what they need.

20150501_170353

I truly appreciated a talk by Andreas Raptopoulos from Matternet. It is a great example of the product being built around a need of people rather than a need of the technology to find an application.

Matternet makes small drones that can deliver packages under 1kg. The drones were built to deliver medicine to rural areas with bad or no roads with projects pilots in Bhutan, Haiti, and Papua New Guinea.

matternet

On a subject of rural areas, apparently there is much work to be done for the drones to go urban. There is a lot of drone hype and make-believe, and the reality is still far from it.

In reality, the drones are not autonomous yet. Making self-driving cars is a child game comparing to the complexity of making self-flying vehicle. Drones have no roads to follow; drones need to see not only what’s in front of them but all around; unlike cars, drones don’t have much room for all the processors and sensors; and unlike cars, drones must be cheap.

Philip McNamara (@McNamara_Philip), the conference organizer, entertained us with this Cirque du Soleil lampshade movie to illustrate how precise autonomous flight control algorithms are becoming.

Autonomy, infrastructure, and regulations are the road blocks that will delay the appearance of Amazon Prime Air and its likes for some time.

NASA has being tasked to build a traffic management infrastructure for low altitudes. I visualized what the presenter, Dr. Parimal Kopardekar, said as invisible highways, bike lanes, traffic stops, etc. in the sky built with geofencing. Unlike their down to earth counterparts, these roads could be reconfigured and adjusted in a real time from the Internet.

While the sensor technology for autonomous flying and the traffic infrastructure are maturing, the regulations are here to hold the enthusiasm back. In the US, out-of-sight flying is not allowed; any commercial flying needs to come with the appropriate licenses and certificates, and recreational flying is only allowed in the open away from people.

Jim Williams from Federal Aviation Administration said that we are facing “out of box aviator” phenomena where anyone can buy a drone and enter the skies without any awareness of rules and responsibilities that come with that. “This is where the most regulated sector, aviations, meets least regulated sector, tech” he said.

In the meantime, Matternet will be trying the drone delivery project in Switzerland where the infrastructure is most mature and regulations are most permitting. With the Swiss project, Matternet will be trying to address “the last mile delivery” problem.

lastmile

Very exciting.

Here is my laundry list of all the applications that I’ve heard at the conference: agriculture, emergency response, construction, mining, oil and gas, forestry, ocean and lakes, insurance, transportation, surveying, delivery, wireless connectivity (this one refers to Facebook Wi-Fi drones), and renewable energy (this one refers to wind-harvesting drones to replace wind towers).

And what about those kids in the audience? Apparently, some of them were with STEM (Science, Technology, Engineering and Math) first person view (FPV) drone racing program. To the spectators, the drone race looks like minimized and slow version Quidditch.

20150430_174854

But not so for the pilots! Did you know that you race a drone with the VR-like goggles on?

The pilot sees what the drone sees. So as a pilot, you are basically getting the ultimate x-box flying experience. As Scot Refsland (@srefsland), the presenter from Flying Grounds said the experience is very “sticky.” He suggests that this is much healthier for the kids to be out in a field racing drones than in front of a box. He believes being involved with the drones also teaches technology, and, yes, even secures kids’ future.

20150501_172317Possibly Related Posts:

Status Of My SlideShare Material

Hemant K Chitale - Tue, 2015-05-05 09:30
My  SlideShare  Material has had 7,390 views to date.


.
.
.
Categories: DBA Blogs

REST Data Services and SQL Developer

Kris Rice - Tue, 2015-05-05 07:48
The database tools team released 3 new GA releases and an update to our SQLCL. Official Releases are here:    SQL Developer, Modeler, and Data Miner:        https://blogs.oracle.com/otn/entry/news_oracle_updates_development_tools        https://blogs.oracle.com/datamining/entry/oracle_data_miner_4_1   REST Data Services now with SODA        https://blogs.oracle.com/otn/entry/

In-Memory OLTP: Hash and Range indexes

Yann Neuhaus - Tue, 2015-05-05 02:04

Since SQL Server 2014 CTP2, Microsoft has introduced a new kind of index which is the Range index. We have now two ways for indexing an In-Memory table: using either an Hash index or a Range index.

These two indexes are slightly different.
In fact, a Hash index is a set of buckets, 8-bytes memory pointers, which points to the actual row data or row chain data. SQL Server 2014 uses a hash function to map values to buckets. To minimize Hash collisions the number of bucket should be twice the number of index values. A Hash collision occurs when the Hash function returns the same bucket value for two different data value for example “Audi” and “Mercedes”. Hash indexes are optimized for equality predicates.

In contrast, for Range indexes, the leaf level of a Bw-Tree gives you a pointer to the first data row in the row chain. Like with hash indexes, rows with the same indexed value form a row chain. This kind of index is particularly efficient with inequality predicates like ‹ or › .If you want to know more about Range Indexes, please read the excellent blog of my colleague David Barbarin.

To compare these two kind of indexes I will create a new database with first two tables, one In-Memory table with a hash index on a DateTime column and another one (disk-based table) with a non-clustered index on the same column. I will fill these tables with one millions records. Here is the script I used:

 

-- database creation

use master

go


if exists(select * from sys.databases where name = 'Test_InMemoryOLTP_Optimized')

   drop database Test_InMemoryOLTP_Optimized

go


create database Test_InMemoryOLTP_Optimized

go


--create a disk table

use Test_InMemoryOLTP_Optimized

go


if exists(select * from sys.objects where name = 'Person_OnDisk')

   drop table dbo.Person_OnDisk

go


create table dbo.Person_OnDisk

(

Person_OnDisk_ID int not null primary key,

Person_OnDisk_Name nvarchar(200) not null,

Person_OnDisk_Date datetime not null,

index Person_OnDisk_Person_OnDisk_Date nonclustered (Person_OnDisk_Date)

)


--enable database for memory optimized tables

-- add memory_optimized_data filegroup

alter database Test_InMemoryOLTP_Optimized

   add filegroup Test_InMemory_mod contains MEMORY_OPTIMIZED_DATA

go


--add container to the filegroup

alter database Test_InMemoryOLTP_Optimized

   add file (name='InMemory_mod', filename='d:\InMemory\InMemory_mod')

   to filegroup Test_InMemory_mod

go


use Test_InMemoryOLTP_Optimized

go


if exists(select * from sys.objects where name = 'Person_InMemory')

   drop table dbo.Person_InMemory

go


create table dbo.Person_InMemory

(

Person_InMemory_ID int not null primary key

       nonclustered hash with (bucket_count = 2000000),

Person_InMemory_Name nvarchar(200) COLLATE Latin1_General_100_BIN2 NOT NULL,

Person_InMemory_Date datetime not null index Person_InMemory_Person_InMemory_Date

nonclustered hash (Person_InMemory_Date) with (bucket_count = 2000000)

)

with (memory_optimized = on, durability = schema_and_data)


--fill in disk table

set nocount on

go

begin tran

declare @cpt int = 0

While @cpt < 1000000

begin

   insert into dbo.Person_OnDisk values

       (@cpt, 'Name_'+cast(@cpt as varchar(10)), dateadd(dd,round(@cpt/10000,0,1),'2014-01-01'))

       set @cpt += 1

end

commit


--fill in In-Memory table

set nocount on

go

begin tran

declare @cpt int = 0

While @cpt < 1000000

begin

   insert into dbo.Person_InMemory values

       (@cpt, 'Name_'+cast(@cpt as varchar(10)), dateadd(dd,round(@cpt/10000,0,1),'2014-01-01'))

       set @cpt += 1

end

commit

 

First remark, filling in the In-Memory table is three time faster than for disk-based table, respectively on my machine 5 seconds for the former against 15 seconds for the latter. You can also see that I decided to persist my In-Memory table data as I used the durability option schema_and_data instead of schema_only which means that after a restart of my SQL Server service the data will not be persistent, so definitively lost.

 

Now, I will try to search for records where date is equal to 14-01-2014 and then I will compare results between my In-Memory and my disk-based tables. I will use IO and Time statistics options to get a better view of each query performance:

 

-- use equality operator with disk table and In-Memory table with Hash index

USE Test_InMemoryOLTP_Optimized

GO

 

SET NOCOUNT ON

SET STATISTICS IO ON

SET STATISTICS TIME ON

 

SELECT *

FROM dbo.Person_OnDisk

WHERE Person_OnDisk_Date = '2014-01-14'

 

SELECT *

FROM dbo.Person_InMemory

WHERE Person_InMemory_Date = '2014-01-14'

 

The result of those queries is:

 

Table on disk:

Table 'Person_OnDisk'. Scan count 1, logical reads 5812, physical reads 3, read-ahead reads 5805, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

 

SQL Server Execution Times:

   CPU time = 31 ms, elapsed time = 1631 ms.

 

 

Table In-Memory:

SQL Server Execution Times:

   CPU time = 0 ms, elapsed time = 128 ms.

 

The disk table made multiple logical reads (5812), three physical reads and 5805 read-aheads whereas the In_Memory table as it is in memory did not perform any disk I/O’s activity.
Concerning the execution time, my disk-based table consumed 31 ms of CPU against and ran in 1631 ms against no CPU and 128 ms for my In-Memory table...
In-Memory table is definitively faster than my disk-based table.

 

Let see now if I used an inequality predicate. So I am using the same queries with an ›:

 

SET NOCOUNT ON

SET STATISTICS IO ON

SET STATISTICS TIME ON

 

SELECT *

FROM dbo.Person_OnDisk

WHERE Person_OnDisk_Date > '2014-01-14' and Person_OnDisk_Date '2014-02-20'

 

 

SELECT *

FROM dbo.Person_InMemory

WHERE Person_InMemory_Date > '2014-01-14' and Person_InMemory_Date '2014-02-20'

 

 

The result is:

 

Disk table:

Table 'Person_OnDisk'. Scan count 1, logical reads 5812, physical reads 3, read-ahead reads 5805, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

 

SQL Server Execution Times:

   CPU time = 171 ms, elapsed time = 3150 ms.

 

 

In-Memory table:

SQL Server Execution Times:

CPU time = 204 ms, elapsed time = 2153 ms.

 

I/O’s for both tables did not change, but concerning the In-Memory table we saw that the gain is not very interesting compare to equality predicate see before. The Hash index is not tuned for inequality predicate. I will now change the Hash index of my In-Memory table to a Range index. To do it, I will have to drop and recreate my table, as it is not possible to alter an In-Memory table, and to reload data.

 

--Change the Hash index by a Range index

if exists(select * from sys.objects where name = 'Person_InMemory')

   drop table dbo.Person_InMemory

go

 

create table dbo.Person_InMemory

(

Person_InMemory_ID int not null primary key

       nonclustered hash with (bucket_count = 2000000),

Person_InMemory_Name nvarchar(200) COLLATE Latin1_General_100_BIN2 NOT NULL,

Person_InMemory_Date datetime not null index Person_InMemory_Person_InMemory_Date

nonclustered (Person_InMemory_Date)

)

with (memory_optimized = on, durability = schema_and_data)

 

 

In-Memory table:

 

SQL Server Execution Times:

   CPU time = 47 ms, elapsed time = 2155 ms.

 

The CPU time is now more than four time lower with a Range Index compare to a Hash index.
If I take a look at the execution plan I can see the optimizer is using my range index defined in my script with an index seek operation against a table scan when using an hash index with an equality operator.

 

Before creating an In-Memory table remember that you will have to think about which kind of predicates will be used in queries to choose the right index: Hash or Range or even both. In addition, don’t forget that an In-Memory table accepts a maximum of eight indexes. Finally, if you need to create another index for an In-Memory table you will have to drop and recreate the table.
I hope this blog will help. If you want to know more about SQL Server 2014 In-Memory OLTP please joins us to our next Event in June, details and inscription here.

Links for 2015-05-04 [del.icio.us]

Categories: DBA Blogs

Variations on 1M rows insert (3): TimesTen

Yann Neuhaus - Mon, 2015-05-04 23:00

In the previous blog post I measured how inserting 1 million rows takes 5 minutes instead of 45 seconds when we commit at each row inserted. In Oracle we can improve that down to 2 minutes if we can accept (and manage) the loss of transactions following an instance crash.
Let's go further. For OLTP that need to face a high rate of transactions, Oracle has TimesTen which is a Database In-Memory - meaning that it is optimized for in-memory structures. My testcase has a 1 million rows table that can fit in memory, so let's try it.

Test case

I am still using the Oracle Developer Day 11g VM which has TimesTen installed.

I've configured my database in /home/oracle/app/oracle/product/TimesTen/tt1122/info/sys.odbc.ini

[sampledb_1122]
Driver=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so
DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122
PermSize=400
TempSize=320
PLSQL=1
DatabaseCharacterSet=US7ASCII
I used the sample one but I have increased the size of memory to 400MB and 320MB for PERM_ALLOCATED_SIZE and TEMP_ALLOCATED_SIZE.

ttIsql

So I run the TimesTen command line interpreter:

[oracle@localhost ~]$ ttisql

Copyright (c) 1996-2011, Oracle.  All rights reserved.
Type ? or "help" for help, type "exit" to quit ttIsql.
check the version
Command> version
TimesTen Release 11.2.2.2.0
connect to my database
Command> connect "dsn=sampledb_1122";
Connection successful: DSN=sampledb_1122;UID=oracle;DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122;DatabaseCharacterSet=US7ASCII;ConnectionCharacterSet=US7ASCII;DRIVER=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so;PermSize=40;TempSize=32;TypeMode=0;
(Default setting AutoCommit=1)
and I want to manage commits myself
Command> set autocommit 0;
and then I'm doing the same as in previous posts:
Command> set echo on;

Command> create table DEMO ("id" number primary key, "text" varchar2(15), "number" number);

Command> set timing on;
There is something that is very nice about TimesTen: it supports PL/SQL. I'm running exactly the same as I did in Oracle:
Command> declare
 type people_array is varray(12) of varchar(15);
 people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
 people_count number :=people.COUNT;
 n number;
begin
 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit;
 end loop;
 commit;
end;
/

PL/SQL procedure successfully completed.

Execution time (SQLExecute) = 21.010554 seconds.
21 second. Yes that's fast...

Durable commits

Do you remember how we improved the commits in the last post? When accepting the risk to have non durable commits the response time was faster on Oracle database because we don't have to wait for persistence of commits.

Here in TimesTen, it's an In-Memory database and by default transactions are not durable. Transactions are logged for rollback, but not for recovery. Which means that nothing goes to disk. If the instance crashes, data is lost. If I want to have the durability of ACID properties then we have to write transaction logging to disk.
This is activated by the following property in the sys.odbc.ini:

DurableCommits=1;

I connect:
Command> connect "dsn=sampledb_1122";
Connection successful: DSN=sampledb_1122;UID=oracle;DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122;DatabaseCharacterSet=US7ASCII;ConnectionCharacterSet=US7ASCII;DurableCommits=1;DRIVER=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so;PermSize=40;TempSize=32;TypeMode=0;
(Default setting AutoCommit=1)
and run exactly the same as before:
PL/SQL procedure successfully completed.

Execution time (SQLExecute) = 31.316902 seconds.
It's longer but still faster than the Oracle database. TimesTen is optimized for that.

Conclusion

Note that we have the durability here, thanks to the persistence of transaction log, as long as we set DurableCommits=1. However, in case of instance crash, the recovery may be long because lot of transaction must be re-done. It's an In-Memory database, it's fast when it's up but you don't have the same availability than a database that is optimized to store data on disk. You have to tune the checkpoints in order to balance between the performance and the availability.

TimesTen In-Memory database can be a good solution to accelerate some use-cases. Its compatibility with Oracle SQL and PL/SQL is very good. You can install it standalone or as a cache for Oracle Database. Licencing is 50% of Oracle Enterprise Edition.

TimesTen is also called 'Database In-Memory'. Which is a bit misleading because Oracle has also the 12c In-memory option which is very different. With competitors, TimesTen can be compared to SQL Server In-Memory OLTP. Because it's a unique event - a cross technology one (MSSQL, ORA, HANA) - I don't hesitate to link again to our free Event about In-Memory: boost your IT performance! In Switzerland (Lausanne, Basel and Zürich).

Talking about cross technology expertise, don't miss Daniel's variation on 1M rows insert into PostgreSQL and with synchronous commit .

EMC World Las Vegas – Momentum 2015 first day

Yann Neuhaus - Mon, 2015-05-04 18:34

Before starting talking about EMC event, I just would like to share my feeling about the city it takes place. It is Las Vegas, ok, what an amazing, impressive place! On top of that, we are in one of most beautiful hotels on the strip ! Congratulations to EMC and thanks for this attention.

 

For this first day of conferences, I decided to attend a first session about xCP platform. I use to manage content with D2, and wondered to know a few background of its brother – process oriented – xCP. Then I will explain a bit further news and forecasts about underlaying content management platform, Documentum itself.

 

xCP is one of leading application EMC wants to promote with D2 within enterprise content management system. In summary, xCP is described as a “rapid application development platform”. It meas it helps developers and business analyst to build application with lowering as much as possible real development works by providing an “extensive platform for building case management and business processes solution”.

 

In fact it aims to graphically build up applications by putting together several items, like forms and operational bricks, organized through processes for providing functionalities business teams are looking for. Such approach also aims to reduce development costs and maintainability over the time for more complex applications than only faceted records management.

 

Meanwhile, EMC released Documentum 7.2 platform with its several clients, D2 4.5, xCP 2.2 and Webtop 6.8.

 

In this release we can see several improvement areas like for security. With Documentum 7.2, we can now store and transmit contents using AES-256 bits encryption, considered by US NSA good enough as Top Secret protection criteria for the moment.

 

This version also provides enhanced capability through REST web services for searches, facet navigation, batches transactions, and indeed, further integration extensibility for third party software.

 

xPlore 1.5 also provides its own growing set of functionalities, like cross-repository subscription, privacy options, enhanced language words splittings and improved warm up. It is also good to know it keeps support with backwards compatibility to Documentum 6.7 SP2.

 

Then for next upgrades, EMC also provides higher-level migration tools like one for automatic encryption upgrade in Documentum.

 

This day was also very rich in terms of EMC corporate communication, for overall products policies and strategies for coming years and releases.

 

I hope you enjoyed reading this short summary of first day at EMC world – Momemtum 2015, and would like to thank you for your attention.

Variations on 1M rows insert(2): commit write - PostgreSQL

Yann Neuhaus - Mon, 2015-05-04 16:04
Franck was quite fast with his second post Variations on 1M rows insert(2): commit write. Lets see what we can do on the PostgreSQL side.

OEM 12c Silent Installation

Pythian Group - Mon, 2015-05-04 13:17

“What’s for lunch today?”, said the newly born ready to run Red Hat 6.4 server.

“Well, I have an outstanding 3-course meal of OEM12c installation.
For the appetizer, a light and crispy ASM 12c,
DB 12c with patching for the main and desert, and to cover everything up, OEM 12c setup and configuration”, replied  the DBA who was really happy to prepare such a great meal for his new friend.

“Ok, let’s start cooking, it won’t take long”, said the DBA and took all his cookware (software), prepared ingredients (disk devices) and got the grid infrastructure cooked:

./runInstaller -silent \
-responseFile /home/oracle/install/grid/response/grid_install.rsp -showProgress \
INVENTORY_LOCATION=/u01/app/oraInventory \
SELECTED_LANGUAGES=en \
oracle.install.option=HA_CONFIG \
ORACLE_BASE=/u01/app/oracle \
ORACLE_HOME=/u01/app/oracle/product/12.1.0/grid \
oracle.install.asm.OSDBA=dba \
oracle.install.asm.OSASM=dba \
oracle.install.crs.config.storageOption=LOCAL_ASM_STORAGE \
oracle.install.asm.SYSASMPassword=sys_pwd \
oracle.install.asm.diskGroup.name=DATA \
oracle.install.asm.diskGroup.redundancy=EXTERNAL \
oracle.install.asm.diskGroup.AUSize=4 \
oracle.install.asm.diskGroup.disks=/dev/asm-disk1 \
oracle.install.asm.diskGroup.diskDiscoveryString=/dev/asm* \
oracle.install.asm.monitorPassword=sys_pwd \
oracle.install.config.managementOption=NONE

And added some crumbs:

/u01/app/oracle/product/12.1.0/grid/cfgtoollogs/configToolAllCommands RESPONSE_FILE=/tmp/asm.rsp
where /tmp/asm.rsp had:
oracle.assistants.asm|S_ASMPASSWORD=sys_pwd
oracle.assistants.asm|S_ASMMONITORPASSWORD=sys_pwd

“It was a great starter”, said the server finishing the first dish,

“I am getting even more hungry. What’s for the main?”.

“Oh, you will love it! It is Database 12c. It is one of these new meals and it is already very popular”, answered the DBA enthusiastically and continued cooking.

“Looking forward to trying it”, the server decided to have a nap until the dish was ready.

“You asked, you got it”, and the DBA gave the server the dish he never tried:

./runInstaller -silent -showProgress \
-responseFile /home/oracle/install/database/response/db_install.rsp \
oracle.install.option=INSTALL_DB_SWONLY \
ORACLE_BASE=/u01/app/oracle \
ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 \
oracle.install.db.InstallEdition=EE oracle.install.db.DBA_GROUP=dba \
oracle.install.db.BACKUPDBA_GROUP=dba \
oracle.install.db.DGDBA_GROUP=dba \
oracle.install.db.KMDBA_GROUP=dba \
SECURITY_UPDATES_VIA_MYORACLESUPPORT=false \
DECLINE_SECURITY_UPDATES=true

The topping ingredient was of course a brand new database:

./dbca -silent -createDatabase -gdbName em12 \
-templateName General_Purpose.dbc \
-emConfiguration none \
-sysPassword sys_pwd \
-systemPassword sys_pwd \
-storageType ASM \
-asmsnmpPassword sys_pwd \
-diskGroupName DATA \
-redoLogFileSize 100 \
-initParams log_buffer=10485760,processes=500,\
session_cached_cursors=300,db_securefile=PERMITTED \
-totalMemory 2048

“Delicious! That’s what I dreamt of! Where did you find it?”, the server could not hide his admiration.

“Well, you have not tried desert yet. When you have it, you will forget all those dishes that you had before.”

“Hmm, you intrigue me. Definitely I will have it!”

“Anything for you, my friend”, and the DBA cooked his famous, rich and delicious desert:

./runInstaller -silent \
-responseFile /home/oracle/install/em/response/new_install.rsp \
-staticPortsIniFile /tmp/ports.ini \
SECURITY_UPDATES_VIA_MYORACLESUPPORT=false \
DECLINE_SECURITY_UPDATES=true \
ORACLE_MIDDLEWARE_HOME_LOCATION=/u01/em12 \
AGENT_BASE_DIR=/u01/agent12c \
WLS_ADMIN_SERVER_USERNAME=weblogic \
WLS_ADMIN_SERVER_PASSWORD=Sun03day03 \
WLS_ADMIN_SERVER_CONFIRM_PASSWORD=Sun03day03 \
NODE_MANAGER_PASSWORD=Sun03day03 \
NODE_MANAGER_CONFIRM_PASSWORD=Sun03day03 \
ORACLE_INSTANCE_HOME_LOCATION=/u01/gc_inst \
CONFIGURE_ORACLE_SOFTWARE_LIBRARY=true \
SOFTWARE_LIBRARY_LOCATION=/u01/sw_lib \
DATABASE_HOSTNAME=oem12c.home \
LISTENER_PORT=1521 \
SERVICENAME_OR_SID=em12 \
SYS_PASSWORD=sys_pwd \
SYSMAN_PASSWORD=Sun03day03 \
SYSMAN_CONFIRM_PASSWORD=Sun03day03 \
DEPLOYMENT_SIZE="SMALL" \
MANAGEMENT_TABLESPACE_LOCATION="+DATA" \
CONFIGURATION_DATA_TABLESPACE_LOCATION="+DATA" \
JVM_DIAGNOSTICS_TABLESPACE_LOCATION="+DATA" \
AGENT_REGISTRATION_PASSWORD=Sun03day03 \
AGENT_REGISTRATION_CONFIRM_PASSWORD=Sun03day03

“You made my day!” exclaimed the server when nothing was left on his plate.

“Anytime my friend!” smiled DBA in response.

He was as happy as any chef that the cooking went the way it was planned and the final product was just as the recipe had said.

Have a good day!

Categories: DBA Blogs

Variations on 1M rows insert (1): bulk insert - PostgreSQL

Yann Neuhaus - Mon, 2015-05-04 11:30

When I read Franck's post about Variations on 1M rows insert (1): bulk insert I thought doing quite the same in PostgreSQL might be interesting. Lets start by using the same test tables, one using a primary key and the other one without a primary key:

create table DEMO ("id" int , "text" varchar(15), "number" int);

create table DEMO_PK ("id" int , "text" varchar(15), "number" int,
                  constraint demo_pk_pk primary key (id) ) ;

postgres=# \d demo*
            Table "public.demo"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 id     | integer               | 
 text   | character varying(15) | 
 number | integer               | 

           Table "public.demo_pk"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 id     | integer               | not null
 text   | character varying(15) | 
 number | integer               | 
Indexes:
    "demo_pk_pk" PRIMARY KEY, btree (id)

   Index "public.demo_pk_pk"
 Column |  Type   | Definition 
--------+---------+------------
 id     | integer | id
primary key, btree, for table "public.demo_pk"

I am using a Virtual Box VM with Oracle Linux 7.1 (64bit) using 512mb of memory and one vCPU for the Tests:

cat /etc/oracle-release 
Oracle Linux Server release 7.1
cat /proc/meminfo | head -1
MemTotal:         502612 kB
cat /proc/cpuinfo | grep proc
processor	: 0



Similar to Franck's plsql block I'll use a plpgsql for the first tests:

\timing on
show autocommit;
truncate table DEMO;
DO $$DECLARE
  l_people_array varchar(15)[12] := '{"Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy","Greg", "Steve", "Kristina", "Katie", "Jeff"}';
  n int;
BEGIN
  for i in 0..1e6 loop
     n:=trunc(random()*1000+1);
     insert into DEMO values( i , l_people_array[floor((random()*11))+1::int] , n );
  end loop;
END$$;

(The code for the demo_pk table is exactly the same, except for the table name). It does not exactly the same as Frank's plsql does but it does comparable things, I believe :)

The PostgreSQL version I'll use is 9.4.1:

postgres=# select version();
                                                         version                                                          
--------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
(1 row)


Lets start by executing the test without a primary key and thus no index:

postgres=# \i 1mio_rows_no_pk.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.183 ms
TRUNCATE TABLE
Time: 71.339 ms
DO
Time: 6861.547 ms

Not so bad. Doing the same with a primary key and thus an index to maintain:

postgres-# \i 1mio_rows_pk.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.494 ms
TRUNCATE TABLE
Time: 37.314 ms
DO
Time: 10900.631 ms

Around 3 seconds more, but again: Not so bad. As there is no "bulk collect" or "bulk insert" in PostgreSQL I can not do the same tests as Franck. But PostgreSQL knows a copy command, lets look at that. First I'll unload the table to a flat file so I have something to load:

postgres=# copy demo to '/home/postgres/demo.txt';
COPY 1000001
Time: 239.161 ms

Believe it or not: it took 240 ms to unload 1000001 rows from a table to a flat file. Now lets do the opposite and load the data back to the table using the copy command. The script used for doing this looks like this for both variations (with and without primary key):

\timing on
show autocommit;
truncate table demo;
copy DEMO from '/home/postgres/demo.txt';

First one without the primary key:

postgres=# \i 1mio_rows_no_pk_copy.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.663 ms
TRUNCATE TABLE
Time: 32.026 ms
COPY 1000001
Time: 1877.480 ms

The second one with the primary key:

postgres=# \i 1mio_rows_pk_copy.sql
Timing is on.
TRUNCATE TABLE
Time: 33.370 ms
COPY 1000001
Time: 6227.844 ms

This makes a big difference: 1,8 seconds compared to 6,2 seconds. But still amazingly fast. For those who do know PostgreSQL a little bit and might wonder about fsync:

postgres=# show fsync;
 fsync 
-------
 on
(1 row)

If I turn this off, without the primary key:

postgres=# alter system set fsync='off';
ALTER SYSTEM
Time: 62.066 ms
postgres=# select pg_reload_conf();
LOG:  received SIGHUP, reloading configuration files
LOG:  parameter "fsync" changed to "off"
 pg_reload_conf 
----------------
 t
(1 row)

Time: 7.253 ms
postgres=# show fsync;
 fsync 
-------
 off
(1 row)

Time: 0.601 ms
postgres=# \i 1mio_rows_no_pk_copy.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.277 ms
TRUNCATE TABLE
Time: 10.064 ms
COPY 1000001
Time: 611.455 ms
postgres=# \i 1mio_rows_pk_copy.sql
Timing is on.
TRUNCATE TABLE
Time: 11.768 ms
COPY 1000001
Time: 4674.273 ms

Around 0,6 seconds without the primary key and around 4 seconds with the primary key. But remember that you might loose data if fsync is off and the server crashed during the load.

Lets wait what Frank will do in the next post of his series.

Log Buffer #421: A Carnival of the Vanities for DBAs

Pythian Group - Mon, 2015-05-04 11:29

As always, this fresh Log Buffer Edition shares some of the unusual yet innovative and information-rich blog posts from across the realms of Oracle, SQL Server and MySQL.

Oracle:

A developer reported problems when running a CREATE OR REPLACE TYPE statement in a development database. It was failing with an ORA-00604 followed by an ORA-00001. These messages could be seen again and again in the alert log.

  • Few Random Solaris Commands : intrstat, croinfo, dlstat, fmstat for Oracle DBA
  • When to use Oracle Database In-Memory?
  • Oracle Linux and Oracle VM at EMCWorld 2015
  • SQLcl connections – Lazy mans SQL*Net completion

SQL Server:

  • SQL Server expert Wayne Sheffield looks into the new T-SQL analytic functions coming in SQL Server 2012.
  • The difference between the CONCAT function and the STUFF function lies in the fact that CONCAT allows you to append a string value at the end of another string value, whereas STUFF allows you insert or replace a string value into or in between another string value.
  • After examining the SQLServerCentral servers using the sp_Blitz™ script, Steve Jones now looks at how we will use the script moving forward.
  • Big data applications are not usually considered mission-critical: while they support sales and marketing decisions, they do not significantly affect core operations such as customer accounts, orders, inventory, and shipping. Why, then, are major IT organizations moving quickly to incorporating big data in their disaster recovery plans?
  • There are no more excuses for not having baseline data. This article introduces a comprehensive Free Baseline Collector Solution.

MySQL:

  • MariaDB 5.5.43 now available
  • Testing MySQL with “read-only” filesystem
  • There are tools like pt-kill from the percona tool kit that may print/kill the long running transactions at MariaDB, MySQL or at Percona data instances, but a lot of backup scripts are just some simple bash lines.
  • Optimizer hints in MySQL 5.7.7 – The missed manual
  • Going beyond 1.3 MILLION SQL Queries/second
Categories: DBA Blogs

Quick Tip : Oracle User Ulimit Doesn’t Reflect Value on /etc/security/limits.conf

Pythian Group - Mon, 2015-05-04 11:21

So the other day I was trying to do a fresh installation of a new Oracle EM12cR4 in a local VM,  and as I was doing it with the DB 12c, I decided to use the Oracle preinstall RPM to ease my installation of the OMS repository database. Also I was doing both the repository and EM12c OMS install in the same VM, that is important to know.

[root@em12cr4 ~]# yum install oracle-rdbms-server-12cR1-preinstall -y

I was able to install the DB without any issues, but when I was trying to do the installation of EM12cR4, an error in the pre-requisites popped up:

WARNING: Limit of open file descriptors is found to be 1024.

For proper functioning of OMS, please set “ulimit -n” to be at least 4096.

And if I checked the soft limit for the user processes , it was set to 1024:

oracle@em12cr4.localdomain [emrep] ulimit -n
1024

So if you have been working with Oracle DBs for a while you know that this has to be checked and modified in/etc/security/limits.conf , but it was my surprise that the limit has been set correctly for the oracle user to at least 4096:

[root@em12cr4 ~]# cat /etc/security/limits.conf | grep -v "#" | grep  nofile
oracle   soft   nofile   4096
oracle   hard   nofile   65536

So my next train of thought was to verify the user bash profile settings, as if the ulimits are set there, it can override the limits.conf, but again it was to my surprise that there was nothing in there, and that is were I was perplexed:

[oracle@em12cr4 ~]# cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
	. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH

So what I did next was open a root terminal and do a trace of the login of the Oracle user:

[root@em12cr4 ~]# strace -o loglimit su - oracle

And in another terminal was verify what was the user reading regarding the user limits, and this is where I hit the jackpot. I was able to see here that it was reading the pam_limits.so and the /etc/security/limits.conf as it should, but it was also reading another configuration file called oracle-rdbms-server-12cR1-preinstall.conf,  (Does this look familiar to you ? :) ) and as you can see the RLIMIT_NOFILE was being set to 1024:

[root@em12cr4 ~]# grep "limit" loglimit
getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=32768*1024}) = 0
open("/lib64/security/pam_limits.so", O_RDONLY) = 6
...
open("/etc/security/limits.conf", O_RDONLY) = 3
read(3, "# /etc/security/limits.conf\n#\n#E"..., 4096) = 2011
open("/etc/security/limits.d", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
open("/etc/security/limits.d/90-nproc.conf", O_RDONLY) = 3
read(3, "# Default limit for number of us"..., 4096) = 208
open("/etc/security/limits.d/oracle-rdbms-server-12cR1-preinstall.conf", O_RDONLY) = 3
setrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=32768*1024}) = 0
setrlimit(RLIMIT_NPROC, {rlim_cur=16*1024, rlim_max=16*1024}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=64*1024}) = 0

So I went ahead and checked the file /etc/security/limits.d/oracle-rdbms-server-12cR1-preinstall.conf and evidently, that is where the limit was set to 1024, so the only thing I did was change the value there to 4096:

[root@em12cr4 ~]# cat /etc/security/limits.d/oracle-rdbms-server-12cR1-preinstall.conf | grep -v"#" | grep nofile
oracle   soft   nofile    1024
oracle   hard   nofile    65536
[root@em12cr4 ~]# vi /etc/security/limits.d/oracle-rdbms-server-12cR1-preinstall.conf
[root@em12cr4 ~]# cat /etc/security/limits.d/oracle-rdbms-server-12cR1-preinstall.conf | grep -v"#" | grep nofile
oracle   soft   nofile    4096
oracle   hard   nofile    65536

Once I did that change, and logged out and logged back in, I was able to see the values that I had set in the first place in /etc/security/limits.conf and now I was able to proceed with the installation of EM12cR4:

oracle@em12cr4.localdomain [emrep] ulimit -n
4096

Conclusion

So when you install the RPM oracle-rdbms-server-12cR1-preinstall, be sure that if you are to change any future user limits, there might be another configuration file that can be setting other values than the ones desired and set in /etc/security/limits.conf

Note.- This was originally published in rene-ace.com

Categories: DBA Blogs

Variations on 1M rows insert(2): commit write

Yann Neuhaus - Mon, 2015-05-04 10:39

In the previous blog post I measured the time it takes to insert 1 million rows. About 45 seconds when done row-by-row which goes down to 7 seconds when inserted in bulk. In both case, the commit was done only at the end. But in some cases you need to commit at each row, in order to make it visible to other sessions as soon as it is inserted (think of several sessions inserting concurrently). And commit makes it also durable - available even after an instance crashes - and that supposes that the change is written to disk. What's the cost for that?

Test case

Of course, I cannot do bulk insert for that as I want to commit for each row. I take the sane test as in the previous post. A table with a primary key (so only one index):

create table DEMO ("id" number , "text" varchar2(15), "number" number) ;

And the same PL/SQL except that I've added a COMMIT in the loop:

declare
 type people_array is varray(12) of varchar(15);
 people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
 people_count number :=people.COUNT;
 n number;
begin
 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit;
 end loop;
 commit;
end;
/

My test environment is still the Oracle Developer Day 11g VM with 1 CPU allocated from by VirtualBox.

 

simple commit in PL/SQL

Yes, this is a special case. I'm running that in a PL/SQL statement and PL/SQL has a special optimization here because the 'commit successful' message is given to the user only at the end. Then the intermediate commits are lighter than commits that return to client. However, look at that 1 million insert + commit execution:

PL/SQL procedure successfully completed.
Elapsed: 00:02:16.34

We are far from the 45 seconds of row-by-row inserts that were committed only at the end. Commit has a big overhead.

 

commit write wait immediate

If you run the inserts from a client, doing a user call for each row, the special optimization is not there. In that case the commit waits for log writer and returns only when log writer has acknowledged the write to disk (and to SYNC standby database if you are in DataGuard). We can do the same from PL/SQL if we issue a COMMIT WRITE WAIT IMMEDIATE instead of a simple commit. So that's a good way to compare the time without introducing additional roundtrip latency.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write wait immediate;
 end loop;

When executing my procedure with the loop above, here is the time it takes:

PL/SQL procedure successfully completed.
Elapsed: 00:05:06.57

Time has doubled. This is the most expensive insert you can do. But it's also the only way if you want to:

  • Give commit feedback to the user at each commit
  • Ensure that the commit is done as quick as possible (this is the IMMEDIATE write) favoring response time over throughput.
  • Ensure that commit is durable (the D in ACID) meaning that we wait that the changes are on persistent storage (this is the WAIT write)

As we can see here, those requirements have a cost.

 

commit write wait batch

We can favor the throughput with the BATCH commit logging instead of IMMEDIATE.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write wait batch;
 end loop;

The optimization is not very high here where I've only one session doing all those inserts:

PL/SQL procedure successfully completed.
Elapsed: 00:04:28.97

but the redo size is reduced. I plan to show more statistics and wait events about it in a future post. The time spend here in the WAIT commits is mostly in that orange 'log file sync' waits we don't like to see on enterprise manager screen. The only point where session process is waiting for work that is expected to be done asynchronously in background. At commit, the session waits for the redo to be persisted.

 

commit write nowait immediate

If you are ok to loose a transaction even when the recent commit was successful, then you don't have to wait. That seems to be heretic for ACID aficionados, but is totally acceptable for supporters of 'eventual consistency'. I prefer to think about it according to business requirements.

When something - external to the database - occurs when the commit is successful, then you must trust in ACID and you have to wait. When your customer withdraw money at the ATM then once they have their money you must be sure that the operation is commited in the database. A crash of the database should not give back the money on his account because he will not give back the bills.

However, when you insert events coming from sensors (you can think Big Data) you probably want to keep them for a while in the database but you can accept to loose a few of them in the rare case of server crash. Especially if that improves a lot the performance and the scalability of the process. Maybe you can accept to loose some transactions. Or you are able to load them back again after a crash.
Think of it like the NOLOGGING operations. It's not what you want by default, but if you are sure about the business requirements, and the recover procedure, then you have new was to improve performance and scalability.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write nowait immediate;
 end loop;

And the improvement is obvious:

PL/SQL procedure successfully completed.
Elapsed: 00:02:08.75

When you don't wait, obviously, you improve the response time and the scalability.

 

commit write nowait batch

And we can batch the redo record, as we did above, in addition to nowait:

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write nowait batch;
 end loop;

And the improvement is obvious:

PL/SQL procedure successfully completed.
Elapsed: 00:01:45.33

Let the log writer do its job in batch, without waiting for its acknowledgement, and the row-by-row commit overhead is minimized. And it's ok as long as you know what you are doing:

  • You can manage the loss of your transaction in case of server crash. Either by re-executing them or because the missing changes have no consequence.
  • You do a WAIT commit when you return to the user a commit status, because he can do something you don't manage and which depend on that commit status.
statement level, session level, instance level

I show only the statement level way to manage commit wait and commit logging because it's a decision that must be done by developers. You can choose the same behavior with the commit_wait and commit_logging session parameters but that's probably not the right option. As I said before, the commit behavior probably depends on where it is in the code. The final commit before returning to the user is probably managed differently.
For the same reason, you probably don't change those parameters at PDB or instance level. But there are always exceptions. If you are doing an application migration that is using the row-by-row API to insert data into the new system, then maybe you can do all that in NOWAIT BATCH by setting instance parameters. If it fails, you restart from the beginning (flashback database helps here).

This kind of optimization exists also in SQL Server. David Bararin has blogged about MSSQL delayed durability previously. Want to meet in one day our experts from all technologies? Come to our Event In-Memory: boost your IT performance! where we talk about SQL Server, Oracle and SAP HANA.

And as we are talking about cross technology, stay tuned to see 1M row insert variations on other database systems...

My Weekend with Apple Watch

Oracle AppsLab - Mon, 2015-05-04 08:33

Editor’s note: Here’s our first Apple Watch review from Thao (@thaobnguyen), our head of research and design. Look for a companion review taken from Noel (@noelportugal) later this week.

Thanks, Apple! I got an Apple Watch (38mm) delivered on Friday, April 24.

uxboxing

The Watch packaging, quite different than the Watch Sport packaging.

Full disclosure, I’m a long time Apple user. My household runs on Apple so many would say I’m an Apple-fan girl. Even so, I’m amazed by the excitement Apple generates and surprised by my own excitement over the Watch.

I’ve used other smart watches for years, but why was I so eager to get my hands on an Apple Watch? Perhaps, it is was all the buzz about limited supply and high demand, and I could be among the “few” [thousands] to get the Watch first. Whatever the reason, I’m feel pretty lucky to be among the first customers that received an Apple Watch.

After spending the weekend with the watch, I would say I like it, but I’m not in love with it. It hasn’t reach the status of being invaluable. For now, I view it is a pretty awesome iPhone accessory and wearable. I feel constantly in touch and reachable – I don’t think I will ever miss a text message or email again.

glance

Many apps have been updated to support Apple Watch. The Watch apps have much simpler interaction than iPhone apps, which I’m starting to explore and get used to. Those that do it well, it feels cool to be able to do it on the watch. Those that don’t (such as not providing enough information), I’m sad I need to reach for my iPhone.

Lastly, Apple Watch consolidates features of my wearables into one so I am give up my other wearables for now. I wore a smart watch that was primarily a golf range finder (and I look forward to trying Apple Watch on the golf course) and a separate fitness tracker.

I will just be wearing the one Apple Watch now. I’m curious to see how my behavior and device usage pattern changes over time.

Will I become dependent upon and attached to Apple Watch as I am with my iPhone?

Finally, let me answer a few common questions I’ve received so far:

  • The watch does last all day. I start the day with 100% battery and end the day between 20-40% battery. However, my iPhone battery seems to be taking a hit.
  • Yes I can make and take a phone call on the watch. The sound quality is good and the mic is good. Caveat, I did not attempt to have a conversation in a loud setting like a busy restaurant.
  • No fat finger issues. The buttons and apps icons are seemingly small but I pretty much tap and select things on the watch without error.
  • Pairing between Apple Watch and iPhone was easy, and the range is good. I could be wearing the watch in one room of my house while the iPhone was in another room and had no problems with them being in range of each other.
  • Cool factor – surprisingly no! Only one person asked me if I was wearing an Apple Watch. Contrasted with other smart watches I have worn, where I would always be asked “what is that?” I’m guessing it is because the Apple Watch fits me and looks like a normal watch. It doesn’t draw attention as being oversized for my wrist.

home

Please leave comments as to your own use, or tips and tricks on getting the most out of smart watches.Possibly Related Posts:

Getting started with Sales Cloud (Updated)

Angelo Santagata - Mon, 2015-05-04 08:24
Hey all, Ive just reviesed the Getting Started with Oracle Sales Cloud Integrations blog entry with a few more links

EMC World 2015 - Day 1 at Momentum

Yann Neuhaus - Mon, 2015-05-04 08:07

The first day of my first EMC World conferences and specially the ones from Momentum wich covers the Enterprise Content Division (ECD) products/solutions/strategies aso. The start was great, being in Las Vegas where you have the feeling you are on another planet, I had the same feel during the General Session or the ECD Keynote; each time good explanations coupled with good shows.

The information I have got was interresting and some questions came in my mind. Questions that I hope can be answered in the next days.

InfoArchive

Before attending the General Session I went to another one which was about EMC InfoArchive. Today I work mainly with the Documentum Content Server and products around it like xPlore, ADTS, D2 aso.

To be prepared for new futur customer requests and challenges I wanted to see what is behind InfoArchive. Let's give some points:

- One main goal of using InfoArchive is to reduce the cost of the storage and to keep the assets.

- Once legacy applications are shut down, you can archive their data into InfoArchive. You can also use it to archive data from active applications where you can build some rules to define which data will be moved to InfoArchive. And this can be done for flat, complex as well as, of course, for document records.

- When the data are saved into InfoArchive, you can use xQuery, xForm to retrieve the data and display them in a way the user wants to see it.

That's on the general overview. On a technical point of view here some information:

- The Archive Service is build using a Data Service (xDB data server) and/or a Content Server. In case you have to archive only metadata the xDB service is sufficient.

- The storage to be used is obviously the EMC storages but other ones can also be used meaning this solution can be implemented in more type of infrastructures.

- To the question what is archived, the answer is SIP (Submission Information Package). You have a SIP descriptor  and SIP Data (metadata or/and Content)

- LWSO objects are stored to use less storage

- The search is done first against the AIP (Archive Info Packages) and once the object is found, against the AIU (Archive Info Unit).There is no fulltext available on the InfoArchive layer, the reason is that an archive system does not use it in general.

- RPS can be used to manage the retention.

Open questions

So that for the "facts", now there are some other open points which could be raised in case InforArchive will be used. You can save you data in normal XML formats but you can also define how the data are saved and how you want to search them. In this case who will manage that, the Record&Archive team or do you need first a business analyste? Can the defined model easily be changed for the current archived information? There are technical questions but I think the organization has first to be defined to have a successfull implementation of InfoArchive

Again, some questions are coming in my mind. And again, let's see if I can have some answers in ... the next days.

Get the main Administration Information from SQL Server with PowerShell

Yann Neuhaus - Mon, 2015-05-04 07:48

In my previous blog Automate SQL Server Administration with PowerShell – How to retrieve SQL Server services?, I presented you the first step of the SQL Server administration through an automated process.

This blog is a follow-up of the previous one, and it will focus on retrieving information of a SQL Server instance with PowerShell

 

Disclaimer: I am not a developer but a SQL Server dba. If you find errors or some ways of improvement, I will be glad to read your comments!

 

List all SQL Server instances

To be able to proceed for all the instances, you can easily get all your instance names with this function:

Get-SQLInstances_function_20150504-135411_1.png

 

Retrieve SQL Server instance information

In my example, I execute my script on the machine hosting my SQL Server instance.

I use SMO objects to access to the instance information . But you need the instance full name , as follows:

full_instance_name.png

I only give the instance name as parameter because I execute my script on a local server, otherwise I need to give the server name as parameter.

 

First I initialize my SMO object of my instance like this:

instance_smo_object.png

 

This SMO object contains SQL Server instance main information. To list all properties and the object methods, proceed as follows:

service_properties_and_methods_20150504-141841_1.png

 

To list the general information of  the instance, you can proceed like this:

instance_general_information.png

To list the directory paths related to the instance, here is an example:

instance_directories.png

To list important instance configuration, here is an example:

instance_configuration.png

 

By formating the information you retrieve in the instance SMO object, you can generate reports, audit your environment or whatever!

The following capture is an existing dashboard from our Database Management Kit (DMK).

dmk_instance_information.png

 

Next steps

The SMO object for the SQL Server instance has a limit number of properties and methods. Sometimes, you need information which are not present in the object. In this case, you must use the "sqlcmd" command and retrieve your information by using T-SQL.

Here is the way to proceed:

invoke_sqlcmd_command.png

To retrieve any of SQL Server instance information, the "sqlcmd" command would always work. You can also use it to modify the instance configuration.

 

I hope this blog will help you in your work. In my next blog, I will show you how to access to your database information with PowerShell.