Skip navigation.

Feed aggregator

Twitter Oracle Security Open Chat Thursday 6th March

Pete Finnigan - Wed, 2015-05-20 01:05

I will be co-chairing/hosting a twitter chat on Thursday 6th March at 7pm UK time with Confio. The details are here . The chat is done over twitter so it is a little like the Oracle security round table sessions....[Read More]

Posted by Pete On 05/03/14 At 10:17 AM

Categories: Security Blogs

PFCLScan Reseller Program

Pete Finnigan - Wed, 2015-05-20 01:05

We are going to start a reseller program for PFCLScan and we have started the plannng and recruitment process for this program. I have just posted a short blog on the PFCLScan website titled " PFCLScan Reseller Program ". If....[Read More]

Posted by Pete On 29/10/13 At 01:05 PM

Categories: Security Blogs

PFCLScan Version 1.3 Released

Pete Finnigan - Wed, 2015-05-20 01:05

We released version 1.3 of PFCLScan our enterprise database security scanner for Oracle a week ago. I have just posted a blog entry on the PFCLScan product site blog that describes some of the highlights of the over 220 new....[Read More]

Posted by Pete On 18/10/13 At 02:36 PM

Categories: Security Blogs

PFCLScan Updated and Powerful features

Pete Finnigan - Wed, 2015-05-20 01:05

We have just updated PFCLScan our companies database security scanner for Oracle databases to version 1.2 and added some new features and some new contents and more. We are working to release another service update also in the next couple....[Read More]

Posted by Pete On 04/09/13 At 02:45 PM

Categories: Security Blogs

Oracle Security Training, 12c, PFCLScan, Magazines, UKOUG, Oracle Security Books and Much More

Pete Finnigan - Wed, 2015-05-20 01:05

It has been a few weeks since my last blog post but don't worry I am still interested to blog about Oracle 12c database security and indeed have nearly 700 pages of notes in MS Word related to 12c security....[Read More]

Posted by Pete On 28/08/13 At 05:04 PM

Categories: Security Blogs

Row Store vs Column Store in SAP HANA

Yann Neuhaus - Wed, 2015-05-20 00:00

The SAP HANA database allows you to create your tables in Row or Column Store mode. In this blog, I will demonstrate that each method has its advantages and disadvantages and should be used for specific cases.

Thanks to two kind of tests, I will show you that the Row Store mode should be used for simple SELECT SQL queries, without aggregation and the Column Store mode should be used for complex SELECT queries, containing aggregation levels.

If you want to have more information regarding the Column Store or the In-memory technologies, don't hesitate to assist at the next dbi services event:

http://www.dbi-services.com/index.php/newsroom-e/events/event-l-in-memory-r-boost-your-it-performance

Test 1: Simple SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a simple SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

1_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

2_SQL_COLUMN_STORE.PNG

Tables Row Store Table

You can find here information regarding the Row Store table used in the test.

Name:                 SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                1

Partition:            0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

3_TABLE_ROW_STORE_1.PNG

 

4_TABLE_ROW_STORE_2.PNG

Column Store Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:           Column Store

Row count:          10 309 873

Index:                 0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

6_TABLE_COLUMN_STORE_2.PNG

Result of the test Using the Row Store table

8_SQL_Q1_SELECT_ROW_RESULT.PNG

Using the Column Store table

9_SQL_Q1_SELECT_COLUMN_RESULT.PNG

Test 2: Complex SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a complex SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

10_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

11_SQL_COLUMN_STORE.PNG

Tables Row Store Fact Table

You can find here information regarding the Row Store table used in the test.

Name:                  SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                   2

Partition:             0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

Column Store Fact Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:          Column Store

Row count:         10 309 873

Index:                   0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

Result of the test Using the Row Store tables

12_SQL_Q2_SELECT_ROW_RESULT.PNG

Using the Column Store tables

13_SQL_Q2_SELECT_COLUMN_RESULT.PNG

Conclusion

Row and Column store modes in SAP HANA should be used in two different contexts:

 - Tables in Row store mode must be used in SELECT queries WITHOUT any aggregation functions

 -Tables in Column store mode are powerful when they are used to create analytical queries or view, using aggregation functions (GROUP BY, …)

The performance can be highly optimized if the tables selected in the queries have the right store mode.

 

 

 

2 minute Tech Tip: Working with JSON in APEX

Dimitri Gielis - Tue, 2015-05-19 16:30
On Monday Bob Rhubart did a video call with me in his series of 2MTT (2 Minute Tech Tip) on YouTube. You find my 2MMT here.

I talked about using JSON and APEX and gave two examples were we use it.
In previous blog posts I gave more details on those techniques. Here's a quick overview:
Categories: Development

Using HBase and Impala to Add Update and Delete Capability to Hive DW Tables, and Improve Query Response Times

Rittman Mead Consulting - Tue, 2015-05-19 16:21

One of our customers is looking to offload part of their data warehouse platform to Hadoop, extracting data out of a source system and loading it into Apache Hive tables for subsequent querying using OBIEE11g. One of the challenges that the project faces though is how to handle updates to dimensions (and in their case, fact table records) when HDFS and Hive are typically append-only filesystems; ideally writes to fact tables should only require INSERTs and filesystem appends but in this case they wanted to use an accumulating fact snapshot table, whilst the dimension tables all used SCD1-type attributes that had their values overwritten when updates to those values came through from the source system.

The obvious answer then was to use Apache HBase as part of the design, a NoSQL database that sits over HDFS but allows updates and deletes to individual rows of data rather than restricting you just to append/inserts. I covered HBase briefly on the blog a few months ago when we used it to store webserver log entries brought into Hadoop via Flume, but in this case it makes an ideal landing point for data coming into our Hadoop system as we can maintain a current-state record of the data brought into the source system updating and overwriting values if we need to. What was also interesting to me though was how well we could integrate this HBase data into our mainly SQL-style data processing; how much Java I’d have to use to work with HBase, and whether we could get OBIEE to connect to the HBase tables and query them directly (with a reasonable response time). In particular, could we use the Hive-on-HBase feature to create Hive tables over the HBase ones, and then query those efficiently using OBIEE, so that the data flow looked like this?

NewImage

To test this idea out, I took the Flight Delays dataset from the OBIEE11g SampleApp & Exalytics demo data [PDF] and created four HBase tables to hold the data from them, using the BigDataLite 4.1 VM and the HBase Shell. This dataset has four tables:

  • FLIGHT_DELAYS – around 220m US flight records listing the origin airport, destination airport, carrier, year and a bunch of metrics (flights, late minutes, distance etc)
  • GEOG_ORIGIN – a list of all the airports in the US along with their city, state, name and so on
  • GEOG_DEST – a copy of the GEOG_ORIGIN table, used for filtering and aggregating on both origin and destination 
  • CARRIERS – a list of all the airlines associated with flights in the FLIGHT_DELAYS table

HBase is a NoSQL, key/value-store database where individual rows have a key, and then one or more column families made up of one or more columns. When you define a HBase table you only define the column families, and the data load itself creates the columns within them in a similar way to how the Endeca Server holds “jagged” data – individual rows might have different columns to each other and like MongoDB you can define a new column just by loading it into the database.

Using the HBase Shell CLI on the BigDataLite VM I therefore create the HBase tables using just these high-level column family definitions, with the individual columns within the column families to be defined later when I load data into them.

hbase shell
 
create 'carriers','details'
create 'geog_origin','origin'
create 'geog_dest','dest'
create 'flight_delays','dims','measures'

To get data into HBase tables there’s a variety of methods you can use. Most probably for the full project we’ll write a Java application that uses the HBase client to read, write, update and delete rows that are read in from the source application (see this previous blog post for an example where we use Flume as the source), or to set up some example data we can use the HBase Shell and enter the HBase row/cell values directly, like this for the geog_dest table:

put 'geog_dest','LAX','dest:airport_name','Los Angeles, CA: Los Angeles'
put 'geog_dest','LAX','dest:airport_name','Los Angeles, CA: Los Angeles'
put 'geog_dest','LAX','dest:city','Los Angeles, CA'
put 'geog_dest','LAX','dest:state','California'
put 'geog_dest','LAX','dest:id','12892'

and you can then use the “scan” command from the HBase shell to see those values stored in HBase’s key/value store, keyed on LAX as the key.

hbase(main):015:0> scan 'geog_dest'
ROW                                    COLUMN+CELL                                                                                                     
 LAX                                   column=dest:airport_name, timestamp=1432067861347, value=Los Angeles, CA: Los Angeles                           
 LAX                                   column=dest:city, timestamp=1432067861375, value=Los Angeles, CA                                                
 LAX                                   column=dest:id, timestamp=1432067862018, value=12892                                                            
 LAX                                   column=dest:state, timestamp=1432067861404, value=California                                                    
1 row(s) in 0.0240 seconds

For testing purposes though we need a large volume of rows and entering them all in by-hand isn’t practical, so this is where we start to use the Hive integration that now comes with HBase. For the BigDataLite 4.1 VM all you need to do to get this working is install the hive-hbase package using yum (after first installing the Cloudera CDH5 repo into /etc/yum.repos.d), load the relevant JAR files when starting your Hive shell session, and then create a Hive table over the HBase table mapping Hive columns to the relevant HBase ones, like this:

hive
 
ADD JAR /usr/lib/hive/lib/zookeeper.jar;
ADD JAR /usr/lib/hive/lib/hive-hbase-handler.jar;
ADD JAR /usr/lib/hive/lib/guava-11.0.2.jar;
ADD JAR /usr/lib/hive/lib/hbase-client.jar;
ADD JAR /usr/lib/hive/lib/hbase-common.jar;
ADD JAR /usr/lib/hive/lib/hbase-hadoop-compat.jar;
ADD JAR /usr/lib/hive/lib/hbase-hadoop2-compat.jar;
ADD JAR /usr/lib/hive/lib/hbase-protocol.jar;
ADD JAR /usr/lib/hive/lib/hbase-server.jar;
ADD JAR /usr/lib/hive/lib/htrace-core.jar;
 
CREATE EXTERNAL TABLE hbase_carriers
 (key string,
  carrier_desc string
) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES 
("hbase.columns.mapping" = ":key,details:carrier_desc")
TBLPROPERTIES ("hbase.table.name" = "carriers");
 
CREATE EXTERNAL TABLE hbase_geog_origin
 (key string,
  origin_airport_name string,
  origin_city string,
  origin_state string,
  origin_id string
) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES 
("hbase.columns.mapping" = ":key,origin:airport_name,origin:city,origin:state,origin:id")
TBLPROPERTIES ("hbase.table.name" = "geog_origin");
 
CREATE EXTERNAL TABLE hbase_geog_dest
 (key string,
  dest_airport_name string,
  dest_city string,
  dest_state string,
  dest_id string
) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES 
("hbase.columns.mapping" = ":key,dest:airport_name,dest:city,dest:state,dest:id")
TBLPROPERTIES ("hbase.table.name" = "geog_dest");
 
CREATE EXTERNAL TABLE hbase_flight_delays
 (key string,
  year string,
  carrier string,
  orig string,
  dest string,
  flights tinyint,
  late   tinyint,
  cancelled bigint,
  distance smallint
) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES 
("hbase.columns.mapping" = ":key,dims:year,dims:carrier,dims:orig,dims:dest,measures:flights,measures:late,measures:cancelled,measures:distance")
TBLPROPERTIES ("hbase.table.name" = "flight_delays");

Bulk loading data into these Hive-on-HBase tables is then just a matter of loading the source data into a regular Hive table, and then running INSERT INTO TABLE … SELECT commands to copy the regular Hive rows into the HBase tables via their Hive metadata overlays:

insert into table hbase_carriers                           
select carrier, carrier_desc from carriers;
 
insert into table hbase_geog_origin
select * from geog_origin;
 
insert into table hbase_geog_dest
select * from geog_dest;
 
insert into table hbase_flight_delays
select row_number() over (), * from flight_delays;

Note that I had to create a synthetic sequence number key for the fact table, as the source data for that table doesn’t have a unique key for each row – something fairly common for data warehouse fact table datasets. In fact storing fact table data into a HBase table is not a very good idea for a number of reasons that we’ll see in a moment, and bear-in-mind that HBase is designed for sparse datasets and low-latency inserts and row retrievals so don’t read too much into this approach yet.

So going back to the original reason for using HBase to store these tables, updating rows within them is pretty straightforward. Taking the geog_origin HBase table at the start, if we get the row for SFO at the start using a Hive query over the HBase table, it looks like this:

hive> select * from hbase_geog_origin where key = 'SFO'; 
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
...
SFO   San Francisco, CA: San Francisco   San Francisco, CA   California   14771
Time taken: 29.126 seconds, Fetched: 1 row(s)

To update that row and others, I can load a new data file into the Hive table using HiveQL’s LOAD DATA command, or INSERT INTO TABLE … SELECT from another Hive table containing the updates, like this:

insert into table hbase_geog_origin    
select * from origin_updates;

To check that the value has in-fact updated I can either run the same SELECT query against the Hive table over the HBase one, or drop into the HBase shell and check it there:

hbase(main):001:0> get 'geog_origin','SFO'
COLUMN                                 CELL                                                                                                           
 origin:airport_name                   timestamp=1432050681685, value=San Francisco, CA: San Francisco International                                  
 origin:city                           timestamp=1432050681685, value=San Francisco, CA                                                               
 origin:id                             timestamp=1432050681685, value=14771                                                                           
 origin:state                          timestamp=1432050681685, value=California                                                                      
4 row(s) in 0.2740 seconds

In this case the update file/Hive table changed the SFO airport name from “San Francisco” to “San Francisco International”. I can change it back again using the HBase Shell like this, if I want:

put 'geog_origin','SFO','origin:airport_name','San Francisco, CA: San Francisco'

and then checking it again using the HBase Shell’s GET command on that key value shows it’s back to the old value – HBase actually stores X number of versions of each cell with a timestamp for each version, but by default it shows you the current one:

hbase(main):003:0> get 'geog_origin','SFO'
COLUMN                                 CELL                                                                                                           
 origin:airport_name                   timestamp=1432064747843, value=San Francisco, CA: San Francisco                                                
 origin:city                           timestamp=1432050681685, value=San Francisco, CA                                                               
 origin:id                             timestamp=1432050681685, value=14771                                                                           
 origin:state                          timestamp=1432050681685, value=California                                                                      
4 row(s) in 0.0130 seconds

So, so far so good. We’ve got a way of storing data in Hive-type tables on Hadoop and a way of updating and amending records within them by using HBase as the underlying storage, but what are these tables like to query? Hive-on-HBase tables with just a handful of HBase rows return data almost immediately, for example when I create a copy of the geog_dest HBase table and put just a single row entry into it, then query it using a Hive table over it:

hive> select * from hbase_geog_dest2;
OK
LAXLos Angeles, CA: Los AngelesLos Angeles, CACalifornia12892
Time taken: 0.257 seconds, Fetched: 1 row(s)

Hive in this case even with a single row would normally take 30 seconds or more to return just that row; but when we move up to larger datasets such as the flight delays fact table itself, running a simple row count on the Hive table and then comparing that to the same query running against the Hive-on-HBase version shows a significant time-penalty for the HBase version:

hive> select sum(cast(flights as bigint)) as flight_count from flight_delays;
Total jobs = 1
Launching Job 1 out of 1
...
Total MapReduce CPU Time Spent: 7 seconds 670 msec
OK
29483653
Time taken: 37.327 seconds, Fetched: 1 row(s)

compared to the Hive-on-HBase version of the fact table:

hive> select sum(cast(flights as bigint)) as flight_count from hbase_flight_delays;
Total jobs = 1
Launching Job 1 out of 1
...
Total MapReduce CPU Time Spent: 1 minutes 19 seconds 240 msec
OK
21473738
Time taken: 99.154 seconds, Fetched: 1 row(s)

And that’s to be expected; as I said earlier, HBase is aimed at low-latency single-row operations rather than full table scan, aggregation-type queries, so it’s not unexpected that HBase performs badly here, but the response time is even worse if I try and join the HBase-stored Hive fact table to one or more of the dimension tables also stored in HBase.

In our particular customer example though these HBase tables were only going to be loaded once-a-day, so what if we copy the current version of each HBase table row into a snapshot Hive table stored in regular HDFS storage, so that our data loading process looks like this:

NewImage

and then OBIEE queries the snapshot of the Hive-on-HBase table joined to the dimension table still stored in HBase, so that the query side looks like this:

NewImage

Let’s try it out by taking the original Hive table I used earlier on to load the hbase_flight_delays table. and join that to one of the Hive-on-HBase dimension tables; I’ll start first by creating a baseline response time by joining that source Hive fact table to the source Hive dimension table (also used earlier to load the corresponding Hive-on-HBase table):

select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from flight_delays f 
join geog_origin o on f.orig = o.origin                                                             
and o.origin_state = 'California'                                                                       
group by o.origin_airport_name; 
...
OK
17638Arcata/Eureka, CA: Arcata
9146Bakersfield, CA: Meadows Field
125433Burbank, CA: Bob Hope
...
1653Santa Maria, CA: Santa Maria Public/Capt. G. Allan Hancock Field
Time taken: 43.896 seconds, Fetched: 27 row(s)

So that’s just under 44 seconds to do the query entirely using regular Hive tables. So what if I swap-out the regular Hive dimension table for the Hive-on-HBase version, how does that affect the response time?

hive> select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from flight_delays f       
    > join hbase_geog_origin o on f.orig = o.key                                                        
    > and o.origin_state = 'California'                                                                 
    > group by o.origin_airport_name;
...
OK
17638Arcata/Eureka, CA: Arcata
9146Bakersfield, CA: Meadows Field
125433Burbank, CA: Bob Hope
...
1653Santa Maria, CA: Santa Maria Public/Capt. G. Allan Hancock Field
Time taken: 51.757 seconds, Fetched: 27 row(s)

That’s interesting – even though we used the (updatable) Hive-on-HBase dimension table in the query, the response time only went up a few seconds to 51, compared to the 44 when we used just regular Hive tables. Taking it one step further though, what if we used Cloudera Impala as our query engine and copied the Hive-on-HBase fact table into a Parquet-stored Impala table, so that our inward data flow looked like this:

NewImage

By using the Impala MPP engine – running on Hadoop but directly reading the underlying data files, rather than going through MapReduce as Hive does – and in-addition storing its data in column-store query-orientated Parquet storage, we can take advantage of OBIEE 11.1.1.9’s new support for Impala and potentially bring the query response time even further. Let’s go into the Impala Shell on the BigDataLite 4.1 VM, update Impala’s view of the Hive Metastore table data dictionary, and then create the corresponding Impala snapshot fact table using a CREATE TABLE … AS SELECT Impala SQL command:

[oracle@bigdatalite ~]$ impala-shell
 
[bigdatalite.localdomain:21000] > invalidate metadata;
 
[bigdatalite.localdomain:21000] > create table impala_flight_delays
                                > stored as parquet
                                > as select * from hbase_flight_delays;

Now let’s use the Impala Shell to join the Impala version of the flight delays table with data stored in Parquet files, to the Hive-on-HBase dimension table created earlier within our Hive environment:

[bigdatalite.localdomain:21000] > select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from impala_flight_delays f
                                > join hbase_geog_origin o on f.orig = o.key
                                > and o.origin_state = 'California'  
                                > group by o.origin_airport_name;
Query: select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from impala_flight_delays f
join hbase_geog_origin o on f.orig = o.key
and o.origin_state = 'California'
group by o.origin_airport_name
+--------------+------------------------------------------------------------------+
| flight_count | origin_airport_name                                              |
+--------------+------------------------------------------------------------------+
| 31907        | Fresno, CA: Fresno Yosemite International                        |
| 125433       | Burbank, CA: Bob Hope                                            |
...
| 1653         | Santa Maria, CA: Santa Maria Public/Capt. G. Allan Hancock Field |
+--------------+------------------------------------------------------------------+
Fetched 27 row(s) in 2.16s

Blimey – 2.16 seconds, compared to the best time of 44 seconds we go earlier when we just used regular Hive tables, let alone join to the dimension table stored in HBase. Let’s crank-it-up a bit and join another dimension table in, filtering on both origin and destination values:

[bigdatalite.localdomain:21000] > select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from impala_flight_delays f
                                > join hbase_geog_origin o on f.orig = o.key
                                > join hbase_geog_dest d on f.dest = d.key
                                > and o.origin_state = 'California'  
                                > and d.dest_state = 'New York'
                                > group by o.origin_airport_name;
Query: select sum(cast(f.flights as bigint)) as flight_count, o.origin_airport_name from impala_flight_delays f
join hbase_geog_origin o on f.orig = o.key
join hbase_geog_dest d on f.dest = d.key
and o.origin_state = 'California'
and d.dest_state = 'New York'
group by o.origin_airport_name
+--------------+-------------------------------------------------------+
| flight_count | origin_airport_name                                   |
+--------------+-------------------------------------------------------+
| 947          | Sacramento, CA: Sacramento International              |
| 3880         | San Diego, CA: San Diego International                |
| 4030         | Burbank, CA: Bob Hope                                 |
| 41909        | San Francisco, CA: San Francisco International        |
| 3489         | Oakland, CA: Metropolitan Oakland International       |
| 937          | San Jose, CA: Norman Y. Mineta San Jose International |
| 41407        | Los Angeles, CA: Los Angeles International            |
| 794          | Ontario, CA: Ontario International                    |
| 4176         | Long Beach, CA: Long Beach Airport                    |
+--------------+-------------------------------------------------------+
Fetched 9 row(s) in 1.48s

Even faster. So that’s what we’ll be going with as our initial approach for the data loading and querying; load data into HBase tables as planned at the start, taking advantage of HBase’s CRUD capabilities but bulk-loading and initially reading the data using Hive tables over the HBase ones; but then, before we make the data available for querying by OBIEE, we copy the current state of the HBase fact table into a Parquet-stored Impala table, using Impala’s ability to work with Hive tables and metadata and create joins across both Impala and Hive tables, even when one of the Hive tables uses HBase as its underlying storage.

Categories: BI & Warehousing

List all RMAN backups that are needed to recover

Yann Neuhaus - Tue, 2015-05-19 09:49

This blog post is something I had in draft and Laurent Schneider blog post reminds me to publish it. With the right RMAN configuration you should not have to managed backup files yourself. The RMAN catalog knows them and RMAN should be able to access them. If you want to keep a backup for a long time, you just tell RMAN to keep it.
But sometimes, RMAN is not connected to your tape backup software, or the backups are not shared on all sites, and you have to restore or copy the set of files that is needed for a restore database or a duplicate database.

A customer was in that case, identifying the required files from their names because they are all timestamped with the beginning of the backup job. It's our DMK default. In order to rely on that, the 'backup database plus archivelog' was run. And in order to be sure to have all archived logs in those backup sets, any concurrent RMAN job are blocked during that database backup. Because if a concurrent job is doing some archivelog backups, they will be timestamped differently.

RPO and availability

I don't like that. I don't want that anything can block the backup of archived logs.
They are critical for two reasons:

  • The Recovery Point Objective is not fulfilled if some archivelog backups are delayed
  • The frequency of archivelog backup is also defined to prevent a full FRA
But if we allow concurrent backup of archived logs, we need something else to be able to identify the whole set of files that are needed to restore the database at that point in time. then my suggestion was to generate the list of those files after each database backup, and keep that list. When we need to restore that backup, then we can send the list to the backup team ans ask them to restore them.

The script

Here is my script, I'll explain later:

echo "restore controlfile preview; restore database preview;" | rman target / | awk '
/Finished restore at /{timestamp=$4}
/Recovery must be done beyond SCN /{if ($7>scn) scn=$7 }
/^ *(Piece )Name: / { sub(/^ *(Piece )Name: /,"") ; files[$0]=1 }
END{ for (i in files) print i > "files-"timestamp"-SCN-"scn".txt" }
' 
this script generate the following file:
files-20150519013910-SCN-47682382860.txt
which list the files needed to do a RESTORE/RECOVER UNTIL SCN 47682382860

the content of the file is:

oracle@dbzhorap01:/home/oracle/ [DB01PP1] sort files-20150519019910-SCN-47682382860.txt
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168279_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168280_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168281_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168282_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168283_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168284_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169462_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169463_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169464_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169465_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169466_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169467_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169468_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169469_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169470_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169471_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169472_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169482_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169473_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169474_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169475_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169476_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169477_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169478_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
and lists the backup pieces for the incremental 0, incremental 1 and archivelogs needed to recover to a consistent state that can be opened. The script lists only backup sets so we are supposed have have backed up the latest archived logs (with backup database plus archivelog for example).

You can put an 'until scn'^but my primary goal was to run it just after a backup database in order to know which files have to be restored to get that backup (restore or duplicate).

Restore preview

The idea is to rely on RMAN to find the files that are needed to restore and recover rather than doing it ourselves from the recovery catalog. RMAN provides the PREVIEW restore for that:

RMAN> restore database preview
Starting restore at 20150501390436
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
166388  Incr 0  10.53G     DISK        00:52:56     20150516031010
        BP Key: 166388   Status: AVAILABLE  Compressed: YES  Tag: WEEKLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
  List of Datafiles in backup set 166388
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  1    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/system.329.835812499
  2    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/undotbs1.525.835803187
  10   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.676.835815153
  17   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.347.835815677
  23   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.277.835814327
  25   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.342.835811161
...
BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
167586  Incr 1  216.09M    DISK        00:01:34     20150519012830
        BP Key: 167586   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
  List of Datafiles in backup set 167586
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  43   1  Incr 47681921440 20150519012700 +U01/DB01Pp/datafile/cpy_idx.346.835815097

List of Backup Sets
===================

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
167594  105.34M    DISK        00:00:23     20150519015400
        BP Key: 167594   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck

  List of Archived Logs in backup set 167594
  Thrd Seq     Low SCN    Low Time       Next SCN   Next Time
  ---- ------- ---------- -------------- ---------- ---------
  3    59406   47681333097 20150519010239 47682617820 20150519014652
  4    46800   47681333143 20150519010240 47682617836 20150519014652
  1    76382   47681333188 20150519010240 47682618254 20150519014655
  2    60967   47681333315 20150519010242 47682385651 20150519013711
...

Media recovery start SCN is 47681637369
Recovery must be done beyond SCN 47682382860 to clear datafile fuzziness
Finished restore at 20150501390440
You see the list of datafiles backupsets and archivelog backupsets and at the end you have information about SCN. Let me explain what are those SCNs.

Recovery SCN

Because it is online backup the datafiles are fuzzy. We need to apply redo generaed during backup.

The 'media recovery start SCN' is the begining of the archivelog to be applied:

SQL> select scn_to_timestamp(47681637369) from dual;

SCN_TO_TIMESTAMP(47681637369)
---------------------------------------------------------------------------
19-MAY-15 01.10.38.000000000 AM

The 'recovery must be done beyond SCN' is the last redo that must be applied to have datafiles consistent:

SQL> select scn_to_timestamp(47682382860) from dual;

SCN_TO_TIMESTAMP(47682382860)
---------------------------------------------------------------------------
19-MAY-15 01.35.58.000000000 AM

In my example, the backup (incremental level 1 + archivelog) started at 01:00:00 and was completed at 01:35:00

Conclusion

And I have a file with the list of backups that are needed to restore or duplicate the database at that point in time. Why do I need that when RMAN is supposed to be able to retrieve them itself? Because sometimes we backup to disk and the disk is backed up to tape without RMAN knowing it. Of course RMAN can connect directly to the tape backup software but that is not for free. Or we want to duplicate to another site where backups are not shared. We need to know which files we have to bring there. And that sometimes requires a request to another team so it's better to have the list of all files we need.

As usual, don't hesitate to comment if you see something to improve in my small script.

Maker Faire 2015

Oracle AppsLab - Tue, 2015-05-19 09:17

This weekend the 10th Annual Maker Faire Bay Area took place in my backyard and rather than fighting traffic for 2 days with the +130,000 attendees I decided, as I have for the last 9 years, to join them.

Unlike last year, Oracle had no presence at the Maker Faire itself, so I had plenty of time to walk around the grounds and attend sessions.  This post is an overview of what I saw and experienced in the 2 day madness that is called the Maker Faire.

For those of you who have never been to the Maker Faire, the easiest way to describe it is as a mix of Burning Man and a completely out of control hobbyist’s garage, where the hobbyist’s hobbies include, but are not limited to: everything tech related, everything food related, everything engineering related and everything art related, all wrapped up in a family friendly atmosphere, my kids love the Maker Faire.

You can find the tech giants of the world next to the one person startup, beer brewers next to crazy knitting contraptions, bus sized, fire breathing rhino’s next to giant cardboard robots etc.  And nobody takes themselves too seriously, e.g. Google was handing out Google Glasses to everybody … Google Safety Glasses that is :-)

Google Safety Goggles

My new Google Glasses :-)

The first thing I noticed was that the Faire expanded . . . again.  A huge tent was erected on what was a parking lot last year that was housing the Make:Labs, I didn’t actually get to spend any time in there but it contained an exploratorium, startup stuff and a section for Young Makers.

Which brings me to the first trend I observed, makers are getting younger and younger and the faire is doubling down on these young folk.

Don’t get me wrong, the faire has always attracted young kids, and some of them were making stuff, but there seem to be more and more of them, the projects they bring are getting more and more impressive and the faire’s expansions all seem to be to cater to these younger makers.

One of the sessions I attended was called “Meet Some Amazing Young Makers” where a 14 year old girl showed of a semi-autonomous robot that could map the inside of caves.  She was showing us the second iteration, she build the first version . . . when she was 8!  Another young man, 13, build a contraption that solved a Rubik’s cube in under 90 seconds.  It wasn’t just that they build these things, they gave solid presentations to a majority adult audience talking about their builds and future plans.

Another trend that was hard to ignore is that the Internet of Things (IoT) is getting huge and it’s definitely here to stay.  There weren’t just many, many vendors promoting their brand of IoT hardware, but a whole ecosystem is developing around them.

From tools that let you visualize all the data collected by your “things” to remote configuration and customization.  This trend will not just Cross the Chasm, it’s going to rocket right passed it.

I attended a panel discussion with Dominic Pajak (Director IoT Segments, ARM), Paul Rothman (Director of R&D at littleBits Electronics), Andrew Witte (CTO, Pebble), Alasdair Allan (scientist, tinkerer) and Pierre Roux (Atmel) about the current state of IoT and the challenges that lay ahead.

One of the interesting points raised during the discussions is that there currently is no such thing as the Internet of Things!  All these “things” have to be tethered to a phone or other internet capable device (typically using BLE), they cannot connect to the internet directly.

Furthermore, they cannot communicate with each other directly.  So it’s not really an IoT rather the regular “human internet” with regular computers/phones connecting to it, which in turn happen to have have some sensors attached to them that use the internet as a communication vehicle, but that doesn’t really roll of the tongue that well.

There is no interoperability standard at the moment so you can’t really have one device talk to a random other device.  This is one of the challenges the panel felt has to be solved in the sort term.  This could happen with the adoption of IP in BLE or some other mechanism like Fog Computing.

Another challenge brought up was securing IoT devices, especially given that some of the devices could be broadcasting extremely personal information.  This will have to be solved at the manufacturing level as well as at the application level.

Finally, they also mentioned that lowering power consumption needs to be a top priority for these devices.  Even though they have already come a long way, there still is a lot of work to be done.  The ultimate goal would be self sufficient devices that need no external power at all but can harvest the energy they need from their environment.

One such example mentioned is a button/switch that when pressed, uses the energy you put in to press it to generate enough power to send a on/off signal to another device.

Massimo Banzi, co-founder of the Arduino Project, also gave a talk (as he does every year) about the State of Arduino.  It seems that a lot of that state is in legal limbo at the moment as there are now seemingly 2 arduino companies (arduino.cc and arduino.org) with different views of the future of the project.

As part of his vision, Massimo introduced a partnership with Adafruit to let them produce arduino’s in the USA.  Also as a result of the legal issues with the Arduino brand name, he introduced a new “sister” brand called Genuino (Get it? Genuine Arduino) which will allow them to keep producing at least in the US.

Other announcements included the release of the Arduino Gemma, the smallest Arduino ever, the Modulino, a arduino like product designed and produced in their Bangalore, India, office and a focus on online tools to manage and program arduino’s.

I also attended a few sessions that talked about the BeagleBone board.  I am interested in this board because it bridges that gap between the Raspberry Pi and the Arduino, on the one hand it has a Linux OS, but on the other hand it also has Real Time GPIO pins making it interesting for IoT projects that require this.

It also can be easily programmed using JavaScript (it comes with a node server build in) which is something I am currently working with, I’ll probably write up another blog post about my findings with that board when I get some time to play with it (yes, I got one at the Maker Faire :-).

And finally, some other things you can find at the Maker Faire:

Game of Drones:

Fire and Art:

IMG_5591

Robots that solve Rubik’s cubes:

Cheers,

Mark.Possibly Related Posts:

Writing tips

Amardeep Sidhu - Tue, 2015-05-19 03:14

Tim Hall has written some brilliant posts about getting going with writing (blogs, whitepapers etc). This post is the result of inspiration from there only. Tim says that just get started with whatever Winking smile.

If you are into blogging and no so active or even if you aren’t you may want to take a look at all the posts to get some inspiration to document the knowledge you gain on day to day basis.

Here is an index to all the posts by Tim till now

http://oracle-base.com/blog/2015/05/11/writing-tips-why-should-i-bother/

http://oracle-base.com/blog/2015/05/12/writing-tips-how-do-i-start/

http://oracle-base.com/blog/2015/05/13/writing-tips-writing-style/

http://oracle-base.com/blog/2015/05/14/writing-tips-how-do-i-stay-motivated/

http://oracle-base.com/blog/2015/05/15/writing-tips-dealing-with-comments-and-criticism/

http://oracle-base.com/blog/2015/05/18/writing-tips-should-i-go-back-and-rewrite-revise-remove-old-posts/

http://oracle-base.com/blog/2015/05/19/writing-tips-how-often-should-i-write/

Enjoy !

Categories: BI & Warehousing

SQL Server 2014: First Service Pack (SP1) is available

Yann Neuhaus - Tue, 2015-05-19 01:48

May 14th, Microsoft has released the first Service Pack (SP1) for SQL Server 2014. It is more than thirteen months after the RTM version.
SQL Server 2014 Service Pack 1 includes all of the CU from 1 to 5.

Which issues are fixed in this SP1

There are 29 hotfixes:

  • 19 for the Engine
  • 6 for SSRS
  • 3 for SSAS
  • 1 for SSIS

 

b2ap3_thumbnail_PieSQL2014SP1.jpg

Some improvements are:

  • Performance improvement of Column store with batch mode operators and a new Extended Event
  • Buffer pool extension improvement
  • New cardinality estimator to boost queries performances

Historic of SQL Server 2014

The build version of SQL Server 2014 SP1 is 12.0.4100.1.
Here, a quick overview of SQL Server 2014 builds since the CTP1:

Date SQL Server 2014 version Build

Juin 2013

Community Technology Preview 1 (CTP1)

11.00.9120

October 2013

Community Technology Preview 2 (CTP2)

12.00.1524

April 2014

RTM

12.00.2000

April 2014

Cumulative Update 1 (CU1)

12.00.2342

June 2014

Cumulative Update 2 (CU2)

12.00.2370

August 2014

Cumulative Update 3 (CU3)

11.00.2402

October 2014

Cumulative Update 4 (CU4)

12.00.2430

December 2014

Cumulative Update 5 (CU5)

11.00.2456

May 2015

Service Pack 1 (SP1)

12.00.4100

If you need more information about SQL Server 2014 SP1 or to download it, click here.

As a reminder, Service Packs are very critical and important for bug fixing point of view, product upgrade so take care to install it quickly ;-)
See you.

Change first day of week in APEX 5.0 Calendar

Dimitri Gielis - Tue, 2015-05-19 00:52
APEX 5.0 comes with a new calendar region, which is way nicer than the previous calendar in APEX. It has more features, looks better and is also responsive. Behind the scenes you'll see the calendar region is based on Full Calendar.


In Belgium we use Monday as first day of the week, whereas in the US they seem to use Sunday as start of the week in the calendar overview. I've been integrating Full Calendar before, so I knew that library had an option to set the first day of the week. You could either specify an option called firstDay and set that to 1, or you could change the language, and depending the language it would adjust the start day of the week.

In APEX 5.0 I looked for that option, but there's not a specific attribute to set the first day of the week, instead it's based on the language of your application. If you go to Shared Components > Globalization Attributes by default it's set to en, which has Sunday as start of the week. If you set it to en-gb it will have Monday as start of the week.


I searched some time to find how to do it, so hopefully this post will help others to find it more easily. Thanks to Patrick for sharing the way it was implemented.

Categories: Development

SQL Saturday Lisbon: from Francesinha to Bacalhau

Yann Neuhaus - Mon, 2015-05-18 23:45

The last week-end, I was at the SQL Saturday 369 that held in Lisbon. If you take a look at the agenda, you'll probably see that there is a lot of interesting sessions with a lot of famous speakers. Unfortunately, I was not able to attend to all sessions, so I decided to focus only on those that have a direct correlation with my work.

First, 2 "headache" sessions given by Paul White (aka @SQL_Kiwi) about the query optimizer and some internal stuffs. The QO is definitely a very interesting topic and I'm always willing to discover more and more with guys like Paul to improve my skills.

Then, 2 sessions about In-Memory features with SQL Server 2016. In fact, I'm already aware about potential new features about the next SQL Server version, but attending to a session given by Niko Neugebauer about columnstore and discuss about next features adds always a certain value for sure. Thanks Niko and Murilo Miranda for your sessions! 

Finally another "headache" session to finish this day about batch mode and CPU archictectures given by Chris Adkin. We had a very deep dive explaination about batch mode and how it improves performance with CPU savings.  

 

Moreover, it was also the opportunity to meet some of my SQL Server MVP friends like Jean-Pierre Riehl and Florian Eiden ...

 

blog_45_-_1_-_french_team_sqlsat369

 

... and have a good dinner with the SQL Saturday staff and other speakers. A lot of countries represented here: Portugal, Germany, UK, New Zealand, France and probably others.

 

blog_45_-_2_-_dinner_sqlsat369

 

A beautiful city, a good weather, a lot of very good speakers and a very good staff ... maybe the secret sauce of a successful SQL Server event!

I'm pretty sure that it will be the same to the next SQL Saturday in Paris and I will be there (maybe as a speaker this time)

Indexing and Transparent Data Encryption Part I (The Secret Life of Arabia)

Richard Foote - Mon, 2015-05-18 23:42
Database security has been a really hot topic recently so I thought I might write a few posts in relation to indexing and Transparent Data Encryption (TDE) which is available as part of the Oracle Advanced Security option. To protect the database from unauthorized “backed-door” accesses, the data within the actual database files can be encrypted. […]
Categories: DBA Blogs

First Impression for Evodesk Desktop Unboxing

Michael Dinh - Mon, 2015-05-18 18:53

Disclaimer: I am not being paid by anyone to write positive or negative review.

Opinions are my own based on my limited engineering background.

First, packaging is somewhat poor and could be much better for a desk costing close to $1,000 ($886 for my configuration).

Tape coming off.

badpackaging2

I hope my desktop is okay.

badpackaging1

Taking a look inside. Is that a tiny scratch I see?

badpackaging3

After opening the desktop, this is the torn location – not enough foam.

badpackaging4

Look at how much love I give it.

Desktop should be shipped in bubble wrap to prevent damage and scratch.

Cable Pass Through is way too small for 30” x 72”.

smallpass

Most standing desks I was looking at are 1 inch thick.

By no means is this best in class as Evodesk Desktop is 3/4 inch thin.

You won’t find this information anywhere at Evodesk technical specification.

http://www.evodesk.com/media/desktop-diagrams.pdf

thin

This is the programmer controller.

Openned ziplock bag and was this a returned repackaged?

controller

My picture does not look at good as Evodesk – http://www.evodesk.com/standing-desks#posi-loc

I do like th Posi-Loc and was the final selling point.

Hope this is secure and does not spin.

posi-loc

Update:

It looks like Evodesk has updated the information for desktop. Either that or I was blind as a bat the first go round.

Renew™ Desktops
  • 100% reclaimed/recycled wood composite desktop
  • EvoGuard™ durable & stylish non-VOC seamless coating
  • Soft comfort edges eliminate nerve compression and pressure fatigue
  • Corners are slightly rounded for improved safety and style
  • Oversized 3” x 6” Cable Pass Through
  • Pre-drilled for quick and easy setup
  • Available sizes: 48″ (30” x 48″ x .75”), 60″ (30” x 60” x .75”), 72″ (30” x 72” x .75”)
  • Meets California Air Resources Board’s (CARB 2) stringent emission standard
  • Backed by a no-nonsense 2-year no-nonsense limited warranty

Migrated rows

Jonathan Lewis - Mon, 2015-05-18 11:43

I received an email recently describing a problem with a query which was running a full tablescan but: “almost all the waits are on ‘db file sequential read’ and the disk read is 10 times the table blocks”.  Some further information supplied was that the tablespace was using ASSM and 16KB block size; the table had 272 columns (ouch!) and the Oracle version was 11.2.0.4.

In his researches he had read my article on wide rows, and had picked out of one of the comments the line: “the very bad thing about chained rows and direct reads that is that finding the rest of row by ‘db file sequential read’ is never cached”, but he wasn’t sure that this was the problem he was seeing so, very sensibly, he had re-run the query with extended tracing available, and dumped (and formatted/edited) a couple of blocks from the table.

He then sent me the trace file and block dump. Generally this is a mistake – especially when the trace file is several megabytes – but he had prepared the ground well and had linked it back to one of my blog notes, and I thought there might be an opportunity for publishing a few more comments, so I took a look. Here’s a carefully edited subset of the block dump – showing all the pertinent information:


Start dump data blocks tsn: 99 file#:100 minblk 2513181 maxblk 2513181

Block header dump:  0x1926591d
 Object id on Block? Y
 seg/obj: 0x1652a7  csc: 0x53.891880b8  itc: 12  flg: E  typ: 1 - DATA
     brn: 1  bdba: 0x1965b70c ver: 0x01 opc: 0
     inc: 84  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0010.01d.0000dff9  0x2b442286.3469.09  C---    0  scn 0x0053.891880b1
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x04   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x05   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x06   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x07   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x08   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x09   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x0a   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x0b   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
0x0c   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn 0x0000.00000000
bdba: 0x1926591d

data_block_dump,data header at 0x11083f154
===============
tsiz: 0x1ea8
hsiz: 0x26
pbl: 0x11083f154
     76543210
flag=--------
ntab=1
nrow=10
frre=-1
fsbo=0x26
fseo=0x4c5
avsp=0x49f
tosp=0x49f
0xe:pti[0]	nrow=10	offs=0
0x12:pri[0]	offs=0x1c15
0x14:pri[1]	offs=0x197b
0x16:pri[2]	offs=0x16e1
0x18:pri[3]	offs=0x1448
0x1a:pri[4]	offs=0x11b8
0x1c:pri[5]	offs=0xf1f
0x1e:pri[6]	offs=0xc85
0x20:pri[7]	offs=0x9ec
0x22:pri[8]	offs=0x752
0x24:pri[9]	offs=0x4c5
block_row_dump:
tab 0, row 0, @0x1c15
tl: 659 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 1, @0x197b
tl: 666 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 2, @0x16e1
tl: 666 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 3, @0x1448
tl: 665 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 4, @0x11b8
tl: 656 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 5, @0xf1f
tl: 665 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 7, @0x9ec
tl: 665 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 8, @0x752
tl: 666 fb: -----L-- lb: 0x0  cc: 255
tab 0, row 9, @0x4c5
tl: 653 fb: -----L-- lb: 0x0  cc: 255

In the ITL you can see 10 entries with the flag set to “C—-” (committed) with no XID or SCN – that’s consistent with 10 rows migrating into the block in a single transaction. In the row directory you can see the block holds 10 rows, and in the body of the block you can see the header for each of those 10 rows with 255 columns (presumably the 2nd section of each row of 272 columns), and the flag bytes set to “—–L–” (the Last piece of a chained – as opposed to simply migrated – row).

So the block dump is consistent with the possiblity of a direct path read of a block somewhere (10 head pieces) having to read this block 10 times shortly afterwards. Can we find further corroboration in the trace file? The blockdump was for block 0x1926591d = 421943581 decimal


PARSE #4573135368:c=29,e=48,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4116693033,tim=79008343283418
EXEC #4573135368:c=53,e=93,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4116693033,tim=79008343283607
WAIT #4573135368: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=15477650 tim=79008343283636
WAIT #4573135368: nam='Disk file operations I/O' ela= 38 FileOperation=2 fileno=101 filetype=2 obj#=1462951 tim=79008343283973
WAIT #4573135368: nam='direct path read' ela= 8991 file number=100 first dba=947580 block cnt=13 obj#=1462951 tim=79008343293041

WAIT #4573135368: nam='db file sequential read' ela= 4934 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298032
WAIT #4573135368: nam='db file sequential read' ela= 155 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298216
WAIT #4573135368: nam='db file sequential read' ela= 127 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298378
WAIT #4573135368: nam='db file sequential read' ela= 125 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298526
WAIT #4573135368: nam='db file sequential read' ela= 128 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298677
WAIT #4573135368: nam='db file sequential read' ela= 123 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298826
WAIT #4573135368: nam='db file sequential read' ela= 134 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343298983
WAIT #4573135368: nam='db file sequential read' ela= 129 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343299135
WAIT #4573135368: nam='db file sequential read' ela= 180 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343299341
WAIT #4573135368: nam='db file sequential read' ela= 133 file#=100 block#=2513181 blocks=1 obj#=1462951 tim=79008343299497

WAIT #4573135368: nam='db file sequential read' ela= 11039 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343310565
WAIT #4573135368: nam='db file sequential read' ela= 133 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343310730
WAIT #4573135368: nam='db file sequential read' ela= 139 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343310895
WAIT #4573135368: nam='db file sequential read' ela= 124 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311045
WAIT #4573135368: nam='db file sequential read' ela= 122 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311190
WAIT #4573135368: nam='db file sequential read' ela= 127 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311339
WAIT #4573135368: nam='db file sequential read' ela= 125 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311490
WAIT #4573135368: nam='db file sequential read' ela= 134 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311647
WAIT #4573135368: nam='db file sequential read' ela= 128 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311797
WAIT #4573135368: nam='db file sequential read' ela= 124 file#=100 block#=2513245 blocks=1 obj#=1462951 tim=79008343311947

WAIT #4573135368: nam='db file sequential read' ela= 10592 file#=100 block#=2513309 blocks=1 obj#=1462951 tim=79008343322564
WAIT #4573135368: nam='db file sequential read' ela= 142 file#=100 block#=2513309 blocks=1 obj#=1462951 tim=79008343322740
WAIT #4573135368: nam='db file sequential read' ela= 126 file#=100 block#=2513309 blocks=1 obj#=1462951 tim=79008343322889

There are a couple of interesting details in this trace file.

First we note (as the OP said) there are very few direct path reads – but direct path reads can be asynchronous with several running concurrently, which means that we may report one direct path read while the data returned from others records no time. (You’ll have to take my word for the sparseness of direct path reads – there were 5 reading a total of 58 blocks from the object, compared to 50,000 db file sequential reads)

Then you can see that although each block that was subject to “db file sequential read” is reported 10 times, the first read is much slower than the subsequent ones – a fairly good indication that the later reads are coming from a cache somewhere. (The 50,00 reads consisted of roughly 5,300 blocks being read 10 times, 1,400 blocks being read 9 times, 460 blocks being read 8 times, and a few blocks being read 7 or fewer times.)

You might also notice that the “coincidental” jump of 64 blocks between the sets of 10 reads – this appears fairly frequently, and it’s the type of pattern you might expect to see when a serial process is allocating blocks for use in a clean ASSM tablespace after the extent sizes have become fairly large (possibly the 64MB size that eventually appears with system managed extent sizes). There’s a “pseudo-random” choice of block within extent dicated by the process id, that spreads the work done by a single process steadily through the extent. Having filled 2513181, 2513245, 2513309 and so on for 16 steps the trace file comes back to 2513182, 2513246, 2513309 and so on.

It’s interesting (and time-consuming) to check the patterns but what we really need next, and don’t have, to check the theory is the set of 13 blocks dictated by the first direct path read:

WAIT #4573135368: nam='direct path read' ela= 8991 file number=100 first dba=947580 block cnt=13 obj#=1462951 tim=79008343293041

It’s likely that somewhere in the 13 blocks in the range 947580 onwards we would find the 10 row head pieces pointing to block 2513181; then the 10 row head pieces pointing to block 2513245, and so on – and I’d hope that we might see a pattern of many consecutive (or near-consecutive) rows in each originating block pointing to the same “next block”. In fact, with a few blocks in the early range, we might even get some idea of how the application was loading and updating data and be able to make some suggestions for changing the strategy to avoid row chaining.

Footnote

The OP also had a follow-up question which was: “One question for the block dump is why there is no hrid in it since the row pieces are the second row pieces and the flag bit is ‘—–L–‘?”  It would be nice to see this, of course – then we wouldn’t need to see the 947580-947592 range to see what had been happening to the data – but that’s not the way Oracle works, as I’ve pointed out above; but since the answer was in another posting of mine I simply emailed the relevant URL to the OP.


<b>Contributions by Angela Golla,

Oracle Infogram - Mon, 2015-05-18 11:17
Contributions by Angela Golla, Infogram Deputy Editor

Profit Magazine for May
The May issue of Profit Magazine is available. It has a special 'Digital Disruption' report that includes 'The 4 Steps to a Successful Big Data Project' and more.   

Can you restore from a full online backup ?

Laurent Schneider - Mon, 2015-05-18 11:07

The question is not HOW TO DO IT but WHETHER YOU CAN DO IT !

A typical backup script would contains something like BACKUP DATABASE PLUS ARCHIVELOG:


backup database format
'/u99/backup/DB01/20150518/full_0_%d_s%s_p%p'
plus archivelog format
'/u99/backup/DB01/20150518/arc_%d_s%s_p%p';

Starting backup at 2015-05-18_18:27:55
current log archived
input archived log thread=1 sequence=469
...
piece handle=
/u99/backup/DB01/20150518/arc_DB01_s86_p1
Finished backup at 2015-05-18_18:27:58

Starting backup at 2015-05-18_18:27:58
input datafile file number=00002 name=
/u02/oradata/DB01/undots01.dbf
...
including current control file in backup set
including current SPFILE in backup set
piece handle=
/u99/backup/DB01/20150518/full_0_DB01_s88_p1
Finished backup at 2015-05-18_18:28:16

Starting backup at 2015-05-18_18:28:16
current log archived
input archived log thread=1 sequence=20
piece handle=
/u99/backup/DB01/20150518/arc_DB01_s89_p1
Finished backup at 2015-05-18_18:28:17

This sounds pretty safe, but what happened if you keep this full backup for ever ? do you have all what you need in it to restore ?

It depends. Chance exists that you can restore. To annoy the paranoiacs, here is a counter example.

  1. you start your full backup at 6pm.
    backuping sequence 21,22,23,24
    backup datafiles…

  2. at 7 pm one cronjob issue backup archivelog all delete input;
    backuping and deleting sequence 21,22,23,24,25,26,27,28,29,30

  3. at 8pm your backup is about to finish
    backuping sequence 31

Well, where is my sequence 27 ?

Let’s try


rm /tmp/arch_DB01*
startup force mount;
crosscheck backup;restore database;


Starting restore at 2015-05-18_18:47:45
channel ORA_DISK_1: restore complete, elapsed time: 00:02:05
Finished restore at 2015-05-18_18:49:51

Ok, the restore was fine. Now what?


RMAN&gt; recover database;

Starting recover at 2015-05-18_18:50:35
using channel ORA_DISK_1

starting media recovery

RMAN-00571: =================================================
RMAN-00569: ========== ERROR MESSAGE STACK FOLLOWS ==========
RMAN-00571: =================================================
RMAN-03002: failure of recover command at 05/18/2015 18:50:36
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of archived log for thread 1 with sequence 30
RMAN-06025: no backup of archived log for thread 1 with sequence 29
RMAN-06025: no backup of archived log for thread 1 with sequence 27

RMAN&gt; alter database open;

RMAN-00571: =================================================
RMAN-00569: ========== ERROR MESSAGE STACK FOLLOWS ==========
RMAN-00571: =================================================
RMAN-03002: failure of alter db command at 05/18/2015 18:51:29
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u02/oradata/DB01/system01.dbf'

You have lost your database. Even if you did a full online backup plus archivelog, it did not contain all the archivelogs and RMAN did not tell you.

There is more than one way to reduce this annoyance :

1) don’t backup archivelogs during full backup
2) don’t delete archivelog backups done during a full backup
3) report any archivelog backups that run during a full. Either by looking in the RMAN dictionary or in your house logfiles
4) do offline backup for long term archives
5) don’t trust RMAN

And as well :
Test your backups &#x1f600;

Variations on 1M rows insert (3): using SQL Server In-Memory features

Yann Neuhaus - Mon, 2015-05-18 11:00

Let’s continue with this series about inserting 1M rows and let’s perform the same test with a new variation by using SQL Server In-Memory features. For this blog post, I will still use a minimal configuration that consists of only 1 virtual hyper-V machine with 1 processor, 512MB of memory. In addition my storage includes VHDx disks placed on 2 separate SSDs  (one INTEL SSDC2BW180A3L and one Samsung SSD 840 EVO). No special configuration has been performed on Hyper-V.

Let's begin by the creation script of my database DEMO:

 

CREATE DATABASE [demo] ON PRIMARY ( NAME = N'demo_data', FILENAME = N'E:\SQLSERVER\demo_data.mdf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ), FILEGROUP [demo_hk_grp] CONTAINS MEMORY_OPTIMIZED_DATA DEFAULT ( NAME = N'demo_hk', FILENAME = N'E:\SQLSERVER\HK' , MAXSIZE = UNLIMITED) LOG ON ( NAME = N'demo_log', FILENAME = N'F:\SQLSERVER\demo_hk_log.ldf' , SIZE = 1395776KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO   ALTER DATABASE [demo] SET COMPATIBILITY_LEVEL = 120 GO   ALTER DATABASE [demo] SET RECOVERY SIMPLE; GO

 

Next the creation script of all user objects that includes:   - 2 disk-based tables: DEMO_DB_PK (with a clustered primary key) and DEMO_DB_HP (a heap table) - 2 In-Memory optimized tables: DEMO_HK_SCH_DATA (data arepersisted) and DEMO_HK_SCH (only schema is persisted)

 

CREATE TABLE [dbo].[DEMO_DB_PK] (        [id] [int] NOT NULL primary key,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )   CREATE TABLE [dbo].[DEMO_DB_HP] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )     CREATE TABLE [dbo].[DEMO_HK_SCH_DATA] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA ) GO   CREATE TABLE [dbo].[DEMO_HK_SCH] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY ) GO

 

Finally, the last script of creating 7 stored procedures in order to test different cases:

sp_demo_insert_demo_db_hp: insert 1M rows inside a disk-based heap table sp_demo_insert_demo_db_pk: insert 1M rows inside a disk-based clustered table sp_demo_insert_demo_hk_sch_data: insert 1M rows inside an In-Memory optimized table in INTEROP with data persisted on disk sp_demo_insert_demo_hk_sch: insert 1M rows inside an In-Memory optimized table in INTEROP with only schema persisted on disk sp_demo_insert_demo_hk_sch_data_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and durability sp_demo_insert_demo_hk_sch_data_cp_d: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and delayed durability enable sp_demo_insert_demo_hk_sch_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with only schema persisted on disk  

Just as reminder, INTEROP procedures allow using both disk-based and In-Memory optimized tables whereas the NATIVE (or natively compiled) procedures doesn't support disk-based tables. However, using the latter is very interesting in performance perspective because it improves drastically the execution time.

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_hp] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_HP VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_pk] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_PK VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH_DATA VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_hk_sch] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

Note that for the following natively compiled stored procedures, I rewrote one portion of code because it concerns CASE statement which is not supported with SQL Server 2014.

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END End go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp_d] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english', delayed_durability = on)          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end GO

 

Ok it's time to run the different test scenarios. You will see below the different results and their related wait statistics that I found on my environment:

 

blog_44_-_1_-_results_

 

blog_44_-_2_-_wait_stats_

 

First of all, let’s notice that using In-Memory optimized tables in INTEROP mode seems to not improve the overall performance in all cases. Indeed, we still deal with the pretty same duration and the same CPU time as well when writing to the transaction log and checkpoint files for the In-Memory optimized tables with data persisted on disk – SCHEMA_AND_DATA.

Next, using In-Memory tables with only schema persisted on disk – SCHEMA_ONLY - contributes to better results (performance gain x 2) as we may expect. Indeed, the WRITELOG wait type has completely disappeared because data are not persisted in this case, so we minimize considerably the amount of records inside the Tlog.

Finally let’s have a look at the tests concerning natively compiled stored procedures. If we refer to the first tests (either for disk-based tables or for INTEROP), we can see that we reduce drastically the CPU consumption by using natively compiled stored procedures (roughly 97% in the best case). So, inserting 1M rows is very fast in this case.

 

Moreover, if we focus only on the results only between In-Memory optimized tables with different durability (SCHEMA_AND_DATA and SCHEMA_ONLY), we may notice that using transaction delayed durability may help. Once again persisting data by writing into the TLog and checkpoint files seems to slow down the insert process.

Some wait types still remain as PREEMPTIVE_OS_CREATEFILE, PREEMPTIVE_OS_CLOSEHANDLE and PREEMPTIVE_IS_FILEOPS and I will probably focus on them later. At this point, I would suspect a misconfigured storage or maybe my Hyper-V settings but I have to verify this point. I already double checked that I enabled instant file initialization according to the Microsoft documentation and disabled also some others features like 8.3 names, file indexing and last modification date tracking as well. So I will come back soon when I have more information.

But anyway for the moment we get a new time reference here: 2’’59’ for disk-based tables against 778 ms for In-Memory optimized tables + natively compiled stored procedures in the best scenario that tends to state that with In-Memory optimized we may get a huge performance improvement depending on our scenario.

So stay connected and see you soon for the next story!