Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from http://www.dbi-services.com/
Updated: 19 hours 31 min ago

SQL Server 2016: In-Memory OLTP enhancement

Fri, 2015-07-17 05:19

The CTP2.1 has been released for some weeks and it is time to check which interesting enhancements have been already deployed for the In-Memory OLTP feature.
I will first have a look to the memory optimized table.

Memory Optimized table

I have a Memory Optimized table named command_queue_HK and I will add a column to this table. This table is saved on my disk in a XPT folder with its corresponding files:

b2ap3_thumbnail_In-memory1.jpg

I execute the following command:

b2ap3_thumbnail_In-memory2.jpg

If I have a look to my XTP folder I see that I have six new files:

b2ap3_thumbnail_In-memory3.jpg

Does it mean that each time I make an alter table I have 6 new files? Let’s try another alter table:

b2ap3_thumbnail_In-memory4.jpg

Go back to my XTP container:

b2ap3_thumbnail_In-memory5.jpg

I have six new files for my last alter. Each time you run an alter table you create six new files corresponding to the new structure of your table… could by really disk consuming if you update a lot your tables… which hopefully could not arrive too often.
For the time being I cannot find a process which cleans obsoletes files, but when you restart you SQL Server Engine this process runs and deletes all files corresponding to your tables and recreates just six for the current structure:

b2ap3_thumbnail_In-memory6.jpg

This alter table is an offline process and need twice the size of the table in memory… don’t forget it.

It is also possible now to use Non-BIN2 collations in index key columns:

b2ap3_thumbnail_In-memory7.jpg

Some functionalities are not yet available (no exhaustive list).

Foreign Keys for Memory Optimized table:

b2ap3_thumbnail_In-memory8.jpg

Check constraint:

b2ap3_thumbnail_In-memory9.jpg

Natively Compiled Stored Procedure

For Natively Compiled Stored Procedures, we also have some enhancements and as for Memory Optimized table the first one is the possibility to alter them.
If I check on my container folder I see the six files for my Natively Compiled Stored Procedure:

b2ap3_thumbnail_In-memory10.jpg

If I right click on my SP in Management Studio I see that I have two possibility to update my SP:

  • select Modify
  • go through “Script Stored Procedure as”, “ALTER to…”, select where to script it

Once my Stored Procedure is scripted, I can make the modifications I need and execute the code:

b2ap3_thumbnail_In-memory11.jpg


Once the code is executed, I don’t have like for table six files more but directly six new files and just six:

b2ap3_thumbnail_In-memory12.jpg

Execute function in Natively Compiled Stored Procedure is also available but just with natively compiled modules:

b2ap3_thumbnail_In-memory13.jpg

It is not working if we try to call a nested Stored Procedure.

Like for Memory Optimized table some functionalities are not already available in this Technology Preview 2.1.

Subqueries are also not available for the time being:

b2ap3_thumbnail_In-memory14.jpg

There are interesting enhancements for In-Memory OLTP already available with this CTP 2.1 like Alter Table or Alter Natively Compiled Procedure but some important ones are not yet getting out like Foreign keys or Constraints for In-Memory tables. We will have to wait the RTM version to be fixed on remaining caveats.

Dealing with SSRS subscription schedules in AlwaysOn environment

Thu, 2015-07-16 11:00

A couple of days ago, I had the opportunity to work with AlwaysOn 2014 infrastructure that included a report server (SSRS) in native and scale-out mode. As you know, dealing with SSRS and availability groups is not an easy task depending on the used features. This is particularly true when scheduling and subscriptions are used by customers.

Here’s a brief description of the concerned architecture:

 

blog_57-_0_-_infra_aag_ssrs

 

2 SSRS servers in scale-out mode and an availability group with 2 replicas. The corresponding availability group configuration is shown below:

 

blog_57-_01_-_config_aagjpg

 

First of all, schedules are tied to SQL Server agent jobs and according to the BOL, there are some precautions to take in this case:

  • A schedule is created on the primary replica and is not automatically replicated to the other secondaries. In the same manner, a deleted schedule will not be replicated to other secondaries
  • After the database failover completes and after the report server service is re-started, SQL Server Agent jobs will be re-created automatically. Until the SQL agent jobs are recreated, any background executions associated with SQL Server Agent jobs will not be processed. This includes Reporting Services subscriptions, schedules, and snapshots

These two points introduce some challenges in the management stuff for the following reasons:

We must detect an availability group failover event in order to take the corresponding actions:

  • Disable the concerned SQL Server agent jobs on the secondaries to avoid monitoring false errors
  • Enable the SSRS agents jobs on the primary
  • Ensure that all of the SSRS jobs are synchronized between replicas when they are used as primary (schedules can be created or deleted between failovers)


How can we process in this case?

In fact there is a different wayto meet these requirements and you can probably find one of them on the internet. From my part, I decided to choose a solution based on our DMK monitoring tool for AlwaysOn that consists of:

  • A stored procedure that detects the availability group failover events. It will run on each replica on scheduled basis and will return the availability groups related to a failover event and also their new primary replica.
  • A SQL Server agent job that gets information from the above stored procedure and perform the necessary actions described below:
  • Disable the SSRS scheduling jobs if the replica’s new role is secondary
  • Drop the SSRS scheduling jobs and then restart one of the SSRS Server in the farm in order to resynchronize all the schedule jobs on the new replica (PowerShell script)

At this point, I just want to point out that the executed scheduling jobs may fail in a transitioning state according to BOL. Take a look at the following points:

  • The execution of background processing may occur more than once due to retry logic and the inability of the report server to mark scheduled work as completed during the failover period
  • The execution of background processing that would have normally been triggered to run during the period of the failover will not occur because SQL Server Agent will not be able to write data into the report server database and this data will not be synchronized to the new primary replica.


The DMK stored procedure

This script is part of our DMK AlwaysOn monitoring script. When a failover is triggered for an availability group, the stored procedure gives as output the concerned availability group and the new replica

 

 

:connect sql161   exec dbi_tools.[dbo].[dbi_alwayson_failover_detect] go   :connect sql162   exec dbi_tools.[dbo].[dbi_alwayson_failover_detect] go   :connect sql163   exec dbi_tools.[dbo].[dbi_alwayson_failover_detect]

 

blog_57-_1_-_dmk_monitoring_aag_script


 

After the first detection of the availability group failover is done, the next time no output is provided by the stored procedure to avoid a false positive:

 

blog_57-_2_-_dmk_monitoring_aag_script_2

 

PowerShell script:

The PowerShell script gets automatically the availability group where the ReportServer database is member of. Then it detects if a failover has occurred for the concerned availability group and perform the extra steps in order to update the reporting environment to use correctly the new availability replica.

Moreover, note that we get the report server names in the scale-out infrastructure from the dbo.keys table in the ReportServer database. Then we try to restart the first report server in the farm and we stop if it is successfully (we need only to restart one of the report servers in order to recreate all of the SSRS jobs). If the restart fails, we move on the next server and try to restart it and so on…

 

       .SYNOPSIS    Restart a SSRS report server in the farm if a failover is detected for the availability that concerns the ReportServer database      .DESCRIPTION        You must configure two parameters      $replica : replica name where the script is running       .EXAMPLE #>   $replica = "replica-name";   # Get the availability group that includes the ReportServer database $aag_ssrs = Invoke-Sqlcmd "SELECT DISTINCT                                   g.name as aag_name                            FROM sys.availability_groups as g                            JOIN sys.dm_hadr_database_replica_states as rs                                   ON g.group_id = rs.group_id                            JOIN sys.databases AS d                                   ON rs.database_id = d.database_id                            WHERE d.name = N''ReportServer'';" -ServerInstance $replica;   $restart_state = 0;   # Detect if a failover has occured $aags_failover = Invoke-Sqlcmd "EXEC dbi_tools.dbo.dbi_alwayson_failover_detect" -ServerInstance $replica;   Foreach ($aag in $aags_failover) {    If (![string]::IsNullOrEmpty($aag.group_name) -and ![string]::IsNullOrEmpty($aag_ssrs.aag_name))    {        If ($aag_ssrs.aag_name -eq $aag.group_name)        {            #Write-Host "SSRS availability group failover detected";              # If the concerned replica is the new primary --> Get SSRS reports servers in the farm (Scale-Out)             $ssrs_servers = Invoke-Sqlcmd "If (SELECT sys.fn_hadr_is_primary_replica(N''ReportServer'')) = 1 SELECT MachineName FROM [ReportServer].[dbo].[Keys] WHERE MachineName IS NOT NULL" -ServerInstance $replica;              If ($ssrs_servers.ItemArray.Count -gt 0)            {                # Delete all SSRS agent jobs (deleted subscriptions are not automatically replicated on the secondaries)                Invoke-Sqlcmd "DECLARE @sql NVARCHAR(MAX) = N'''';                                  SELECT                                       @sql = @sql + ''EXEC msdb.dbo.sp_delete_job @job_name = N'''''' + j.name + '''''''' + CHAR(13)                                FROM msdb.dbo.sysjobs AS j                                JOIN msdb.dbo.syscategories AS c                                        ON j.category_id = c.category_id                                WHERE c.name = N''Report Server'';                                  EXEC sp_executesql @sql;" -ServerInstance $replica;                  Foreach ($ssrs in $ssrs_servers)                {                    try                    {                                           #Write-Host "Restart ssrs server : " $ssrs.MachineName;                          # Restart one SSRS report in the farm (scale-out) in order to recreate subscription jobs                        Get-Service -Computer $ssrs.MachineName -Name "ReportServer" | Restart-Service;                                           Break;                    }                    catch [Microsoft.PowerShell.Commands.ServiceCommandException]                    {                           $pass = 0;                        #Write-Host "Error during the service restart of the report server " $ssrs.MachineName;                    }                }            }        }    } }

 

The T-SQL script for disabling / enabling SSRS subscription jobs

This T-SQL script code will be used to either enable or disable the SSRS jobs depending on the replica context. It will use the new sys.fn_hadr_is_primary_replica() DMF to determine if the current replica is primary or not for a concerned database.

 

DECLARE @sql NVARCHAR(MAX) = ''; DECLARE @state BIT = COALESCE((SELECT sys.fn_hadr_is_primary_replica(N'ReportServer')), 0);   SELECT @sql = @sql + 'IF NOT EXISTS (SELECT 1 FROM msdb.dbo.sysjobs WHERE name = N''' + j.name + ''' AND enabled = ' + CAST(@state AS CHAR(1)) + ')                                 EXEC sp_update_job @job_name = N''' + j.name + ''', @enabled = ' + CAST(@state AS CHAR(1)) + '' + CHAR(13) FROM msdb.dbo.sysjobs AS j JOIN msdb.dbo.syscategories AS c        ON j.category_id = c.category_id WHERE c.name = N'Report Server'   --PRINT @sql;   EXEC sp_executesql@sql;

 

Let me show you the scenario:

1- First, we have two schedules on the SSRS report server as follows …

 

blog_57-_30_-_ssrs_schedules

 

… with 2 associated SQL Server agent jobs on the primary replica (SQL161). You may also notice that the jobs have not been replicated on the secondary (SQL162) yet.

From the SQL Server side, I created the ReportServerSubscriptionControl job that includes two steps with the above scripts. It’s up to you to configure the schedule.

 

blog_57-_4_-_config_sql_job

 

We begin with an initial configuration where SQL161 is the primary replica and SQL162 the secondary

 

blog_57-_31_-_aag_case_1

 

2- A failover has occurred and the availability group has switched from SQL161 to SQL162 which is the new primary. After that ReportServerSubscriptionControl  job has been executed the configuration has changed as shown in the next picture:

 

blog_57-_32_-_aag_case_2

 

 

SQL162 is the new primary replica and all of the SSRS jobs are enabled and synchronized on it. Likewise, SQL161 becomes the new secondary and all of the SRRS jobs are now disabled.

 

Note that in this blog post, I voluntary bypassed the security configuration because it depends mainly on the customer context with the security rules in place. For instance, in my scenario, I played with a proxy account for executing some steps in the ReportServerSubscriptionControl job and the PowerShell subsystem. This proxy account is tied to a Windows credential that has minimum privileges configured in order to deal with both the information from the SQL Server instance itself and restart remotely and safely the SSRS service.

Hope this helps and please feel free to contact me if you have any questions

 

 

 

 

 


How to solve accent sensitive problems?

Thu, 2015-07-16 02:45

Some days ago, one of my customer claimed that searching for “Muller” doesn’t return “Mueller” and “Müller” as well!
This is typically an issue due to the collation of the SQL Server database, but how to resolve this problem?
The collation of the database is Latin1_General_CI_AS, so a case insensitive and accent sensitive collation.

If a run the following query:

Select name from Person where name like 'muller'

 

alt


I get only “Muller” which is normal as I use an accent sensitive collation so u is not the same as ü
Next, if I execute the same query by using an accent insensitive collation:

Select name from Person where name like 'muller' collate Latin1_GENERAL_CI_AI

I have as result:

alt

This time, “Muller” and “Müller” are retrieved as my collation is fully insensitive. For Latin1_General and AI (Accent Insensitive) collation u = ü, o = ö, a = ä…
But I get not yet “Mueller” which is a synonym of “Müller” without using a ü in German writing.
So I decided to use a German collation to see if it could solve my issue by returning my three forms of “Muller”. In this phonebook collation, ü is sorted like ue, ä like ae…

Select name from Person where name like 'muller' collate German_PhoneBook_CI_AI

 

alt


As expected, I received just “Muller” which is quite normal as “Muller” in German speaking is not “Müller”…
Let’s try with:

Select name from Person where name like 'müller' collate German_PhoneBook_CI_AI

 

alt


This result is consistent with the German speaking where “Mueller” and “Müller” are the same. But I cannot yet get my three forms of “Muller”…

Getting the result excepted by my customer seems like an impossible task by just changing the column collation.

Another possibility is to use the SOUNDEX string function. This function converts an alphanumeric string to a four-character code based on how the string sounds when spoken. So, let’s try with those queries:

select * from Person where soundex(Name) like soundex('muller')

select soundex('muller'), soundex('müller'), soundex('mueller')

 

alt


This string function was able to retrieve all forms of “muller” without any collation change. I saw that all version of “Muller” is converted to the same SOUNDEX code. The only problem is the utilization of indexes by this function which is not ensure.

Finally, I took a look at the FullText catalog feature which can be accent insensitive and that will include a FullText index with German language:

CREATE FULLTEXT CATALOG FTSCatalog WITH ACCENT_SENSITIVITY=OFF AS DEFAULT

CREATE FULLTEXT INDEX ON Person(name LANGUAGE German) KEY INDEX PK_Person

GO

 

After I used the following queries based on the contains clause and the Formsof predicate with the inflectional option for my different forms of Muller:

Select name from Person where contains(name,'FORMSOF(INFLECTIONAL,"muller")')

Select name from Person where contains(name,'FORMSOF(INFLECTIONAL,"müller")')

Select name from Person where contains(name,'FORMSOF(INFLECTIONAL,"mueller")')

 

alt


As expected the result was consistent with the other ones as we don’t have all forms when searching for “muller”. In contrary searching for “müller” or “Mueller” gives me all the results.

In conclusion, the FullText capabilities of SQL Server is certainly the best solution as it will be also faster with a huge number of rows and give the possibility to not change the collation which could be sometimes a real nightmare but we have to use “Müller” instead of “muller” to retrieve all the expected results.

New search capabilities in SharePoint Server 2013-2016

Wed, 2015-07-15 07:32

alt

For the ones who have not migrated their environment on SharePoint 2013, or the novices, here is an article in which we will discover the new search capabilities in SharePoint Server 2013.
We will have an overview of:

  • Search User interface
  • Crawling
  • Structure
  • Index & Search Schema
  • Health reports
  • Search architecture

 

What is the SP Search tool? what

SharePoint contains a search technology, which combines advanced search and analytics features. This feature is highly customizable. The content of documents (including PDFs) are searched.

FUNCTIONAL DRAW

search

 

Search user interface

Users can quickly identify useful results in ways such as the following:

  • Users can just move the pointer above the result for preview.

  • Results can be distinguished based on their type. The application icon is placed in front of the title of the search result. Lync availability and the people picture is shown on the results.

  • Certain types of related results are displayed in groups called result blocks. A result block contains a small subset of results that are related in a particular way. For example, results that are Excel documents appear in a result block searching for terms like the word "budget".

The search tool helps users to return to a previous search because the system is keeping search history.

Site collection administrators and site owners can specify display templates that determine how result types appear.

alt

Crawling improvements In SharePoint Server 2013, you can configure crawl schedules for SharePoint content sources so that crawls are performed continuously. Setting this option eliminates the need to schedule incremental crawls and automatically starts crawls as necessary to keep the search index fresh. Administrators should still configure full crawls as necessary.   

For more information, see on TechNet site: Manage continuous crawls in SharePoint Server 2013.

Index and Search schema

By defining crawled properties, managed properties, and the mappings between them, the search schema determines how the properties of crawled content are saved to the search index.
The search index stores the contents of the managed properties. The attributes of the managed properties determine the search index structure.

SharePoint Server 2013 introduces new attributes that you can apply to manage properties, such as sortable and refinable. The sortable attribute reduces the time that is required to return large search result sets by sorting results before they are returned. The refinable attribute enables you to create a refiner based on a particular managed property.

In SharePoint Server 2013, you can have multiple search schemas. The main search schema is defined at the Search service application level. Site collection administrators can create customized search schemas for different site collections.

For more information, see on TechNet site: Manage the search schema in SharePoint Server 2013.

Health reports

SharePoint Server 2013 provides many query health reports and crawl health reports. In SharePoint Server 2010 and FAST Search Server 2010 for SharePoint, similar reports were called Search Administration Reports. For more information, see View search diagnostics in SharePoint Server 2013.

Search architecture

SharePoint Server 2013 introduces a new search architecture that includes significant changes and additions to the search components and databases. For examples and more information, see the Search technical diagrams in Technical diagrams for SharePoint 2013.


All the SharePoint 2013 Improvements are kept in SharePoint 2016, actually the first info done my Microsoft about the search in SharePoint 2016 is regarding the Delve:

Search with Delve

SharePoint 2016 will be having Search with Delve app.

What is Delve?what

Delve is a new way of searching & presenting contents based on user’s interest. Delve’s roots goes to Office 365.

Delve can present information from Exchange, OneDrive for Business, SharePoint Online and Yammer based on user’s interactions.
For more information, please have a look on this reference: https://support.office.com/en-us/article/What-is-Office-Delve--1315665a-c6af-4409-a28d-49f8916878ca?ui=en-US&rs=en-US&ad=US

Conclusion

Search could easily become a nightmare sometimes, please use the Best Practices related to any search whether be people, objects, information, needles in a haystack ... with the right information and the right settings we always seem to find back what we are looking for.

Image-1_4

 

Best practices for organizing content for search in SharePoint Server 2013: https://technet.microsoft.com/en-us/library/jj683124.aspx

Plan search in SharePoint Server 2013: https://technet.microsoft.com/en-us/library/cc263400.aspx

 

 

 


Don't use the default database file autogrow settings !

Mon, 2015-07-13 13:37

I often noticed default values for database file autogrow settings during my audits. You don't trust me? Let me talk about a funny story about an incredible (and extreme) situation that concerns a transaction log file and its default settings.

First of all, during my intervention, my customer complained of some performance issues, so I decided to take a look at different areas of its instance.

I found some interesting records inside the SQL Server error log file as follows:

 

blog_56_-_2_-_timeout_latch_log_manager_2_dump

 

A timeout that occurred while waiting for latch class LOG_MANAGER … Ouch, it was the first time I faced this issue and after taking a look at the latch stats it confirmed that an issue existed on this area:

 

blog_56_-_3_-_latch_stats

 

Unfortunately, the Microsoft documentation is very poor concerning the description of this latch type.

 

blog_56_-_4_-_LOG_MANAGER_desc

 

But Paul Randal gives some clues on his very useful blog here. This latch may appear when transaction log file is growing (whatever the reason). After some investigations, I effectively found that the growing issue was due to an incorrect recovery model configuration (FULL and no backup log in this context).

But that’s not all and this is where the funny part of this story comes. To be honest, I have never seen a latch timeout causing by a growing of a transaction log file except on this situation. I first suspected the storage here but in fact after taking a look at the transaction log file size and the associated growth parameter, I definitely rule out this suspicion (at least the real problem was not here).

 

blog_56_-_5_-_log_space_used

...

blog_56_-_6_-_database_file_options

 

 

I replaced the real database name of my customer by User DB here. So what do you notice? No you don’t dream … the transaction log file size = 1.2 TB with a default auto growth configuration = 10%! So I imagine that you begin to understand where the problem is located Smile

So, just for curiosity let’s have a look at the generated dump file:

 

blog_56_-_7_-_dump_reading

 

So, it seems that an update query has generated log records after updating a BTree structure (sqlmin!BTreeRow:UpdateRecord()). In turn, the log manager tried to expand the transaction log file (sqlmin!SQLServerLogMgr::GrowAFile() method) without success after SQL Server has generated log records (sqlmin!XdesRMReadWrite::LogBeginXact() --> sqlmin!XdesRMReadWrite:GenerateLogRec() methods) . In this context, I guess that the expand operation has taken a very long time because 10% of the transaction log file represents a large chunk of disk space (roughly 100GB).

Finally let's have a look at the wait stats:

 

blog_56_-_1_-_wait_stats

 

We can also correlate the PREEMPTIVE_OS_WRITEFILEGATHER wait type to our issue here with the high average value. Zeroing the new portion of the transaction log file may be very long in our context.

My intention was not to blame my customer here but it was a good opportunity to point out the potential issue you may face with default database file auto growth settings!

See you

 

 

 

 

 

Clustered columnstore index and memory management

Fri, 2015-07-10 10:46

Few weeks ago, I had the opportunity to give a session about the clustered columnstore index feature (CCI) at our In-Memory event dedicated to In-Memory technologies for Microsoft SQL Server, Oracle and SAP HANA. During our session, I explained the improvement made by Microsoft on SQL Server 2014 with the introduction of new clustered columnstore index (CCI).

The CCI includes a new structure that allows update operations: the delta store. Indeed, insert operations go directly in the delta store. Delete operations are logical and go in the deleted bitmap in the delta store and finally update operations are split into two basic operations DELETE operation followed by INSERT operation. I was very interested in how SQL Server deals with both the structures (delta store and columnstore) and memory in different scenarios. This blog post is a result of my studies and will probably concern those who are interested by internal stuff. In fact, I discussed with one of my (oracle) friend and he asked me some interesting questions about CCI and the memory management topic.

First of all, let’s begin with the storage concept: the delta store consists of a traditional row-based storage unlike the columnstore index which is column-oriented storage. The two structures are managed differently by SQL Server and they have their own memory space - respectively the CACHESTORE_COLUMSTOREOBJECTPOOL for the columnstore structure and the traditional buffer pool (BP) for the row store structure. When columnstore data are fetched from disk to memory, they come first to the BP and then to the columnstore memory pool. We may get information about columnstore memory pool by using the following query:

 

select        type,        name,        memory_node_id,        pages_kb,        page_size_in_bytes from sys.dm_os_memory_clerks where type = 'CACHESTORE_COLUMNSTOREOBJECTPOOL'; go

 

blog_55_-_2-_CCI_memory_management_memory_clerk-_

 

First scenario

We’ll see how SQL Server behaves by reading data exclusively from the delta store. Let’s begin with a pretty simple table:

 

-- create table test_CCI if object_id('test_CCI', 'U') is not null        drop table test_CCI; go   create table test_CCI (        id int not null identity(1,1),        col1 char(10) not null default 'col_' ); go

 

Next, let’s create a CCI that will include 1 compressed row group and 1 delta store (open state):

 

set nocount on;   -- insert 1000 rows insert test_CCI default values; go 1000   -- create CCI create clustered columnstore index [PK__test_CCI__3213E83F3A6FE3AC] on test_CCI go   -- insert 1 rows in order to create a delta store (OPEN state) insert test_CCI default values; go 1

 

Let’s have a look at the CCI row group’s information:

 

select        object_name(object_id) as table_name,        index_id,        row_group_id,        delta_store_hobt_id,        state_description as [state],        total_rows,        deleted_rows,        size_in_bytes from sys.column_store_row_groups where object_id = object_id('test_CCI'); go

 

blog_55_-_2-_CCI_configuration-_

 

Let’s execute the first query that will fetch data from the record to the delta store

 

dbcc dropcleanbuffers;   select        max(id) from dbo.test_CCI where id = 1001

 

Let’s have a look at the memory cache entries related to the CCI memory pool:

 

select        name,        in_use_count,        is_dirty,        pages_kb,        entry_data,        pool_id from sys.dm_os_memory_cache_entries where type = 'CACHESTORE_COLUMNSTOREOBJECTPOOL'; go

 

blog_55_-_3-_CCI_memory_management_cache_entries_empty-_

No entries and this is what I expected because data come only from delta store and the buffer pool is the only one concerned by this scenario. Another important point: segments are eliminated directly from disk. In order to prove it, I created an extended event to get segment elimination information as follows:

 

CREATE EVENT SESSION [cci_segment_elimination] ON SERVER ADD EVENT sqlserver.column_store_segment_eliminate (    WHERE ([sqlserver].[database_name]=N'db_test') ) ADD TARGET package0.event_file (        SET filename= N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Log\cci_segment_elimination' ) WITH (        MAX_MEMORY=4096 KB,        EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,        MAX_DISPATCH_LATENCY=30 SECONDS,        MAX_EVENT_SIZE=0 KB,        MEMORY_PARTITION_MODE=NONE,        TRACK_CAUSALITY=OFF,        STARTUP_STATE=OFF ) GO

 

And after looking at the extended event file, I noticed that the segment was eliminated by SQL Server as expected.

 

blog_55_-_4-_CCI_segment_elimination

 

The hobt_id value relies to the compressed segment in the columnstore index:

 

select        partition_id,        hobt_id,        column_id,        segment_id,        row_count,        min_data_id,        max_data_id from sys.column_store_segments where hobt_id = 72057594041925632

 

blog_55_-_5-_CCI_segment

 

Second scenario

It will consist in reading data directly from the columnstore segment.

 

dbcc dropcleanbuffers go   select        max(id) from dbo.test_CCI where id = 1

 

With the previous script that uses the sys.dm_os_memory_cache_entries DMV we can see this time, two cached entries from the columnstore object pool:

 

blog_55_-_1-_CCI_memory_management_-_cache_entries

 

 

I would like to thank Sunil Argarwal (Principal Program Manager in SQL Server Storage Engine) for his kindness and some information he gave me, in order to read the above entry data column values. [Object type] is a very useful information here with the following meaning:

0x0 = Un-initalized object 0x1 = Column segment 0x2 = Primary dictionary for a column 0x4 = Secondary dictionary for a column segment 0x5 = the primary dictionary with reverse HT initialized, used for bulk insert 0x6 = Delete bitmap - used temporarily when reading from disk

So let’s correlate with the retrieved entry data column values. SQL Server fetched the concerned segment (object_id = 1) from disk to the columnstore object memory. However, let’s have a look at the column_id value (= 2 here). In fact, I expected to get value =1 which might be related to the id column in the table dbo.test_CCI. I performed some other tests and they let me think that the column_id from the entry_data column value is in fact equal to the column_id minus 1 from the concerned table but I will check this point in the near future.

Moreover, according to the Sunil’s information, the deleted bitmap (object_id=6) was also fetched by SQL Server. I can imagine that SQL Server needs to read it to retrieve deleted records. My feeling is that these operations are fully logical and SQL Server has no way to identify a deleted record from a segment without reading the deleted bitmap.

We can assume that the columnstore memory pool is a room for storing the columnstore segments and segments are stored in LOB pages. Does SQL Server read data directly from the columnstore memory pool?

Let’s go back to the previous test. As a reminder, we want to get the max (id) from the columnstore segment related to the id columm. So SQL Server needs to read the related segment in this case. Let’s see if we can retrieve a corresponding page in the buffer pool by using the following script:

  select        page_type,        count(*) as nb_pages from sys.dm_os_buffer_descriptors where database_id = db_id('db_test') group by page_type order by page_type

 

blog_55_-_7-_CCI_buffer_pool

 

Ok, there are plenty of pages in the buffer pool related to the db_test database. Let’s focus first on data page by using the following script that consists in retrieving data pages only for the dbo.test_CCI table:

 

if object_id('tempdb..#buffer_descriptor_pages') is not null        drop table #buffer_descriptor_pages; go   create table #buffer_descriptor_pages (        num INT null,        ParentObject varchar(100) not null,        [Object] varchar(100) not null,        Field varchar(100) not null,        VALUE varchar(100) not null );       declare @sql varchar(max) = ''; declare @database_id int; declare @file_id int; declare @page_id int; declare @i int = 0;   declare c_pages cursor fast_forward for select        database_id, file_id, page_id from sys.dm_os_buffer_descriptors where database_id = db_id('db_test')        and page_type = 'DATA_PAGE';   open c_pages;   fetch next from c_pages into @database_id, @file_id, @page_id;   while @@fetch_status = 0 begin          set @sql = 'dbcc traceon (3604); dbcc page(' + cast(@database_id as varchar(10))                                                 + ', ' + cast(@file_id as varchar(10))                                                                           + ', ' + cast(@page_id as varchar(10))                                                                           + ', 3) with tableresults';          insert into #buffer_descriptor_pages(ParentObject, [Object], Field, VALUE)        exec(@sql);               update #buffer_descriptor_pages        set num = @i        where num is null;          set @i = @i + 1;        fetch next from c_pages into @database_id, @file_id, @page_id;     end        close c_pages; deallocate c_pages;   select * from #buffer_descriptor_pages where num in(select num                            from #buffer_descriptor_pages                            where Field = 'Metadata: ObjectId'                                   and VALUE = object_id('dbo.test_CCI'));

 

In my case, I retrieved only one page with the following detail:

 

blog_55_-_7-_CCI_page_compressed_details

 

We get a compressed data page and, to be more precise, a data page that comes from the delta store (id = 1001). Remember that the segment elimination is not performed for the delta store. This is why I got this page in my case.

Next, let’s have a look at the LOB pages (our segments)

 

select        database_id,        file_id,        page_id,        allocation_unit_id,        row_count from sys.dm_os_buffer_descriptors where database_id = db_id('db_test')        and page_type = 'TEXT_MIX_PAGE'              and allocation_unit_id IN(select au.allocation_unit_id                                         from sys.allocation_units as au                                                          join sys.partitions as p                                                             on p.hobt_id = au.container_id                                                          where p.object_id = object_id('dbo.test_CCI'));

 

blog_55_-_8-_CCI_LOB_pages

 

We have one LOB page (TEXT_MIXPAGE type) but it seems to be empty and I admit that I don’t have any idea of this page. I will update this blog post later if I get a response.

So, to summarize and according to my tests, it’s seems that SQL Server reads LOB pages directly from the columnstore object pool and doesn’t need to use the BP in this case.

 

Third scenario

It will consist in updating data from the columnstore index and understanding how SQL Server behaves in this case.

 

alter index [PK__test_CCI__3213E83F3A6FE3AC] on [dbo].[test_CCI] rebuild

...

select        object_name(object_id) as table_name,        index_id,        row_group_id,        delta_store_hobt_id,        state_description as [state],        total_rows,        deleted_rows,        size_in_bytes from sys.column_store_row_groups where object_id = object_id('test_CCI');

 

blog_55_-_9-_CCI_without_deltastore

 

Next, let’s update the columnstore index by using the following query:

  checkpoint; go
dbcc dropcleanbuffers; go   update dbo.test_CCI set col1 = 'toto'

 

At this point, a delta store page is created by SQL Server and we have to think differently because the storage has changed from columnar to row store. So let’s have a look at the modified pages related to the columnstore index.

 

if object_id('tempdb..#buffer_descriptor_pages') is not null        drop table #buffer_descriptor_pages; go   create table #buffer_descriptor_pages (        num INT null,        ParentObject varchar(400) not null,        [Object] varchar(400) not null,        Field varchar(400) not null,        VALUE varchar(400) not null );     declare @sql varchar(max) = ''; declare @database_id int; declare @file_id int; declare @page_id int; declare @i int = 0;   declare c_pages cursor fast_forward for select        database_id, file_id, page_id from sys.dm_os_buffer_descriptors where database_id = db_id('db_test')        and page_type = 'DATA_PAGE'              and is_modified = 1;   open c_pages;   fetch next from c_pages into @database_id, @file_id, @page_id;   while @@fetch_status = 0 begin          set @sql = 'dbcc traceon (3604); dbcc page(' + cast(@database_id as varchar(10))                                                 + ', ' + cast(@file_id as varchar(10))                                                                           + ', ' + cast(@page_id as varchar(10))                                                                           + ', 3) with tableresults';          insert into #buffer_descriptor_pages(ParentObject, [Object], Field, VALUE)        exec(@sql);               update #buffer_descriptor_pages        set num = @i        where num is null;          set @i = @i + 1;        fetch next from c_pages into @database_id, @file_id, @page_id;     end        close c_pages; deallocate c_pages; select * from #buffer_descriptor_pages where num in(select num                            from #buffer_descriptor_pages                            where (Field = 'Metadata: ObjectId')                                   and VALUE = object_id('dbo.test_CCI'))        and(Field = 'm_pageId' or Field = 'Record Type' or Field = 'CD array entry' or Field = 'id' or Field = 'col1');

 

blog_55_-_10-_CCI_after_modifying_data

 

Note that this time, I only focused on the modified / dirty pages in my result and I noticed that there are two data pages. The second page (1:94) in the order of this result set is pretty obvious because it concerns the record with id = 1 and col1 = toto (the modified data). However, I’m not sure to know exactly what the first page is but I can again speculate: we performed an update operation and we know that this operation is split in two basic operations DELETE + INSERT. So my feeling here is that this page lies to the deleted bitmap. Let’s have a look at the sys.column_store_row_groups DMV:

 

select        object_name(object_id) as table_name,        index_id,        row_group_id,        delta_store_hobt_id,        state_description as [state],        total_rows,        deleted_rows,        size_in_bytes from sys.column_store_row_groups where object_id = object_id('test_CCI');

 

blog_55_-_11-_CCI_rowgroup_after_modifying_data

 

And as expected we can notice a logical deleted record in the row group with a new open delta store (and its deleted bitmap). So let’s perform a checkpoint and clear the buffer pool.

 

checkpoint; go   dbcc dropcleanbuffers; go

 

Now, we can wonder how SQL Server will retrieve data from id column = 1. Indeed, we have a deleted record into the row group from one side and the new version of the data in the delta store on the other side. So, we can guess that SQL Server will need to fetch both the data pages from the delta store and the deleted bitmap to get the correct record.

Let’s verify by performing this test and taking a look first at the memory cache entries related to the columnstore index.

 

blog_55_-_12-_CCI_memory_management_cache_entries_3

 

SQL Server has fetched the corresponding segment (object_type =1) and the deleted bitmap (object_id=6) as well. Note that segment elimination is not performed for the concerned segment because SQL Server is not able to perform an elimination for segments that contain logical deletions.

Finally let’s retrieve the data pages in the buffer pool related to the columnstore index:

 

blog_55_-_13_-_CCI_page_compressed_details

 

Ok we retrieved the same clean pages (is_modified = 0) and performing the same test after rebuilding the CCI yielded an empty result. In the latter case, this is the expected behaviour because rebuilding the columnstore index get rid of deleted records inside the segments. Thus, SQL Server doesn’t need the deleted bitmap.

I didn’t cover all the scenarios in this blog post and some questions are not answered yet. My intention was just to introduce some interesting internal stuff done by the CCI. This is definitely a very interesting topic that I want to cover in the near future. Please feel free to share your thoughts about this exciting feature!

ACFS 12.1.0.2 on Oracle Linux 7.1

Fri, 2015-07-10 03:56
Recently we wanted to create an ACFS filesystem on a brand new 12.1.0.2 GI installation on Oracle Linux 7.1. According to the documentation this should not be an issue as "Oracle Linux 7 with the Unbreakable Enterprise kernel: 3.8.13-33.el7uek.x86_64 or later" is supported.

Kerberos SSO with Liferay 6.1 v2

Mon, 2015-07-06 23:00


A little bit less than one year ago, I wrote a blog about how to setup the Kerberos SSO on Liferay using Apache httpd as a front-end, Tomcat as a back-end and with auth_mod_kerb and mod_jk to transfer the information between these two components. Unfortunately, after some months, it seems that something changed either on the Liferay side or in the Apache side because the exact same configuration wasn't working anymore. I solved this issue (or at least applied a workaround) two or three months ago because I needed a working environment for some tests but until now, I didn't have the time to share my findings.


That's why in this blog I will try to explain what was needed in my case to avoid an issue with the configuration I shared in my previous blog. I decided not to update my previous blog but rather to create a new one because my previous blog may still be working as expected with certain versions of Liferay!


I. The issue


In my case, after some months where everything was fine, the SSO suddenly wasn't working anymore. The access to the Liferay welcome page took a looooooong time or even never ended (or rejected by Apache httpd depending on the configuration).


With the proper logging information or debugging steps, if you face the same issue, you should be able to see that in fact Liferay doesn't understand anymore the information that Apache httpd provides and the result of that is an almost infinite loop in the login process.


From a debugging point of view, my previous blog post prevented an administrator to access Liferay without a valid Kerberos ticket. That's why in this solution I also incorporated some lines of code to prevent such cases and to allow any users to still access Liferay even without a valid ticket. This can be easily removed for strict access requirements. Please be aware that this guess access is, by default, only valid for requests directly addresses to Tomcat (not the ones coming from Apache httpd)!


II. A possible solution


So the first thing that changed is the "KerberosAutoLogin" itself. I changed a few lines to get something cleaner and I added some others to handle the "guest" access in case a valid Kerberos ticket wasn't found (line 64 to 77). The result is the following:

KerberosAutoLogin1.pngKerberosAutoLogin2.png


Please find at the end of this blog the squeleton of the HOOK I built for my dev environment. This is basically the same that in my previous blog except that I changed the code for you.


In this Java Class, I used the "LogFactoryUtil" for the logging to get something more standard but that requires you to configure the log level that you want. If you just want these logs for the debug phase, then you may want to replace all "logger.debug" or "logger.warn" with a simple "System.out.println". That should redirect all elements that you want to log to the default catalina.out log file.



The second thing to do it to modify the Java Class named "AutoLoginFilter". This Class is a kind of dispatcher for the login requests that are coming to Liferay. From what I saw, there is a little issue with this class that prevent our KerberosAutoLogin code to be executed properly or at all... Indeed in our case, the "REMOTE_USER" is set by Apache and when this method is executed, our KerberosAutoLogin which is one of the "_autoLogins" (see the code below) isn't called because the "remoteUser" variable isn't null!


To enable our KerberosAutoLogin to be executed even if the remoteUser isn't null, you can modify the default code. You can find the Java Class at the following URL: the code is on grepcode. Please be aware that the code of this Java Class is version dependent, which means that the code for Liferay 6.0 may not exactly be the same that for Liferay 6.1... So be careful ;).


I highlighted the lines I added for this specific case. If you take a look at the code, I also added a variable "REALM" to compare the Kerberos REALM that came from Apache httpd to the one that should be used (your Active Directory or KDC). This check can be used to prevent a user from a different REALM to login to your Liferay. If you don't want or don't need this additional security, you can just remove this variable and also remove the "if" test that use it (remove line 25, 65 and 88).

AutoLoginFilter1.pngAutoLoginFilter2.png


Please find at the end of this blog the squeleton of the EXT I built for my dev environment. Please also be aware that the KerberosAutoLogin is defined in a hook (custom-hook if you followed my previous blog) but you can't do the same for the AutoLoginFilter. Indeed, you will have to create your own EXT using the SDK, add this class in it and then deploy it to be able to see the updated version of the class loaded.


I only printed here the method "processFilter" because it's the only thing that needs to be modified, at least in my Liferay version 6.1.1! If you compare this method to the one on grepcode you will see that I compacted the code quite a lot to take less space but I didn't change the code that isn't highlighted.


Once these two Java Classes are modified, you should be able to login properly to Liferay using Kerberos tickets. If not, then enable the logging and that should help you to find out where your issue is. Of course you can still post a comment below and I will do my best to help you. I also already saw some issues that were related to the new version of Apache... You may want to try the same thing with Apache 2.2 instead of 2.4! From my side, this was the first thing I tried to get the SSO working again and I still use Apache httpd 2.2 (instead of 2.4 as described in my previous blog). Another important thing to check is that your browser is properly setup for the SSO!


Squeleton of the EXT
Squeleton of the HOOK

 

RESULT_CACHE hint expiration options

Sun, 2015-07-05 14:22

The result cache is very nice when base tables are static: get the result without any buffer get, without any join or sorts. But as soon as any DML occurs on one of the tables the result is based on, then result cache is invalidated. And cache misses are quite expensive. You can think of it as a materialized view in memory. But with materialized views, you can accept to see stale results in order to avoid frequent refreshes. Let's see how we can control stale results with undocumented options.

Test case

I'm creating a basic table.

22:30:44 SQL> create table DEMO as select rownum id from xmltable('1 to 100000');

Table created.

22:30:44 SQL> exec if dbms_result_cache.flush then dbms_output.put_line('Flushed.'); end if;

PL/SQL procedure successfully completed.

RESULT_CACHE

On that table, I'll do a simple select with the RESULT_CACHE hint.

22:30:44 SQL> set autotrace on
22:30:44 SQL> select /*+ result_cache */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row); name="select /*+ result_cache */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
        159  consistent gets
        153  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

We have read the 150 blocks of the DEMO table. The result cache has been loaded. Here are the objects and dependency:
22:30:44 SQL> set autotrace off
22:30:44 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         1          0     112463          0

22:30:44 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      0
  1 Result     Published select /*+ result_cache */ cou 2x5f91pfn5p6882f6szxj50jwf     0

We can read it as: the query result (id=1) depends on the table (id=0).

If I run it another time:

22:30:45 SQL> select /*+ result_cache */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row); name="select /*+ result_cache */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

No block get at all. The result was in cache.

dependencies

If we do any kind of DML on the tables the result cache depends on:

22:30:45 SQL> delete from DEMO where null is not null;

0 rows deleted.

22:30:45 SQL> commit;

Commit complete.

Then the cache is invalidated:
22:30:45 SQL> select * from v$result_cache_dependency;

no rows selected

22:30:45 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      1
  1 Result     Invalid   select /*+ result_cache */ cou 2x5f91pfn5p6882f6szxj50jwf     0

and another run will have to load the cache again:
22:30:45 SQL> set autotrace on
22:30:45 SQL> select /*+ result_cache */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row); name="select /*+ result_cache */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        157  consistent gets
          0  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

22:30:46 SQL> set autotrace off
22:30:46 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         2          0     112463          0

22:30:46 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      1
  1 Result     Invalid   select /*+ result_cache */ cou 2x5f91pfn5p6882f6szxj50jwf     0
  2 Result     Published select /*+ result_cache */ cou 2x5f91pfn5p6882f6szxj50jwf     0

The is the only documented behaviour: the cache is invalidated if and only if there has been some DML on the tables it depends on.

RESULT_CACHE(SHELFLIFE=seconds)


I flush the cache and run the same with the undocumented result cache hint option: SHELFLIFE=10 seconds:

22:30:46 SQL> exec if dbms_result_cache.flush then dbms_output.put_line('Flushed.'); end if;

PL/SQL procedure successfully completed.

22:30:46 SQL>
22:30:46 SQL> set autotrace on
22:30:46 SQL> select /*+ result_cache(shelflife=10) */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row, shelflife=10); name="select /*+ result_cache(shelflife=10) */ count(*) from DEMO"
we see the 'shelflife' attribute but the dependency is the same as without the option:
22:30:46 SQL> set autotrace off
22:30:46 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         1          0     112463          0

22:30:46 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      0
  1 Result     Published select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0
I run the query again 5 seconds later:
22:30:51 SQL> set autotrace on
22:30:51 SQL> select /*+ result_cache(shelflife=10) */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row, shelflife=10); name="select /*+ result_cache(shelflife=10) */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

22:30:51 SQL> set autotrace off
22:30:51 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         1          0     112463          0

22:30:51 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      0
  1 Result     Published select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0

Nothing special here. No DML occured so the result is still valid ('published')

expiration


But let's wait 5 more seconds and run it again:

22:30:56 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         1          0     112463          0

22:30:56 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      0
  1 Result     Expired   select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0
The status has changed, it's not EXPIRED because the 10 seconds shelflife has passed on since the cache was loaded.
22:30:56 SQL>
22:30:56 SQL> set autotrace on
22:30:56 SQL> select /*+ result_cache(shelflife=10) */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; dependencies=(DEMO.DEMO); attributes=(single-row, shelflife=10); name="select /*+ result_cache(shelflife=10) */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        157  consistent gets
          0  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

here I have a cache miss even if no DML occured on the tables it depends on.
22:30:56 SQL> select * from v$result_cache_dependency;

 RESULT_ID  DEPEND_ID  OBJECT_NO     CON_ID
---------- ---------- ---------- ----------
         2          0     112463          0

22:30:56 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      0
  1 Result     Invalid   select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0
  2 Result     Published select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0

When I run a query using an expired result cache, that cache becomes invalid, the query is fully run (150 block gets) and new result cache populated.

Note that in addition to the shelflife expiration, the dependency on DML is still invalidating the result cache:

22:30:56 SQL> delete from DEMO where null is not null;

0 rows deleted.

22:30:56 SQL> commit;

Commit complete.

22:30:56 SQL> select * from v$result_cache_dependency;

no rows selected

22:30:56 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Dependency Published DEMO.DEMO                      DEMO.DEMO                      1
  1 Result     Invalid   select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0
  2 Result     Invalid   select /*+ result_cache(shelfl 2x5f91pfn5p6882f6szxj50jwf     0
So with SHELFLIFE we have two reasons for invalidations: dependency and expiration.

RESULT_CACHE(SNAPSHOT=seconds)


There is another undocumented option for the result cache hint: SNAPSHOT which also taked a number of seconds. Let's try it:

22:30:56 SQL> exec if dbms_result_cache.flush then dbms_output.put_line('Flushed.'); end if;

PL/SQL procedure successfully completed.

22:30:56 SQL>
22:30:56 SQL> set autotrace on
22:30:56 SQL> select /*+ result_cache(snapshot=10) */ count(*) from DEMO;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

------------------------------------------------------------------------------------------
| Id  | Operation           | Name                       | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                            |     1 |    46   (0)| 00:00:01 |
|   1 |  RESULT CACHE       | 2x5f91pfn5p6882f6szxj50jwf |       |            |          |
|   2 |   SORT AGGREGATE    |                            |     1 |            |          |
|   3 |    TABLE ACCESS FULL| DEMO                       |   100K|    46   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Result Cache Information (identified by operation id):
------------------------------------------------------

   1 - column-count=1; attributes=(single-row, snapshot=10); name="select /*+ result_cache(snapshot=10) */ count(*) from DEMO"


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        157  consistent gets
          0  physical reads
          0  redo size
        358  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

22:30:56 SQL> set autotrace off
22:30:56 SQL> select * from v$result_cache_dependency;

no rows selected

22:30:56 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Result     Published select /*+ result_cache(snapsh 2x5f91pfn5p6882f6szxj50jwf     0

The cache has been populated here, but there is no dependency. Let's try DML on base table:
22:30:56 SQL> delete from DEMO where null is not null;

0 rows deleted.

22:30:56 SQL> commit;

Commit complete.

22:30:56 SQL> select * from v$result_cache_dependency;

no rows selected

22:30:56 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Result     Published select /*+ result_cache(snapsh 2x5f91pfn5p6882f6szxj50jwf     0

A SNAPSHOT result cache is not invalidated by DML on base tables.

But if we wait 10 seconds:

22:31:06 SQL> select id,type,status,name,cache_id,invalidations from v$result_cache_objects order by id;

 ID TYPE       STATUS    NAME                           CACHE_ID                     INV
--- ---------- --------- ------------------------------ --------------------------- ----
  0 Result     Expired   select /*+ result_cache(snapsh 2x5f91pfn5p6882f6szxj50jwf     0

The status is expired, as with the shelflife. Next execution will invalidate it.

Conclusion

There are two reasons to invalidate a result cache: DML on dependency, or expiration after a number of seconds. And we can use any combination of them with undocumented hints. Oracle itself uses them internally. SNAPSHOT is used by Active Dynamic Sampling: stale result are acceptable for one hour. SHELFLIFE is used on dictionary views bases on X$ tables where dependency cannot be tracked.

I hope it will get documented in future releases. There are some cases where we can accept stale results in order to get better performance. We already do that with materialized views, so why not do it with result cache?

Testing the just released PostgreSQL 9.5 Alpha in a docker container

Fri, 2015-07-03 00:15

On the 2cnd of July the PostgreSQL Global Development Group released an alpha version of the upcoming PostgreSQL 9.5. The same day, Josh Berkus, another of those PostgreSQL core team members released a docker image for testing this alpha release. It's never been that easy to get started with PostgreSQL or testing new features.

BEEZY: Social Network for SharePoint 2013

Wed, 2015-07-01 13:32


Beezy-logo-M-25255B20-25255D


Social Networking.. Everybody is actually "connected": professional network, private social network... There is so many solutions around as of today. Which one should I use? What are the differences?
Regarding the use of  a social work, we have already seen YAMMER, what about BEEZY?

What is Beezy? what

Beezy is a social network built inside SharePoint.
Beezy comes in two flavors: on premises behind the firewall on SharePoint Server 2010 and in the cloud on Office365.


Beezy Features

 

  • Collaboration tools: by a click, sharing Files, events, tasks, images, video, links, is possible! Yes it is!
  • Groups: Beezy allows to create groups to structure corporate information, the setting up is user friendly and even if a group is shut down, information’ are kept.
  • Microblogging: this is a good way for collaboration, team spirit, you share ideas and get feedbacks in real-time. As with Twitter, you can use Tag like hashtags (#) and replies (@) and Embed videos from YouTube!
  • Follows: Knowledge management is also about effectively filtering information. Following, replying… users are notified when a change is made to anything they are following whether conversations or documents.
  • Profiles: A unique employee profile regrouping professional data, latest activity is available. You can also link your past activities with LinkedIn, and synchronize employee data with Active Directory.

Here is video link about Beezy: Beezy or Yammer? Beezy-logo-M-25255B20-25255D   and yammer

 

The biggest difference between both tools is the integration.
Beezy is SharePoint integrated whereas Yammer get only a link in a top menu and a web part that doesn’t accept uploading files in the microblog.

Beezy works within the SharePoint framework, all of your permissions, storage, compliance policies and procedures remain the same, unlike in a hybrid solution using Yammer/Office365 where the level of access is limited by comparison, requiring additional management overhead.


Only good User Experience drives real adoption
As we already seen in others articles, user experience only is capable to drive to a real adoption. More simple, fast and intuitive tools you will put in place, more your employees will jump in.

Collaboration
Conclusion


Beezy offers a complete integrated Collaboration tool in SharePoint 2013 / OFFICE 365, easily deploy in SharePoint Servers 2013 and easy to use.
In order to make the right choice, take time to analyze your business needs, try solutions with small groups, get feedbacks from users and then take a decision.


Source: www.beezy.net



Multitenant vs. schema based consolidation

Tue, 2015-06-30 11:12

If you want install multiple instances of a software, for example you host the ERP for several companies or subsidiaries, you have 3 solutions:

  • have one database and multiple schema
  • have multiple databases
  • have one database and multiple pluggable databases

Of course, this is exactly the reason for pluggable databases: multitenant. You have good isolation but still share resources. A lot of reasons have been given why multiple schema - or schema based consolidation - is not a good solution. I don't agree with most of them. But there is one very good reason that I'll show later and it's about cursor sharing.

schema based consolidation

Let's take the Oracle white paper presenting multitenancy.

Name collision might prevent schema-based consolidation

Yes some applications have a fixed schema name. If your ERP must be installed in SYSERP schema, then you cannot install several ones in the same database.

However, you should challenge your application provider for that before changing all your infrastructure and buying expensive options. Maybe I'm too optimistic here, but I  think it's something from the past. I remember a telco billing software I've installed 15 years ago. The schema was 'PB'. It had nothing to do with the software name or the vendor name. But when I asked if I can change, answer was No. That schema name was hard-coded everywhere. It got it when the main developer came to visit us... his name was Pierre B.

About public synonyms, and public database links... please just avoid them.


Schema-based consolidation brings weak security

Same idea. If your application requires a 'SELECT ANY PRIVILEGE' then don't do it. In 12c you have privilege analysis that can help to identify the minimal rights you need to grant.

 

Per application backend point-in-time recovery is prohibitively difficult

I don't see the point. Currently multitenant do not give us more options because pluggable database point in time recovery, nor flashback pluggable database, is currently possible in-place. But I know it's planned for the future. You can already read about it at http://www.google.com/patents/US20140095452

Of course, when using schema-based consolidation you should used different tablespaces and you have TSPITR.

 

Resource management between application backends is difficult

Well you don't need pluggable database to use services. Multitenant is just an easy way to force the application to use specific services.

 

Patching the Oracle version for a single application backend is not possible

Yes, plugging a PDB into a different version CDB can be faster for those applications that have lot of objects. But it is not as easy as the doc says. The PDB dictionary must be patched. It's still a good think when the system metadata is a lot smaller than the application metadata

 

Cloning a single application backend is difficult

Cloning a PDB is easy. Right. 

Finally, multitenant is nice because of pluggable databases. Do you know that all occurrence of 'multitenant' in 12c code or documentation was 'pluggable database' one month before the release?

But wait a minute, I'm not talking about test environments here. I'm talking about consolidating the similar production databases. And all the plug/unplug has the same problem as transportable tablespaces: source must be made read-only.

 

Cursor sharing in schema based consolidation

Time to show you what is the big advantage of multitenant.

10 years ago I worked on a database that had 3000 schemas. Well we had 5 databases like that. You can think of them as specialized datamarts: same code, same data model, but different data, used by application services provided to different customers. A total of 45TB was quite nice at that time.

That was growing very fast and we had 3 issues.

Issue one was capacity planning. The growth was difficult to predict. We had to move those schemas from one database to another, from one Storage system to another... It was 10g - no online datafile move at that time. Transportable tablespaces was there, but see next point.

The second issue was the number of files. At first, each datamart had its set of tablespaces. But >5000 datafiles on a database was too much for several reasons. One of the reason was RMAN. I remember a duplicate with skip tablespace took 2 days to initialize... 

Then we have consolidated several datamarts into same tablespaces. When I think about it, the multitenant database we can have today (12c) would not have been an easy solution. Lot of pluggable databases mean lot of datafiles. I hope those RMAN issues have been fixed. But there are other ones. Did you ever try to query DBA_EXTENTS on a >5000 datafiles database? I had to when we had some block corruption on the SAN (you know, because of issue 1 we did lot of online reorg of the filesystems, and SAN software had a bug) This is where I made my alternative to DBA_EXTENTS.

Then the third issue was cursor sharing.

Let me give you an example

I create the same table in two schemas (DEMO1 and DEMO2) of same database.

SQL> connect demo1/demo@//192.168.78.113/PDB1
Connected.
SQL> create table DEMO as select * from dual;

Table created.

SQL> select * from DEMO;

D
-
X

SQL> select prev_sql_id from v$session where sid=sys_context('userenv','sid');

PREV_SQL_ID
-------------
0m8kbvzchkytt

SQL>
SQL> connect demo2/demo@//192.168.78.113/PDB1
Connected.
SQL> create table DEMO as select * from dual;

Table created.

SQL> select * from DEMO;

D
-
X

I'm in multitenant here because of the second test I'll do, but it's the same pluggable database PDB1.

 You see that I've executed exactly the same statement - SELECT * FROM DEMO - in both connections. Same statement but on different tables. Let's look at the cursors:

b2ap3_thumbnail_CaptureShared1.JPG

The optimization tried to share the same cursor. The parent cursor is the same because the sql text is the same. Then it follows the child list in order to see if a child can be shared. But semantic verification sees that it's not the same 'DEMO' table and it had to hard parse.

The problem is not hard parse. It's not the same table, then it's another cursor. Only the name is the same.

Imagine what happened on my database where I had 3000 identical queries on different schemas. We didn't have 'perf flame graphs' at that time, or we would have seen a large flame over kkscsSearchChildList.

Looking at thousand of child cursors in the hope to find one that can be shared is very expensive. And because it's the same parent cursor, there is a high contention on the latch protecting the parent.

The solution at that time was to add a comment into the sql statements with the name of the datamart, so that each one is a different sql text - different parent cursor. But that was a big change of code with dynamic SQL.

Cursor sharing in multitenant consolidation

So, in 12c if I run the same query on different pluggable databases. After the previous test where I had two child cursors in the PDB1 (CON_ID=5) I have run the same in PDB2 (CON_ID=4) and here is the view of parent and child cursors from the CDB:

b2ap3_thumbnail_CaptureShared2.JPG

We have the two child cursors from the previous test and we have a new child for CON_ID=4

The child number may be misleading but the search for shareable cursor is done only for the current container, so the same query when run from another pluggable database did not try to share a previous cursor. We can see that because there is not an additional 'reason' in V$SQL_SHARED_CURSOR.

SQL> select con_id,sql_id,version_count from v$sqlarea where sql_id='0m8kbvzchkytt';

    CON_ID SQL_ID        VERSION_COUNT
---------- ------------- -------------
         5 0m8kbvzchkytt             3
         4 0m8kbvzchkytt             3

The V$SQLAREA is also misleading because VERSION_COUNT aggregates the versions across containers.

But the real behavior is visible in V$SQL_SHARED_CURSOR above and if you run that with a lot of child cursor you will see the difference in CPU time, latching activity, etc.

Conclusion

I'm not talking about pluggable databases here. Pluggable database do not need the multitenant option as you can plug/unplug database in single-tenant. Pluggable database is a nice evolution of transportable database.

When it comes to multitenant - having several pluggable database in the same container, in order to have several 'instances' of your software without demultiplicating the instances of your RDBMS - then here is the big point: consolidation scalability.

You can add new pluggable databases, and run same application code on them, without increasing contention, because most of the instance resources are isolated to one container. 

ODA - VMs possibilities & performances

Tue, 2015-06-30 02:56

As you know it is possible to install the ODA in a virtualized mode and to take avantages from all cores not licensed with Enterprise Edition for additional VMs.

The question is what could we do with it and which performances could we expect...

Indexing for like/similarity operations

Mon, 2015-06-29 04:05

Indexing queries for like/similarity conditions is not that easy with the usual index types. The only option you have with btree indexes (especially if the wild-card is at the beginning of the filter) is to create a partial index on that columns for a very specific query.

Oracle Log Writer and Write-Ahead-Logging

Sun, 2015-06-28 11:29

I posted a tweet with a link to a very old document - 20 years old - about 'internals of recovery'. It's a gem. All the complexity of the ACID mecanisms of Oracle are explained in a very simple way. It was written for Oracle 7.2 but it's incredible to see how much the basic things are still relevant today. Of course, there is  a reason for that: the mecanisms of recovery are critical and must be stable. There is one more reason in my opinion: the Oracle RDBMS software was very well designed, then the basic structures designed 20 years ago are still able to cope with new features, and to scale with very large databases, through the versions and the years. 

It's 20 years old but it's still the best written document I've read about how Oracle works http://t.co/4CAI4Q5MIm http://t.co/mmgA50JzMQ

— Franck Pachot (@FranckPachot) June 26, 2015

If you check the conversation that followed, a doubt has been raised about the following sentence:

According to write-ahead log protocol, before DBWR can write out a cache buffer containing a modified datablock, LGWR must write out the redo log buffer containing redo records describing changes to that datablock.

There are 2 ways to clear out that kind of doubt: read and test. And we need both of them because:

  • documentation may have bug
  • software may have bug

so you can be sure about a behaviour only when both documentation and test validates your assumption.

Documentation

The first documentation I find about it is another gem describing how Oracle works: Jonathan Lewis 'Oracle Core (Apress)'. And it's clearly stated that:

One of the most important features of the Oracle code is that the database writer will not write a changed block to disk before the log writer has written the redo that describes how the block was changed. This write-ahead logging strategy is critical to the whole recovery mechanism.

Then there is of course the Oracle Documentation:

Before DBW can write a dirty buffer, the database must write to disk the redo records associated with changes to the buffer (the write-ahead protocol). If DBW discovers that some redo records have not been written, it signals LGWR to write the records to disk, and waits for LGWR to complete before writing the data buffers to disk.

Test case

Ok, that should be enough. But I want to do a simple testcase in order to see if anything has changed in the latest version (12.1.0.2). My idea is to check two things:

  • whether a checkpoint is requesting so work to be done by logwriter
  • whether a change is written to redo log after a checkpoint, without waiting the usual 

I create a table:

19:07:21 SQL> create table DEMO as select '--VAL--1--'||to_char(current_timestamp,'hh24missffff') val from dual;

Table created.

19:07:21 SQL> select * from DEMO;

VAL
----------------------------------
--VAL--1--190721367902367902

 

I start with a new logfile:

19:07:21 SQL> alter system switch logfile;
System altered.

And I retrieve the log writer process id for future use:

19:07:21 SQL> column spid new_value pid
19:07:21 SQL> select spid,pname from v$process where pname='LGWR';

SPID PNAME
------------------------ -----
12402 LGWR

19:07:21 SQL> host ps -fp &pid
UID PID PPID C STIME TTY TIME CMD
oracle 12402 1 0 Jun25 ? 00:00:46 ora_lgwr_DEMO14
update and commit

Here is a scenario where I update and commit:

19:07:21 SQL> update DEMO set val='--VAL--2--'||to_char(current_timestamp,'hh24missffff');

1 row updated.

19:07:21 SQL> select * from DEMO;

VAL
----------------------------------
--VAL--2--190721443102443102

19:07:21 SQL> commit;

Commit complete.

I want to see if a checkpoint has something to wait from the log writer, so I freeze the log writer:

19:07:21 SQL> host kill -sigstop &pid

and I checkpoint:

19:07:21 SQL> alter system checkpoint;

System altered.

No problem. The checkpoint did not require anything from log writer in that case. Note that the dirty buffers related redo has already been written to disk at commit (and log writer was running at that time).

I un-freeze it for the next test:

19:07:21 SQL> host kill -sigcont &pid

update without commit

Now I'm doing the same but without commit. My goal is to see if uncommited dirty blocks need their redo to be written to disk.

19:07:51 SQL> select * from DEMO;

VAL
----------------------------------
--VAL--2--190721443102443102

19:07:51 SQL> host kill -sigstop &pid

19:07:51 SQL> update DEMO set val='--VAL--3--'||to_char(current_timestamp,'hh24missffff');

1 row updated.

19:07:51 SQL> alter system checkpoint;

Here it hangs. Look at the wait events:

b2ap3_thumbnail_CaptureLGWR.JPG

My checkpoint is waiting on 'rdbms ipc reply' until the log writer is woken up. 


$ kill -sigcont 12402

System altered.

19:09:37 SQL> select * from DEMO;

VAL
----------------------------------
--VAL--3--190751477395477395

The checkpoint is done.

 

Note that if I run the same but wait 3 seconds after the update (because I know that log writer writes redo at least every 3 seconds even not asked to do it):

21:33:35 SQL> update DEMO set val='--VAL--3--'||to_char(current_timestamp,'hh24missffff');

1 row updated.

21:33:35 SQL> host sleep 3

21:33:38 SQL> host kill -sigstop &pid

21:33:38 SQL> alter system checkpoint;

System altered.

21:33:38 SQL>

checkpoint is not waiting because all the redo that covers the dirty buffers are alerady written.

I've also checked that immediately after the checkpoint (without stopping the log writer here) the uncommited change is written to the redo log files:

21:56:38 SQL> select group#,v$log.status,member from v$log join v$logfile using(group#) where v$log.status='CURRENT';

GROUP# STATUS MEMBER
---------- ---------------- ----------------------------------------
2 CURRENT /u01/DEMO/oradata/DEMO14/redo02.log


21:56:38 SQL> update DEMO set val='--VAL--2--'||to_char(current_timestamp,'hh24missffff');

1 row updated.

21:56:38 SQL> select * from DEMO;

VAL
----------------------------------
--VAL--2--215638557183557183

21:56:38 SQL> alter system checkpoint;

System altered.

21:56:38 SQL> host strings &redo | grep "VAL--"
--VAL--1--215638376899376899
--VAL--2--2156385571

A simple grep reveals that redo has been written (I've no other activity in the database - so no concurrent commits here).

Conclusion

Even if some mecanisms have been improved (see Jonathan lewis book for them) for performance, the fundamentals have not changed.

I've said that there are two ways to validated an assumption: documention and test.
But there is a third one: understanding.

When you think about it, if you write uncommited changes to the files, then you must be able to rollback them in case of recovery. Where is the rollback information? In the undo blocks. Are the undo blocks written on disk when the database is written on disk? You don't know. Then where do you find the undo information in case of recovery? The redo genereated by the transaction contains change vectors for data blocks and for undo blocks. Then if you are sure that all redo is written before the block containing uncomitted changes, then you are sure to be able to rollback those uncommited changes.

Note that this occurs only for modifications through buffer cache. Direct-path insert do not need to be covered by redo to be undone. It's the change of high water mark that will be undone and this one is done in buffer cache, protected by redo.

Oracle Database Cloud Service - My first trial

Sat, 2015-06-27 12:31

The cloud has been annouced, I want to try.

From the cloud.oracle.com/database website, there is Trial only for the 'Database Schema Service' so I asked fot it, received an e-mail with connection info and it works:

b2ap3_thumbnail_CaptureCloud001.JPG

 

Good. The password was temporary, so I have to change it, and set answers to 3 within 4 questions in case I forgot my password.

Java error: my user does not exist:

b2ap3_thumbnail_CaptureCloud002.JPG

Ok, I was too quick after receiving the e-mail... Let's wait 5 minutes and come back:

 

b2ap3_thumbnail_CaptureCloud003.JPG

 

What ? Same as old password ? 

let's try another one:

 

b2ap3_thumbnail_CaptureCloud004.JPG

 

Ok. Now I understand. When my user did not exist, the passward has been changed anway... Good I remember it.

I put the first password I tried as old password:

 

b2ap3_thumbnail_CaptureCloud005.JPG

Ok. But I know my previous passord, let's keep it.

I come back to the url and connect:

 

b2ap3_thumbnail_CaptureCloud006.JPG

 

It seems that I don't remember it... let's go through the 'Forgot Passord' screen. I entered my mother's maiden name, may pet name, my city of birth... well all thost top secret information that everybody has on his Facebook first page ;)

And enter a new password:

 

b2ap3_thumbnail_CaptureCloud007.JPG

Bad luck again...

Going back to the e-mail I see something strange: there is two spaces in front of my username:

 

b2ap3_thumbnail_CaptureCloud007b.JPG

Ok, I try everything: my username with and without space in front, the 3 passwords I tried to change previously... same error.

Let's start from the begining. Click on the first url in the mail.

Bingo. I don't know how but I'm logged. Here is the first screen of by first database in the cloud:

b2ap3_thumbnail_CaptureCloud008.JPG

 

Wait a minute... is this new ? An online APEX workspace that is available for years is now 'The' Oracle Cloud Database Schema Service available as a 30-days trial?

Without any exitation, I'll do my first Hello world in a database in the Cloud:

b2ap3_thumbnail_CaptureCloud009.JPG

I hope we will be able to get a trial account for the 'real' Database as a Service in the Cloud. I always loved to be able to download and try Oracle products for free. I don't think we are ready to pay to test it. Prices here.

Quickly create a hundred databases and users

Fri, 2015-06-26 13:00
Do you need a hundred databases and users for training etc. in PostgreSQL?

Swiss Postgres Conference 2015

Fri, 2015-06-26 11:00

At the 26th of June I had the chance to attend the second Swiss Postgres Conference at the HSR Rapperswil. It was packed with interesting sessions.

RAC buffer states: XCUR, SCUR, PI, CI

Thu, 2015-06-25 13:43

In RAC, blocks are copied across instances by the Global Cache Service. In single instance, we have only two status: CR for consistent read clones where undo is applied, and CUR for the current version that can be modified (then being a dirty block). I'ts a bit more complex in RAC. Here is a brief example to show the buffer status in Global Cache.

SCUR: shared current

I connect to one instance (I have a few singleton services. service ONE is on instance 3 and service TWO is on instance 1)

SQL> connect demo/demo@//192.168.78.252/ONE.racattack
Connected.
and I query a row by ROWID in order to read only one block
SQL> select rowid,DEMO.* from DEMO where rowid='&rowid1';
old   1: select rowid,DEMO.* from DEMO where rowid='&rowid1'
new   1: select rowid,DEMO.* from DEMO where rowid='AAAXqxAALAAACUkAAD'

ROWID                      ID          N
------------------ ---------- ----------
AAAXqxAALAAACUkAAD         10         10
Here is the status of the buffer in the buffer cache:
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         3          1 scur       00000000B9FEA060 N N N N N N
The block has been read from disk by my instance. Without modification it is in SCUR status: it's the current version of the block and can be shared.

SCUR copies

Now connecting to another instance

SQL> connect demo/demo@//192.168.78.252/TWO.racattack
Connected.
and reading the same block
SQL> select rowid,DEMO.* from DEMO where rowid='&rowid1';
old   1: select rowid,DEMO.* from DEMO where rowid='&rowid1'
new   1: select rowid,DEMO.* from DEMO where rowid='AAAXqxAALAAACUkAAD'

ROWID                      ID          N
------------------ ---------- ----------
AAAXqxAALAAACUkAAD         10         10
let's see what I have in my Global Cache:
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id,lock_element_addr;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         1          1 scur       00000000B0FAADC0 N N N N N N
         3          1 scur       00000000B9FEA060 N N N N N N
non modified blocks can be shared: I have a copy on each instance.

XCUR: exclusive current

I'll start a new case, I flush the buffer cache

connecting to the first instance

SQL> connect demo/demo@//192.168.78.252/ONE.racattack
Connected.
I'm now doing a modification with a select for update (which writes the lock in the block, so it's a modification)
SQL> select rowid,DEMO.* from DEMO where rowid='&rowid1' for update;
old   1: select rowid,DEMO.* from DEMO where rowid='&rowid1' for update
new   1: select rowid,DEMO.* from DEMO where rowid='AAAXqxAALAAACUkAAD' for update

ROWID                      ID          N
------------------ ---------- ----------
AAAXqxAALAAACUkAAD         10         10
now the status in buffer cache is different:
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id,lock_element_addr;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         3          1 cr         00               N N N N N N
         3          1 xcur       00000000B9FEA060 Y N N N N N
So I have two buffers for the same block. The buffer that has been read and will not be current anymore because it has the rows before the modifications. It stays in consistent read (CR) status. The modified one is then the current one but cannot be shared: its the XCUR buffer where modifications will be done.

CR consistent read

Now I'll read it from the second instance

SQL> connect demo/demo@//192.168.78.252/TWO.racattack
Connected.
SQL> select rowid,DEMO.* from DEMO where rowid='&rowid1';
old   1: select rowid,DEMO.* from DEMO where rowid='&rowid1'
new   1: select rowid,DEMO.* from DEMO where rowid='AAAXqxAALAAACUkAAD'

ROWID                      ID          N
------------------ ---------- ----------
AAAXqxAALAAACUkAAD         10         10
the block is read and I've another CR buffer:
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id,lock_element_addr;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         1          1 cr         00               N N N N N N
         3          1 cr         00               N N N N N N
         3          1 xcur       00000000B9FEA060 Y N N N N N
the CR buffer is at another SCN. A block can have several CR blocks (by default up to 6 per instance)

PI: past image

Let's do a modification from the other instance

SQL> connect demo/demo@//192.168.78.252/TWO.racattack
Connected.
SQL> select rowid,DEMO.* from DEMO where rowid='&rowid1' for update;
old   1: select rowid,DEMO.* from DEMO where rowid='&rowid1' for update
new   1: select rowid,DEMO.* from DEMO where rowid='AAAXqxAALAAACUkAAD' for update

ROWID                      ID          N
------------------ ---------- ----------
AAAXqxAALAAACUkAAD         10         10
My modification must be done on the current version, which must be shipped to my instance
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id,lock_element_addr;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         1          1 cr         00               N N N N N N
         1          1 cr         00               N N N N N N
         1          1 xcur       00000000B0FAADC0 Y N N N N N
         3          1 cr         00               N N N N N N
         3          1 pi         00000000B9FEA060 Y N N N N N
and the previous current version remains as a PI - past image. It cannot be used for consistent reads but it is kept for recovery: if current block is lost, redo can be applied to the past image to recover it. See Jonathan Lewis explanation.

Checkpoint

As the past images are there in case of recovery, they are not needed once an instance has checkpointed the current block.

SQL> connect sys/oracle@//192.168.78.252/ONE.racattack as sysdba
Connected.
SQL> alter system checkpoint;
System altered.
afer the checkpoint on the instance that has the XCUR, there is no dirty buffer in any instance:
SQL> select inst_id,class#,status,lock_element_addr,dirty,temp,ping,stale,direct,new from gv$bh where objd=(select data_object_id from dba_objects where owner='DEMO' and object_name='DEMO') and status!='free' order by inst_id,lock_element_addr;

   INST_ID     CLASS# STATUS     LOCK_ELEMENT_ADD D T P S D N
---------- ---------- ---------- ---------------- - - - - - -
         1          1 cr         00               N N N N N N
         1          1 cr         00               N N N N N N
         1          1 xcur       00000000B0FAADC0 N N N N N N
         3          1 cr         00               N N N N N N
         3          1 cr         00               N N N N N N
the PI became a consistent read.

Summary

Here are the states we have seen here:

XCUR: current version of the block - holding an exclusive lock for it

SCUR: current version of the block that can be share because no modification were done

CR: only valid for consistent read, after applying the necessary undo to get it back to requried SCN

PI: past image of a modified current block, kept until the latest version is checkpointed

and the other possible states:

FREE: The buffer is not currently in use.

READ: when the block is being read from disk

MREC: when the block is being recovered for media recovery

IREC: when the block is being recovered for crash recovery

another way to list invalid objects

Thu, 2015-06-25 01:00

How often did I type a query like this to list the invalid objects in a database?

select count(*)
  from dba_objects
 where status  'VALID';
    -- and user in/not in

Today I learned another way to do the same.