Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 14 hours 51 min ago

Generate Azure VM with Resource Manager deployment in PowerShell

Wed, 2016-08-17 09:04

Recently, there is a new way to manage the Azure infrastructure with Resource Manager. It brings many advantages regarding the classic deployment.
The differences between these two deployments will not be covered in this blog because it is not the initial goal, and it already exists a very good Microsoft topic on this subject.

In this blog, we will generate a new Windows Azure Virtual Machine using Resource Manager deployment with PowerShell from On-Premise.

Remember, only RM object can be listed with RM cmdlets! On the contrary, only Classic object can be listed with Classic cmdlets!

We can connect automatically to Azure Account with this command:
Select-AzureRmProfile -Path "C:\temp\AzureCert.json"

But to download this certificate, we need to connect manually to Azure Account at least once as follows:
Add-AzureRmAccount -SubscriptionId "<YourSubscriptionID>"

Enter your personal credentials and then run the following command:
Save-AzureRmProfile -Path "C:\temp\AzureCert.json"

If you want to navigate through your different attached Azure Subscriptions, use the cmdlets Get-AzureRmSubscription/Set-AzureRmSubcription.

To obtain the different existing Azure Locations:
Get-AzureRmLocation | Select DisplayName

For the end of this blog, we will work in this specific Azure Location:
$location = "West Europe"

Hardware Profile

To list all different available Resource Group:
Get-AzureRmResourceGroup | Select ResourceGroupName, Location

And select your specific Azure Resource Group:
$resourceGroupName = (Get-AzureRmResourceGroup).ResourceGroupName[0]

To choose the correct VM size, list all available Azure formats:
Get-AzureRmVMSize -location $location | Select Name, NumberOfCores, MemoryInMB
$vmSize = "Standard_A3"

And initialize the VM object to build:
$vm = New-AzureRMVMConfig -Name $vmname -VMSize $vmsize

Image Profile

Now we want to select a specific image available from a publisher in Azure. In this case, we will choose the last SQL Server 2016 Enterprise edition ISO.
The different steps will describe the method to find out all the elements to select the correct available image.

Select all publishers from a specific Azure Location:
Get-AzureRmVMImagePublisher -Location $location | Select PublisherName
$publisher = "MicrosoftSQLServer"

Now select all offers from a specific Azure Publisher:
Get-AzureRmVMImageOffer -Location $location -PublisherName $publisher | Select Offer
$offer = "SQL2016-WS2012R2"

Then select all Skus from a specific Azure Offer:
Get-AzureRmVMImageSku -Location $location -PublisherName $publisher -Offer $offer | Select Skus
$skus = "Enterprise"

Finally choose your version:
(Get-AzureRmVMImage -Location $location -PublisherName $publisher -Offer $publisher -Skus $skus).version

To obtain the last version of the image:
$Version = (Get-AzureRmVMImage -Location $location -PublisherName $publisher -Offer $offer -Skus $skus | sort -Descending).version[0]

Add the image profile to the existing VM object:
$vm = Set-AzureRmVMSourceImage -VM $vm -PublisherName $publisher -Offer $offer -Skus $skus -Version $version

OS Profile

According to the Image Profile, the Virtual Machine will be a Windows Server. So enter the specifications as follows:
$username = "dbi"
$password = ConvertTo-SecureString "B3stPa$$w0rd3v3r" -AsPlainText –Force
$cred = New-Object System.Management.Automation.PSCredential ($username, $password)
$vm = Set-AzureRmVMOperatingSystem -VM $VM -ComputerName "Artanis" -Windows -Credential $cred -ProvisionVMAgent

Disk Profile

As the VM will be created from an Azure Image, we need to specify a location and a name for the OS disk.

To list all your available Azure Storage Accounts, run this command:
Get-AzureRmStorageAccount | Select StorageAccountName, Location

To list the different containers available in your Azure Storage:
(Get-AzureRmStorageAccount | Get-AzureStorageContainer).CloudBlobContainer

And now add a disk profile to the existing VM:
$diskLocation = "https://<accountStorageName>.blob.core.windows.net/vhds/"
$vm = Set-AzureRmVMOSDisk -VM $vm -Name "artanisVHDOS.vhd" -VhdUri ($diskLocation+"artanisVHDOS.vhd") -CreateOption FromImage

IP Profile

Here is an example of Network configuration:
$subnet = New-AzureRmVirtualNetworkSubnetConfig -Name "CloudSubnet" -AddressPrefix "10.0.64.0/24"
$ain = New-AzureRmVirtualNetwork -Name "VirtualNetwork" -ResourceGroupName $resourceGroupName -Location $location -AddressPrefix "10.0.0.0/16" -Subnet $subnet
$pip = New-AzureRmPublicIpAddress -Name "AzurePublicIP" -ResourceGroupName $resourceGroupName -AllocationMethod Dynamic -Location $location
$nic = New-AzureRMNetworkInterface -Name "AzureNetInterface" -ResourceGroupName $resourceGroupName -Location $location SubnetId $ain.Subnets[0].Id -PublicIpAddressId $pip.Id

Conclusion: VM generation

Now we have entered all different profiles required to generate a new Windows Azure VM:
$azurevm = New-AzureRmVM -ResourceGroupName $resourceGroupName -Location $location -VM $vm

Use “Get-AzureRmVM” cmdlet to list all available VMs.

To download the remote desktop file to connect to this new virtual machine, use the following command:
Get-AzureRmRemoteDesktopFile -ResourceGroupName $resourceGroupName -Name $vmName -LocalPath "C:\Temp\Artanis.rdp"

With all these commands, you can realize how simple it is to automate the generation of a new Virtual Machine in Azure. Moreover you should probably have noticed the construction of the VM object (with the different profiles) is similar to Hyper-V structure.

I hope it helps you ;-)

 

Cet article Generate Azure VM with Resource Manager deployment in PowerShell est apparu en premier sur Blog dbi services.

Unplugged pluggable databases

Mon, 2016-08-15 08:35

When Oracle Multitenant came out in 12c, with pluggable databases, it was easy to draw them as USB sticks that you can plug and unplug to/from your Container Database (CDB). I don’t like this because it gives the false idea that an unplugged database is detached from the container.

Containers

In the Oracle documentation, the Concept book, the description of the multitenant architecture starts with an illustration of a CDB.
CDB with two PDBs where the text description starts like:
This graphic depicts a CDB as a cylinder. Inside the CDB is a box labeled “Root (CDB$ROOT).” Plugged into the box is the Seed (CDB$SEED), hrpdb, and salespdb.

Let me list what I don’t like with this description:

  1. There are 5 containers here but 3 ways to draw them. The CDB itself is a container (CDB_ID=0) and is a cylinder. The CDB$ROOT (CON_ID=1) is a container and is a box. The PDB$SEED, and the user PDBs are cylinders with USB plug.
  2. The CDB$ROOT do not look like a database (cylinder). However, physically it’s the same: SYSTEM, SYSAUX, UNDO, TEMP tablepsaces
  3. The PDB$SEED (CON_ID=1) looks like it is pluggable (USB stick) but you never unplug the PDB$SEED
  4. The USB plug is plugged inside the CDB$ROOT. That’s wrong. All containers inside a CDB are at the same level and are ‘plugged’ in the CDB (CON_ID=0) and not the CDB$ROOT(CON_ID=1). They are contained by the CDB and if they are plugged somewhere, it’s in the CDB controlfile. The root is a root for metadata and object links, not for the whole PDBs.

If I had to show pluggable databases as USB sticks it would be like that:
CaptureUSBStick

Here CDB$ROOT is a container like the pluggable databases, except that you cannot unplug it. PDB$SEED is a pluggable database but that you don’t unplug. The CDB is a container but do not look like a database. It’s the controlfile and the instance, but there’s no datafiles directly attached to the CDB.

Unplugged

However with this illustration, we can think that an unplugged pluggable database is detached from the CDB, which is wrong.

SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
4 PDB READ WRITE NO
SQL> alter pluggable database PDB close;
Pluggable database altered.
 
SQL> alter pluggable database PDB unplug into '/tmp/PDB.xml';
Pluggable database altered.
 
SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
4 PDB MOUNTED

Here PDB is unplugged, but still pertains to the CDB.

The CDB controlfile still addresses all the PDB datafiles:

RMAN> report schema;
 
using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name CDB
 
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1 829 SYSTEM YES /u02/app/oracle/oradata/CDB/system01.dbf
3 1390 SYSAUX NO /u02/app/oracle/oradata/CDB/sysaux01.dbf
4 4771 UNDOTBS1 YES /u02/app/oracle/oradata/CDB/undotbs01.dbf
5 270 PDB$SEED:SYSTEM NO /u02/app/oracle/oradata/CDB/pdbseed/system01.dbf
6 2 USERS NO /u02/app/oracle/oradata/CDB/users01.dbf
7 540 PDB$SEED:SYSAUX NO /u02/app/oracle/oradata/CDB/pdbseed/sysaux01.dbf
12 280 PDB:SYSTEM NO /u02/app/oracle/oradata/CDB/3969397A986337DCE053B6CDC40AC61C/datafile/o1_mf_system_ctcxz29m_.dbf
13 570 PDB:SYSAUX NO /u02/app/oracle/oradata/CDB/3969397A986337DCE053B6CDC40AC61C/datafile/o1_mf_sysaux_ctcxz2bb_.dbf

The datafiles 12 and 13 are the ones from my unplugged PDB, still known and managed by the CDB.

Backup

An unplugged PDB has data, and data should have backups. Who is responsible for the unplugged PDB backups? It’s still the CDB:

RMAN> backup database;
 
...
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00013 name=/u02/app/oracle/oradata/CDB/3969397A986337DCE053B6CDC40AC61C/datafile/o1_mf_sysaux_ctcxz2bb_.dbf
input datafile file number=00012 name=/u02/app/oracle/oradata/CDB/3969397A986337DCE053B6CDC40AC61C/datafile/o1_mf_system_ctcxz29m_.dbf
...

The unplugged CDB is not detached at all and the CDB is still referencing its files and is responsible for them. This is very different from an unplugged USB stick which has no link anymore with the hosts it was plugged-in before.

Backup optimization

If you wonderwhether it’s good to backup an unplugged PDB with each CDB backup, don’t worry. RMAN knows that it is in a state where it cannot be modified (like read-only tablespaces) and do not backup it each time. Of course, you need to have BACKUP OPTIMIZATION is configured to ON:

RMAN> backup database;
 
Starting backup at 15-AUG-16
using channel ORA_DISK_1
skipping datafile 12; already backed up 2 time(s)
skipping datafile 13; already backed up 2 time(s)
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set

Unplug and DROP

From what we have seen, an unplugged PDB is like a closed PDB. There’s a difference through: an unplugged PDB is closed forever. You cannot open it again:

SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
4 PDB MOUNTED
SQL> alter pluggable database PDB open;
alter pluggable database PDB open
*
ERROR at line 1:
ORA-65086: cannot open/close the pluggable database
 
SQL> host oerr ora 65086
65086, 00000, "cannot open/close the pluggable database"
// *Cause: The pluggable database has been unplugged.
// *Action: The pluggable database can only be dropped.

So, if you want to keep the USB stick analogy, unplugged do not mean ‘physically unplugged’ but something like what you should do before removing a USB stick:
Screenshot 2016-08-15 10.44.46

In summary:

ALTER PLUGGABLE DATABASE … UNPLUG is like the logical ‘eject’ you do to be sure that what you will remove physically was closed forever.
ALTER PLUGGABLE DATABASE … DROP … KEEP DATAFILES is the physical removal from the CDB

Because DROP is the only thing that can be done on an unplugged PDB, SQL Developer do the both when you click on ‘unplug':
CaptureUNPLUGDROP

The idea to drop it just after the unplug is probably there to prevent the risk to drop it ‘including datafiles’ after it has been plugged into another CDB. Because then it is lost.
However, keep in mind that when unplugged and dropped, nobody will backup the PDB datafiles until it is plugged into a new CDB.

Read-Only

There’s a last one more difference. A USB stick can be read-only. A plugged PDB cannot. You may want to share a database from a read-only filesystem, like you can do with transportable tablespaces. but you can’t:

SQL> drop pluggable database PDB keep datafiles;
Pluggable database dropped.
 
SQL> create pluggable database PDB using '/tmp/PDB.xml';
Pluggable database created.
 
SQL> alter pluggable database PDB open read only;
alter pluggable database PDB open read only
*
ERROR at line 1:
ORA-65085: cannot open pluggable database in read-only mode

The plugged PDB must be opened in read/write mode at least once:
SQL> host oerr ora 65085
65085, 00000, "cannot open pluggable database in read-only mode"
// *Cause: The pluggable database has been created and not opened.
// *Action: The pluggable database needs to be opened in read/write or
// restricted mode first.

And only then, it can be opened read-only:

SQL> alter pluggable database PDB open;
Pluggable database altered.
 
SQL> alter pluggable database PDB close;
Pluggable database altered.
 
SQL> alter pluggable database PDB open read only;
Pluggable database altered.

So what…

Here is the way I visualize pluggable databases:

CaptureCON_ID

Just a bunch of tablespaces, referenced by the CDB controlfile and grouped by self-contained containers. CDB$ROOT cannot be cloned nor unplugged. PDB$SEED can be cloned but not unplugged (but it’s a PDB). Other PDBs can be cloned and unplugged.

I’ll talk about multitenant at Oracle Open World, DOAG Conference and UKOUG TECH16.
There’s also a book coming, probably early 2017 (depends on 12.2 availability)

 

Cet article Unplugged pluggable databases est apparu en premier sur Blog dbi services.

Windows: Sauvegarde RMAN sur un disque partagé

Mon, 2016-08-08 11:09

Dans ce blog, je vais vous expliquer comment exporter les backups RMAN sur un « share disk » appartenant à un Domaine.

Assurer la sécurité des données est l’une des tâches principales de l’administrateur :

  • La mise en œuvre d’une protection des fichiers sensibles de la base :
    • Fichier de contrôle
    • Fichiers de journalisation
  • La mise en place d’une stratégie de sauvegarde/récupération :
    • Adaptée aux contraintes de l’entreprise
    • Testée et documentée.

Afin de vous documenter sur les différentes techniques de sauvegarde et de restauration, je vous propose de jeter un coup d’œil à notre page Workshop Oracle Backup Recovery.

Plusieurs d’entre vous utilisent certainement des serveurs Windows pour administrer les bases de données Oracle, cependant il n’est pas toujours évident de les administrer sur un environnement Windows par rapport à Linux.
C’est pourquoi, je vous propose une solution de sauvegarde qui exportera vos backups sur un disque partagé ou un serveur de stockage sur lequel une sauvegarde des backups se fait quotidiennement sur un disque ou une bande.

Voici les étapes à suivre:

  • Vérifiez les droits (lecture/écriture) sur le disque partagé
  • Configurez le service Oracle et le Listener dans l’outil « services.msc » avec l’utilisateur de service
  • Vérifiez que le mot de passe du compte de service n’expire jamais et qu’il ne soit jamais verrouillé ou supprimé.
  • Redémarrez les services (oracle et listener)
  • Testez les backups avec RMAN

Allez dans le menu « services.msc » et changez le paramètre du service « OracleService_[nom_de_l’instance] » ainsi que le service « Listener » à l’aide de l’utilisateur de service qui fait fonctionner vos bases de données.

Faites un clic droit sur « Propriété » aller sur l’onglet « Connexion » puis sélectionnez « Ce compte ».
Cliquez ensuite sur « Parcourir » puis écrivez le nom de l’utilisateur de service, pour finir cliquez sur « Vérifier les noms » afin de trouver l’utilisateur dans l’Active Directory.

Capture-13
Capture-8Capture-9

Redémarrez le service Oracle et Listener.
ATTENTION : Ne pas redémarrer les services si la base de données est une production !
Capture-15

Testez les Backups RMAN sur le lecteur partagé :
Capture-17

Bien entendu, il est préférable de scripter les backups via le Planificateur de tâches, afin de les exécuter automatiquement. Je vous parlerais de cette prochaine étape lors d’un second blog.

 

Cet article Windows: Sauvegarde RMAN sur un disque partagé est apparu en premier sur Blog dbi services.

Multitenant internals: object links on fixed tables

Sun, 2016-08-07 11:13

The previous post partly answered to the original question (why an object link to V$SESSION is refreshed only every 30 seconds): recursive queries on shared=object views. Now let’s see what is different with fixed tables.

Disclaimer: this is research only. Don’t do that anywhere else than a lab. This is implicit when the title is ‘internals’.

Result cache

When query on a shared=object view is executed from a PDB, the session switches to the CDB to run a recursive query to get the rows. This query uses result cache by adding the following hint:
/*+ RESULT_CACHE (SYSOBJ=TRUE) */

This enables result cache for the rows fetched by this query, and even for system object. The ‘SYSOBJ=TRUE’ is there because the “_rc_sys_obj_enabled” defaults to true.

Here is the result cache from the previous post (I flushed the result cache just before the second run because in 12c hard parsing is also using a lot the result cache):


16:34:00 SQL> select con_id,sql_id,rows_processed,plan_hash_value,executions,parse_calls,fetches,buffer_gets,sql_text from v$sql where plan_hash_value in (3598352655,3551671056) order by last_active_time;
 
CON_ID SQL_ID ROWS_PROCESSED PLAN_HASH_VALUE EXECUTIONS PARSE_CALLS FETCHES BUFFER_GETS SQL_TEXT
---------- ------------- -------------- --------------- ---------- ----------- ---------- ----------- --------------------------------------------------------------------------------
1 350gg6247sfa6 200 3598352655 2 2 2 26 SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM NO_OBJECT_LINK("SYS"."DEMOV
3 bynmh7xm4bf54 0 3598352655 0 5 0 51 SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
3 duy45bn72jr35 200 3551671056 2 2 16 269 select id from DEMOV where num column name format a120 trunc
16:34:00 SQL> select type,status,name,row_count from v$result_cache_objects order by row_count desc fetch first 10 rows only;
 
TYPE STATUS NAME ROW_COUNT
---------- --------- ------------------------------------------------------------------------------------------------------------------------ ----------
Result Published SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM NO_OBJECT_LINK("SYS"."DEMOV") "DEMOV" WHERE "DEMOV"."NUM"<=100 100
Dependency Published SYS.DEMOT 0
Dependency Published SYS.DEMOV 0

As with regular result cache, there is dependency tracking: as soon as the underlying table has some modification, the cache will be invalidated. So this query is always guaranteed to get fresh results.

Invalidation

I did the same when deleting half of the rows before the second execution in order to invalidate the result cache:


16:43:46 SQL> select con_id,sql_id,rows_processed,plan_hash_value,executions,parse_calls,fetches,buffer_gets,sql_text from v$sql where plan_hash_value in (3598352655,3551671056) order by last_active_time;
 
CON_ID SQL_ID ROWS_PROCESSED PLAN_HASH_VALUE EXECUTIONS PARSE_CALLS FETCHES BUFFER_GETS SQL_TEXT
---------- ------------- -------------- --------------- ---------- ----------- ---------- ----------- --------------------------------------------------------------------------------
1 350gg6247sfa6 150 3598352655 2 2 2 26 SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM NO_OBJECT_LINK("SYS"."DEMOV
3 bynmh7xm4bf54 0 3598352655 0 5 0 51 SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
3 duy45bn72jr35 150 3551671056 2 2 13 269 select id from DEMOV where num column name format a120 trunc
16:43:46 SQL> select type,status,name,row_count from v$result_cache_objects order by row_count desc fetch first 10 rows only;
 
TYPE STATUS NAME ROW_COUNT
---------- --------- ------------------------------------------------------------------------------------------------------------------------ ----------
Result Invalid SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM NO_OBJECT_LINK("SYS"."DEMOV") "DEMOV" WHERE "DEMOV"."NUM"<=100 100
Result Published SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM NO_OBJECT_LINK("SYS"."DEMOV") "DEMOV" WHERE "DEMOV"."NUM"<=100 50
Dependency Published SYS.DEMOT 0
Dependency Published SYS.DEMOV 0

I’ve 100 rows from the first run, invalidated, and them 50 rows from the second one.

Note that I’ve the same result when I set “_disable_cdb_view_rc_invalidation”=true. Sometimes undocumented parameters behavior cannot be guessed only from their name.

Fixed tables

I’ve run the same testcase but with the following definition of DEMOV:

create view DEMOV sharing=object as select saddr id, rownum num from V$SESSION;

Here is the trace of the recursive query run in CDB$ROOT, at first execution:

PARSING IN CURSOR #140436732146672 len=112 dep=1 uid=0 oct=3 lid=0 tim=769208810641 hv=3298783355 ad='108ee92f0' sqlid=' 10qf9kb29yw3v'
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE SHELFLIFE=30) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
END OF STMT
PARSE #140436732146672:c=4000,e=10614,p=0,cr=6,cu=0,mis=1,r=0,dep=1,og=4,plh=350654732,tim=769208810640
EXEC #140436732146672:c=0,e=17,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=350654732,tim=769208810701
FETCH #140436732146672:c=1000,e=961,p=0,cr=0,cu=0,mis=0,r=53,dep=1,og=4,plh=350654732,tim=769208811687
STAT #140436732146672 id=1 cnt=53 pid=0 pos=1 obj=0 op='RESULT CACHE byq3fbkawmkm34gtfk1csvwv52 (cr=0 pr=0 pw=0 time=877 us)'
STAT #140436732146672 id=2 cnt=53 pid=1 pos=1 obj=98258 op='VIEW DEMOV (cr=0 pr=0 pw=0 time=655 us cost=0 size=171 card=9)'
STAT #140436732146672 id=3 cnt=53 pid=2 pos=1 obj=0 op='COUNT (cr=0 pr=0 pw=0 time=652 us)'
STAT #140436732146672 id=4 cnt=53 pid=3 pos=1 obj=0 op='NESTED LOOPS (cr=0 pr=0 pw=0 time=597 us cost=0 size=306 card=9)'
STAT #140436732146672 id=5 cnt=53 pid=4 pos=1 obj=0 op='NESTED LOOPS (cr=0 pr=0 pw=0 time=436 us cost=0 size=270 card=9)'
STAT #140436732146672 id=6 cnt=53 pid=5 pos=1 obj=0 op='FIXED TABLE FULL X$KSLWT (cr=0 pr=0 pw=0 time=219 us cost=0 size=352 card=44)'
STAT #140436732146672 id=7 cnt=53 pid=5 pos=2 obj=0 op='FIXED TABLE FIXED INDEX X$KSUSE (ind:1) (cr=0 pr=0 pw=0 time=101 us cost=0 size=22 card=1)'
STAT #140436732146672 id=8 cnt=53 pid=4 pos=2 obj=0 op='FIXED TABLE FIXED INDEX X$KSLED (ind:2) (cr=0 pr=0 pw=0 time=70 us cost=0 size=4 card=1)'

The difference here is that SHELFLIFE=30 has been added to the generated result cache hint.

The second run has very short parse time (c=0) because it’s a soft parse but you also see very short fetch time (c=0) because it’s a cache hit:

PARSING IN CURSOR #140436733602136 len=112 dep=1 uid=0 oct=3 lid=0 tim=769208821904 hv=3298783355 ad='108ee92f0' sqlid=' 10qf9kb29yw3v'
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE SHELFLIFE=30) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
END OF STMT
PARSE #140436733602136:c=0,e=29,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=350654732,tim=769208821904
EXEC #140436733602136:c=0,e=13,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=350654732,tim=769208821955
FETCH #140436733602136:c=0,e=18,p=0,cr=0,cu=0,mis=0,r=53,dep=1,og=4,plh=350654732,tim=769208821990

When I look at the result cache, there were no invalidations:
TYPE STATUS NAME ROW_COUNT
---------- --------- ------------------------------------------------------------------------------------------------------------------------ ----------
Result Published SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE SHELFLIFE=30) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100 55
Dependency Published SYS.DEMOV 0

When SHELFLIFE is set in a result cache hint, there is no dependency tracking. I’ve described RESULT_CACHE hint expiration options in a previous post.

The V$ views are on fixed tables, structures in memory, where there is no dependency tracking possibility. This is probably why the recursive query for sharing=object views use a SHELFLIFE instead.

This means that if you create an object link view on a fixed table the query will show same result for the next executions for 30 seconds.

“_cdb_view_rc_shelflife”

I’ve tested a shared=object view on V$SESSION as an answer to a previous blog comment. My query selects MAX(LAST_ET_CALL) which is supposed to increase every second for the inactive sessions. And we see that the result changes only every 30 seconds.

Those 30 seconds are parametered by “_cdb_view_rc_shelflife”. Here is the same test where I set “_cdb_view_rc_shelflife” to 5 seconds:


15:31:48 SQL> alter session set "_cdb_view_rc_shelflife"=5;
Session altered.
 
15:31:48 SQL> set serveroutput on
15:31:48 SQL> declare
15:31:48 2 x varchar2(100);
15:31:48 3 begin
15:31:48 4 for i in 1..60 loop
15:31:48 5 dbms_lock.sleep(1);
15:31:48 6 select to_char(current_timestamp)||' --> '||max(last_call_et) into x from DEMOV;
15:31:48 7 dbms_output.put_line(x);
15:31:48 8 end loop;
15:31:48 9 end;
15:31:48 10 /
 
07-AUG-16 03.31.49.852081 PM +00:00 --> 775144
07-AUG-16 03.31.50.863742 PM +00:00 --> 775144
07-AUG-16 03.31.51.863753 PM +00:00 --> 775144
07-AUG-16 03.31.52.864697 PM +00:00 --> 775144
07-AUG-16 03.31.53.864706 PM +00:00 --> 775144
07-AUG-16 03.31.54.864726 PM +00:00 --> 775144
07-AUG-16 03.31.55.864669 PM +00:00 --> 775150
07-AUG-16 03.31.56.864711 PM +00:00 --> 775150
07-AUG-16 03.31.57.864754 PM +00:00 --> 775150
07-AUG-16 03.31.58.864702 PM +00:00 --> 775150
07-AUG-16 03.31.59.864711 PM +00:00 --> 775150
07-AUG-16 03.32.00.864779 PM +00:00 --> 775150
07-AUG-16 03.32.01.865710 PM +00:00 --> 775156
07-AUG-16 03.32.02.866738 PM +00:00 --> 775156
07-AUG-16 03.32.03.866719 PM +00:00 --> 775156
07-AUG-16 03.32.04.866787 PM +00:00 --> 775156
07-AUG-16 03.32.05.866758 PM +00:00 --> 775156
07-AUG-16 03.32.06.866805 PM +00:00 --> 775156
07-AUG-16 03.32.07.867738 PM +00:00 --> 775162
07-AUG-16 03.32.08.868743 PM +00:00 --> 775162
07-AUG-16 03.32.09.868727 PM +00:00 --> 775162
07-AUG-16 03.32.10.868724 PM +00:00 --> 775162
07-AUG-16 03.32.11.868758 PM +00:00 --> 775162
07-AUG-16 03.32.12.869763 PM +00:00 --> 775167
07-AUG-16 03.32.13.870741 PM +00:00 --> 775167
07-AUG-16 03.32.14.870742 PM +00:00 --> 775167
07-AUG-16 03.32.15.870721 PM +00:00 --> 775167
07-AUG-16 03.32.16.870734 PM +00:00 --> 775167
07-AUG-16 03.32.17.870883 PM +00:00 --> 775167
07-AUG-16 03.32.18.872741 PM +00:00 --> 775173
07-AUG-16 03.32.19.873837 PM +00:00 --> 775173

And here is the same test after setting:

SQL> exec dbms_result_cache.bypass(true);

I’ve not tested, but I expect the same in Standard Edition where result cache is disabled


07-AUG-16 03.43.32.158741 PM +00:00 --> 775846
07-AUG-16 03.43.33.185793 PM +00:00 --> 775847
07-AUG-16 03.43.34.186633 PM +00:00 --> 775848
07-AUG-16 03.43.35.186738 PM +00:00 --> 775849
07-AUG-16 03.43.36.187696 PM +00:00 --> 775850
07-AUG-16 03.43.37.188684 PM +00:00 --> 775851
07-AUG-16 03.43.38.188692 PM +00:00 --> 775852
07-AUG-16 03.43.39.189755 PM +00:00 --> 775853
07-AUG-16 03.43.40.190697 PM +00:00 --> 775854
07-AUG-16 03.43.41.191763 PM +00:00 --> 775855
07-AUG-16 03.43.42.192706 PM +00:00 --> 775856
07-AUG-16 03.43.43.193736 PM +00:00 --> 775857

Conclusion

Don’t be afraid. There are very few sharing=object views in the dictionary, and only few of them have dependencies on fixed tables:

SQL> select owner,name,referenced_name from dba_dependencies
where (referenced_owner,referenced_name) in (select 'SYS',view_name from v$fixed_view_definition union select 'SYS',name from v$fixed_table)
and (owner,name,type) in (select owner,object_name,object_type from dba_objects where sharing='OBJECT LINK')
;
 
OWNER NAME REFERENCED_NAME
------------------------------ ------------------------------ ------------------------------
SYS INT$DBA_HIST_SQLSTAT X$MODACT_LENGTH
SYS INT$DBA_HIST_ACT_SESS_HISTORY X$MODACT_LENGTH
SYS INT$DBA_OUTSTANDING_ALERTS X$KELTGSD
SYS INT$DBA_OUTSTANDING_ALERTS X$KELTSD
SYS INT$DBA_OUTSTANDING_ALERTS X$KELTOSD
SYS INT$DBA_ALERT_HISTORY X$KELTGSD
SYS INT$DBA_ALERT_HISTORY X$KELTSD
SYS INT$DBA_ALERT_HISTORY X$KELTOSD
SYS INT$DBA_ALERT_HISTORY_DETAIL X$KELTGSD
SYS INT$DBA_ALERT_HISTORY_DETAIL X$KELTSD
SYS INT$DBA_ALERT_HISTORY_DETAIL X$KELTOSD
SYS DEMOV GV$SESSION
 
6 rows selected.

I’ve described how AWR views are stacked onto each other in a previous post.

And don’t worry, you don’t need to have a fresh view of those X$ tables. As an example, behind DBA_HIST_ACTIVE_SES_HISTORY the fixed table X$MODACT_LENGTH holds only the length of module and action strings:
SQL> select * from X$MODACT_LENGTH;
 
ADDR INDX INST_ID CON_ID KSUMODLEN KSUACTLEN
---------------- ---------- ---------- ---------- ---------- ----------
00007FF2EF280920 0 1 0 48 32

And the others (X$KELTSD, X$KELTGSD, X$KELTOSD) are the structures behind V$ALERT_TYPES that are not supposed to change.

So don’t panic. The multitenant architecture has some strange implementation stuff, but mostly harmless…

 

Cet article Multitenant internals: object links on fixed tables est apparu en premier sur Blog dbi services.

Multitenant internals: how object links are parsed/executed

Sun, 2016-08-07 05:33

I had a comment on object links internals when creating a sharing=object view on GV$SESSION. Before posting about this specific case, I realized that I’ve never explained how a query on an object link is run on the CDB$ROOT container.

Data link testcase

Here is how I create the DEMOT table and DEMOV view as object link.

14:48:58 SQL> connect / as sysdba
Connected.
14:48:59 SQL> alter session set container=CDB$ROOT;
Session altered.
14:48:59 SQL> alter session set "_oracle_script"=true;
Session altered.
14:48:59 SQL> create table DEMOT as select rownum id, rownum num from xmltable('10 to 1000000');
Table created.
SQL> exec dbms_stats.gather_table_stats('','DEMOT');
14:49:00 SQL> create view DEMOV sharing=object as select * from DEMOT;
View created.
14:49:00 SQL> alter session set container=PDB;
Session altered.
14:49:00 SQL> create table DEMOT as select rownum id, rownum num from xmltable('1 to 1');
Table created.
SQL> exec dbms_stats.gather_table_stats('','DEMOT');
14:49:01 SQL> create view DEMOV sharing=object as select * from DEMOT;
View created.
14:49:01 SQL> alter session set "_oracle_script"=false;
Session altered.

And I run the following query two times (easier to look at trace without hard parsing)

14:49:02 SQL> select id from DEMOV where num<=100;
...
100 rows selected.

SQL_TRACE

I’ll detail the sql_trace of the last run.

First, the query is parsed in our PDB:

PARSING IN CURSOR #140360238365672 len=35 dep=0 uid=0 oct=3 lid=0 tim=687080512770 hv=237558885 ad='10cf55ae8' sqlid=' duy45bn72jr35'
select id from DEMOV where num<=100
END OF STMT
PARSE #140360238365672:c=0,e=86,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3055399777,tim=687080512769
EXEC #140360238365672:c=0,e=61,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3055399777,tim=687080512872

There was no hard parse for this second parse call (mis=0 means no ‘Misses in library cache during parse’). Execution occurred but no fetch yet.

At that point, the session switches to CDB$ROOT container (you have to trust me as there is no information about it in the trace file in 12.1)

PARSING IN CURSOR #140360238643592 len=99 dep=1 uid=0 oct=3 lid=0 tim=687080513015 hv=2967959178 ad='107be5590' sqlid=' 3b9x1rasffxna'
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
END OF STMT
PARSE #140360238643592:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=3598352655,tim=687080513014
EXEC #140360238643592:c=0,e=16,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=3598352655,tim=687080513084
FETCH #140360238643592:c=0,e=32,p=0,cr=0,cu=0,mis=0,r=100,dep=1,og=4,plh=3598352655,tim=687080513137
STAT #140360238643592 id=1 cnt=100 pid=0 pos=1 obj=0 op='RESULT CACHE cgn1rxw6ycznac8fyzfursq2z6 (cr=0 pr=0 pw=0 time=12 us)'
STAT #140360238643592 id=2 cnt=0 pid=1 pos=1 obj=98422 op='TABLE ACCESS FULL DEMOT (cr=0 pr=0 pw=0 time=0 us)'

We have here a recursive query (dep=1) that is run on the view in CDB$ROOT. It’s not the same query as ours: FROM clause is our sharing=object view, WHERE clause is the predicates that applies on it, and SELECT clause the columns that we need (ID was in my SELECT clause and NUM was in my WHERE clause). The query is parsed, executed, the 100 rows are fetched and interestingly the result goes to result cache. Yes, if you query V$RESULT_CACHE_OBJECTS in a CDB you will see lot of objects:

#multitenant object links uses result cache internally. AWR views are object links. pic.twitter.com/V5IMW2qQx2

— Franck Pachot (@FranckPachot) August 7, 2016

If you look at the FETCH line above, you see that the second execution was a result cache hit (cr=0)

So, the rows we require from the object link are fetched, then the execution of our query continues in our PDB:


FETCH #140360238365672:c=0,e=235,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=3055399777,tim=687080513194
FETCH #140360238365672:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513375
FETCH #140360238365672:c=0,e=10,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513586
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513776
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080513983
FETCH #140360238365672:c=0,e=9,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080514188
FETCH #140360238365672:c=0,e=8,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=3055399777,tim=687080514375

Up there 91 rows were fetched. We can see in the trace that the recursive cursor is closed there (session switches to CDB$ROOT for that):

CLOSE #140360238643592:c=0,e=3,dep=1,type=0,tim=687080514584

And our session is back on PDB container where the remaining rows are fetched and our cursor closed:

FETCH #140360238365672:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=9,dep=0,og=1,plh=3055399777,tim=687080514610
STAT #140360238365672 id=1 cnt=100 pid=0 pos=1 obj=0 op='FIXED TABLE FULL X$OBLNK$ (cr=0 pr=0 pw=0 time=263 us cost=0 size=13000 card=500)'
CLOSE #140360238365672:c=0,e=3,dep=0,type=0,tim=687080516173

As I’ve explained at DOAGDB16 and SOUG Day (next session is planned for UKOUG TECH16), following metadata and object links is done by the session switching from PDB to CDB$ROOT.

TKPROF

Here is the tkprof of the full trace with two executions

Our query on PDB


SQL ID: duy45bn72jr35 Plan Hash: 3055399777
 
select id
from
DEMOV where num<=100
 
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 16 0.00 0.00 0 0 0 200
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20 0.00 0.00 0 0 0 200
 
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Number of plan statistics captured: 2
 
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
100 100 100 FIXED TABLE FULL X$OBLNK$ (cr=1040 pr=1034 pw=0 time=42636 us cost=0 size=13000 card=500)

The number of executions and row count is correct, but there’s no logical reads here because all block reads occurred through the recursive query. The execution plan shows are full table scan on X$OBLNK$ which is how object link access path are displayed in 12.1

Query on CDB$ROOT


SQL ID: 3b9x1rasffxna Plan Hash: 3598352655
 
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM
FROM
"SYS"."DEMOV" "DEMOV" WHERE "DEMOV"."NUM"<=100
 
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.08 2068 2074 0 200
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.03 0.08 2068 2074 0 200
 
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: SYS (recursive depth: 1)
Number of plan statistics captured: 2
 
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
100 100 100 RESULT CACHE cgn1rxw6ycznac8fyzfursq2z6 (cr=1037 pr=1034 pw=0 time=662 us)
100 50 100 TABLE ACCESS FULL DEMOT (cr=1037 pr=1034 pw=0 time=522 us)

Here is where the work to get rows from the view is, in CDB$ROOT. Only two fetches there (one per execution). As we have seen in the row trace, all rows from the object link were fetched before we issue any fetch call from our query. I did same test with more rows selected and it seems that the fetch size is 200 rows: when 200 rows are fetched from CDB$ROOT, session switches back to PDB to fetch those rows (15 by 15 with the default sqlplus arraysize) and comes again to CDB$ROOT for next 200 rows. This means that they are probably buffered.

Actually there’s a hidden parameter to define that: “_cdb_view_prefetch_batch_size” is set to 200 by default.

Note that the 2000 logical reads are from the first execution only because the second one found the result in result cache.

V$SQL

From SQL_TRACE, the work done in the other container is not included in statement statistics. This makes tuning more difficult as we are used to see recursive work cumulated in the top level statement.

From CDB$ROOT here is what we can see from the shared pool (V$SQL) about the queries I’ve seen in the SQL_TRACE. I query V$SQL with the PLAN_HASH_VALUE (‘phv’ in the SQL_TRACE dump).


14:49:02 SQL> select con_id,sql_id,rows_processed,plan_hash_value,executions,parse_calls,fetches,buffer_gets,sql_text from v$sql where plan_hash_value in (3598352655,3055399777) order by last_active_time;
 
CON_ID SQL_ID ROWS_PROCESSED PLAN_HASH_VALUE EXECUTIONS PARSE_CALLS FETCHES BUFFER_GETS SQL_TEXT
---------- ------------- -------------- --------------- ---------- ----------- ---------- ----------- --------------------------------------------------------------------------------
4 duy45bn72jr35 200 3055399777 2 2 16 2721 select id from DEMOV where num<=100
4 bynmh7xm4bf54 0 3598352655 0 1 0 61 SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
1 3b9x1rasffxna 200 3598352655 2 2 2 2080 SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV" "DEMOV" WHERE

For the two queries we have seen above, the number of executions and fetches matches what we have seen in the trace. However, buffer_gets from the user query includes the logical reads from the recursive query.

But there’s more here. A statement is there with the same PLAN_HASH_VALUE than the internal query. It’s the query on the shared=object view, with the undocumented NO_OBJECT_LINK() function. It is parsed but not executed. This parse occurred in PDB just before switching to CDB$ROOT. This parse occurend only one time when our query was hard parsed. It has the same plan hash value than the internal query because the plan is the same: full table scan on the table.

My understanding of that is that when hard parsing our query and executing the recursive query on CDB$ROOT, the optimizer checks the view definition in the current container (the PDB) by parsing it without following object links (reason for the NO_OBJECT_LINK).

NO_OBJECT_LINK

Here is the parsing of that query with NO_OBJECT_LINK that occurs in the PDB:


PARSING IN CURSOR #140360238422472 len=43 dep=1 uid=0 oct=3 lid=0 tim=687080413554 hv=1715845284 ad='108fc0230' sqlid=' bynmh7xm4bf54'
SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
END OF STMT
PARSE #140360238422472:c=3000,e=2948,p=0,cr=61,cu=0,mis=1,r=0,dep=1,og=4,plh=3598352655,tim=687080413553
CLOSE #140360238422472:c=0,e=3,dep=1,type=0,tim=687080413614

There is no where clause here. I guess that the goal is just to validate the view in the PDB before executing the full query on CDB$ROOT.

Note that query in the CDB$ROOT do not use the NO_OBJECT_LINK here in 12.1 but could have use it to ensure that there are no further links.

Execution plans

With all those recursive queries, how the cardinalities are estimated? DEMOT has no rows in PDB and 1000000 rows in CDB$ROOT. Statistics gathered and I query only 100 rows (they are evenly distributed between low and high value);

The query that is only parsed in PDB:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID bynmh7xm4bf54, child number 0
-------------------------------------
SELECT * FROM NO_OBJECT_LINK("SYS"."DEMOV")
Plan hash value: 3598352655
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TABLE ACCESS FULL| DEMOT | 1 | 6 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------

The query that is run in CDB$ROOT:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 3b9x1rasffxna, child number 0
-------------------------------------
SELECT /*+ RESULT_CACHE (SYSOBJ=TRUE) */ ID,NUM FROM "SYS"."DEMOV"
"DEMOV" WHERE "DEMOV"."NUM"<=100
Plan hash value: 3598352655
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 568 (100)| |
| 1 | RESULT CACHE | 9cv1sbwyz16651fgh17234v67g | | | | |
|* 2 | TABLE ACCESS FULL| DEMOT | 100 | 1000 | 568 (2)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("NUM"<=100)

Estimation is ok here.

And my user query:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID duy45bn72jr35, child number 0
-------------------------------------
select id from DEMOV where num<=100
Plan hash value: 3055399777
------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)|
|* 1 | FIXED TABLE FULL| X$OBLNK$ | 500 | 13000 | 0 (0)|
------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NUM"<=100)

Here, this estimation is not accurate at all, seems to have an hardcoded value of 500.

Conclusion

Lot of interesting things here. Object link (that you can call data links as well) are processed in a very special way. But don’t worry. Remember that you are not allowed to create them yourself. And there are only very few oracle maintained object links: some PDB information that must be available when PDB are not opened, AWR information that is consolidated in root, some audit reference tables,…

It interesting to see (and think about all consequences) that result cache is used here for internal optimization. Even when you don’t use result cache for your application, you should have a look at it and maybe size it differently than default. In a future post I’ll create a sharing=object view on V$ fixed views and result cache will be even more fun.

 

Cet article Multitenant internals: how object links are parsed/executed est apparu en premier sur Blog dbi services.

List usernames instead of uids with the ps command for long usernames

Fri, 2016-08-05 08:27

Have your ever faced such a situation. You have usernames in your /etc/passwd file with more than 8 characters. This is no problem for Linux at all, usernames may be up to 32 characters long, only your ps output might look a little scrambled.

It shows you the uid instead of the username like in the following example:

$ id

uid=20001(longuser01) gid=10002(oinstall) groups=10002(oinstall)

$ sleep 1000000 &

$ ps -ef | grep sleep | grep -v grep

20001    14069 11739  0 14:11 pts/0    00:00:00 sleep 1000000

 

But you want to see the username instead of the uid. The workaround is

  • Don’t use more than eight characters for your usernames  :-)
  • Or …. format your ps output the right way

You could use the following alias to get the job done.

$ alias psx='export PS_FORMAT="user:12,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,command"; ps ax'

$ psx | grep sleep | grep -v grep

longuser01 14069  0.0  58940 520 pts/0 S 14:11:50 sleep 1000000

 

Now it looks better.

Cheers, William

 

Cet article List usernames instead of uids with the ps command for long usernames est apparu en premier sur Blog dbi services.

How to do a Filesystem Resize (ext3/ext4) on Redhat running on VMware

Fri, 2016-08-05 07:43

A filesystem resize can be done in several ways, online, offline, with LVM2 or without LVM2.  However, this blog will describe how to do an online resize of ext3/ext4 filesystems where a virtual disk (vmdk) is online added to a VMware Redhat guest OS.

So let’s start with the online filesystem resize of ext3/4 filesystems on the Redhat guest OS.  A new virutal disk (preferably an eagerd zero thick on VM running Oracle) was added as a pre requirement. Adding a new virtual disk is an online operation and no downtime is required to do it.

The whole procedure in this document is described by using the command line only. There is also a graphical user interface `system-config-lvm` that can perform the job, but that tool is out of scope in this document.

Online resize a ext3/4 filesystem

There are several steps that have to be done. These are in general:

  1. Scanning for new LUN’s
  2. Partition the new LUN’s and partprobe
  3. Create the physical volume
  4. Extend the volume group and the logical volume
  5. Extend the filesystem online
Rescan for new LUN’s

Depending on the number of virtual controllers, you have to scan for your new LUN’s on each of these. In case you know on which the disk was added, then of course, you need to scan only the appropriate one.

Rescan for new LUN’s on the first SCSI Controller (LSI Logic Parallel)
# echo "- - -"  > /sys/class/scsi_host/host0/scan*
Rescan for new LUN’s on the second SCSI Controller (Paravirtualized)
# echo "- - -"  > /sys/class/scsi_host/host1/scan*
Create a Primary Partion on the new devices
# fdisk /dev/sdx??

# fdisk /dev/sdy??
Partprobe the new devices

Partprobe is a program that informs the operating system kernel of partition table changes, by requesting that the operating system re-read the partition table.

# partprobe /dev/sdx??

# partprobe /dev/sdy??
Create the Pysical Volumes
# pvcreate /dev/sdx??

Physical volume "/dev/sdx??" successfully created
# pvcreate /dev/sdy??

Physical volume "/dev/sdy??" successfully created
Extend the Volume Group
# vgextend VGOracle /dev/sdx??

Volume group "VGOracle" successfully extended
# vgextend VGOracle /dev/sdy??

Volume group "VGOracle" successfully extended
Extend the Logical Volume
# lvextend -L 72G /dev/VGOracle/LVOracleu??

Extending logical volume LVOracleu?? to 72.00 GB

Logical volume LVOracleu01 successfully resized
Online Resize the ext3/ext4 Filesystem

After the logical volume is resized successfully, you can resize, in fact any filesystem that is online re-sizable. The following are examples for the ext3/ext4 filesystems. The syntax for ext3 and ext4 differ only slightly. For ext3 you use `resize2fs` even if its ext3 and not ext2, and in case of ext4 you use `resize4fs` were the command name is more logically.

ext3
# resize2fs /dev/VGOracle/LVOracleu??
ext4
# resize4fs /dev/VGOracle/LVOracleu??

 

That’s it. Now have fun with the bigger filesystem.

Cheers,

William

 

 

Cet article How to do a Filesystem Resize (ext3/ext4) on Redhat running on VMware est apparu en premier sur Blog dbi services.

SQL Server AlwaysOn: troubleshooting improvements with new service packs

Tue, 2016-08-02 09:42

As promised in my latest blog, I will talk about improvements in the detection of the availability group replication latency. As a reminder, replication latency between high available replicas may be monitored in different manners so far.

Firstly, in my experience I had the opportunity to use different performance counters as “Database Replica:Transaction Delay” and “Databas Replica:Mirrored Write Transactions/sec” in order to get a good picture of the latency over the time after implementing an availability group. I have also experienced performance issues at customer places where referring to the HADR_SYNC_COMMIT wait type was also very valuable. However, in both cases experiencing high transaction delay means only that the primary is waiting on response from remote replica that the log block has been hardened. This is a relevant information for sure but this is only the symptom. Indeed, at this point we are not aware of where the issue is occurring exactly and we have to figure out what the root cause of this latency is. Is the network involved or maybe log flushing activity on the secondary or both? Well, it is sure that further investigations will be necessary to locate and get rid of the root cause.

When I discovered the new diagnostic features from recent service packs of SQL Server, I enjoyed to see that new extended events were added and will allow to respond easier to the previous request. Before going further to the new extended events study, let’s say that I learned a lot by attending to the very interesting web meeting about AlwaysOn troubleshooting improvements given by Amit Banerjee (Senior PM Microsoft for Tiger Team). During this web meeting, he provided a lot of useful information about new ways to diagnostic AlwaysOn architectures including the new extended events.

You may find out an exhaustive list of new / enhanced extended events and their corresponding performance counters by referring to the Microsoft KB3173156 but let me summarize the main benefit we may get. Well, we have now a complete picture of the replication stack and processing time value at each level of the architecture as you may see below:

blog 102 - AG 2014 SP2 - extended events new columns

For an easier understanding, I tried to draw a picture of the AlwaysOn replication process by placing the extended events at their correct location in the replication stack. Please feel free to comment if you find out any mistakes.

 

blog 102 - AG 2014 SP2 - replication and xe

As said earlier, these new extended events will probably make some troubleshooting stuffs easier and in this blog post I would like to focus on two latency issues I faced at customer places where I noticed high HADR_SYNC_COMMIT wait type values but the root cause was not the same.

I decided to simulate these two issues on my lab environment (one simple architecture that includes one availability group, two replicas involved in synchronous replication and run with SQL Server 2014 SP2). My main goal was to check if the new extended events were able to locate quickly where the root cause is located in each case.

So the first thing to do is to implement and to enable a customized AlwaysOn extended event session that includes all the events displayed in the above picture on each side (both primary and secondary). I used directly the definition of the extended event shared by Amit Banerjee on GitHub.

CREATE EVENT SESSION [AlwaysOn_Data_Movement_Tracing] 
ON SERVER 
ADD EVENT sqlserver.hadr_apply_log_block,
ADD EVENT sqlserver.hadr_capture_filestream_wait,
ADD EVENT sqlserver.hadr_capture_log_block,
ADD EVENT sqlserver.hadr_capture_vlfheader,
ADD EVENT sqlserver.hadr_database_flow_control_action,
ADD EVENT sqlserver.hadr_db_commit_mgr_harden,
ADD EVENT sqlserver.hadr_log_block_compression,
ADD EVENT sqlserver.hadr_log_block_decompression,
ADD EVENT sqlserver.hadr_log_block_group_commit,
ADD EVENT sqlserver.hadr_log_block_send_complete,
ADD EVENT sqlserver.hadr_lsn_send_complete,
ADD EVENT sqlserver.hadr_receive_harden_lsn_message,
ADD EVENT sqlserver.hadr_send_harden_lsn_message,
ADD EVENT sqlserver.hadr_transport_flow_control_action,
ADD EVENT sqlserver.hadr_transport_receive_log_block_message,
ADD EVENT sqlserver.log_block_pushed_to_logpool,
ADD EVENT sqlserver.log_flush_complete,
ADD EVENT sqlserver.log_flush_start,
ADD EVENT sqlserver.recovery_unit_harden_log_timestamps,
ADD EVENT ucs.ucs_connection_send_msg
(
    WHERE ([correlation_id]<>(0))
) 
ADD TARGET package0.event_file
(
	SET filename=N'AlwaysOn_Data_Movement_Tracing.xel',max_rollover_files=(100)
)
WITH 
(
	MAX_MEMORY=4096 KB,
	EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
	MAX_DISPATCH_LATENCY=30 SECONDS,
	MAX_EVENT_SIZE=0 KB,
	MEMORY_PARTITION_MODE=NONE,
	TRACK_CAUSALITY=OFF,
	STARTUP_STATE=ON
)
GO

Firstly, I have to warn about the use of such trace because it may generate a large amount of data depending mainly on the workload activity. In my test scenario, the extended event session has generated up to 4GB of events in a time interval of 20 minutes. So we will have probably to think about some customizations that might be needed according the context in order to minimize the overall performance impact and the disk usage as well.

Then, after collecting extended events data, I experienced a very long execution time and a high tempdb usage as well by using the initial extraction script provided by Amit Banerjee on GitHub. In my case, the script took approximatively 9 hours to finish on my lab environment with 4 VCPUs and 4GB of RAM. This is definitely not a good way to perform quickly further analysis on collected data. My first feeling was the script was designed for the demonstration purpose with few data. So in order to handle larger collected data, I decided to modify the initial version of the script to reduce the extract duration time. It took less than 2h for the same amount of data in my case). You will also notice that I reduced the number of extracted data to the minimum in order to meet my own requirements (I only have two replicas in my case). But let’s say that I’m still in a process of improving the performance of the extraction script in order to meet more realistic production scenarios. So here my own version of the script used in this blog post:

SET NOCOUNT ON

IF OBJECT_ID('tempdb..#xe_AlwaysOn_Data_Movement_Tracing', 'U') IS NOT NULL
	DROP TABLE #xe_AlwaysOn_Data_Movement_Tracing;


SELECT
	object_name as event_name,
	CONVERT(XML,Event_data) AS event_data
INTO #xe_AlwaysOn_Data_Movement_Tracing
FROM sys.fn_xe_file_target_read_file(
			'C:\Program Files\Microsoft SQL Server\MSSQL12.SQL14\MSSQL\Log\AlwaysOn_Data_Movement_Tracing_0_*.xel', 
			null, null, null) as xe
where object_name in ('hadr_log_block_group_commit',
						'log_block_pushed_to_logpool',
						'log_flush_start',
						'log_flush_complete',
						'hadr_log_block_compression',
						'hadr_capture_log_block',
						'hadr_capture_filestream_wait',
						'hadr_log_block_send_complete',
						'hadr_receive_harden_lsn_message',
						'hadr_db_commit_mgr_harden',
						'recovery_unit_harden_log_timestamps',
						'hadr_capture_vlfheader',
						'hadr_log_block_decompression',
						'hadr_apply_log_block',
						'hadr_send_harden_lsn_message',
						'hadr_log_block_decompression',
						'hadr_lsn_send_complete',
						'hadr_transport_receive_log_block_message');


IF OBJECT_ID('dbo.DMReplicaEvents', 'U') IS NOT NULL
	DROP TABLE dbo.DMReplicaEvents_primary;

CREATE TABLE [dbo].[DMReplicaEvents_primary](
	[server_name] [nvarchar](128) NULL,
	[event_name] [nvarchar](60) NOT NULL,
	[log_block_id] [bigint] NULL,
	[database_id] [int] NULL,
	[processing_time] [bigint] NULL,
	[start_timestamp] [bigint] NULL,
	[publish_timestamp] [datetimeoffset](7) NULL,
	[log_block_size] [int] NULL,
	[target_availability_replica_id] [uniqueidentifier] NULL,
	[mode] [bigint] NULL
) ON [PRIMARY]


INSERT INTO DMReplicaEvents_Primary
SELECT 
	@@SERVERNAME as server_name,
	event_name,
	xe.event_data.value('(/event/data[@name="log_block_id"]/value)[1]','bigint') AS log_block_id,
	xe.event_data.value('(/event/data[@name="database_id"]/value)[1]','int') AS database_id,
	CASE event_name 
		WHEN 'hadr_db_commit_mgr_harden' THEN xe.event_data.value('(/event/data[@name="time_to_commit"]/value)[1]','bigint')
		WHEN 'hadr_apply_log_block' THEN xe.event_data.value('(/event/data[@name="total_processing_time"]/value)[1]','bigint')
		WHEN 'hadr_log_block_send_complete' THEN xe.event_data.value('(/event/data[@name="total_processing_time"]/value)[1]','bigint')
		WHEN 'hadr_lsn_send_complete' THEN xe.event_data.value('(/event/data[@name="total_processing_time"]/value)[1]','bigint')
		WHEN 'log_flush_complete' THEN xe.event_data.value('(/event/data[@name="duration"]/value)[1]','bigint') * 1000
		ELSE xe.event_data.value('(/event/data[@name="processing_time"]/value)[1]','bigint') 
	END AS processing_time,
	xe.event_data.value('(/event/data[@name="start_timestamp"]/value)[1]','bigint') AS start_timestamp,
	xe.event_data.value('(/event/@timestamp)[1]','DATETIMEOFFSET') AS publish_timestamp,
	CASE event_name
		WHEN 'hadr_log_block_compression' THEN xe.event_data.value('(/event/data[@name="uncompressed_size"]/value)[1]','int')
		WHEN 'hadr_log_block_decompression' THEN xe.event_data.value('(/event/data[@name="uncompressed_size"]/value)[1]','int')
		WHEN 'hadr_capture_log_block' THEN xe.event_data.value('(/event/data[@name="log_block_size"]/value)[1]','int')
		ELSE NULL 
	END AS log_block_size,
	CASE event_name
		WHEN 'hadr_db_commit_mgr_harden' THEN xe.event_data.value('(/event/data[@name="replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_log_block_compression' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_log_block_decompression' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_capture_log_block' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_capture_filestream_wait' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_receive_harden_lsn_message' THEN xe.event_data.value('(/event/data[@name="target_availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_transport_receive_log_block_message' THEN xe.event_data.value('(/event/data[@name="target_availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_capture_vlfheader' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		WHEN 'hadr_send_harden_lsn_message' THEN xe.event_data.value('(/event/data[@name="availability_replica_id"]/value)[1]','uniqueidentifier')
		ELSE NULL 
	END AS target_availability_replica_id,
	xe.event_data.value('(/event/data[@name="mode"]/value)[1]','bigint') AS mode
--INTO DMReplicaEvents
FROM #xe_AlwaysOn_Data_Movement_Tracing xe

Having talked about the data collector aspect, let’s continue by using it against my two latency issue scenarios:

 

First scenario (network bandwidth issue)

My first test consisted in simulating a network bandwidth issue by limiting the maximum bandwidth of the public network card as low as possible to trigger a noticeable increase of the availability group replication latency. This is what you may see in the following picture between 19:37 and 19:42. The average transaction delay (or latency) increases accordingly up to 17ms against 2ms during normal processing.

 

blog 102 - AG 2014 SP2 - first scenario - perfmon

If we take a look at the wait statistics we may found out the HADR_SYNC_COMMIT wait stat with an average value of 20ms.

blog 102 - AG 2014 SP2 - first scenario - waitstats

But at this point we only know that we are experiencing latency issues but we didn’t get any additional information to prove that the network is the culprit. This is where new extended events come into play. I used Excel to consume collected data and to produce charts for a better analysis. Here the picture of the first test scenario (total processing time per event) :

blog 102 - AG 2014 SP2 - first scenario - xe primary

blog 102 - AG 2014 SP2 - first scenario - xe secondary

From the above charts, we may easily notice between 19:37 and 19:43 an increase of the hadr_db_commit_mgr_harden processing time on the primary. Other events stay constant over the time. Referring to the first replication diagram, we know this event type represents the remote response from the secondary that increases during the concerned time interval. Furthermore, from the secondary, we may see constant processing time values for all events meaning that the root cause doesn’t seem to concern the secondary itself but certainly an external item between the primary and the secondary, so in this case probably the network stack. By using new extended events, we were able to point out quickly the issue and perform further investigations on the network side.

 

Second scenario (storage performance issue)

Let’s continue with the second scenario. At first glance we may see the same kind of wait type after the second test meaning that we’re facing the same latency issue as the previous scenario

blog 102 - AG 2014 SP2 - second scenario - waitstats

But this time the root cause is not the same as we may see very soon by looking at the following extended event charts

 

blog 102 - AG 2014 SP2 - second scenario - xe primary

blog 102 - AG 2014 SP2 - second scenario - xe secondary

This time, we may notice a pretty obvious correlation between the two charts (primary and secondary replicas). Indeed, each time the remote response time (in other words the latency) increases from the primary, we may see the same increase of the log_flush_complete processing time from the secondary and with the same order of magnitude. In this case, we may quickly focus on database file (and storage) response time on the secondary replica. In my case I voluntary used a very slow storage to host the killerdb transaction log file as you may see below at the database file io statistics

blog 102 - AG 2014 SP2 - second scenario - file io stats

Bottom line

In this blog post, we have seen how useful may be the new extended events shipped with new service packs of SQL Server and we probably have just scratched the surface. But I think we have to keep in mind that using these new extended events has a cost and for the moment I must admit to be not confident to use blindly the initial version script of collecting on production for the reasons cited above. I have to investigate further on a better way to collect and extract data in a real situation.

Happy troubleshooting!

 

 

 

 

 

Cet article SQL Server AlwaysOn: troubleshooting improvements with new service packs est apparu en premier sur Blog dbi services.

Exadata X-5 Bare Metal vs. OVM load testing

Sun, 2016-07-31 02:49

In a previous post I tried to compare a single thread workload between Exadata X5 Bare Metal and Virtualized. The conclusions were that there is no huge differences, and that this kind of comparison is not easy.

About the comparison is not easy, some reasons have been nicely detailed by Lonny Niederstadt in this twitter thread

Besides the single thread tests, I did a test with 50 sessions doing updates on a small data set. It’s a 50 session SLOB with SCALE=100M WORK_UNIT=64 UPDATE_PCT=100 and a small buffer cache.

Here are the load profiles, side by side:

Bare Metal Virtualized
 
Load Profile Per Second Per Transaction Per Second Per Transaction
~~~~~~~~~~~~~~~ --------------- --------------- --------------- --------------- -
DB Time(s): 42.9 0.0 48.6 0.1
DB CPU(s): 5.6 0.0 6.8 0.0
Background CPU(s): 3.4 0.0 3.8 0.0
Redo size (bytes): 58,364,739.1 52,693.1 52,350,760.2 52,555.7
Logical read (blocks): 83,306.4 75.2 73,826.5 74.1
Block changes: 145,360.7 131.2 130,416.6 130.9
Physical read (blocks): 66,038.6 59.6 60,512.7 60.8
Physical write (blocks): 69,121.8 62.4 62,962.0 63.2
Read IO requests: 65,944.8 59.5 60,448.0 60.7
Write IO requests: 59,618.4 53.8 55,883.5 56.1
Read IO (MB): 515.9 0.5 472.8 0.5
Write IO (MB): 540.0 0.5 491.9 0.5
Executes (SQL): 1,170.7 1.1 1,045.6 1.1
Rollbacks: 0.0 0.0 0.0 0.0
Transactions: 1,107.6 996.1

and I/O profile:
Bare Metal Virtualized
 
IO Profile Read+Write/Second Read/Second Write/Second Read+Write/Second Read/Second Write/Second
~~~~~~~~~~ ----------------- --------------- --------------- ----------------- --------------- ---------------
Total Requests: 126,471.0 65,950.0 60,521.0 117,014.4 60,452.8 56,561.6
Database Requests: 125,563.2 65,944.8 59,618.4 116,331.5 60,448.0 55,883.5
Optimized Requests: 125,543.0 65,941.1 59,601.9 116,130.7 60,439.9 55,690.7
Redo Requests: 902.2 0.1 902.1 677.1 0.1 677.0
Total (MB): 1,114.0 516.0 598.0 1,016.5 472.8 543.6
Database (MB): 1,055.9 515.9 540.0 964.7 472.8 491.9
Optimized Total (MB): 1,043.1 515.9 527.2 942.6 472.7 469.8
Redo (MB): 57.7 0.0 57.7 51.7 0.0 51.7
Database (blocks): 135,160.4 66,038.6 69,121.8 123,474.7 60,512.7 62,962.0
Via Buffer Cache (blocks): 135,159.8 66,038.5 69,121.3 123,474.0 60,512.6 62,961.4
Direct (blocks): 0.6 0.2 0.5 0.7 0.1 0.5

This is roughly what you can expect from OLTP workload: small data set that fits in flash cache, high redo rate. Of course in OLTP you will have a higher buffer cache, but this is not what I wanted to measure here. It seems that the I/O performance is slightly better in bare metal. This is what we also see on averages:

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Total Wait Wait % DB Wait
Event Waits Time (sec) Avg(ms) time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea 9,272,548 4182.4 0.45 69.3 User I/O
free buffer waits 152,043 1511.5 9.94 25.1 Configur
DB CPU 791.4 13.1
 
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Total Wait Wait % DB Wait
Event Waits Time (sec) Avg(ms) time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea 7,486,727 3836.2 0.51 63.7 User I/O
free buffer waits 208,658 1840.9 8.82 30.6 Configur
DB CPU 845.9 14.1

It’s interesting to see that even when on I/O bound system there are no significant waits on log file sync.

I’ll focus on ‘log file paralle writes':

Bare Metal
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
log file parallel write 1 0 1 microsecond
log file parallel write 2 0 2 microseconds
log file parallel write 4 0 4 microseconds
log file parallel write 8 0 8 microseconds
log file parallel write 16 0 16 microseconds
log file parallel write 32 0 32 microseconds
log file parallel write 64 0 64 microseconds
log file parallel write 128 0 128 microseconds
log file parallel write 256 8244 256 microseconds
log file parallel write 512 102771 512 microseconds
log file parallel write 1024 14812 1 millisecond
log file parallel write 2048 444 2 milliseconds
log file parallel write 4096 42 4 milliseconds
log file parallel write 8192 11 8 milliseconds
log file parallel write 16384 3 16 milliseconds
log file parallel write 32768 1 32 milliseconds

Virtualized
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
log file parallel write 1 0 1 microsecond
log file parallel write 2 0 2 microseconds
log file parallel write 4 0 4 microseconds
log file parallel write 8 0 8 microseconds
log file parallel write 16 0 16 microseconds
log file parallel write 32 0 32 microseconds
log file parallel write 64 0 64 microseconds
log file parallel write 128 0 128 microseconds
log file parallel write 256 723 256 microseconds
log file parallel write 512 33847 512 microseconds
log file parallel write 1024 41262 1 millisecond
log file parallel write 2048 6483 2 milliseconds
log file parallel write 4096 805 4 milliseconds
log file parallel write 8192 341 8 milliseconds
log file parallel write 16384 70 16 milliseconds
log file parallel write 32768 10 32 milliseconds

As I’ve seen in previous tests, most of the writes where below 512 microseconds in bare metal, and above in virtualized.

And here are the histograms for the single block reads:

Bare Metal
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read 1 0 1 microsecond
cell single block physical read 2 0 2 microseconds
cell single block physical read 4 0 4 microseconds
cell single block physical read 8 0 8 microseconds
cell single block physical read 16 0 16 microseconds
cell single block physical read 32 0 32 microseconds
cell single block physical read 64 0 64 microseconds
cell single block physical read 128 432 128 microseconds
cell single block physical read 256 2569835 256 microseconds
cell single block physical read 512 5275814 512 microseconds
cell single block physical read 1024 837402 1 millisecond
cell single block physical read 2048 275112 2 milliseconds
cell single block physical read 4096 297320 4 milliseconds
cell single block physical read 8192 4550 8 milliseconds
cell single block physical read 16384 1485 16 milliseconds
cell single block physical read 32768 99 32 milliseconds
cell single block physical read 65536 24 65 milliseconds
cell single block physical read 131072 11 131 milliseconds
cell single block physical read 262144 14 262 milliseconds
cell single block physical read 524288 7 524 milliseconds
cell single block physical read 1048576 4 1 second
cell single block physical read 2097152 1 2 seconds

Virtualized

EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read 1 0 1 microsecond
cell single block physical read 2 0 2 microseconds
cell single block physical read 4 0 4 microseconds
cell single block physical read 8 0 8 microseconds
cell single block physical read 16 0 16 microseconds
cell single block physical read 32 0 32 microseconds
cell single block physical read 64 0 64 microseconds
cell single block physical read 128 0 128 microseconds
cell single block physical read 256 518447 256 microseconds
cell single block physical read 512 5371496 512 microseconds
cell single block physical read 1024 1063689 1 millisecond
cell single block physical read 2048 284640 2 milliseconds
cell single block physical read 4096 226581 4 milliseconds
cell single block physical read 8192 16292 8 milliseconds
cell single block physical read 16384 3191 16 milliseconds
cell single block physical read 32768 474 32 milliseconds
cell single block physical read 65536 62 65 milliseconds
cell single block physical read 131072 2 131 milliseconds

Same conclusions here: the ‘less than 256 microseconds’ occurs more frequently in bare metal than virtualized.

For the reference, those tests wer done on similar configuration (except virtualization): X5-2L High Capacity with 3 storage cells, version cell-12.1.2.3.1_LINUX.X64_160411-1.x86_64, flashcache in writeback. Whole system started before the test. This test is the only thing running on the whole database machine.

 

Cet article Exadata X-5 Bare Metal vs. OVM load testing est apparu en premier sur Blog dbi services.

Oracle serializable is not serializable

Sat, 2016-07-30 17:17

Did you know that when you set isolation level to SERIALIZABLE, it is not serializable but SNAPSHOT? This isolation levels is lower than serializable. I’ve never thought about it until I read Markus Winand slides about transactions. I recommend every developer or DBA to read those slides. This post is there to illustrate write skew in Oracle.

Let’s show an example on SCOTT.EMP table. Let’s say there’s a HR directive to increase one of department 10 employees salary so that total salaries for the department is 9000.
Now let’s imagine that two HR users received the directive at the same time.

User A checks the salaries:

23:18:33 SID=365> select ename,sal from EMP where deptno=10;
 
ENAME SAL
---------- ----------
CLARK 2450
KING 5000
MILLER 1300

The sum is 8750 so User A decides to increase MILLER’s salary with additional 250.

However, to be sure that he is the only one to do that, he starts a transaction in SERIALIZABLE isolation level, checks the sum again, and do the update:

23:18:40 SID=365> set transaction isolation level serializable;
Transaction set.
 
23:18:41 SID=365> select sum(sal) from EMP where deptno=10;
 
SUM(SAL)
----------
8750
 
23:18:44 SID=365> update EMP set sal=sal+250 where ename='MILLER';
1 row updated.

Now at the same time, User B is doing the same but chose to increase CLARK’s salary:


23:18:30 SID=12> set transaction isolation level serializable;
Transaction set.
 
23:18:51 SID=12> select sum(sal) from EMP where deptno=10;
 
SUM(SAL)
----------
8750
 
23:18:53 SID=12> update EMP set sal=sal+250 where ename='CLARK';
1 row updated.

Note that there is no “ORA-08177: can’t serialize access for this transaction” there because the updates occurs on different rows.

The User A checks again the sum and then commits his transaction:


23:18:46 SID=365> select sum(sal) from EMP where deptno=10;
 
SUM(SAL)
----------
9000
 
23:19:04 SID=365> commit;
Commit complete.

And so does the User B:


23:18:55 SID=12> select sum(sal) from EMP where deptno=10;
 
SUM(SAL)
----------
9000
 
23:19:08 SID=12> commit;
Commit complete.

However, once you commit, the result is different:


23:19:09 SID=12> select sum(sal) from EMP where deptno=10;
 
SUM(SAL)
----------
9250

Actually, what Oracle calls SERIALIZABLE here is only SNAPSHOT isolation level. You see data without the concurrent changes that have been commited after the beginning of your transaction. And you cannot modify a row that has been modified by another session. However, nothing prevents that what you have read is modified by another session. You don’t see those modification, but they can be commited.

The definition of serializability requires that the result is the same when transactions occurs one after the other. Here, if User A had commited before the start of User B transaction, the latter would have seen that the total were already at 9000.

In this example, if you want to prevent write skew you need to lock the table in Share mode. Locking the rows (with select for update) is sufficient to prevent concurrent updates, but then another user can insert a new employee which brings the total salary higher. In addition to that, row locks are exclusive and you don’t want readers to block readers. Locking a range (DEPTNO=10) is not possible in Oracle. So the solution is to lock the table.

It seems that only PostgreSQL (version >= 9.1) is able to guarantee true serializability without locking.

 

Cet article Oracle serializable is not serializable est apparu en premier sur Blog dbi services.

DBA essentials workshop with release Oracle 12c is Available

Fri, 2016-07-29 03:23

It’s with pleasure to announced dbi services have upgrade the infrastructure for Oracle DBA essentials workshop.
We have migrate the release of Oracle 11g R2 to the latest version Oracle 12c features and install the last PSU (April 2016).

During this course you understand different topics:

  • Understand the Oracle database architecture
  • Install and Manage Oracle database
  • Administer Oracle databases with dbi expert certified OCM or ACE director :-)

For more information about the workshop, please click on the link

Oracle Database release:

SQL> select product, version, status from product_component_version where product like '%Database%';


PRODUCT                                    VERSION              STATUS
------------------------------------------ -------------------- ------------------------------
Oracle Database 12c Enterprise Edition     12.1.0.2.0           64bit Production

Last PSU installed:

SQL> select patch_id, version, status, description from dba_registry_sqlpatch;


PATCH_ID   VERSION              STATUS          DESCRIPTION
---------- -------------------- --------------- ------------------------------------------------------
22291127   12.1.0.2             SUCCESS         Database Patch Set Update : 12.1.0.2.160419 (22291127)

020_ORA-DBA-Essential-OracleArchitecture

 

Cet article DBA essentials workshop with release Oracle 12c is Available est apparu en premier sur Blog dbi services.

Shrinking Oracle VM VirtualBox with Zerofree

Fri, 2016-07-29 03:17

In this blog I would like to talk about Shrinking a Virtual Machine with Oracle databases and MySQL which is install.
Unfortunately, whilst Virtual Box will dynamically expand the hard drive as it’s required, it won’t dynamically shrink it again if you free up space in the VM. The good news is You can shrink a dynamic Virtual Box disk image and so reduce the size of your VM infrastructure.

I have use the Zerofree utility for scans the free blocks in an ext2 and ext3 file system and fills any non-zero blocks with zeroes. Source

Step by step:
  • Convert .vmdk to .vdi
  • Mount the .vdi to another VM
  • Stop processes Oracle and MySQL
  • Mount file system on read only
  • Use zerofree
  • Shutdown VM
  • Vboxmanage compact
  • Convert .vdi to .vmdk is you find any reason not to keep the .vdi

I have my Virtual Box extension “.vmdk”. The fist step is clone the virtual-disk to “.vdi” extension.

root@computer:/media/pio/Elements1/Workshop Oracle DBA1 11g/vmreforadba01# vboxmanage clonehd "vmrefdba01.vmdk" "vmrefdba01.vdi" --format vdi
0%...10%...20%...30%...40%...50%...60%...70%...80%...

When the clone is finish, start the VM with the new file extension “.vdi”:
shrink

Install Zerofree on your VM:

[root@vmrefdba01 zerofree-1.0.4]# yum install e2fsprogs-devel -y
[root@vmrefdba01 zerofree-1.0.4]# wget http://frippery.org/uml/zerofree-1.0.3.tgz
[root@vmrefdba01 zerofree-1.0.4]# tar -zxf zerofree-1.0.3.tgz
[root@vmrefdba01 zerofree-1.0.4]# cd zerofree-1.0.3
[root@vmrefdba01 zerofree-1.0.3]# make
[root@vmrefdba01 zerofree-1.0.3]# cp zerofree /usr/bin

Stopping all processes Oracle and MySQL for umount the File system:

mysql@vmrefdba01:/home/mysql/ [DUMMY] mysqld_multi stop
oracle@vmrefdba01:/home/oracle/ [WSDBA2] sqh
SQL*Plus: Release 12.1.0.2.0 Production on Thu Jul 28 10:27:29 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> shutdown;
Database closed.
Database dismounted.
ORACLE instance shut down.

Mount the File system on Read Only mode:

[root@vmrefdba01 ~]# mount -o ro /dev/mapper/vgdata-lvdata /oracle/
[root@vmrefdba01 ~]# mount -l
/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,size=3G)
/dev/sda1 on /boot type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
gvfs-fuse-daemon on /root/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev)
/dev/mapper/vgdata-lvdata on /oracle type ext3 (ro)

Use the utility Zerofree for scans free blocks:

[root@vmrefdba01 ~]# zerofree /dev/mapper/vgdata-lvdata

Shutdown the VM and compact the disk:

[root@vmrefdba01 ~]# shutdown -h now
root@computer:/media/pio/Elements1/Workshop Oracle DBA1 11g/vmreforadba01# vboxmanage modifyhd vmrefdba01.vdi --compact
0%...10%...20%...30%...40%...50%...

Check the size with the new VirtualDisk and you can see.. we have won 20Gb with the shrink space :D

root@computer:/media/pio/Elements1/Workshop Oracle DBA1 11g/vmreforadba01# du -sh *
740K Logs
44M Snapshots
34G vmrefdba01.vdi --> New disk
54G vmrefdba01.vmdk --> Old disk

Optional : If you want you can clone again the VM with the extension “.vmdk”.

root@computer:/media/pio/Elements1/Workshop Oracle DBA1 11g/vmreforadba01# vboxmanage clonehd "vmrefdba01.vdi" "vmrefdba01_v1.vmdk" --format vmdk
0%...10%...20%...30%...40%...

 

Cet article Shrinking Oracle VM VirtualBox with Zerofree est apparu en premier sur Blog dbi services.

Exadata X-5 Bare Metal vs. OVM performance

Fri, 2016-07-29 02:54

The Exadata X5 can be installed Bare Metal or Virtualized. The latter one, Oracle VM allows to create multiple clusters on one machine, is more complex for installation and for storage capacity planning. But it allows a small flexibility on options licencing. Those are the real challenges behind the choice. However, when we talk about virtualization, most of questions are about the overhead. Last week, we did some tests on same machine with different configuration, thanks to Arrow Oracle Authorized Solution Center.

Comparison is not easy. Bare Metal has all resources. Virtualized has to distribute resources. And this test is very specific: all I/O hitting the ‘extreme’ flash cache because I don’t expect any virtualization overhead to be in milliseconds. So, don’t expect some universal conclusions from those tests. And don’t hesitate to comment about my way to read those numbers.

CPU

Do not expect a benchmark that shows the maximum capacity of the machine here. I’m comparing a bare metal node with 36 cores with a VM with 4 vCPUS. So I’ll compare a one thread workload only: SLOB with one session and SCALE=100M UPDATE_PCT=0 RUN_TIME=120 WORK_UNIT=64

Bare Metal load profile:

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 1.0 30.5 0.00 2.91
DB CPU(s): 1.0 29.0 0.00 2.76
Background CPU(s): 0.0 0.2 0.00 0.00
Redo size (bytes): 14,172.4 432,594.0
Logical read (blocks): 810,244.4 24,731,696.3
Block changes: 41.7 1,271.3
Physical read (blocks): 111.6 3,407.8
Physical write (blocks): 0.0 0.3
Read IO requests: 111.3 3,397.3
Write IO requests: 0.0 0.3
Read IO (MB): 0.9 26.6
Write IO (MB): 0.0 0.0
Executes (SQL): 12,285.1 374,988.5

Virtualized load profile:

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 1.0 30.6 0.00 4.37
DB CPU(s): 1.0 29.8 0.00 4.26
Background CPU(s): 0.0 0.2 0.00 0.00
Redo size (bytes): 13,316.5 410,404.0
Logical read (blocks): 848,095.1 26,137,653.8
Block changes: 41.1 1,266.3
Physical read (blocks): 109.1 3,361.3
Physical write (blocks): 0.0 0.3
Read IO requests: 103.8 3,198.5
Write IO requests: 0.0 0.3
Read IO (MB): 0.9 26.3
Write IO (MB): 0.0 0.0
Executes (SQL): 13,051.2 402,228.0

We can say that CPU and RAM performance is similar.

I/O

Now about IOPS on the storage cell flash cache.
I’ll compare SLOB with one session and SCALE=100000M UPDATE_PCT=100 RUN_TIME=120 WORK_UNIT=64

Bare Metal load profile:

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 1.0 0.0 0.02 4.06
DB CPU(s): 0.1 0.0 0.00 0.49
Background CPU(s): 0.1 0.0 0.00 0.00
Redo size (bytes): 1,652,624.9 51,700.6
Logical read (blocks): 2,582.2 80.8
Block changes: 4,214.5 131.9
Physical read (blocks): 2,060.6 64.5
Physical write (blocks): 1,818.0 56.9
Read IO requests: 2,051.0 64.2
Write IO requests: 1,738.6 54.4
Read IO (MB): 16.1 0.5
Write IO (MB): 14.2 0.4
Executes (SQL): 66.3 2.1
Rollbacks: 0.0 0.0
Transactions: 32.0

Virtualized load profile:

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 1.0 0.0 0.01 3.49
DB CPU(s): 0.3 0.0 0.00 1.01
Background CPU(s): 0.2 0.0 0.00 0.00
Redo size (bytes): 2,796,963.3 51,713.3
Logical read (blocks): 4,226.0 78.1
Block changes: 7,107.0 131.4
Physical read (blocks): 3,470.6 64.2
Physical write (blocks): 3,278.7 60.6
Read IO requests: 3,462.0 64.0
Write IO requests: 3,132.0 57.9
Read IO (MB): 27.1 0.5
Write IO (MB): 25.6 0.5
Executes (SQL): 86.9 1.6
Rollbacks: 0.0 0.0
Transactions: 54.1

In two minutes we did more work here. Timed events show statistics about the ‘cell single block reads’ which are nothing else than ‘db file sequential read’ renamed to look more ‘Exadata’. No SmartScan happens here as they go to buffer cache and we cannot do any filtering for blocks that will be shared with other sessions.

Bare Metal:
Total Wait Wait % DB Wait
Event Waits Time (sec) Avg(ms) time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea 249,854 115.7 0.46 94.9 User I/O
DB CPU 14.6 12.0

Virtualized: Total Wait Wait % DB Wait
Event Waits Time (sec) Avg(ms) time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea 425,071 109.3 0.26 89.4 User I/O
DB CPU 35.2 28.8

Lower latency here on average which explains why we did more work. But no conclusion before we know where the latency comes from. Averages hides the details, and it’s the same with the ‘IO Profile’ section:

Bare Metal
IO Profile Read+Write/Second Read/Second Write/Second
~~~~~~~~~~ ----------------- --------------- ---------------
Total Requests: 3,826.6 2,055.1 1,771.5
Database Requests: 3,789.5 2,051.0 1,738.6
Optimized Requests: 3,720.7 1,985.1 1,735.6
Redo Requests: 32.5 0.0 32.5
Total (MB): 32.0 16.2 15.9
Database (MB): 30.3 16.1 14.2
Optimized Total (MB): 29.3 15.6 13.7
Redo (MB): 1.7 0.0 1.7
Database (blocks): 3,878.6 2,060.6 1,818.0
Via Buffer Cache (blocks): 3,878.6 2,060.6 1,818.0
Direct (blocks): 0.0 0.0 0.0

Virtualized
IO Profile Read+Write/Second Read/Second Write/Second
~~~~~~~~~~ ----------------- --------------- ---------------
Total Requests: 6,652.2 3,467.0 3,185.2
Database Requests: 6,594.0 3,462.0 3,132.0
Optimized Requests: 6,582.7 3,461.2 3,121.5
Redo Requests: 54.7 0.0 54.7
Total (MB): 55.6 27.2 28.4
Database (MB): 52.7 27.1 25.6
Optimized Total (MB): 51.8 27.1 24.6
Redo (MB): 2.8 0.0 2.8
Database (blocks): 6,749.3 3,470.6 3,278.7
Via Buffer Cache (blocks): 6,749.3 3,470.6 3,278.7
Direct (blocks): 0.0 0.0 0.0

and for IO statistics.
Bare Metal:
Reads: Reqs Data Writes: Reqs Data Waits: Avg
Function Name Data per sec per sec Data per sec per sec Count Tm(ms)
--------------- ------- ------- ------- ------- ------- ------- ------- -------
Buffer Cache Re 1.9G 2050.9 16.093M 0M 0.0 0M 250.2K 0.5
DBWR 0M 0.0 0M 1.7G 1740.5 14.216M 0 N/A
LGWR 0M 0.0 0M 201M 32.5 1.648M 3914 0.3
Others 8M 4.1 .066M 1M 0.5 .008M 560 0.0
TOTAL: 1.9G 2055.0 16.159M 1.9G 1773.4 15.872M 254.6K 0.5

Virtualized:
Reads: Reqs Data Writes: Reqs Data Waits: Avg
Function Name Data per sec per sec Data per sec per sec Count Tm(ms)
--------------- ------- ------- ------- ------- ------- ------- ------- -------
Buffer Cache Re 3.3G 3462.7 27.12M 0M 0.0 0M 425.6K 0.3
DBWR 0M 0.0 0M 3.1G 3133.9 25.639M 0 N/A
LGWR 0M 0.0 0M 341M 54.7 2.775M 6665 0.3
Others 10M 5.0 .081M 1M 0.5 .008M 514 0.3
TOTAL: 3.3G 3467.7 27.202M 3.4G 3189.0 28.422M 432.7K 0.3

I’ve put the physical read statistics side-by-side to compare:


BARE METAL VIRTUALIZED
 
Statistic Total per Trans Total per Trans
-------------------------------- ------------------ ------------- ------------------ -------------
cell flash cache read hits 242,142 62.1 425,365 64.0
cell logical write IO requests 5,032 1.3 8,351 1.3
cell overwrites in flash cache 200,897 51.5 937,973 141.1
cell physical IO interconnect by 8,145,832,448 2,089,210.7 14,331,230,720 2,156,044.9
cell writes to flash cache 638,514 163.8 1,149,990 173.0
physical read IO requests 250,168 64.2 425,473 64.0
physical read bytes 2,059,042,816 528,095.1 3,494,084,608 525,663.4
physical read partial requests 4 0.0 0 0.0
physical read requests optimized 242,136 62.1 425,365 64.0
physical read total IO requests 250,671 64.3 426,089 64.1
physical read total bytes 2,067,243,008 530,198.3 3,504,136,192 527,175.6
physical read total bytes optimi 1,993,089,024 511,179.5 3,497,918,464 526,240.2
physical read total multi block 0 0.0 0 0.0
physical reads 251,348 64.5 426,524 64.2
physical reads cache 251,348 64.5 426,524 64.2
physical reads cache prefetch 1,180 0.3 1,051 0.2
physical reads direct 0 0.0 0 0.0
physical reads direct (lob) 0 0.0 0 0.0
physical reads prefetch warmup 1,165 0.3 1,016 0.2
physical write IO requests 212,061 54.4 384,909 57.9
physical write bytes 1,816,551,424 465,901.9 3,300,933,632 496,605.0
physical write requests optimize 211,699 54.3 383,624 57.7
physical write total IO requests 216,077 55.4 391,445 58.9
physical write total bytes 2,026,819,072 519,830.5 3,656,793,600 550,142.0
physical write total bytes optim 1,755,620,352 450,274.5 3,171,875,328 477,189.0
physical write total multi block 531 0.1 942 0.1
physical writes 221,747 56.9 402,946 60.6
physical writes direct 0 0.0 0 0.0
physical writes direct (lob) 0 0.0 0 0.0
physical writes from cache 221,747 56.9 402,946 60.6
physical writes non checkpoint 221,694 56.9 402,922 60.6

We already know that there were more work on the OVM run but comparing the ‘per transaction’ statistics show similar things but a bit more ‘flash cache’ ‘optimized’ I/O in the second run.
Of course, even if it’s the same machine, it has been re-imaged, database re-created, different volume and capacity. So maybe I hit more the cell flash on the second run than on the first one and more reads on spinning disks can explain the difference on single block reads latency.

We need to get beyond the averages with the wait event histograms. They don’t show lower than millisecond in the AWR report (I’ve opened an enhancement request for 12.2 about that) but I collected them from the V$EVENT_HISTOGRAM_MICRO

Bare Metal:
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read 1 0 1 microsecond
cell single block physical read 2 0 2 microseconds
cell single block physical read 4 0 4 microseconds
cell single block physical read 8 0 8 microseconds
cell single block physical read 16 0 16 microseconds
cell single block physical read 32 0 32 microseconds
cell single block physical read 64 0 64 microseconds
cell single block physical read 128 533 128 microseconds
cell single block physical read 256 240142 256 microseconds
cell single block physical read 512 7818 512 microseconds
cell single block physical read 1024 949 1 millisecond
cell single block physical read 2048 491 2 milliseconds
cell single block physical read 4096 1885 4 milliseconds
cell single block physical read 8192 3681 8 milliseconds
cell single block physical read 16384 2562 16 milliseconds
cell single block physical read 32768 257 32 milliseconds
cell single block physical read 65536 52 65 milliseconds
cell single block physical read 131072 3 131 milliseconds
cell single block physical read 262144 0 262 milliseconds
cell single block physical read 524288 1 524 milliseconds

Virtualized:
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read 1 0 1 microsecond
cell single block physical read 2 0 2 microseconds
cell single block physical read 4 0 4 microseconds
cell single block physical read 8 0 8 microseconds
cell single block physical read 16 0 16 microseconds
cell single block physical read 32 0 32 microseconds
cell single block physical read 64 0 64 microseconds
cell single block physical read 128 1 128 microseconds
cell single block physical read 256 322113 256 microseconds
cell single block physical read 512 105055 512 microseconds
cell single block physical read 1024 1822 1 millisecond
cell single block physical read 2048 813 2 milliseconds
cell single block physical read 4096 681 4 milliseconds
cell single block physical read 8192 283 8 milliseconds
cell single block physical read 16384 231 16 milliseconds
cell single block physical read 32768 64 32 milliseconds
cell single block physical read 65536 11 65 milliseconds
cell single block physical read 131072 3 131 milliseconds

In the first run we see more reads around 8ms which confirms the previous guess that we had more flash cache hit on the second run.
The waits between 128 and 512 milliseconds are from the cell flash storage and this is where I want to see if virtualization has an overhead.
I’ve put it in color there where it’s easier to visualize that most of the reads are in the 128-256 range. Bare Metal in blue, OVM in orange.

CaptureX5BMVM

In Bare Metal, most of the reads are faster than 256 microseconds. In virtualized there are some significant reads are above. This may be cause by virtualization but anyway that’s not a big difference. I don’t think that virtualization overhead is an important criteria when choosing how to install your Exadata. Storage capacity planning is the major criteria: consolidate all storage in two diskgroups (DATA and RECO) for all databases, or partition them for each cluster. choice is about manageability and agility in provisioning vs. licence optimization.

 

Cet article Exadata X-5 Bare Metal vs. OVM performance est apparu en premier sur Blog dbi services.

How to manage your containers in Oracle Storage Cloud Service ?

Thu, 2016-07-28 07:57

Oracle Cloud Storage Service provides secure, scalable and consistent data from any environment.
To get a higher flexibility for your backups, I suggest to use a CloudBerry application in Oracle Public Cloud.

This application can be used on Cloud Backup and Recovery in PaaS and IaaS infrastructures.
We are going to use it for our DbaaS in PaaS infrastructure.
However, a user-friendly third party tool can also be used as CloudBerry Explorer for OpenStack as managing Storage Containers.

Before starting with CloudBerry it is necessary to subscribe to a trial access or to buy a billing frequency hourly or monthly. In both cases, we have the same functionalities.

Let’s go !

Follow all steps to subscribe your free trial access in the PaaS / DbaaS category on https://cloud.oracle.com/home.
Once your account is registered, you will receive an e-mail from Oracle with subscription details.

This e-mail contains a lot of services, but we will only use the Oracle Backup Service. We can notice that our Backup Service has a data region (UK) and a link to associated documentation (same for other services).

Oracle Database Backup Service
Data Region: UK001
Get Started: http://docs.oracle.com/cloud/latest/dbbackup_gs/index.html

In the next steps, we will have a look on different types of Backup Services. The access to the Backup Services is linked to your account.

You need :
Your email address
Your password
Your Identity Domain

Start with CloudBerry by downloading and installing the following link : http://www.cloudberrylab.com and install it.

Step 1 – Creating an Oracle connection

File > New Oracle Account

Process_to_connection

The account location depends on where you have subscribed your Datacenter. This account is notified when you receive your confirmation e-mail. Note that your storage space is linked with your identity domain and that all services, which depend on your subscription, will be stored in this area.

Once the connection is established we are going to create a container for the backup storage area (process below).

Process_to_connection_Crea_allt

This page shows other containers as well as our “SandBox” container we just created. The “_apaas” belongs to an Application Container Cloud I created as a service on my Oracle Public Cloud.

We will use our SandBox container to create our Oracle Backup Service. There are 2 kinds of containers, “Standard & Archive”.

Standard : Containers are similar to a directory structure but with a key distinction: unlike directories, containers cannot be nested.
By default, all containers are of the standard storage class (as opposed to the archive storage class).

Archive : You can use Archive containers to store data that won’t be accessed for a while or that will be accessed infrequently.

For example :

Storage_quota

In Cloud, we can monitor storage capacity of your Oracle Database Backup Service.

billing

Step 2 – Connection in your Oracle Public Cloud

Process_to_connection

Once the connection to your Oracle Public Cloud is established, we shall create a new database with a Backup storage area. I am not going to describe all steps to create a database in a Cloud, but rather some steps to backup the storage and a summary before we create it.

Step 3 – Creating Cloud Database with Backup and Recovery configuration Here is the summary of out DBaaS creation :

summary

As you will notice in the backup destination section, we have chosen a backup location option for the database in your service and, depending on your choice, provided information regarding the Oracle Storage Cloud service container where cloud backups have to be stored.

Backup Destination: we are going to store our backup in a Cloud or/in a local storage.

Capture20

For configuration, remember your :

Cloud Storage Container : <store service name>-<identity domain name>/<container name>

Step 4 – Backup Tests

First, we shall execute a backup of the database with RMAN command, then we will be able to check the results.

Rman_backup

The backup has been correctly performed with RMAN command. So we will see if we have our backup pieces in your container.

Backup_container

 As shown in the picture above, we find the backups in our “SandBox” container.

Step 5 – How to check my backup parameters ?

When you perform a “Show All” command in RMAN you can check how your channel is configured for backups. There you should have a library parameter set up in RMAN as follow :

Show_all

We can notice that there are “SBT_TAPE” channels for the device with its library (libopc.so) and the configuration which is limited to maxsize of 2 Go.

We see a parameter file called opcSDBX.or which contains the settings to access the backup cloud service Container. Then you can check it and open it with a text editor.

Config_cloud

Here we find our parameters entered previously when we created our access in the Oracle Cloud backup. We can observe one important thing: we use a wallet to secure backups and recovery in Cloud.

Conclusion

Easy to use, simple to set up, just try it !

 

Cet article How to manage your containers in Oracle Storage Cloud Service ? est apparu en premier sur Blog dbi services.

Large Pages and MEMORY_TARGET on Windows

Wed, 2016-07-27 16:57

In a previous post about enabling large page on Windows Server, I explained that it’s possible to use large pages with Automatic Memory Management (AMM, where sizing if SGA and PGA is automatic with MEMORY_TARGET setting) but possible does not mean that it is recommended. We feel that it’s not a good idea, but are there reasons for it or it’s just our linux backgroud that brings this opinion?

SGA_MAX_SIZE=4G, MEMORY_MAX_SIZE=6G

What is the size of allocated large pages? If it is set, SGA_MAX_SIZE is the size allocated as large pages when ORA_LPENABLE=1, even if you’ve set some MEMORY_MAX_SIZE:
VirtualBox_Windows 2012 Server_27_07_2016_17_12_02

I take screenshots with all information: the init.ora settings and the regedit entry on top-left. Latest sqlplus commands on top right (here screenshot is just after the startup). And memory given by Sysinternals RamMap.

So, when SGA_MAX_SIZE is set, it’s the maximum size of SGA that can be allocated, and this is what is allocated at startup from physical memory until the end of the instance. You can reduce the SGA_TARGET dynamically, but I don’t see any reason for that as the memory allocated in large page will not be released, nor swapped, nor usable for PGA.

SGA_MAX_SIZE unset, MEMORY_MAX_SIZE=6G

If we don’t set SGA_MAX_SIZE, then the SGA can grow up to MEMORY_MAX_SIZE and this is what is allocated at startup when ORA_LPENABLE=1:

VirtualBox_Windows 2012 Server_27_07_2016_17_16_44

Physical memory must be free

When ORA_LPENABLE=1 and not enough memory is available if physical memory, you get an error (ORA-27102 OS 1450) at startup:
VirtualBox_Windows 2012 Server_27_07_2016_17_23_17

What I did here was running 10 instances of SQLDeveloper to use 2GB on my 8GB VM.

Fragmentation

Not only we need the SGA to be allocated from physical memory, but it needs to be contiguous. Here is a screenshot I took some times later with those sqldev closed but after I had lot of activity on the VM:
VirtualBox_Windows 2012 Server_27_07_2016_21_32_16
As you see here, there is enough RAM (7GB) but not contiguous.

The recommandation when using large pages on Windows is to start all instances immediately after server restart, and if you have to restart an instance you may have to reboot the server. Note that the major advantage of large pages is on virtualized environments, and then you should not have more that one instance in a server. If you are convinced that with Windows it’s always good to restart the server, here you have a reason: fragmentation breaks large pages.

Mixed-mode

Let’s go back to the state where I had 10 SQLDeveloper opened. I change ORA_LPENABLE to 2 instead of 1 to be able to start the instance even is there is not enough contiguous RAM for the SGA (here for the MEMORY_TARGET as no SGA_MAX_SIZE is defined).

Now, I’m able to start the instance (but it took several minutes here as physical memory is exhausted):
VirtualBox_Windows 2012 Server_27_07_2016_20_56_25
Instance is started, but not all 6GB have been allocated as large pages. In this case where SGA_TARGET is 3GB, I presume that all SGA uses large pages unless we increase SGA_TARGET higher than the 5GB allocated, but this is only my guess.

MEMORY_MAX_SIZE and ORA_LPENABLE=0

So, now that we know how it works, let’s see the danger to run AMM with large pages.

Here is a database in AMM where MEMORY_TARGET=5GB after running some query that needs several GB of buffer cache (thanks to CACHE hint and “_serial_direct_read”=never) and several GB of PGA (thanks to manual workarea size policy). First, without large pages:

VirtualBox_Windows 2012 Server_27_07_2016_21_56_24

The RAM allocation is all private process memory (on Windows, Oracle processes are actually threads from one process only). And AMM achieves its goal: target is MEMORY_TARGET=5GB and this is what we have allocated for the instance.

MEMORY_MAX_SIZE and ORA_LPENABLE=2

Now doing the same with large page:
VirtualBox_Windows 2012 Server_27_07_2016_22_09_10

MEMORY_MAX_SIZE is supposed to be used for PGA+SGA and a large use of PGA should reduce SGA. But it cannot be done here because SGA is large page and PGA cannot be allocated from large pages. This means that AMM with large page do not achieve its goal. MEMORY_MAX_SIZE acts like having set SGA_MAX_SIZE to that value and very small PGA_AGGREGATE_TARGET. In this example (artificial example as I used manual workarea size policy, but same happens with auto and several sessions) physical memory is exhausted.

From that, I think we have a reason to recommend ASSM (Automatic Shared Memory Management) with large pages, as with Linux. In Oracle, because of the threaded architecture, it’s not a requirement but in my opinion it’s still a good idea to differentiate those memory areas that are so different:

  • SGA: one area allocated at startup, preferentially from large pages
  • PGA: variable size areas allocated and de-allocated by sessions

If you have more than few GB on your server, you should size SGA and PGA independently and benefit from large pages for SGA. Do not set MEMORY_MAX_SIZE then. Don’t set MEMORY_TARGET either as it acts as MEMORY_MAX_SIZE if this one is not set.

 

Cet article Large Pages and MEMORY_TARGET on Windows est apparu en premier sur Blog dbi services.

Elasticsearch, Kibana, Logstash and Filebeat – Centralize all your database logs (and even more)

Wed, 2016-07-27 09:02

When it comes to centralizing logs of various sources (operating systems, databases, webservers, etc.) the ELK stack is becoming more and more popular in the open source world. ELK stands for Elasticsearch, Logstash and Kibana. Elasticsearch is based on Apache Lucene and the primary goal is to provide distributed search and analytic functions. Logstash is responsible to collect logs from a variety of systems and is able to forward these to Elasticsearch. Kibana is the data visualization platform on top of Elasticsearch. Nowadays the term ELK seems not be used anymore and people speak about the Elastic Stack. In this post I’ll look at how you can use these tools to centralize your PostgreSQL log file(s) into the Elastic Stack.

As usual my VMs are running CentOS although that should no be very important for the following (except for the yum commands). As Elasticsearch, Kibana and Logstash are all based on Java you’ll need to install java before starting. There are yum and apt repositories available but I’ll use the manual way for getting the pieces up and running.

The goal is to have one VM running Elasticsearch, Logstash and Kibana and another VM which will run the PostgreSQL instance and Filebeat. Lets start with the first one by installing java and setting up a dedicated user for running the stack:

[root@elk ~]# yum install -y java-1.8.0-openjdk
[root@elk ~]# groupadd elk
[root@elk ~]# useradd -g elk elk
[root@elk ~]# passwd elk
[root@elk ~]# mkdir -p /opt/elk
[root@elk ~]# chown elk:elk /opt/elk
[root@elk ~]# su - elk
[elk@elk ~]$ cd /opt/elk

The first of the products we’re going to install is Elasticsearch which is quite easy (we’ll not set up a distributed mode, only single node for the scope of this post). All we need to do is to download the tar file, extract and start:

[elk@elk ~]$ wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.4/elasticsearch-2.3.4.tar.gz
[elk@elk elk]$ tar -axf elasticsearch-2.3.4.tar.gz
[elk@elk elk]$ ls -l
total 26908
drwxrwxr-x. 6 elk elk     4096 Jul 27 09:17 elasticsearch-2.3.4
-rw-rw-r--. 1 elk elk 27547169 Jul  7 15:05 elasticsearch-2.3.4.tar.gz
[elk@elk elk]$ cd elasticsearch-2.3.4
[elk@elk elasticsearch-2.3.4]$ bin/elasticsearch &

This will start up Elasticsearch and print some messages to the screen:

[2016-07-27 09:20:10,529][INFO ][node                     ] [Shinchuko Lotus] version[2.3.4], pid[10112], build[e455fd0/2016-06-30T11:24:31Z]
[2016-07-27 09:20:10,534][INFO ][node                     ] [Shinchuko Lotus] initializing ...
[2016-07-27 09:20:11,090][INFO ][plugins                  ] [Shinchuko Lotus] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-07-27 09:20:11,114][INFO ][env                      ] [Shinchuko Lotus] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [46.8gb], net total_space [48.4gb], spins? [unknown], types [rootfs]
[2016-07-27 09:20:11,115][INFO ][env                      ] [Shinchuko Lotus] heap size [1015.6mb], compressed ordinary object pointers [true]
[2016-07-27 09:20:11,115][WARN ][env                      ] [Shinchuko Lotus] max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
[2016-07-27 09:20:12,637][INFO ][node                     ] [Shinchuko Lotus] initialized
[2016-07-27 09:20:12,637][INFO ][node                     ] [Shinchuko Lotus] starting ...
[2016-07-27 09:20:12,686][INFO ][transport                ] [Shinchuko Lotus] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2016-07-27 09:20:12,690][INFO ][discovery                ] [Shinchuko Lotus] elasticsearch/zc26XSa5SA-f_Kvm_jfthA
[2016-07-27 09:20:15,769][INFO ][cluster.service          ] [Shinchuko Lotus] new_master {Shinchuko Lotus}{zc26XSa5SA-f_Kvm_jfthA}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-07-27 09:20:15,800][INFO ][gateway                  ] [Shinchuko Lotus] recovered [0] indices into cluster_state
[2016-07-27 09:20:15,803][INFO ][http                     ] [Shinchuko Lotus] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2016-07-27 09:20:15,803][INFO ][node                     ] [Shinchuko Lotus] started

The default port is 9200 and you should now be able to talk to Elasticsearch:

[elk@elk elasticsearch-2.3.4]$ curl -X GET http://localhost:9200/
{
  "name" : "Shinchuko Lotus",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.3.4",
    "build_hash" : "e455fd0c13dceca8dbbdbb1665d068ae55dabe3f",
    "build_timestamp" : "2016-06-30T11:24:31Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

Looks good. The next product we’ll need to install is Kibana. The setup itself is as easy as setting up Elasticsearch:

[elk@elk elasticsearch-2.3.4]$ cd /opt/elk/
[elk@elk elk]$ wget https://download.elastic.co/kibana/kibana/kibana-4.5.3-linux-x64.tar.gz
[elk@elk elk]$ tar -axf kibana-4.5.3-linux-x64.tar.gz
[elk@elk elk]$ cd kibana-4.5.3-linux-x64
[elk@elk kibana-4.5.3-linux-x64]$ grep elasticsearch.url config/kibana.yml 
elasticsearch.url: "http://localhost:9200"
[elk@elk kibana-4.5.3-linux-x64]$ bin/kibana &

Similar to Elasticsearch the startup messages are written to the screen:

  log   [09:27:30.208] [info][status][plugin:kibana] Status changed from uninitialized to green - Ready
  log   [09:27:30.237] [info][status][plugin:elasticsearch] Status changed from uninitialized to yellow - Waiting for Elasticsearch
  log   [09:27:30.239] [info][status][plugin:kbn_vislib_vis_types] Status changed from uninitialized to green - Ready
  log   [09:27:30.242] [info][status][plugin:markdown_vis] Status changed from uninitialized to green - Ready
  log   [09:27:30.253] [info][status][plugin:metric_vis] Status changed from uninitialized to green - Ready
  log   [09:27:30.257] [info][status][plugin:spyModes] Status changed from uninitialized to green - Ready
  log   [09:27:30.261] [info][status][plugin:statusPage] Status changed from uninitialized to green - Ready
  log   [09:27:30.263] [info][status][plugin:table_vis] Status changed from uninitialized to green - Ready
  log   [09:27:30.270] [info][listening] Server running at http://0.0.0.0:5601
  log   [09:27:35.320] [info][status][plugin:elasticsearch] Status changed from yellow to yellow - No existing Kibana index found
[2016-07-27 09:27:35,513][INFO ][cluster.metadata         ] [Shinchuko Lotus] [.kibana] creating index, cause [api], templates [], shards [1]/[1], mappings [config]
[2016-07-27 09:27:35,938][INFO ][cluster.routing.allocation] [Shinchuko Lotus] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.kibana][0]] ...]).
  log   [09:27:38.746] [info][status][plugin:elasticsearch] Status changed from yellow to green - Kibana index ready

To check if Kibana is really working point your browser to http://[hostname]:5601 (192.168.22.173 in my case):
kibana_01

The third product we’ll need is Logstash. Is is almost the same procedure for getting it up and running:

[elk@elk kibana-4.5.3-linux-x64]$ cd /opt/elk/
[elk@elk elk]$ wget https://download.elastic.co/logstash/logstash/logstash-all-plugins-2.3.4.tar.gz
[elk@elk elk]$ tar -axf logstash-all-plugins-2.3.4.tar.gz
[elk@elk elk]$ cd logstash-2.3.4

To test if Logstash is running fine start a very simple pipeline:

[elk@elk logstash-2.3.4]$ bin/logstash -e 'input { stdin { } } output { stdout {} }'
Settings: Default pipeline workers: 1
Pipeline main started

Once this is up type something on the command line to check if Logstash is responding:

yipphea
2016-07-27T07:52:43.607Z elk yipphea

Looks good as well. For now we can stop Logstash again by “Control-c”:

^CSIGINT received. Shutting down the agent. {:level=>:warn}
stopping pipeline {:id=>"main"}
Received shutdown signal, but pipeline is still waiting for in-flight events
to be processed. Sending another ^C will force quit Logstash, but this may cause data loss. {:level=>:warn}

Pipeline main has been shutdown

Now we need to do some configuration to prepare Logtsash for receiving our PostgreSQL log file(s) through Filebeat. Filebeat will be responsible to forward the PostgreSQL log file(s) to Logstash.

[elk@elk logstash-2.3.4]$ pwd
/opt/elk/logstash-2.3.4
[elk@elk logstash-2.3.4]$ mkdir conf.d
[elk@elk logstash-2.3.4]$ cat conf.d/02-beats-input.conf
input {
  beats {
    port => 5044
    ssl => false
  }
}

What this is doing is telling Logstash that it shall create a new input of type beats and listen on port 5044 for incoming data. In addition to this input plugin Logstash will need an ouput plugin to know what it shall do with data coming in. As we want to send all the data to Elasticsearch we need to specify this:

[elk@elk logstash-2.3.4]$ cat conf.d/10-elasticsearch-output.conf
output {
  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Lets test if the configuration is fine (Logstash will read all configuration files in order):

[elk@elk logstash-2.3.4]$ bin/logstash --config /opt/elk/logstash-2.3.4/conf.d/ --configtest
Configuration OK

As all seems fine we can start Logstash with our new configuration:

[elk@elk logstash-2.3.4]$ bin/logstash --config /opt/elk/logstash-2.3.4/conf.d/ &
Settings: Default pipeline workers: 1
Pipeline main started

For being able to easily use the Filebeat index patterns in Kibana we’ll load the template dashboards provided by Elastic:

[elk@elk logstash-2.3.4]$ cd /opt/elk/
[elk@elk elk]$ wget http://download.elastic.co/beats/dashboards/beats-dashboards-1.2.3.zip
[elk@elk elk]$ unzip beats-dashboards-1.2.3.zip
[elk@elk elk]$ cd beats-dashboards-1.2.3
[elk@elk beats-dashboards-1.2.3]$ ./load.sh

Time to switch to the PostgreSQL VM to install Filebeat:

postgres@centos7:/u01/app/postgres/product/ [PG1] pwd
/u01/app/postgres/product
postgres@centos7:/u01/app/postgres/product/ [PG1] wget https://download.elastic.co/beats/filebeat/filebeat-1.2.3-x86_64.tar.gz
postgres@centos7:/u01/app/postgres/product/ [PG1] tar -axf filebeat-1.2.3-x86_64.tar.gz

Filebeat comes with an index template for Elasticsearch which we will now need to transfer to the host where Elasticsearch runs on for being able to load it:

postgres@centos7:/u01/app/postgres/product/filebeat-1.2.3-x86_64/ [PG1] ls *template*
filebeat.template.json
postgres@centos7:/u01/app/postgres/product/filebeat-1.2.3-x86_64/ [PG1] scp filebeat.template.json elk@192.168.22.173:/var/tmp/
elk@192.168.22.173's password: 
filebeat.template.json                                                      100%  814     0.8KB/s   00:00  

Locally on the host where Elasticsearch runs on we can now load the template into Elasticsearch:

[elk@elk elk]$ curl -XPUT 'http://localhost:9200/_template/filebeat' -d@/var/tmp/filebeat.template.json
{"acknowledged":true}
[elk@elk elk]$ 

Back to the PostgreSQL VM we need to configure Filebeat itself by adapting the filebeat.yml configuration file:

postgres@centos7:/u01/app/postgres/product/filebeat-1.2.3-x86_64/ [PG1] ls
filebeat  filebeat.template.json  filebeat.yml

There are only a few important points to configure. The first one is to tell Filebeat where to look for the PostgreSQL log files:

filebeat:
  # List of prospectors to fetch data.
  prospectors:
    # Each - is a prospector. Below are the prospector specific configurations
    -
      # Paths that should be crawled and fetched. Glob based paths.
      # To fetch all ".log" files from a specific level of subdirectories
      # /var/log/*/*.log can be used.
      # For each file found under this path, a harvester is started.
      # Make sure not file is defined twice as this can lead to unexpected behaviour.
      paths:
        - /u02/pgdata/PG1/pg_log/*.log

Afterwards make sure you disable/uncomment the Elasticsearch ouput plugin under the “output” section:

  ### Elasticsearch as output
  #elasticsearch:
    # Array of hosts to connect to.
    # Scheme and port can be left out and will be set to the default (http and 9200)
    # In case you specify and additional path, the scheme is required: http://localhost:9200/path
    # IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
    #hosts: ["localhost:9200"]

Finally enable the Logstash output plugin in the same section (provide the host and port where you Logstash is running):

       ### Logstash as output
  logstash:
    # The Logstash hosts
    hosts: ["192.168.22.173:5044"]

The host and port specified here must match to what we specified in 02-beats-input.conf when we configured Logstash above. This should be sufficiant to startup Filebeat:

postgres@centos7:/u01/app/postgres/product/filebeat-1.2.3-x86_64/ [PG1] ./filebeat &

If everything is working fine we should now be able to ask Elasticsearch for our data from the PostgreSQL log file(s):

[elk@elk elk]$ curl -XGET 'http://localhost:9200/filebeat-*/_search?pretty'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 971,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1F",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 06:57:34.030 CEST - 2 - 20831 -  - @ LOG:  MultiXact member wraparound protections are now enabled",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "type" : "log",
        "input_type" : "log",
        "fields" : null,
        "count" : 1,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "offset" : 112,
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1M",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 07:20:46.060 CEST - 2 - 20835 -  - @ LOG:  autovacuum launcher shutting down",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "fields" : null,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "offset" : 948,
        "type" : "log",
        "count" : 1,
        "input_type" : "log",
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1P",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 07:20:46.919 CEST - 5 - 20832 -  - @ LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.678 s; sync files=0, longest=0.000 s, average=0.000 s; distance=10908 kB, estimate=10908 kB",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "input_type" : "log",
        "count" : 1,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "type" : "log",
        "fields" : null,
        "offset" : 1198,
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1R",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:36:34.600 CEST - 1 - 2878 -  - @ LOG:  database system was shut down at 2016-05-15 07:20:46 CEST",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "offset" : 1565,
        "input_type" : "log",
        "count" : 1,
        "fields" : null,
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "type" : "log",
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1X",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:39:21.313 CEST - 3 - 3048 - [local] - postgres@postgres STATEMENT:  insert into t1 generate_series(1,1000000);",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "offset" : 2216,
        "type" : "log",
        "input_type" : "log",
        "count" : 1,
        "fields" : null,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1e",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:43:24.366 CEST - 3 - 3397 - [local] - postgres@postgres CONTEXT:  while updating tuple (0,1) in relation \"t1\"",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "offset" : 3165,
        "type" : "log",
        "input_type" : "log",
        "count" : 1,
        "fields" : null,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1l",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:43:46.776 CEST - 10 - 3397 - [local] - postgres@postgres CONTEXT:  while updating tuple (0,1) in relation \"t1\"",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "offset" : 4045,
        "type" : "log",
        "count" : 1,
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "input_type" : "log",
        "fields" : null,
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1r",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:45:39.837 CEST - 9 - 3048 - [local] - postgres@postgres ERROR:  type \"b\" does not exist at character 28",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "input_type" : "log",
        "count" : 1,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "offset" : 4799,
        "type" : "log",
        "fields" : null,
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1w",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:45:49.843 CEST - 14 - 3048 - [local] - postgres@postgres ERROR:  current transaction is aborted, commands ignored until end of transaction block",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "type" : "log",
        "fields" : null,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "offset" : 5400,
        "input_type" : "log",
        "count" : 1,
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    }, {
      "_index" : "filebeat-2016.07.27",
      "_type" : "log",
      "_id" : "AVYrey_1IfCoJOBMaP1x",
      "_score" : 1.0,
      "_source" : {
        "message" : "2016-05-15 09:45:49.843 CEST - 15 - 3048 - [local] - postgres@postgres STATEMENT:  alter table t1 add column b int;",
        "@version" : "1",
        "@timestamp" : "2016-07-27T08:31:44.940Z",
        "offset" : 5559,
        "count" : 1,
        "fields" : null,
        "beat" : {
          "hostname" : "centos7.local",
          "name" : "centos7.local"
        },
        "source" : "/u02/pgdata/PG1/pg_log/postgresql-Sun.log",
        "type" : "log",
        "input_type" : "log",
        "host" : "centos7.local",
        "tags" : [ "beats_input_codec_plain_applied" ]
      }
    } ]
  }
}

Quite a lot of information, so it is really working :) When we can ask Elasticsearch we should be able to use Kibana on the same data, too, shouldn’t we? Fire up your browser and point it to your Kibana URL (192.168.22.173:5601 in my case). You should see the “filebeat-*” index pattern in the upper left:

kibana_02

Select the “filebeat-*” index pattern:
kibana_03

To make this index the default one click on the green star:
kibana_04

Time to dicover our data by using the “Discover” menu on the top:
kibana_05

The result of that should be that you can see all the PostgreSQL log messages on the screen:
kibana_06

Try to search something, e.g. “checkpoint complete”:
kibana_07

Not much happening on my instance so lets do some checkpoints:

postgres@centos7:/u01/app/postgres/product/filebeat-1.2.3-x86_64/ [PG1] sqh
psql (9.6beta1 dbi services build)
Type "help" for help.

(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.403 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.379 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.364 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.321 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.370 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.282 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.411 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 101.166 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.392 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.322 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.367 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.320 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.328 ms
(postgres@[local]:5432) [postgres] > checkpoint;
CHECKPOINT
Time: 100.285 ms

What is the picture now:
kibana_08

Isn’t that great? All the information near realtime, just by using a browser. In my case logs are coming from a single PostreSQL instance but logs could be coming from hundreds of instances. Logs could also be coming from webservers, application servers, operating system, network, … . All centralized in one place ready to analyze. Even better you could use Watcher to alert on changes on your data.

Ah, I can already hear it: But I need to see my performance metrics as well. No problem, there is the jdbc input plugin for Logstash. What can you do with that? Once configured you can query what ever you want from your database. Lets do a little demo.

As we downloaded Logstash with all plugins included already the jdbc input plugin is already there:

[elk@elk logstash-2.3.4]$ pwd
/opt/elk/logstash-2.3.4
[elk@elk logstash-2.3.4]$ find . -name *jdbc*
./vendor/bundle/jruby/1.9/gems/jdbc-sqlite3-3.8.11.2
./vendor/bundle/jruby/1.9/gems/jdbc-sqlite3-3.8.11.2/jdbc-sqlite3.gemspec
./vendor/bundle/jruby/1.9/gems/jdbc-sqlite3-3.8.11.2/lib/jdbc
./vendor/bundle/jruby/1.9/gems/jdbc-sqlite3-3.8.11.2/lib/sqlite-jdbc-3.8.11.2.jar
./vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.1.0
./vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.1.0/lib/logstash/inputs/jdbc.rb
./vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.1.0/lib/logstash/plugin_mixins/jdbc.rb
./vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.1.0/logstash-input-jdbc.gemspec
./vendor/bundle/jruby/1.9/gems/sequel-4.36.0/lib/sequel/adapters/jdbc
./vendor/bundle/jruby/1.9/gems/sequel-4.36.0/lib/sequel/adapters/jdbc/jdbcprogress.rb
./vendor/bundle/jruby/1.9/gems/sequel-4.36.0/lib/sequel/adapters/jdbc.rb
./vendor/bundle/jruby/1.9/specifications/jdbc-sqlite3-3.8.11.2.gemspec
./vendor/bundle/jruby/1.9/specifications/logstash-input-jdbc-3.1.0.gemspec

What you’ll need to provide in addtition is the jdbc driver for the database you want to connect to. In my case for PostgreSQL:

[elk@elk logstash-2.3.4]$ cd /opt/ elk/
[elk@elk elk]$ mkdir jdbc
[elk@elk elk]$ cd jdbc/
[elk@elk jdbc]$ wget https://jdbc.postgresql.org/download/postgresql-9.4.1209.jar
[elk@elk jdbc]$ ls
postgresql-9.4.1209.jar

All we need to do from here on is to configure another input and output plugin for Logstash:

[elk@elk conf.d]$ pwd
/opt/elk/logstash-2.3.4/conf.d
[elk@elk conf.d]$ cat 03-jdbc-postgres-input.conf
input {
    jdbc {
        jdbc_connection_string => "jdbc:postgresql://192.168.22.99:5432/postgres"
        jdbc_user => "postgres"
        jdbc_password => "postgres"
        jdbc_validate_connection => true
        jdbc_driver_library => "/opt/elk/jdbc/postgresql-9.4.1209.jar"
        jdbc_driver_class => "org.postgresql.Driver"
        statement => "SELECT * from pg_stat_activity"
    }
}
output {
    stdout { codec => json_lines }
}

Re-test if the configuration is fine:

[elk@elk logstash-2.3.4]$ bin/logstash --config /opt/elk/logstash-2.3.4/conf.d/ --configtest
Configuration OK

And then kill and restart Logstash:

[elk@elk logstash-2.3.4]$ bin/logstash --config /opt/elk/logstash-2.3.4/conf.d/

You should see data from pg_stat_activity right on the screen:

[elk@elk logstash-2.3.4]$ bin/logstash --config /opt/elk/logstash-2.3.4/conf.d/ 
Settings: Default pipeline workers: 1
Pipeline main started
{"datid":13322,"datname":"postgres","pid":3887,"usesysid":10,"usename":"postgres","application_name":"","client_addr":{"type":"inet","value":"192.168.22.173"},"client_hostname":null,"client_port":58092,"backend_start":"2016-07-27T13:15:25.421Z","xact_start":"2016-07-27T13:15:25.716Z","query_start":"2016-07-27T13:15:25.718Z","state_change":"2016-07-27T13:15:25.718Z","wait_event_type":null,"wait_event":null,"state":"active","backend_xid":null,"backend_xmin":{"type":"xid","value":"1984"},"query":"SELECT * from pg_stat_activity","@version":"1","@timestamp":"2016-07-27T13:15:26.712Z"}
{"message":"2016-07-27 15:15:25.422 CEST - 1 - 3887 - 192.168.22.173 - [unknown]@[unknown] LOG:  connection received: host=192.168.22.173 port=58092","@version":"1","@timestamp":"2016-07-27T13:15:32.054Z","source":"/u02/pgdata/PG1/pg_log/postgresql-Wed.log","offset":84795,"type":"log","input_type":"log","count":1,"beat":{"hostname":"centos7.local","name":"centos7.local"},"fields":null,"host":"centos7.local","tags":["beats_input_codec_plain_applied"]}
{"message":"2016-07-27 15:15:25.454 CEST - 2 - 3887 - 192.168.22.173 - postgres@postgres LOG:  connection authorized: user=postgres database=postgres","@version":"1","@timestamp":"2016-07-27T13:15:32.054Z","source":"/u02/pgdata/PG1/pg_log/postgresql-Wed.log","fields":null,"beat":{"hostname":"centos7.local","name":"centos7.local"},"count":1,"offset":84932,"type":"log","input_type":"log","host":"centos7.local","tags":["beats_input_codec_plain_applied"]}

As we want to have this data in Elasticsearch and analyze it with Kibana adjust the configuration to look like this and then restart Logstash:

input {
    jdbc {
        jdbc_connection_string => "jdbc:postgresql://192.168.22.99:5432/postgres"
        jdbc_user => "postgres"
        jdbc_password => "postgres"
        jdbc_validate_connection => true
        jdbc_driver_library => "/opt/elk/jdbc/postgresql-9.4.1209.jar"
        jdbc_driver_class => "org.postgresql.Driver"
        statement => "SELECT * from pg_stat_activity"
        schedule => "* * * * *"
    }
}
output {
    elasticsearch {
        index => "pg_stat_activity"
        document_type => "pg_stat_activity"
        document_id => "%{uid}"
        hosts => ["localhost:9200"]
    }
}

Once your restarted head over to Kibana and create a new index:
kibana_09

When you “Discover” you should see the data from pg_stat_activity:
kibana_10

Have fun with your data …

 

Cet article Elasticsearch, Kibana, Logstash and Filebeat – Centralize all your database logs (and even more) est apparu en premier sur Blog dbi services.

Getting started with Ansible – Creating the PostgreSQL instance

Tue, 2016-07-26 08:20

In the last three posts we did the initial Ansible setup, installed the operating system packages, created the PostgreSQL group and user and downloaded, compiled and installed the PostgreSQL binaries from source. In this post we’ll look at how we can use Ansible to create our first PostgreSQL instance.

As a reminder this is our current playbook:

[ansible@ansiblecontrol ansible]$ pwd
/opt/ansible
[ansible@ansiblecontrol ansible]$ cat roles/postgresqldbserver/tasks/main.yml 
---
- name: Install PostgreSQL dependencies
  yum: name={{item}} state=present
  with_items:
   - gcc
   - openldap-devel
   - python-devel
   - readline-devel
   - openssl-devel
   - redhat-lsb
   - bison
   - flex
   - perl-ExtUtils-Embed
   - zlib-devel
   - crypto-utils
   - openssl-devel
   - pam-devel
   - libxml2-devel
   - libxslt-devel
   - tcl
   - tcl-devel
   - openssh-clients
   - bzip2
   - net-tools
   - wget
   - screen
   - ksh

- name: Add PostgreSQL operating system group
  group: name=postgressomegroup state=present

- name: Add PostgreSQL operating system user
  user: name=postgres comment="PostgreSQL binaries owner" group=postgres

- name: Download the PostgreSQL 9.5.3 sources
  get_url: url=https://ftp.postgresql.org/pub/source/v9.5.3/postgresql-9.5.3.tar.bz2 dest=/var/tmp mode=0755

- name: Copy PostgreSQL install script to targets
  copy: src=install_pg953.sh dest=/var/tmp/install_pg953.sh owner=root group=root mode="u=rwx"

- name: Compile and install PostgreSQL
  shell: /var/tmp/install_pg953.sh >> /var/tmp/install_pg_log
  args:
    executable: /usr/bin/bash

What we need now is an additional task which does the creation of the PostgreSQL cluster on disk. As with the installation from source we’ll need to do some scripting for this. Here is a very basic script for doing this:

#!/usr/bin/bash

PGUSER="postgres"
PGGROUP="postgres"
DATADIRECTORY="/u02/pgdata/PG1"
XLOGLOCATION="/u03/pgdata/PG1"
PGHOME="/u01/app/postgres/product/95/db_3/"
POSTGRESDBPASSWORD="postgres"

mkdir -p ${DATADIRECTORY}
mkdir -p ${XLOGLOCATION}
chown ${PGUSER}:${PGGROUP} ${DATADIRECTORY}
chown ${PGUSER}:${PGGROUP} ${XLOGLOCATION}

su - ${PGUSER} -c "echo ${POSTGRESDBPASSWORD} > /var/tmp/tmp_pwd"
su - ${PGUSER} -c "${PGHOME}/bin/initdb -D ${DATADIRECTORY} --pwfile=/var/tmp/tmp_pwd -X ${XLOGLOCATION} -k"

rm -f /var/tmp/tmp_pwd

su - ${PGUSER} -c "${PGHOME}/bin/pg_ctl -D ${DATADIRECTORY} start"

As with the PostgreSQL installation script we’ll put this into the “files” directory of our “postgresqldbserver” role:

[ansible@ansiblecontrol ansible]$ ls roles/postgresqldbserver/files/
create_pg953_cluster.sh  install_pg953.sh

We’ll use the same logic again and add two new tasks to our existing playbook:

- name: Copy the PostgreSQL cluster creation script to the targets
  copy: src=create_pg953_cluster.sh dest=/var/tmp/create_pg953_cluster.sh owner=root group=root mode="u=rwx"

- name: Create and start the PostgreSQL instance
  shell: /var/tmp/create_pg953_cluster.sh >> /var/tmp/create_pg_cluster.log
  args:
    executable: /usr/bin/bash

Once the playbook is executed again:

[ansible@ansiblecontrol ansible]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.171]
ok: [192.168.22.172]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
ok: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
ok: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

TASK [postgresqldbserver : Add PostgreSQL operating system group] **************
ok: [192.168.22.171]
ok: [192.168.22.172]

TASK [postgresqldbserver : Add PostgreSQL operating system user] ***************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Download the PostgreSQL 9.5.3 sources] **************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Copy PostgreSQL install script to targets] **********
changed: [192.168.22.171]
changed: [192.168.22.172]

TASK [postgresqldbserver : Compile and install PostgreSQL] *********************
changed: [192.168.22.172]
changed: [192.168.22.171]

TASK [postgresqldbserver : Copy the PostgreSQL cluster creation script to the targets] ***
ok: [192.168.22.171]
changed: [192.168.22.172]

TASK [postgresqldbserver : Create and start the PostgreSQL instance] ***********
changed: [192.168.22.172]
changed: [192.168.22.171]

PLAY RECAP *********************************************************************
192.168.22.171             : ok=9    changed=3    unreachable=0    failed=0   
192.168.22.172             : ok=9    changed=4    unreachable=0    failed=0   

… we should have a running PostgreSQL instance on both nodes, lets check:

[root@ansiblepg1 tmp]# ps -ef | grep postgres
postgres 17284     1  0 08:47 ?        00:00:00 /u01/app/postgres/product/95/db_3/bin/postgres -D /u02/pgdata/PG1
postgres 17288 17284  0 08:47 ?        00:00:00 postgres: checkpointer process   
postgres 17289 17284  0 08:47 ?        00:00:00 postgres: writer process   
postgres 17290 17284  0 08:47 ?        00:00:00 postgres: wal writer process   
postgres 17291 17284  0 08:47 ?        00:00:00 postgres: autovacuum launcher process   
postgres 17292 17284  0 08:47 ?        00:00:00 postgres: stats collector process   
root     17294 10223  0 08:47 pts/1    00:00:00 grep --color=auto postgres
[root@ansiblepg2 ~]# ps -ef | grep postgres
postgres 16951     1  0 08:47 ?        00:00:00 /u01/app/postgres/product/95/db_3/bin/postgres -D /u02/pgdata/PG1
postgres 16955 16951  0 08:47 ?        00:00:00 postgres: checkpointer process   
postgres 16956 16951  0 08:47 ?        00:00:00 postgres: writer process   
postgres 16957 16951  0 08:47 ?        00:00:00 postgres: wal writer process   
postgres 16958 16951  0 08:47 ?        00:00:00 postgres: autovacuum launcher process   
postgres 16959 16951  0 08:47 ?        00:00:00 postgres: stats collector process   
root     16985 16964  0 08:48 pts/0    00:00:00 grep --color=auto postgres

Isn’t that cool? Sure, there is much hard coding in here which needs to be extended by using variables. This is a topic for another post.

For your reference here is the complete playbook:

[ansible@ansiblecontrol ansible]$ cat /opt/ansible/roles/postgresqldbserver/tasks/main.yml 
---
- name: Install PostgreSQL dependencies
  yum: name={{item}} state=present
  with_items:
   - gcc
   - openldap-devel
   - python-devel
   - readline-devel
   - openssl-devel
   - redhat-lsb
   - bison
   - flex
   - perl-ExtUtils-Embed
   - zlib-devel
   - crypto-utils
   - openssl-devel
   - pam-devel
   - libxml2-devel
   - libxslt-devel
   - tcl
   - tcl-devel
   - openssh-clients
   - bzip2
   - net-tools
   - wget
   - screen
   - ksh

- name: Add PostgreSQL operating system group
  group: name=postgressomegroup state=present

- name: Add PostgreSQL operating system user
  user: name=postgres comment="PostgreSQL binaries owner" group=postgres

- name: Download the PostgreSQL 9.5.3 sources
  get_url: url=https://ftp.postgresql.org/pub/source/v9.5.3/postgresql-9.5.3.tar.bz2 dest=/var/tmp mode=0755

- name: Copy PostgreSQL install script to targets
  copy: src=install_pg953.sh dest=/var/tmp/install_pg953.sh owner=root group=root mode="u=rwx"

- name: Compile and install PostgreSQL
  shell: /var/tmp/install_pg953.sh >> /var/tmp/install_pg_log
  args:
    executable: /usr/bin/bash

- name: Copy the PostgreSQL cluster creation script to the targets
  copy: src=create_pg953_cluster.sh dest=/var/tmp/create_pg953_cluster.sh owner=root group=root mode="u=rwx"

- name: Create and start the PostgreSQL instance
  shell: /var/tmp/create_pg953_cluster.sh >> /var/tmp/create_pg_cluster.log
  args:
    executable: /usr/bin/bash
 

Cet article Getting started with Ansible – Creating the PostgreSQL instance est apparu en premier sur Blog dbi services.

Getting started with Ansible – Download the PostgreSQL sources, compile and install

Mon, 2016-07-25 23:21

In the last post in this series we looked at how you can instruct Ansible to install packages on the operating system using the yum module and how you can create groups and users by using the group and the user modules. In this post we’ll look at how you can download the PostgreSQL sources, compile and then finally install the binaries by extending our existing playbook.

If you remember the last post our current Ansible playbook looks like this:

[ansible@ansiblecontrol ansible]$ pwd
/opt/ansible
[ansible@ansiblecontrol ansible]$ cat roles/postgresqldbserver/tasks/main.yml 
---
- name: Install PostgreSQL dependencies
  yum: name={{item}} state=present
  with_items:
   - gcc
   - openldap-devel
   - python-devel
   - readline-devel
   - openssl-devel
   - redhat-lsb
   - bison
   - flex
   - perl-ExtUtils-Embed
   - zlib-devel
   - crypto-utils
   - openssl-devel
   - pam-devel
   - libxml2-devel
   - libxslt-devel
   - tcl
   - tcl-devel
   - openssh-clients
   - bzip2
   - net-tools
   - wget
   - screen
   - ksh

- name: Add PostgreSQL operating system group
  group: name=postgressomegroup state=present

- name: Add PostgreSQL operating system user
  user: name=postgres comment="PostgreSQL binaries owner" group=postgres

Basically we have three tasks:

  • Install the required operating system packages
  • Create the PostgreSQL operating system group
  • Create the PostgreSQL operating system user

The next task we want Ansible to do is to download the PostgreSQL sources. In this example we’ll download from the official PostgreSQL servers but if you do not have connectivity to the outside world this can be a local server in your company’s network as well. The module we’ll use for that is get_url_module. All we need to do is to add the following lines to our playbook:

- name: Download the PostgreSQL 9.5.3 sources
  get_url: url=https://ftp.postgresql.org/pub/source/v9.5.3/postgresql-9.5.3.tar.bz2 dest=/var/tmp mode=0755

Once we execute the playbook again:

[ansible@ansiblecontrol ~]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
ok: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
ok: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

TASK [postgresqldbserver : Add PostgreSQL operating system group] **************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Add PostgreSQL operating system user] ***************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Download the PostgreSQL 9.5.3 sources] **************
changed: [192.168.22.171]
changed: [192.168.22.172]

PLAY RECAP *********************************************************************
192.168.22.171             : ok=5    changed=1    unreachable=0    failed=0   
192.168.22.172             : ok=5    changed=1    unreachable=0    failed=0   

The sources are available on the PostgreSQL hosts:

[root@ansiblepg1 ~] ls /var/tmp/post*
/var/tmp/postgresql-9.5.3.tar.bz2

Now we need to do some scripting as there is no pre-defined module for installing PostgreSQL from source. Here is a very basic script to do that:

#!/usr/bin/bash
PGSOURCE="/var/tmp/postgresql-9.5.3.tar.bz2"
PGUSER="postgres"
PGGROUP="postgres"
PGHOME="/u01/app/postgres/product/95/db_3"
SEGSIZE=2
BLOCKSIZE=8
mkdir -p /u01/app
chown ${PGUSER}:${PGGROUP} /u01/app
su - ${PGUSER} -c "cd /var/tmp/; tar -axf ${PGSOURCE}"
su - ${PGUSER} -c "cd /var/tmp/postgresql-9.5.3; ./configure --prefix=${PGHOME} \
            --exec-prefix=${PGHOME} \
            --bindir=${PGHOME}/bin \
            --libdir=${PGHOME}/lib \
            --sysconfdir=${PGHOME}/etc \
            --includedir=${PGHOME}/include \
            --datarootdir=${PGHOME}/share \
            --datadir=${PGHOME}/share \
            --with-pgport=5432 \
            --with-perl \
            --with-python \
            --with-tcl \
            --with-openssl \
            --with-pam \
            --with-ldap \
            --with-libxml \
            --with-libxslt \
            --with-segsize=${SEGSIZE} \
            --with-blocksize=${BLOCKSIZE} \
            --with-wal-segsize=16  \
            --with-extra-version=\" dbi services build\""
su - ${PGUSER} -c "cd /var/tmp/postgresql-9.5.3; make world"
su - ${PGUSER} -c "cd /var/tmp/postgresql-9.5.3; make install"
su - ${PGUSER} -c "cd /var/tmp/postgresql-9.5.3/contrib; make install"
rm -rf /var/tmp/postgresql*

Place a file with this contents under the files directory of our role:

roles/postgresqldbserver/files/install_pg953.sh

There is another Ansible module called copy which we now can use to copy the file from our roles directory to the target server. All we need to do is to add the following lines to our playbook:

- name: Copy PostgreSQL install script to targets
  copy: src=install_pg953.sh dest=/var/tmp/install_pg953.sh owner=postgres group=postgres mode="u=rwx"

Once we execute the playbook the file is distributed to all targets (here the check for the first node):

[root@ansiblepg1 ~] ls /var/tmp/install*
/var/tmp/install_pg953.sh

The only thing we need to do from now on to get PostgreSQL installed on the target system is to exexute this file. How can we do that? Very easy by using the shell module. Add these lines to the playbook:

- name: Compile and install PostgreSQL
  shell: /var/tmp/install_pg953.sh >> /var/tmp/install_pg_log
  args:
    executable: /usr/bin/bash

… re-execute the playbook:

[ansible@ansiblecontrol ansible]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
ok: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
ok: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

TASK [postgresqldbserver : Add PostgreSQL operating system group] **************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Add PostgreSQL operating system user] ***************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Download the PostgreSQL 9.5.3 sources] **************
changed: [192.168.22.172]
changed: [192.168.22.171]

TASK [postgresqldbserver : Copy PostgreSQL install script to targets] **********
ok: [192.168.22.171]
ok: [192.168.22.172]

TASK [postgresqldbserver : Compile and install PostgreSQL] *********************
changed: [192.168.22.171]
changed: [192.168.22.172]

PLAY RECAP *********************************************************************
192.168.22.171             : ok=7    changed=2    unreachable=0    failed=0   
192.168.22.172             : ok=7    changed=2    unreachable=0    failed=0   

… and we are done. Just to prove it:

[root@ansiblepg1 ~] /u01/app/postgres/product/95/db_3/bin/psql --version
psql (PostgreSQL) 9.5.3 dbi services build
[root@ansiblepg2 ~] /u01/app/postgres/product/95/db_3/bin/psql --version
psql (PostgreSQL) 9.5.3 dbi services build

PostgreSQL is installed and ready to create a new database cluster (which is the topic for the next post).

Btw: The two steps of copying the script to the targets and then execute it can be combined into one step by using the script module.
Btw2: Of course you might do the same with Oracle, Mysql, MongoDB, Cassandra, …

 

Cet article Getting started with Ansible – Download the PostgreSQL sources, compile and install est apparu en premier sur Blog dbi services.

Getting started with Ansible – Installing OS packages, creating groups and users

Mon, 2016-07-25 08:13

It has been quite a while since the first post in this series: “Getting started with Ansible – Preparations“. If you recap from the initial post Ansible was running on the control host and this simple Ansible command:

ansible postgres-servers -a "/bin/echo 11" -f 5

… was successfully executed against the “postgres-servers” group. So far, so good. Getting Ansible up and running for just this would not be very usefull, so lets see where we might go from here.

When you start to think on what you want to automate you should start to think on how you want to organize your Ansible stuff. The documentation provides some guidelines which might or might not fit your needs. For the scope of this series lets stick to what the documentation is recommeding as one possible way to go. The directory layout on the control host will then be:

[ansible@ansiblecontrol ~]$ sudo mkdir /opt/ansible
[ansible@ansiblecontrol ~]$ sudo chown ansible:ansible /opt/ansible
[ansible@ansiblecontrol ~]$ touch /opt/ansible/development                  # the inventory file for the development hosts      
[ansible@ansiblecontrol ~]$ touch /opt/ansible/staging                      # the inventory file for the staging hosts
[ansible@ansiblecontrol ~]$ touch /opt/ansible/production                   # the inventory file for the production hosts
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common                 # a role valid for "common" stuff
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/tasks
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/handlers
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/templates
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/files
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/vars
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/common/meta
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver    # a role vaild for the PostgreSQL stuff
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/tasks
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/handlers
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/templates
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/files
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/vars
[ansible@ansiblecontrol ~]$ mkdir /opt/ansible/roles/postgresqldbserver/meta

The concept of roles is explained in the documentation and you should definitely read that. We’ll come back to this later.

For now let’s place our two PostgreSQL hosts into the “development” inventory:

[ansible@ansiblecontrol ~]$ echo "[postgresql-servers]" >> /opt/ansible/development
[ansible@ansiblecontrol ~]$ echo "192.168.22.171" >> /opt/ansible/development
[ansible@ansiblecontrol ~]$ echo "192.168.22.172" >> /opt/ansible/development

Passing our new inventory file to Ansible we should be able to perform the same simple task as in the first post:

[ansible@ansiblecontrol ~]$ ansible -i /opt/ansible/development postgresql-servers -a "/bin/echo 11"
192.168.22.172 | SUCCESS | rc=0 >>
11

192.168.22.171 | SUCCESS | rc=0 >>
11

Ok, fine, this still works. When it comes to PostgreSQL one of the first steps when we want to install from source is to install all the operating system packages which are required. How could we do that with Ansible?

The initial step is to tell Ansible where to look for our roles. This is done by specifying the “roles_path” configuration parameter in the ansible.cfg configuration file:

[ansible@ansiblecontrol ~]$ cat /etc/ansible/ansible.cfg | grep roles | grep -v "#"
roles_path    = /opt/ansible/roles

From here on we need to setup our role by creating an initial “site.yml” file:

[ansible@ansiblecontrol ansible]$ cat roles/postgresqldbserver/site.yml 
---
# This playbook deploys a single PostgreSQL instance from the source code

- hosts: postgresql-servers
  become: true
  become_user: root

  roles:
    - postgresqldbserver

You can see from the above that the “postgresql-servers” group is referenced. Additionally notice the “become” and the “become_user” flags. As we’re going to use yum to install the packages we need a way to become root on the target system and this is how you can instruct Ansible to do so.
Time to specify on how we want to install the packages. This is quite easy as well:

[ansible@ansiblecontrol ansible]$ cat roles/postgresqldbserver/tasks/main.yml 
---
- name: Install PostgreSQL dependencies
  yum: name={{item}} state=present
  with_items:
   - gcc
   - openldap-devel
   - python-devel
   - readline-devel
   - openssl-devel
   - redhat-lsb
   - bison
   - flex
   - perl-ExtUtils-Embed
   - zlib-devel
   - crypto-utils
   - openssl-devel
   - pam-devel
   - libxml2-devel
   - libxslt-devel
   - tcl
   - tcl-devel
   - openssh-clients
   - bzip2
   - net-tools
   - wget
   - screen
   - ksh

What did we do here? We created our first task. We tell Ansible to use “yum” to install our “items” (which are the packages we want to install). You can check the documentation for more information on the yum module.

Lets see if it works and we can execute our first task on both PostgreSQL nodes:

[ansible@ansiblecontrol ~]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.171]
ok: [192.168.22.172]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
changed: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
changed: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

PLAY RECAP *********************************************************************
192.168.22.171             : ok=2    changed=1    unreachable=0    failed=0   
192.168.22.172             : ok=2    changed=1    unreachable=0    failed=0   

Cool, we just installed all the dependencies on both nodes with one Ansible command. We additionally want an operating system group for our PostgreSQL deployment so we add the following lines to the playbook:

- name: Add PostgreSQL operating system group
  group: name=postgressomegroup state=present

Execute the playbook again:

[ansible@ansiblecontrol ~]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
ok: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
ok: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

TASK [postgresqldbserver : Add PostgreSQL operating system group] **************
changed: [192.168.22.171]
changed: [192.168.22.172]

PLAY RECAP *********************************************************************
192.168.22.171             : ok=3    changed=1    unreachable=0    failed=0   
192.168.22.172             : ok=3    changed=1    unreachable=0    failed=0   

We did not change any of the packages but added the group. Lets add the PostgreSQL operating system user by adding these lines to the playbook:

- name: Add PostgreSQL operating system user
  user: name=postgres comment="PostgreSQL binaries owner" group=postgres

Execute again:

[ansible@ansiblecontrol ~]$ ansible-playbook -i /opt/ansible/development /opt/ansible/roles/postgresqldbserver/site.yml

PLAY [postgresql-servers] ******************************************************

TASK [setup] *******************************************************************
ok: [192.168.22.172]
ok: [192.168.22.171]

TASK [postgresqldbserver : Install PostgreSQL dependencies] ********************
ok: [192.168.22.171] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])
ok: [192.168.22.172] => (item=[u'gcc', u'openldap-devel', u'python-devel', u'readline-devel', u'openssl-devel', u'redhat-lsb', u'bison', u'flex', u'perl-ExtUtils-Embed', u'zlib-devel', u'crypto-utils', u'openssl-devel', u'pam-devel', u'libxml2-devel', u'libxslt-devel', u'tcl', u'tcl-devel', u'openssh-clients', u'bzip2', u'net-tools', u'wget', u'screen', u'ksh'])

TASK [postgresqldbserver : Add PostgreSQL operating system group] **************
changed: [192.168.22.171]
changed: [192.168.22.172]

TASK [postgresqldbserver : Add PostgreSQL operating system user] ***************
changed: [192.168.22.171]
changed: [192.168.22.172]

PLAY RECAP *********************************************************************
192.168.22.171             : ok=4    changed=2    unreachable=0    failed=0   
192.168.22.172             : ok=4    changed=2    unreachable=0    failed=0   

Really cool and simple. Just to prove lets connect to one of the nodes and check if the postgres user really is there:

[root@ansiblepg2 ~] id -a postgres
uid=1001(postgres) gid=1001(postgres) groups=1001(postgres)
[root@ansiblepg2 ~] 

Perfect. In the next post we’ll install the PostgreSQL binaries.

For you reference this is the playbook as it looks now:

[ansible@ansiblecontrol ansible]$ cat roles/postgresqldbserver/tasks/main.yml
---
- name: Install PostgreSQL dependencies
  yum: name={{item}} state=present
  with_items:
   - gcc
   - openldap-devel
   - python-devel
   - readline-devel
   - openssl-devel
   - redhat-lsb
   - bison
   - flex
   - perl-ExtUtils-Embed
   - zlib-devel
   - crypto-utils
   - openssl-devel
   - pam-devel
   - libxml2-devel
   - libxslt-devel
   - tcl
   - tcl-devel
   - openssh-clients
   - bzip2
   - net-tools
   - wget
   - screen
   - ksh

- name: Add PostgreSQL operating system group
  group: name=postgressomegroup state=present

- name: Add PostgreSQL operating system user
  user: name=postgres comment="PostgreSQL binaries owner" group=postgres
 

Cet article Getting started with Ansible – Installing OS packages, creating groups and users est apparu en premier sur Blog dbi services.

Redo log block size on ODA X6 all flash

Fri, 2016-07-22 16:43

On the Oracle Database Appliance, the redo logs are on Flash storage (and with X6 everything is on Flash storage) so you may wonder if we can benefit from 4k redo blocksize. Here are some tests about it on an ODA X6-2M.

I’ll compare the same workload (heavy inserts) with 512 bytes and 4k bytes block size redo. However, we can’t create a log group different than 512 bytes:

ORA-01378: The logical block size (4096) of file
/u03/app/oracle/redo/LABDB1/onlinelog/o1_mf_999_%u_.log is not compatible with
the disk sector size (media sector size is 512 and host sector size is 512)

This is because the flash storage is exposed with 512 bytes sector size:

ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED NORMAL N 512 4096 4194304 4894016 4500068 2441888 1023992 0 Y DATA/
MOUNTED NORMAL N 512 4096 4194304 1231176 221172 610468 -199762 0 N RECO/

Then, in order to be able to create new redo log groups with higher block size you need to set “_disk_sector_size_override” to TRUE;

I have 3 log groups with 512 bytes block size, and 3 groups with 4k:


LOGFILE
GROUP 10 '+RECO/LABDB1/ONLINELOG/group_10.264.917867333' SIZE 51200M BLOCKSIZE 512,
GROUP 11 '+RECO/LABDB1/ONLINELOG/group_11.265.917867489' SIZE 51200M BLOCKSIZE 512,
GROUP 12 '+RECO/LABDB1/ONLINELOG/group_12.266.917867645' SIZE 51200M BLOCKSIZE 512,
GROUP 13 '+RECO/LABDB1/ONLINELOG/group_13.267.917867795' SIZE 51200M BLOCKSIZE 4096,
GROUP 14 '+RECO/LABDB1/ONLINELOG/group_14.268.917867913' SIZE 51200M BLOCKSIZE 4096,
GROUP 15 '+RECO/LABDB1/ONLINELOG/group_15.269.917868013' SIZE 51200M BLOCKSIZE 4096

In 12c the database files should be on ACFS and not directly on the diskgroup. We did this on purpose in order to check if there is any overhead when in ACFS and we have seen exactly the same performance in both. There is something I dislike here however: redo log files are not multiplexed with multiple log members, but rely on the diskgroup redundancy. I agree with that in ASM because you are not supposed to manage the files and then risk to delete one of them. But in ACFS you see only one file, and if you drop it by mistake, both mirrors are lost, with the latest transactions.

On an insert intensive workload I take AWR snapshots between two log switches:

ODAX6REDO
The switch between blocksize 512 and blocksize 4096 happened at 12:35

Don’t be nervous about those orange ‘log file sync waits’ we had to run 10000 transactions per second in order to get some contention here.

We have to go to the details in order to compare, from an AWR Diff report:

Workload Comparison
~~~~~~~~~~~~~~~~~~~ 1st Per Sec 2nd Per Sec %Diff
--------------- --------------- ------
DB time: 37.9 37.3 -1.4
CPU time: 19.0 24.4 28.4
Background CPU time: 0.8 1.0 23.2
Redo size (bytes): 61,829,138.5 76,420,493.9 23.6
Logical read (blocks): 1,181,178.7 1,458,915.9 23.5
Block changes: 360,883.0 445,770.8 23.5
Physical read (blocks): 0.4 1.1 164.3
Physical write (blocks): 14,451.2 16,092.4 11.4
Read IO requests: 0.4 1.1 164.3
Write IO requests: 9,829.4 10,352.3 5.3
Read IO (MB): 0.0 0.0 100.0
Write IO (MB): 112.9 125.7 11.4
IM scan rows: 0.0 0.0 0.0
Session Logical Read IM:
User calls: 8,376.0 10,341.2 23.5
Parses (SQL): 5,056.0 6,247.8 23.6
Hard parses (SQL): 0.0 0.0 0.0
SQL Work Area (MB): 3.1 3.2 3.5
Logons: 0.4 0.3 -37.2
Executes (SQL): 225,554.2 278,329.3 23.4
Transactions: 10,911.0 13,486.4 23.6

The second workload, when redo blocksize was 4k, was able to handle 23% more activity.

‘log file sync’ average time is 1.3 milliseconds instead of 2.4:

Top Timed Events First DB/Inst: LABDB1/labdb1 Snaps: 155-156 (Elapsed time: 301.556 sec DB time: 11417.12 sec), Second DB/Inst: LABDB1/labdb1 Snaps: 157-158 (Elapsed time: 301.927 sec DB time: 11269.1 sec)
-> Events with a "-" did not make the Top list in this set of snapshots, but are displayed for comparison purposes
 
1st 2nd
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
Event Wait Class Waits Time(s) Avg Time(ms) %DB time Event Wait Class Waits Time(s) Avg Time(ms) %DB time
------------------------------ ------------- ------------ ------------ ------------- ----------- ------------------------------ ------------- ------------ ------------ ------------- -----------
CPU time N/A 5,722.8 N/A 50.1 CPU time N/A 7,358.4 N/A 65.3
log file sync Commit 2,288,655 5,412.1 2.4 47.4 log file sync Commit 2,808,036 3,535.5 1.3 31.4
target log write size Other 363,206 283.7 0.8 2.5 target log write size Other 644,287 278.2 0.4 2.5
log file parallel write System I/O 368,063 225.1 0.6 2.0 enq: TX - row lock contention Application 171,485 170.2 1.0 1.5
db file parallel write System I/O 12,399 160.2 12.9 1.4 db file parallel write System I/O 12,131 150.4 12.4 1.3
enq: TX - row lock contention Application 144,822 133.2 0.9 1.2 log file parallel write System I/O 649,501 148.1 0.2 1.3
library cache: mutex X Concurrency 130,800 120.8 0.9 1.1 library cache: mutex X Concurrency 86,632 128.1 1.5 1.1
log file sequential read System I/O 7,433 27.5 3.7 0.2 LGWR wait for redo copy Other 478,350 45.1 0.1 0.4
LGWR wait for redo copy Other 228,643 20.8 0.1 0.2 log file sequential read System I/O 6,577 21.7 3.3 0.2
buffer busy waits Concurrency 261,348 15.8 0.1 0.1 buffer busy waits Concurrency 295,880 20.1 0.1 0.2
--------------------------------------------------------------------------------------------------------------------

We see that this difference comes from lower latency in ‘log file parallel write':

Wait Events First DB/Inst: LABDB1/labdb1 Snaps: 155-156 (Elapsed time: 301.556 sec DB time: 11417.12 sec), Second DB/Inst: LABDB1/labdb1 Snaps: 157-158 (Elapsed time: 301.927 sec DB time: 11269.1 sec)
-> Ordered by absolute value of 'Diff' column of '% of DB time' descending (idle events last)
 
# Waits/sec (Elapsed Time) Total Wait Time (sec) Avg Wait Time (ms)
---------------------------------------- ---------------------------------------- -------------------------------------------
Event Wait Class 1st 2nd %Diff 1st 2nd %Diff 1st 2nd %Diff
------------------------------ ------------- -------------- -------------- ---------- -------------- -------------- ---------- --------------- --------------- -----------
log file sync Commit 7,589.5 9,300.4 22.5 5,412.1 3,535.5 -34.7 2.36 1.26 -46.61
log file parallel write System I/O 1,220.5 2,151.2 76.2 225.1 148.1 -34.2 0.61 0.23 -62.30
enq: TX - row lock contention Application 480.2 568.0 18.3 133.2 170.2 27.8 0.92 0.99 7.61
LGWR wait for redo copy Other 758.2 1,584.3 109.0 20.8 45.1 117.1 0.09 0.09 0.00
library cache: mutex X Concurrency 433.8 286.9 -33.8 120.8 128.1 6.0 0.92 1.48 60.87
db file parallel write System I/O 41.1 40.2 -2.3 160.2 150.4 -6.2 12.92 12.40 -4.02
cursor: pin S Concurrency 29.7 46.0 55.0 9.9 16.6 67.0 1.11 1.19 7.21
cursor: mutex X Concurrency 7.0 10.8 54.2 13.6 19.7 45.0 6.39 6.01 -5.95
latch: In memory undo latch Concurrency 585.3 749.0 28.0 10.8 16.3 50.8 0.06 0.07 16.67

In order to go into details, here is the wait event histogram for 512 bytes redo blocksize:

% of Waits
-----------------------------------------------
Total
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s
------------------------- ------ ----- ----- ----- ----- ----- ----- ----- -----
LGWR all worker groups 41 48.8 12.2 14.6 14.6 4.9 2.4 2.4
LGWR any worker group 259 6.2 5.4 8.9 13.9 18.1 18.1 29.3
LGWR wait for redo copy 228.9K 99.1 .9 .0
LGWR worker group orderin 442 44.6 9.7 4.5 5.0 9.3 10.6 16.3
log file parallel write 368.5K 85.3 7.5 4.7 1.4 .9 .2 .0
log file sequential read 7358 6.5 13.1 59.0 17.2 3.0 1.1 .2
log file sync 2.3M 48.9 23.1 17.0 5.7 2.7 2.3 .3

and for 4096 bytes blocksize:

% of Waits
-----------------------------------------------
Total
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s
------------------------- ------ ----- ----- ----- ----- ----- ----- ----- -----
LGWR all worker groups 20 45.0 5.0 15.0 10.0 5.0 20.0
LGWR any worker group 235 7.2 3.0 5.5 7.7 14.5 25.1 37.0
LGWR wait for redo copy 478.7K 98.9 1.0 .1 .0
LGWR worker group orderin 517 51.3 9.7 2.3 2.9 7.2 11.6 15.1
log file parallel write 649.9K 97.7 1.3 .3 .3 .4 .0 .0
log file sequential read 6464 5.7 8.2 73.5 11.0 1.2 .3 .1
log file sync 2.8M 78.2 15.6 2.3 .8 1.6 1.2 .

Few milliseconds are not perceived by end-user at commit except if the application has a design that is so bad that hundreds of commits are done for each user interaction. Even if both are really fast, the log writers was above 1ms for writes only for 1% of them when in blocksize 4k vs. 15% with default blocksize.

This faster latency is measured by I/O statistics as well:

Reads: Reqs Data Writes: Reqs Data Waits: Avg
Function Name Data per sec per sec Data per sec per sec Count Tm(ms)
--------------- ------- ------- ------- ------- ------- ------- ------- -------

BLOCKSIZE 512:
LGWR 0M 0.0 0M 18.1G 2420.4 61.528M 368.9K 0.6
BLOCKSIZE 4096:
LGWR 0M 0.0 0M 24.1G 4263.5 81.689M 649.5K 0.2

To be comprehensive, here are the statistics related with redo, thanks to those many statistics available in 12c:

Value per Second (Elapsed Time)
------------------------------------------- ---------------------------------------
Statistic 1st 2nd %Diff 1st 2nd %Diff
------------------------------ ---------------- ---------------- --------- -------------- -------------- ---------
 
redo KB read 16,319,609 15,783,576 -3.3 54,118.0 52,276.1 -3.4
redo blocks checksummed by FG 26,587,090 1,000,267 -96.2 88,166.3 3,312.9 -96.2
redo blocks written 37,974,499 6,318,372 -83.4 125,928.5 20,926.8 -83.4
redo blocks written (group 0) 37,256,502 6,257,861 -83.2 123,547.5 20,726.4 -83.2
redo blocks written (group 1) 717,997 60,511 -91.6 2,381.0 200.4 -91.6
redo entries 24,023,503 30,214,386 25.8 79,665.1 100,071.8 25.6
redo size 18,644,947,688 23,073,410,468 23.8 61,829,138.5 76,420,493.9 23.6
redo synch long waits 343 4,890 1,325.7 1.1 16.2 1,321.1
redo synch time 541,804 354,625 -34.5 1,796.7 1,174.5 -34.6
redo synch time (usec) 5,418,056,862 3,546,209,390 -34.5 17,967,000.7 11,745,254.3 -34.6
redo synch time overhead (usec) 145,664,759 197,925,281 35.9 483,043.8 655,540.2 35.7
redo synch time overhead count ( 2ms) 2,295,847 2,821,726 22.9 7,613.3 9,345.7 22.8
redo synch time overhead count ( 8ms) 443 3,704 736.1 1.5 12.3 734.7
redo synch time overhead count ( 32ms) 2 9 350.0 0.0 0.0 200.0
redo synch writes 2,305,502 2,849,645 23.6 7,645.4 9,438.2 23.5
redo wastage 179,073,264 2,703,864,280 1,409.9 593,830.9 8,955,357.7 1,408.1
redo write finish time 291,094,266 277,884,591 -4.5 965,307.5 920,370.1 -4.7
redo write gather time 63,237,013 125,066,420 97.8 209,702.4 414,227.3 97.5
redo write info find 2,296,292 2,825,439 23.0 7,614.8 9,358.0 22.9
redo write schedule time 63,679,682 125,819,850 97.6 211,170.3 416,722.8 97.3
redo write size count ( 4KB) 12,220 0 40.5 0
redo write size count ( 8KB) 26,420 2,246 -91.5 87.6 7.4 -91.5
redo write size count ( 16KB) 69,674 94,557 35.7 231.0 313.2 35.5
redo write size count ( 32KB) 108,676 268,794 147.3 360.4 890.3 147.0
redo write size count ( 128KB) 106,651 253,669 137.8 353.7 840.2 137.6
redo write size count ( 256KB) 37,332 28,076 -24.8 123.8 93.0 -24.9
redo write size count ( 512KB) 7,328 2,382 -67.5 24.3 7.9 -67.5
redo write size count (1024KB) 28 28 0.0 0.1 0.1 0.0
redo write time 29,126 27,817 -4.5 96.6 92.1 -4.6
redo write time (usec) 291,261,420 278,162,437 -4.5 965,861.8 921,290.4 -4.6
redo write total time 306,213,383 298,786,696 -2.4 1,015,444.5 989,599.1 -2.5
redo write worker delay (usec) 38,246,633 73,452,748 92.1 126,830.9 243,279.8 91.8
redo writes 368,330 649,751 76.4 1,221.4 2,152.0 76.2
redo writes (group 0) 366,492 648,430 76.9 1,215.3 2,147.6 76.7
redo writes (group 1) 1,838 1,321 -28.1 6.1 4.4 -28.2
redo writes adaptive all 368,330 649,752 76.4 1,221.4 2,152.0 76.2
redo writes adaptive worker 368,330 649,752 76.4 1,221.4 2,152.0 76.2

I’ve added a few things that were masked by the AWR Diff Report. The writes lower than 4k is zero in the second snapshots because it’s the blocksize.
It’s interesting to see that redo size is higher and this is because you write 4k even when you have less. This is measured by ‘redo wastage’.

So, larger blocksize lowers the latency but increases the volume. Here, where NVMe optimizes the bandwidth to Flash storage, it may not be a problem.

So what?

You have to keep in mind that this workload, with lot of small transactions and no other waits, is a special workload for this test. If you are not in this extreme case, then the default block size is probably sufficient for latency and reduces the redo size. However, if log file sync latency is your bottleneck, you may consider increasing the blocksize.

Thanks to

arrow-electronicsOracle Authorized Solution Center, Switzerland.
Those tests were done on ODA X6-2M at Arrow OASC. Arrow has a wide range of Engineered Systems available for Oracle partners, like dbi services, and for customers to do Proof of Concepts, demos, learning, benchmarks, etc.

ODAArrow

 

Cet article Redo log block size on ODA X6 all flash est apparu en premier sur Blog dbi services.

Pages