Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 4 hours 26 min ago

Power BI Report Server Kerberos Setup

Mon, 2018-11-19 10:46
In the case you have the following configuration and requirements

Your Power BI, paginated mobile KPI reports are published on your on premise Power BI Report Server (Named i.e. SRV-PBIRS), their data sources is an Analysis Services located on another server (Named i.e. SRV-SSASTAB\INST01, INST01 being the named instance) and you want to track/monitor who is accessing the data on Analysis Services or you have row level security constraints.

In such case, if you have configure your Analysis connection using Windows integrated authentication, and therefore you have to setup the Kerberos delegation from the Power BI Report Server to the Analysis Services Server. If you don’t do that, your users will be faced to the famous “double-hop” issue and they won’t be able to access the Analysis Services data or you won’t be able to identify who is consuming your data on Analysis Services side.

In order to setup the Kerberos delegation you can follow steps below:

1- Be sure to be Domain Admin or to have sufficient permission to create SPN and change the Service Account and /or computer settings in the Active Directory. 2- On your Power BI Report Server  server, get the Service account starting your Power BI Report Server service.

(i.e. SvcAcc_PBIRS)

pic1

Note: If you do not have used domain service account you will have to use the server name instead in the following steps.

While you are on the server, make first a backup and then change the rsreportserver.config configuration file (for a default installation it is located here: C:\Program Files\Microsoft Power BI Report Server\PBIRS\ReportServer). Add the parameter <RSWindowsNegotiate/>> in the <AuthenticationType> xml node

pic2

Save an close the file.

3. On your Analysis Services server, get the server account starting your Analysis Services service

(i.e. SvcAcc_SSASTab)

pic3

Note: If you do not have used domain service account you will have to use the server name instead in the following steps.

4- Open a PowerShell console on a  any domain computer with your domain admin user.

Execute the following command to get SPN associated with your Power BI Report Service account:

Setspn -l PBIRSServiceAccount

If you do not see the following entry

HTTP/SRV-PBIRS.Domain
HTTP/SRV-PBIRS

Execute the following commands to register HTTP SPN for your server FQDN and NETBIOS names

SetSpn -a http/SRV-PBIRS.Domain PBIRSServiceAccount
SetSpn -a http/SRV-PBIRS PBIRSServiceAccount

Note that you have to replace the SRV-PBIRS.Domain with the URL (without the virtual directory) of your Power BI Report Server site in the case you defined an URL or you defined an HTTPS  URL with a certificate.

Check again if you the SPN’s are correctly registered after.

 5- In your PowerShell session, execute the following command to get SPN registered for your Analysis Services Service account:
SetSpn -l SvcAcc_SSASTab

You should see the following entries, meaning your Analysis Services SPN’s have been registered:

MSOLAPSVC.3/ SRV-SSASTAB:INST01
MSOLAPSVC.3/ SRV-SSASTAB.domain:INST01

If not run the following commands:

SetSpn -a MSOLAPSVC.3/ SRV-SSASTAB:INST01
SetSpn -a MSOLAPSVC.3/ SRV-SSASTAB.domain:INST01

Furthermore, in the case you installed your Analysis Services with a named instance (in my example INST01), check if SPN’s have been registered for the Analysis Services SQL Browser Service (the server name is used in that case for the SQL Server Browser is started with a local service account):

SetSpn -l SRV-SSASTAB

You should see the following entries:

MSOLAPDisco.3/SRV-SSASTAB
MSOLAPDisco.3/SRV-SSASTAB.domain

If not, run the following command:

SetSpn -a MSOLAPDisco.3/SRV-SSASTAB
SetSpn -a MSOLAPDisco.3/SRV-SSASTAB.domain

 

6- For the next step you have to open Active Directory administration.

Open the properties of your Power BI Report Server service account.In the Account tab, uncheck the “Account is sensitive and cannot be delegated”

pic4

Then in the Delegation tab, select the “Trust this user for delegation to any service”. If you have security constraint with the delegation, it is recommended to use the third option and to select the only services you defined in step 5.

pic5

 7- Finally restart you Power BI Report Server Service.

Cet article Power BI Report Server Kerberos Setup est apparu en premier sur Blog dbi services.

Is there too much memory for my SQL Server instance?

Mon, 2018-11-19 02:56

Is there too much memory for my SQL Server instance? This is definitely an uncommon question I had to deal with of my customers a couple of weeks ago. Usually DBAs complain when they don’t have enough memory for environments they have to manage and the fact is SQL Server (like other SGBDRs) provides a plenty of tools for memory pressure troubleshooting. But what about of the opposite? This question raised in a context of an environment that includes a lot of virtual database servers (> 100) on the top of VMWare where my customer was asked for lowering the SQL Server instance memory reservations when possible in order to free memory from ESX hosts.

blog 147 - 0 - banner

Let’s start with the sys.dm_os_sys_memory. This is the first one that my customer wanted to dig into. This DMV may be helpful to get a picture of the overall system state including external memory conditions at the operating system level and the physical limits of the underlying hardware.

select 
	available_physical_memory_kb / 1024 AS available_physical_memory_MB,
	total_physical_memory_kb / 1024 AS total_physical_memory_MB,
	total_page_file_kb / 1024 AS total_page_file_MB,
	available_page_file_kb / 1024 AS available_page_file_MB,
	system_cache_kb / 1024 AS system_cache_MB
from sys.dm_os_sys_memory;

 

blog 147 - 1 - sys.dm_os_memory

 

But in the context of my customer, it partially helped to figure out SQL Server memory consumption instances because we didn’t really face any environments under pressure here.

However, another interesting DMV we may rely on is sys.dm_os_sys_info. We may also use their counterparts with perfmon counters \Memory Manager\Target Server Memory (KB) and \Memory Manager\Total Server Memory (KB) as shown below:

select 
	physical_memory_kb / 1024 AS physical_memory_MB,
	visible_target_KB,
	committed_kb,
	committed_target_kb
from sys.dm_os_sys_info;

 

blog 147 - 9 - sys.dm_os_sys_info

The concept of committed and Target commit memory are important here to figure out how SQL Server deals with memory space. The commit memory represents the physical memory allocated by the SQL Server process whereas the Target memory is the amount of memory SQL Server tries to maintain as committed memory regarding different factors described in the BOL. Chances are the latter is closed to the max server memory value in most of scenarios from my experience by the way.

But relying blindly on the committed memory may contribute to misinterpretation about what SQL Server is really consuming for a specific period of time. Indeed, let’s say my SQL Server instance is capped to 2GB and after the daily business workload here the corresponding figures. Let’s say the values in the context of my customer were of a different order of magnitude but this demo will help to figure out the issue that motivated this write-up:

select 
	physical_memory_kb / 1024 AS physical_memory_MB,
	visible_target_KB,
	committed_kb,
	committed_target_kb
from sys.dm_os_sys_info;

 

blog 147 - 10 - sys.dm_os_sys_info_after_daily_workload

The committed memory is about 365MB and by far from the configured max server memory parameter value – 2GB. But now let’s the database maintenance kicks-in. Usually this is a nightly and a daily or weekly basis job that includes generally a rebuilding index task that consists in reading generally all the data structures to get external fragmentation values through the DMF sys.dm_db_index_physical_stats(). This operation can touch structures that are not used during daily business and may have a huge impact on the buffer pool. In my case here the new memory state after executing this maintenance task:

blog 147 - 11 - sys.dm_os_sys_info_after_maintenance

The game has changed here because SQL Server has committed all the memory until reaching the max server memory value. This time we may go through the sys.dm_os_memory_clerks DMV to get details from different memory clerks of my SQL Server instance. pages_kb column is used because SQL instances run with SQL 2014 version.

SELECT
	[type],
	pages_kb / 1024 AS size_MB
FROM sys.dm_os_memory_clerks
WHERE memory_node_id = 0
ORDER BY size_MB DESC

 

blog 147 - 12 - sys.dm_os_memory_clerks

So, from now the committed memory has good chance to keep closed from the max server memory value while in fact the daily business workload won’t probably need all this memory allocated to the SQL Server process. This exactly why my customer asked me for a way to get a more realistic picture of memory consumption of its SQL Server instances during daily business by excluding the nightly database maintenance workload.

We went through a solution that consisted in freeing up the committed memory before starting the journey and to leave the memory grow up gradually until reaching its maximum usage. It is worth noting that there is no easy way. as far as I know, to free up the committed memory and SQL Server may decrease it only if the corresponding target server memory value is lower. From my experience this situation is more an exception than the rule of thumbs and therefore it is difficult to rely on it. One potential workaround might be to restart the SQL Server instance(s) but in the case of my customer restarting the database servers was not an option and we looked into a solution that forced SQL Server making room by setting up the max server memory closed to the min server memory value. Don’t get me wrong, I don’t consider this as a best practice but more as an emergency procedure because as restarting a SQL Server instance, it may lead to a temporary impact but in a high number of magnitudes especially whether the workload performance is directly tied to the buffer cache state (warm vs cold). In addition, I would say that scaling the max server memory value with only the daily business workload may be controversial in many ways and in fact we have to consider some tradeoffs here. In the context of my customer, the main goal was to release “unused memory” from SQL Server instances during daily business to free up memory from VMWare ESX hosts but there is no free lunch. For instance, the nightly basis workload execution may become suddenly higher in duration if there is less room to work in memory. Another direct side effect of working with less memory might be the increase of I/O operations from the storage layout. In a nutshell, there is no black or white solution and we have to deal with what we consider at the best solution for the specific context.

See you!

 

 

 

Cet article Is there too much memory for my SQL Server instance? est apparu en premier sur Blog dbi services.

Patching a virtualized ODA to patch 12.2.1.4.0

Tue, 2018-11-13 02:24

This article describes patching a virtualized Oracle Database Appliance (ODA) containing only an ODA_BASE virtual machine.

Do this patching first on test machines because it can not be guaranteed that all causes of failures of single VM ODAs are covered in this article. I got the experience that precheck for ODA patches does not detect some failure conditions which may lead to an unusuable ODA.

Overview:
Patch first to 12.1.2.12.0
After that patch to 12.2.1.4.0

Procedure for both patches:

Preparation:

Apply all files of the patch to repository on all nodes as user root:

oakcli unpack -package /directory_name/file_name

Verify patch and parts to be patched on all servers:

[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --verify
INFO: 2018-09-24 08:32:52: Reading the metadata file now...
Component Name Installed Version Proposed Patch Version
--------------- ------------------ -----------------
Controller_INT 4.650.00-7176 Up-to-date
Controller_EXT 13.00.00.00 Up-to-date
Expander 0291 0306
SSD_SHARED {
[ c1d20,c1d21,c1d22, A29A Up-to-date
c1d23 ] [ c1d0,c1d1,c1d2,c1d A29A Up-to-date
3,c1d4,c1d5,c1d6,c1d
7,c1d8,c1d9,c1d10,c1
d11,c1d12,c1d13,c1d1
4,c1d15,c1d16,c1d17,
c1d18,c1d19 ] }
SSD_LOCAL 0R3Q Up-to-date
ILOM 3.2.9.23 r116695 4.0.2.26.a r123797
BIOS 38070200 38100300
IPMI 1.8.12.4 Up-to-date
HMP 2.3.5.2.8 2.4.1.0.11
OAK 12.1.2.12.0 12.2.1.4.0
OL 6.8 6.9
OVM 3.4.3 3.4.4
GI_HOME 12.1.0.2.170814(2660 12.2.0.1.180417(2767
9783,26609945) 4384,27464465)
DB_HOME {
[ OraDb12102_home1 ] 12.1.0.2.170814(2660 12.1.0.2.180417(2733
9783,26609945) 8029,27338020)
[ OraDb11204_home2 ] 11.2.0.4.170418(2473 11.2.0.4.180417(2733
2075,23054319) 8049,27441052)
}

Validate the whole ODA (not during peak load):

oakcli validate -a

Show versions of all installed components (example is after patching):

[root@xx1 ~]# oakcli show version -detail
Reading the metadata. It takes a while...
System Version Component Name Installed Version Supported Version
-------------- --------------- ------------------ -----------------
12.2.1.4.0
Controller_INT 4.650.00-7176 Up-to-date
Controller_EXT 13.00.00.00 Up-to-date
Expander 0306 Up-to-date
SSD_SHARED {
[ c1d20,c1d21,c1d22, A29A Up-to-date
c1d23 ] [ c1d0,c1d1,c1d2,c1d A29A Up-to-date
3,c1d4,c1d5,c1d6,c1d
7,c1d8,c1d9,c1d10,c1
d11,c1d12,c1d13,c1d1
4,c1d15,c1d16,c1d17,
c1d18,c1d19 ] }
SSD_LOCAL 0R3Q Up-to-date
ILOM 4.0.2.26.a r123797 Up-to-date
BIOS 38100300 Up-to-date
IPMI 1.8.12.4 Up-to-date
HMP 2.4.1.0.11 Up-to-date
OAK 12.2.1.4.0 Up-to-date
OL 6.9 Up-to-date
OVM 3.4.4 Up-to-date
GI_HOME 12.2.0.1.180417(2767 Up-to-date
4384,27464465)
DB_HOME 11.2.0.4.170418(2473 11.2.0.4.180417(2733
2075,23054319) 8049,27441052)

To dry run of ospatch (does not work for any other components than ospatch):

[root@xx1 ~]# oakcli validate -c ospatch -ver 12.2.1.4.0
INFO: Validating the OS patch for the version 12.2.1.4.0
INFO: 2018-09-25 08:34:28: Performing a dry run for OS patching
INFO: 2018-09-25 08:34:52: There are no conflicts. OS upgrade could be successful

All packages which are mentioned as incompatible must be removed before patching. Also somebody who is able to install and configure compatible versions of these packages properly after patching should be available. Also compatible versions of these packages should be prepared beforehand.

Before applying patch:
In dataguard installations, set state to APPLY-OFF for all standby databases
Disable all jobs which use Grid Infrastructure or databases
Set all ACFS replications to “pause”.
Unmount all ACFS filesystems
Stop all agents on all ODA nodes
Remove all resources from Grid Infrastructure which depend on ACFS filesystems (srvctl remove)
These resources can be determined with:

crsctl stat res -dependency | grep -i acfs

Remove all packages which were found incompatible to patch.

Note:
Scripts of both patches cannot unmount ACFS filesystems (at least filesystems mounted with registry) and usage of Grid Infrastructure files by mounted ACFS filesystems causes both patches to fail. Check scripts of both patches seem not to check for this condition. In Grid Infrastructure all resources on which other resources have dependencies must exist, otherwise their configuration must be saved and the resources must be removed from GI.

Use UNIX tool screen for applying patch because any network interruption causes patch to fail.

Patching:
Only server and storage should be patched with oakcli script, databases should be patched manually. In / filesystem at least 10 GB, in /u01 at least 15 GB available disk space must exist.

All commands have to be executed on primary ODA node as user root. The http server error at end of server patching can be ignored.


[root@xx1 ~]# screen
[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --server
INFO: DB, ASM, Clusterware may be stopped during the patch if required
INFO: Both Nodes may get rebooted automatically during the patch if required
Do you want to continue: [Y/N]?: Y
INFO: User has confirmed for the reboot
INFO: Patch bundle must be unpacked on the second Node also before applying the patch
Did you unpack the patch bundle on the second Node? : [Y/N]? : Y
INFO: All the VMs except the ODABASE will be shutdown forcefully if needed
Do you want to continue : [Y/N]? : Y
INFO: Running pre-install scripts
INFO: Running prepatching on node 0
INFO: Running prepatching on node 1
INFO: Completed pre-install scripts
INFO: Patching server component (rolling)
INFO: Stopping VMs, repos and OAKD on both nodes...
INFO: Stopped Oakd
...
INFO: Patching the server on node: xx2
INFO: it may take upto 60 minutes. Please wait
INFO: Infrastructure patching summary on node: xx1
INFO: Infrastructure patching summary on node: xx2
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the HMP
SUCCESS: 2018-09-25 09:42:24: Successfully updated the OAK
SUCCESS: 2018-09-25 09:42:24: Successfully updated the JDK
INFO: 2018-09-25 09:42:24: IPMI is already upgraded
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the OS
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device OVM
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 09:42:24: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 09:42:24: Successfully upgraded the local storage
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device Ilom
SUCCESS: 2018-09-25 09:42:24: Successfully updated the device BIOS
INFO: 2018-09-25 09:42:24: Some of the components patched on node
INFO: 2018-09-25 09:42:24: require node reboot. Rebooting the node
INFO: 2018-09-25 09:42:24: rebooting xx2 via /tmp/dom0reboot...
..........
INFO: 2018-09-25 09:48:03: xx2 is rebooting...
INFO: 2018-09-25 09:48:03: Waiting for xx2 to reboot...
........
INFO: 2018-09-25 09:55:24: xx2 has rebooted...
INFO: 2018-09-25 09:55:24: Waiting for processes on xx2 to start...
..
INFO: Patching server component on node: xx1
INFO: 2018-09-25 09:59:31: Patching ODABASE Server Components (including Grid software)
INFO: 2018-09-25 09:59:31: ------------------Patching HMP-------------------------
SUCCESS: 2018-09-25 10:00:26: Successfully upgraded the HMP
INFO: 2018-09-25 10:00:26: creating /usr/lib64/sun-ssm symlink
INFO: 2018-09-25 10:00:27: ----------------------Patching OAK---------------------
SUCCESS: 2018-09-25 10:00:59: Successfully upgraded OAK
INFO: 2018-09-25 10:01:02: ----------------------Patching JDK---------------------
SUCCESS: 2018-09-25 10:01:12: Successfully upgraded JDK
INFO: 2018-09-25 10:01:12: ----------------------Patching IPMI---------------------
INFO: 2018-09-25 10:01:12: IPMI is already upgraded or running with the latest version
INFO: 2018-09-25 10:01:13: ------------------Patching OS-------------------------
INFO: 2018-09-25 10:01:36: Removed kernel-uek-firmware-4.1.12-61.44.1.el6uek.noarch
INFO: 2018-09-25 10:01:52: Removed kernel-uek-4.1.12-61.44.1.el6uek.x86_64
INFO: 2018-09-25 10:02:03: Clusterware is running on local node
INFO: 2018-09-25 10:02:03: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 10:03:22: Successfully stopped the clusterware on local node
SUCCESS: 2018-09-25 10:07:36: Successfully upgraded the OS
INFO: 2018-09-25 10:07:40: ------------------Patching Grid-------------------------
INFO: 2018-09-25 10:07:45: Checking for available free space on /, /tmp, /u01
INFO: 2018-09-25 10:07:50: Attempting to upgrade grid.
INFO: 2018-09-25 10:07:50: Executing /opt/oracle/oak/pkgrepos/System/12.2.1.4.0/bin/GridUpgrade.pl...
SUCCESS: 2018-09-25 10:55:07: Grid software has been updated.
INFO: 2018-09-25 10:55:07: Patching DOM0 Server Components
INFO: 2018-09-25 10:55:07: Attempting to patch OS on Dom0...
INFO: 2018-09-25 10:55:16: Clusterware is running on local node
INFO: 2018-09-25 10:55:16: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 10:56:45: Successfully stopped the clusterware on local node
SUCCESS: 2018-09-25 11:02:19: Successfully updated the device OVM to 3.4.4
INFO: 2018-09-25 11:02:19: Attempting to patch the HMP on Dom0...
SUCCESS: 2018-09-25 11:02:26: Successfully updated the device HMP to the version 2.4.1.0.11 on Dom0
INFO: 2018-09-25 11:02:26: Attempting to patch the IPMI on Dom0...
INFO: 2018-09-25 11:02:27: Successfully updated the IPMI on Dom0
INFO: 2018-09-25 11:02:30: Attempting to patch the local storage on Dom0...
INFO: 2018-09-25 11:02:30: Stopping clusterware on local node...
INFO: 2018-09-25 11:02:37: Disk : c0d0 is already running with MS4SC2JH2ORA480G 0R3Q
INFO: 2018-09-25 11:02:38: Disk : c0d1 is already running with MS4SC2JH2ORA480G 0R3Q
INFO: 2018-09-25 11:02:40: Controller : c0 is already running with 0x005d 4.650.00-7176
INFO: 2018-09-25 11:02:41: Attempting to patch the ILOM on Dom0...
SUCCESS: 2018-09-25 11:27:49: Successfully updated the device Ilom to 4.0.2.26.a r123797
SUCCESS: 2018-09-25 11:27:49: Successfully updated the device BIOS to 38100300
INFO: Infrastructure patching summary on node: xxxx1
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP
SUCCESS: 2018-09-25 11:27:54: Successfully updated the OAK
SUCCESS: 2018-09-25 11:27:54: Successfully updated the JDK
INFO: 2018-09-25 11:27:54: IPMI is already upgraded
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the OS
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded GI
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device OVM
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 11:27:54: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the local storage
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device Ilom
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device BIOS
INFO: Infrastructure patching summary on node: xxxx2
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP
SUCCESS: 2018-09-25 11:27:54: Successfully updated the OAK
SUCCESS: 2018-09-25 11:27:54: Successfully updated the JDK
INFO: 2018-09-25 11:27:54: IPMI is already upgraded
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the OS
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device OVM
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the HMP on Dom0
INFO: 2018-09-25 11:27:54: Local storage patching summary on Dom0...
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded the local storage
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device Ilom
SUCCESS: 2018-09-25 11:27:54: Successfully updated the device BIOS
SUCCESS: 2018-09-25 11:27:54: Successfully upgraded GI
INFO: Running post-install scripts
INFO: Running postpatch on node 1...
INFO: Running postpatch on node 0...
...
...
INFO: Started Oakd
INFO: 2018-09-25 11:32:26: Some of the components patched on node
INFO: 2018-09-25 11:32:26: require node reboot. Rebooting the node
INFO: Rebooting Dom0 on node 0
INFO: 2018-09-25 11:32:26: Running /tmp/dom0reboot on node 0
INFO: 2018-09-25 11:33:10: Clusterware is running on local node
INFO: 2018-09-25 11:33:10: Attempting to stop clusterware and its resources locally
SUCCESS: 2018-09-25 11:35:52: Successfully stopped the clusterware on local node
INFO: 2018-09-25 11:38:54: RPC::XML::Client::send_request: HTTP server error: read timeout
[root@xx1 ~]#
Broadcast message from root@xx1
(unknown) at 11:39 ...
The system is going down for power off NOW!

[root@xx1 ~]# oakcli update -patch 12.2.1.4.0 --storage
INFO: DB, ASM, Clusterware may be stopped during the patch if required
INFO: Both Nodes may get rebooted automatically during the patch if required
Do you want to continue: [Y/N]?: Y
INFO: User has confirmed for the reboot
INFO: Running pre-install scripts
INFO: Running prepatching on node 0
INFO: Running prepatching on node 1
INFO: Completed pre-install scripts
INFO: Shared Storage components need to be patched
INFO: Stopping OAKD on both nodes...
INFO: Stopped Oakd
INFO: Attempting to shutdown clusterware (if required)..
INFO: 2018-09-25 12:07:13: Clusterware is running on one or more nodes of the cluster
INFO: 2018-09-25 12:07:13: Attempting to stop clusterware and its resources across the cluster
SUCCESS: 2018-09-25 12:07:59: Successfully stopped the clusterware
INFO: Patching storage on node xx2
INFO: Patching storage on node xx1
INFO: 2018-09-25 12:08:23: ----------------Patching Storage-------------------
INFO: 2018-09-25 12:08:23: ....................Patching Shared SSDs...............
INFO: 2018-09-25 12:08:23: Disk : d0 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:23: Disk : d1 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:23: Disk : d2 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d3 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d4 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:24: Disk : d5 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d6 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d7 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:25: Disk : d8 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d9 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d10 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:26: Disk : d11 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d12 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d13 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:27: Disk : d14 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d15 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d16 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:28: Disk : d17 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:29: Disk : d18 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:29: Disk : d19 is already running with : HSCAC2DA2SUN1.6T A29A
INFO: 2018-09-25 12:08:30: Disk : d20 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:30: Disk : d21 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:30: Disk : d22 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:31: Disk : d23 is already running with : HSCAC2DA6SUN200G A29A
INFO: 2018-09-25 12:08:31: ....................Patching Shared HDDs...............
INFO: 2018-09-25 12:08:31: ....................Patching Expanders...............
INFO: 2018-09-25 12:08:31: Updating the Expander : c0x0 with the Firmware : DE3-24C 0306
SUCCESS: 2018-09-25 12:09:24: Successfully updated the Firmware on Expander : c0x0 to DE3-24C 0306
INFO: 2018-09-25 12:09:24: Updating the Expander : c1x0 with the Firmware : DE3-24C 0306
SUCCESS: 2018-09-25 12:10:16: Successfully updated the Firmware on Expander : c1x0 to DE3-24C 0306
INFO: 2018-09-25 12:10:16: ..............Patching Shared Controllers...............
INFO: 2018-09-25 12:10:16: Controller : c0 is already running with : 0x0097 13.00.00.00
INFO: 2018-09-25 12:10:17: Controller : c1 is already running with : 0x0097 13.00.00.00
INFO: 2018-09-25 12:10:17: ------------ Completed Storage Patching------------
INFO: 2018-09-25 12:10:17: Completed patching of shared_storage
INFO: Patching completed for component Storage
INFO: Running post-install scripts
INFO: Running postpatch on node 1...
INFO: Running postpatch on node 0...
INFO: 2018-09-25 12:10:28: Some of the components patched on node
INFO: 2018-09-25 12:10:28: require node reboot. Rebooting the node
INFO: 2018-09-25 12:10:28: Running /tmp/pending_actions on node 1
INFO: Node will reboot now.
INFO: Please check reboot progress via ILOM interface
INFO: This session may appear to hang, press ENTER after reboot
INFO: 2018-09-25 12:12:53: Rebooting Dom1 on node 0
INFO: Running /tmp/pending_actions on node 0
Broadcast message from oracle@xx1
(/dev/pts/0) at 12:13 ...
The system is going down for reboot NOW!

After successful patching:

Install and configure compatible versions of all previously removed packages
Mount all ACFS filesystems
Recreate all deleted Grid Infrastructure resources and start them
Reenable all jobs disabled before
Resume all ACFS replications
Set state of all dataguard standby databases to APPLY-ON
Check ACFS replications
Check dataguard status
Check whether all works as before

Cet article Patching a virtualized ODA to patch 12.2.1.4.0 est apparu en premier sur Blog dbi services.

DC/OS: Install Marathon-LB with keepalived

Tue, 2018-11-13 01:32

After the minimal setup of the DC/OS in my further articles, I wanted to extend my DC/OS and add a loadbalancer.
There are two options for loadbalancing in DC/OS
1. Marathon-LB (a layer 7 load balancer, used for external requests, based on HAProxy)
2. Named VIPs (a layer 4 load balancer used for internal TCP traffic)

In this article we will use Marathon-LB. In case you want to read more about the VIP solution, just visit the DC/OS Documentation.

I also want to configure keepalived, which will automatically generate a unicast based failover for high-availabilty.

Preparation

To use a loadbalancer, I had to extend the DC/OS I build before Deploy DC/OS using Ansible (Part 1). The new DC/OS has the following structure:
POWERPNT_2018-11-07_10-26-14

Implementation of marathon-lb

I used the DC/OS CLI, but you can also install the marathon-lb package using the catalog on the web interface.

[root@dcos-master ~]# dcos package install marathon-lb
By Deploying, you agree to the Terms and Conditions https://mesosphere.com/catalog-terms-conditions/#community-services
We recommend at least 2 CPUs and 1GiB of RAM for each Marathon-LB instance.

*NOTE*: For additional ```Enterprise Edition``` DC/OS instructions, see https://docs.mesosphere.com/administration/id-and-access-mgt/service-auth/mlb-auth/
Continue installing? [yes/no] yes
Installing Marathon app for package [marathon-lb] version [1.12.3]
Marathon-lb DC/OS Service has been successfully installed!
See https://github.com/mesosphere/marathon-lb for documentation.
Create a keepalived configuration

To implement keepalived on the both Public agents, create two JSON Files (you can find the GitHub Guidance here). One for the master and one for the backup.
Make the IPs fitting to your environment. Be sure you did not mix up the IPs for the master and the backup. You also have to adapt the KEEPALIVED_VIRTUAL_IPADDRESS_1

[root@dcos-master ~]# cd /etc/
[root@dcos-master etc]# mkdir keepalived
[root@dcos-master etc]# cat keepalived-master.json
{
  "id": "/keepalived-master",
  "acceptedResourceRoles": [
    "slave_public"
  ],
  "constraints": [
    [
      "hostname",
      "LIKE",
      "192.168.22.104"
    ]
  ],
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      "image": "arcts/keepalived",
      "forcePullImage": false,
      "privileged": false,
      "parameters": [
        {
          "key": "cap-add",
          "value": "NET_ADMIN"
        }
      ]
    }
  },
  "cpus": 0.5,
  "disk": 0,
  "env": {
    "KEEPALIVED_AUTOCONF": "true",
    "KEEPALIVED_VIRTUAL_IPADDRESS_1": "192.168.22.150/24",
    "KEEPALIVED_STATE": "MASTER",
    "KEEPALIVED_UNICAST_PEER_0": "192.168.22.106",
    "KEEPALIVED_INTERFACE": "en0ps8",
    "KEEPALIVED_UNICAST_SRC_IP": "192.168.22.104"
  },
  "instances": 1,
  "maxLaunchDelaySeconds": 3600,
  "mem": 100,
  "gpus": 0,
  "networks": [
    {
      "mode": "host"
    }
  ],
  "portDefinitions": [],
  "requirePorts": true,
  "upgradeStrategy": {
    "maximumOverCapacity": 1,
    "minimumHealthCapacity": 1
  },
  "killSelection": "YOUNGEST_FIRST",
  "unreachableStrategy": {
    "inactiveAfterSeconds": 0,
    "expungeAfterSeconds": 0
  },
  "healthChecks": [],
  "fetch": []
}
[root@dcos-master keepalived]# cat keepalived-backup.json
{
  "id": "/keepalived-backup",
  "acceptedResourceRoles": [
    "slave_public"
  ],
  "constraints": [
    [
      "hostname",
      "LIKE",
      "192.168.22.106"
    ]
  ],
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      "image": "arcts/keepalived",
      "forcePullImage": false,
      "privileged": false,
      "parameters": [
        {
          "key": "cap-add",
          "value": "NET_ADMIN"
        }
      ]
    }
  },
  "cpus": 0.5,
  "disk": 0,
  "env": {
    "KEEPALIVED_AUTOCONF": "true",
    "KEEPALIVED_VIRTUAL_IPADDRESS_1": "192.168.22.150/24",
    "KEEPALIVED_STATE": "BACKUP",
    "KEEPALIVED_UNICAST_PEER_0": "192.168.22.104",
    "KEEPALIVED_INTERFACE": "en0ps8",
    "KEEPALIVED_UNICAST_SRC_IP": "192.168.22.106"
  },
  "instances": 1,
  "maxLaunchDelaySeconds": 3600,
  "mem": 124,
  "gpus": 0,
  "networks": [
    {
      "mode": "host"
    }
  ],
  "portDefinitions": [],
  "requirePorts": true,
  "upgradeStrategy": {
    "maximumOverCapacity": 1,
    "minimumHealthCapacity": 1
  },
  "killSelection": "YOUNGEST_FIRST",
  "unreachableStrategy": {
    "inactiveAfterSeconds": 0,
    "expungeAfterSeconds": 0
  },
  "healthChecks": [],
  "fetch": []
}
Add the keepalived apps to DC/OS
[root@dcos-master keepalived]# dcos marathon app add keepalived-master.json
Created deployment b41b4526-86ae-4b70-a254-429c3c212ce3
[root@dcos-master keepalived]# dcos marathon app add keepalived-backup.json
Created deployment f854a621-f402-4e72-9f46-150b11c6a7c8
Everything works as expected?

You can easily check if everything works as expected by either using the CLI

[root@dcos-master keepalived]# dcos marathon app list
ID                  MEM  CPUS  TASKS  HEALTH  DEPLOYMENT  WAITING  CONTAINER  CMD
/keepalived-backup  124  0.5    1/1    N/A       ---      False      DOCKER   N/A
/keepalived-master  100  0.5    1/1    N/A       ---      False      DOCKER   N/A
/marathon-lb        800   1     2/2    2/2       ---      False      DOCKER   N/A

Or the web interface. In this case I prefer the web interface. I think it offers a great overview. You can check the health of the Services, the IP addresses….
firefox_2018-11-01_14-21-00

Select the keepalived-master to proof the state
firefox_2018-11-07_11-10-25

Select the log tab and make sure the master is in “MASTER STATE”
firefox_2018-11-07_11-11-18

Do the same for the keepalived-backup. The log should show the backup in “BACKUP STATE”
firefox_2018-11-07_11-14-28

Cet article DC/OS: Install Marathon-LB with keepalived est apparu en premier sur Blog dbi services.

EDB BART 2.2, parallel full backups without using pg_basebackup

Mon, 2018-11-12 13:45

Some days ago EnterpriseDB released the latest version of its backup and recovery tool for PostgreSQL and EDB Postgres Advanced Server, release notes here. The main new features, at least for me, are backups using multiple cores and parallel incremental restores. Besides that BART does not require pg_basebackup anymore for taking full backups of a PostgreSQL instance. The downside with that could be that you can not easily restore the backups without using EDB BART. Lets see how all of that works.

I’ll not describe on how BART can be installed as that is just a “yum install …” from the EDB repositories. Once that is done BART should report version 2.2.0:

[postgres@edbbart ~]$ bart --version
bart (EnterpriseDB) 2.2.0
BART22-REL-2_2_0-GA2-0-g60d64ca 
Tue Nov 6 08:32:55 UTC 2018 

The help output of BART already gives an idea about the latest changes:

[postgres@edbbart ~]$ bart backup --help
bart: backup and recovery tool

Usage:
 bart BACKUP [OPTION]...

Options:
  -h, --help           Show this help message and exit
  -s, --server         Name of the server or 'all' (full backups only) to specify all servers
  -F, --format=p|t     Backup output format (tar (default) or plain)
  -z, --gzip           Enables gzip compression of tar files
  -c, --compress-level Specifies the compression level (1 through 9, 9 being best compression)

  --backup-name        Specify a friendly name for the current backup
  --parent             Specify parent backup for incremental backup
  --thread-count       Specify number of worker thread(s) to take backup
  --check              Verify checksum of required mbm files
  --with-pg_basebackup Use pg_basebackup to take the backup, valid only for full backup
  --no-pg_basebackup   Don't use pg_basebackup to take the backup, valid only for full backup

Parallel stuff came in and the possibility to avoid using pg_basebackup. Lets do a first test without specifying “–with-pg_basebackup” or “–no-pg_basebackup” just to see what the default is:

[postgres@edbbart ~]$ /usr/edb/bart/bin/bart init -s PG2 -o
INFO:  setting archive_command for server 'pg2'
WARNING: archive_command is set. server restart is required
[postgres@edbbart ~]$ bart backup -s PG2
INFO:  DebugTarget - getVar(checkDiskSpace.bytesAvailable)
INFO:  new backup identifier generated 1541853901964
INFO:  creating 2 harvester threads
/u90/pg2/1541853901964
/u90/pg2/1541853901964
INFO:  backup completed successfully
INFO:  
BART VERSION: 2.2.0
BACKUP DETAILS:
BACKUP STATUS: active
BACKUP IDENTIFIER: 1541853901964
BACKUP NAME: PG2_2018-11-10T13:45
BACKUP PARENT: none
BACKUP LOCATION: /u90/pg2/1541853901964
BACKUP SIZE: 81.08 MB
BACKUP FORMAT: tar
BACKUP TIMEZONE: Europe/Zurich
XLOG METHOD: stream
BACKUP CHECKSUM(s): 0
TABLESPACE(s): 0
START WAL LOCATION: 000000010000000000000005
STOP WAL LOCATION: 000000010000000000000006
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2018-11-01 15:39:25 CET
STOP TIME: 2018-11-01 15:39:28 CET
TOTAL DURATION: 3 sec(s)

We get two threads for creating the backup and that is because I have requested that in my configuration:

[postgres@edbbart ~]$ cat /usr/edb/bart/etc/bart.cfg
[BART]              
bart_host = postgres@edbbart  
backup_path = /u90           
pg_basebackup_path = /usr/edb/as11/bin/pg_basebackup   
xlog_method = stream
retention_policy= 2 DAYS
logfile = /tmp/bart.log 
scanner_logfile = /tmp/scanner.log	   
thread_count = 2

[PG2]
backup_name = PG2_%year-%month-%dayT%hour:%minute
host = 192.168.22.60   
user = bart
port = 5433   
cluster_owner = postgres 
remote_host = postgres@192.168.22.60 
allow_incremental_backups = enabled
thread_count = 2      
description = PG1 backups    

The question is now if pg_basebackup was used in the background or not (the answer is already known though, as pg_basebackup has no parallel option).

[postgres@edbbart ~]$ cd /u90/pg2/1541853901964/
[postgres@edbbart 1541853901964]$ ls -l
total 83040
-rw-rw-r--. 1 postgres postgres      644 Nov 10 13:45 backupinfo
-rw-------. 1 postgres postgres      211 Nov 10 13:45 backup_label
drwxrwxr-x. 4 postgres postgres       37 Nov 10 13:45 base
-rw-rw-r--. 1 postgres postgres 26151424 Nov 10 13:45 base-1.tar
-rw-rw-r--. 1 postgres postgres 25309696 Nov 10 13:45 base-2.tar
-rw-rw-r--. 1 postgres postgres 33557504 Nov 10 13:45 base.tar

This is not what you get when you do a plain pg_basebackup. Could we restore that without using BART?

[postgres@edbbart aa]$ tar -axf /u90/pg2/1541853901964/base.tar 
[postgres@edbbart aa]$ ls -l
total 60
drwx------. 6 postgres postgres    54 Nov 10 13:45 base
-rw-------. 1 postgres postgres    33 Nov 10 13:45 current_logfiles
drwx------. 2 postgres postgres  4096 Nov 10 13:45 global
-rw-------. 1 postgres postgres    25 Nov 10 13:45 __payloadChecksum
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_commit_ts
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_dynshmem
-rw-------. 1 postgres postgres  4653 Nov 10 13:45 pg_hba.conf
-rw-------. 1 postgres postgres  1636 Nov 10 13:45 pg_ident.conf
drwx------. 2 postgres postgres    58 Nov 10 13:55 pg_log
drwx------. 4 postgres postgres    68 Nov 10 13:45 pg_logical
drwx------. 4 postgres postgres    36 Nov 10 13:45 pg_multixact
drwx------. 2 postgres postgres    18 Nov 10 13:55 pg_notify
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_replslot
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_serial
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_snapshots
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_stat
drwx------. 2 postgres postgres    88 Nov 10 13:55 pg_stat_tmp
drwx------. 2 postgres postgres    18 Nov 10 13:45 pg_subtrans
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_tblspc
drwx------. 2 postgres postgres     6 Nov 10 13:45 pg_twophase
-rw-------. 1 postgres postgres     3 Nov 10 13:45 PG_VERSION
drwx------. 2 postgres postgres    70 Nov 10 13:55 pg_wal
drwx------. 2 postgres postgres    18 Nov 10 13:55 pg_xact
-rw-------. 1 postgres postgres  1184 Nov 10 13:45 postgresql.auto.conf
-rw-------. 1 postgres postgres 27807 Nov 10 13:45 postgresql.conf

Looks like we can just start that, so lets try:

[postgres@edbbart aa]$ chmod 700 /var/tmp/aa
[postgres@edbbart aa]$ /usr/edb/as11/bin/pg_ctl -D /var/tmp/aa/ start
waiting for server to start....2018-11-10 13:57:29.027 CET - 1 - 14772 -  - @ LOG:  listening on IPv4 address "0.0.0.0", port 5433
2018-11-10 13:57:29.027 CET - 2 - 14772 -  - @ LOG:  listening on IPv6 address "::", port 5433
2018-11-10 13:57:29.034 CET - 3 - 14772 -  - @ LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2018-11-10 13:57:29.075 CET - 4 - 14772 -  - @ LOG:  redirecting log output to logging collector process
2018-11-10 13:57:29.075 CET - 5 - 14772 -  - @ HINT:  Future log output will appear in directory "pg_log".
 done
server started
[postgres@edbbart aa]$ /usr/edb/as11/bin/psql -p 5433
psql.bin (11.0.3, server 11.0.3)
Type "help" for help.
postgres=# \q
[postgres@edbbart aa]$ /usr/edb/as11/bin/pg_ctl -D /var/tmp/aa/ stop -m fast
waiting for server to shut down....... done
server stopped
[postgres@edbbart aa]$ 

Good news: Even when not using pg_basebackup there is no lock-in because you can just combine the tar files and start up the instance.

Having a closer look at what is inside the base*.tar files:

[postgres@edbbart ~]$ rm -rf /var/tmp/aa
[postgres@edbbart ~]$ mkdir /var/tmp/aa1
[postgres@edbbart ~]$ mkdir /var/tmp/aa2
[postgres@edbbart ~]$ cd /var/tmp/aa1
[postgres@edbbart aa1]$ tar -axf /u90/pg2/1541853901964/base-1.tar 
[postgres@edbbart aa1]$ cd ../aa2
[postgres@edbbart aa2]$ tar -axf /u90/pg2/1541853901964/base-2.tar 
[postgres@edbbart aa2]$ ls -la ../aa1
total 28
drwxrwxr-x. 15 postgres postgres 4096 Nov 10 14:09 .
drwxrwxrwt.  6 root     root      134 Nov 10 14:08 ..
drwxrwxr-x.  6 postgres postgres   54 Nov 10 14:09 base
-rw-------.  1 postgres postgres   33 Nov 10 13:45 current_logfiles
drwxrwxr-x.  2 postgres postgres 4096 Nov 10 14:09 global
-rw-------.  1 postgres postgres   25 Nov 10 13:45 __payloadChecksum
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_commit_ts
-rw-------.  1 postgres postgres 4653 Nov 10 13:45 pg_hba.conf
drwx------.  2 postgres postgres   32 Nov 10 13:45 pg_log
drwxrwxr-x.  4 postgres postgres   39 Nov 10 14:09 pg_logical
drwxrwxr-x.  4 postgres postgres   36 Nov 10 14:09 pg_multixact
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_notify
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_replslot
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_snapshots
drwx------.  2 postgres postgres   52 Nov 10 13:45 pg_stat_tmp
drwxrwxr-x.  2 postgres postgres   18 Nov 10 14:09 pg_subtrans
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_twophase
drwx------.  2 postgres postgres    6 Nov 10 13:45 pg_xact
-rw-------.  1 postgres postgres 1184 Nov 10 13:45 postgresql.auto.conf
[postgres@edbbart aa2]$ ls -la ../aa2
total 48
drwxrwxr-x. 16 postgres postgres  4096 Nov 10 14:09 .
drwxrwxrwt.  6 root     root       134 Nov 10 14:08 ..
drwx------.  6 postgres postgres    54 Nov 10 13:45 base
drwx------.  2 postgres postgres  4096 Nov 10 13:45 global
-rw-------.  1 postgres postgres    25 Nov 10 13:45 __payloadChecksum
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_dynshmem
-rw-------.  1 postgres postgres  1636 Nov 10 13:45 pg_ident.conf
drwxrwxr-x.  2 postgres postgres    32 Nov 10 14:09 pg_log
drwx------.  2 postgres postgres    35 Nov 10 13:45 pg_logical
drwx------.  4 postgres postgres    36 Nov 10 13:45 pg_multixact
drwxrwxr-x.  2 postgres postgres    18 Nov 10 14:09 pg_notify
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_serial
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_stat
drwxrwxr-x.  2 postgres postgres    42 Nov 10 14:09 pg_stat_tmp
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_subtrans
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_tblspc
-rw-------.  1 postgres postgres     3 Nov 10 13:45 PG_VERSION
drwx------.  2 postgres postgres     6 Nov 10 13:45 pg_wal
drwxrwxr-x.  2 postgres postgres    18 Nov 10 14:09 pg_xact
-rw-------.  1 postgres postgres 27807 Nov 10 13:45 postgresql.conf

The main benefit comes from splitting the data files of the databases ( $PGDATA/base/[OIDs]) into separat tar files:

[postgres@edbbart aa2]$ ls -la ../aa2/base
total 36
drwx------.  6 postgres postgres   54 Nov 10 13:45 .
drwxrwxr-x. 16 postgres postgres 4096 Nov 10 14:09 ..
drwxrwxr-x.  2 postgres postgres 4096 Nov 10 14:09 1
drwxrwxr-x.  2 postgres postgres 4096 Nov 10 14:09 15709
drwx------.  2 postgres postgres 4096 Nov 10 13:45 15710
drwx------.  2 postgres postgres 4096 Nov 10 13:45 15711
[postgres@edbbart aa2]$ ls -la ../aa1/base
total 36
drwxrwxr-x.  6 postgres postgres   54 Nov 10 14:09 .
drwxrwxr-x. 15 postgres postgres 4096 Nov 10 14:09 ..
drwx------.  2 postgres postgres 4096 Nov 10 13:45 1
drwx------.  2 postgres postgres 4096 Nov 10 13:45 15709
drwxrwxr-x.  2 postgres postgres 4096 Nov 10 14:09 15710
drwxrwxr-x.  2 postgres postgres 4096 Nov 10 14:09 15711
[postgres@edbbart aa2]$ ls -la ../aa1/base/15711
total 5828
drwxrwxr-x. 2 postgres postgres    4096 Nov 10 14:09 .
drwxrwxr-x. 6 postgres postgres      54 Nov 10 14:09 ..
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 112
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1220
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1222
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1227
-rw-------. 1 postgres postgres  139264 Nov 10 13:45 1247
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1247_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 1249_fsm
-rw-------. 1 postgres postgres 1196032 Nov 10 13:45 1255
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1255_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 1259_fsm
-rw-------. 1 postgres postgres   57344 Nov 10 13:45 13795
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13795_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13799
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13800_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13802
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13805
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13805_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13809
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13810_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13812
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13815
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13815_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13819
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13820_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13822
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13825
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13829
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13956_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13966
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13966_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1417_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1418_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14539
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 14542
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14547
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 14550
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14555
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15609
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15614
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15617
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15622
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15625
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15629
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15632
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15636
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15639
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15644
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15649
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15654
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15659
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15664
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 175
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2224
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2224_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2328_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2336
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2337
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2600
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2600_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2601_fsm
-rw-------. 1 postgres postgres   57344 Nov 10 13:45 2602
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2602_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2603_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2604
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2604_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2605_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2606
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2606_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2607_fsm
-rw-------. 1 postgres postgres  663552 Nov 10 13:45 2608
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2608_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2609_fsm
-rw-------. 1 postgres postgres   40960 Nov 10 13:45 2610
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2610_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2611_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2612_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2613
-rw-------. 1 postgres postgres   49152 Nov 10 13:45 2615
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2615_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2616_fsm
-rw-------. 1 postgres postgres  122880 Nov 10 13:45 2617
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2617_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2618_fsm
-rw-------. 1 postgres postgres  163840 Nov 10 13:45 2619
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2619_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2620_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2651
-rw-------. 1 postgres postgres   40960 Nov 10 13:45 2653
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2655
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2657
-rw-------. 1 postgres postgres  139264 Nov 10 13:45 2659
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2661
-rw-------. 1 postgres postgres   65536 Nov 10 13:45 2663
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2665
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2667
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2669
-rw-------. 1 postgres postgres  524288 Nov 10 13:45 2673
-rw-------. 1 postgres postgres  196608 Nov 10 13:45 2675
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2679
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2681
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2683
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2685
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2687
-rw-------. 1 postgres postgres   49152 Nov 10 13:45 2689
-rw-------. 1 postgres postgres  434176 Nov 10 13:45 2691
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2693
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2699
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2702
-rw-------. 1 postgres postgres   65536 Nov 10 13:45 2704
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2753_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2754
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2756
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2830
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2831
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2832_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2834
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2835
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2836_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2837
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2838_fsm
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2839
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2840_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2841
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2995_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3079
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3079_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3081
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3118
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3119
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3256
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3257
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3350
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3351
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3380
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3381_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3394_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3395
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3439_vm
-rw-------. 1 postgres postgres   49152 Nov 10 13:45 3455
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3456_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3466
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3467
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3501
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3501_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3503
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3541
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3541_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3574
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3576
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3596
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3597
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3598_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3600
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3600_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3601_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3602
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3602_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3603_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3604
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3606
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3608
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3712
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3764_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3766
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3997
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4943
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4951
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4953
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 5002
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 548
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6102
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6104
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6106
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 6110
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 6112
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 6117
-rw-------. 1 postgres postgres       0 Nov 10 13:45 7200_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 826
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 827
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 8889
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 8890_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 8891
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 8895_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8896
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8899
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8900
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9970
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9971
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9972
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9972_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 9973_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9984
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9986
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9988
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9991
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9993
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9995
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9997
-rw-------. 1 postgres postgres       3 Nov 10 13:45 PG_VERSION
[postgres@edbbart aa2]$ ls -la ../aa2/base/15711
total 6028
drwx------. 2 postgres postgres    4096 Nov 10 13:45 .
drwx------. 6 postgres postgres      54 Nov 10 13:45 ..
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 113
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1221
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1226
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1228
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 1247_fsm
-rw-------. 1 postgres postgres  761856 Nov 10 13:45 1249
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1249_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 1255_fsm
-rw-------. 1 postgres postgres  139264 Nov 10 13:45 1259
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 1259_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13795_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13797
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13800
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13800_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13804
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13805_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13807
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13810
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13810_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13814
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13815_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13817
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13820
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13820_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13824
-rw-------. 1 postgres postgres       0 Nov 10 13:45 13827
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13956
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 13956_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 13966_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1417
-rw-------. 1 postgres postgres       0 Nov 10 13:45 1418
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14536
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 14541
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14544
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 14549
-rw-------. 1 postgres postgres       0 Nov 10 13:45 14552
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 14557
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15611
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15616
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15619
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15624
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15626
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15631
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15633
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 15638
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15641
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15646
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15651
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15656
-rw-------. 1 postgres postgres       0 Nov 10 13:45 15661
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 174
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2187
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2224_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2328
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2328_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2336_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2579
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2600_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2601
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2601_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2602_fsm
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2603
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2603_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2604_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2605
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2605_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2606_fsm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2607
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2607_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2608_fsm
-rw-------. 1 postgres postgres  360448 Nov 10 13:45 2609
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2609_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2610_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2611
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2612
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2612_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2613_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2615_fsm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2616
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2616_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2617_fsm
-rw-------. 1 postgres postgres  172032 Nov 10 13:45 2618
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2618_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 2619_fsm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2620
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2650
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2652
-rw-------. 1 postgres postgres   40960 Nov 10 13:45 2654
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2656
-rw-------. 1 postgres postgres  204800 Nov 10 13:45 2658
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2660
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2662
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2664
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2666
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2668
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2670
-rw-------. 1 postgres postgres  557056 Nov 10 13:45 2674
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2678
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2680
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2682
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2684
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2686
-rw-------. 1 postgres postgres   40960 Nov 10 13:45 2688
-rw-------. 1 postgres postgres  114688 Nov 10 13:45 2690
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2692
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2696
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2701
-rw-------. 1 postgres postgres   40960 Nov 10 13:45 2703
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2753
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2753_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 2755
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2757
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2830_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2832
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2833
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2834_vm
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 2836
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2836_vm
-rw-------. 1 postgres postgres 1040384 Nov 10 13:45 2838
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2838_vm
-rw-------. 1 postgres postgres   49152 Nov 10 13:45 2840
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2840_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 2995
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 2996
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3079_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3080
-rw-------. 1 postgres postgres   57344 Nov 10 13:45 3085
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3118_vm
-rw-------. 1 postgres postgres   90112 Nov 10 13:45 3164
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3256_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3258
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3350_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3379
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3381
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3394
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3394_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3439
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3440
-rw-------. 1 postgres postgres  507904 Nov 10 13:45 3456
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3456_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3466_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3468
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3501_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3502
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3534
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3541_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3542
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3575
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3576_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3596_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 3598
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3599
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3600_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3601
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3601_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 3602_fsm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3603
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3603_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3605
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3607
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 3609
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3764
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 3764_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 3767
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4942
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4950
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4952
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 4954
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 5007
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 549
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6102_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6104_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 6106_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 6111
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 6113
-rw-------. 1 postgres postgres       0 Nov 10 13:45 7200
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 7201
-rw-------. 1 postgres postgres       0 Nov 10 13:45 826_vm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 828
-rw-------. 1 postgres postgres   49152 Nov 10 13:45 8890
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 8890_vm
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 8895
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 8895_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8896_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8899_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 8900_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9970_vm
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9971_vm
-rw-------. 1 postgres postgres   24576 Nov 10 13:45 9972_fsm
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9973
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9973_vm
-rw-------. 1 postgres postgres   32768 Nov 10 13:45 9985
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9987
-rw-------. 1 postgres postgres       0 Nov 10 13:45 9988_vm
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9992
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9994
-rw-------. 1 postgres postgres   16384 Nov 10 13:45 9996
-rw-------. 1 postgres postgres    8192 Nov 10 13:45 9998
-rw-------. 1 postgres postgres     512 Nov 10 13:45 pg_filenode.map

When you reduce the “thread_count” to 1 you will get a normal pg_basebackup:

[postgres@edbbart aa2]$ cat /usr/edb/bart/etc/bart.cfg
[BART]              
bart_host = postgres@edbbart  
backup_path = /u90           
pg_basebackup_path = /usr/edb/as11/bin/pg_basebackup   
xlog_method = stream
retention_policy= 2 DAYS
logfile = /tmp/bart.log 
scanner_logfile = /tmp/scanner.log	   
thread_count = 2

[PG2]
backup_name = PG2_%year-%month-%dayT%hour:%minute
host = 192.168.22.60   
user = bart
port = 5433   
cluster_owner = postgres 
remote_host = postgres@192.168.22.60 
allow_incremental_backups = enabled
thread_count = 1
description = PG1 backuos 

[postgres@edbbart aa2]$ bart backup -s PG2
INFO:  DebugTarget - getVar(checkDiskSpace.bytesAvailable)
INFO:  creating full backup using pg_basebackup for server 'pg2'
INFO:  creating backup for server 'pg2'
INFO:  backup identifier: '1541856034356'
INFO:  backup completed successfully
INFO:  
BART VERSION: 2.2.0
BACKUP DETAILS:
BACKUP STATUS: active
BACKUP IDENTIFIER: 1541856034356
BACKUP NAME: PG2_2018-11-10T14:20
BACKUP PARENT: none
BACKUP LOCATION: /u90/pg2/1541856034356
BACKUP SIZE: 64.68 MB
BACKUP FORMAT: tar
BACKUP TIMEZONE: Europe/Zurich
XLOG METHOD: stream
BACKUP CHECKSUM(s): 0
TABLESPACE(s): 0
START WAL LOCATION: 000000010000000000000008
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2018-11-01 16:14:57 CET
STOP TIME: 2018-11-10 14:20:34 CET
TOTAL DURATION: 214.094 hour(s)

In the next post we’ll look at how we can do parallel incremental restores.

Cet article EDB BART 2.2, parallel full backups without using pg_basebackup est apparu en premier sur Blog dbi services.

Recovering from failed patch on virtualized ODA

Mon, 2018-11-12 05:17

When a patch fails on a virtualized Oracle Database Appliance (ODA), this ODA is often unusuable because Linux and OAKD are patched to new release but Grid Infrastructure is still on old version. OAKD cannot be restarted in default mode because in this mode the active Grid Infrastructure version is checked, which will fail due to old version. Also Grid Infrastructure cannot be started due to the fact that OAKD controls access of shared hardware on ODA and if OAKD does not run, shared hardware cannot be accessed.

One way to resolve this problem is to reimage the ODA, which is time consuming and means that all databases and VMs have to be restored.

A workaround of this chicken and egg problem (I cannot guarantee that it is supported) as a last try before reimaging the ODA could be to start OAKD in non-cluster mode. This not very good documented mode does not check active grid infrastructure but gives access to shared hardware. Additional VMs cannot be started because there is no master OAKD. In this mode manual patching/upgrade of Grid Infrastructure is possible.

The non cluster mode can be entered like following (on every ODA node):


cp /opt/oracle/oak/install/oakdrun /opt/oracle/oak/install/oakdrun_orig
echo "non-cluster" > /opt/oracle/oak/install/oakdrun
cd /etc/init.d
./init.oak start

[root@xx init.d]# ps -ef | grep oakd
root 49697 49658 11 11:05 ? 00:00:02 /opt/oracle/oak/bin/oakd -non-cluster
root 50511 42821 0 11:05 pts/0 00:00:00 grep oakd

Now Grid Infrastructure patching or upgrade can be done.

If only an ODA_BASE VM exists and timeframe for manual patching/upgrade is too short, it also can be tried is to start Grid Infrastructure on one ODA node and then start the services. Patching or reimaging has to be done in next suitable timeframe.

After running Grid Infrastructure on new version, OAKD can be tried to start in default mode:


echo "start" > /opt/oracle/oak/install/oakdrun
cd /etc/init.d
./init.oak start

[root@xx init.d]# ps -ef | grep oakd
root 30187 30117 13 10:18 ? 00:00:02 /opt/oracle/oak/bin/oakd foreground
root 31902 7569 0 10:18 pts/1 00:00:00 grep oakd

Perhaps manual patching/upgrade of other components has to be done afterwards.

After patching/upgrading, ODA has to be checked with:


oakcli show version -detail
oakcli validate -a

Cet article Recovering from failed patch on virtualized ODA est apparu en premier sur Blog dbi services.

An index only scan in PostgreSQL is not always index only

Sat, 2018-11-10 05:19

PostgreSQL supports index only scans since version 9.2 which was released in September 2013. The purpose of an index only scan is to fetch all the required values entirely from the index without visiting the table (the heap) at all. Of course that can speed up a query because avoiding to touch the heap, means reading less data and reading less data is obviously faster than reading more data. So index only scans are a good thing but unfortunately it does not always mean that the heap is not touched.

As always, lets start by creating a sample table and populate it with some data:

postgres=# create table t1 ( a int, b int, c int );
CREATE TABLE
postgres=# insert into t1 select a.*,a.*,a.* from generate_series(1,1000000) a;
INSERT 0 1000000
postgres=# \d+ t1
                                    Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
--------+---------+-----------+----------+---------+---------+--------------+-------------
 a      | integer |           |          |         | plain   |              | 
 b      | integer |           |          |         | plain   |              | 
 c      | integer |           |          |         | plain   |              | 

Without any index a query like the following one needs to read the whole table for getting the result:

postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Gather (actual time=2.187..158.023 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=5406
   ->  Parallel Seq Scan on t1 (actual time=68.645..119.828 rows=0 loops=3)
         Filter: (b = 5)
         Rows Removed by Filter: 333333
         Buffers: shared hit=5406
 Planning time: 0.209 ms
 Execution time: 158.079 ms
(10 rows)

In this case PostgreSQL decides to do a parallel sequential scan which is fine. The only other option would be to do a serial sequential scan as we do not have any indexes on that table. What people usually do in such cases is to create an index like this one:

postgres=# create index i1 on t1(b);
CREATE INDEX
postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 b      | integer |           |          | 
 c      | integer |           |          | 
Indexes:
    "i1" btree (b)

Having that index in place PostgreSQL can use it to return the results faster:

postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                             QUERY PLAN                              
---------------------------------------------------------------------
 Index Scan using i1 on t1 (actual time=0.035..0.037 rows=1 loops=1)
   Index Cond: (b = 5)
   Buffers: shared hit=4
 Planning time: 0.174 ms
 Execution time: 0.081 ms
(5 rows)

As you can see above the index is used but PostgreSQL will still have to visit the heap for getting the value of “a”. We can improve that even further by creating an index that contains all the information we need to satisfy the query:

postgres=# create index i2 on t1 (b,a);
CREATE INDEX
postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 b      | integer |           |          | 
 c      | integer |           |          | 
Indexes:
    "i1" btree (b)
    "i2" btree (b, a)

What will happen now is, that PostgreSQL will switch to an index only scan:

postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                                QUERY PLAN                                
--------------------------------------------------------------------------
 Index Only Scan using i2 on t1 (actual time=0.111..0.113 rows=1 loops=1)
   Index Cond: (b = 5)
   Heap Fetches: 1
   Buffers: shared hit=1 read=3
 Planning time: 0.515 ms
 Execution time: 0.161 ms
(6 rows)

But: There is still a fetch from the heap. Why that? For answering that, lets list the files on disk for that table:

postgres=# select pg_relation_filepath('t1');
 pg_relation_filepath 
----------------------
 base/34013/34082
(1 row)

postgres=# \! ls -l $PGDATA/base/34013/34082*
-rw-------. 1 postgres postgres 44285952 Nov  8 04:51 /u02/pgdata/10/PG103/base/34013/34082
-rw-------. 1 postgres postgres    32768 Nov  8 04:46 /u02/pgdata/10/PG103/base/34013/34082_fsm

… and here we go: The table has a free space map but the visibility map is not yet there. Without the visibility map PostgreSQL can not know if all the rows in that page are visible to all current transactions and therefore has to visit the heap to get that information. As soon as we create the visibility map:

postgres=# vacuum t1;
VACUUM
postgres=# \! ls -l $PGDATA/base/34013/34082*
-rw-------. 1 postgres postgres 44285952 Nov  8 04:51 /u02/pgdata/10/PG103/base/34013/34082
-rw-------. 1 postgres postgres    32768 Nov  8 04:46 /u02/pgdata/10/PG103/base/34013/34082_fsm
-rw-------. 1 postgres postgres     8192 Nov  8 07:18 /u02/pgdata/10/PG103/base/34013/34082_vm
postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                                QUERY PLAN                                
--------------------------------------------------------------------------
 Index Only Scan using i2 on t1 (actual time=0.052..0.054 rows=1 loops=1)
   Index Cond: (b = 5)
   Heap Fetches: 0
   Buffers: shared hit=4
 Planning time: 0.446 ms
 Execution time: 0.106 ms
(6 rows)

… the fetch from the heap is gone and we have a real index only scan (although the visibility map is always scanned). To demonstrate that in more detail lets get the physical location of the row we want to read:

>
postgres=# select ctid,* from t1 where b=5;
 ctid  | a | b | c 
-------+---+---+---
 (0,5) | 5 | 5 | 5
(1 row)

Now we know that the row is in block 0 and it is the 5th row in that block. Let’s check, for that block, if all rows are visible to all current transactions:

postgres=# create extension pg_visibility;
CREATE EXTENSION
postgres=# select pg_visibility_map('t1'::regclass, 0);
 pg_visibility_map 
-------------------
 (t,f)
(1 row)

Yes, they are (the first “t”, which is true, means all visible). What happens when we update the row in a second session?

postgres=# update t1 set a=8 where b=5;
UPDATE 1

Do we still get a “true” when we ask if all rows in that block are visible to all transactions?

postgres=# select pg_visibility_map('t1'::regclass, 0);
 pg_visibility_map 
-------------------
 (f,f)
(1 row)

No, and that means two things: First of all a modification to a page clears the bit in the visibility map. The second consequence is, that our index only scan will need to visit the heap again:

postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                                QUERY PLAN                                
--------------------------------------------------------------------------
 Index Only Scan using i2 on t1 (actual time=0.263..0.267 rows=1 loops=1)
   Index Cond: (b = 5)
   Heap Fetches: 2
   Buffers: shared hit=6 dirtied=3
 Planning time: 0.205 ms
 Execution time: 0.328 ms
(6 rows)

The question now is: Why two heap fetches? First of all every update in PostgreSQL creates a new row:

postgres=# select ctid,* from t1 where b=5;
   ctid    | a | b | c 
-----------+---+---+---
 (5405,76) | 8 | 5 | 5
(1 row)

Our row is now in a new block (and even if if would be in the same block it would be at another location in the block) and that of course also affects the index entry which points to that row. The index still points to the old version of the row and there is the pointer to the current version which means two heap fetches (when you update a column that is not part of the index, that is called a hot update, more on that in another post). For the next execution we see one heap fetch again:

postgres=# explain (analyze,buffers,costs off) select a from t1 where b = 5;
                                QUERY PLAN                                
--------------------------------------------------------------------------
 Index Only Scan using i2 on t1 (actual time=0.022..0.023 rows=1 loops=1)
   Index Cond: (b = 5)
   Heap Fetches: 1
   Buffers: shared hit=5
 Planning time: 0.093 ms
 Execution time: 0.047 ms
(6 rows)

Not sure why only one, at the moment, but I’ll update this blog once I have more information.

What you need to remember is, that an index only scan is not always index only. Depending on how many modifications are happening on that table, it might well be that PostgreSQL must visit the heap quite a lot of times which of course slows down things. For tables where most of the blocks are static an index only scan is great.

Cet article An index only scan in PostgreSQL is not always index only est apparu en premier sur Blog dbi services.

SQL Pass Summit 2018: SQL Server 2019 features coming soon!

Fri, 2018-11-09 15:13

It’s the 20th anniversary of the event and the keynote today was amazing (see the blog from Christophe) for different reasons but the more important is that is every time better people. Not only DBA’s but also people using their application connected to the database.
Today, one of my focus will be the next version of SQL Server 2019 with the session of Bob Ward, Asad Khan & Amit Banerjee from Microsoft.
sql2019_01
I already tested some new features and wrote about it, but it’s really good to see a session about SQL Server 2019 and to discover new features that can help us.
Before writing about the session, I refer the keynote few minutes demo done by Bob Ward & Connor Cunningham and I will share this with you because today it’s not available in the CPT. It will be in the future release of SQL Server. The problem is if you have a huge workload and Tempdb is solicited. You notice in tempdb that only sys tables are the cause…What can we do? We do not have access to these tables, they are internal… And Connor comes with an update of SQL Server and it was fix! AMAZING…It’s really a dream for us!!! Connor explained that the SQLServer development Team developed a solution using in-memory technology and put sys tables in-memory to avoid this problem. This is illustrated in the picture as “tempdb: it Juts Runs Faster” and is part of the Intelligent Query Processing like Memory Grant Feedback or Adaptive Joins already available in SQL Server 2017 in the Adaptive Query Processing family.
The name is also changing from Adaptive to Intelligent! :lol: sql2019_02
After this future new feature, let’s focus on the Roadmap for SQL Server: Inside SQL Server 2019.
I will not write about features already in the CPT but features coming soon.
On the security I’m very curious to test the “Static Data masking”.
We will be able to mask data to users with admin access.
It can useful doing a copy from prod to dev for developers having db_owner role…  they won’t see the data in dev! Perfect for us!
sql2019_03
On the High Availability topic, they announced that system databases can be finally in Availability Groups. You cannot imaging the joy that this announcement does for all of us!
The other news is the Accelerated Data Recovery. This new feature helps in case of a rollback from a large query to come faster in the precedent state. I was in a session explaining how it’s working and it was very interesting and I’m excited to test it on-premises.
sql2019_4
During the demo at the keynote, Bob used a new function that I find very useful, it is fn_ PageResCracker associated to the DMV dm_db_page_info to troubleshoot page resource.

sql2019_5And to finish, the last slide of the presentation shows the next direction of SQL Server.
SQL2019_6Dbi services SQL Server Team will be happy to test all these new features and blog about them!

Cet article SQL Pass Summit 2018: SQL Server 2019 features coming soon! est apparu en premier sur Blog dbi services.

Pass Summit – dbi visit day 4 (2)

Fri, 2018-11-09 12:38
Power BI 5 top custom visuals

The development of Power BI itself is fast. But, the means to represent data is too. I mean there are constantly new Power BI custom visuals developed by the community and Microsoft.
You can develop your own custom visuals and publish them on the marketplace for your organization or the whole community.
Tips : check if the visuals are certified, meaning they have been tested on the different Power BI possibilities
You can find the gallery of custom visuals here: https://Appsource.microsoft.com/en-us/marketplace

Here are 5 different ones presented during the PASS Summit session:

1- SandDance
Used to display every data element of your data set on XY axis. It enables also to zoom in the data to isolate a plotted area and create scenes to show your data with animations.

IMG_1071

2- Synoptic Panel
Connects areas in a custom image with attributes in your data model using a designer
Https://Synoptic.design in order areas in your image that you will map to your data set

IMG_1076

3- Infographic designer
Custom the representation of you data to your specific need or business even with your own images

IMG_1077

4- Timeline Storyteller
Multiple representation and layouts of data based on time axis you can organize in order to tell a story with your data

IMG_1079

5- PowerApps
Integrate Line of Busienss application into Power BI. This can be used as well to have Writebacks in Power BI.

IMG_1080

Development of your power App must be done in the website Https://Powerapps.com if you are integrate it using Power BI Desktop. You will have a difference of experience using the service version. In the service version including the PowerApps visual will allow you to create the powerapps automatically with the fields of your data set

Again an interesting data analytic day…stay tuned..

Cet article Pass Summit – dbi visit day 4 (2) est apparu en premier sur Blog dbi services.

Pass summit – day 4

Fri, 2018-11-09 12:01
Optimizing Multi-Billion Row Tables in Tabular in 2018

 

I wanted to attend the session moderated by Marco Russo to see his approach of optimizing performance in Tabular model
The first thing to understand is how the data is stored and organized in a Tabular model. It is of course using the xVelocity in-memory capabilities with the Vertipac column storage layout engine. It organizes the data, compressing  it by column,  in combination with a dictionary encoding mechanism.
Therefore the most important aspects in a model to take care at are:

  • Reducing the number of distinct values of columns
  • Focus on columns with many distinct values

What impacts the performance of data retrieval is also  the relationships in the model that define the cardinality. The higher the cardinality is, the slower the query will be.

In order to measure there criteria the following tools are available:

  • Vertipac analyzer: it can help you to get insight on your tabular model, to get information on the number of items in your columns and their size.
  • DAX studio: it is an open source that you can download visiting https://Daxstudio.org

DAX Studio will help you to analyze how your queries are performing and you will be able to see how the performance is related to the number of cores, the size of the columns queried and as already said above the size of the cardinality involved.

What is recommended for tabular model is to have the fastest CPU speed (>= 3GHz) and CPU L2/L3 large caches; large cache of the CPU is essential to avoid quickly performance problems, but having to many CPU can be counterproductive, having more sockets can negatively impact your performance. Fastest RAM is also recommended(>=3200). You have to be aware of the latency of memory and storage access and see the huge impact it has on the performance.

Marco delivers best practice using the Azure tiers available
IMG_1081

The way you can influence having too high cardinality between fact and dimension is reduce the dimension with sub category but also to reduce you fact by splitting it and doing aggregates. You have to re-shape your data model.

The great thting is that a new “Dual storage mode and aggregation option” is a new feature in Power BI Premium that enable to defined aggregation table of your big fact tables that do not fit in memory within Power BI.
IMG_1082

Again an interesting data analytic day…stay tuned..

Cet article Pass summit – day 4 est apparu en premier sur Blog dbi services.

Pass summit – dbi visit day 3

Thu, 2018-11-08 10:52
IMG_1050 The Microsoft data platform is evolving

The third day began with the Keynote help by Rohan Kumar the Corporate Vice President of Azure Data by Microsoft

The main message was “Hybrid data platform is the way for the future” and Microsoft is working in this direction.

The use of AI and analytic to transform the customer business is also a key driver for the Microsoft data platform, which is building to enable this easily.

Rohan remember the customer priorities:
– Modernizing on-premises
– Modernizing the cloud
– Building cloud native cloud application
– Unlocking insights

To support these priorities we got an overview of the last development on the data platform product or services:
Modernizing the on-premises solution with SQL Server 2019
○ More language support
○ Best security and performance
○ Power BI Report Server
○ Mission critical availability and performance
○ Security and compliance
○ Management and development
○ Big data and analytics

And Microsoft is announcing SQL Server 2019 CPT 2.1

One big evolution in term of performance is that some system tables will be moved in in-memory technology.
SQL Server 2019 is becoming a hub for integrating data and also enable the data injection in big data clusters combining Spark and SQL Server

Azure Data Studio si the new tool to manage all the Azure data services in order to see them a unified data platform

Modernizing the cloud with the new services or their new capabilities or functionalities
Azure database migration service
Azure SQL database managed instance with a business critical option available
Azure SQL Database Hyper scale (up to 100TB)
Accelerated databases recovery and machine Learning services in Azure SQL Database
Building cloud-native apps with Azure Cosmos DB
Multi-Master and Cassandra API

Unlocking insights
New features of Azure SQL Data Warehouse, like workload management
Partnership Informatica, tableau connectors
HDInsight 4.0
Real-time analytics
Azure event hubs for Kafka
Azure Data Explorer
Power BI Dataflow
Power BI SSRS Reports in Power BI
Azure Databricks new feature

The SQL Server product is evolving constantly and the Azure world event faster.

Again an interesting data analytic day…stay tuned..

Cet article Pass summit – dbi visit day 3 est apparu en premier sur Blog dbi services.

SQL server security pre-conference @SQL Pass Summit 2018 in Seattle

Wed, 2018-11-07 07:53

After long hours of flights, Christophe Cosme and I finally arrived in Seattle at the SQL PASS Summit 2018. We picked up our badge and bag and now we are ready to begin…

welcome2

This first day for me started with the pre-conference about SQL Server Security with Denny Cherry.
Many topics were discussed like:

  • Network Designs (such as public IP vs. Private IP)
  • Firewall configurations (such as Network Design between Public network, Internal network and DMZ)
  • Physical Security Options (such as Cloud security)
  • Data Encryption Options (such as Always Encrypted)
  • Password Protection Options (such as rules to set a sql login password)
  • Contained Databases (such as principle of contained Databases)
  • Always On Availability Groups Security (such as new roles and sysadmin role)
  • Internet-facing Applications (such as SQL CLR risks)
  • SQL injections (Ascii characters that aren’t actually ascii)
  • Backups (such as backups encryption using TDE or Backup encryption)
  • Antivirus (such as Best Practice for the Antivirus)
  • SAN Security Designs (such as disk encryption option)
  • SQL Server Auditing Options (such as setting up auditing)

It will be a tough to go through all these subjects and all are interesting…

 preconf2

I preferred focusing on the Antivirus topic and to give you a little update on that.
First, we know that Antivirus can cause issues (CPU, memory and IO) on SQL Server.
As I always tell my customers, it is better not to install an antivirus on a server with SQL Server.
Why? Performance can be impacted but also because it will corrupt your databases too.
In the case that you have an Antivirus installed on a SQL Server, Denny gave us a good way to verify if it will inject itself into SQL Server:

SELECT * FROM sys.dm_os_loaded_modules where company != 'Microsoft Corporation'

loaded_modules
This is also a good way to verify for other providers too and see a security break!
Beside this, do not forget to exclude files, folders and SQL Server from the Antivirus.
The list of exclusions is on the TechNet website here

A point I really ignored was that SQL Server 2016 & after, include and enabled Windows Defender but you can have very poor performance if is not setup correctly…

antivirus

As conclusion, I will just write that Antivirus is good but not for SQL Server! 8-)

Cet article SQL server security pre-conference @SQL Pass Summit 2018 in Seattle est apparu en premier sur Blog dbi services.

Pass summit – dbi visit day 2

Tue, 2018-11-06 19:19
 IMG_1046 The Art and Science of Designing Smart Data story

This pre-conference session was moderated by Mico Yuk CEO and founder of BI Brainz

Mico drove us through the path of BI data visual storytelling and the importance it has to present the data in an attractive way.
She warned the session will not be technical at all, I was a bit disappointed about this point, but was curious how she will drive us there.

Starting, she made us aware of the impact on your brain of what you see , especially the color, the words.
In order to create a good storytelling, the first think is to raise the right questions, but avoid the pitfalls like asking
What do you want to see?
How do you want it to look?
What do you want to measure?

She drew a parallel with Hollywood and the film scenarios, being most of the time build on the same template.
Why can’t we do the same with data:
Part 1: goal
goals need to be defined clearly, They are the core of your data story and what is significant is that the goal is quantified
Part 2: Snapshot or KPI
KPI what are the figures that measure your goal
Part 3: Trends
Is the way the KPI behaves to support your goal
Part 4: Actions
There are 3 kinds of actions that can be considered to fix the issues
Reality – what we want to happen
Fallback – if reality does not work what else do you do
Wish list – what do you wish to fix the issue

To succeed in creating the different storyboard parts you have to raise the right questions:
For the goal part : What is your goal for this project? What does success look like?
For the KPI part: What KPI do you need see to hit your goal?
For the trends: If your KPI is not on target why?

A goal formulation must follow the WHW goal sample would be “Hit consumer sales target 1% by 2018″
What = hit
How much = 1% sales
When = 2018

Finally Mico showed how to display the information of the story on a dashboard following the visual storyboard tile with the following elements:
0 Goal/Kpi name
1 Where are you no
2 How di you get here
3 Where will you end up
4 what is the impact

The trick is always to prototype your storyboard outside a BI tool.

Along the day we also did some gym to relax and aware our brains

 

It was a hands-on session with practical exercises and we worked by group on a concrete use case to apply the storytelling methodology all along the day.

IMG_1048

Again an interesting data analytic day…stay tuned..

Cet article Pass summit – dbi visit day 2 est apparu en premier sur Blog dbi services.

Pass summit – dbi visit day 1

Mon, 2018-11-05 18:16
IMG_1022 Designing Modern Data and Analytic Solution in Azure

After having explained the pros and cons of Azure and the decision drivers for going to an Azure Architecture, some interesting messages were delivered like the decoupling of the storage and compute aspect on Azure, even if some of services still combine both. Another message that we all know but is essential to remind on a regular basis is that cost control is an important aspect. Developers and architects can influence the cost of cloud based platform hugely.

The fact that Azure is evolving a lot, that new services are offered or that their features are changing , you have to be constantly informed about these  new or deprecated capabilities. You are sometimes faced to some migration works to adapt to these evolutions. Proof of concept must be an habit, a routine due to this constantly moving environment.

Organizing and configuring your resources in Azure is also something to consider seriously not to be in trouble and lost with all the resources you deployed within your organization. You have some notions that become important to use to organize your Azure platform like

  • Subscriptions
  • Resource groups
  • Tags
  • Policies
  • Locations of your services
  • Naming convention

Using Tags on Azure can help a lot for reporting reasons.
A right naming convention (i.e. Purpose + type of service + environment), can help you to deploy easier your different environments, but a trick is also not use special characters because they cannot be used consistently in azure services. Using policies will support you to control the services are created the way your governance wants them.

A presentation of a data analytic architecture was presented including the following services in order to managed structured data:

  • Azure data factory
  • Azure Blob Storage
  • Azure data bricks
  • Azure SQL Data Warehouse
  • Azure Analysis Services
  • Power BI

Then following services are suggested additionally to add real-time data analytic

  • Azure HDInsight Kafka
  • Azure Databricks Spark
  • Azure Cosmos DB

They presented then the different layers of the architecture and the possible azure services, data acquisition, data storage, data computing and data visualization.
When doing on premise data warehouse project for 15 years , I was mentioning always that I want to privilege ELT vs. ETL , I was glad to hear that now ELT is really mostly used terminology and Data Factory was presented that way. The new version enables now to run SSIS packages too. The different data storage possibilities, like blob storage, data lake, SQL DWH, SQL DB and Cosmos DB have been reviewed.  Choosing the right data storage is always to find the right balanced between data schema read and data schema write. But in a modern BI Azure platform you will find a a so called polyglot data storage solution, combining several types a storage services.

A component often emerging in this tools constellation is Databricks running Apache Spark for the data engineering and data science. The advantage is that it is closer to open source world and support several languages SQL, Python, R, Scala and can be used for use cases like bath data processing, Interactive analytics, machine learnings, Stream event processing,…

In such environment, the data virtualization is also an interesting capabilities on such architecture. It is possible using for instance polybase, allowing to query the disparate stored data avoiding to replicate, duplicate it.

The per-conference day finished with the presentation of Power BI and Azure Analysis Service and some automation technics with ARM and PowerShell

Again an interesting data analytic day…stay tuned…

Cet article Pass summit – dbi visit day 1 est apparu en premier sur Blog dbi services.

A Graphical Overview of a Repository

Mon, 2018-11-05 09:00

As the saying goes, “A Picture Is Worth A Thousands Words”. I’d add “And A Simple Graph Is Worth A Long, Abstruse List of Numbers”. And of words too, so let’s show off a little bit:
Screenshot from 2018-11-03 19-39-57
Interested ? Then, please read on.
It happens not so infrequently that we wish we could quickly plot a few numbers to have a look at their overall trend. The usual procedure is to output the numbers into a csv file, transfer it to a desktop machine, subsequently import it into a spreadsheet program and interactively set up a chart, a tedious manual procedure at best, especially if it has to be repeated several times.
What if we could query a few relevant values from a system and plot them in one go to visualize the produced graphs from within a browser ? What if we could generalize that procedure to any extracted tabular data ? I’m not talking of a sophisticated interface to some enterprise-class graphing tool but just of a simple way to get a quick visual overall feeling of some variables with as little installed software as possible.
I’ll show here how to do that for a Documentum repository but the target system can be anything, a database, an ldap, an O/S, just adapt the queries and the script as needed.
To simplify, I assume that we generally work on a server machine, likely a Linux headless VM, and that the browser runs remotely on any GUI-based desktop, quite a common configuration for Documentum.

The Data

As an administrator, I often have to connect to docbases I never visited before, and I find it useful to run the following queries to help me make acquaintance with those new beasts. In the examples below, the target docbase is an out of the box one with no activity in it, which explains the low numbers and lack of custom doctypes:

— 1. what distinct document types are there ?

select r_object_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type order by 1
r_object_type count(*) tot_size
-------------------------------- ---------------------- ----------------------
dm_document 1295 6384130
dm_esign_template 1 46255
dm_format_preferences 1 109
dm_menu_system 2 352034
dm_plugin 2 212586
dm_xml_config 1 534
dmc_jar 236 50451386
dmc_preset_package 2 11951
dmc_tcf_activity_template 10 12162
(9 rows affected)

A reminder of some definitions: The result table above is a dataset. The r_object_type is the category, typically shown in the X-axis. The count(*) and tot_size columns are the “variables” to be plot, typically as bars, lines or pie slices. They are also named “traces” in some graphing tools. Some of the datasets here have 1, 2 or 3 variables to be plot. Datasets with 1 variable can be plot as bars, lines or pies charts. Datasets with 2 variables can be plot as grouped or stacked bars, or as 2 distinct graphs of one variable. There are many possibilities and combinations, and the choice depends on which representation offers the best visual clarity. Sometimes, the plotting library even lets one edit the programmatically generated graph and interactively choose the best type of graph with no coding needed !

— 2. how does their population vary over time ?

select r_object_type, datefloor(month, r_creation_date) as "creation_month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type, datefloor(month, r_creation_date) order by 1, 2;
r_object_type creation_month count(*) tot_size
-------------------------------- ------------------------- ------------ ------------
dm_document 11/1/2017 01:00:00 448 3055062
dm_document 12/1/2017 01:00:00 120 323552
dm_document 3/1/2018 01:00:00 38 66288
dm_document 4/1/2018 02:00:00 469 1427865
dm_document 5/1/2018 02:00:00 86 584453
dm_document 6/1/2018 02:00:00 20 150464
dm_document 7/1/2018 02:00:00 40 301341
dm_document 8/1/2018 02:00:00 32 151333
dm_document 9/1/2018 02:00:00 46 356386
dm_esign_template 11/1/2017 01:00:00 1 46255
dm_format_preferences 11/1/2017 01:00:00 1 109
dm_menu_system 11/1/2017 01:00:00 2 352034
dm_plugin 11/1/2017 01:00:00 2 212586
dm_xml_config 11/1/2017 01:00:00 1 534
dmc_jar 11/1/2017 01:00:00 236 50451386
dmc_preset_package 11/1/2017 01:00:00 2 11951
dmc_tcf_activity_template 11/1/2017 01:00:00 10 12162
(17 rows affected)

This query tells how heavily used the repository is. Also, a broad spectre of custom document types tends to indicate that the repository is used in the context of applications.

— 3. how changing are those documents ?

select r_object_type, datefloor(month, r_modify_date) as "modification_month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) where r_creation_date < r_modify_date group by r_object_type, datefloor(month, r_modify_date) order by 1, 2;
r_object_type modification_month count(*) tot_size
-------------------------------- ------------------------- ------------ ------------
dm_document 11/1/2017 01:00:00 127 485791
dm_document 12/1/2017 01:00:00 33 122863
dm_document 3/1/2018 01:00:00 16 34310
dm_document 4/1/2018 02:00:00 209 749370
dm_document 5/1/2018 02:00:00 42 311211
dm_document 6/1/2018 02:00:00 10 79803
dm_document 7/1/2018 02:00:00 20 160100
dm_document 8/1/2018 02:00:00 12 81982
dm_document 9/1/2018 02:00:00 23 172299
dm_esign_template 8/1/2018 02:00:00 1 46255
dmc_jar 11/1/2017 01:00:00 14 1616218
dmc_preset_package 11/1/2017 01:00:00 2 11951
dmc_tcf_activity_template 11/1/2017 01:00:00 10 12162
(13 rows affected)

This query shows if a repository is used interactively rather than for archiving. If there are lots of editions, the docbase is a lively one; on the contrary, if documents are rarely or never edited, the docbase is mostly used for archiving. The document ownership can tell too, technical accounts vs. real people.

— samething but without distinction of document type;
— 4. new documents;

select datefloor(month, r_creation_date) as "creation_month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by datefloor(month, r_creation_date) order by 1;
creation_month count(*) tot_size
------------------------- ---------------------- ----------------------
11/1/2017 01:00:00 703 54142079
12/1/2017 01:00:00 120 323552
3/1/2018 01:00:00 38 66288
4/1/2018 02:00:00 469 1427865
5/1/2018 02:00:00 86 584453
6/1/2018 02:00:00 20 150464
7/1/2018 02:00:00 40 301341
8/1/2018 02:00:00 32 151333
9/1/2018 02:00:00 42 323772
(9 rows affected)

— 5. modified documents;

select datefloor(month, r_modify_date) as "modification_month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) where r_creation_date < r_modify_date group by datefloor(month, r_modify_date) order by 1;
modification_month count(*) tot_size
------------------------- ---------------------- ----------------------
11/1/2017 01:00:00 153 2126122
12/1/2017 01:00:00 33 122863
3/1/2018 01:00:00 16 34310
4/1/2018 02:00:00 209 749370
5/1/2018 02:00:00 42 311211
6/1/2018 02:00:00 10 79803
7/1/2018 02:00:00 20 160100
8/1/2018 02:00:00 13 128237
9/1/2018 02:00:00 20 140354
(9 rows affected)

— 6. what content types are used in the repository ?

select a_content_type, count(*) as "count_content_type", sum(r_full_content_size) as "tot_content_size" from dm_document(all) group by a_content_type order by 1
col a_content_type 20
a_content_type count_content_type tot_content_size
-------------------- ------------------ ----------------
11 0
amipro 1 4558
crtext 43 389681
dtd 5 163392
excel12bbook 1 8546
excel12book 1 7848
excel12mebook 1 7867
excel12metemplate 1 7871
excel12template 1 7853
excel5book 1 15360
excel8book 1 13824
excel8template 1 13824
ibmshrlib 2 212586
jar 233 48856783
java 3 1594603
maker55 5 117760
mdoc55 9 780288
ms_access7 1 83968
ms_access8 1 59392
ms_access8_mde 1 61440
msw12 1 11280
msw12me 1 10009
msw12metemplate 1 10004
msw12template 1 9993
msw6 1 11776
msw8 1 19456
msw8template 1 27136
pdf 2 50214
powerpoint 1 14848
ppt12 1 29956
ppt12me 1 29944
ppt12meslideshow 1 29943
ppt12metemplate 1 29941
ppt12slideshow 1 29897
ppt12template 1 29914
ppt8 1 7680
ppt8_template 1 9728
text 1175 4236008
ustn 1 10240
vrf 1 158993
wp6 1 941
wp7 1 1362
wp8 1 1362
xml 28 37953
zip 1 255125
(45 rows affected)

The content type may indicates the kind of activity the repository is used for. While pdf are generally final documents suitable for archiving, a predominance of illustrator/pagemaker/QuarkXPress vs photoshop vs multimedia vs Office documents can give a hint at the docbase’s general function.
Since most content types are unused (who uses Word Perfect or Ami Pro any more ?), a constraint on the count can be introduced, e.g. “having count(*) > 10″ to filter out those obsolete formats.

— 7. and how do they evolve over time ?

select datefloor(month, r_creation_date) as "creation_month", a_content_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by datefloor(month, r_creation_date), a_content_type order by 1, 2
creation_month a_content_type count(*) tot_size
------------------------- -------------------- ------------ ------------
11/1/2017 01:00:00 1 0
11/1/2017 01:00:00 amipro 1 4558
11/1/2017 01:00:00 crtext 43 389681
11/1/2017 01:00:00 dtd 5 163392
11/1/2017 01:00:00 excel12bbook 1 8546
11/1/2017 01:00:00 excel12book 1 7848
11/1/2017 01:00:00 excel12mebook 1 7867
11/1/2017 01:00:00 excel12metemplate 1 7871
11/1/2017 01:00:00 excel12template 1 7853
11/1/2017 01:00:00 excel5book 1 15360
11/1/2017 01:00:00 excel8book 1 13824
11/1/2017 01:00:00 excel8template 1 13824
11/1/2017 01:00:00 ibmshrlib 2 212586
11/1/2017 01:00:00 jar 233 48856783
11/1/2017 01:00:00 java 3 1594603
11/1/2017 01:00:00 maker55 5 117760
11/1/2017 01:00:00 mdoc55 9 780288
11/1/2017 01:00:00 ms_access7 1 83968
11/1/2017 01:00:00 ms_access8 1 59392
11/1/2017 01:00:00 ms_access8_mde 1 61440
11/1/2017 01:00:00 msw12 1 11280
11/1/2017 01:00:00 msw12me 1 10009
11/1/2017 01:00:00 msw12metemplate 1 10004
11/1/2017 01:00:00 msw12template 1 9993
11/1/2017 01:00:00 msw6 1 11776
11/1/2017 01:00:00 msw8 1 19456
11/1/2017 01:00:00 msw8template 1 27136
11/1/2017 01:00:00 pdf 2 50214
11/1/2017 01:00:00 powerpoint 1 14848
11/1/2017 01:00:00 ppt12 1 29956
11/1/2017 01:00:00 ppt12me 1 29944
11/1/2017 01:00:00 ppt12meslideshow 1 29943
11/1/2017 01:00:00 ppt12metemplate 1 29941
11/1/2017 01:00:00 ppt12slideshow 1 29897
11/1/2017 01:00:00 ppt12template 1 29914
11/1/2017 01:00:00 ppt8 1 7680
11/1/2017 01:00:00 ppt8_template 1 9728
11/1/2017 01:00:00 text 338 906940
11/1/2017 01:00:00 ustn 1 10240
11/1/2017 01:00:00 vrf 1 158993
11/1/2017 01:00:00 wp6 1 941
11/1/2017 01:00:00 wp7 1 1362
11/1/2017 01:00:00 wp8 1 1362
11/1/2017 01:00:00 xml 28 37953
11/1/2017 01:00:00 zip 1 255125
12/1/2017 01:00:00 text 120 323552
3/1/2018 01:00:00 text 38 66288
4/1/2018 02:00:00 text 469 1427865
5/1/2018 02:00:00 text 86 584453
6/1/2018 02:00:00 text 20 150464
7/1/2018 02:00:00 text 40 301341
8/1/2018 02:00:00 10 0
8/1/2018 02:00:00 text 22 151333
9/1/2018 02:00:00 text 42 323772
(54 rows affected)

— 8. where are those contents stored ?

select a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_storage_type order by a_storage_type
col a_storage_type 20
a_storage_type count(*) tot_size
-------------------- ------------ ------------
11 0
filestore_01 1539 57471147
(2 rows affected)

Filestores are conceptually quite similar to Oracle RDBMS tablespaces. It is a good practice to separate filestores by applications/type of documents and/or even by time if the volume of documents in important. This query tells if this is in place and, if so, what the criteria were.

— 9. dig by document type;

select r_object_type, a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type, a_storage_type order by r_object_type, a_storage_type
r_object_type a_storage_type count(*) tot_size
-------------------------------- -------------------- ------------ ------------
dm_document 11 0
dm_document filestore_01 1284 6384130
dm_esign_template filestore_01 1 46255
dm_format_preferences filestore_01 1 109
dm_menu_system filestore_01 2 352034
dm_plugin filestore_01 2 212586
dm_xml_config filestore_01 1 534
dmc_jar filestore_01 236 50451386
dmc_preset_package filestore_01 2 11951
dmc_tcf_activity_template filestore_01 10 12162
(10 rows affected)

— 10. same but by content format;

select a_content_type, a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_content_type, a_storage_type having count(*) > 10 order by a_content_type, a_storage_type
a_content_type a_storage_type count(*) tot_size
-------------------------------- --------------- ---------------------- ----------------------
crtext filestore_01 43 389681
jar filestore_01 233 48856783
text filestore_01 1204 4432200
xml filestore_01 28 37953
(4 rows affected)

— 11. what ACLs do exist and who created them ?

select owner_name, count(*) from dm_acl group by owner_name order by 2 desc
col owner_name 30
owner_name count(*)
------------------------------ ------------
dmadmin 179
dmtest 44
dm_bof_registry 6
dmc_wdk_preferences_owner 2
dmc_wdk_presets_owner 2
dm_mediaserver 1
dm_audit_user 1
dm_autorender_mac 1
dm_fulltext_index_user 1
dm_autorender_win31 1
dm_report_user 1
(11 rows affected)

Spoiler alert: the attentive readers have probably noticed the command “col” preceding several queries, like in “col name 20″; this is not a standard idql command but part of an extended idql which will be presented in a future blog.
ACLs are part of the security model in place, if any. Usually, the owner is a technical account, sometimes a different one for each application or application’s functionality or business line, so this query can show if the repository is under control of applications or rather simply used as a replacement for a shared drive. Globally, it can tell how the repository’s security is managed, if at all.

— 12. ACLs most in use;

select acl_name, count(*) from dm_document(all) where acl_name not like 'dm_%' group by acl_name having count(*) >= 10 order by 1
acl_name count(*)
-------------------------------- ----------------------
BOF_acl 233
(1 row affected)

Here too, a filter by usage can be introduced so that rarely used acls are not reported, e.g. “having count(*) > 10″.

— 13. queue items;

select name, count(*) from dmi_queue_item group by name order by 2 desc
name count(*)
-------------------- ------------
dm_autorender_win31 115
dmadmin 101
dmtest 54
(3 rows affected)

This query is mainly useful to show if external systems are used for generating renditions, thumbnails or any other document transformation.

— non quantifiable queries;

select object_name from dm_acl where object_name not like 'dm_%' order by 1
object_name
--------------------------------
BOF_acl
BOF_acl2
BPM Process Variable ACL
BatchPromoteAcl
Default Preset Permission Set
Global User Default ACL
WebPublishingAcl
Work Queue User Default ACL
dce_world_write
desktop_client_acl1
replica_acl_default
(11 rows affected)
 
col name 20
col root 30
col file_system_path 70
select f.name, f.root, l.file_system_path from dm_filestore f, dm_location l where f.root = l.object_name
name root file_system_path
-------------------- ------------------------------ ----------------------------------------------------------------------
filestore_01 storage_01 /home/dmadmin/documentum/data/dmtest/content_storage_01
thumbnail_store_01 thumbnail_storage_01 /home/dmadmin/documentum/data/dmtest/thumbnail_storage_01
streaming_store_01 streaming_storage_01 /home/dmadmin/documentum/data/dmtest/streaming_storage_01
replicate_temp_store replicate_location /home/dmadmin/documentum/data/dmtest/replicate_temp_store
replica_filestore_01 replica_storage_01 /home/dmadmin/documentum/data/dmtest/replica_content_storage_01
(5 rows affected)

No numbers here, just plain text information.

There is of course much more information to query (e.g. what about lifecycles and workflows, users and groups ?) depending on what one needs to look at in the repositories, but those are enough examples for our demonstration’s purpose. The readers can always add queries or refine the existing ones to suit their needs.
At this point, we have a lot of hard-to-ingest numbers. Plotting them gives a pleasant 2-dimensional view of the data and will allow to easily compare and analyze them, especially if the generated plot offers some interactivity.

The Graphing Library

To do this, we need a graphing library and since we want to be able to look at the graphs from within a browser for platform-independence and zero-installation on the desktop, we need one that lets us generate an HTML page containing the charts. This is possible in 2 ways:
o   the library is a javascript one; we will programmatically build up json literals, pass them to the library’s plotting function and wrap everything into an html page which is saved on disk for later viewing;
o   the library is usable from several scripting languages and includes a function to generate an HTML page containing the graph; we will chose one for python since this language is ubiquitous and we have now a binding for Documentum (see my blog here);
Now, there are tons of such javascript libraries available on-line (see for example here for a sample) but our choice criteria will be simple:
o   the library must be rich enough but at the same time easy to use;
o   it must do its work entirely locally as servers are generally not allowed to access the Internet;
o   it must be free in order to get rid of the licensing complexity as it can often be overkill in such a simple and confined usage;
As we must pick one, let’s choose Plotly since it is very capable and largely covers our needs. Besides, it works either as a javascript or as a python library but for more generality let’s just use it as a javascript library and generate ourselves the html pages. As an additional bonus, Plotly also allows to interactively edit a plotted graph so it can be tweaked at will. This is very convenient because it lessens the effort to optimize the graphs’ readability as this can be done later by the users themselves while viewing the graphs. For example, the users can zoom, unselect variables, move legends around and much more.
To install it, simply right-click here and save the link to the default graph location in the project’s directory, e.g. /home/dmadmin/graph-stats/graphs.

pwd
/home/dmadmin/graph-stats/graphs
ls -l
...
-rw-rw-r-- 1 dmadmin dmadmin 2814564 Oct 11 11:40 plotly-latest.min.js
...

That’s less than 2.7 MiB of powerful compact js source code. Since it is a text file, there shouldn’t be any security concern in copying that file on a server. Also, as it is executed inside the sandbox of a browser, security is normally not an issue.
The library needs to be in the same directory as the generated html files. If those files are copied to a desktop for direct viewing from a browser with File/Open, don’t forget to copy over the Plotly.js library too, open the html file in an editor, go to the line 6 below and set src to the path of the html file:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
   <head>
      <title>Graphical Overview of Repository SubWay</title>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <script src="...../plotly-latest.min.js"></script>
   </head>
Viewing the html pages

Since those pages must be accessed from any machine on the network (typically, a desktop with a browser), despite they may be created directly on the server that hosts the repositories of interest, we need a web server. It may happen that a disk volume is shared between desktops and servers but this is far from being the general case. Fortunately, python is very helpful here and saves us from the tedious task of setting up a full-fledged web server such as apache. The one-liner below will start a server for visualizing the files in the current directory and its sub-directories:

python3 -m http.server [port] # for python >= v3
python -m SimpleHTTPServer [port] # for python 2.x

It can also be moved in the background to free the command-line. To stop it, bring it in the foreground and type ctrl-C.
Its default port is 8000.
This mini-web server is now waiting for requests on the given port and its IP address. In order to determine that IP address, use the ifconfig command and try one address that is on the same network as the desktop with the browser. Then, load the URL, e.g. http://192.168.56.10:8000/, and you’ll be greeted by a familiar point-and-click interface.
The server presents a directory listing from which to select the html files containing the generated graphs.

An Example

Here is a complete example, from the query to the graph.
The 4th DQL query and its output:

select datefloor(month, r_creation_date) as "creation_month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by datefloor(month, r_creation_date) order by 1;
creation_month count(*) tot_size
------------------------- ---------------------- ----------------------
11/1/2017 01:00:00 703 54142079
12/1/2017 01:00:00 120 323552
3/1/2018 01:00:00 38 66288
4/1/2018 02:00:00 469 1427865
5/1/2018 02:00:00 86 584453
6/1/2018 02:00:00 20 150464
7/1/2018 02:00:00 40 301341
8/1/2018 02:00:00 32 151333
9/1/2018 02:00:00 42 323772
(9 rows affected)

The command to generate the graph:

pwd
/home/dmadmin/graph-stats
./graph-stats.py --docbase dmtest

A simplified generated html page showing only the count(*) variable:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
      <head>
         <title>Statistical Graphs for Repository dmtest</title>
         <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
         <script src="http://192.168.56.10:8000/plotly-latest.min.js"></script>
      </head>
      <body>
      <div id="TheChart" style="width:400px;height:400px;"></div>
      <script>
         var ctx = document.getElementById("TheChart");
         Plotly.newPlot(ctx, [{
                             type: "bar",
                             name: "new docs",
                             x: ['2017/11', '2017/12', '2018/03', '2018/04', '2018/05', '2018/06', '2018/07', '2018/08', '2018/09'],
                             y: ['703', '120', '38', '469', '86', '20', '40', '32', '42'],
                             marker: {
                                        color: '#4e6bed',
                                        line: {
                                                 width: 2.5
                                              }
                                     }
                            }],
                            {
                               title: "New Documents",
                               font: {size: 18}
                            },
                        {responsive: true});
      </script>
      </body>
</html>

Its resulting file:

ll graphs
total 2756
-rw-rw-r-- 1 dmadmin dmadmin 2814564 Oct 12 14:18 plotly-latest.min.js
-rw-rw-r-- 1 dmadmin dmadmin 1713 Oct 12 14:18 dmtest-20181012-141811.html

Same but exposed by the mini-web server and viewed from a browser:
Screenshot from 2018-11-03 15-02-56
When clicking on one of the html file, a graph such as the one below is displayed:
Screenshot from 2018-11-03 15-08-52

The python script

The following python script runs the 13 queries above and produce one large HTML page containing the graphs of all the datasets. It can also be used to query a list of local docbases and generate one single, huge html report or one report per docbase. It can even directly output the produced html code to stdout for further in-line processing, e.g. to somewhat compact the javascript code:

./graph-stats.py --docbase dmtest -o stdout | gawk '{printf $0}' > graphs/dmtest-gibberish.html
 
wc -l graphs/-gibberish.html
0 graphs/-gibberish.html
 
less graphs/dmtest-gibberish.html
Screenshot from 2018-11-03 15-26-11

So, here is the script:

#!/usr/bin/env python

# 10/2018, C. Cervini, dbi-services;
 
import sys
import getopt
from datetime import datetime
import json
import DctmAPI

def Usage():
   print("""
Usage:
Connects as dmadmin/xxxx to a local repository and generates an HTML page containing the plots of several DQL queries result;
Usage:
       ./graph-stats.py -h|--help | -d|--docbase <docbase>{,<docbase>} \[-o|--output_file <output_file>\]
<docbase> can be one repository or a comma-separated list of repositories;
if <output_file> is omitted, the html page is output to ./graphs/<docbase>-$(date +"%Y%m%d-%H%M%S").html;
<output_file> can be "stdout", which is useful for CGI programming;
Example:
       ./graph-stats.py -d dmtest,mail_archive,doc_engineering
will query the docbases dmtest, mail_archive and doc_engineering and output the graphs to the files <docbase>-$(date +"%Y%m%d-%H%M%S").html, one file per docbase;
       ./graph-stats.py -d mail_archive,doc_engineering -o all_docbase_current_status
will query the docbases mail_archive and doc_engineering and output all the graphs to the unique file all_docbase_current_status;
       ./graph-stats.py -d mail-archive
will query docbase mail-archive and output the graphs to the file mail-archive-$(date +"%Y%m%d-%H%M%S").html;
       ./graph-stats.py --docbase dmtest --output dmtest.html
will query docbase dmtest and output the graphs to the file dmtest.html;
       ./graph-stats.py -d dmtest --output stdout
will query docbase dmtest and output the graphs to stdout;
""")

def Plot2HTML(div, graph_title, data, data_labels, bLineGraph = False, mode = None):
   global html_output
   if None != html_output:
      sys.stdout = open(html_output, "a")

   # start the html page;
   if "b" == mode:
      global server, page_title
      print('''
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
   <head>
      <title>''' + page_title + '''</title>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <script src="http://''' + server + '''/plotly-latest.min.js"></script>
   </head>
   <body>
   <center><h3>''' + page_title + '''</h3></center>
''')

   # append to the body of the html page;
   if len(data_labels) == 4:
      stack_labels = {}
      for point in data:
         stack_labels[point[data_labels[0]]] = 0
      print('''
      <div style="width:1500px;height:600px;">
          <div id="''' + div + '-' + data_labels[2] + '''" style="width:50%; float:left;"></div>
          <div id="''' + div + '-' + data_labels[3] + '''" style="width:50%; float:left;"></div>
      </div>
      <script>
         var ctx = document.getElementById("''' + div + '-' + data_labels[2] + '''");
      ''')
      variables = ""
      for stack in stack_labels:
         x = [point[data_labels[1]] for i, point in enumerate(data) if point[data_labels[0]] == stack]
         y1 = [point[data_labels[2]] for i, point in enumerate(data) if point[data_labels[0]] == stack]
         variables +=  ("" if not variables else ", ") + "data_" + stack 
         if bLineGraph:
            vars()['data_' + stack] = {
                                        'name': stack,
                                        'type': "scatter",
                                        'mode': "lines",
                                        'x': x,
                                        'y': y1,
                                      };
         else:
            vars()['data_' + stack] = {
                                        'name': stack,
                                        'type': "bar",
                                        'x': x,
                                        'y': y1,
                                        'width': 0.25,
                                        'marker': {
                                                     #'color': '#009933',
                                                     'line': {
                                                                'width': 1.0
                                                             }
                                                  }
                                      };
         print('data_' + stack + ' = ' + json.dumps(vars()['data_' + stack]) + ';')
         layout = {
                     'title': '<b>' + graph_title + '<br>' + data_labels[2] + '</b>',
                     'legend': {'x': -.1, 'y': 1.2, 'font': {'size': 8}},
                     'font': {'size': 12},
                     'width': 750,
                     'height': 600,
                     'xaxis': {
                                 'title': '<b>' + data_labels[1] + '</b>',
                                 'titlefont': {'size': 10},
                                 'tickangle': -45,
                                 'tickfont': {'size': 8},
                                 'zeroline': True,
                                 'showline': True,
                                 'categoryorder': "category ascending",
                                 'type': "category"
                              },
                     'yaxis': {
                                 'title': '<b>' + data_labels[2] + '</b>',
                                 'titlefont': {'size': 10},
                                 'zeroline': True,
                                 'showline': True,
                                 'tickfont': {'size': 8},
                                 'showgrid': False
                              }
                  }
         if not bLineGraph:
            layout.update({'barmode': "stack", 'bargap': 0.15, 'bargroupgap': 0.5})
         interaction = {'responsive': True, 'scrollZoom': True, 'editable': True}
      print('''Plotly.newPlot(ctx,
                              [''' + variables + '], ' +
                              json.dumps(layout) + ',' +
                              json.dumps(interaction) + ''');''')
      for stack in stack_labels:
         x = [point[data_labels[1]] for i, point in enumerate(data) if point[data_labels[0]] == stack]
         y2 = [point[data_labels[3]] for i, point in enumerate(data) if point[data_labels[0]] == stack]
         if bLineGraph:
            vars()['data_' + stack] = {
                                         'name': stack,
                                         'type': "scatter",
                                         'mode': "lines",
                                         'x': x,
                                         'y': y2
                                      };
         else:
            vars()['data_' + stack] = {
                                         'name': stack,
                                         'type': "bar",
                                         'x': x,
                                         'y': y2,
                                         'width': 0.25,
                                         'marker': {
                                                      #'color': '#009933',*/
                                                      'line': {
                                                                 'width': 1.0
                                                              }
                                                   }
                                      };
         print('data_' + stack + ' = ' + json.dumps(vars()['data_' + stack]) + ';')
      layout = {
                  'title': '<b>' + graph_title + '<br>' + data_labels[3] + '</b>',
                  'legend': {'x': -.1, 'y': 1.2, 'font': {'size': 8}},
                  'font': {
                             'size': 12
                          },
                  'width': 750,
                  'height': 600,
                  'xaxis': {
                            'title': '<b>' + data_labels[1] + '</b>',
                            'titlefont': {'size': 10},
                            'tickangle': -45,
                            'tickfont': {'size': 8},
                            'zeroline': True,
                            'showline': True,
                            'categoryorder': "category ascending",
                            'type': "category"
                         },
                  'yaxis': {
                    'title': '<b>' + data_labels[3] + '</b>',
                    'titlefont': {'size': 10},
                    'zeroline': True,
                    'showline': True,
                    'tickfont': {'size': 8},
                    'showgrid': False
                  }
               }
      if not bLineGraph:
         layout.update({'barmode': "stack", 'bargap': 0.15, 'bargroupgap': 0.5})
      interaction = {'responsive': True, 'scrollZoom': True, 'editable': True}
      print('''
      var ctx = document.getElementById("''' + div + '-' + data_labels[3] + '''");
      Plotly.newPlot(ctx,
                     [''' + variables + '],' + 
                     json.dumps(layout) + ''',
                     ''' + json.dumps(interaction) + ''');
      </script>
''')
   elif len(data_labels) == 3:
      print('''
      <div style="width:1200px;height:600px;">
          <div id="''' + div + '''" style="width:75%; float:left;"></div>
          <div id="''' + div + '''-pie" style="width:25%; float:left;"></div>
      </div>
      <script>
         var ctx = document.getElementById("''' + div + '''");
''')
      traces = []
      if not bLineGraph:
         traces = [
                     {
                        'name': data_labels[1],
                        'type': "bar",
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'y': [point[data_labels[1]] for i, point in enumerate(data)],
                        'width': 0.25,
                        'marker': {
                                     'color': '#009933',
                                     'line': {
                                              'width': 1.0
                                             }
                                  },
                     },
                     # work around for bug "Grouped bar charts do not work with multiple Y axes";
                     # see https://github.com/plotly/plotly.js/issues/78;
                     # must be inserted here;
                     # invisible second trace in the first group
                     {
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'barmode': "overlay",
                        'y': [0], 'type': 'bar', 'hoverinfo': 'none', 'showlegend': False
                     },
                     # invisible first trace in the second group
                     {
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'y': [0], 'type': 'bar', 'yaxis': 'y2', 'hoverinfo': 'none', 'showlegend': False
                     },
                     {
                        'name': data_labels[2],
                        'type': "bar",
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'y': [point[data_labels[2]] for i, point in enumerate(data)],
                        'width': 0.25,
                        'yaxis': "y2",
                        'marker': {
                                     'color': '#4e6bed',
                                     'line': {
                                                'width': 1.0
                                             }
                                  }
                     }
                   ]
      else:
         traces = [
                     {
                        'name': data_labels[1],
                        'type': "scatter",
                        'mode': "lines",
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'y': [point[data_labels[1]] for i, point in enumerate(data)],
                        'width': 0.25,
                     },
                     {
                        'name': data_labels[2],
                        'type': "scatter",
                        'mode': "lines",
                        'x': [point[data_labels[0]] for i, point in enumerate(data)],
                        'y': [point[data_labels[2]] for i, point in enumerate(data)],
                        'width': 0.25,
                        'yaxis': "y2",
                     }
                  ]
      layout = {
                  'title': '<b>' + graph_title + '</b>',
                  'legend': {'x': -.1, 'y': 1.2, 'font': {'size': 8}},
                  'font': {'size': 12},
                  'width': 800,
                  'height': 600,
                  'xaxis': {
                              'title': '<b>' + data_labels[0] + '</b>',
                              'titlefont': {'size': 10},
                              'tickangle': -45,
                              'tickfont': {'size': 8},
                              'zeroline': True,
                              'showline': True
                           },
                  'yaxis': {
                              'title': '<b>' + data_labels[1] + '</b>',
                              'titlefont': {'size': 10},
                              'zeroline': True,
                              'showline': True,
                              'tickfont': {'size': 8},
                              'showgrid': False,
                           },
                  'yaxis2': {
                               'title': '<b>' + data_labels[2] + '</b>',
                               'titlefont': {'size': 10},
                               'tickfont': {'size': 8},
                               'zeroline': True,
                               'showline': True,
                               'overlaying': "y",
                               'side': "right",
                               'showgrid': False,
                            }
               }
      if bLineGraph:
         layout['yaxis'].update({'type': "linear"})
         layout['yaxis2'].update({'type': "linear"})
         pass
      else:
         layout.update({'barmode': "group", 'bargap': 0.15, 'bargroupgap': 0.5})
      interaction = {'responsive': True, 'scrollZoom': True, 'editable': True}
      print('''
         Plotly.newPlot(ctx,
                        ''' + json.dumps(traces) + ''',
                        ''' + json.dumps(layout) + ''',
                        ''' + json.dumps(interaction) + ''');
      </script>

''')
      if not bLineGraph:
         traces =  [
                      {
                         'name': data_labels[1],
                         'hole': .4,
                         'type': "pie",
                         'labels': [point[data_labels[0]] for i, point in enumerate(data)],
                         'values': [point[data_labels[1]] for i, point in enumerate(data)],
                         'domain': {
                            'row': 0,
                            'column': 0
                         },
                         'outsidetextfont': {'size': 8},
                         'insidetextfont': {'size': 8},
                         'legend': {'font': {'size': 8}},
                         'textfont': {'size': 8},
                         'font': {'size': 8},
                         'hoverinfo': 'label+percent+name',
                         'hoverlabel': {'font': {'size': 8}},
                         'textinfo': 'none'
                      },
                      {
                         'name': data_labels[2],
                         'hole': .4,
                         'type': "pie",
                         'labels': [point[data_labels[0]] for i, point in enumerate(data)],
                         'values': [point[data_labels[2]] for i, point in enumerate(data)],
                         'domain': {
                            'row': 1,
                            'column': 0
                         },
                         'outsidetextfont': {'size': 8},
                         'insidetextfont': {'size': 8},
                         'legend': {'font': {'size': 8}},
                         'textfont': {'size': 8},
                         'font': {'size': 8},
                         'hoverinfo': 'label+percent+name',
                         'hoverlabel': {'font': {'size': 8}},
                         'textinfo': 'none'
                      },
                   ]
         layout = {
                     'title': '<b>' + graph_title + '</b>',
                     'annotations': [
                                       {
                                          'font': {
                                                    'size': 8
                                                  },
                                          'showarrow': False,
                                          'text': '<b>' + data_labels[1] + '</b>',
                                          'x': 0.5,
                                          'y': 0.8
                                       },
                                       {
                                          'font': {
                                                     'size': 8
                                                  },
                                          'showarrow': False,
                                          'text': '<b>' + data_labels[2] + '</b>',
                                          'x': 0.5,
                                          'y': 0.25
                                       }
                                    ],
                     'height': 600,
                     'width': 600,
                     'grid': {'rows': 2, 'columns': 1},
                     'legend': {'font': {'size': 10}}
                  }

         print('''
      <script>
         var ctx = document.getElementById("''' + div + '''-pie");
         Plotly.newPlot(ctx,
                        ''' + json.dumps(traces) + ''',
                        ''' + json.dumps(layout) + '''
);
      </script>
''')
   elif len(data_labels) == 2:
      trace = [
                 {
                    'name': data_labels[1],
                    'type': "bar",
                    'x': [point[data_labels[0]] for i, point in enumerate(data)],
                    'y': [point[data_labels[1]] for i, point in enumerate(data)],
                    'width': 0.25,
                    'marker': {
                                 'color': '#009933',
                                 'line': {
                                            'width': 1.0
                                         }
                              }
                 }
              ]
      layout = {
                  'title': '<b>' + graph_title + '</b>',
                  'legend': {'x': -.1, 'y': 1.2, 'font': {'size': 8}},
                  'font': {'size': 12},
                  'height': 600,
                  'width': 800,
                  'xaxis': {
                            'title': '<b>' + data_labels[0] + '</b>',
                            'titlefont': {'size': 10},
                            'tickangle': -45,
                            'tickfont': {'size': 8},
                            'zeroline': True,
                          },
                  'yaxis': {
                            'title': '<b>' + data_labels[1] + '</b>',
                            'titlefont': {'size': 10},
                            'zeroline': True,
                            'showline': True,
                            'tickfont': {'size': 8},
                            'showgrid': False
                          },
                  'barmode': "group",
                  'bargap': 0.15,
                  'bargroupgap': 0.5
               }
      interaction = {'responsive': True, 'scrollZoom': True, 'editable': True};
      print('''        
      <div style="width:1200px;height:600px;">
          <div id="''' + div + '''" style="width:75%; float:left;"></div> 
          <div id="''' + div + '''-pie" style="width:25%; float:left;"></div>
      </div>
      <script>
         var ctx = document.getElementById("''' + div + '''");
         Plotly.newPlot(ctx,
                        ''' + json.dumps(trace) + ''',
                        ''' + json.dumps(layout) + ''',
                        ''' + json.dumps(interaction) + ''');
      </script>
''')
      trace =  [
                  {
                     'name': data_labels[1],
                     'hole': .4,
                     'type': "pie",
                     'labels': [point[data_labels[0]] for i, point in enumerate(data)],
                     'values': [point[data_labels[1]] for i, point in enumerate(data)],
                     'domain': {
                                  'row': 0,
                                  'column': 0
                               },
                     'outsidetextfont': {'size': 8},
                     'insidetextfont': {'size': 8},
                     'legend': {'font': {'size': 8}},
                     'textfont': {'size': 8},
                     'font': {'size': 8},
                     'hoverinfo': 'label+percent+name',
                     'hoverlabel': {'font': {'size': 8}},
                     'textinfo': 'none'
                  },
               ]
      layout = {
                  'title': '<b>' + graph_title + '</b>',
                  'annotations': [
                                    {
                                       'font': {
                                                  'size': 8
                                               },
                                       'showarrow': False,
                                       'text': '<b>' + data_labels[1] + '</b>',
                                       'x': 0.5,
                                       'y': 0.5
                                    },
                                 ],
                  'height': 600,
                  'width': 600,
                  'grid': {'rows': 1, 'columns': 1},
                  'legend': {'font': {'size': 10}}
               }
      print('''        
      <script>
         var ctx = document.getElementById("''' + div + '''-pie");
         Plotly.newPlot(ctx,
                        ''' + json.dumps(trace) + ''',
                        ''' + json.dumps(layout) + ''');
      </script>
''')
   else:
      print("illegal data_label value: " + repr(data_label))

   # closes the html page;
   if "e" == mode:
      print('''
   </body>
</html>
''')

   # restores default output stream;
   if None != html_output:
      sys.stdout = sys.__stdout__

def cumulAndExtend2():
   """
   for 2 variables in resultset;
   """
   global rawdata
   sum_count = 0
   sum_size = 0
   new_month = str(datetime.now().year) + "/" + str(datetime.now().month)
   for ind, point in enumerate(rawdata):
      sum_count += int(point['count(*)'])
      point['count(*)'] = sum_count
      sum_size += int(point['tot_size'])
      point['tot_size'] = sum_size
   if rawdata[ind]['month'] < new_month:
      new_point = dict(rawdata[ind])
      new_point['month'] = new_month
      rawdata.append(new_point)
   DctmAPI.show(rawdata)

def cumulAndExtend3(key):
   """
   for 3 variables in resultset;
   """
   global rawdata
   prec_doc = ""
   sum_count = 0
   sum_size = 0
   new_month = str(datetime.now().year) + "/" + str(datetime.now().month)
   for ind, point in enumerate(rawdata):
      if "" == prec_doc:
         prec_doc = point[key]
      if point[key] != prec_doc:
         # duplicate the last point so the line graphe shows a flat line and not a simple dot when there is a unique point for the document type;
         if rawdata[ind - 1]['month'] < new_month:
            new_point = dict(rawdata[ind - 1])
            new_point['month'] = new_month
            rawdata.insert(ind, new_point)
            prec_doc = point[key]
            sum_count = 0
            sum_size = 0
      else:
         sum_count += int(point['count(*)'])
         point['count(*)'] = sum_count
         sum_size += int(point['tot_size'])
         point['tot_size'] = sum_size
   if rawdata[ind]['month'] < new_month:
      new_point = dict(rawdata[ind])
      new_point['month'] = new_month
      rawdata.append(new_point)
   DctmAPI.show(rawdata)

# -----------------
# main;
if __name__ == "__main__":
   DctmAPI.logLevel = 0
 
   # parse the command-line parameters;
   # old-style for I don't need more flexibility here;
   repository = None
   output_file = None
   try:
      (opts, args) = getopt.getopt(sys.argv[1:], "hd:o:", ["help", "docbase=", "output="])
   except getopt.GetoptError:
      print("Illegal option")
      print("./graph-stats.py -h|--help | -d|--docbase <docbase>{,<docbase>} [-o|--output_file <output_file>]")
      sys.exit(1)
   for opt, arg in opts:
      if opt in ("-h", "--help"):
         Usage()
         sys.exit()
      elif opt in ("-d", "--docbase"):
         repository = arg
      elif opt in ("-o", "--output"):
         output_file = arg
   if None == repository:
      print("at least one repository must be specified")
      Usage()
      sys.exit()
   DctmAPI.show("Will connect to docbase(s): " + repository + " and output to " + ("stdout" if "stdout" == output_file else "one single file " + output_file if output_file is not None else "one file per docbase"))
 
   # needed to locally import the js library Plotly;
   server = "192.168.56.10:8000"

   docbase_done = set()
   status = DctmAPI.dmInit()
   for pointer, docbase in enumerate(repository.split(",")):
      # graphe_1;
      if docbase in docbase_done:
         print("Warning: docbase {:s} was already processed and won't be again, skipping ...".format(docbase))
         continue
      docbase_done.add(docbase)
      session = DctmAPI.connect(docbase = docbase, user_name = "dmadmin", password = "dmadmin")
      if session is None:
         print("no session opened, exiting ...")
         exit(1)

      page_title = "Graphical Overview of Repository " + docbase
      if None == output_file:
         html_output = "./graphs/" + docbase + "-" + datetime.today().strftime("%Y%m%d-%H%M%S") + ".html"
      elif "stdout" == output_file:
         html_output = None
      else:
         html_output = output_file

      graph_name = "graphe_1"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select r_object_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type order by 1"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q1. Present Count & Size Per Document Type",
                data = rawdata,
                data_labels = attr_name,
                mode = ("b" if (("stdout" == output_file or output_file is not None) and 0 == pointer) or output_file is None else None))

      # graphe_2;
      graph_name = "graphe_2"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select r_object_type, datetostring(r_creation_date, 'yyyy/mm') as "month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type, datetostring(r_creation_date, 'yyyy/mm') order by 1, 2"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q2. Monthly New Documents per Type",
                data = rawdata,
                data_labels = attr_name)
      cumulAndExtend3('r_object_type')
      Plot2HTML(div = graph_name + "l",
                graph_title = "Q2l. Cumulated Monthly Documents per Type",
                data = rawdata,
                data_labels = attr_name,
                bLineGraph = True)
   
      # graphe_3;
      graph_name = "graphe_3"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select r_object_type, datetostring(r_modify_date, 'yyyy/mm') as "month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) where r_creation_date < r_modify_date group by r_object_type, datetostring(r_modify_date, 'yyyy/mm') order by 1, 2"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q3. Monthly Modified Documents per Type",
                data = rawdata,
                data_labels = attr_name)
      cumulAndExtend3('r_object_type')
      Plot2HTML(div = graph_name + "l",
                graph_title = "Q3l. Cumulated Monthly Modified Documents per Type",
                data = rawdata,
                data_labels = attr_name,
                bLineGraph = True)

      # graphe_4;
      graph_name = "graphe_4"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select datetostring(r_creation_date, 'yyyy/mm') as "month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by datetostring(r_creation_date, 'yyyy/mm') order by 1"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q4. Monthly New Documents",
                data = rawdata,
                data_labels = attr_name )
      cumulAndExtend2()
      Plot2HTML(div = graph_name + "l",
                graph_title = "Q4l. Cumulated Monthly Documents",
                data = rawdata,
                data_labels = attr_name,
                bLineGraph = True)

      # graphe_5;
      graph_name = "graphe_5"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select datetostring(r_modify_date, 'yyyy/mm') as "month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) where r_creation_date != r_modify_date group by datetostring(r_modify_date, 'yyyy/mm') order by 1;
        """
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q5. Monthly Modified Documents",
                data = rawdata,
                data_labels = attr_name)
      cumulAndExtend2()
      Plot2HTML(div = graph_name + "l",
                graph_title = "Q5l. Cumulated Monthly Modified Documents",
                data = rawdata,
                data_labels = attr_name,
                bLineGraph = True,
                mode = 'e')

      # graphe_6;
      graph_name = "graphe_6"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select a_content_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_content_type order by 1"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q6. Count & Size Per Content Format",
                data = rawdata,
                data_labels = attr_name)

      # graphe_7;
      graph_name = "graphe_7"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select a_content_type, datetostring(r_creation_date, 'yyyy-mm') as "month", count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_content_type, datetostring(r_creation_date, 'yyyy-mm') having count(*) > 10 order by 1, 2"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q7. Monthly Created Documents Per Content Format with count(*) > 10",
                data = rawdata,
                data_labels = attr_name)
      cumulAndExtend3('a_content_type')
      DctmAPI.show(rawdata)
      Plot2HTML(div = graph_name + "l",
                graph_title = "Q7l. Cumulated Monthly Created Documents Per Content Format with count(*) > 10",
                data = rawdata,
                data_labels = attr_name,
                bLineGraph = True)

      # graphe_8;
      graph_name = "graphe_8"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_storage_type order by a_storage_type"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q8. Count & Size Per Filestore",
                data = rawdata,
                data_labels = attr_name)

      # graphe_9;
      graph_name = "graphe_9"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select r_object_type, a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by r_object_type, a_storage_type order by r_object_type, a_storage_type"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q9. Count & Size Per Document Type & Filestore",
                data = rawdata,
                data_labels = attr_name)

      # graphe_10;
      graph_name = "graphe_10"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select a_content_type, a_storage_type, count(*), sum(r_full_content_size) as "tot_size" from dm_document(all) group by a_content_type, a_storage_type having count(*) > 10 order by a_content_type, a_storage_type"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q10. Count & Size Per Content Format & Filestore with count(*) > 10",
                data = rawdata,
                data_labels = attr_name)

      # graphe_11;
      graph_name = "graphe_11"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select owner_name, count(*) from dm_acl group by owner_name order by 2 desc"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q11. ACLs per owner",
                data = rawdata,
                data_labels = attr_name)

      # graphe_12;
      graph_name = "graphe_12"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select acl_name, count(*) from dm_document(all) where acl_name not like 'dm_%' group by acl_name having count(*) >= 10 order by 1"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q12. External ACLs in Use by >= 10 documents",
                data = rawdata,
                data_labels = attr_name)

      # graphe_13;
      graph_name = "graphe_13"
      DctmAPI.show(graph_name, beg_sep = True)
      stmt = """select name, count(*) from dmi_queue_item group by name order by 2 desc"""
      rawdata = []
      attr_name = []
      status = DctmAPI.select2dict(session, stmt, rawdata, attr_name)
      DctmAPI.show(rawdata)
      if not status:
         print("select [" + stmt + "] was not successful")
         exit(1)
      Plot2HTML(div = graph_name,
                graph_title = "Q13. Queue Items by Name",
                data = rawdata,
                data_labels = attr_name,
                mode = ('e' if (("stdout" == output_file or output_file is not None) and len(repository.split(",")) - 1 == pointer) or output_file is None else None))

   status = DctmAPI.disconnect(session)
   if not status:
      print("error while  disconnecting")

A pdf (sorry, uploading python scripts is not allowed on this site) of the script is available here too graph-stats.
Here is the DctmAPI.py as pdf too: DctmAPI.
The script invokes select2dict() from the module DctmAPI (as said before, it is also presented and accessible here) with the query and receives back the result into an array of dictionaries, i.e. each line of the array is a line from the query’s resultset. It then goes on creating the HTML page by invoking the function Plot2HTML(). On line 47, the js library is imported. Each graph is plotted in its own DIV section by Plotly’s js function newPlot(). All the required parameters are set in python dictionaries and passed as json literals to newPlot(). The conversion from python to javascript is conveniently done by the function dumps() from the json module. The HTML and js parts are produced by multi-line python’s print() statements.

Invoking the script

The script is invoked from the command-line through the following syntax:

./graph-stats.py -h|--help | -d|--docbase <docbase>{,<docbase>} [-o|--output_file <output_file>]

A Help option is available, which outputs the text below:

./graph-stats.py --help
Usage:
Connects as dmadmin/xxxx to a repository and generates an HTML page containing the plots of several DQL queries result;
Usage:
./graph-stats.py -h|--help | -d|--docbase {,<docbase>} [-o|--output_file <output_file>] output_file can be one repository or a comma-separated list of repositories;
if output_file is omitted, the html page is output to ./graphs/-$(date +"%Y%m%d-%H%M%S").html;
output_file can be "stdout", which is useful for CGI programming;
Example:
./graph-stats.py -d dmtest,mail_archive,doc_engineering
will query the docbases dmtest, mail_archive and doc_engineering and output the graphs to the files -$(date +"%Y%m%d-%H%M%S").html, one file per docbase;
./graph-stats.py -d mail_archive,doc_engineering -o all_docbase_current_status
will query the docbases mail_archive and doc_engineering and output all the graphs to the unique file all_docbase_current_status;
./graph-stats.py -d mail-archive
will query docbase mail-archive and output the graphs to the file mail-archive-$(date +"%Y%m%d-%H%M%S").html;
./graph-stats.py --docbase dmtest --output dmtest.html
will query docbase dmtest and output the graphs to the file dmtest.html;
./graph-stats.py -d dmtest --output stdout
will query docbase dmtest and output the graphs to stdout;

As the account used is the trusted, password-less dmadmin one, the script must run locally on the repository server machine but it is relatively easy to set it up for remote docbases, with a password prompt or without if using public key authentication (and still password-less in the remote docbases if using dmadmin).

Examples of charts

Now that we’ve seen the bits and pieces, let’s generate an HTML page with the graphs of all the above queries.
Here is an example when the queries where run against an out of the box repository named dmtest (all the graphs in one pdf file): dmtest.pdf
For more complex examples, just run the script against one of your docbases and see the results by yourself, as I can’t upload neither html files nor compressed files here.
As shown, some queries are charted using 2 to 4 graphs, the objective being to offer the clearest view. E.g. query 1 computes counts and total sizes. They will be plotted on the same graph as grouped bars of 2 independent variables and on two distinct pie charts, one for each variable. Query 2 shows the count(*) and the total sizes by document type as 2 stacked bar graphs plus 2 line graphs of the cumulated monthly created documents by document type. The line graphs show the total number of created documents at the end of a month, not the total number of existing documents as of the end of each month; to have those numbers (an historical view of query 1), one needs to query the docbase monthly and store the numbers somewhere. Query 1 returns this status but only at the time it was run.
Discussing Plotly is out of the scope of this article but suffice it to say that it allows one to work the chart interactively, e.g. removing variables (aka traces in Plotly’s parlance) so the rest of them are re-scaled for better readability, zoom parts of the charts, and scroll the graphs horizontally or vertically. A nice tip on hover is also available to show the exact values of a point, e.g.:
Screenshot from 2018-11-03 19-07-28
Plotly even lets one completely edit the chart from within a js application hosted at Plotly, so this may not suit anybody if confidentiality is required; moreover, the graph can only be saved in a public area on their site, and subsequently downloaded from there, unless a per annum paying upgrade is subscribed. Nonetheless, the free features are well enough for most needs.

Leveraging Plot2HTML()

The script’s Plot2HTML() function can be leveraged so it is used to plot data from any source, even from a text file so the query is separated from its result’s graphical representation, which is useful if a direct access to the system to query is not possible. To do this, an easy to parse file format is necessary, e.g.:

# this is a commented line;
# line_no   variable1    variable2 .....
0          type of graph                                   # use B for bars, S for stacked bars, L for lines, P for pie or any combination of them (BL, LB, BP, PB, LP, PL, BPL);
                                                           # one graph for each combination will be put on the html page in a nx2 cell grid, i.e. graphs  1 and 2 horizontally and graph 3 below,
                                                           # in the same order as the graph type letters;
1          graph title                                     # can be on several lines separated by html 
for sub-titles; 2 Xaxis_name Yaxis1_name Yaxis2_name # names of axis, one X and up to 2 Y; 3 Xaxis_values1 value1 value2 # first line of (x,y) pairs or (x,y1,y2) triplets if a second axis was specified; 4 Xaxis_values2 value1 value2 # second line of data; ... # the other lines; n Xaxis_valuesn valuen valuen # last line of data;

And that script could be invoked like this:

Usage:
       ./graph-it.py -h|--help | -f|--input_file <filename>{,<filename>} [-o|--output_file <output_file>]

The script itself just needs to parse and load the data from the input file(s) into an array of dictionaries and pass it to the plotting function Plot2HTML(). Of course, if the data is already in json format, the parsing step will be greatly simplified.
Time and mood permitting, I’ll do it in a future article.

Useful tools

The color of the many chart’s option (e.g. on line 87) can be specified through an rgb triplet or an HTML code. There are many color pickers on the web, e.g. this one. Here, I let Plotly chose the colors as it sees fit, hence the not so attractive results.
If you need to paste HTML code into an HTML page without it being interpreted by the browser, e.g. the code generated by graph-stats.py, some transformations are required, e.g. all the angle brackets need be replaced by their HTML entities representations. If that code does not contain confidential information, an on-line service can be used for that, such as this one, but it shouldn’t be too difficult to write a list of sed expressions that do the substitutions (e.g. this gawk one-liner takes care of the characters < and >:

gawk -c '{gsub(/>/, "\\>"); gsub(/</, "\\<"); print}' myfile.html).

Other replacements can be added as the need arises.

Conclusion

All the above confirms yet again that it is possible to create cheap but useful, custom tools with almost no installation, except here a textual javascript library. All the rest is already available on any administrator’s decent system, command-line and python interpreters, and a browser which, by the way, is taking more and more importance as a tiers because of the wide spreading of javascript libraries that execute locally and the ever increasing performance of javascript engines. By the way, chromium is highly recommended here for visualizing those complex charts from real production repositories.
As the source code is provided, it is possible to modify it in order to produce better looking graphs. This is the real arduous and time-consuming part in my humble opinion as it requires some artistic inclination and many experimentations. If you find a nicer color scheme, please feel free to contribute in the comments.

Cet article A Graphical Overview of a Repository est apparu en premier sur Blog dbi services.

Some more zheap testing

Sat, 2018-11-03 08:59

Hervé already did some tests with zheap and documented his results yesterday. After some more discussions with Amit who did the session about zHeap at the conference here in Lisbon (you can find the slides here). I thought it might be a good idea to do some more testing on that probably upcoming feature. Lets go.

If you want to test it for your own, here is a simple script that clones the repository, compiles and installs from source and then start the PostgreSQL instance:

postgres@pgbox:/home/postgres/ [ZHEAP] cat refresh_zheap.sh 
#!/bin/bash

rm -rf zheap
git clone https://github.com/EnterpriseDB/zheap
cd zheap
PGHOME=/u01/app/postgres/product/zheap/db_1/
SEGSIZE=2
BLOCKSIZE=8
./configure --prefix=${PGHOME} \
            --exec-prefix=${PGHOME} \
            --bindir=${PGHOME}/bin \
            --libdir=${PGHOME}/lib \
            --sysconfdir=${PGHOME}/etc \
            --includedir=${PGHOME}/include \
            --datarootdir=${PGHOME}/share \
            --datadir=${PGHOME}/share \
            --with-pgport=5432 \
            --with-perl \
            --with-python \
            --with-openssl \
            --with-pam \
            --with-ldap \
            --with-libxml \
            --with-libxslt \
            --with-segsize=${SEGSIZE} \
            --with-blocksize=${BLOCKSIZE} \
	    --with-systemd
make all
make install
cd contrib
make install
rm -rf /u02/pgdata/zheap
/u01/app/postgres/product/zheap/db_1/bin/initdb -D /u02/pgdata/zheap
pg_ctl -D /u02/pgdata/zheap start
psql -c "alter system set logging_collector='on'" postgres
psql -c "alter system set log_truncate_on_rotation='on'" postgres
psql -c "alter system set log_filename='postgresql-%a.log'" postgres
psql -c "alter system set log_line_prefix='%m - %l - %p - %h - %u@%d '" postgres
psql -c "alter system set log_directory='pg_log'" postgres
pg_ctl -D /u02/pgdata/zheap restart -m fast

First of all, when you startup PostgreSQL you’ll get two new background worker processes:

postgres@pgbox:/home/postgres/ [ZHEAP] ps -ef | egrep "discard|undo"
postgres  1483  1475  0 14:40 ?        00:00:00 postgres: discard worker   
postgres  1484  1475  0 14:40 ?        00:00:01 postgres: undo worker launcher   
postgres  1566  1070  0 14:51 pts/0    00:00:00 grep -E --color=auto discard|undo

The “discard worker” is responsible for getting rid of all the undo segments that are not required anymore and the “undo worker launcher” obviously is responsible for launching undo worker processes for doing the rollbacks.

There is a new parameter which controls the default storage engine (at least the parameter is there as of now, maybe that will change in the future), so lets change that to zheap before we populate a sample database (“heap” is the default value):

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "alter system set storage_engine='zheap'" postgres
ALTER SYSTEM
Time: 12.722 ms
postgres@pgbox:/home/postgres/ [ZHEAP] pg_ctl -D $PGDATA restart -m fast
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "show storage_engine" postgres
 storage_engine 
----------------
 zheap
(1 row)

Lets use pgbench to create the sample data:

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "create database zheap" postgres
CREATE DATABASE
Time: 763.284 ms
postgres@pgbox:/home/postgres/ [ZHEAP] time pgbench -i -s 100 zheap
...
done.

real	0m23.375s
user	0m2.293s
sys	0m0.772s

That should have created the tables using the zheap storage engine:

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "\d+ pgbench_accounts" zheap
                                  Table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description 
----------+---------------+-----------+----------+---------+----------+--------------+-------------
 aid      | integer       |           | not null |         | plain    |              | 
 bid      | integer       |           |          |         | plain    |              | 
 abalance | integer       |           |          |         | plain    |              | 
 filler   | character(84) |           |          |         | extended |              | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Options: storage_engine=zheap, fillfactor=100

When we do the same using the “heap” storage format how long does that take?:

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "alter system set storage_engine='heap'" postgres
ALTER SYSTEM
Time: 8.790 ms
postgres@pgbox:/home/postgres/ [ZHEAP] pg_ctl -D $PGDATA restart -m fast
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "create database heap" postgres
CREATE DATABASE
Time: 889.847 ms
postgres@pgbox:/home/postgres/ [ZHEAP] time pgbench -i -s 100 heap
...

real	0m30.471s
user	0m2.355s
sys	0m0.419s
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "\d+ pgbench_accounts" heap
                                  Table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description 
----------+---------------+-----------+----------+---------+----------+--------------+-------------
 aid      | integer       |           | not null |         | plain    |              | 
 bid      | integer       |           |          |         | plain    |              | 
 abalance | integer       |           |          |         | plain    |              | 
 filler   | character(84) |           |          |         | extended |              | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Options: fillfactor=100

postgres@pgbox:/home/postgres/ [ZHEAP] 

I ran that test several times but the difference of about 5 to 6 seconds is consistent. zheap is faster here, but that is coming from vacuum. When you run the same test again but skip the vacuum ( the “-n” option of pgbench) at the end, heap is faster:

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "create database heap" postgres
CREATE DATABASE
Time: 562.155 ms
postgres@pgbox:/home/postgres/ [ZHEAP] time pgbench -i -n -s 100 heap
done.

real	0m21.650s
user	0m2.316s
sys	0m0.225s

But anyway: As zheap has to create undo segments more needs to go to disk initially. heap needs to run vacuum, not immediately but for sure some time later. When you compare a pure insert only workload, without vacuum, heap is faster. The great thing is, that you can decide what you want to use on the table level. Some tables might be better created with the zheap storage engine, others may be better created with heap. The important bit is that you have full control.

Hervé already compared the size of his tables in the last post. Do we see the same here when we compare the size of the entire databases?

postgres@pgbox:/home/postgres/ [ZHEAP] vacuumdb heap
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "\l+" postgres
                                                                   List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   |  Size   | Tablespace |                Description         
-----------+----------+----------+------------+------------+-----------------------+---------+------------+------------------------------------
 heap      | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 1503 MB | pg_default | 
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 7867 kB | pg_default | default administrative connection d
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +| 7721 kB | pg_default | unmodifiable empty database
           |          |          |            |            | postgres=CTc/postgres |         |            | 
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +| 7721 kB | pg_default | default template for new databases
           |          |          |            |            | postgres=CTc/postgres |         |            | 
 zheap     | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 1250 MB | pg_default | 
(5 rows)

Yes, heap is 253MB smaller. That difference should even get bigger once we populate the “filler” column of the pgbench_accounts table, which is currently NULL:

postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "update pgbench_accounts set filler = 'aaaaaa'" zheap
UPDATE 10000000
Time: 55768.488 ms (00:55.768)
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "update pgbench_accounts set filler = 'aaaaaa'" heap
UPDATE 10000000
Time: 52598.782 ms (00:52.599)
postgres@pgbox:/home/postgres/ [ZHEAP] vacuumdb heap
vacuumdb: vacuuming database "heap"
postgres@pgbox:/home/postgres/ [ZHEAP] psql -c "\l+" postgres
                                                                   List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   |  Size   | Tablespace |                Description              
-----------+----------+----------+------------+------------+-----------------------+---------+------------+-----------------------------------------
 heap      | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 3213 MB | pg_default | 
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 7867 kB | pg_default | default administrative connection databa
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +| 7721 kB | pg_default | unmodifiable empty database
           |          |          |            |            | postgres=CTc/postgres |         |            | 
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +| 7721 kB | pg_default | default template for new databases
           |          |          |            |            | postgres=CTc/postgres |         |            | 
 zheap     | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                       | 1250 MB | pg_default | 

As expected and consistent with what Herve has seen in his tests. The update against the heap table was a bit faster (around 3 seconds) but again: zheap hast to create undo segments and that causes additional writes on disk. Three seconds against a 10 million row table is not that huge, by the way, and how often do you update the complete table?

Now lets run a standard pgbench workload against these database and check what we can see there. For the zheap database with 1 connection for 60 seconds this is the best result I got after ten runs:

postgres@pgbox:/home/postgres/ [ZHEAP] pgbench -c 1 -T 60 zheap
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 29265
latency average = 2.050 ms
tps = 487.726916 (including connections establishing)
tps = 487.786025 (excluding connections establishing)

The same against the heap:

postgres@pgbox:/home/postgres/ [ZHEAP] pgbench -c 1 -T 60 heap
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 24992
latency average = 2.401 ms
tps = 416.485499 (including connections establishing)
tps = 416.516805 (excluding connections establishing)

The numbers changed a bit for every execution but always zheap was better than heap (Be aware that I am on little VM here), so at least there is no regression in performance but rather an improvement for this workload.

For the select only workload (the “-S” option) this is the best result for heap:

postgres@pgbox:/home/postgres/ [ZHEAP] for i in {1..10}; do pgbench -c 1 -S -T 60 heap; done
...
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 64954
latency average = 0.924 ms
tps = 1082.514439 (including connections establishing)
tps = 1082.578288 (excluding connections establishing)
...

And this is the best result for zheap:

postgres@pgbox:/home/postgres/ [ZHEAP] for i in {1..10}; do pgbench -c 1 -S -T 60 zheap; done
...
starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 109023
latency average = 0.550 ms
tps = 1816.787280 (including connections establishing)
tps = 1817.485717 (excluding connections establishing)
...

With this workload the difference is even more clear: zheap clearly wins.

As noted before: all these test have been done locally on a little VM, so be careful with these number. We should have access to a great storage system with some good servers soon and once we have that I’ll do some more tests and publish the results.

For now it is somehow clear that zheap is an improvement for several types of workloads while heap still is better for others. In the next post I’ll try to do some tests to help the developers, meaning: Can we break it?

Cet article Some more zheap testing est apparu en premier sur Blog dbi services.

5 mistakes you should avoid with Oracle Database Appliance

Fri, 2018-11-02 18:00
Introduction

It has been 5 years I’m working on ODA and I now have a quite good overview of the pros and the cons of such a solution. On one hand, ODA can greatly streamline your Oracle Database environment if you understand the purpose of this appliance. On the other hand, ODA can be your nightmare if you’re not using it as it supposed to be used. Let’s discover the common mistakes to avoid from the very beginning.

1st mistake – Consider the ODA as an appliance

Appliances are very popular these days, you need one feature, you buy a box that handles this feature and nothing else, you plug it and it works straight. Unfortunatly you cannot do that with an ODA. First of all, the level of embedded software is not clever enough to bundle all the features needed to simplify the DBA’s job. No doubt it will help you with faster deployment compared to home made environments, but all the tasks the DBA did before still exist. The appliance part is limited to basic operations. With an ODA, you will have to connect in terminal mode, with root, grid and oracle users to do a lot of operations. And you will need Linux skills to manage your ODA: it can be a problem if you are a Windows-only user. ODA provides a graphical interface, but it’s not something you will use very often. And there is still a lot of work for Oracle to gain the appliance moniker.

2nd mistake – Consider the ODA as a normal server

Second mistake is to consider the ODA as a normal server. Because it looks like it’s a normal server.

On the software side, if you ask a dba to connect to your Oracle environment on ODA, he probably won’t see that it’s actually an ODA, unless the server name contains the oda word :-) The only tool that differs from a normal server is the oakcli/odacli appliance manager, a command-line tool created to manage several features like database creation/deletion. What can be dangerous is that you will have system level access on the server, and all the advantages it comes with. But if you do some changes on your system, for example by installing a new package, manually changing the network configuration, tuning some Linux parameters, it can later avoid you to apply the next patch. The DBA should also keep off patching a database with classic patches available for Linux systems. Doing that will make your dbhome and related databases no more manageable by the appliance manager. Wait for the ODA-dedicated quaterly patch bundle if you want to patch.

On the hardware side, yes an ODA looks like a normal server, with free disk slots in the front. But you always have to order extension from the ODA catalog, and you cannot do what you want. You need to change disks for bigger ones? It’s not possible. You want to add 2 new disks? Not more possible, disks are sold as a 3-disk pack and are supposed to be installed together. ODAs have limitations. The small one cannot be extended at all. And the other ones support limited extensions (in number and in time).

Please keep your ODA away from Linux gurus and hardware geeks. And get stuck with recommended configurations.

3rd mistake – Compare the ODA hardware to other systems

When you consider ODA, it’s quite common to compare the hardware to what other brands can propose for the same amount of money. But it’s clearly not a good comparison. You’ll probably get more for your money from other brands. You should consider the ODA as a hardware and software bundle made to last more than a normal system. As an example, I deployed my first ODA X4-2 in May 2014, and the actual software bundle is still compatible with this ODA. Support for this ODA will end in February 2020, nearly 6 years of update for all the stack for a server that is able to run 11.2.0.4, 12.1.0.2 and 12.2.0.1 databases. Do the other brands propose that? I don’t think so.

What you cannot immediately realize is how fast the adoption of ODA can be. Most of the ODA projects I did are going on full production within 1 year, starting from initial ODA consideration. On a classic project, choosing the server/storage/software takes longer, deployment last longuer because multiple people are involved, you sometimes get stucked with hardware/software compatibily problem, and you have no guarantee about the performance even if you choose the best components. ODA reduces the duration and the cost of a project for sure.

4th mistake – Buy only one ODA

If you consider just buying one ODA, you probably need to think twice. Unless you do not want to patch regularly, this is probably not a good solution. Patching is not a zero-downtime operation, and it’s not reversible. Even if ODA patch bundles simplify patching, it’s still a complex operation especially when the patch is updating the operating system and the Grid Infrastructure components. Remember that one of the big advantage of an ODA is the availabily of a new patch every 3 months to update all the stack: firmwares, bios, ILOM, operating system, grid infrastructure, oracle homes, oracle databases, … So if you want to secure the patching process, you’d better go for 2 ODAs, one for production databases and one for dev/test databases for example. And it makes no sense to move only a part of your databases on ODA, leaving the other databases on classic systems.

Another advantage of 2+ ODAs, if you’re lucky enough to use Enterprise Edition, is the free use of Data Guard (without Active mode – standby database will stay mounted only). Most often, thinking about ODA is also thinking about disaster recovery solutions. And both are better together.

5th mistake – Manage your Enterprise licenses as you always did

One of the key feature of the ODA is the ability to scale the Enterprise licenses, starting from 1 PROC on a single ODA (or 25 named users). 1 license is only 2 cores on ODA. Does it makes sense on this platform to have limited number of licenses? Answer is yes and yes. Oracle recommends at least one core per database, but it’s not a problem to deploy 5 or even 10 (small) databases with just 1 license, there is no limit for that. Appart from the CPU limitation (applying the license will limit the available cores), ODA has quite a big amount of RAM (please tune your SGA according to this) and fast I/O speed that makes reads on disks not so expensive. CPU utilisation will be optimized.

What I mean is that you probably need less licenses on ODA than you need on a normal system. You can better spread these licenses on more ODAs and/or decrease the number of licenses you need. ODA hardware is sometimes self-financed by the economy of licenses. Keep in mind that 1 Oracle Enterprise PROC license costs more than the medium-size ODA. And you can always increase the number of licenses if needed (on-demand capacity).

Buying ODA hardware can be cheaper than you thought.

Conclusion

ODA is a great piece of hardware and knowing what it is designed for and how it works will make you better manage your Oracle Database environment.

Cet article 5 mistakes you should avoid with Oracle Database Appliance est apparu en premier sur Blog dbi services.

Foglight: Monitoring solution for databases [Part 02]

Fri, 2018-11-02 10:30

Foglight is a powerful and all in one monitoring solution to monitor various IT infrastructure assets like databases, servers, applications and so on. Whenever you need to monitor a special assets like an Oracle database, a Windows server, a Tomcat server or any other component you can do so by adding a Cartridges. A Cartridges is like a plugin for a specific technology. As we already seen how to install Foglight in a previous article here, in this one we are going to test the monitoring implementation of an Oracle database.

Configuration

The configuration is done in the web console, so we need first to login: https://192.168.56.100:8443
My user/password is the default foglight/foglight and here I am:

Capture

Email for notifications

Dashboards > Administration > Setup > Email Configuration

screen2

Users and external authentication

screen3

It is possible to use an external authentication method by configuring one the following providers:

  • Microsoft® Active Directory®
  • Oracle® Directory Server Enterprise Edition
  • OpenLDAP®
  • Novell® eDirectory™

More information here.

Installing and Upgrading Cartridges

Each Foglight cartridge contains extensions for monitoring a specific environment, such as applications, operating systems, or database management systems. Cartridges are installed on the server. A cartridge can contain one or more agents that are used to collect data from monitored environments. To manage them we need to go to the Cartridges area:

screen1

Many cartridges for databases are available here: https://support.quest.com/foglight-for-databases/5.9.2/download-new-releases

Deploying an agent
  • Agents collect data from monitored environments and send it to the Management Server
  • Each agent type can monitor a specific part of your environment, such as an operating system, application, or server
  • Foglight cartridges that you install on the server include one or more agent types
  • A server installation includes an embedded Foglight Agent Manager and it starts up and stops with the Foglight Management Server
  • Agents can be install in silent mode here

To deploy an agent we need to go use the Agent Manager in Dashboards > Administration > Agents > Agent Managers:

screen4

After the agent installation, login into the target server to be monitored and install the agent:


[root@oracle-server ~]# mkdir /opt/foglight
[root@oracle-server ~]# chown foglight. /opt/foglight
[root@oracle-server ~]# mv FglAM-5_9_2-linux-x86_64.bin /opt/foglight/
[root@oracle-server ~]# chown foglight. /opt/foglight/FglAM-5_9_2-linux-x86_64.bin
[root@oracle-server ~]# su - foglight
Last login: Tue Oct 2 00:13:43 CEST 2018 on pts/0
[foglight@oracle-server ~]$ cd /opt/foglight/
[foglight@oracle-server foglight]$ ./FglAM-5_9_2-linux-x86_64.bin ## --allow-unsecured
2018-10-02 15:45:19.000 INFO [native] Extracting to a temporary directory
2018-10-02 15:45:20.000 INFO [native] Foglight Agent Manager version 5.9.2 (build 5.9.2-201712050154-RELEASE-101) starting ...
2018-10-02 15:45:23.000 INFO [native] Extraction complete, configuring
[0/2]: -Dinstaller.executable.name=FglAM-5_9_2-linux-x86_64.bin
[1/2]: -Dinstaller.executable.path=/opt/foglight
2018-10-02 15:45:24.278 INFO Foglight Agent Manager: 5.9.2 (build 5.9.2-201712
050154-RELEASE-101)

License Agreement
===============================================================================
You must accept the following license agreement to install and use Foglight Agent Manager.

Do you accept the terms of the license agreement? [Y/N]: Y

Choose Install Location
===============================================================================
Where would you like to install the Foglight Agent Manager application?

Install directory (default: /opt/quest/foglightagentmanager):
/opt/foglight/agent

The directory "/opt/foglight/agent" does not exist. Would you like to create it? [Y/N] (default: Y):

Host Display Name
===============================================================================
Foglight Agent Manager identifies itself using the detected host name for the
computer it has been installed on. This can be overridden below to provide an
alternate name if the hostname is not stable or is already in use by another
computer.

Detected Host Name: [default: oracle-server]:

Log a warning if the detected host name changes? [Y/N] (default: Y):

Server URLs
===============================================================================
Configure the URLs that the Foglight Agent Manager process will use to communicate with the management server.
For fail-over purposes you can configure multiple URLs.

You have the following options:
1) Add a new management server URL
2) Add a new SSL Certificate CA
3) Test connectivity
4) Search for additional HA servers
5) List configured management server URLs
6) List SSL Certificate CAs
7) Delete a configured management server URL
8) Delete an SSL Certificate CA
0) Continue with the next step

What would you like to do? 1

Enter the URL data to add. The management server URL parameters are specified as comma separated name=value pairs. For example:
url=https://localhost:8443,address=127.0.0.1,proxy=http://proxy.server,ssl-allow-self-signed=false,ssl-cert-common-name=name.com

Available parameters are:
url: Required. The URL that Foglight Agent Manager will connect to.
proxy: Optional. The URL of the proxy to use when connecting.
proxy-user: Optional. The username sent to the proxy.
proxy-pass: Optional. The password sent to the proxy.
proxy-ntlm-domain: Optional. The NTLM domain sent to the proxy.
address: Optional. The local network address from which connections to the management server will be made.
ssl-allow-self-signed: Optional. True to allow self-signed certificates to be accepted; false (default) otherwise.
ssl-cert-common-name: Optional. The common name contained in the management servers certificate.
compressed: Optional. True (default) to use GZIP compression when sending and receiving data.

URL: https://192.168.56.100:8443,ssl-allow-self-signed=true
The URL has been added.

You have the following options:
1) Add a new management server URL
2) Add a new SSL Certificate CA
3) Test connectivity
4) Search for additional HA servers
5) List configured management server URLs
6) List SSL Certificate CAs
7) Delete a configured management server URL
8) Delete an SSL Certificate CA
0) Continue with the next step

What would you like to do? 3
Testing connectivity...
0%... 100% finished

You have the following options:
1) Add a new management server URL
2) Add a new SSL Certificate CA
3) Test connectivity
4) Search for additional HA servers
5) List configured management server URLs
6) List SSL Certificate CAs
7) Delete a configured management server URL
8) Delete an SSL Certificate CA
0) Continue with the next step

What would you like to do? 0

You have some untested/broken management server URLs configured. Are you sure you want to continue? [Y/N] (default: Y):

Downstream Connection Configuration
===============================================================================
Foglight Agent Manager can accept incoming connections and be configured as a concentrator that acts as an intermediary connection that aggregates multiple downstream Foglight Agent Manager clients. A concentrator configuration can provide a single connection through either a firewall or proxy for all downstream clients, or as an aggregated connection directly to the server.

Enabling this install as a concentrator will pre-configure queue and heap sizes to support accepting and transferring data from one or more downstream connections.

Enable Concentrator support? [Y/N] (default: N):

Secure Launcher
===============================================================================
Some agents require elevated permissions in order to gather the required system
metrics. These agents are launched using an external loader to give them the required access.

Please see the Foglight Agent Manager documentation for more information on agent security settings.

Secure launcher (default: /bin/sudo):

UNIX init.d Script
===============================================================================
Foglight Agent Manager will be configured to start when this host is rebooted by adding an init.d-style script.

Would you like to customize the start-up script? [Y/N] (default: N):

Summary
===============================================================================
Foglight Agent Manager has been configured and will be copied into its final location.

Press to continue.

Beginning work ...
===============================================================================
Calculating install size...
0%... 100% finished

Copying files...
0%... 10%... 20%... 30%... 40%... 50%... 60%... 70%... 80%... 90%... 100% finished

Created init.d installer script:

/opt/foglight/agent/state/default/fglam-init-script-installer.sh

Since this installation is not being performed by a user
with root privileges, the init.d script will not be installed.
A copy of the script and the script installer have been saved
in the state directory for later use.

The Foglight Agent Manager process will be started once this configuration process exits.

Then we can see that the agent is running:

[foglight@oracle-server foglight]$ /opt/foglight/agent/bin/fglam --status
0
#0: Process running normally
#1: Process not running, but PID file exists
#3: Process not running
#4: Process status unknown

I hope this helps and please do not hesitate to contact us should you need more details.

Cet article Foglight: Monitoring solution for databases [Part 02] est apparu en premier sur Blog dbi services.

How to fix OUI-10022 error on an ODA

Fri, 2018-11-02 05:00

When manually upgrading Grid Infrastructure on an ODA according to MOS note 2379389.1 it has to be done as grid user. This can fail with OUI-10022 error which indicates that Oracle inventory is corrupt.
But when trying an “opatch lsinventory” as oracle and as grid user these commands succeeded, so inventory seems to be ok.
It turned out that the locks subdirectory of the oracle inventory was not writable for the grid user. After making it writable for user grid, upgrade ran fine.

Cet article How to fix OUI-10022 error on an ODA est apparu en premier sur Blog dbi services.

Password Verification Policy in MySQL 8.0.13

Mon, 2018-10-29 12:08

The new release 8.0.13 for MySQL is available since last week.
Concerning security, this comes with a new feature already announced: the Password Verification Policy.
Let’s have a look…

This aim of this feature is to secure the attempts to change a password by specifying the old one to be replaced.
It is turned off by default:

mysql> show variables like 'password_require_current';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| password_require_current | OFF   |
+--------------------------+-------+

and we can activate it by several ways (as for some other password features):
1. Globally, at the server level:

mysql> set persist password_require_current='ON';
mysql> show variables like 'password_require_current';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| password_require_current | ON    |
+--------------------------+-------+

2. On a per-account-basis, and if we want to force the verification of the old password:

mysql> create user olivier@localhost identified by 'MySQLisPowerful' PASSWORD REQUIRE CURRENT;

3. On a per-account-basis, and if we want to make the verification of the old password optional:

mysql> create user gregory@localhost identified by 'SecurityIsImportant' PASSWORD REQUIRE CURRENT OPTIONAL;

Suppose that we have activated it at the server level, now let’s create one user account:

mysql> create user elisa@localhost identified by 'manager';

If we try to change the password for this user, we can do that without specifying any password:

mysql> alter user elisa@localhost identified by 'WhatsTheProblem';

Why? Because we are connected as the root account. Actually accounts which have the ‘CREATE USER’ or ‘UPDATE on mysql.*’ privileges are not affected by this policy.

So if we try to connect as our user ‘elisa’ and to change our password:

mysql> select user();
+-----------------+
| user()          |
+-----------------+
| elisa@localhost |
+-----------------+
mysql> alter user elisa@localhost identified by 'GoodVibes';
ERROR 13226 (HY000): Current password needs to be specified in the REPLACE clause in order to change it.

that is not possible. We can only do that if we specify our old password in the ‘ALTER USER’ statement through the ‘REPLACE’ clause:

mysql> alter user elisa@localhost identified by 'GoodVibes' replace 'WhatsTheProblem';
Query OK, 0 rows affected (0.12 sec)

Simple, isn’t it?
As a best practice in terms of security, I suggest you to activate this functionality in your MySQL environment.
For other information concerning new security features in MySQL 8.0 check the MySQL Documentation and come to my session MySQL 8.0 Community: Ready for GDPR? at the DOAG.

Cet article Password Verification Policy in MySQL 8.0.13 est apparu en premier sur Blog dbi services.

Pages