Yann Neuhaus

Subscribe to Yann Neuhaus feed Yann Neuhaus
Updated: 10 hours 51 min ago

PostgreSQL Cluster using repmgr

Thu, 2022-08-11 10:53

With this blog I describe the installation of a PostgreSQL Cluster using repmgr instead of Patroni. Repmgr was originally developed by 2ndQuadrant which is now part of EDB and EDB decided for the 2ndQuadrant tools for the future, barman as backup solution from 2ndQ survives, Bart is canceled by EDB.

At first I do a setup of three virtual machines using Rocky Linux 8.6 minimal installation including EPEL Repository for htop. Than I make sure that networking between these three VMs is working by adapting /etc/hosts. Without a functional network a cluster won’t work.

[root@repmgr-01 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
192.168.198.130 repmgr-01 repmgr-01.localdomain
192.168.198.131	repmgr-02 repmgr-02.localdomain
192.168.198.132	repmgr-03 repmgr-03.localdomain
[root@repmgr-01 ~]# ping repmgr-01
PING repmgr-01 (192.168.198.130) 56(84) bytes of data.
64 bytes from repmgr-01 (192.168.198.130): icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from repmgr-01 (192.168.198.130): icmp_seq=2 ttl=64 time=0.057 ms
64 bytes from repmgr-01 (192.168.198.130): icmp_seq=3 ttl=64 time=0.183 ms
--- repmgr-01 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2074ms
rtt min/avg/max/mdev = 0.032/0.090/0.183/0.066 ms

[root@repmgr-01 ~]# ping repmgr-02
PING repmgr-02 (192.168.198.131) 56(84) bytes of data.
64 bytes from repmgr-02 (192.168.198.131): icmp_seq=1 ttl=64 time=0.550 ms
64 bytes from repmgr-02 (192.168.198.131): icmp_seq=2 ttl=64 time=0.757 ms
64 bytes from repmgr-02 (192.168.198.131): icmp_seq=3 ttl=64 time=0.838 ms
--- repmgr-02 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2074ms
rtt min/avg/max/mdev = 0.550/0.715/0.838/0.121 ms

[root@repmgr-01 ~]# ping repmgr-03
PING repmgr-03 (192.168.198.132) 56(84) bytes of data.
64 bytes from repmgr-03 (192.168.198.132): icmp_seq=1 ttl=64 time=0.541 ms
64 bytes from repmgr-03 (192.168.198.132): icmp_seq=2 ttl=64 time=0.479 ms
64 bytes from repmgr-03 (192.168.198.132): icmp_seq=3 ttl=64 time=0.439 ms
--- repmgr-03 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2058ms
rtt min/avg/max/mdev = 0.439/0.486/0.541/0.045 ms
[root@repmgr-01 ~]# 

The PostgreSQL installation is following the steps I have described at my Article at heise.de:

https://www.heise.de/ratgeber/PostgreSQL-installieren-mit-den-Community-Paketen-4877556.html

But with different packages added like repmgr and barman and of course PostgreSQL 14 latest.

The installation starts with adding the Postgresql.org repository and disabling the OS PostgreSQL modules on the two database nodes, the third machine is planned as witness for auto failover.

[root@repmgr-01 ~]# dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
Last metadata expiration check: 0:20:49 ago on Thu 11 Aug 2022 11:06:40 AM CEST.
pgdg-redhat-repo-latest.noarch.rpm                                                                                                                                                                                                                         62 kB/s |  13 kB     00:00    
Dependencies resolved.
==========================================================================================================================================================================================================================================================================================
 Package                                                                   Architecture                                                    Version                                                            Repository                                                             Size
==========================================================================================================================================================================================================================================================================================
Installing:
 pgdg-redhat-repo                                                          noarch                                                          42.0-24                                                            @commandline                                                           13 k

Transaction Summary
==========================================================================================================================================================================================================================================================================================
Install  1 Package

Total size: 13 k
Installed size: 12 k
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                                  1/1 
  Installing       : pgdg-redhat-repo-42.0-24.noarch                                                                                                                                                                                                                                  1/1 
  Verifying        : pgdg-redhat-repo-42.0-24.noarch                                                                                                                                                                                                                                  1/1 

Installed:
  pgdg-redhat-repo-42.0-24.noarch                                                                                                                                                                                                                                                         

Complete!
[root@repmgr-01 ~]# dnf -qy module disable postgresql
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
Importing GPG key 0x442DF0F8:
 Userid     : "PostgreSQL RPM Building Project <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 68C9 E2B9 1A37 D136 FE74 D176 1F16 D2E1 442D F0F8
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
[root@repmgr-01 ~]# dnf install -y postgresql14 postgresql14-server postgresql14-contrib postgresql14-libs repmgr_14 barman

I personally disable all postgresql releases within the postgresql repo file that I don’t want to use before installation, this speed up the whole installation process.

The next step is adapting the service file for a non standard PGDATA, for that I’m using systemctl edit to create a override.conf for the postgresql-14.service file.

[root@repmgr-03 ~]# systemctl edit postgresql-14.service
[Service]
Environment=PGDATA=/pgdata/14/data

Initialization should be done on the planned Leader / Master node only, by crating the Replica node later pgbasebackup is used.

[root@repmgr-01 ~]# /usr/pgsql-14/bin/postgresql-14-setup initdb
Initializing database ... OK

For recurring jobs I have a set of shell scripts, one for setup CPU and Memory parameters including SSL using server self created certificate and key, required is a directory /pgdata/ssl owned by postgres user.

It configures PostgreSQL using alter system set commands within postgresql.auto.conf.

[postgres@repmgr-01 pgsql]$ cat config.sh 
##############################################
# PostgreSQL configuration                   #
# by Karsten Lenz dbi services sa 04.29.2022 #
##############################################

#!/bin/bash

echo "PostgreSQL Configuration"
echo ""

function printHelp {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -c <Count of CPU used>\t\t\tAmount of CPU used by this system (required)\n"
  printf "\t -m <Amount of Memory>\t\t\tAmount of Memory of this system (required)\n"
  printf "\t -o <Max Connections>\t\t\tAmount of Connections of this system (default = 100))\n"
  printf "\t -v <PostgreSQL Version>\t\tMajor Release of Postgresql (default = 14)\n"
  printf "\t -h <Help>\t\t\t\tprints this help\n"
}

while getopts c:m:o:v:h option 2>/dev/null
do
  case "${option}"
  in
  c) CPU=${OPTARG};;
  m) RAM=${OPTARG};;
  o) CONNECTIONS=${OPTARG:=100};;
  v) VERSION=${OPTARG:=14};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n";
     printf "Run ${printHelp} -h to print help\n"; exit 1;;
  esac
done

# create ssl certificate and ssl key
openssl req -new -newkey rsa:4096 -nodes -x509 -subj "/C=CH/ST=DBAAS/L=ZUERICH/O=Dis/CN=www.dbi-services.com" -keyout /pgdata/ssl/pgsql.key -out /pgdata/ssl/pgsql.crt

# define parameters
rootdir=/opt/pgsql/config
cd ${rootdir}

# connections
max_connections=($CONNECTIONS)
echo max_connections : $max_connections

# memory
let shared_buffers=($RAM/4)
echo shared_buffers : $shared_buffers
let effective_cache_size=($RAM-$shared_buffers)
echo effective_cache_size : $effective_cache_size
let work_mem=($RAM*256/$CONNECTIONS)
echo work_mem : $work_mem
let maintenance_work_mem=($RAM*256/8)
echo  maintenance_work_mem : $maintenance_work_mem

# cpu
let max_worker_processes=($CPU)
echo max_worker_processes : $max_worker_processes
let max_parallel_workers=($CPU)
echo  max_parallel_workers : $max_parallel_workers
let max_parallel_workers_per_gather=($CPU/2)
echo max_parallel_workers_per_gather : $max_parallel_workers_per_gather
let max_parallel_maintenance_workers=($CPU/2)
echo max_parallel_maintenance_workers : $max_parallel_maintenance_workers

# cpu and memory configuration
psql -c "alter system set listen_addresses = '*';"
psql -c "alter system set max_connections = '$max_connections';"
psql -c "alter system set effective_cache_size = '$effective_cache_size GB';"
psql -c "alter system set shared_buffers = '$shared_buffers GB';"
psql -c "alter system set work_mem = '$work_mem MB';"
psql -c "alter system set maintenance_work_mem = '$maintenance_work_mem MB';"
psql -c "alter system set max_worker_processes = '$max_worker_processes';"
psql -c "alter system set max_parallel_workers = '$max_parallel_workers';"
psql -c "alter system set max_parallel_workers_per_gather = '$max_parallel_workers_per_gather';"
psql -c "alter system set max_parallel_maintenance_workers = '$max_parallel_maintenance_workers';"
psql -c "alter system set ssl_cert_file = '/pgdata/ssl/pgsql.crt';"
psql -c "alter system set ssl_key_file = '/pgdata/ssl/pgsql.key';"
psql -c "alter system set ssl = on;"
psql -c "alter system set ssl_ciphers = 'HIGH';"
psql -c "alter system set ssl_min_protocol_version = 'TLSv1.2';"
psql -c "alter system set shared_preload_libraries = pg_stat_statements;"
sudo service postgresql-$VERSION restart
exit 
[postgres@repmgr-01 pgsql]$ sh config.sh -h
PostgreSQL Configuration

Usage:
 [OPTION]

Options:
	 -c <Count of CPU used>			Amount of CPU used by this system (required)
	 -m <Amount of Memory>			Amount of Memory of this system (required)
	 -o <Max Connections>			Amount of Connections of this system (default = 100))
	 -v <PostgreSQL Version>		Major Release of Postgresql (default = 14)
	 -h <Help>				prints this help

[postgres@repmgr-01 pgsql]$ sh config.sh -c 2 -m 4 -o 100 -v 14
PostgreSQL Configuration

Generating a RSA private key
..........................................................++++
.................................++++
writing new private key to '/pgdata/ssl/pgsql.key'
-----
max_connections : 100
shared_buffers : 1
effective_cache_size : 3
work_mem : 10
maintenance_work_mem : 128
max_worker_processes : 2
max_parallel_workers : 2
max_parallel_workers_per_gather : 1
max_parallel_maintenance_workers : 1
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
Redirecting to /bin/systemctl restart postgresql-14.service

[postgres@repmgr-01 pgsql]$ cat /pgdata/14/data/postgresql.auto.conf 
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
listen_addresses = '*'
max_connections = '100'
effective_cache_size = '3 GB'
shared_buffers = '1 GB'
work_mem = '10 MB'
maintenance_work_mem = '128 MB'
max_worker_processes = '2'
max_parallel_workers = '2'
max_parallel_workers_per_gather = '1'
max_parallel_maintenance_workers = '1'
ssl_cert_file = '/pgdata/ssl/pgsql.crt'
ssl_key_file = '/pgdata/ssl/pgsql.key'
ssl = 'on'
ssl_ciphers = 'HIGH'
ssl_min_protocol_version = 'TLSv1.2'
shared_preload_libraries = 'pg_stat_statements'
[postgres@repmgr-01 pgsql]$ 

The system has 2 vCPU and 4GB RAM as visible within this configuration.

For repmgr I would like to use password less authentication using pgpass, also for that I have written a small shell script, again with -h for help.

[postgres@repmgr-01 /]$ cat /opt/pgsql/config/pgpass.sh 
#!/bin/sh

########################################
#                                      #
#  pgpass setup script                 #
#                                      #
#  Author: Karsten Lenz / 2020.05.28   #
#                                      #
########################################

progName=$(basename $0)
# postgresVersion=12
domain=localdomain
# pgData=/pgdata/$postgresVersion/data
# postgresConf=/pgdata/$postgresVersion/data/postgresql.conf
postgresHome=/var/lib/pgsql
# postgresBin=/usr/pgsql-$postgresVersion/bin
pgpass=$postgresHome/.pgpass
password=PutYourPasswordHere

function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -p <Primary Server>\t\t\tserver where the primary host is running on (required)\n"
  printf "\t -s <Secondary Server>\t\t\tserver where the secondary host is running on (required)\n"
  printf "\t -h <Help>\t\t\t\tprints this help\n"
}

while getopts p:s:h option 2>/dev/null
do
  case "${option}"
  in
  p) primServer=${OPTARG};; 
  s) secdServer=${OPTARG};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n"; 
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done

############ Log function ############

logFile=/tmp/pgpass_install.log

function log() {
  echo "$(date +%Y.%m.%d-%H:%M:%S) [$$]$*" | tee -a $logFile
}

if [ -f $logFile ]; then
  continue
else
  touch $logFile
  chmod -R 774 $logFile
  sleep 2
fi

#clean .pgpass
rm -f $pgpass

#set values in .pgpass
log "INFO: #host:port:database:user:password in $pgpass"
echo "#host:port:database:user:password" | tee -a $pgpass
log "INFO: Setting localhost in $pgass"
echo "localhost:5432:*:repmgr:$password" | tee -a $pgpass
log "INFO: Setting 127.0.0.1 in $pgpass"
echo "127.0.0.1:5432:*:repmgr:$password" | tee -a $pgpass
log "INFO: Setting Primary $primServer in $pgpass"
echo "$primServer.$domain:5432:*:repmgr:$password" | tee -a $pgpass
log "INFO: Setting Primary $secdServer in $pgpass"
echo "$secdServer.$domain:5432:*:repmgr:$password" | tee -a $pgpass

#set .pgpass 0600
chmod 0600 $pgpass

#export PGPASSFILE
export PGPASSFILE='/var/lib/pgsql/.pgpass'
[postgres@repmgr-01 /]$ 

[postgres@repmgr-01 /]$ sh /opt/pgsql/config/pgpass.sh -h
Usage:
pgpass.sh [OPTION]

Options:
	 -p <Primary Server>			server where the primary host is running on (required)
	 -s <Secondary Server>			server where the secondary host is running on (required)
	 -h <Help>				prints this help
[postgres@repmgr-01 /]$ 
[postgres@repmgr-01 /]$ sh /opt/pgsql/config/pgpass.sh -p 192.168.198.130 -s 192.168.198.131
2022.08.11-14:50:52 [10902]INFO: #host:port:database:user:password in /var/lib/pgsql/.pgpass
#host:port:database:user:password
2022.08.11-14:50:52 [10902]INFO: Setting localhost in 
localhost:5432:*:repmgr:PutYourPasswordHere
2022.08.11-14:50:52 [10902]INFO: Setting 127.0.0.1 in /var/lib/pgsql/.pgpass
127.0.0.1:5432:*:repmgr:PutYourPasswordHere
2022.08.11-14:50:52 [10902]INFO: Setting Primary 192.168.198.130 in /var/lib/pgsql/.pgpass
192.168.198.130.localdomain:5432:*:repmgr:PutYourPasswordHere
2022.08.11-14:50:52 [10902]INFO: Setting Primary 192.168.198.131 in /var/lib/pgsql/.pgpass
192.168.198.131.localdomain:5432:*:repmgr:PutYourPasswordHere
[postgres@repmgr-01 /]$ 

[postgres@repmgr-01 /]$ cat /var/lib/pgsql/.pgpass
#host:port:database:user:password
localhost:5432:*:repmgr:PutYourPasswordHere
127.0.0.1:5432:*:repmgr:PutYourPasswordHere
192.168.198.130.localdomain:5432:*:repmgr:PutYourPasswordHere
192.168.198.131.localdomain:5432:*:repmgr:PutYourPasswordHere
[postgres@repmgr-01 /]$ 

Now the setup of repmgr itself, also for that I have written shell scripts for recurring operation, originally the scrpts where written for a customer project for a DBaaS environment within a private cloud.

Setting up the Leader / Master node:

[postgres@repmgr-01 /]$ cat /opt/pgsql/config/repMgrMasterSetup.sh 
#!/bin/sh

########################################
#  RepMgr setup script                 #
#  Rework: Karsten Lenz / 2022.08.11   #
########################################

progName=$(basename $0)
# postgresVersion=14
domain=localdomain
# repmgr_conf=/etc/repmgr/$postgresVersion/repmgr.conf
# pgData=/pgdata/$postgresVersion/data
# postgresConf=/pgdata/$postgresVersion/data/postgresql.conf
# postgresHome=/var/lib/pgsql/$postgresVersion
# postgresBin=/usr/pgsql-$postgresVersion/bin
password=PutYourPasswordHere

function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -p <Primary Server>\t\t\thost where the primary server is running on (required)\n"
  printf "\t -s <Standby Server>\t\t\thost where the standby server is running on (required)\n"
  printf "\t -v <PostgreSQL Major Release>\t\tMajor Release Number default 14 (required)\n"
  printf "\t -h <Help>\t\t\t\tprints this help\n"
}

while getopts c:p:s:v:h option 2>/dev/null
do
  case "${option}"
  in
  p) primServer=${OPTARG};;
  s) secdServer=${OPTARG};;
  v) postgresVersion=${OPTARG:=14};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n"; 
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done

### Building Variables according to inputs ###
repmgr_conf=/etc/repmgr/$postgresVersion/repmgr.conf
pgData=/pgdata/$postgresVersion/data
postgresConf=/pgdata/$postgresVersion/data/postgresql.conf
postgresHome=/var/lib/pgsql/$postgresVersion
postgresBin=/usr/pgsql-$postgresVersion/bin

rootDir=/opt/pgsql

############ Log function ############

logFile=/tmp/repMaster_install.log

function log() {
  echo "$(date +%Y.%m.%d-%H:%M:%S) [$$]$*" | tee -a $logFile
}

if [ -f $logFile ]; then
  continue
else
  touch $logFile
  chmod -R 774 $logFile
  sleep 2
fi

############ MAIN ############
psql -c "alter system set max_replication_slots = 10;"
psql -c "alter system set archive_mode = 'on';"
psql -c "alter system set archive_command = '/bin/true';"
psql -c "alter system set wal_level = 'replica';"
psql -c "alter system set max_wal_senders = 2;"
psql -c "create user repmgr with superuser"
log "INFO: create user repmgr with superuser"
psql -c "alter user repmgr with password '$password'"
log "INFO: alter user repmgr set password"

$postgresBin/createdb repmgrdb -O repmgr
log "INFO: Create database repmgrdb with owner repmgr"

$postgresBin/pg_ctl reload -D $pgData -W -s
if [ $? == 0 ]; then
  log "INFO: Reloading postgres returned $?"
else
  log "ERROR: Reloading postgres returned $?"
  exit 8
fi

> $repmgr_conf
#log "INFO: Setting cluster=$repCluster in $repmgr_conf"
#echo "cluster=$repCluster" | tee -a $repmgr_conf
log "INFO: Setting node_id=1 in $repmgr_conf"
echo "node_id=1" | tee -a $repmgr_conf
log "INFO: Setting node_name=$primServer in $repmgr_conf"
echo "node_name=$primServer" | tee -a $repmgr_conf
log "INFO: Setting conninfo='host=$primServer.$domain user=repmgr dbname=repmgrdb' in $repmgr_conf"
echo "conninfo='host=$primServer.$domain user=repmgr dbname=repmgrdb'" | tee -a $repmgr_conf
log "INFO: Setting use_replication_slots=true"
echo "use_replication_slots=true" | tee -a $repmgr_conf
log "INFO: Setting data_directory='$pgData' in $repmgr_conf"
echo "data_directory='$pgData'" | tee -a $repmgr_conf

#/usr/psql-14/bin repmgrdb repmgr <<EOF

psql -c "ALTER USER repmgr SET search_path TO repmgr, public;"
log "INFO: ALTER USER repmgr SET search_path TO repmgr, public;"

$postgresBin/repmgr -f $repmgr_conf -F master register
if [ $? == 0 ]; then
  log "INFO: Registering master returned $?"
else
  log "ERROR: Registering master returned $?"
  exit 8
fi

echo "setup of primary successfully completed"
[postgres@repmgr-01 /]$ 

postgres@repmgr-01 /]$ sh /opt/pgsql/config/repMgrMasterSetup.sh -p repmgr-01 -s repmgr-02 -v 14
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
ALTER SYSTEM
2022.08.11-16:06:28 [14145]INFO: create user repmgr with superuser
ALTER ROLE
2022.08.11-16:06:28 [14145]INFO: alter user repmgr set password
2022.08.11-16:06:28 [14145]INFO: Create database repmgrdb with owner repmgr
2022.08.11-16:06:28 [14145]INFO: Reloading postgres returned 0
2022.08.11-16:06:28 [14145]INFO: Setting node_id=1 in /etc/repmgr/14/repmgr.conf
node_id=1
2022.08.11-16:06:28 [14145]INFO: Setting node_name=repmgr-01 in /etc/repmgr/14/repmgr.conf
node_name=repmgr-01
2022.08.11-16:06:28 [14145]INFO: Setting conninfo='host=repmgr-01.localdomain user=repmgr dbname=repmgrdb' in /etc/repmgr/14/repmgr.conf
conninfo='host=repmgr-01.localdomain user=repmgr dbname=repmgrdb'
2022.08.11-16:06:28 [14145]INFO: Setting use_replication_slots=true
use_replication_slots=true
2022.08.11-16:06:28 [14145]INFO: Setting data_directory='/pgdata/14/data' in /etc/repmgr/14/repmgr.conf
data_directory='/pgdata/14/data'
ALTER ROLE
2022.08.11-16:06:28 [14145]INFO: ALTER USER repmgr SET search_path TO repmgr, public;
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered
2022.08.11-16:06:28 [14145]INFO: Registering master returned 0
setup of primary successfully completed
postgres@repmgr-01 /]$

And now the master is running.

[postgres@repmgr-01 /]$ /usr/pgsql-14/bin/repmgr cluster show
 ID | Name      | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-----------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | repmgr-01 | primary | * running |          | default  | 100      | 1        | host=repmgr-01.localdomain user=repmgr dbname=repmgrdb
[postgres@repmgr-01 /]$

For the replica we need to copy the psql.crt and psql.key using scp to /pgdata/ssl on the replica, after that we can use a repMgrSteupStandby.sh script to attach the replica to the leader.

[postgres@repmgr-02 /]$ cat /opt/pgsql/config/repMgrStanbySetup.sh 
#!/bin/sh

#################################################
#  RepMgr Standby setup script                  #
#  Author: Karsten Lenz dbi-services 2022.08.11 #
#################################################

progName=$(basename $0)
# postgresVersion=14
domain=localdomain
# repmgr_conf=/etc/repmgr/$postgresVersion/repmgr.conf
# pgData=/pgdata/$postgresVersion/data
# postgresConf=/pgdata/$postgresVersion/data/postgresql.conf
# postgresHome=/var/lib/pgsql/$postgresVersion
# postgresBin=/usr/pgsql-$postgresVersion/bin
password=PutYourPasswordHere

function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -c <Container Name>\t\t\tname of the container/cluster (required)\n"
  printf "\t -p <Primary Server>\t\t\thost where the primary server is running on (required)\n"
  printf "\t -s <Standby Server>\t\t\thost where the standby server is running on (required)\n"
  printf "\t -v <PostgreSQL Major Release>\t\tMajor Release Number 14 default (required)\n"
  printf "\t -h <Help>\t\t\t\tprints this help\n"
}

while getopts c:p:s:v:h option 2>/dev/null
do
  case "${option}"
  in
  c) container=${OPTARG};;
  p) primServer=${OPTARG};;
  s) secdServer=${OPTARG};;
  v) postgresVersion=${OPTARG:=14};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n"; 
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done

### Building Definitions according to inputs ###
repmgr_conf=/etc/repmgr/$postgresVersion/repmgr.conf
pgData=/pgdata/$postgresVersion/data
postgresConf=/pgdata/$postgresVersion/data/postgresql.conf
postgresHome=/var/lib/pgsql/$postgresVersion
postgresBin=/usr/pgsql-$postgresVersion/bin

rootDir=/opt/pgsql

############ Log function ############

logFile=/tmp/repSecondary_install.log

function log() {
  echo "$(date +%Y.%m.%d-%H:%M:%S) [$$]$*" | tee -a $logFile
}

if [ -f $logFile ]; then
  continue
else
  touch $logFile
  chmod -R 774 $logFile
  sleep 2
fi

############ MAIN ############
# change cert and key file via alter system set command
# not necessary - will be copied with base dump??
#psql -c "alter system set ssl_cert_file = '/pgdata/security/ssl/${container}.pem'; "
#psql -c "alter system set ssl_key_file = '/pgdata/security/ssl/${container}.key'; "

>$repmgr_conf

log "INFO: Setting node_id=2 in $repmgr_conf"
echo "node_id=2" | tee -a $repmgr_conf
log "INFO: Setting node_name=$secdServer in $repmgr_conf"
echo "node_name=$secdServer" | tee -a $repmgr_conf
log "INFO: Setting conninfo='host=$secdServer.$domain user=repmgr dbname=repmgrdb' in $repmgr_conf"
echo "conninfo='host=$secdServer.$domain user=repmgr dbname=repmgrdb'" | tee -a $repmgr_conf
log "Info: Setting 'use_replication_slots=true'  in $repmgr_conf"
echo "use_replication_slots=true"  | tee -a $repmgr_conf
log "INFO: Setting data_directory='$pgData' in $repmgr_conf"
echo "data_directory='$pgData'" | tee -a $repmgr_conf

#/usr/psql-14/bin repmgrdb repmgr <<EOF

$postgresBin/repmgr -h $primServer.$domain -U repmgr -d repmgrdb -F standby clone 
if [ $? == 0 ]; then
  log "INFO: Registering standby returned $?"
else
  log "ERROR: Registering standby returned $?"
  exit 8
fi
#start postgresql
sudo systemctl start postgresql-${postgresVersion}.service

## # set path
## psql -c "ALTER USER repmgr SET search_path TO repmgr, public;"
## log "INFO: ALTER USER repmgr SET search_path TO repmgr, public;"

#register standby
$postgresBin/repmgr standby register

echo "setup of standby successfully completed"

[postgres@repmgr-02 /]$ 

The script has an help function -h to tell how it is used.

[postgres@repmgr-02 /]$ sh /opt/pgsql/config/repMgrStanbySetup.sh -h
Usage:
repMgrStanbySetup.sh [OPTION]

Options:
	 -c <Container Name>			name of the container/cluster (required)
	 -p <Primary Server>			host where the primary server is running on (required)
	 -s <Standby Server>			host where the standby server is running on (required)
	 -v <PostgreSQL Major Release>		Major Release Number 14 default (required)
	 -h <Help>				prints this help
[postgres@repmgr-02 /]$ 

OK, let it run.

[postgres@repmgr-02 data]$ sh /opt/pgsql/config/repMgrStanbySetup.sh -c Cluster-01 -p repmgr-01 -s repmgr-02 -v 14
2022.08.11-17:39:53 [13124]INFO: Setting node_id=2 in /etc/repmgr/14/repmgr.conf
node_id=2
2022.08.11-17:39:53 [13124]INFO: Setting node_name=repmgr-02 in /etc/repmgr/14/repmgr.conf
node_name=repmgr-02
2022.08.11-17:39:53 [13124]INFO: Setting conninfo='host=repmgr-02.localdomain user=repmgr dbname=repmgrdb' in /etc/repmgr/14/repmgr.conf
conninfo='host=repmgr-02.localdomain user=repmgr dbname=repmgrdb'
2022.08.11-17:39:53 [13124]Info: Setting 'use_replication_slots=true'  in /etc/repmgr/14/repmgr.conf
use_replication_slots=true
2022.08.11-17:39:53 [13124]INFO: Setting data_directory='/pgdata/14/data' in /etc/repmgr/14/repmgr.conf
data_directory='/pgdata/14/data'
NOTICE: destination directory "/pgdata/14/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=repmgr-01.localdomain user=repmgr dbname=repmgrdb
DETAIL: current installation size is 34 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
WARNING: data checksums are not enabled and "wal_log_hints" is "off"
DETAIL: pg_rewind requires "wal_log_hints" to be enabled
WARNING: directory "/pgdata/14/data" exists but is not empty
NOTICE: -F/--force provided - deleting existing data directory "/pgdata/14/data"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  pg_basebackup -l "repmgr base backup"  -D /pgdata/14/data -h repmgr-01.localdomain -p 5432 -U repmgr -X stream -S repmgr_slot_2 
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /pgdata/14/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
2022.08.11-17:39:53 [13124]INFO: Registering standby returned 0
INFO: connecting to local node "repmgr-02" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1)
INFO: standby registration complete
NOTICE: standby node "repmgr-02" (ID: 2) successfully registered
setup of standby successfully completed
[postgres@repmgr-02 data]$ 

The Cluster is up and running now.

[postgres@repmgr-01 /]$ /usr/pgsql-14/bin/repmgr cluster show
 ID | Name      | Role    | Status    | Upstream  | Location | Priority | Timeline | Connection string                                     
----+-----------+---------+-----------+-----------+----------+----------+----------+--------------------------------------------------------
 1  | repmgr-01 | primary | * running |           | default  | 100      | 1        | host=repmgr-01.localdomain user=repmgr dbname=repmgrdb
 2  | repmgr-02 | standby |   running | repmgr-01 | default  | 100      | 1        | host=repmgr-02.localdomain user=repmgr dbname=repmgrdb
[postgres@repmgr-01 /]$ 

L’article PostgreSQL Cluster using repmgr est apparu en premier sur dbi Blog.

Performance problems on a ZyWALL USG

Wed, 2022-08-10 16:06
ZyWALL USG 50

Are you experiencing performance problems with your Zyxel ZyWALL USG firewall? You will find in this blog an example of a real case I faced and how I solved it.

Introduction

From one day to the next, network performance deteriorated. Users first reported sporadic access problems and then these problems became permanent. The performance problems resulted in very slow access to a remote system. As shown below, the pings performed sometimes went up to more than a hundred milliseconds,

Pinging 8.8.8.8 with 32 bytes of data
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=63 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=562 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=519 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=211 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=31 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=617 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=100 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=19 ms TTL=116

Thanks to UDITIS, we were able to quickly isolate the problem and understood that it came from the firewall. In this case it was a ZyWALL USG 50 firewall.

Logging into the ZyWALL USG GUI, we quickly saw that the processor resources were being used at over 90%. This had an impact on packet processing.

Processor consumption at nearly 100% Problem analysis

Unfortunately the ZyWALL graphical interface does not give more details about the use of resources and does not allow to see what the processor or memory resources are used for. To analyze the problem, it is therefore necessary to connect to the Zywall via SSH.

As explained on the Zyxel support webpage, we can use the command line “debug system show cpu status” to see the CPU usage details. For instance, if CPU time would have been spent on “softirq” it would have mean that the CPU was occupied with traffic load. In our case the CPU is occupied with a system item (currentrly 51%) and by user (currently 47%).

login as: admin
Using keyboard-interactive authentication.
Password:
Bad terminal type: "xterm". Will assume vt100.

Router> debug system show cpu status
CPU utilization: 99 % (system: 51 %, user: 47 %, irq: 0 %, softirq 1 %)
CPU utilization (1 minute): 98 % (system: 48 %, user: 43 %, irq: 4 %, softirq 3 %)
CPU utilization (5 minute): 98 % (system: 47 %, user: 44 %, irq: 4 %, softirq 3 %)

To understand why the CPU is used by system item, we can use the command “debug system ps“. This command will help to see the different processes and the resources usage as presented below (extract).

Router> debug system ps
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0    208   100 ?        Ss   Aug05   0:02 ini
root         2  0.0  0.0      0     0 ?        S<   Aug05   1:01 [kthreadd]
root       816  0.3  0.5   8464  1240 ?        Ss   Aug05   5:20 /usr/sbin/zylogd
root       820  0.0  0.1   1104   444 ?        Ss   Aug05   0:05 /usr/sbin/syslog-ng -f /var/zyxel/syslog-ng/syslog-ng.conf
root      1527  0.0  0.1   3760   404 ?        S    Aug05   0:00 /usr/sbin/pcap_monitor
root      1533  0.0  0.1   8920   340 ?        Ss   Aug05   0:00 /usr/sbin/vsd
root      1536  0.0  0.1   2588   276 ?        S<   Aug05   0:00 /bin/zyshd_wd
root      1540  0.3  4.8  54304 11708 ?        S    Aug05   5:00 /bin/zyshd
root      1696  0.0  0.2   7068   664 ?        Ss   Aug05   0:00 /usr/sbin/xinetd -filelog /tmp/xinetd.log -stayalive -reuse -pidfile /var/run/xinetd.pid
root      1807  0.0  1.0  50108  2456 ?        Ss   Aug05   0:00 /usr/sbin/radiusd -d /var/zyxel/raddb
root      2297  0.0  0.1   2704   408 ?        S<   Aug05   0:00 /sbin/resd
root      2303  0.0  0.8  35436  1944 ?        Ss   Aug05   0:04 /usr/sbin/contfltd
root      2382  0.0  0.0   2572   196 ?        Ss   Aug05   0:00 /sbin/lavd
root      2383  0.0  1.1  15836  2768 ?        S<   Aug05   0:00 /sbin/decomp_server
root      2397  0.0  0.2  16212   632 ?        Ss   Aug05   0:00 /usr/sbin/fauthd
root      2412  0.0  0.1   2772   320 ?        S<   Aug05   0:00 /sbin/wdtd
...
...
nobody   12843  0.0  1.5  41860  3804 ?        S    Aug05   0:00 /usr/local/apache/bin/httpd -f /usr/local/zyxel-gui/httpd.conf -k start -DSSL
nobody   12851  0.0  1.5  41860  3688 ?        S    Aug05   0:00 /usr/local/apache/bin/httpd -f /usr/local/zyxel-gui/httpd.conf -k start -DSSL
nobody   12852  0.0  1.6  44004  4108 ?        S    Aug05   0:00 /usr/local/apache/bin/httpd -f /usr/local/zyxel-gui/httpd.conf -k start -DSSL
root     20680 91.9  1.2  26976  3076 ?        R    10:18   4:31 /usr/local/sbin/snmpd udp:161,udp6:161 -c /var/zyxel/snmpd.conf -p /var/run/snmpd.pid
...
...

As you can see, the SNMP process is using 91.9% of CPU usage. SNMP monitoring can be used to collect information from your USG. If you do not need it you can simply deactivate it through the graphical interface. In addition, it looks to be a good practice on the USG when not using this feature. In order to deactivate the SNMP functionality just go in Configuration section and then in System > SNMP as shown below:

How to deactivate SNMP

Once SNMP deactivated, Processor usages are freed up.

Router> debug system show cpu status
CPU utilization: 12 % (system: 5 %, user: 2 %, irq: 3 %, softirq 2 %)
CPU utilization (1 minute): 78 % (system: 39 %, user: 33 %, irq: 3 %, softirq 3 %)
CPU utilization (5 minute): 92 % (system: 45 %, user: 41 %, irq: 3 %, softirq 3 %)

The pings are now much more stable and there are no more spikes as we saw at the beginning.

Pinging 8.8.8.8 with 32 bytes of data
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=11 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=11 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
...
...
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Reply from 8.8.8.8: Bytes=32 time=12 ms TTL=116
Conclusion

In the event of a performance problem with a ZyWALL USG, the command line tool provided by ZyXEL allows you to understand what resources are being monopolized by. Once you have identified what was consuming the resources, in our case the SNMP process, you can either stop it or go further in the debugging process.

L’article Performance problems on a ZyWALL USG est apparu en premier sur dbi Blog.

SQL Server Deadlock on UPDATE with Serializable isolation level

Wed, 2022-08-10 00:00

Recently I spent some time on a recurring Deadlock problem on a customer’s site in a high concurrency environment. It was only after finding the solution on my own that I discovered that this particular problem is already well documented if you search for the right keyword. ¯\_(ツ)_/¯

Anyway, in this blog, I will give feedback with some useful information about this specific Deadlock case.

The Deadlock Deadlock Graph

Here is a Deadlock graph similar to the real one the customer had in Production :

Deadlock Graph

Looking at the XML properties we can notice the isolation level of both transactions is Serializable and that it’s the same object (stored procedure) causing the deadlock.

Stored Procedure

Here is what the stored procedure looks like.

CREATE PROCEDURE dbo.[bigTransactionHistory]
	@TransactionID		bigint
	, @ProductID		int
	, @TransactionDate	datetime
	, @Quantity			int
	, @ActualCost		money
AS
BEGIN
	SET NOCOUNT ON;

	SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

	BEGIN TRAN
		IF EXISTS(SELECT 1 FROM [bigTransactionHistory] WHERE TransactionID = @TransactionID)
		BEGIN
			UPDATE dbo.[bigTransactionHistory]
			SET ProductID = @ProductID
				, TransactionDate =	@TransactionDate
				, Quantity =		@Quantity
				, ActualCost =		@ActualCost
			WHERE TransactionID = @TransactionID
		END
		ELSE
		BEGIN
			INSERT INTO [bigTransactionHistory](
				[TransactionID]
				, [ProductID]
				, [TransactionDate]
				, [Quantity]
				, [ActualCost])
			VALUES (
				@TransactionID
				, @ProductID
				, @TransactionDate
				, @Quantity
				, @ActualCost
			)
		END
	COMMIT
END

The stored procedure is doing an UPDATE or an INSERT based on the existence of the row inside the table.
The existence check is performed with the “IF EXISTS(SELECT…)”.

Serializable isolation level

The documentation mentions the following behavior for this level of isolation:

No other transactions can modify data that has been read by the current transaction until the current transaction completes.

Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.

It’s the “IF EXISTS” which reads the row and holds a RangeS-S lock for the duration of the transaction.
If the Stored Procedure is run at the same time for the same key in two concurrent sessions the deadlock can occur.

StoredProcedureAndLocks

RangeS-S and RangeX-X are incompatible locks.

Range Locks compatibility matrix
Solution

To solve this deadlock I changed the Stored Procedure removing the IF EXISTS(SELECT..).

	SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

	BEGIN TRAN
			UPDATE dbo.[bigTransactionHistory]
			SET ProductID = @ProductID
				, TransactionDate =	@TransactionDate
				, Quantity =		@Quantity
				, ActualCost =		@ActualCost
			WHERE TransactionID = @TransactionID
		IF @@ROWCOUNT = 0
		BEGIN
			INSERT INTO [bigTransactionHistory](
				[TransactionID]
				, [ProductID]
				, [TransactionDate]
				, [Quantity]
				, [ActualCost])
			VALUES (
				@TransactionID
				, @ProductID
				, @TransactionDate
				, @Quantity
				, @ActualCost
			)
		END
	COMMIT

The UPDATE is performed at all times. If no row is updated (checked with @@ROWCOUNT) a new row is INSERTED, removing the need for a first RangeS-S lock and any Deadlock possibility.

UPSERT

This type of scenario is a classic and is called UPSERT.
You can find a great article about this from Michael J. Swart; SQL Server UPSERT Patterns and Antipatterns

UPSERT scenario, from Michael J. Swart

Interesting fact, some SQL Database Management Systems like CockroachDB implement an UPSERT statement.

This UPSERT article adds a UPDLOCK hint to the UPDATE statement.
The idea is to protect against a potential conversion deadlock. I have not encountered this type of deadlock but the addition of UPLOCK is a good thing to prevent any deadlock (and the associated retry logic). It will force other transactions to wait for the requested U lock.
The hint should be added to the UPDATE statement like this:

UPDATE dbo.[bigTransactionHistory] WITH (UPDLOCK)
SET ProductID = @ProductID
	, TransactionDate =	@TransactionDate
	, Quantity =		@Quantity
	, ActualCost =		@ActualCost
WHERE TransactionID = @TransactionID
Conclusion

The UPSERT pattern is very common but is not trivial to write correctly in SQL. I hope this can help you understand and solve this kind of Deadlocks.

L’article SQL Server Deadlock on UPDATE with Serializable isolation level est apparu en premier sur dbi Blog.

SQL Server Security: Check if the guest is active on all user-databases through the CMS

Mon, 2022-08-08 08:30

Today, a customer asks me to have look on each SQL Server instance to control if the user-database Guest is active or not. The easiest way is to go through the CMS (Central Management Servers) with a query but witch query….

The CIS (Center for Internet Security) provides guidance to secure SQL Server.

One of the points is to ensure that the CONNECT permission on the GUEST user database is Revoked within all SQL Server databases excluding the master, msdb and tempdb.

CIS give a query to verify the status of this permission with sys.database_permissions.

I suggest to use a more simple request with the value hasdbaccess from the system view sys.sysusers.

The result is the same because the value hasdbaccess is at 1 if the user has the CONNECT permission and 0 if not. I think is more readable on this way.

I use sp_MSforeachdb to go through all databases and create a temporary table to put in the result. After I just filter it on the value hasdbaccess.

Create table #temp
(
	database_name sysname,
	name sysname,
	hasdbaccess int
)

insert into #Temp
exec sp_MSforeachdb N'select ''[?]'' as database_name, name, hasdbaccess from [?].sys.sysusers WHERE name = ''guest'''

SELECT * from #temp where hasdbaccess=1

SELECT * from #temp where hasdbaccess=0

drop table #temp

Example:

The most important is that model and every user database has 0.

Let the system databases master, tempdb and msdb with the connect permission to the database-user guest. It can be some issues, if you revoke it from the master or msdb.

L’article SQL Server Security: Check if the guest is active on all user-databases through the CMS est apparu en premier sur dbi Blog.

Running Intel x86-64 VMs with an Oracle DB on Apple Silicon (ARM)

Fri, 2022-08-05 17:28

For testing purposes many IT-people traditionally use Virtual Machines (VMs) on their Laptops (often on Oracle Virtualbox). Since Apple moved to its own processor type (Apple Silicon, i.e. ARM architecture), it’s no longer possible to run VMs based on Intel x86-64 on Apple ARM (e.g. on Apple MacBook Pros with M1 or M2 processors). I.e. as the Oracle Database does not run on ARM, a workaround is necessary for e.g. a Consultant with an Apple Laptop to be able to run tests against Oracle databases on that Laptop.

REMARK: Oracle announced to port the DB software to ARM in the future. See e.g. this video at 26:40 – 27:22.

What are the alternatives?

  • runinng the DB-server VM on the Cloud (requires access to the cloud resources over the internet)
  • running the DB-server on separate hardware (requires access to separate hardware. So you need to carry the hardware with you or provide remote access to it, e.g. over VPN)

Another potential alternative would be the emulation of x86-64 on ARM. I.e. on Mac that means running QEMU. The product UTM makes that easy as it adds a GUI on top of QEMU.

According lots of feedback on the internet the emulation is VERY SLOW. I thought I’ll check that myself on what “very slow” means in terms of running an Oracle database in an emulated environment.

Here’s the environment I’m in:

  • MacBook Air (M1, 2020), 8GB RAM
  • UTM version 3.2.4. from Turing Software LLC (can be downloaded from the App Store) and costs $10

For testing purposes I created 2 VMs:

  • Ubuntu Linux with Oracle XE on Docker
  • Oracle Linux 8.x with Oracle EE 19.16.

First of all the good news is that I could install and run the 2 environments without any crash. Everything runs very stable as long as the UTM System setting “Force Multicore” is not enabled. In that case UTM can only use 1 core of the Mac. That’s very stable. With “Force Multicore” enabled more than 1 Core can be used from the Mac on the VM. In my case that was fast, but running Java in the VM did not work anymore. I had to disable cores then to make Java work (see the summary below on how I did that).

Now to the main question: What is the performance of the DB-server running in the emulated UTM-VM?

It takes considerable more time to install the environment and it of course feels much slower than running a type 2 hypervisor (like Virtualbox on native x86_64). But I wanted to know exact numbers, so I did a couple of tests:

1.) Running datapatch against a CDB with 1 PDB (part of dbca after installing 19.16. software only):

This test was performed with “Force Multicore” disabled.

1 Core for the VM on the emulated environment:

  • Patch CDB: 2h, 37min, 23sec
  • Patch PDB$SEED and PDB1: 1h, 41min, 30sec

I did run the same on a virtualbox on a MacBook Pro with a 2.6 GHZ Core i7:

2 Cores for VM on type 2 hypervisor on native x86_64:

  • Patch CDB: 8min, 15sec
  • Patch PDB$SEED and PDB1: 5min, 32sec

Result: Emulated VM with half the cores is approx factor 18 slower than on the native machine.

REMARK: All tests performed for this Blog were done on those 2 machines (MacBook Air 2020 M1 with emulated UTM-VM versus MacBook Pro 2019 Intel with Virtualbox).

2.) A pure DB-CPU-Load test

To utilize a DB sever on CPU I usually run the following cpu_load.sql-script in sqlplus:

set lines 120
set timing on time on
with t as ( select rownum from dual connect by level <= 60)
select /*+ ALL_ROWS */ count(*) from t,t,t,t,t
/

I.e. the dedicated server process runs 100% on CPU for a couple of secs. This is a single thread performance test and is not related to the number of active cores as the test runs on 1 core only. The results were:

UTM (emulated): 3min, 33sec
Virtualbox (native): 15.5 secs

–> The emulated VM is 13.75 times slower

3.) CPUSPEED in the Oracle system statistics

The CPUSPEED of system stats in aux_stats$ shows 342 for that emulated machine (342 millions of instructions per second). I gathered system statistics while running a swingbench test:

The performance cores of the M1 run between 600 – 3204 MHz. I.e. the emulated environment is approximately 3204/342 = 9.4 times slower from an Oracle perspective.

4.) Swingbench

Loaded Swingbench data with the oewizard by leaving everything on default except the load factor and the parallelism. I took a load factor 0.7 and a parallelism of 2. The total runtime for the load was 17min, 52secs and it inserted 14’569 rows /sec:

Doing the same against Virtualbox on a native Intel environment looked as follows:

I.e. 14569 rows inserted per second on the emulated environment compared to 75966 rows per second on Virtualbox (factor 5.2). But as I just ran with a parallelism of 2, I probably haven’t used the available resources effectively.

Running charbench for 1 min on the emulated environment with 8 Users (the number of 8 sessions brought the best TPS-rate) showed approx 55 TPS (on the 2 cores configured VM with “Force Multicore”):

On the native Intel environment configured with 2 cores as well I could achieve 770 TPS with 55 sessions:

So that’s a factor of 14 higher TPS on a 2 cores VM.

Summary: It’s possible to run an Oracle DB in a VM on an Apple Silicon machine, but you have to use hardware emulation, which makes the processing on the DB slower by a factor of 10 – 14 compared to a Type 2 hypervisor (e.g. Virtualbox). This is OK for doing functional tests, but is usually too slow for heavy testcases or regular installations or upgrades. For best performance enable “Force Multicore”, but then Java did not work anymore in my tests (e.g. a command “$ORACLE_HOME/OPatch/opatch lspatches” did not finish). To workaround the Java issue I disabled all cores except 1, ran the Java program and enabled all cores again. E.g. on a VM with 2 cores enabled (cpu0 and cpu1):

Disable cpu1 as root temporarily:

# echo 0 > /sys/devices/system/cpu/cpu1/online

Run your Java program:

[oracle@utm-ora8 ~] $ORACLE_HOME/OPatch/opatch lspatches

Enable cpu1 as root again:

# echo 1 > /sys/devices/system/cpu/cpu1/online

The DB itself was not affected negatively during my tests with “Force Multicore” enabled. I do recommend to use “Force Multicore” enabled as it makes a huge difference:

REMARK: After writing this Blog I detected another potential alternative to run an Oracle DB on ARM. In this video the possibility is shown to run Windows for ARM in Parallels (could probably be run on UTM as well). You can install “Oracle for Windows x86-64” in Windows ARM. The OS will detect the x86-64 code and emulate x86-64 automatically.

L’article Running Intel x86-64 VMs with an Oracle DB on Apple Silicon (ARM) est apparu en premier sur dbi Blog.

OCI connected to your personal network – quick&easy example with open source firewall ‘IPFire’

Fri, 2022-08-05 09:04
Introduction & why attaching your personnal network to OCI (Oracle Cloud Infrastructure)

You want to use your existing tools and infrastructure, but you need high internet performance for your Oracle environment – the combination of on premise and OCI may make sense for you. Or if you simply want to check what possibilities OCI provides for your business with OCI, you can gain some experience with a mix of both worlds.

Maybe you are interested in Oracle Cloud Infrastructure and you don’t want to move all of your IT to the cloud? Or you have some tools or data locally in your premise IT you want to user further? If you don’t plan to move terabytes of data you can do a quick and easy test within a short time with an open source firewall from IPFire (https://www.ipfire.org) to connect your infrastructure to OCI via VPN. If you have an existing IPFire firewall connected to the internet (in my case no NAT) the time to configure the needed 2 IPSec tunnels is done in less than 5min.

Why open source ‘IPFire’ Firewall?

I’m using IPFire-firewall since years without problems and even if you’re not very experienced command-line user you can install, configure and maintain the firewall by GUI easily too.

And in addition, regarding our intention to connect OCI, the IPSec configuration is very simple to configure via GUI too.

Example configuration overview

CPE (Customer-Premises Equipment) is nothing else than your personal IT entry-point. So please don’t use public IP-address beginning with something like 94.16.xxx.yyy if you don’t want to run into trouble with my provider Quickline (except if you have the same provider. Only then it would look similar).

/!\ Please check first which IP-address is assigned by your ISP (Internet Service Provider) before you begin to setup CPE in OCI. The OCI CPE is configured together with your personal firewall.

IPSec to be defined on both sides – in OCI and in your personal network (in my example with the IPFire firewall and the IPSec configuration).

DRG “Dynamic Routing Gateway” will coordinate routing between your on-prem netword and the VCNs/Subnets in OCI.

VCN “Virtual Cloud Network” is the network you attach other components to and where you can define subnets, routing tables, security lists.

Attach the Routing Table (RT) and Security List (SL) definitions to your subnets later. I was wondering why it didn’t work on my first attempt.

VCN Subnet – here you place your compute instances and apply your SL (Security List ~firewall rules) and RT (Routing Table).

Prerequisites

On OCI side

  • your OCI tenancy
  • VCN (Virtual Cloud Network)
  • VCN subnet
  • CPE (Customer-Premises Equipment)
  • IPSec connection
  • 2x IPSec tunnel
  • Dynamic Routing Gateway (DRG)
  • Security List (SL)
  • Routing Table (RT)
  • At least 1 instance

On customer side

  • IPFire firewall
  • Your external IP-Address
  • 1 LINUX OS instance on your premise IT

Optional

  • DDNS for your personal network
  • DDNS  domain name or your own DNS
Setup OCI VCN NamemyVCNCIDR10.0.5.0/24 VCN subnet NamemySubnetCIDR10.0.5.0/24RTRouteTableMySubnetDestination192.168.0.0/24 [customers internal CIDR]Target TypeDynamic Routing GatewayTargetmyDRGDRG attmyDRG_Attachment_SubnetAtt NamemyDRG_Attachment_mySubnetLifecyc StateAttachedDRGmyDRGVCN RT–Cross-Ten.NoSLSecurityListMySubnetIngress Rules (1)StatelessNoSource192.168.0.0/24 [customers internal CIDR]IP ProtocolTCPSource PRAllDestin. PR22Type&CodeAllowsTCP traffic for pots: 22 SSH Remote Login ProtocolDescriptionmyCPE(ingress) DRG NamemyDRGLifecycle StAvailable (when at least one tunnel up)Ora Redund.Redundant (when all up and running)VCN att.(2)(1/2)Att NameDRG_Attachment_for_IPSec_Tunnel: myIPSecTunnel1Lifecyc StAttachedIPSec Tun.myIPSecTunnel1DRG RTAutogen. Drg RT for RPC, VC, and IPSec att.CPEmyCPECPE IKE Idtest.myddns.com (if you changed IPSec ‘IP-Connection’ with FQDN)(2/2)Att NameDRG_Attachment_for_IPSec_Tunnel: myIPSecTunnel2Lifecyc StAttachedIPSec Tun.myIPSecTunnel2DRG RTAutogen. Drg RT for RPC, VC, and IPSec att.CPEmyCPECPE IKE Idtest.myddns.com (if you changed IPSec ‘IP-Connection’ with FQDN)

Shortly after setup I saw all ‘ok/status green’. But after some minutes one tunnel went down. Don’t worry, if the active tunnel has problems the other tunnel becomes active.

CPE NamemyCPEPubl. IP94.16.100.100 (you must replace this with your public IP-address !!! )

Here you are able to use an IP-address only and that is the reason why you have to recreate it every day if you use your private dynamic IP-address.

In addition – the whole IPSec configuration depends on this. You have to recreate IPSec and the both tunnels too, even if you could use the same parameters with your own DNS/URL.

IPSec NamemyIPSecLifecycle St.AvailableDRGmyDRG

/!\ Here is the point you can change from IP-address to FQDN if you have your own DDNS or your own domain.

Tunnel NamemyIPSecTunnel1Lifecycle StAvailable (when config successful)IPSec StatusUp (when config successful)IPv4 BGP St.–IPv6 BGP St.–Oracle VPN111.111.111.111 (address will be provided when created)Routing TypStatic RoutingNamemyIPSecTunnel2Lifecycle StAvailable (when config successful)IPSec StatusUp (when config successful)IPv4 BGP St.–IPv6 BGP St.–Oracle VPN222.222.222.222 (address will be provided when created)Routing TypStatic Routing

Here the pictures of OCI. I omitted the second tunnel as it is configured the same way.

Security List (SL) SLSecurityListMySubnetIngress Rules (1)StatelessNoSource192.168.0.0/24 [customers internal CIDR]IP ProtocolTCPSource PRAllDestin. PR22Type&CodeAllowsTCP traffic for pots: 22 SSH Remote Login ProtocolDescriptionmyCPE(ingress)

Please consider that I used (and copied) the default security list instead of creating the above mentioned ‘SecurityListMySubnet’

Routing Table (RT) RTRouteTableMySubnetDestination192.168.0.0/24 [customers internal CIDR]Target TypeDynamic Routing GatewayTargetmyDRG

Please consider that I used (and copied) the default route table instead of creating the above mentioned ‘RouteTableMySubnet’

Instance

Create an instance in your OCI VCN-subnet

Setup IPFire firewall

On the ‘Main page’ you will find your external IP if you connect your IPFire firewall directly to the internet. Otherwise you have to do the instructions provided by Oracle for NAT-configuration.

In this example it is the 94.16.100.100 IP-address which you have to replace with your external IP.

Main Page IPSec

Creating your Certificate Authorities and -Keys is self-explaining and I have filled my Host Certificate CN with my FQDN

Don’t worry if one tunnel is displayed ‘down’ after some time. OCI and the firewall are automatically selecting one tunnel and take offline the other one if not configured with BGP.

Tunnel 1

Parameters for IPSec tunnel 1 on customer side is like this:

It is very important to fill ‘Local ID’ with ‘@<your FQDN>. You don’t get running tunnels in parallel if you don’t use the same naming as in OCI IPSec definition.

Tunnel 2

Same as tunnel 1: It is very important to fill ‘Local ID’ with ‘@<your FQDN>. You don’t get running tunnels in parallel if you don’t use the same naming as in OCI IPSec definition.

IPSec Tunnel Advanced

For IPSec advanced features I have set a custom configuration on OCI and my (customers) side. But it was running with defaults too. Advanced settings are identical for tunnel 1 and tunnel 2.

/!\ Don’t forget to set your IPFire firewall to allow traffic to both tunnels and drop traffic from tunnels if you don’t want that someone/something from OCI can access your private network.

By default the traffic is allowed in both directions.

Your personal firewall

Last but not least – don’t forget to block traffic from OCI to your personal network if you don’t want the ‘whole world’ in your personal network.

IPFire can handle the tunnels very easily. Just allow traffic from your network to tunnels and block (drop) traffic the way from tunnels to your network.

Faced issues

Most of the configuring was straight forward and the IPSec tunnels were showed up as working after a short time, but …

One single IPSec-tunnel ‘up’ works but the IPSec-tunnels don’t run in parallel

Most time-consuming issue was: Even if each IPSec tunnel was connected and ‘up’ to OCI, you don’t get any traffic through.

As a first try just disable one of the IPSec tunnels. My configuration worked with one tunnel up and the other down. And it doesn’t care which one I had up and which one was down. I just had to avoid using both in parallel.

What solved the issue?

The problem was gone when I used FQDN on OCI-side and entered the used FQDN in IPFire in the tunnel settings in the field ‘Local ID’ (with preceeded ‘@’ character)

What was the behavior then? With the FQDN in place OCI and my firewall were negotiating themselves which tunnel is used. One tunnel is up and one tunnel is down. If active tunnel goes down the other one goes up automatically.

Remark from Oracle:

If your CPE supports having two IPSec tunnels up/active to the same destination, configure the second tunnel to also be up/active. Oracle recommends configuring both tunnels to use BGP dynamic routing.

All other issues were mainly to ensure correct order of data-collecting and -entering. If you use OCIs ‘Wizard’ you are on safe side already.

Conclusion

For a ‘one day valid test’ (if you are using dynamic IP-address) it takes 3min reconfiguring of your environment every day after you’ve received a new external IP-address to do testing with OCI plus on premises IT with no financial impact. What you need is your curiosity and time setting things up initially.

L’article OCI connected to your personal network – quick&easy example with open source firewall ‘IPFire’ est apparu en premier sur dbi Blog.

Run 2 specific GitLab Runners in local containers for CI/CD Pipeline

Fri, 2022-08-05 01:21

By testing GitLab CI/CD pipeline on my free SaaS GitLab account, I’ve explored the setup of specific GitLab Runner in containers on my local machine. For those who don’t want to use the shared runners of GitLab (it’s free but you still have to share your credit card details with GitLab!) or just want to do some tests and keep the full control of the Runners, here is a quick setup that should cover several test scenarios for your needs.

GitLab gives all the details for creating a Runner into one container here. If you are in a hurry (who isn’t!) and want a quick procedure to follow or if you want to create a second one in order to test jobs ran by two different Runner, follow through.

First Runner in a local container

Let’s hit our terminal straight away to run our first gitlab-runner container:

enb@DBI-LT-ENB project1 % docker run -d --name gitlab-runner --restart always \
  -v /Users/Shared/gitlab-runner/config:/etc/gitlab-runner \
  gitlab/gitlab-runner:latest

As I’m on MacOS, I’m using /Users/Shared instead of /srv as per GitLab documentation. This volume mounted will keep the Runner configuration persistent after a container restart. I didn’t use the option -v /var/run/docker.sock:/var/run/docker.sock as I’m going to use Shell Executor.

Now let’s register our Runner:

enb@DBI-LT-ENB project1 % docker run --rm -it -v /Users/Shared/gitlab-runner/config:/etc/gitlab-runner gitlab/gitlab-runner register

Provide all the required register information (you could alternatively have passed all those parameters as options of the register command above):

Runtime platform                                    arch=arm64 os=linux pid=7 revision=32fc1585 version=15.2.1
Running in system-mode.

Enter the GitLab instance URL (for example, https://gitlab.com/):
https://gitlab.com/
Enter the registration token:
GZ2358921LfDn-pbKiaL6exvSk0B2
Enter a description for the runner:
[21e5bba348f9]: gitlab-runner1
Enter tags for the runner (comma-separated):
r1
Enter optional maintenance note for the runner:

Registering runner... succeeded                     runner=GZ2358921LfDn-pb
Enter an executor: custom, docker-ssh, parallels, docker-ssh+machine, kubernetes, docker, shell, ssh, virtualbox, docker+machine:
shell
Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded!

Configuration (with the authentication token) was saved in "/etc/gitlab-runner/config.toml"

The GitLab instance and registration token are given in your GitLab project (Settings -> CD/CD -> Runners). Enter a unique tag (or tags) for this runner as this will be used to identify it in your CI/CD pipeline (by default untagged Jobs are not run). It is here I define I’ll use Shell Executor for this runner. Note that this configuration is saved in the volume mounted in the previous step.

Second Runner in another local container

Repeat these two commands above for the second Runner:

enb@DBI-LT-ENB project1 % docker run -d --name gitlab-runner2 --restart always \
  -v /Users/Shared/gitlab-runner/config2:/etc/gitlab-runner \
  gitlab/gitlab-runner:latest

Just change the name used for this Runner as well as the local folder for the volume mount.

enb@DBI-LT-ENB project1 % docker run --rm -it -v /Users/Shared/gitlab-runner/config2:/etc/gitlab-runner gitlab/gitlab-runner register
Runtime platform                                    arch=arm64 os=linux pid=7 revision=32fc1585 version=15.2.1
Running in system-mode.

Enter the GitLab instance URL (for example, https://gitlab.com/):
https://gitlab.com/
Enter the registration token:
GZ2358921LfDn-pbKiaL6exvSk0B2
Enter a description for the runner:
[c29d03644a73]: gitlab-runner2
Enter tags for the runner (comma-separated):
r2
Enter optional maintenance note for the runner:

Registering runner... succeeded                     runner=GZ2358921LfDn-pb
Enter an executor: custom, shell, ssh, docker+machine, docker-ssh+machine, kubernetes, docker, docker-ssh, parallels, virtualbox:
shell
Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded!

Configuration (with the authentication token) was saved in "/etc/gitlab-runner/config.toml"

We provide another description for the runner (this will be the name used in GitLab to identify your Runner) as well as another tag for this Runner (to be used in our Pipeline).

Checking of our setup

Both our Runner containers are up and running on my local machine:

enb@DBI-LT-ENB project1 % docker ps
CONTAINER ID   IMAGE                         COMMAND                  CREATED         STATUS         PORTS     NAMES
3ce44058ff1c   gitlab/gitlab-runner:latest   "/usr/bin/dumb-init …"   2 minutes ago   Up 2 minutes             gitlab-runner2
5d44c65d768e   gitlab/gitlab-runner:latest   "/usr/bin/dumb-init …"   4 minutes ago   Up 4 minutes             gitlab-runner

They have also been successfully registered in our GitLab project:

GitLab CI/CD Pipeline

The last step is to use both of those Runner in our CI/CD Pipeline by using the simple script below (Use the Editor of SaaS GitLab to edit the pipeline file .gitlab-ci.yml):

build1:
  tags:
    - r1
  stage: build
  script:
    - echo "Do your build here"

test:
  tags:
    - r2
  stage: test
  script:
    - echo "Do a test here"

“build1” Job is using the Runner with the tag r1 and “test” the one with the tag r2. Let’s run it and see the results:

We can see that each Job is using each of our Runner using the Shell Executor as configured. You can now perform more advanced tests with your Runners and your GitLab project.

If you want to bring your Docker skills to the next level, check out our Training course given by our Docker Guru!

L’article Run 2 specific GitLab Runners in local containers for CI/CD Pipeline est apparu en premier sur dbi Blog.

Documentum – RED Warning on D2 after enabling Tomcat HTTP Security Headers

Sun, 2022-07-31 04:34

Security is and will always be a very important aspect of IT. Enterprise Content Management (or Document Management Systems or Content Services Platform or whateveryoucallit) is obviously not an exception to that. One of our main goals as a consultant is to add improvements to our customers’ installations for stability, security, and ease of management. However, it happens that not everything goes as planned… In this blog, I will discuss about a small but still annoying issue that appeared on D2 20.2 after enabling the Tomcat HTTP Security Headers at a customer.

D2 provides some capabilities to add some of the standard HTTP Security Headers via its own internal configuration. It already supports the HTTP Strict Transport Security (HSTS) and inside the “settings.properties” file, you can define some parameters like the HSTS max age (hsts.maxage=xxx). In this file, you can also configure D2 for anti-clickjacking (allowed.frame.origins=xxx). However, that’s not all the HTTP Security Headers, and you probably are using other applications besides D2 that are deployed on Tomcat in your company. Therefore, you might start looking at putting the configuration not on the application but instead on the Tomcat layer, to have a common configuration, enterprise wise.

Tomcat HTTP Security Headers (I’m only talking about “httpHeaderSecurity” filter here) doesn’t support everything but it’s still a good base so it might make sense to configure the headers at that layer. What is missing in the Tomcat HTTP Security Headers from the list of “common” Security Headers to define is mainly the Cache-Control, the Content-Security-Policy/CSP (here is a blog related to CSP config in regards to D2 WSCTF, if needed) and the Cross-Origin Resource Sharing/CORS but all of these can be configured through other filters (eithers OOTB filters or custom ones that you need to deploy). By default, the Tomcat HTTP Security Headers are disabled (commented):

[tomcat@d2-0 ~]$ web_xml="$TOMCAT_HOME/conf/web.xml"
[tomcat@d2-0 ~]$
[tomcat@d2-0 ~]$ grep -B2 -A4 'httpHeaderSecurity' ${web_xml}
<!--
    <filter>
        <filter-name>httpHeaderSecurity</filter-name>
        <filter-class>org.apache.catalina.filters.HttpHeaderSecurityFilter</filter-class>
        <async-supported>true</async-supported>
    </filter>
-->
--
<!--
    <filter-mapping>
        <filter-name>httpHeaderSecurity</filter-name>
        <url-pattern>/*</url-pattern>
        <dispatcher>REQUEST</dispatcher>
    </filter-mapping>
-->
[tomcat@d2-0 ~]$

To enable it, you just need to uncomment the two sections (remove ‘<!–‘ before the XML tag and ‘–>‘ after) and you can of course start filling it with what you need. Here is a possible configuration:

[tomcat@d2-0 ~]$ cat ${web_xml}
...

    <filter>
        <filter-name>httpHeaderSecurity</filter-name>
        <filter-class>org.apache.catalina.filters.HttpHeaderSecurityFilter</filter-class>
        <async-supported>true</async-supported>
        <init-param>
            <param-name>hstsEnabled</param-name>
            <param-value>true</param-value>
        </init-param>
        <init-param>
            <param-name>hstsMaxAgeSeconds</param-name>
            <param-value>63072000</param-value>
        </init-param>
        <init-param>
            <param-name>hstsIncludeSubDomains</param-name>
            <param-value>true</param-value>
        </init-param>
        <init-param>
            <param-name>hstsPreload</param-name>
            <param-value>false</param-value>
        </init-param>
        <init-param>
            <param-name>antiClickJackingEnabled</param-name>
            <param-value>true</param-value>
        </init-param>
        <init-param>
            <param-name>antiClickJackingOption</param-name>
            <param-value>SAMEORIGIN</param-value>
        </init-param>
        <init-param>
            <param-name>blockContentTypeSniffingEnabled</param-name>
            <param-value>true</param-value>
        </init-param>
        <init-param>
            <param-name>xssProtectionEnabled</param-name>
            <param-value>true</param-value>
        </init-param>
    </filter>

...

    <filter-mapping>
        <filter-name>httpHeaderSecurity</filter-name>
        <url-pattern>/*</url-pattern>
        <dispatcher>REQUEST</dispatcher>
    </filter-mapping>

...
[tomcat@d2-0 ~]$

The above is just an example but that’s how you enable the Tomcat HTTP Security Headers. Now coming to the issue I wanted to talk about in this blog… With these new headers in place, all applications (D2, D2-Config, D2-Smartview, D2-REST, DA, …) appeared to be working fine, no issues were reported by the tests done until someone tried to click on a link received by mail a few days before, and he got this beautiful screen:

Tomcat Red Warning

When that appeared, the following message was also printed on the Tomcat logs:

2022-03-03 13:56:52,342 UTC SEVERE [https-jsse-nio-8080-exec-27] org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service() for servlet [jsp] in context with path [/D2] threw exception [Unable to add HTTP headers since response is already committed on entry to the HTTP header security Filter] with root cause
	javax.servlet.ServletException: Unable to add HTTP headers since response is already committed on entry to the HTTP header security Filter
		at org.apache.catalina.filters.HttpHeaderSecurityFilter.doFilter(HttpHeaderSecurityFilter.java:101)
		at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
		at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
		at com.emc.x3.portal.server.filters.HttpHeaderFilter.doFilter(HttpHeaderFilter.java:87)
		at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
		at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
		at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
		at org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
		at com.emc.x3.portal.server.filters.authc.X3SAMLHttpAuthenticationFilter.executeChain(X3SAMLHttpAuthenticationFilter.java:356)
		at org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
		at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
		at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
		at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
		at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
		at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
		at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
		at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:387)
		at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
		at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
		at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:121)
		at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:133)
		at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
		at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
		at com.emc.x3.portal.server.filters.X3SessionTimeoutFilter.doFilter(X3SessionTimeoutFilter.java:40)
		at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
		at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
		at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197)
		at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
		at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:540)
		at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)
		at org.apache.catalina.valves.StuckThreadDetectionValve.invoke(StuckThreadDetectionValve.java:206)
		at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
		at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:687)
		at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
		at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)
		at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:382)
		at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
		at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:895)
		at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1722)
		at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
		at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
		at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
		at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
		at java.base/java.lang.Thread.run(Thread.java:829)

As you can see, it seems to be linked to what was just configured (“HttpHeaderSecurityFilter” on the first line of the stack trace) and there is also quite a bit of “shiro” references as well as a “X3SAMLHttpAuthenticationFilter“. It would therefore indicate that the issue is most probably linked to the Single Sign-On in relation with the Tomcat HTTP Security Headers. Since Tomcat is unable to add the required HTTP Security Headers, then it fails the request and print the red warning page.

What happened is that the link on the mail contained URL parameters (i.e., the “query” part) so it was something like “https://dns/D2?docbase=xxx&locateId=xxx“. D2 was fully working properly, except when someone tried to directly open a link from a mail without opening D2 first, which shouldn’t be a problem under normal circumstances… When that happened, the user was redirected to the Single Sign-On (AzureAD SAML2, configured on the shiro.ini), and then back to D2, which was supposed to handle the user’s request but failed because of the above Tomcat error. In addition to the issue (and the nice red warning screen), the URL changed to “https://dns/D2” (i.e., the URL parameters were lost in the process).

However, clicking a second time on the link from the mail worked properly and displayed the document correctly (using the “locateId” parameter)! The easiest way to replicate the issue was therefore to clear the cookies and access any URL with parameters. It appeared to me that, the issue would only show up if there is no SAML cookies (SAMLNameID, SAMLSessionIndex). After disabling the SSO (i.e., renaming/removing the “shiro.ini” file and restarting Tomcat), the issue wasn’t reproducible anymore. This confirmed my assumptions that there is a conflict between how D2 handles the SAML response and how Tomcat adds the HTTP Security Headers (“response is already committed“).

With that in mind, I opened the OpenText SR#5056234, to see if it was a D2 bug or something else. After some more investigations, I got the information from OpenText that this is apparently happening because of the open source third party library that is used for SSO (i.e., Apache Shiro). Without Apache Shiro, the HTTP Security Headers are properly added/modified by the doFilter() function before the servlet action is done. However, with Apache Shiro (that chains additional filters), the order of filters execution appears to be changed and, in this case, the response is already committed before Tomcat has the chance to add the necessary HTTP Security Headers, which triggers this error. As far as I know, this issue doesn’t happen with OTDS SSO, so I assume different libraries are in play there.

Since this issue can be seen as soon as the Tomcat HTTP Security Headers are enabled (even if you don’t specify any configuration for the filter), then it kind of makes it difficult to setup it up at the Tomcat layer. D2 20.2 uses Apache Shiro 1.4.2 (Nov-2019) while the latest version, as of today, is 1.9.1 (Jun-2022). I didn’t check but this issue might have been solved in a later version. Therefore, potential solutions could be to update the libraries (knowing that it could bring issues as well…) or simply to NOT configure the HTTP Security Headers on Tomcat, but instead put them one level above. If you are using Tomcat, there is a good chance that you have a front-end on top of it like Apache HTTPD or Nginx. The issue being related to Tomcat servlet handling, configuring/adding the HTTP Security Headers upfront will prevent the issue to appear altogether, hence that would most probably be the preferred solution.

L’article Documentum – RED Warning on D2 after enabling Tomcat HTTP Security Headers est apparu en premier sur dbi Blog.

A few words about the good to know MAX_IDLE_BLOCKER_TIME

Sat, 2022-07-30 04:42

The increasing popularity of tools like SQL Developer and Toad, sometimes comes hand in hand with an increase of stuck sessions.
This is especially seen in cases where:

  • User A does a change (neither committed nor rolled back). Then, switches to another task, or even leaves for lunch.
  • Then, User B updates the same row and stays there, stuck for hours.

In such cases, DBAs can obviously kill the initial blocker session.
In the past, Resource Manager could be setup to fix these automatically. According the Oracle 12.1 manual, the procedure DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE() has the 2 below parameters

max_idle_timeIndicates the maximum session idle time. Default is NULL, which means unlimited.max_idle_blocker_timeMaximum amount of time in seconds that a session can be idle while blocking another session’s acquisition of a resource

Nowadays, in 21c (checked in 21.6.0.0.220419), we can achieve the same by setting only 1 init.ora/spfile parameter.
Actually, although often marked as 21c New Feature, this already works in 19c (checked in 19.10.0.0.210119)

Let’s have a closer look at this “good to know” parameter.

What’s on by default?
SQL> show parameter idle
NAME				     TYPE	 VALUE
------------------------------------ ----------- --------------------
max_idle_blocker_time		     integer	 0
max_idle_time			     integer	 0

By default, both parameters are set to 0, meaning there is no limit set.

According Oracle 19c documentation, we can set MAX_IDLE_BLOCKER_TIME to a number of minutes that a session holding needed resources can be idle before it is automatically terminated.

So, shall we set MAX_IDLE_TIME or MAX_IDLE_BLOCKER_TIME?
By giving a value to MAX_IDLE_TIME, we limit all idle sessions
whereas by setting MAX_IDLE_BLOCKER_TIME, we only limit idle sessions blocking resources.

Setting MAX_IDLE_TIME can be be an issue in case of connection pool.
In such a case, we could continuously re-create sessions automatically terminated by this parameter.

Set MAX_IDLE_BLOCKER_TIME value to 5 minutes
alter system set max_idle_blocker_time=5;

This parameter is modifiable at PDB-level.

While connected to CDB$ROOT, we can easily doublecheck the change by selecting against the pdb_spfile$ system view:

select ps.db_uniq_name, ps.pdb_uid, p.name as pdb_name, ps.name, ps.value$
from pdb_spfile$ ps
 join v$pdbs p on ps.pdb_uid = p.con_uid
 order by 1, 2, 3;
 
 DB_UNIQ_NA    PDB_UID PDB_NAME		   NAME 		     VALUE$
---------- ---------- -------------------- ------------------------- ----------
*	   3067207640 PDB1T		   max_idle_blocker_time     5


In a 1st session, let’s now update the table (without commit nor rollback)
SQL> select * from products;

	ID NAME
---------- ----------------------------------------
	 1 bread
	 2 chocolate

SQL> show auto
autocommit OFF

SQL> set time on
11:08:42 SQL> update products set name='cheese' where id=2;

1 row updated.

Here, let’s now leave as is (without commit nor rollback)

In a 2nd session, Update the same row will hang
show auto
autocommit OFF
set time on

SQL> update products set name='bier' where id=2;

… Wait …

Optionally, in a 3rd session, show locks
SQL> set time on
11:10:02 SQL> @qinalocksess.sql

    SID Lock Type		       Lock Mode				Request Block Owner			       Table Name			 Wait s.
------- ------------------------------ ---------------------------------------- ------- ----- -------------------------------- -------------------------------- --------
    105 DML Lock (TM)		       ROW-X (SX)				      0     0 USER1			       PRODUCTS 			      75
     67 DML Lock (TM)		       ROW-X (SX)				      0     0 USER1			       PRODUCTS 			      35

2 rows selected.

.../...

11:12:33 SQL> @qinalocksess.sql

    SID Lock Type		       Lock Mode				Request Block Owner			       Table Name			 Wait s.
------- ------------------------------ ---------------------------------------- ------- ----- -------------------------------- -------------------------------- --------
    105 DML Lock (TM)		       ROW-X (SX)				      0     0 USER1			       PRODUCTS 			     230
     67 DML Lock (TM)		       ROW-X (SX)				      0     0 USER1			       PRODUCTS 			     190

2 rows selected.
11:14:35 SQL> @qinalocksess.sql

Thu Jul 28																							 page	 1
													 Sessions Blocking Other Sessions Report

    SID Lock Type		       Lock Mode				Request Block Owner			       Table Name			 Wait s.
------- ------------------------------ ---------------------------------------- ------- ----- -------------------------------- -------------------------------- --------
     67 DML Lock (TM)		       ROW-X (SX)				      0     0 USER1			       PRODUCTS 			      48

1 row selected.
Let’s now go back to session 1

After the 5 minutes, we can verify the 1st session has been automatically terminated with an ORA-03113.

11:08:52 SQL> /
update products set name='cheese' where id=2
       *
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 6451
Session ID: 105 Serial number: 36338

Usually, seeing ORA-03113 is not expected as this often reveals some kind of oracle bugs (ORA-00600 or ORA-07445).
In the present case, it is nice to see such error.

The Update from Session 2 is now completed as the initial row-level lock has gone
SQL> set time on
11:09:21 SQL> update products set name='bier' where id=2;

1 row updated.

11:13:58 SQL> commit;

Commit complete.

11:16:25 SQL>
Look at the leftovers logs and tracefiles
adrci
adrci> set homepath diag/rdbms/cdb1t_site1/CDB1T
adrci> show alert -tail

The automatic session closure is written black on white in the Alert log:

2022-07-28 11:13:58.032000 +02:00
KILL SESSION for sid=(105, 36338):
  Reason = max_idle_blocker_time parameter, idle time = 5 mins, currently waiting on 'SQL*Net message from
  Mode = KILL HARD SAFE -/-/NO_REPLAY
  Requestor = PMON (orapid = 2, ospid = 2187, inst = 1)
  Owner = Process: USER (orapid = 64, ospid = 6451)
  User = oracle
  Program = sqlplus@l8-bsl.localdomain (TNS V1-V3)
  Result = ORA-0
  

If need be, we can get more details by looking at the trace files generated by the Diag background process:

oracle@l8-bsl:/u01/app/oracle/diag/rdbms/cdb1t_site1/CDB1T/trace/ [CDB1T(PDB1T)] ls -ltr
...
-rw-r-----. 1 oracle oinstall   37932 Jul 28 11:19 CDB1T_dia0_2241_base_1.trc
cat CDB1T_dia0_2241_base_1.trc
.....
*** 2022-07-28T11:10:28.504611+02:00 (CDB$ROOT(1))
HM: Session with ID 105 serial # 36338 (FG)
    on single instance 1 in container PDB1T is hung
    and is waiting on 'SQL*Net message from client' for 96 seconds.
    Session was previously waiting on 'SQL*Net message to client'.
    Session ID 105 is blocking 1 session
.....
*** 2022-07-28T11:11:09.462853+02:00 (CDB$ROOT(1))
HM: Session with ID 67 serial # 14207 (FG)
    on single instance 1 in container PDB1T is hung
    and is waiting on 'enq: TX - row lock contention' for 97 seconds.
    Session was previously waiting on 'PGA memory operation'.
    Final Blocker is Session ID 105 serial# 36338 on instance 1
     which is waiting on 'SQL*Net message from client' for 135 seconds, wait id 48
     p1: 'driver id'=0x54435000, p2: '#bytes'=0x1, p3: ''=0x0
.....
*** 2022-07-28T11:11:11.513636+02:00 (CDB$ROOT(1))
All Current Hang Statistics

                      current number of hangs 1
    hangs:current number of impacted sessions 2
                  current number of deadlocks 0
deadlocks:current number of impacted sessions 0
           number of locally blocked sessions 1
  local contention - locally blocked sessions 33.33%
          number of remotely blocked sessions 0
remote contention - remotely blocked sessions  0.00%
                 current number of singletons 0
      current number of local active sessions 3
        current number of local hung sessions 1

Suspected Hangs in the System
and possibly Rebuilt Hangs
                     Root       Chain Total               Hang
  Hang Hang          Inst Root  #hung #hung  Hang   Hang  Resolution
    ID Type Status   Num  Sess   Sess  Sess  Conf   Span  Action
 ----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
     2 HANG    VALID    1   105     2     2    LOW  LOCAL Terminate Process

  Inst  Sess   Ser             Proc  Wait    Wait
   Num    ID    Num      OSPID  Name Time(s) Event
  ----- ------ ----- --------- ----- ------- -----
        PDBID PDBNm
        ----- ---------------
      1     67 14207      6691    FG      99 enq: TX - row lock contention
            3 PDB1T
      1    105 36338      6451    FG     139 SQL*Net message from client
            3 PDB1T
.....
HM: current SQL: update products set name='bier' where id=2


                                                     IO
 Total  Self-         Total  Total  Outlr  Outlr  Outlr
  Hung  Rslvd  Rslvd   Wait WaitTm   Wait WaitTm   Wait
  Sess  Hangs  Hangs  Count   Secs  Count   Secs  Count Wait Event
------ ------ ------ ------ ------ ------ ------ ------ -----------
     2      0      0      1    192      1    192      0 enq: TX - row lock contention

 

L’article A few words about the good to know MAX_IDLE_BLOCKER_TIME est apparu en premier sur dbi Blog.

Documentum – ACS not starting because of ActiveMQ on the MethodServer

Fri, 2022-07-29 13:42

Documentum is an inexhaustible source of issues and that’s great because I can continuously write blogs on it, and I always have materials for that. In this blog, I will talk about the ACS not being able to start on the MethodServer because of an ActiveMQ error. As you might know, WildFly has been coming with an integrated ActiveMQ for quite some time now and therefore, in case something goes wrong with ActiveMQ, it might impact your Documentum applications deployed on WildFly.

The issue I will talk about today happened after a simple restart of the MethodServer that has been done simply because some Documentum jars were updated. Unfortunately, the startup failed with the following errors:

[dmadmin@cs-0 ~]$ cd $JMS_HOME/server/DctmServer_MethodServer/log
[dmadmin@cs-0 log]$ cat server.log
...
2022-05-02 06:54:23,181 UTC INFO  [org.jboss.modules] (main) JBoss Modules version 1.9.1.Final
2022-05-02 06:54:23,635 UTC INFO  [org.jboss.msc] (main) JBoss MSC version 1.4.8.Final
2022-05-02 06:54:23,644 UTC INFO  [org.jboss.threads] (main) JBoss Threads version 2.3.3.Final
2022-05-02 06:54:23,801 UTC INFO  [org.jboss.as] (MSC service thread 1-1) WFLYSRV0049: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) starting
2022-05-02 06:54:24,701 UTC INFO  [org.wildfly.security] (ServerService Thread Pool -- 25) ELY00001: WildFly Elytron version 1.9.1.Final
...
2022-05-02 06:54:31,365 UTC INFO  [org.jboss.as.patching] (MSC service thread 1-4) WFLYPAT0050: WildFly Full cumulative patch ID is: base, one-off patches include: none
2022-05-02 06:54:31,403 UTC INFO  [org.jboss.as.server.deployment.scanner] (MSC service thread 1-3) WFLYDS0013: Started FileSystemDeploymentService for directory $DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/deployments
2022-05-02 06:54:31,406 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-2) WFLYSRV0027: Starting deployment of "ServerApps.ear" (runtime-name: "ServerApps.ear")
2022-05-02 06:54:31,406 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-7) WFLYSRV0027: Starting deployment of "error.war" (runtime-name: "error.war")
2022-05-02 06:54:31,406 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-1) WFLYSRV0027: Starting deployment of "acs.ear" (runtime-name: "acs.ear")
2022-05-02 06:54:31,430 UTC INFO  [org.jboss.as.remoting] (MSC service thread 1-7) WFLYRMT0001: Listening on 127.0.0.1:9084
2022-05-02 06:54:31,447 UTC ERROR [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ224097: Failed to start server: java.io.IOException: Input/output error
        at java.base/sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
        at java.base/sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:54)
        at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:274)
        at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233)
        at java.base/sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:811)
        at java.base/sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:796)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.NodeManager.createNodeId(NodeManager.java:209)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.NodeManager.setUpServerLockFile(NodeManager.java:195)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.start(FileLockNodeManager.java:76)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.internalStart(ActiveMQServerImpl.java:576)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.start(ActiveMQServerImpl.java:522)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.jms.server.impl.JMSServerManagerImpl.start(JMSServerManagerImpl.java:373)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.doStart(JMSService.java:206)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.access$000(JMSService.java:65)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService$1.run(JMSService.java:100)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1348)
        at java.base/java.lang.Thread.run(Thread.java:829)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.JBossThread.run(JBossThread.java:485)

2022-05-02 06:54:31,561 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-6) WFLYUT0006: Undertow HTTPS listener default-https listening on 0.0.0.0:9082
2022-05-02 06:54:31,650 UTC INFO  [org.jboss.ws.common.management] (MSC service thread 1-7) JBWS022052: Starting JBossWS 5.3.0.Final (Apache CXF 3.3.2)
2022-05-02 06:54:32,837 UTC INFO  [org.infinispan.factories.GlobalComponentRegistry] (MSC service thread 1-3) ISPN000128: Infinispan version: Infinispan 'Infinity Minus ONE +2' 9.4.14.Final
2022-05-02 06:54:33,191 UTC INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 62) WFLYCLINF0002: Started client-mappings cache from ejb container
2022-05-02 06:54:33,317 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 63) WFLYUT0021: Registered web context: '/' for server 'default-server'
...
2022-05-02 06:55:48,364 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-7) WFLYSRV0207: Starting subdeployment (runtime-name: "bocs.war")
2022-05-02 06:55:48,364 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-4) WFLYSRV0207: Starting subdeployment (runtime-name: "documentum-bocs-ws.war")
2022-05-02 06:55:49,289 UTC WARN  [org.jboss.as.dependency.private] (MSC service thread 1-3) WFLYSRV0018: Deployment "deployment.acs.ear" is using a private module ("org.jboss.as.jmx") which may be changed or removed in future versions without notice.
...
2022-05-02 06:55:50,516 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 70) WFLYUT0021: Registered web context: '/bocs-ws' for server 'default-server'
2022-05-02 06:55:58,622 UTC INFO  [org.jboss.as.server] (Thread-1) WFLYSRV0220: Server shutdown has been requested via an OS signal
2022-05-02 06:55:58,656 UTC INFO  [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-7) WFLYJCA0019: Stopped Driver service with driver-name = h2
2022-05-02 06:55:58,666 UTC INFO  [org.jboss.as.mail.extension] (MSC service thread 1-8) WFLYMAIL0002: Unbound mail session [java:jboss/mail/Default]
2022-05-02 06:55:58,675 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 71) WFLYUT0022: Unregistered web context: '/bocs-ws' from server 'default-server'
2022-05-02 06:55:58,675 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 69) WFLYUT0022: Unregistered web context: '/' from server 'default-server'
2022-05-02 06:55:58,691 UTC ERROR [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 70) WFLYMSGAMQ0003: Exception while stopping JMS server: java.lang.NullPointerException
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1302)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1104)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1027)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:872)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:866)
        at org.apache.activemq.artemis@2.8.1//org.apache.activemq.artemis.jms.server.impl.JMSServerManagerImpl.stop(JMSServerManagerImpl.java:393)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.doStop(JMSService.java:218)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService.access$100(JMSService.java:65)
        at org.wildfly.extension.messaging-activemq//org.wildfly.extension.messaging.activemq.jms.JMSService$2.run(JMSService.java:122)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
        at java.base/java.lang.Thread.run(Thread.java:829)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.JBossThread.run(JBossThread.java:485)

2022-05-02 06:55:58,700 UTC INFO  [javax.enterprise.resource.webservices.jaxws.servlet.http] (ServerService Thread Pool -- 71) WSSERVLET15: JAX-WS servlet destroyed
2022-05-02 06:55:58,703 UTC INFO  [javax.enterprise.resource.webservices.jaxws.server.http] (ServerService Thread Pool -- 71) WSSERVLET13: JAX-WS context listener destroyed
2022-05-02 06:55:59,288 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-4) WFLYSRV0028: Stopped deployment error.war (runtime-name: error.war) in 645ms
...
2022-05-02 06:55:59,502 UTC INFO  [stdout] (FelixStartLevel) ERROR: Error starting file:$DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/deployments/acs.ear/lib/Web.jar (org.osgi.framework.BundleException: Activator start error in bundle web [4].)
2022-05-02 06:55:59,502 UTC ERROR [stderr] (FelixStartLevel) java.lang.ExceptionInInitializerError
2022-05-02 06:55:59,503 UTC ERROR [stderr] (FelixStartLevel)    at com.documentum.acs.sdi.servlet.AcsHelper.<clinit>(AcsHelper.java:161)
2022-05-02 06:55:59,503 UTC ERROR [stderr] (FelixStartLevel)    at com.documentum.acs.sdi.servlet.ACS.<clinit>(ACS.java:404)
2022-05-02 06:55:59,503 UTC ERROR [stderr] (FelixStartLevel)    at com.documentum.acs.sdi.osgi.ServletActivator$HttpTrackerCustomizer.addingService(ServletActivator.java:48)
2022-05-02 06:55:59,503 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.osgi.util.tracker.ServiceTracker$Tracked.trackAdding(ServiceTracker.java:1030)
2022-05-02 06:55:59,503 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.osgi.util.tracker.ServiceTracker$Tracked.trackInitialServices(ServiceTracker.java:891)
2022-05-02 06:55:59,504 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:296)
2022-05-02 06:55:59,504 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.osgi.util.tracker.ServiceTracker.open(ServiceTracker.java:235)
2022-05-02 06:55:59,504 UTC ERROR [stderr] (FelixStartLevel)    at com.documentum.acs.sdi.osgi.ServletActivator.start(ServletActivator.java:25)
2022-05-02 06:55:59,504 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:589)
2022-05-02 06:55:59,504 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.apache.felix.framework.Felix._startBundle(Felix.java:1671)
2022-05-02 06:55:59,505 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.apache.felix.framework.Felix.startBundle(Felix.java:1588)
2022-05-02 06:55:59,505 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.apache.felix.framework.Felix.setFrameworkStartLevel(Felix.java:1180)
2022-05-02 06:55:59,505 UTC ERROR [stderr] (FelixStartLevel)    at deployment.acs.ear//org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:265)
2022-05-02 06:55:59,505 UTC ERROR [stderr] (FelixStartLevel)    at java.base/java.lang.Thread.run(Thread.java:829)
2022-05-02 06:55:59,513 UTC ERROR [stderr] (FelixStartLevel) Caused by: com.google.inject.CreationException: Guice creation errors:
...
2022-05-02 06:55:59,677 UTC INFO  [org.wildfly.naming] (FelixStartLevel) WildFly Naming version 1.0.10.Final
2022-05-02 06:55:59,838 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 68) WFLYUT0021: Registered web context: '/ACS' for server 'default-server'
2022-05-02 06:55:59,838 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 68) WFLYUT0022: Unregistered web context: '/ACS' from server 'default-server'
2022-05-02 06:55:59,908 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) WFLYUT0019: Host default-host stopping
2022-05-02 06:55:59,925 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0208: Stopped subdeployment (runtime-name: documentum-bocs-ws.war) in 1285ms
2022-05-02 06:55:59,928 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) WFLYUT0008: Undertow HTTPS listener default-https suspending
2022-05-02 06:55:59,928 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-7) WFLYUT0008: Undertow HTTP listener default suspending
2022-05-02 06:55:59,929 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) WFLYUT0007: Undertow HTTPS listener default-https stopped, was bound to 0.0.0.0:9082
2022-05-02 06:55:59,929 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-7) WFLYUT0007: Undertow HTTP listener default stopped, was bound to 0.0.0.0:9080
2022-05-02 06:55:59,930 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) WFLYUT0004: Undertow 2.0.21.Final stopping
2022-05-02 06:55:59,930 UTC INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 68) WFLYCLINF0003: Stopped client-mappings cache from ejb container
2022-05-02 06:56:00,101 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-4) WFLYSRV0208: Stopped subdeployment (runtime-name: bocs.war) in 1457ms
[dmadmin@cs-0 log]$

As you can see on the above log, the issue appears to be on the ACS application (the ServerApps URL responds but not the ACS one) and it’s definitively caused by ActiveMQ that doesn’t start properly: “AMQ224097: Failed to start server: java.io.IOException: Input/output error“. Just to make sure, I checked previous logs but there really was no issues on previous startup as well as during the previous runtime. No errors at all on the JMS log files so the issue really started during this startup. The only thing that changed, and the reason for this restart, as I mentioned previously, was to update a few libraries. However, these jar files were Documentum specific, nothing related to WildFly or ActiveMQ. According to the error stack, it seemed to me like it could be caused by some kind of NFS problem. Maybe a disconnection of the NFS at the wrong moment or something similar that could cause the ActiveMQ to either lose a lock file or its status. So, I looked into the ActiveMQ specific data that WildFly creates when it starts:

[dmadmin@cs-0 log]$ ls -l ../data/
total 0
drwxrwx--- 2 dmadmin dmadmin 152 Jun 12  2021 MethodServer
drwxrwxr-x 5 dmadmin dmadmin 152 Jun 23  2021 activemq
drwxrwxr-x 3 dmadmin dmadmin 152 May  2 07:05 content
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 kernel
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 timer-service-data
drwxrwxr-x 3 dmadmin dmadmin 152 Jun 23  2021 tx-object-store
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ ls -l ../data/activemq/
total 16
drwxrwxr-x 2 dmadmin dmadmin 8192 Jun 23  2021 bindings
drwxrwxr-x 2 dmadmin dmadmin 8192 Apr 22 18:33 journal
drwxrwxr-x 2 dmadmin dmadmin  152 Jun 23  2021 largemessages
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ ls -l ../data/activemq/*
activemq/bindings:
total 4128
-rw-rw-r-- 1 dmadmin dmadmin 1048576 Apr 22 18:34 activemq-bindings-1.bindings
-rw-rw-r-- 1 dmadmin dmadmin 1048576 Apr 22 18:33 activemq-bindings-2.bindings
-rw-rw-r-- 1 dmadmin dmadmin 1048576 Apr 22 18:33 activemq-jms-1.jms
-rw-rw-r-- 1 dmadmin dmadmin 1048576 Apr 22 18:33 activemq-jms-2.jms

activemq/journal:
total 20536
-rw-rw-r-- 1 dmadmin dmadmin 10485760 Apr 22 18:33 activemq-data-1.amq
-rw-rw-r-- 1 dmadmin dmadmin 10485760 Apr 22 18:33 activemq-data-2.amq
-rw-rw-r-- 1 dmadmin dmadmin       19 Apr 22 18:34 server.lock

activemq/largemessages:
total 0
[dmadmin@cs-0 log]$

All the files appeared to be present, and it looked like nothing changed them in the previous days… However, since I assumed the problem was coming from the ActiveMQ data, I still tried to shutdown WildFly, backup these data and then start it again to let it regenerate the files automatically:

[dmadmin@cs-0 log]$ mv ../data/activemq ../data/activemq.old
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ ls -l ../data/
total 0
drwxrwx--- 2 dmadmin dmadmin 152 Jun 12  2021 MethodServer
drwxrwxr-x 5 dmadmin dmadmin 152 Jun 23  2021 activemq.old
drwxrwxr-x 3 dmadmin dmadmin 152 May  2 07:05 content
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 kernel
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 timer-service-data
drwxrwxr-x 3 dmadmin dmadmin 152 Jun 23  2021 tx-object-store
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ $JMS_HOME/server/startJMSCustom.sh
Starting the Java MethodServer...
The Java MethodServer has been started.
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ ls -l ../data/
total 0
drwxrwx--- 2 dmadmin dmadmin 152 Jun 12  2021 MethodServer
drwxr-x--- 5 dmadmin dmadmin 152 May  2 07:17 activemq
drwxrwxr-x 5 dmadmin dmadmin 152 Jun 23  2021 activemq.old
drwxrwxr-x 3 dmadmin dmadmin 152 May  2 07:16 content
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 kernel
drwxrwxr-x 2 dmadmin dmadmin 152 Jun 23  2021 timer-service-data
drwxrwxr-x 3 dmadmin dmadmin 152 Jun 23  2021 tx-object-store
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ ls -l ../data/activemq/*
DctmServer_MethodServer/data/activemq/bindings:
total 4128
-rw-r----- 1 dmadmin dmadmin 1048576 May  2 07:17 activemq-bindings-1.bindings
-rw-r----- 1 dmadmin dmadmin 1048576 May  2 07:17 activemq-bindings-2.bindings
-rw-r----- 1 dmadmin dmadmin 1048576 May  2 07:17 activemq-jms-1.jms
-rw-r----- 1 dmadmin dmadmin 1048576 May  2 07:17 activemq-jms-2.jms

DctmServer_MethodServer/data/activemq/journal:
total 20536
-rw-r----- 1 dmadmin dmadmin 10485760 May  2 07:17 activemq-data-1.amq
-rw-r----- 1 dmadmin dmadmin 10485760 May  2 07:17 activemq-data-2.amq
-rw-r----- 1 dmadmin dmadmin       19 May  2 07:17 server.lock

DctmServer_MethodServer/data/activemq/largemessages:
total 0
[dmadmin@cs-0 log]$

As you can see above, all the files were properly re-created. Therefore, checking the JMS log file now, as well as its URL:

[dmadmin@cs-0 log]$ cat server.log
...
2022-05-02 07:16:56,318 UTC INFO  [org.jboss.modules] (main) JBoss Modules version 1.9.1.Final
2022-05-02 07:16:56,782 UTC INFO  [org.jboss.msc] (main) JBoss MSC version 1.4.8.Final
2022-05-02 07:16:56,792 UTC INFO  [org.jboss.threads] (main) JBoss Threads version 2.3.3.Final
2022-05-02 07:16:56,958 UTC INFO  [org.jboss.as] (MSC service thread 1-1) WFLYSRV0049: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) starting
2022-05-02 07:16:57,870 UTC INFO  [org.wildfly.security] (ServerService Thread Pool -- 25) ELY00001: WildFly Elytron version 1.9.1.Final
...
2022-05-02 07:16:59,886 UTC INFO  [org.jboss.as.patching] (MSC service thread 1-1) WFLYPAT0050: WildFly Full cumulative patch ID is: base, one-off patches include: none
2022-05-02 07:16:59,911 UTC INFO  [org.jboss.as.server.deployment.scanner] (MSC service thread 1-3) WFLYDS0013: Started FileSystemDeploymentService for directory $DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/deployments
2022-05-02 07:16:59,917 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-5) WFLYSRV0027: Starting deployment of "ServerApps.ear" (runtime-name: "ServerApps.ear")
2022-05-02 07:16:59,919 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-7) WFLYSRV0027: Starting deployment of "error.war" (runtime-name: "error.war")
2022-05-02 07:16:59,918 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-1) WFLYSRV0027: Starting deployment of "acs.ear" (runtime-name: "acs.ear")
2022-05-02 07:16:59,986 UTC INFO  [org.jboss.as.remoting] (MSC service thread 1-1) WFLYRMT0001: Listening on 127.0.0.1:9084
2022-05-02 07:17:00,027 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221000: live Message Broker is starting with configuration Broker Configuration (clustered=false,journalDirectory=$DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/data/activemq/journal,bindingsDirectory=$DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/data/activemq/bindings,largeMessagesDirectory=$DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/data/activemq/largemessages,pagingDirectory=$DOCUMENTUM/wildfly17.0.1/server/DctmServer_MethodServer/data/activemq/paging)
2022-05-02 07:17:00,107 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221012: Using AIO Journal
2022-05-02 07:17:00,136 UTC INFO  [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0006: Undertow HTTPS listener default-https listening on 0.0.0.0:9082
2022-05-02 07:17:00,213 UTC INFO  [org.jboss.ws.common.management] (MSC service thread 1-8) JBWS022052: Starting JBossWS 5.3.0.Final (Apache CXF 3.3.2)
2022-05-02 07:17:00,249 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2022-05-02 07:17:00,250 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2022-05-02 07:17:00,254 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2022-05-02 07:17:00,255 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2022-05-02 07:17:00,337 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221034: Waiting indefinitely to obtain live lock
2022-05-02 07:17:00,337 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221035: Live Server Obtained live lock
2022-05-02 07:17:00,952 UTC INFO  [org.infinispan.factories.GlobalComponentRegistry] (MSC service thread 1-8) ISPN000128: Infinispan version: Infinispan 'Infinity Minus ONE +2' 9.4.14.Final
2022-05-02 07:17:00,991 UTC INFO  [org.wildfly.extension.messaging-activemq] (MSC service thread 1-6) WFLYMSGAMQ0016: Registered HTTP upgrade for activemq-remoting protocol handled by http-acceptor acceptor
2022-05-02 07:17:00,991 UTC INFO  [org.wildfly.extension.messaging-activemq] (MSC service thread 1-5) WFLYMSGAMQ0016: Registered HTTP upgrade for activemq-remoting protocol handled by http-acceptor-throughput acceptor
2022-05-02 07:17:00,991 UTC INFO  [org.wildfly.extension.messaging-activemq] (MSC service thread 1-1) WFLYMSGAMQ0016: Registered HTTP upgrade for activemq-remoting protocol handled by http-acceptor acceptor
2022-05-02 07:17:00,991 UTC INFO  [org.wildfly.extension.messaging-activemq] (MSC service thread 1-4) WFLYMSGAMQ0016: Registered HTTP upgrade for activemq-remoting protocol handled by http-acceptor-throughput acceptor
2022-05-02 07:17:01,264 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221007: Server is now live
2022-05-02 07:17:01,265 UTC INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 62) AMQ221001: Apache ActiveMQ Artemis Message Broker version 2.8.1 [default, nodeID=db920184-c9e7-11ec-8ebb-223c2a795152]
2022-05-02 07:17:01,395 UTC INFO  [org.jboss.as.connector.deployment] (MSC service thread 1-8) WFLYJCA0007: Registered connection factory java:/JmsXA
2022-05-02 07:17:01,396 UTC INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 66) WFLYMSGAMQ0002: Bound messaging object to jndi name java:jboss/exported/jms/RemoteConnectionFactory
2022-05-02 07:17:01,401 UTC INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 67) WFLYMSGAMQ0002: Bound messaging object to jndi name java:/TouchRpcQueueConnectionFactory
2022-05-02 07:17:01,410 UTC INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 67) WFLYMSGAMQ0002: Bound messaging object to jndi name java:jboss/exported/jms/TouchRpcQueueConnectionFactory
2022-05-02 07:17:01,411 UTC INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 68) WFLYMSGAMQ0002: Bound messaging object to jndi name java:/ConnectionFactory
2022-05-02 07:17:01,411 UTC INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started client-mappings cache from ejb container
2022-05-02 07:17:01,455 UTC INFO  [org.apache.activemq.artemis.ra] (MSC service thread 1-8) AMQ151007: Resource adaptor started
2022-05-02 07:17:01,456 UTC INFO  [org.jboss.as.connector.services.resourceadapters.ResourceAdapterActivatorService$ResourceAdapterActivator] (MSC service thread 1-8) IJ020002: Deployed: file://RaActivatoractivemq-ra
2022-05-02 07:17:01,467 UTC INFO  [org.jboss.as.connector.deployment] (MSC service thread 1-3) WFLYJCA0002: Bound JCA ConnectionFactory [java:/JmsXA]
2022-05-02 07:17:01,467 UTC INFO  [org.jboss.as.connector.deployment] (MSC service thread 1-8) WFLYJCA0118: Binding connection factory named java:/JmsXA to alias java:jboss/DefaultJMSConnectionFactory
2022-05-02 07:17:01,583 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 64) WFLYUT0021: Registered web context: '/' for server 'default-server'
...
2022-05-02 07:17:21,632 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "acs.ear" (runtime-name : "acs.ear")
2022-05-02 07:17:21,632 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "ServerApps.ear" (runtime-name : "ServerApps.ear")
2022-05-02 07:17:21,633 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "error.war" (runtime-name : "error.war")
2022-05-02 07:17:21,758 UTC INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
2022-05-02 07:17:21,762 UTC INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) started in 25839ms - Started 977 of 1235 services (451 services are lazy, passive or on-demand)
[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ curl -kI https://`hostname -f`:9082/DmMethods/servlet/DoMethod
HTTP/1.1 200 OK
Connection: keep-alive
Server: MethodServer
Content-Length: 144
Date: Mon, 2 May 2022 07:18:59 GMT

[dmadmin@cs-0 log]$
[dmadmin@cs-0 log]$ curl -kI https://`hostname -f`:9082/ACS/servlet/ACS
HTTP/1.1 200 OK
Connection: keep-alive
Server: MethodServer
Content-Type: text/plain
Content-Length: 48
Date: Mon, 2 May 2022 07:19:07 GMT

[dmadmin@cs-0 log]$

Therefore, this seems to have fixed properly the issue since now both URLs are responding properly. As far as I know, Documentum isn’t using the ActiveMQ included by WildFly and therefore cleaning the data shouldn’t be a problem. No issues were found after that on this environment and therefore, the issue was closed as is. No real root cause was identified either, so I will most probably continue to believe that it was linked to a network and/or NFS issue… If you are facing the same behavior, don’t hesitate to share!

L’article Documentum – ACS not starting because of ActiveMQ on the MethodServer est apparu en premier sur dbi Blog.

Why do we still have HEIGHT BALANCED Histograms and how to get rid of them?

Fri, 2022-07-29 06:59

A customer, who is on 19c with his Oracle databases asked me recently why he still has Height Balanced Histograms in his database? E.g.

SQL> select histogram, count(*) from dba_tab_columns where histogram <> 'NONE' group by histogram order by 2;

HISTOGRAM         COUNT(*)
--------------- ----------
TOP-FREQUENCY            4
HEIGHT BALANCED          5
HYBRID                  39
FREQUENCY              492

SQL> 

In 12.1. Oracle introduced Top Frequency and Hybrid Histograms, which should replace Height Balanced histograms. There are 2 main reasons why Height Balanced histograms may still be there:

1. Top Frequency and/or Hybrid histograms are disabled

By setting the preference ENABLE_HYBRID_HISTOGRAMS and ENABLE_TOP_FREQ_HISTOGRAMS to 0 (globally or on table level) you can disable the new histogram types. The default value is 3 and enables the 2 new histogram types:

SQL> select dbms_stats.get_prefs('ENABLE_HYBRID_HISTOGRAMS') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> select dbms_stats.get_prefs('ENABLE_TOP_FREQ_HISTOGRAMS') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> 

2. When gathering statistics and using a non-default value for ESTIMATE_PERCENT (default is DBMS_STATS.AUTO_SAMPLE_SIZE) then Height Balanced histograms will be used instead of the new histogram types.

The question is on how to find out what caused HEIGHT BALANCED histograms to be created?

First let’s check what table-columns have Height Balanced histograms and when they’ve been created:

SQL> select table_name, column_name, last_analyzed from dba_tab_columns where histogram='HEIGHT BALANCED';

TABLE_NAME                       COLUMN_NAME                      LAST_ANALYZED
-------------------------------- -------------------------------- -------------------
T1                               TIMESTAMP                        27.07.2022 13:11:31
T1                               LAST_DDL_TIME                    27.07.2022 13:11:31
T1                               CREATED                          27.07.2022 13:11:31
T1                               OBJECT_ID                        27.07.2022 13:11:31
T1                               OBJECT_NAME                      27.07.2022 13:11:31

SQL> 

Ok, I do only have a test table T1 (copy of ALL_OBJECTS) where 5 columns have a HEIGHT BALANCED histogram. What caused them to be created?

Are the new histogram types disabled?

SQL> select dbms_stats.get_prefs('ENABLE_HYBRID_HISTOGRAMS') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> select dbms_stats.get_prefs('ENABLE_TOP_FREQ_HISTOGRAMS') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> 

REMARK: Alternatively you may also run this query to get the global preferences:

SQL> select sname, nvl(to_char(sval1),spare4) value
  2  from sys.optstat_hist_control$
  3  where sname like '%HISTOGRAMS';

SNAME                          VALUE
------------------------------ --------------------------------
ENABLE_TOP_FREQ_HISTOGRAMS     3
ENABLE_HYBRID_HISTOGRAMS       3

SQL> 

So globally the new histogram types are enabled and I could also check if something specific has been set on table-level:

SQL> select dbms_stats.get_prefs('ENABLE_HYBRID_HISTOGRAMS',ownname=>'CBLEILE',tabname=>'T1') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> select dbms_stats.get_prefs('ENABLE_TOP_FREQ_HISTOGRAMS',ownname=>'CBLEILE',tabname=>'T1') hist_enabled from dual;

HIST_ENABLED
------------
3

SQL> 

I.e. the new histogram types are enabled. So that’s not the reason I still have height balanced histograms. To find out how statistics were gathered on table T1 the NOTES column on DBA_OPTSTAT_OPERATIONS is very useful. I split this into 2 SQL-statements to improve the readability:

SQL> select target, end_time, operation
  2  from dba_optstat_operations
  3  where end_time between to_date('27-JUL-2022 13:11:00','dd-mon-yyyy hh24:mi:ss')
  4  and to_date('27-JUL-2022 13:12:00','dd-mon-yyyy hh24:mi:ss');

TARGET          END_TIME                            OPERATION
--------------- ----------------------------------- ------------------------
"CBLEILE"."T1"  27-JUL-22 01.11.31.849682 PM +01:00 gather_table_stats

SQL> select notes
  2  from dba_optstat_operations
  3  where end_time between to_date('27-JUL-2022 13:11:00','dd-mon-yyyy hh24:mi:ss')
  4  and to_date('27-JUL-2022 13:12:00','dd-mon-yyyy hh24:mi:ss');

NOTES
--------------------------------------------------
<params><param name="block_sample" val="FALSE"/><p
aram name="cascade" val="NULL"/><param name="concu
rrent" val="FALSE"/><param name="degree" val="NULL
"/><param name="estimate_percent" val="50"/><param
 name="force" val="FALSE"/><param name="granularit
y" val="AUTO"/><param name="method_opt" val="FOR A
LL COLUMNS SIZE 254"/><param name="no_invalidate"
val="NULL"/><param name="ownname" val="CBLEILE"/><
param name="partname" val=""/><param name="reporti
ng_mode" val="FALSE"/><param name="statid" val=""/
><param name="statown" val=""/><param name="statta
b" val=""/><param name="stattype" val="DATA"/><par
am name="tabname" val="T1"/></params>

So statistics were gathered with dbms_stats.gather_table_stats and ESTIMATE_PERCENT => 50 was used (see the NOTES column). Hence HEIGHT BALANCED histograms were created.

To fix this there are 2 possibilities:

  1. Tell the developer or DBA who gathers statistics to use the default for ESTIMATE_PERCENT so that he can fix the code accordingly.
  2. Ignore non-default settings when gathering statistics. I.e. there is a preference PREFERENCE_OVERRIDES_PARAMETER which actually ignores parameters provided in dbms_stats and uses the preference on the table instead. E.g.
SQL> select dbms_stats.get_prefs('ESTIMATE_PERCENT','CBLEILE','T1') from dual;

DBMS_STATS.GET_PREFS('ESTIMATE_PERCENT','CBLEILE','T1')
------------------------------------------------------------------------------
DBMS_STATS.AUTO_SAMPLE_SIZE

SQL> exec dbms_stats.set_table_prefs('CBLEILE','T1','PREFERENCE_OVERRIDES_PARAMETER','TRUE');
SQL> exec dbms_stats.gather_table_stats('CBLEILE','T1',ESTIMATE_PERCENT=>50,options=>'GATHER',method_opt=>'FOR ALL COLUMNS SIZE 254');
SQL> select column_name, histogram from user_tab_columns
  2  where table_name='T1' and histogram <> 'NONE';

COLUMN_NAME                      HISTOGRAM
-------------------------------- ---------------
OBJECT_TYPE                      FREQUENCY

SQL>

Why do I only have a frequency based histogram and no other histograms anymore?

The reason is that the setting
method_opt=>’FOR ALL COLUMNS SIZE 254′
has also been overwritten by the preference on the table:

SQL> select dbms_stats.get_prefs('METHOD_OPT','CBLEILE','T1') t1_prefs from dual;

T1_PREFS
----------------------------
FOR ALL COLUMNS SIZE AUTO

I.e. to use ‘FOR ALL COLUMNS SIZE 254’ I have to set the preference as well (here for test purposes):

SQL> exec dbms_stats.set_table_prefs('CBLEILE','T1','METHOD_OPT','FOR ALL COLUMNS SIZE 254');

SQL> exec dbms_stats.gather_table_stats('CBLEILE','T1',ESTIMATE_PERCENT=>50,options=>'GATHER',method_opt=>'FOR ALL COLUMNS SIZE 1');

REMARK: I used method_opt=>’FOR ALL COLUMNS SIZE 1′ on purpose which means “disable histograms”.

SQL> select column_name, histogram from user_tab_columns
  2  where table_name='T1' and histogram <> 'NONE';

COLUMN_NAME                      HISTOGRAM
-------------------------------- ---------------
OWNER                            FREQUENCY
OBJECT_NAME                      HYBRID
SUBOBJECT_NAME                   HYBRID
OBJECT_ID                        HYBRID
DATA_OBJECT_ID                   HYBRID
OBJECT_TYPE                      FREQUENCY
CREATED                          HYBRID
LAST_DDL_TIME                    HYBRID
TIMESTAMP                        HYBRID
STATUS                           FREQUENCY
TEMPORARY                        FREQUENCY
GENERATED                        FREQUENCY
SECONDARY                        FREQUENCY
NAMESPACE                        FREQUENCY
SHARING                          FREQUENCY
EDITIONABLE                      FREQUENCY
ORACLE_MAINTAINED                FREQUENCY
APPLICATION                      FREQUENCY
DEFAULT_COLLATION                FREQUENCY
DUPLICATED                       FREQUENCY
SHARDED                          FREQUENCY

21 rows selected.

SQL> 

I.e. with the preference PREFERENCE_OVERRIDES_PARAMETER = TRUE I can “force” to use all preferences set on the table (or the global preferences if the table preferences have not been set). So I overwrote my manual settings ESTIMATE_PERCENT=>50 and method_opt=>’FOR ALL COLUMNS SIZE 1′ here.

REMARK: Check if statistics have been gathered, because options=>’GATHER’ may also have been overwritten with the preference on the table.

Be careful with PREFERENCE_OVERRIDES_PARAMETER = TRUE because it may have unwanted side-effects (as seen above) or statistics gathering in the application may take longer than before.

Summary: If you still see HEIGHT BALANCED histograms in your database then you probably use ESTIMATE_PERCENT <> Default when gathering statistics. It’s recommended to fix that and gather statistics with ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE. There have been lots of improvements since 11g to speed up stats gathering (like e.g. the use of approx_for_count_distinct = TRUE). Hence settings like ESTIMATE_PERCENT => 1 to speed up stats gathering on huge tables are usually not necessary anymore today.

L’article Why do we still have HEIGHT BALANCED Histograms and how to get rid of them? est apparu en premier sur dbi Blog.

Using Docker containers for Ansible testing

Fri, 2022-07-29 02:00

I’d like to share in this post a quick setup using Docker containers for Ansible testing.

I wanted to create a small lab for my Ansible scripts testing. The natural option that came to my mind was to create some virtual machines with Virtualbox using Vagrant in order to automate that process. However as I’m using a Macbook with a chip Apple M1, Virtualbox is not supported on that hardware. I then decided to explore using containers instead of virtual machines.

The first step is to look after an existing docker image that would already have Ansible and Openssh installed. There is no need to reinvente the wheel in the age of space travelling so my colleague Jean-Philippe Clapot (our Docker guru!) quickly spotted the perfect image for my need: https://hub.docker.com/r/jcpowermac/alpine-ansible-ssh

The launch

As docker and Docker desktop are already installed on my laptop I just have to pull that image:

% docker pull jcpowermac/alpine-ansible-ssh

Then I can run some containers:

% docker run --name=controller --platform linux/amd64 -d jcpowermac/alpine-ansible-ssh
% docker run --name=target1 --platform linux/amd64 -d jcpowermac/alpine-ansible-ssh
% docker run --name=target2 --platform linux/amd64 -d jcpowermac/alpine-ansible-ssh

The idea for now is to have one container named “controller” on which I’ll write and run my Ansible scripts. Then, two target containers named “target1” and “target2” will be the targeting hosts of my scripts. Using containers instead of Virtual Machine give me plenty of spare resources to run much more of those bad boys when doing more advanced Ansible testing. Note the parameter --platform linux/amd64 which specify the platform to use on my Apple M1 chip. Without this parameter you get a warning but the container is properly created anyway.

All three containers are now up and running:

% docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED          STATUS          PORTS     NAMES
786309da3987   jcpowermac/alpine-ansible-ssh   "/bin/ash -c '/usr/s…"   5 seconds ago    Up 4 seconds              target2
abe685b1f1ab   jcpowermac/alpine-ansible-ssh   "/bin/ash -c '/usr/s…"   28 seconds ago   Up 27 seconds             target1
e063d4e9267d   jcpowermac/alpine-ansible-ssh   "/bin/ash -c '/usr/s…"   44 seconds ago   Up 43 seconds             controller
The network

As this setup is only for temporary tests, I didn’t create a dedicated network for those containers. They all use the default bridge network in the default range 172.17.0.0/16.

The first container “controller” get the first free IP Address in this range which is 172.17.0.2/16 (172.17.0.1/16 is taken by the bridge interface. Each target will get the next free IP Address. We can check the IP Addresses assigned to the containers with the one-liner below:

% for i in $(docker ps|awk '{print $1}'|tail -n +2); do docker exec $i ip a|grep 172.17;done
    inet 172.17.0.4/16 brd 172.17.255.255 scope global eth0
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0

We can now connect to the controller container and check the ansible version installed in this image:

% docker exec -it controller /bin/sh
/ # id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)

/ # ansible --version
ansible 2.7.2
  config file = None
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.15 (default, Aug 16 2018, 14:17:09) [GCC 6.4.0]

This Docker image comes with the user ansible already created so let’s use it:

/ # su ansible
~/projects $ pwd
/home/ansible/projects
~/projects $ id
uid=1000(ansible) gid=1000(ansible) groups=1000(ansible)
~/projects $

We can first check the connectivity:

~/projects $ ssh 172.17.0.3
The authenticity of host '172.17.0.3 (172.17.0.3)' can't be established.
ECDSA key fingerprint is SHA256:ntBTgrAxi9bUSIb47U31BFzD4rE5ktFnwRxztqXFICE.
Are you sure you want to continue connecting (yes/no)? yes

~/projects $ ssh 172.17.0.4
The authenticity of host '172.17.0.4 (172.17.0.4)' can't be established.
ECDSA key fingerprint is SHA256:ntBTgrAxi9bUSIb47U31BFzD4rE5ktFnwRxztqXFICE.
Are you sure you want to continue connecting (yes/no)? yes

~/projects $

From the controller I can ssh to target1 and target2, my setup is now completed and I can start playing with Ansible.

The Ansible test

The first step is to create an inventory file in the home folder:

~/projects $ cat <<EOF > inventory.txt
target1 ansible_host=172.17.0.3
target2 ansible_host=172.17.0.4
EOF

We can then run a basic Ansible command against our both target:

~/projects $ ansible target* -m ping -i inventory.txt
target2 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
target1 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

All is working as expected, our test platform is now ready for more advanced Ansible testing.

Have you already used Ansible with containers? Please share your experience in the comment section below.

Learn more on Ansible with our Training course: https://www.dbi-services.com/en/courses/ansible-basics/

L’article Using Docker containers for Ansible testing est apparu en premier sur dbi Blog.

Vim Tips & Tricks for the Kubernetes CKA exam

Thu, 2022-07-28 08:39

After being a regular reader of this blog for months, I’m thrilled to publish my first post!

Vim is a very powerful editing tool to know in Linux and sometimes the only text editor available when connected to a machine using command line interface.
Regarding the CKA exam, it is the text editor selected to be used and so knowing a few tricks can save you precious time for completing the various tasks requested.
This post assume you already have a basic knowledge of Vi/Vim and don’t have a panic attack because you don’t know how to exit Vim ;-). For beginners, the Vim package comes with a tutorial where you can learn the basics, just enter the command vimtutor in your terminal and follow the instructions.
Let’s start with some easy tips first before pushing it a bit harder.

Move to a pattern and modify text between double quotes

In the pod yaml file below we would like to change the mountPath value to /var/www/html

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

Let’s for example go straight to the mountPath key by typing /mountPath then press W (capital w) to reach the opening “

Now to delete everything between the “” and enter in insert mode type ci”

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: ""
          name: task-pv-storage

All the content between the “” have been deleted and you can now just enter the new desired path here /var/www/html and save the file.

Split the screen

It is sometimes convenient to stay in Vim and do other system operations at the same time without exiting Vim. Here are some examples and ideas for doing so.

When editing a file with Vim you can split the screen vertically or horizontally with :vsp or :sp

apiVersion: v1                          |apiVersion: v1
kind: Pod                               |kind: Pod
metadata:                               |metadata:
  name: task-pv-pod                     |  name: task-pv-pod
spec:                                   |spec:
  volumes:                              |  volumes:
    - name: task-pv-storage             |    - name: task-pv-storage
      persistentVolumeClaim:            |      persistentVolumeClaim:
        claimName: task-pv-claim        |        claimName: task-pv-claim
  containers:                           |  containers:
    - name: task-pv-container           |    - name: task-pv-container
      image: nginx                      |      image: nginx
      ports:                            |      ports:
        - containerPort: 80             |        - containerPort: 80
          name: "http-server"           |          name: "http-server"
      volumeMounts:                     |      volumeMounts:
        - mountPath: "/usr/share/nginx/h|        - mountPath: "/usr/share/nginx/
tml"                                    |html"
          name: task-pv-storage         |          name: task-pv-storage
~                                       |~
~                                       |~
~                                       |~
~                                       |~
pod1.yaml                                pod1.yaml

By default the second area on the right (vertical split is used in this example) shows the same file as the first one. Each areas can be split further vertically or horizontally if required. Use CTRL+ww to switch between the areas and :q in an area to close it.

Now we can check a local file to open by switching to the shell (without exiting Vim) with :sh (note that this action will be in full screen and is not contained in a particular area). Here you can launch any system command as well as kubectl commands which is very handy for getting quickly an information. Here look for example for a file to edit, copy its name and exit from the shell with exit

Open now that new file in the second area with :e <file_name>. In my example I’ve opened pod2.yaml which has the same content as pod1.yaml for now but will be modified to evolve to a version 2 of that file.

apiVersion: v1                          |apiVersion: v1
kind: Pod                               |kind: Pod
metadata:                               |metadata:
  name: task-pv-pod                     |  name: task-pv-pod
spec:                                   |spec:
  volumes:                              |  volumes:
    - name: task-pv-storage             |    - name: task-pv-storage
      persistentVolumeClaim:            |      persistentVolumeClaim:
        claimName: task-pv-claim        |        claimName: task-pv-claim
  containers:                           |  containers:
    - name: task-pv-container           |    - name: task-pv-container
      image: nginx                      |      image: nginx
      ports:                            |      ports:
        - containerPort: 80             |        - containerPort: 80
          name: "http-server"           |          name: "http-server"
      volumeMounts:                     |      volumeMounts:
        - mountPath: "/usr/share/nginx/h|        - mountPath: "/var/www/html"
tml"                                    |          name: task-pv-storage
          name: task-pv-storage         |~
~                                       |~
~                                       |~
~                                       |~
~                                       |~
pod1.yaml                                pod2.yaml

You can now compare those files, copy/paste some text from one area to the other,…

To copy/paste you can trivially use the mouse or CTRL+C/CTRL+V between the areas but for the Vim purists you can use “+y and “+p to copy and paste to and from the system clipboard register. From my experience it doesn’t work well in all systems (especially in remote access) so I let you experiment with it.

Finally if you want to replace the name nginx by web in the second area, just use the search and replace command in that area only with :%s/nginx/web/gc

apiVersion: v1                          |apiVersion: v1
kind: Pod                               |kind: Pod
metadata:                               |metadata:
  name: task-pv-pod                     |  name: task-pv-pod
spec:                                   |spec:
  volumes:                              |  volumes:
    - name: task-pv-storage             |    - name: task-pv-storage
      persistentVolumeClaim:            |      persistentVolumeClaim:
        claimName: task-pv-claim        |        claimName: task-pv-claim
  containers:                           |  containers:
    - name: task-pv-container           |    - name: task-pv-container
      image: nginx                      |      image: web
      ports:                            |      ports:
        - containerPort: 80             |        - containerPort: 80
          name: "http-server"           |          name: "http-server"
      volumeMounts:                     |      volumeMounts:
        - mountPath: "/usr/share/nginx/h|        - mountPath: "/var/www/html"
tml"                                    |          name: task-pv-storage
          name: task-pv-storage         |~
~                                       |~
~                                       |~
~                                       |~
~                                       |~
pod1.yaml                                pod2.yaml [+]
Copy and insert a block and the use of the registers

Kubernetes manifest files are based on Yaml formatting and it is often useful to be able to copy a block as a container and copy it in order to create a second one.

There are several ways for doing this, let’s explore the most straightforward in my opinion:

First display the line numbers in vim with :set nu

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: task-pv-pod
  5 spec:
  6   volumes:
  7     - name: task-pv-storage
  8       persistentVolumeClaim:
  9         claimName: task-pv-claim
 10   containers:
 11     - name: task-pv-container
 12       image: nginx
 13       ports:
 14         - containerPort: 80
 15           name: "http-server"
 16       volumeMounts:
 17         - mountPath: "/usr/share/nginx/html"
 18           name: task-pv-storage

If we want to copy the container from line 11 to line 15 included then type :11,15y (y stand for yank which means copy in Vim terminology, note also that using d instead of y would just have deleted that block)

Then go to line 15 (as you want to insert that copied block just after that line) with 15gg and paste the copied block with p (p stand for paste)

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: task-pv-pod
  5 spec:
  6   volumes:
  7     - name: task-pv-storage
  8       persistentVolumeClaim:
  9         claimName: task-pv-claim
 10   containers:
 11     - name: task-pv-container
 12       image: nginx
 13       ports:
 14         - containerPort: 80
 15           name: "http-server"
 16     - name: task-pv-container
 17       image: nginx
 18       ports:
 19         - containerPort: 80
 20           name: "http-server"
 21       volumeMounts:
 22         - mountPath: "/usr/share/nginx/html"
 23           name: task-pv-storage

The second container is now properly inserted and can be modified further to match the requirements for it.

Note that when you yank or delete something (a character, a word, a line,…) in Vim it is stored in an unnamed register that is replaced after each yank or delete operation. Vim also offers 26 additional registers for users that corresponds to each 26 letters of the alphabet. It is then possible to copy something into different registers and paste the one desired by calling its register letter.

Let’s see an example of this where I want to copy the line with the name of the container in the register a and the line with the image of the container in the register b.

To do so type

11gg then “ayy           => This will copy line 11 into the register a

12gg then “byy           => This will copy line 12 into the register b

Then go to line 20 with 20gg as you will paste your copied lines after it and type:

“ap”bp in order to paste the content of register a then register b with the result shown below:

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: task-pv-pod
  5 spec:
  6   volumes:
  7     - name: task-pv-storage
  8       persistentVolumeClaim:
  9         claimName: task-pv-claim
 10   containers:
 11     - name: task-pv-container
 12       image: nginx
 13       ports:
 14         - containerPort: 80
 15           name: "http-server"
 16     - name: task-pv-container
 17       image: nginx
 18       ports:
 19         - containerPort: 80
 20           name: "http-server"
 21     - name: task-pv-container
 22       image: nginx
 23       volumeMounts:
 24         - mountPath: "/usr/share/nginx/html"

That will create a third container in this pod to work with.

Note that both lines could have been copied at once as shown previously but that example was for education purpose only.

Note also that if you use just p you still have in the unnamed register the block you’ve copied at the beginning.

Use Visual Block to shift blocks at once

Now let’s push it a bit further and use Visual Block to shift blocks to the left or to the right. Yaml format is strict on the indentation of the elements and often it is required to move a block further to the right or to the left to match those Yaml requirements. When there are several lines to move we could do much better than using x to delete a character or i and space to insert one so let’s see how by using a common scenario: You’ve copied some lines from the Kubernetes documentation and paste them into your yaml file, however the pasting didn’t keep all the formatting very well as shown below:

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: cpu-demo
  5   namespace: cpu-example
  6 spec:
  7   containers:
  8      - name: cpu-demo-ctr
  9        image: vish/stress
 10        resources:
 11          limits:
 12            cpu: "1"
 13          requests:
 14            cpu: "0.5"
 15        args:
 16        - -cpus
 17        - "2"

I’ve copied the container elements from line 8 to 17 and we can see that it is not properly aligned: The – before name at line 8 should be exactly below the c of containers above. However all that block itself is properly indented so we just need to shift all that block 3 characters to the left. To do so we go to line 8 with 8gg then type 0 to go at the beginning of the line and enter Visual Block with CTRL+v then use j (or down arrow key) to go down the last line then stroke l (or the right arrow key) 3 times.

Then press d to delete that whole block and see it now properly positioned:

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: cpu-demo
  5   namespace: cpu-example
  6 spec:
  7   containers:
  8   - name: cpu-demo-ctr
  9     image: vish/stress
 10     resources:
 11       limits:
 12         cpu: "1"
 13       requests:
 14         cpu: "0.5"
 15     args:
 16     - -cpus
 17     - "2"

Now let’s have a look at the opposite scenario where the block needs to be pushed to the right:

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: cpu-demo
  5   namespace: cpu-example
  6 spec:
  7   containers:
  8  - name: cpu-demo-ctr
  9    image: vish/stress
 10    resources:
 11      limits:
 12        cpu: "1"
 13      requests:
 14        cpu: "0.5"
 15    args:
 16    - -cpus
 17    - "2"

We can see now that the same container block needs to be pushed with one space character to the right in order to match the Yaml requirements.

Go to the beginning of line 8 as shown previously and enter Visual Block mode again with CTRL+v then use j (or down arrow key) to go down the last line (to select the block you want to work with) then type SHIFT+i then SPACE

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: cpu-demo
  5   namespace: cpu-example
  6 spec:
  7   containers:
  8   - name: cpu-demo-ctr
  9    image: vish/stress
 10    resources:
 11      limits:
 12        cpu: "1"
 13      requests:
 14        cpu: "0.5"
 15    args:
 16    - -cpus
 17    - "2"

Line 8 is now properly positioned so type ESC twice and it’s almost magical but all the lines of the defined block will also move by one space to the right automatically:

  1 apiVersion: v1
  2 kind: Pod
  3 metadata:
  4   name: cpu-demo
  5   namespace: cpu-example
  6 spec:
  7   containers:
  8   - name: cpu-demo-ctr
  9     image: vish/stress
 10     resources:
 11       limits:
 12         cpu: "1"
 13       requests:
 14         cpu: "0.5"
 15     args:
 16     - -cpus
 17     - "2"

The whole block is now properly aligned!

Last but not least

If the editing goes wrong at some point, don’t panic and press ESC then u in order to undo the last operation and continue to press u in order to further undo. If you went too far in the undo then press CTRL+r in order to replay that last action and you can also go further in the replay by repeating that sequence.

Of course those Tips & Tricks are not limited to Kubernetes and CKA and can be used in any text editing context.

There are obviously much more as Vim possibilities look limitless but I hope you’ve learned something that can help you being more efficient with text editing by using Vim. This can be a game changer when passing the CKA exam, I’ve successfully passed it a few days ago and applying those Tips & Tricks allowed me to be quick and have time in the end to review my work.

And you what are your favorites Tips & Tricks with Vim? Please share them in the comment section below.

L’article Vim Tips & Tricks for the Kubernetes CKA exam est apparu en premier sur dbi Blog.

PostgreSQL on 80 core ARM Server

Mon, 2022-07-25 04:20

This Blog is about intallation and testing PostgreSQL 14 latest on an ARM based 80 core Server using Rocky Linux 8.6.

Many thanks to Happyware for providing the machine used for this Blog.

https://happyware.com/

The machine used for this Blog is a Gigabyte R272-P30 with Ampere Altra Q80-30 CPU, 80 cores at 3GHz.

https://happyware.com/gigabyte/arm-server-ampere-altra-r272-p30-q80-30/6nr272p30mr-00

Very nice IPMI:

IPMI Settings

CPU Invetory within IPMI:

CPU inventory

All Memory channels used.

Memory inventory

The cpu itself is very interesting, 80 cores, 1 MByte Cache per core, 8 Memory channels, and 128 PCIE 4 Lines.

The OS used for this Blog is Rocky Linux 8.6 for ARM64:

[root@localhost ~]# cat /etc/os-release
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8"
[root@localhost ~]#

Interesting is the output of htop on this machine.

[root@localhost ~]# lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              80
On-line CPU(s) list: 0-79
Thread(s) per core:  1
Core(s) per socket:  80
Socket(s):           1
NUMA node(s):        1
Vendor ID:           ARM
BIOS Vendor ID:      Ampere(R)
Model:               1
Model name:          Neoverse-N1
BIOS Model name:     Ampere(R) Altra(R) Processor
Stepping:            r3p1
CPU max MHz:         3300.0000
CPU min MHz:         1000.0000
BogoMIPS:            50.00
L1d cache:           64K
L1i cache:           64K
L2 cache:            1024K
NUMA node0 CPU(s):   0-79
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
[root@localhost ~]#

The storage in the background are two Samsund SSD 980Pro of 500GB, so nothing very special.

[root@localhost ~]# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
nvme1n1     259:0    0 465.8G  0 disk
└─nvme1n1p1 259:2    0 465.8G  0 part
  └─rl-home 253:2    0   856G  0 lvm  /home
nvme0n1     259:1    0 465.8G  0 disk
├─nvme0n1p1 259:3    0   600M  0 part /boot/efi
├─nvme0n1p2 259:4    0     1G  0 part /boot
└─nvme0n1p3 259:5    0 464.2G  0 part
  ├─rl-root 253:0    0    70G  0 lvm  /
  ├─rl-swap 253:1    0     4G  0 lvm  [SWAP]
  └─rl-home 253:2    0   856G  0 lvm  /home
[root@localhost ~]#

The installation of PostgreSQL will follow the documentation I have described in my atricle at heise.de, it is in german.

https://www.heise.de/ratgeber/PostgreSQL-installieren-mit-den-Community-Paketen-4877556.html

Means installing postgresql.org repository and disabling Rocky Linux postgresql modules.

[root@localhost ~]# dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-aarch64/pgdg-redhat-repo-latest.noarch.rpm
Last metadata expiration check: 1:36:40 ago on Thu 21 Jul 2022 07:15:54 AM EDT.
pgdg-redhat-repo-latest.noarch.rpm                                                                                                                                                 4.5 kB/s |  11 kB     00:02
Dependencies resolved.
===================================================================================================================================================================================================================
 Package                                                 Architecture                                  Version                                           Repository                                           Size
===================================================================================================================================================================================================================
Installing:
 pgdg-redhat-repo                                        noarch                                        42.0-26                                           @commandline                                         11 k

Transaction Summary
===================================================================================================================================================================================================================
Install  1 Package

Total size: 11 k
Installed size: 13 k
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                           1/1
  Installing       : pgdg-redhat-repo-42.0-26.noarch                                                                                                                                                           1/1
  Verifying        : pgdg-redhat-repo-42.0-26.noarch                                                                                                                                                           1/1

Installed:
  pgdg-redhat-repo-42.0-26.noarch

Complete!
[root@localhost ~]# dnf -qy module disable postgresql
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
Importing GPG key 0x6D960B89:
 Userid     : "PostgreSQL RPM Repository <pgsql-pkg-yum@postgresql.org>"
 Fingerprint: 33EC A7E4 0671 479E 2279 EA81 A8AC 42ED 6D96 0B89
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG-AARCH64
[root@localhost ~]#

Within the pgdg-redhat-all.repo file I have disabled all PostgreSQL versions except 14 which I want to use. The installation itself is the same like on other servers using Intel or AMD.

dnf install postgresql14 postgresql14-server postgresql14-contrib

On that system a stripe of two NVME SSDs is used for /home, so I changed the PGDATA for the service file creating a override.conf.

[root@localhost ~]# systemctl edit postgresql-14.service

And add:

[Service]
Environment=PGDATA=/home/PG14/data

Now we can run initdb.

[root@localhost ~]# /usr/pgsql-14/bin/postgresql-14-setup initdb
Initializing database ... OK

[root@localhost ~]#

Starting PostgreSQL and enabling the service.

[root@localhost opt]# systemctl start postgresql-14.service
[root@localhost opt]# systemctl enable postgresql-14.service
Created symlink /etc/systemd/system/multi-user.target.wants/postgresql-14.service → /usr/lib/systemd/system/postgresql-14.service.
[root@localhost opt]#

PostgreSQL 14 is up and running on this ARM based host.

[postgres@localhost ~]$ psql
psql (14.4)
Type "help" for help.

postgres=#

The configuration of PostgreSQL is following best practices.

[root@localhost ~]# cat /home/PG14/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
listen_addresses = '*'
max_connections = '1000'
effective_cache_size = '192 GB'
shared_buffers = '64 GB'
work_mem = '64 MB'
maintenance_work_mem = '8000 MB'
max_worker_processes = '80'
max_parallel_workers = '80'
max_parallel_workers_per_gather = '40'
max_parallel_maintenance_workers = '40'
shared_preload_libraries = 'pg_stat_statements'
checkpoint_timeout  = '15 min'
checkpoint_completion_target = 0.9
min_wal_size = '1024 MB'
max_wal_size = '16384 MB'
wal_buffers = '512MB'
wal_compression = on
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0
bgwriter_flush_after = 0
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0
bgwriter_flush_after = 0
enable_partitionwise_join = on
enable_partitionwise_aggregate = on
jit = on
[root@localhost ~]#

I’ using the tuned.conf for PostgreSQL out of the dbi DMK Package.

#
# dbi services tuned profile for PostgreSQL servers
#

[main]
summary=dbi services tuned profile for PostgreSQL servers
include=throughput-performance

[bootloader]
cmdline = "transparent_hugepage=never"

[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100
# Explicitly disable deep c-states to reduce latency on OLTP workloads.
force_latency=1

[disk]
readahead=>4096

[sysctl]
kernel.sched_min_granularity_ns = 10000000
kernel.sched_wakeup_granularity_ns = 15000000
kernel.sched_migration_cost_ns=50000000
# this one is for pgpool
## http://www.pgpool.net/docs/latest/en/html/runtime-config-connection.html => num_init_children
net.core.somaxconn=256
net.ipv4.tcp_timestamps=0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 250
vm.overcommit_memory=2
vm.overcommit_ratio=75
vm.swappiness=1
vm.dirty_ratio=2
vm.dirty_background_ratio=1
#vm.nr_hugepages=1200

[vm]
transparent_hugepages=never
[postgres@localhost ~]$

I have activated that profile and after a reboot I was starting to work with pgbech for testing.

I have created a pgbenchdb with 1 billion records, the DB itself is about 146GB.

[postgres@localhost ~]$ /usr/pgsql-14/bin/pgbench -i -s 10000 pgbenchdb
generating data (client-side)...
1000000000 of 1000000000 tuples (100%) done (elapsed 740.31 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 1568.00 s (drop tables 0.00 s, create tables 0.03 s, client-side generate 742.72 s, vacuum 113.22 s, primary keys 712.03 s).
[postgres@localhost ~]$
[postgres@localhost ~]$ psql
psql (14.4)
Type "help" for help.

postgres=# \l+
                                                                    List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 pgbenchdb | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                       | 146 GB  | pg_default |
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                       | 8857 kB | pg_default | default administrative connection database
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +| 8705 kB | pg_default | unmodifiable empty database
           |          |          |             |             | postgres=CTc/postgres |         |            |
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +| 8705 kB | pg_default | default template for new databases
           |          |          |             |             | postgres=CTc/postgres |         |            |
(4 rows)

postgres=#

I run pgbench with a simulation of 10 clients, 25 clients, 50 clients, 100 clients, 250 clients, 500 clients and 1000 clients, pgbench istelf was used with 10 threads. For every client 1000 transactions simulated.

[postgres@localhost ~]$ /usr/pgsql-14/bin/pgbench -c 1000 -j 10 -t 1000 pgbenchdb
pgbench (14.4)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10000
query mode: simple
number of clients: 1000
number of threads: 10
number of transactions per client: 1000
number of transactions actually processed: 1000000/1000000
latency average = 22.659 ms
initial connection time = 436.926 ms
tps = 44132.846931 (without initial connection time)
[postgres@localhost ~]$

Very interesting result, over 44000 tps. But also interesting was the scaling of that system over the count of clients.

I’m very satisfied with this performance I did not expect, the system is scaling very well.

L’article PostgreSQL on 80 core ARM Server est apparu en premier sur dbi Blog.

ODA: How to use Data Preserving Reprovisioning?

Fri, 2022-07-22 11:13
Introduction

Oracle Database Appliance has a global patch available each quarter, so all the components can be patched together on a regular basis. Most of the time, applying the patch on your ODA will be OK, but sometimes you will struggle for multiple reasons. If you don’t apply patch frequently for example. Or if your ODA is heavily tuned. Or if your configuration is partially done manually. For those being not able anymore to correctly apply the patch, the only way of patching was to do a complete reimaging of the ODA, but it’s time consuming as all databases need to be restored.

19.15 brought a new feature called “Data Preserving Reprovisioning”: like patching, it does not need a restore of all your databases, and like reimaging, your system is reistalled from scratch. This sounds like the best option for most of us.

Limits

Data Preserving Reprovisioning (DPR) is limited to very old versions: 12.1.2.12, 12.2.1.4, 18.3, 18.5, 18.7, and 18.8. It may get increased compatibility in future versions.

How does it work?

Reimaging normally implies to wipe out the data disks before, but this is not more mandatory. You can keep your databases on disk, extract the metadata of your ODA setup, reimage your appliance and then remap the databases to your fresh install. This is a kind of unplug/plug of your ODA configuration.

My test environment

I’m lucky to have an X8-2M I can play with. And the first thing I did is a reimaging to 18.8. X8 came along 19c, but also supports 18.8.

For sure, I had to do a cleanup deploy before reimaging (this ODA was already using 19.15), and I struggled with reimaging because partition map prevent reimaging at first tries (probably because those partitions were created for Linux 7, and I put back a Linux 6 so the partition manager called by Anaconda didn’t recognize them). I solved the problem by starting a normal Linux 7 setup (with an Oracle Linux ISO not dedicated to ODA) and removing all partitions from the disks.

On this 18.8 environment, I deployed 2 DB homes, 18.8 and 12.1. I created 3 databases on it.

odacli describe-component
System Version
---------------
18.8.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       18.8.0.0.0            up-to-date
DB {
[ OraDB18000_home1 ]                      18.8.0.0.191015       up-to-date
[ OraDB12102_home1 ]                      12.1.0.2.191015       up-to-date
}
DCSAGENT                                  18.8.0.0.0            up-to-date
ILOM                                      5.0.2.24.r141466      4.0.4.38.a.r132148
BIOS                                      52050300              52020500
OS                                        6.10                  up-to-date
FIRMWARECONTROLLER                        VDV1RL04              vdv1rl02
FIRMWAREDISK                              1132                  1102

odacli list-databases

ID                                       DB Name    DB Type  DB Version           CDB        Class    Shape    Storage    Status        DbHomeID
---------------------------------------- ---------- -------- -------------------- ---------- -------- -------- ---------- ------------ ----------------------------------------
0681df5a-498e-4bb3-8b1e-a5464e891a56     TSTNOC18   Si       18.8.0.0.191015      false      Oltp     Odb6     Acfs       Configured   3eac70f5-6510-4c7c-bf5b-ed938dc60aa1
59ada2fe-cf7d-4fe6-b34b-496f611ab13e     TSTC18     Si       18.8.0.0.191015      true       Oltp     Odb2     Asm        Configured   3eac70f5-6510-4c7c-bf5b-ed938dc60aa1
cc44fae8-e508-4b6d-87ee-257b4c48bed8     TSTNOC12   Si       12.1.0.2.191015      false      Oltp     Odb4     Acfs       Configured   8dbba7de-4348-4c61-a171-f7413d30bc14

I didn’t do further changes on this system.

DPR from 18.8 to 19.15: detach the node

First step is to detach your configuration, this is like unplugging a PDB. You will need this additional tool from MOS:

Patch 33594115: DATA PRESERVING REPROVISIONING: UPLOAD ODA UPGRADE UTILITY ZIP TO ARU

Unzip the files and run the precheck:

cd /opt/dbi/
unzip p33594115_1915000_Linux-x86-64.zip
unzip -d /opt/oracle odaupgradeutil_220530.zip
cd /opt/oracle/odaupgradeutil/
./odaupgradeutil run-prechecks

Initializing...
########################## ODAUPGRADEUTIL - INIT - BEGIN ##########################
Please check /opt/oracle/oak/restore/log/odaupgradeutil_init_22-07-2022_11:06:55.log for details.
Get System Version...BEGIN
System Version is: 18.8.0.0.0
Get System Version...DONE
Get Hardware Info...BEGIN
Hardware Model: X8-2, Hardware Platform: M
Get Hardware Info...DONE
Get Grid home...BEGIN
Grid Home is: /u01/app/18.0.0.0/grid
Get Grid home...DONE
Get system configuration details...BEGIN
Grid user is: grid
Oracle user is: oracle
Get system configuration details...DONE
########################## ODAUPGRADEUTIL - INIT - END ##########################
*********
IMPORTANT
*********
odaupgradeutil will bring down the databases and grid services on the system.
The files that belong to the databases, which are stored on ASM or ACFS,
are left intact on the storage. The databases will be started up back after
re-imaging the ODA system using 'odacli restore-node' commands.
As a good precautionary measure, please backup all the databases on the
system before you start this process. Do not store the backup on this ODA
machine since the local file system will be wiped out as part of the re-image.
*********
IMPORTANT
*********
########################## ODAUPGRADEUTIL - PRECHECKS - BEGIN ##########################
Please check /opt/oracle/oak/restore/log/odaupgradeutil_prechecks_22-07-2022_11:07:15.log for details.
System version precheck...BEGIN
System version precheck...PASSED
System config precheck...BEGIN
System config precheck...PASSED
Required Files precheck...BEGIN
Required Files precheck...PASSED
Need to discover DB homes
Get Database homes...BEGIN
Get Database homes...SUCCESS
Disk space precheck...BEGIN
Get Quorum Disks...BEGIN
Get Quorum Disks...SUCCESS
Disk space precheck...PASSED
DCS Agent status precheck...BEGIN
DCS Agent status precheck...PASSED
OAK precheck...BEGIN
OAK precheck...PASSED
ASM precheck...BEGIN
ASM precheck...PASSED
Database precheck...BEGIN
Get databases...BEGIN
  Database Name: TSTNOC18
  Oracle Home: /u01/app/oracle/product/18.0.0.0/dbhome_1
  Database Name: TSTC18
  Oracle Home: /u01/app/oracle/product/18.0.0.0/dbhome_1
  Database Name: TSTNOC12
  Oracle Home: /u01/app/oracle/product/12.1.0.2/dbhome_1
Get databases...SUCCESS
Database precheck...PASSED
Audit Files precheck...BEGIN
Audit Files precheck...WARNING
Custom RPMs precheck...BEGIN
Custom RPMs precheck...PASSED
########################## ODAUPGRADEUTIL - PRECHECKS - END ##########################
Use 'odaupgradeutil describe-precheck-report [-j]' to view the precheck report.

If something is wrong, you can probably try to solve the problem and do a precheck again.

Once precheck is OK, you can detach:

./odaupgradeutil detach-node
*********
IMPORTANT
*********
odaupgradeutil will bring down the databases and grid services on the system.
The files that belong to the databases, which are stored on ASM or ACFS,
are left intact on the storage. The databases will be started up back after
re-imaging the ODA system using 'odacli restore-node' commands.
As a good precautionary measure, please backup all the databases on the
system before you start this process. Do not store the backup on this ODA
machine since the local file system will be wiped out as part of the re-image.
*********
IMPORTANT
*********
Do you want to continue? [y/n]: y
########################## ODAUPGRADEUTIL - SAVECONF - BEGIN ##########################
Please check /opt/oracle/oak/restore/log/odaupgradeutil_saveconf_22-07-2022_11:13:47.log for details.
Backup files to /opt/oracle/oak/restore/bkp...BEGIN
Backup files to /opt/oracle/oak/restore/bkp...SUCCESS
Get provision instance...BEGIN
Get provision instance...SUCCESS
Get network configuration...BEGIN
Get network configuration...SUCCESS
Get databases...BEGIN
  Database Name: TSTNOC18
  Oracle Home: /u01/app/oracle/product/18.0.0.0/dbhome_1
  Database Name: TSTC18
  Oracle Home: /u01/app/oracle/product/18.0.0.0/dbhome_1
  Database Name: TSTNOC12
  Oracle Home: /u01/app/oracle/product/12.1.0.2/dbhome_1
Get databases...SUCCESS
Get Database homes...BEGIN
  Checking Unified Auditing for dbhome '/u01/app/oracle/product/18.0.0.0/dbhome_1'
  Unified Auditing is set to TRUE
  Checking Unified Auditing for dbhome '/u01/app/oracle/product/12.1.0.2/dbhome_1'
  Unified Auditing is set to TRUE
Get Database homes...SUCCESS
Get Database storages...BEGIN
  Database Name: TSTNOC18
    DATA destination: /u02/app/oracle/oradata/TSTNOC18/
    RECO destination: /u03/app/oracle/fast_recovery_area/
    REDO destination: /u04/app/oracle/redo/TSTNOC18/
    Flash Cache destination:
  Database Name: TSTC18
    DATA destination: +DATA
    RECO destination: +RECO
    REDO destination: +RECO
    Flash Cache destination:
  Database Name: TSTNOC12
    DATA destination: /u02/app/oracle/oradata/TSTNOC12
    RECO destination: /u03/app/oracle/fast_recovery_area/
    REDO destination: /u04/app/oracle/redo/TSTNOC12/
    Flash Cache destination:
Get Database storages...SUCCESS
Get Volumes...BEGIN
Get Volumes...SUCCESS
Get Filesystems...BEGIN
Get Filesystems...SUCCESS
Get Quorum Disks...BEGIN
Get Quorum Disks...SUCCESS
SAVECONF: SUCCESS
########################## ODAUPGRADEUTIL - SAVECONF - END ##########################
########################## ODAUPGRADEUTIL - DETACHNODE - BEGIN ##########################
Please check /opt/oracle/oak/restore/log/odaupgradeutil_detachnode_22-07-2022_11:15:01.log for details.
Deconfigure databases...BEGIN
  Database Name: TSTNOC18
  Local Instance: TSTNOC18
  Local Instance Status: RUNNING
  Stopping database 'TSTNOC18'...
  Removing database 'TSTNOC18' from CRS...
  Database Name: TSTC18
  Local Instance: TSTC18
  Local Instance Status: RUNNING
  Stopping database 'TSTC18'...
  Removing database 'TSTC18' from CRS...
  Database Name: TSTNOC12
  Local Instance: TSTNOC12
  Local Instance Status: RUNNING
  Stopping database 'TSTNOC12'...
  Removing database 'TSTNOC12' from CRS...
Deconfigure databases...SUCCESS
Get DB backup metadata...BEGIN
No backupconfigs found
No backupreports found
Quorum Disks were found
  Quorum disk '/dev/SSD_QRMDSK_p1' is 1024 MB in size, no resizing needed
  Quorum disk '/dev/SSD_QRMDSK_p2' is 1024 MB in size, no resizing needed
Deconfigure Grid Infrastructure...BEGIN
Deconfigure Grid Infrastructure...SUCCESS
Backup quorum disks...
  Backing up quorum disk '/dev/SSD_QRMDSK_p1'
  Backing up quorum disk '/dev/SSD_QRMDSK_p2'
Create serverarchives...BEGIN
  Serverarchive '/opt/oracle/oak/restore/out/serverarchive_dbi-oda-x8.zip' created
  Size = 365741 bytes
  SHA256 checksum = a272187c6dfd248d4c03a6e793154e3d0ba3a2b002430e9ad0b7adf4b2c19598
Create serverarchives...DONE
DETACHNODE: SUCCESS
[CRITICAL] Server data archive file(s) generated at /opt/oracle/oak/restore/out . Please ensure the file(s) are copied outside the ODA system and preserved.
########################## ODAUPGRADEUTIL - DETACHNODE - END ##########################

Backup the generated files on a secure place outside the appliance. Basically there is 1 file for a lite ODA, 3 files for an HA ODA.

ls -lrt /opt/oracle/oak/restore/out/
total 368
-rw------- 1 root root     65 Jul 22 11:21 serverarchive_dbi-oda-x8.zip.sha256
-rw-r--r-- 1 root root 365741 Jul 22 11:21 serverarchive_dbi-oda-x8.zip

cp /opt/oracle/oak/restore/out/serverarchive* /backup/oda-dpr
DPR from 18.8 to 19.15: reimage the node

You won’t do a cleanup.pl prior reimaging as for normal reimaging.

Reimaging is very easy, first you need to connect the 19.15 ISO from the ILOM console, then choose CDROM for next boot, then do a power cycle. You will need to do that on each node for an HA ODA.

Reimaging takes about 50 minutes to complete.

Once done, your ODA is no more reachable on the network, because no network configuration exists on your system.

DPR from 18.8 to 19.15: configure firstnet

After reimaging is finished, you can connect through ILOM console with root/welcome1. First step after a normal reimaging is the configure firstnet, this is the same here. If you don’t remember your network settings, you can find them in the backup files:

unzip serverarchive_dbi-oda-x8.zip
cd restore/
ls
bkp			configure-firstnet.rsp	init.params		log			metadata		prechecks
cat configure-firstnet.rsp
# To be used for configure-firstnet post reimage
HOSTNAME=dbi-oda-x8
INTERFACE_NAME=btbond1
VLAN=NO
IP_ADDR=10.36.0.241
SUBNET_MASK=255.255.255.0
GATEWAY=10.36.0.1
DPR from 18.8 to 19.15: patch the microcodes

This is not my case in this particular example, but if you come from 18.8 or older versions, your microcodes are probably old and never had an update. Do the check, and if some microcodes need an update, you should apply the server and/or storage patch on top of your fresh reimaging. Download the 19.15 patch, register it and apply the patch on both server and storage:

Patch 34069644: ORACLE DATABASE APPLIANCE 19.15.0.0.0 SERVER PATCH FOR ALL ODACLI/DCS STACK

odacli describe-component | grep -v ^$
...
unzip p34069644_1915000_Linux-x86-64.zip
odacli update-repository -f /opt/dbi/oda-sm-19.15.0.0.0-220530-server.zip

odacli create-prepatchreport -v 19.15.0.0.0 -s
odacli update-server -v 19.15.0.0.0
odacli update-storage -v 19.15.0.0.0
odacli describe-component | grep -v ^$
...
DPR from 18.8 to 19.15: reprovision the appliance

Now the interesting part is coming. Put back the backup files and register them in the ODA repository, unzip and register the GI clone 19.15, and do a restore node with the -g (grid) option:

odacli update-repository -f /opt/dbi/serverarchive_dbi-oda-x8.zip
cd /opt/dbi
unzip p30403673_1915000_Linux-x86-64.zip
odacli update-repository -f /opt/dbi/odacli-dcs-19.15.0.0.0-220425-GI-19.15.0.0.zip

odacli list-jobs

ID                                       Description                                                                 Created                             Status
---------------------------------------- --------------------------------------------------------------------------- ----------------------------------- ----------
82782a81-d8a2-4266-9074-9160f3f1b7bb     Repository Update                                                           July 22, 2022 12:00:43 PM UTC       Success
8e60a4ef-04d2-4985-a69b-1ef768da16f9     Repository Update                                                           July 22, 2022 12:10:43 PM UTC       Success

odacli restore-node -g
Enter new system password:
Retype new system password:
Enter an initial password for Web Console account (oda-admin):
Retype the password for Web Console account (oda-admin):
User 'oda-admin' created successfully...
{
  "jobId" : "2f7a5cdc-9de7-412a-8107-552fd4980909",
  "status" : "Created",
  "message" : "The system will reboot, if required, to enable the licensed number of CPU cores",
  "reports" : [ ],
  "createTimestamp" : "July 22, 2022 12:14:19 PM UTC",
  "resourceList" : [ ],
  "description" : "Restore node service - GI",
  "updatedTime" : "July 22, 2022 12:14:19 PM UTC"
}

odacli describe-job -i 2f7a5cdc-9de7-412a-8107-552fd4980909

Job details
----------------------------------------------------------------
                     ID:  2f7a5cdc-9de7-412a-8107-552fd4980909
            Description:  Restore node service - GI
                 Status:  Success
                Created:  July 22, 2022 12:14:19 PM CEST
                Message:  The system will reboot, if required, to enable the licensed number of CPU cores

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Restore node service creation            July 22, 2022 12:14:28 PM CEST      July 22, 2022 12:38:26 PM CEST      Success
Setting up Network                       July 22, 2022 12:14:29 PM CEST      July 22, 2022 12:14:29 PM CEST      Success
Setting up Vlan                          July 22, 2022 12:14:48 PM CEST      July 22, 2022 12:14:48 PM CEST      Success
Setting up Network                       July 22, 2022 12:15:08 PM CEST      July 22, 2022 12:15:08 PM CEST      Success
network update                           July 22, 2022 12:15:32 PM CEST      July 22, 2022 12:15:51 PM CEST      Success
updating network                         July 22, 2022 12:15:32 PM CEST      July 22, 2022 12:15:51 PM CEST      Success
Setting up Network                       July 22, 2022 12:15:32 PM CEST      July 22, 2022 12:15:32 PM CEST      Success
OS usergroup 'asmdba'creation            July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS usergroup 'asmoper'creation           July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS usergroup 'asmadmin'creation          July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS usergroup 'dba'creation               July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS usergroup 'dbaoper'creation           July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS usergroup 'oinstall'creation          July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS user 'grid'creation                   July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
OS user 'oracle'creation                 July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
Default backup policy creation           July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
Backup config metadata persist           July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
Grant permission to RHP files            July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
Add SYSNAME in Env                       July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:15:52 PM CEST      Success
Install oracle-ahf                       July 22, 2022 12:15:52 PM CEST      July 22, 2022 12:16:53 PM CEST      Success
Stop DCS Admin                           July 22, 2022 12:16:55 PM CEST      July 22, 2022 12:16:55 PM CEST      Success
Generate mTLS certificates               July 22, 2022 12:16:55 PM CEST      July 22, 2022 12:16:56 PM CEST      Success
Exporting Public Keys                    July 22, 2022 12:16:56 PM CEST      July 22, 2022 12:16:58 PM CEST      Success
Creating Trust Store                     July 22, 2022 12:16:58 PM CEST      July 22, 2022 12:16:59 PM CEST      Success
Update config files                      July 22, 2022 12:16:59 PM CEST      July 22, 2022 12:17:00 PM CEST      Success
Restart DCS Admin                        July 22, 2022 12:17:00 PM CEST      July 22, 2022 12:17:20 PM CEST      Success
Unzipping storage configuration files    July 22, 2022 12:17:20 PM CEST      July 22, 2022 12:17:20 PM CEST      Success
Reloading multipath devices              July 22, 2022 12:17:20 PM CEST      July 22, 2022 12:17:20 PM CEST      Success
restart oakd                             July 22, 2022 12:17:20 PM CEST      July 22, 2022 12:17:31 PM CEST      Success
Reloading multipath devices              July 22, 2022 12:18:32 PM CEST      July 22, 2022 12:18:32 PM CEST      Success
restart oakd                             July 22, 2022 12:18:32 PM CEST      July 22, 2022 12:18:42 PM CEST      Success
Restore Quorum Disks                     July 22, 2022 12:18:42 PM CEST      July 22, 2022 12:18:43 PM CEST      Success
Creating GI home directories             July 22, 2022 12:18:43 PM CEST      July 22, 2022 12:18:43 PM CEST      Success
Extract GI clone                         July 22, 2022 12:18:43 PM CEST      July 22, 2022 12:20:15 PM CEST      Success
Creating wallet for Root User            July 22, 2022 12:20:15 PM CEST      July 22, 2022 12:20:20 PM CEST      Success
Creating wallet for ASM Client           July 22, 2022 12:20:20 PM CEST      July 22, 2022 12:20:23 PM CEST      Success
Grid stack creation                      July 22, 2022 12:20:24 PM CEST      July 22, 2022 12:31:53 PM CEST      Success
GI Restore with RHP                      July 22, 2022 12:20:24 PM CEST      July 22, 2022 12:28:32 PM CEST      Success
Updating GIHome version                  July 22, 2022 12:28:33 PM CEST      July 22, 2022 12:28:36 PM CEST      Success
Post cluster OAKD configuration          July 22, 2022 12:31:53 PM CEST      July 22, 2022 12:34:16 PM CEST      Success
Mounting disk group DATA                 July 22, 2022 12:34:16 PM CEST      July 22, 2022 12:34:17 PM CEST      Success
Mounting disk group RECO                 July 22, 2022 12:34:25 PM CEST      July 22, 2022 12:34:33 PM CEST      Success
Setting ACL for disk groups              July 22, 2022 12:34:40 PM CEST      July 22, 2022 12:34:44 PM CEST      Success
Register Scan and Vips to Public Network July 22, 2022 12:34:44 PM CEST      July 22, 2022 12:34:46 PM CEST      Success
Configure export clones resource         July 22, 2022 12:35:40 PM CEST      July 22, 2022 12:35:41 PM CEST      Success
Adding Volume COMMONSTORE to Clusterware July 22, 2022 12:35:43 PM CEST      July 22, 2022 12:35:47 PM CEST      Success
Adding Volume DATTSTNOC12 to Clusterware July 22, 2022 12:35:47 PM CEST      July 22, 2022 12:35:50 PM CEST      Success
Adding Volume DATTSTNOC18 to Clusterware July 22, 2022 12:35:50 PM CEST      July 22, 2022 12:35:54 PM CEST      Success
Adding Volume RDOTSTNOC12 to Clusterware July 22, 2022 12:35:54 PM CEST      July 22, 2022 12:35:57 PM CEST      Success
Adding Volume RDOTSTNOC18 to Clusterware July 22, 2022 12:35:57 PM CEST      July 22, 2022 12:36:01 PM CEST      Success
Adding Volume RECO to Clusterware        July 22, 2022 12:36:01 PM CEST      July 22, 2022 12:36:04 PM CEST      Success
Enabling Volume(s)                       July 22, 2022 12:36:04 PM CEST      July 22, 2022 12:37:16 PM CEST      Success
Provisioning service creation            July 22, 2022 12:38:25 PM CEST      July 22, 2022 12:38:25 PM CEST      Success
persist new agent state entry            July 22, 2022 12:38:25 PM CEST      July 22, 2022 12:38:25 PM CEST      Success
persist new agent state entry            July 22, 2022 12:38:25 PM CEST      July 22, 2022 12:38:25 PM CEST      Success
Restart Zookeeper and DCS Agent          July 22, 2022 12:38:25 PM CEST      July 22, 2022 12:38:26 PM CEST      Success

df -h /dev/asm/*
Filesystem                Size  Used Avail Use% Mounted on
/dev/asm/acfsclone-207    150G  7.0G  144G   5% /opt/oracle/oak/pkgrepos/orapkgs/clones
/dev/asm/commonstore-207  5.0G  319M  4.7G   7% /opt/oracle/dcs/commonstore
/dev/asm/dattstnoc12-207  100G  2.2G   98G   3% /u02/app/oracle/oradata/TSTNOC12
/dev/asm/dattstnoc18-207  100G  2.1G   98G   3% /u02/app/oracle/oradata/TSTNOC18
/dev/asm/rdotstnoc12-158   14G   13G  1.7G  88% /u04/app/oracle/redo/TSTNOC12
/dev/asm/rdotstnoc18-158   29G   25G  4.7G  84% /u04/app/oracle/redo/TSTNOC18
/dev/asm/reco-158         297G  976M  297G   1% /u03/app/oracle

su - grid
asmcmd ls +data/TSTC18/datafile/*
SYSAUX.271.1110706283
SYSTEM.270.1110706247
UNDOTBS1.272.1110706299
USERS.283.1110707033

exit

My ACFS and ASM files are there.

DPR from 18.8 to 19.15: reprovision the DB homes and remap the databases

I now need the DB clones files to be deployed. If you remember, I had 2x 18.8 databases, and 1x 12.1 database. Required clone files can be found in a list:

cat /opt/oracle/oak/restore/metadata/dbVersions.list
# List of all db versions found, to be used for downloading required clones before DB restore
18.8.0.0.191015
12.1.0.2.191015

I will use DB clones coming from my 18.8.

I first need to create the dbhome storage:

odacli configure-dbhome-storage -dg DATA -s 80

And add the DB clones to the repository:

unzip p23494992_188000_Linux-x86-64.zip
odacli update-repository -f /opt/dbi/odacli-dcs-18.8.0.0.0-191226-DB-12.1.0.2.zip

unzip p27604558_188000_Linux-x86-64.zip
odacli update-repository -f /opt/dbi/odacli-dcs-18.8.0.0.0-191226-DB-18.0.0.0.zip

Now it’s possible to remap the database to the DB homes with the restore-node -d (database):

odacli restore-node -d

odacli describe-job -i 4b579397-2c93-420a-846b-5a2aebe71d32

Job details
----------------------------------------------------------------
                     ID:  4b579397-2c93-420a-846b-5a2aebe71d32
            Description:  Restore node service - Database
                 Status:  Success
                Created:  July 22, 2022 4:36:29 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance               July 22, 2022 4:36:49 PM CEST       July 22, 2022 4:36:49 PM CEST       Success
DB home creation: OraDB18000_home1       July 22, 2022 4:36:50 PM CEST       July 22, 2022 4:39:31 PM CEST       Success
Validating dbHome available space        July 22, 2022 4:36:50 PM CEST       July 22, 2022 4:36:50 PM CEST       Success
Creating DbHome Directory                July 22, 2022 4:36:50 PM CEST       July 22, 2022 4:36:50 PM CEST       Success
Create required directories              July 22, 2022 4:36:50 PM CEST       July 22, 2022 4:36:50 PM CEST       Success
Extract DB clone                         July 22, 2022 4:36:50 PM CEST       July 22, 2022 4:38:15 PM CEST       Success
ProvDbHome by using RHP                  July 22, 2022 4:38:15 PM CEST       July 22, 2022 4:39:12 PM CEST       Success
Enable DB options                        July 22, 2022 4:39:13 PM CEST       July 22, 2022 4:39:29 PM CEST       Success
Creating wallet for DB Client            July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:39:31 PM CEST       Success
DB home creation: OraDB12102_home1       July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
Validating dbHome available space        July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:39:31 PM CEST       Success
Creating DbHome Directory                July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:39:31 PM CEST       Success
Create required directories              July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:39:31 PM CEST       Success
Extract DB clone                         July 22, 2022 4:39:31 PM CEST       July 22, 2022 4:41:02 PM CEST       Success
ProvDbHome by using RHP                  July 22, 2022 4:41:02 PM CEST       July 22, 2022 4:42:30 PM CEST       Success
Enable DB options                        July 22, 2022 4:42:30 PM CEST       July 22, 2022 4:42:40 PM CEST       Success
Creating wallet for DB Client            July 22, 2022 4:42:46 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
Persist database storage locations       July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTNOC18             July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTC18               July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTNOC12             July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
Persist database storages                July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTNOC18             July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTC18               July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
  Save metadata for TSTNOC12             July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:47 PM CEST       Success
Restore database: TSTNOC18               July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:44:11 PM CEST       Success
  Adding database TSTNOC18 to GI         July 22, 2022 4:42:47 PM CEST       July 22, 2022 4:42:49 PM CEST       Success
  Adding database instance(s) to GI      July 22, 2022 4:42:49 PM CEST       July 22, 2022 4:42:49 PM CEST       Success
  Modifying SPFILE for database          July 22, 2022 4:42:49 PM CEST       July 22, 2022 4:43:33 PM CEST       Success
  Restore password file for database     July 22, 2022 4:43:33 PM CEST       July 22, 2022 4:43:34 PM CEST       Success
  Start instance(s) for database         July 22, 2022 4:43:34 PM CEST       July 22, 2022 4:43:52 PM CEST       Success
  Persist metadata for database          July 22, 2022 4:43:52 PM CEST       July 22, 2022 4:43:52 PM CEST       Success
  Clear all listeners from Database      July 22, 2022 4:43:52 PM CEST       July 22, 2022 4:43:53 PM CEST       Success
  Run SqlPatch                           July 22, 2022 4:43:56 PM CEST       July 22, 2022 4:44:11 PM CEST       Success
Restore database: TSTC18                 July 22, 2022 4:44:11 PM CEST       July 22, 2022 4:46:05 PM CEST       Success
  Adding database TSTC18 to GI           July 22, 2022 4:44:11 PM CEST       July 22, 2022 4:44:13 PM CEST       Success
  Adding database instance(s) to GI      July 22, 2022 4:44:13 PM CEST       July 22, 2022 4:44:13 PM CEST       Success
  Modifying SPFILE for database          July 22, 2022 4:44:13 PM CEST       July 22, 2022 4:45:05 PM CEST       Success
  Restore password file for database     July 22, 2022 4:45:05 PM CEST       July 22, 2022 4:45:05 PM CEST       Success
  Start instance(s) for database         July 22, 2022 4:45:05 PM CEST       July 22, 2022 4:45:28 PM CEST       Success
  Persist metadata for database          July 22, 2022 4:45:28 PM CEST       July 22, 2022 4:45:28 PM CEST       Success
  Clear all listeners from Database      July 22, 2022 4:45:28 PM CEST       July 22, 2022 4:45:29 PM CEST       Success
  Run SqlPatch                           July 22, 2022 4:45:32 PM CEST       July 22, 2022 4:46:05 PM CEST       Success
Restore database: TSTNOC12               July 22, 2022 4:46:05 PM CEST       July 22, 2022 4:47:38 PM CEST       Success
  Adding database TSTNOC12 to GI         July 22, 2022 4:46:05 PM CEST       July 22, 2022 4:46:07 PM CEST       Success
  Adding database instance(s) to GI      July 22, 2022 4:46:07 PM CEST       July 22, 2022 4:46:07 PM CEST       Success
  Modifying SPFILE for database          July 22, 2022 4:46:07 PM CEST       July 22, 2022 4:46:40 PM CEST       Success
  Restore password file for database     July 22, 2022 4:46:40 PM CEST       July 22, 2022 4:46:41 PM CEST       Success
  Start instance(s) for database         July 22, 2022 4:46:41 PM CEST       July 22, 2022 4:46:59 PM CEST       Success
  Persist metadata for database          July 22, 2022 4:46:59 PM CEST       July 22, 2022 4:46:59 PM CEST       Success
  Clear all listeners from Database      July 22, 2022 4:46:59 PM CEST       July 22, 2022 4:47:00 PM CEST       Success
  Run SqlPatch                           July 22, 2022 4:47:03 PM CEST       July 22, 2022 4:47:38 PM CEST       Success
Restore Object Stores                    July 22, 2022 4:47:38 PM CEST       July 22, 2022 4:47:38 PM CEST       Success
Remount NFS backups                      July 22, 2022 4:47:38 PM CEST       July 22, 2022 4:47:38 PM CEST       Success
Restore BackupConfigs                    July 22, 2022 4:47:38 PM CEST       July 22, 2022 4:47:38 PM CEST       Success
Reattach backupconfigs to DBs            July 22, 2022 4:47:38 PM CEST       July 22, 2022 4:47:39 PM CEST       Success
Restore backup reports                   July 22, 2022 4:47:39 PM CEST       July 22, 2022 4:47:39 PM CEST       Success

Everything looks OK.

Let’s check databases and pmons:

odacli list-databases

ID                                       DB Name    DB Type  DB Version           CDB        Class    Shape    Storage    Status        DbHomeID
---------------------------------------- ---------- -------- -------------------- ---------- -------- -------- ---------- ------------ ----------------------------------------
0a06b4a6-29a4-41e9-bd38-f1f2984f3590     TSTNOC18   SI       18.8.0.0.191015      false      OLTP     odb6     ACFS       CONFIGURED   3514b78c-a1c8-42dd-a9b7-ff13d5a496fe
35895ff7-f485-43d9-8877-bdc0a872a4bf     TSTC18     SI       18.8.0.0.191015      true       OLTP     odb2     ASM        CONFIGURED   3514b78c-a1c8-42dd-a9b7-ff13d5a496fe
9c18400f-403c-4a6b-b0f7-7ec8bd7d4c87     TSTNOC12   SI       12.1.0.2.191015      false      OLTP     odb4     ACFS       CONFIGURED   3b00a7ce-bdb4-4d4c-8153-9dac8dff6e61

ps -ef | grep pmon | grep -v grep
oracle     547     1  0 16:45 ?        00:00:00 ora_pmon_TSTC18
grid      6986     1  0 14:30 ?        00:00:00 asm_pmon_+ASM1
oracle   13404     1  0 16:46 ?        00:00:00 ora_pmon_TSTNOC12
grid     31490     1  0 14:35 ?        00:00:00 apx_pmon_+APX1
oracle   87126     1  0 16:43 ?        00:00:00 ora_pmon_TSTNOC18

Perfect!

I would recommend deploying newest DB homes and migrating your databases to get rid of these old versions.

Conclusion

This feature is really convenient, and it works. It’s definitely cleaner than patching, and faster than a complete reimaging. I just hope this will be possible for all starting version in the next releases. For now, I’ve no plan to use it because most of the ODAs I work on are now already using 19.x. Those using old versions are also running on old hardware.

L’article ODA: How to use Data Preserving Reprovisioning? est apparu en premier sur dbi Blog.

Setup a Rocky Linux Repository server

Mon, 2022-07-18 14:22

This Blog is about a setup of a Rocky Linux Repository server.

Many people whould complain that it will be the same than RHEL 8, but is is not, there are some differencies between AlmaLinux, Rocky Linux, Oracle Linux and RHEL 8.

Base is a Rocky Linux Linux 8.6 minimal installation.

[root@rockylinux-8-repo ~]# cat /etc/os-release 
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.6 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8"
[root@rockylinux-8-repo ~]# 

The installation of EPEL is similar to AlmaLinux. in short words it is the same.

[root@rockylinux-8-repo ~]# dnf config-manager --set-enabled powertools
[root@rockylinux-8-repo ~]# dnf install epel-release

RPMFusion is exactly the same like on AlmaLinux.

[root@rockylinux-8-repo ~]# dnf install --nogpgcheck https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-$(rpm -E %rhel).noarch.rpm https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-$(rpm -E %rhel).noarch.rpm

Next step will be the installation of the required RPMs, I’m also using nginx to reuse as much as possible from my RHEL 8 Repository Blog. The subscription-manager package we will need to switch between different Rocky Linux 8 Releases we want to have on our own repo server.

[root@rockylinux-8-repo ~]# dnf install nginx yum-utils createrepo_c subscription-manager

Interesting is checking the active repositories.

[root@rockylinux-8-repo ~]# dnf repolist
repo id                                                                                                                        repo name
appstream                                                                                                                      Rocky Linux 8 - AppStream
baseos                                                                                                                         Rocky Linux 8 - BaseOS
epel                                                                                                                           Extra Packages for Enterprise Linux 8 - x86_64
epel-modular                                                                                                                   Extra Packages for Enterprise Linux Modular 8 - x86_64
extras                                                                                                                         Rocky Linux 8 - Extras
powertools                                                                                                                     Rocky Linux 8 - PowerTools
rpmfusion-free-updates                                                                                                         RPM Fusion for EL 8 - Free - Updates
rpmfusion-nonfree-updates                                                                                                      RPM Fusion for EL 8 - Nonfree - Updates
[root@rockylinux-8-repo ~]# 

It is one to one the same repo_id like on AlmaLinux, this make it easy, the script für synchronizing repositories will be identical.

[root@rockylinux-8-repo /]# cat /opt/reposync/reposync.sh 
# Rocky Linux $releasever - BaseOS


################################################
# Synchronisation of RockyLinux 8 Repositories #
# By Karsten Lenz dbi-services sa 2022.06.29   #
################################################

#!/bin/bash

echo "Synchronisation of RHEL Repositores"
echo ""

# Help
function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -v <RHEL Version>\t\tRHEL Release (required)\n"
  printf "\t -h <Help>\t\t\tprints this help\n"
}

# Select Options
while getopts v:h option 2>/dev/null
do
  case "${option}"
  in
  v) VERSION=${OPTARG};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n";
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done
# Extract Major Release
MAJOR=${VERSION:0:1}

# Set RHEL RELEASE to sync
printf "Set Release to sync"
subscription-manager release --set=$VERSION && rm -rf /var/cache/dnf

# SYNC BASE-OS, APPSTREAM, Codeready, EPEL and rpmfusion
if [ $MAJOR == '8' ]
then
    	printf "Sync Base OS "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=baseos
        printf "Sync Appstream "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=appstream
        printf "Sync Extras " 
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=extras
        printf "Sync Powertools " 
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=powertools
        printf "Sync EPEL 8"
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=epel
        printf "Sync EPEL 8 Modular "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=epel-modular
        printf "Sync rpmfusion-free "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-free-updates
        printf "Sync rpmfusion-nonfree "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-nonfree-updates
fi

[root@rockylinux-8-repo /]# 

With this script it is possible to store different Rocky Linux 8 releases on one own Repository server with sh reposync -v 8.6 for release 8.6.

[root@rockylinux-8-repo 8]# du -sh *
15G	8.4
15G	8.5
15G	8.6
13G	epel
1.1G	epel-modular
297M	rpmfusion-free-updates
1.1G	rpmfusion-nonfree-updates
[root@rockylinux-8-repo 8]# 

Now it is time to configure nginx, the configuration is similar to that one used for the RHEL 8 and AlmaLinux 8 Repository servers.

[root@rockylinux-8-repo /]# cat /etc/nginx/nginx.conf
# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
	listen       80 default_server;
        listen       [::]:80 default_server;
        server_name  _;
        root         /usr/share/nginx/html/;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        location / {
                allow all;
                sendfile on;
                sendfile_max_chunk 1m;
                autoindex on;
                autoindex_exact_size off;
                autoindex_format html;
                autoindex_localtime on;
        }
          error_page 404 /404.html;
            location = /40x.html {
        }

          error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }

#    server {
#        listen       80 default_server;
#        listen       [::]:80 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }
#
# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2 default_server;
#        listen       [::]:443 ssl http2 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}

If SE Linux is set to enforcing we need to do some adaptations.

[root@rockylinux-8-repo /]# getenforce
Enforcing
[root@rockylinux-8-repo /]# setfacl -R -m u:nginx:rwx /usr/share/nginx/html/8/
[root@rockylinux-8-repo /]# chcon -Rt httpd_sys_content_t /usr/share/nginx/html/8/

The firewall need to be adapted for http and https if https is required, and restarting nginx.

[root@rockylinux-8-repo /]# firewall-cmd --zone=public --permanent --add-service=http
success
[root@rockylinux-8-repo /]# firewall-cmd --zone=public --permanent --add-service=https
success
[root@rockylinux-8-repo /]# firewall-cmd --reload
success
[root@rockylinux-8-repo /]# systemctl restart nginx

The repository server is up and working.

L’article Setup a Rocky Linux Repository server est apparu en premier sur dbi Blog.

Setup a AlmaLinux 8 Repository Server

Mon, 2022-07-18 01:28

This Blog is about to setup a AlmaLinux 8 Repository Server including EPEL and RPMFusion.

Many people whould complain that it will be the same than RHEL 8, but is is not, there are some differencies between AlmaLinux, Rocky Linux, Oracle Linux and RHEL 8.

Base is a AlmaLinux 8.6 minimal installation.

[root@almalinux-8-repo ~]# cat /etc/os-release 
NAME="AlmaLinux"
VERSION="8.6 (Sky Tiger)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="AlmaLinux 8.6 (Sky Tiger)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:almalinux:almalinux:8::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-8"
ALMALINUX_MANTISBT_PROJECT_VERSION="8.6"

[root@almalinux-8-repo ~]# 

The installation of EPEL is a bit different to RHEL 8.6, there is no need for crb, just enable powertools at first.

[root@almalinux-8-repo ~]# dnf config-manager --set-enabled powertools
[root@almalinux-8-repo ~]# dnf install epel-release

Adding RPMFusion is a one line command, we have enabled powertools and EPEL ist allso added before, both is mandatory for RPMFusion.

[root@almalinux-8-repo ~]# dnf install --nogpgcheck https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-$(rpm -E %rhel).noarch.rpm https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-$(rpm -E %rhel).noarch.rpm

Next step will be the installation of the required RPMs, I’m also using nginx to reuse as much as possible from my RHEL 8 Repository Blog. The subscription-manager package we will to switch between diffrent AlmaLinux 8 Releases we want to have on our own repo server.

[root@almalinux-8-repo ~]# dnf install nginx yum-utils createrepo_c subscription-manager

Now it is time for scripting the reposync function to sync the required repositories. For that we need to know the repo ids for adapting the script originally written for RHEL 8.

[root@almalinux-8-repo reposync]# dnf repolist
repo id                                                                                                                        repo name
appstream                                                                                                                      AlmaLinux 8 - AppStream
baseos                                                                                                                         AlmaLinux 8 - BaseOS
epel                                                                                                                           Extra Packages for Enterprise Linux 8 - x86_64
epel-modular                                                                                                                   Extra Packages for Enterprise Linux Modular 8 - x86_64
extras                                                                                                                         AlmaLinux 8 - Extras
powertools                                                                                                                     AlmaLinux 8 - PowerTools
rpmfusion-free-updates                                                                                                         RPM Fusion for EL 8 - Free - Updates
rpmfusion-nonfree-updates                                                                                                      RPM Fusion for EL 8 - Nonfree - Updates
[root@almalinux-8-repo reposync]# 

Here the adapted script for AlmaLinux 8.

[root@almalinux-8-repo /]# cat /opt/reposync/reposync.sh 
# AlmaLinux $releasever - BaseOS


###############################################
# Synchronisation of AlmaLinux 8 Repositories #
# By Karsten Lenz dbi-services sa 2022.06.29  #
###############################################

#!/bin/bash

echo "Synchronisation of RHEL Repositores"
echo ""

# Help
function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -v <RHEL Version>\t\tRHEL Release (required)\n"
  printf "\t -h <Help>\t\t\tprints this help\n"
}

# Select Options
while getopts v:h option 2>/dev/null
do
  case "${option}"
  in
  v) VERSION=${OPTARG};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n";
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done
# Extract Major Release
MAJOR=${VERSION:0:1}

# Set RHEL RELEASE to sync
printf "Set Release to sync"
subscription-manager release --set=$VERSION && rm -rf /var/cache/dnf

# SYNC BASE-OS, APPSTREAM, Codeready, EPEL and rpmfusion
if [ $MAJOR == '8' ]
then
    	printf "Sync Base OS "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=baseos
        printf "Sync Appstream "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=appstream
        printf "Sync Extras " 
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=extras
        printf "Sync Powertools " 
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=powertools
        printf "Sync EPEL 8"
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=epel
        printf "Sync EPEL 8 Modular "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=epel-modular
        printf "Sync rpmfusion-free "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-free-updates
        printf "Sync rpmfusion-nonfree "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-nonfree-updates
fi

With this script it is possible to store different AlmaLinux 8 releases on one own Repository server.

[root@almalinux-8-repo 8]# du -sh *
17G	8.4
17G	8.5
17G	8.6
13G	epel
1.1G	epel-modular
297M	rpmfusion-free-updates
1.1G	rpmfusion-nonfree-updates
[root@almalinux-8-repo 8]# 

Now it is time to configure nginx, the configuration is similar to that one used for the RHEL 8 Repository server.

[root@almalinux-8-repo /]# cat /etc/nginx/nginx.conf
# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
	listen       80 default_server;
        listen       [::]:80 default_server;
        server_name  _;
        root         /usr/share/nginx/html/;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        location / {
                allow all;
                sendfile on;
                sendfile_max_chunk 1m;
                autoindex on;
                autoindex_exact_size off;
                autoindex_format html;
                autoindex_localtime on;
        }
          error_page 404 /404.html;
            location = /40x.html {
        }

          error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }

#    server {
#        listen       80 default_server;
#        listen       [::]:80 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }
#
# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2 default_server;
#        listen       [::]:443 ssl http2 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}

[root@almalinux-8-repo /]# 

If SE Linux is set to enforcing we need to do some adaptations.

[root@almalinux-8-repo /]# getenforce
Enforcing
[root@almalinux-8-repo /]# setfacl -R -m u:nginx:rwx /usr/share/nginx/html/8/
[root@almalinux-8-repo /]# chcon -Rt httpd_sys_content_t /usr/share/nginx/html/8/

The firewall need to be adapted for http and https if https is required, and restarting nginx.

[root@almalinux-8-repo /]# firewall-cmd --zone=public --permanent --add-service=http
success
[root@almalinux-8-repo /]# firewall-cmd --zone=public --permanent --add-service=https
success
[root@almalinux-8-repo /]# firewall-cmd --reload
success
[root@almalinux-8-repo /]# systemctl restart nginx

Now the repository server is up and running.

L’article Setup a AlmaLinux 8 Repository Server est apparu en premier sur dbi Blog.

Setup a RHEL 8 Repository Server

Mon, 2022-07-18 01:27

This Blog is about a own Repository server for RHEL 8, Blogs for Repository Server for AlmaLinux, Oracle Linux, Rocky Linux, OpenSuse LEAP, SLES 15 and Debian 11 will follow.

The base is a minimal installation.
I have added EPEL and RPMFusion, may be that Desktpp systems also needed to be updated or systems with packages from EPEL.

$ [root@rhel-8-repo ~]# cat /etc/os-release 
$ NAME="Red Hat Enterprise Linux"
$ VERSION="8.6 (Ootpa)"
$ ID="rhel"
$ ID_LIKE="fedora"
$ VERSION_ID="8.6"
$ PLATFORM_ID="platform:el8"
$ PRETTY_NAME="Red Hat Enterprise Linux 8.6 (Ootpa)"
$ ANSI_COLOR="0;31"
$ CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
$ HOME_URL="https://www.redhat.com/"
$ DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
$ BUG_REPORT_URL="https://bugzilla.redhat.com/"
$ 
$ REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
$ REDHAT_BUGZILLA_PRODUCT_VERSION=8.6
$ REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
$ REDHAT_SUPPORT_PRODUCT_VERSION="8.6"
[$ root@rhel-8-repo ~]# 

As any RHEL Systems that this will work it is madatory that the system is attached to a subscription.

[root@rhel-8-repo ~]# subscription-manager register
Registering to: subscription.rhsm.redhat.com:443/subscription
Username: XXXXXXXXXXXXXXXXXX
Password: 
The system has been registered with ID: XXXXXXXXXXX
The registered system name is: rhel-8-repo.localdomain
[root@rhel-8-repo ~]# 

Now we can add EPEL and RPMFusion, first ist EPEL, it is required for RPMFusion.
For EPEL we need to enable codeready-builder first.

root@rhel-8-repo ~]# subscription-manager repos --enable codeready-builder-for-rhel-8-$(arch)-rpms
Repository 'codeready-builder-for-rhel-8-x86_64-rpms' is enabled for this system.
[root@rhel-8-repo ~]# dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

Now we can add RPMFusion.

root@rhel-8-repo ~]# dnf install --nogpgcheck https://dl.fedoraproject.org/pub/epel/epel-release-latest-$(rpm -E %rhel).noarch.rpm
root@rhel-8-repo ~]# dnf install --nogpgcheck https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-$(rpm -E %rhel).noarch.rpm https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-$(rpm -root@rhel-8-repo ~]# E %rhel).noarch.rpm

I’m using NGINX as HTTP service, and we need yum-utils createrepo_c for the sync functionality, so we need to install them.

root@rhel-8-repo ~]# dnf install nginx yum-utils createrepo_c

For synchronizig repositories I have written a script where the Relase number can be given as parameter.
The reason is that sometimes it is required having more than one Release availiable. We alo need to know which repositories are active.

[root@rhel-8-repo /]# dnf repolist
Updating Subscription Management repositories.
repo id                                                                                                                              repo name
codeready-builder-for-rhel-8-x86_64-rpms                                                                                             Red Hat CodeReady Linux Builder for RHEL 8 x86_64 (RPMs)
epel                                                                                                                                 Extra Packages for Enterprise Linux 8 - x86_64
epel-modular                                                                                                                         Extra Packages for Enterprise Linux Modular 8 - x86_64
rhel-8-for-x86_64-appstream-rpms                                                                                                     Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
rhel-8-for-x86_64-baseos-rpms                                                                                                        Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
rpmfusion-free-updates                                                                                                               RPM Fusion for EL 8 - Free - Updates
rpmfusion-nonfree-updates                                                                                                            RPM Fusion for EL 8 - Nonfree - Updates
[root@rhel-8-repo /]#

With this information we can write a sync sript.

[root@rhel-8-repo /]# cat /opt/reposync/reposync.sh 
##############################################
# Synchronisation of RHEL Repositories       #
# By Karsten Lenz dbi-services sa 2022.06.29 #
##############################################

#!/bin/bash

echo "Synchronisation of RHEL Repositores"
echo ""

# Help
function printHelp() {
  printf "Usage:\n"
  printf "${progName} [OPTION]\n\n"
  printf "Options:\n"
  printf "\t -v <RHEL Version>\t\tRHEL Release (required)\n"
  printf "\t -h <Help>\t\t\tprints this help\n"
}

# Select Options
while getopts v:h option 2>/dev/null
do
  case "${option}"
  in
  v) VERSION=${OPTARG};;
  h) printHelp; exit 2;;
  *) printf "Unsupported option or parameter value missing '$*'\n";
     printf "Run ${progName} -h to print help\n"; exit 1;;
  esac
done
# Extract Major Release
MAJOR=${VERSION:0:1}

# Set RHEL RELEASE to sync
printf "Set Release to sync"
subscription-manager release --set=$VERSION && rm -rf /var/cache/dnf

# SYNC BASE-OS, APPSTREAM, Codeready, EPEL and rpmfusion
if [ $MAJOR == '8' ]
then
    	printf "Sync Base OS "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=rhel-8-for-x86_64-baseos-rpms 
        printf "Sync Appstream "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=rhel-8-for-x86_64-appstream-rpms
        printf "Sync Codeready "
        reposync -p /usr/share/nginx/html/$MAJOR/$VERSION --download-metadata --newest-only --delete --repoid=codeready-builder-for-rhel-8-x86_64-rpms
        printf "Sync EPEL 8 "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repo=epel 
        printf "Sync EPEL 8 Modular "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=epel-modular 
        printf "Sync rpmfusion-free "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-free-updates
        printf "Sync rpmfusion-nonfree "
        reposync -p /usr/share/nginx/html/$MAJOR --download-metadata --newest-only --delete --repoid=rpmfusion-nonfree-updates
fi
[root@rhel-8-repo /]# 

EPEL and RPMFusion doesn’t know Minor Realeases, so there is only one Repository for all RHEL 8 Versions.
That is the reason to store the Repositories only under the extracted Major Release number.

Some explanations, –newest-only dowloads only the newest packages per-repo, –delete will delete local packages no longer present in the repository.
The switch -v means the version for which the local repository should be created, so it is 8.5 for 8.5 or 8.6 for 8.6, so there can be a local repository for each release which is locally used with only one script.

The shell script has a help function.

[root@rhel-8-repo reposync]# sh reposync.sh -h
Synchronisation of RHEL Repositores

Usage:
 [OPTION]

Options:
	 -v <RHEL Version>		RHEL Release (required)
	 -h <Help>			prints this help
[root@rhel-8-repo reposync]# 

Running for RHEL 8.6, the subscriptionmanager relaese command will set the Release to sync.

[root@rhel-8-repo reposync]# sh reposync.sh -v 8.6

The sync requires about 66GB with the parameters –newest-only and –delete, 21GB for RHEL 8.6, 18GB for RHEL 8.5, 13GB for RHEL 8.4, 13GB for EPEL and 1.4GB for RPMFusion.

[root@rhel-8-repo 8]# du -sh *
13G	8.4
18G	8.5
21G	8.6
13G	epel
1.1G	epel-modular
297M	rpmfusion-free-updates
1.1G	rpmfusion-nonfree-updates
[root@rhel-8-repo 8]# 

For configuration of NGINX we need to open the firewall port 80 and 443 first (http and https).

[root@rhel-8-repo /]# firewall-cmd --zone=public --permanent --add-service=http
success
[root@rhel-8-repo /]# firewall-cmd --zone=public --permanent --add-service=https
success
[root@rhel-8-repo /]# firewall-cmd --reload
success

I have adapted the originally nginx.conf, I have replaced the server section with my settings:

[root@rhel-8-repo ~]# cat /etc/nginx/nginx.conf
# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
	listen       80 default_server;
        listen       [::]:80 default_server;
        server_name  _;
        root         /usr/share/nginx/html/;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        location / {
                allow all;
                sendfile on;
                sendfile_max_chunk 1m;
                autoindex on;
                autoindex_exact_size off;
                autoindex_format html;
                autoindex_localtime on;
        }
	  error_page 404 /404.html;
            location = /40x.html {
        }

	  error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }


#    server {
#        listen       80 default_server;
#        listen       [::]:80 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2 default_server;
#        listen       [::]:443 ssl http2 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}

[root@rhel-8-repo ~]# 

The next topic is SE Linux, when it is set to enforcing we need to set permissions and restart nginx.

[root@rhel-8-repo /]# chcon -Rt httpd_sys_content_t /usr/share/nginx/html/repo/
[root@rhel-8-repo /]# setfacl -R -m u:nginx:rwx /usr/share/nginx/html/repo/
[root@rhel-8-repo /]# systemctl restart nginx

A Repository Server for three RHEL 8 Releases including EPEL and RPMFusion.

L’article Setup a RHEL 8 Repository Server est apparu en premier sur dbi Blog.

Documentum – Encrypt BOF passwords with a minimal jar file

Sun, 2022-07-17 10:25

Nowadays, with DevOps and containers approaches, you will often see components or software that try to go on a diet, to reduce the size needed to deploy them and therefore the images/containers associated to it. I was recently asked to do something similar on a very specific feature that the Documentum DFC jar files provide: extract and minimize the jar file(s) needed for the encryption of a BOF password. In this blog, I will therefore go through the steps I took to achieve this.

As you might know, Documentum DFC jar files provide a lot of capabilities, including encryption and decryption of passwords, as long as you know which class needs to be called for that. As mentioned, I was recently asked to strip down the needed jar files to only allow encryption of BOF passwords and nothing more than what is strictly necessary for it. Since the encryption and decryption are using two different classes, it also means that this new jar file will not be able to decrypt passwords, only encrypt them. Of course, it might technically be possible to decompile Documentum java classes and create your custom class using the (decompiled) source code from OpenText but let me remind you that it’s a proprietary software so that’s not something I will risk myself to do…

I worked on a Documentum 20.2 environment, but I assume the outcome would be very similar on other versions. The approach I used is basically a die and retry one: I know which class is being used to encrypt BOF passwords (“com.documentum.fc.tools.RegistryPasswordUtils“) and I know this is inside the dfc.jar file. Therefore, I can start from there and try to only use this class file to encrypt a password. This will most likely fail a few times, complaining about “NoClassDefFoundError“, which means that there are dependencies that are required to include into our custom jar file to be able to encrypt passwords. Then simply repeat this process until you are successfully able to get an encrypted password (that you can decrypt using the default dfc.jar and verify that it’s indeed a correctly encrypted password). These dependencies might not be necessarily needed for the encryption itself, but as mentioned, since we don’t want to decompile the OpenText class files, we have no other choice than to just include any and all classes that have direct or indirect references/methods being called, starting from the central point (RegistryPasswordUtils.class).

In short, I started doing something like the following (I’m using a sub-folder for the encryption classpath to make sure it’s not using the local folder jar files):

[tomcat@d2-0 enc]$ workspace="./workspace"
[tomcat@d2-0 enc]$ classpath="${workspace}/*"
[tomcat@d2-0 enc]$ mkdir ${workspace}
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ ls -l
total 15548
-rw-r----- 1 tomcat tomcat 15913953 Jul 17 13:45 dfc.jar
drwxr-x--- 2 tomcat tomcat     4096 Jul 17 13:46 workspace
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -tvf dfc.jar | grep "com/documentum/fc/tools/RegistryPasswordUtils"
  7646 Tue Mar 08 19:53:12 UTC 2022 com/documentum/fc/tools/RegistryPasswordUtils.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -xf dfc.jar com/documentum/fc/tools/RegistryPasswordUtils.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ ls -l
total 15552
drwxr-x--- 3 tomcat tomcat     4096 Jul 17 13:47 com
-rw-r----- 1 tomcat tomcat 15913953 Jul 17 13:45 dfc.jar
drwxr-x--- 2 tomcat tomcat     4096 Jul 17 13:46 workspace
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ tree com
com
└── documentum
    └── fc
        └── tools
            └── RegistryPasswordUtils.class

3 directories, 1 file
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ echo ${classpath}
./workspace/*
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -cf ${workspace}/encrypt.jar com
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ echo ${classpath}
./workspace/encrypt.jar
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ java -classpath "${classpath}" com.documentum.fc.tools.RegistryPasswordUtils "T3stP4ssw0rd"
Error: Unable to initialize main class com.documentum.fc.tools.RegistryPasswordUtils
Caused by: java.lang.NoClassDefFoundError: com/documentum/fc/common/DfException
[tomcat@d2-0 enc]$

As you can see above, the “NoClassDefFoundError” is for another class (DfException.class), so then I simply continued from there on, finding this class, adding it into the “encrypt.jar” file and trying again:

[tomcat@d2-0 enc]$ jar -tvf dfc.jar | grep "com/documentum/fc/common/DfException"
 19265 Tue Mar 08 19:53:08 UTC 2022 com/documentum/fc/common/DfException.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -xf dfc.jar com/documentum/fc/common/DfException.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -cf ${workspace}/encrypt.jar com
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ java -classpath "${classpath}" com.documentum.fc.tools.RegistryPasswordUtils "T3stP4ssw0rd"
Error: Unable to initialize main class com.documentum.fc.tools.RegistryPasswordUtils
Caused by: java.lang.NoClassDefFoundError: com/documentum/fc/common/IDfException
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -tvf dfc.jar | grep "com/documentum/fc/common/IDfException"
  8453 Tue Mar 08 19:53:16 UTC 2022 com/documentum/fc/common/IDfException.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -xf dfc.jar com/documentum/fc/common/IDfException.class
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -cf ${workspace}/encrypt.jar com
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ java -classpath "${classpath}" com.documentum.fc.tools.RegistryPasswordUtils "T3stP4ssw0rd"
Error: Unable to initialize main class com.documentum.fc.tools.RegistryPasswordUtils
Caused by: java.lang.NoClassDefFoundError: org/aspectj/lang/Signature
[tomcat@d2-0 enc]$

I guess you got the gist of it. As you can see, this last error is caused by a class (Signature.class) that doesn’t seem to belong to a Documentum package but instead to the “org.aspectj” one. There is actually a second jar file that is required to continue and that is the aspectjrt.jar. A big part of the classes that we will need to include to encrypt Documentum passwords (again, because of these dependencies required since we don’t want to touch the OpenText class files) are coming from this second jar.

I went through to the end of it, and I got a list of classes required for the dfc.jar and the aspectjrt.jar (and their sub-classes, that’s why there are wildcards “*” below). I cleaned up everything and did it again from scratch to make sure I didn’t have any mistake in the steps:

[tomcat@d2-0 enc]$ workspace="./workspace"
[tomcat@d2-0 enc]$ classpath="${workspace}/*"
[tomcat@d2-0 enc]$ mkdir ${workspace}
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ ls -l
total 15664
-rw-r----- 1 tomcat tomcat   118762 Jul 17 14:07 aspectjrt.jar
-rw-r----- 1 tomcat tomcat 15913953 Jul 17 13:45 dfc.jar
drwxr-x--- 2 tomcat tomcat     4096 Jul 17 14:08 workspace
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ jar -xf dfc.jar com/
[tomcat@d2-0 enc]$ jar -xf aspectjrt.jar org/
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ du -sh *
116K    aspectjrt.jar
56M     com
16M     dfc.jar
4.0K    workspace
660K    org
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ echo 'com/documentum/com/IDfClientX.class
com/documentum/fc/client/DfServiceInstantiationException.class
com/documentum/fc/client/DfServiceException.class
com/documentum/fc/client/DfTypedObject*.class
com/documentum/fc/client/IDfSession*.class
com/documentum/fc/client/IDfTypedObject.class
com/documentum/fc/client/IDfTypedObjectInternal.class
com/documentum/fc/client/IDfGlobalModuleRegistry.class
com/documentum/fc/client/IDfModuleRegistry.class
com/documentum/fc/client/impl/bof/classmgmt/IClassLoader.class
com/documentum/fc/client/impl/bof/classmgmt/IModuleManager.class
com/documentum/fc/client/impl/bof/classmgmt/ModuleManager*.class
com/documentum/fc/client/impl/bof/classmgmt/URLClassLoaderEx.class
com/documentum/fc/client/impl/bof/registry/IModuleMetadata.class
com/documentum/fc/client/impl/ITypedObject.class
com/documentum/fc/client/internal/IShutdownListener.class
com/documentum/fc/client/internal/ITypedObjectInternal.class
com/documentum/fc/common/DfException.class
com/documentum/fc/common/DfObject.class
com/documentum/fc/common/DfPreferences*.class
com/documentum/fc/common/DfRuntimeException.class
com/documentum/fc/common/IDfLoginInfo.class
com/documentum/fc/common/IDfException.class
com/documentum/fc/common/impl/preferences/IPreferencesObserver.class
com/documentum/fc/common/impl/preferences/TypedPreferences*.class
com/documentum/fc/impl/util/PBEUtils.class
com/documentum/fc/tools/RegistryPasswordUtils.class
com/documentum/fc/tracing/impl/aspects/BaseTracingAspect.class
com/documentum/fc/tracing/impl/Tracing*.class
com/documentum/fc/tracing/IUserIdentifyingObject.class
com/documentum/operations/common/DfBase64Encoder.class
com/documentum/operations/common/DfBase64FormatException.class' > list_dfc.txt
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ while read line; do
  for file in $(ls ${line}); do
    dest_folder="${workspace}/$(dirname ${file})"
    mkdir -p "${dest_folder}"
    cp "${file}" "${dest_folder}"
  done
done < <(cat list_dfc.txt)
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ echo 'org/aspectj/lang/JoinPoint*
org/aspectj/lang/reflect/AdviceSignature.class
org/aspectj/lang/reflect/CatchClauseSignature.class
org/aspectj/lang/reflect/CodeSignature.class
org/aspectj/lang/reflect/ConstructorSignature.class
org/aspectj/lang/reflect/FieldSignature.class
org/aspectj/lang/reflect/InitializerSignature.class
org/aspectj/lang/reflect/LockSignature.class
org/aspectj/lang/reflect/MemberSignature.class
org/aspectj/lang/reflect/MethodSignature.class
org/aspectj/lang/reflect/SourceLocation.class
org/aspectj/lang/reflect/UnlockSignature.class
org/aspectj/lang/Signature.class
org/aspectj/runtime/reflect/AdviceSignatureImpl.class
org/aspectj/runtime/reflect/CatchClauseSignatureImpl.class
org/aspectj/runtime/reflect/CodeSignatureImpl.class
org/aspectj/runtime/reflect/ConstructorSignatureImpl.class
org/aspectj/runtime/reflect/Factory.class
org/aspectj/runtime/reflect/FieldSignatureImpl.class
org/aspectj/runtime/reflect/InitializerSignatureImpl.class
org/aspectj/runtime/reflect/JoinPointImpl*
org/aspectj/runtime/reflect/LockSignatureImpl.class
org/aspectj/runtime/reflect/MemberSignatureImpl.class
org/aspectj/runtime/reflect/MethodSignatureImpl.class
org/aspectj/runtime/reflect/SignatureImpl*.class
org/aspectj/runtime/reflect/SourceLocationImpl.class
org/aspectj/runtime/reflect/UnlockSignatureImpl.class' > list_aspectjrt.txt
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ while read line; do
  for file in $(ls ${line}); do
    dest_folder="${workspace}/$(dirname ${file})"
    mkdir -p "${dest_folder}"
    cp "${file}" "${dest_folder}"
  done
done < <(cat list_aspectjrt.txt)
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ du -sh *
116K    aspectjrt.jar
56M     com
16M     dfc.jar
1.4M    workspace
4.0K    list_aspectjrt.txt
4.0K    list_dfc.txt
660K    org
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ cd ${workspace}/
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ du -sh *
1.2M    com
176K    org
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ jar -cf encrypt.jar com org
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ rm -rf com/ org/
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ ls -l
total 324
-rw-r----- 1 tomcat tomcat 328470 Jul 17 14:11 encrypt.jar
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ java -classpath "./encrypt.jar" com.documentum.fc.tools.RegistryPasswordUtils "T3stP4ssw0rd"
AAAAEGSo3bu4EisUf3nIQkN0geKBTCWJePigsKNqozw+XVIz
[tomcat@d2-0 workspace]$

And voila, we have an encrypted password. To validate that it’s properly done, you can try to decrypt it. I won’t put in this blog the command to be used to decrypt passwords because I don’t really think that’s something that should be public, but if you really want to know, all I can say is that if you are looking on official resources provided by Documentum, you will probably find what you are looking for… In any cases, the decryption is working properly using the default dfc.jar (it doesn’t need aspectjrt.jar to decrypt) but it’s of course not working for our newly created encrypt.jar:

[tomcat@d2-0 workspace]$ java -classpath "./encrypt.jar" ***something*** AAAAEGSo3bu4EisUf3nIQkN0geKBTCWJePigsKNqozw+XVIz
Error: Could not find or load main class ***something***
Caused by: java.lang.ClassNotFoundException: ***something***
[tomcat@d2-0 workspace]$
[tomcat@d2-0 workspace]$ cd ..
[tomcat@d2-0 enc]$
[tomcat@d2-0 enc]$ java -classpath "./dfc.jar" ***something*** AAAAEGSo3bu4EisUf3nIQkN0geKBTCWJePigsKNqozw+XVIz
The decrypted password is: T3stP4ssw0rd
[tomcat@d2-0 enc]$

As you can see, it’s the same password that we had initially, so the encryption works properly. Now the main purpose of this activity was to decrease the size of the jar file needed to encrypt password, so what’s the status there? The default “com” (from dfc.jar) and “org” (from aspectjrt.jar) folders are roughly 56Mb, while the minimal ones that we are using to create the encrypt.jar file are only 1.2Mb! To be able to encrypt passwords, you would need normally both the dfc.jar as well as the aspectjrt.jar, for a total size of 16032715 bytes (~15.3Mb). On the other hand, the encrypt.jar file that we just created is only 328475 bytes (~0.31Mb, i.e., 48.8 times smaller) so that’s good. It would obviously be possible to add the decrypt capability to our encrypt.jar but the goal here was to only be able to encrypt passwords, so that you can use this jar file anywhere to quickly encrypt a password to use right away.

L’article Documentum – Encrypt BOF passwords with a minimal jar file est apparu en premier sur dbi Blog.

Documentum – xPlore not able to deploy “dsearch.war”

Fri, 2022-07-15 14:05

Recently at a customer, I faced a new issue I didn’t see before on the xPlore Dsearch component. Basically, the processes appeared to be starting and the “dsearchadmin.war” was deployed properly but it always failed for the “dsearch.war” upon startup of WildFly. However, it was possible to “deploy” it once WildFly was running… This issue wasn’t too complex to debug because some interesting hints were displayed on the logs but it’s an interesting behavior because you could think that everything is running while it’s not. The initial issue was the following one:

[xplore@ds-0 ~]$ cd $JBOSS_HOME/server
[xplore@ds-0 server]$
[xplore@ds-0 server]$ $STARTSTOP start
  **
  **  The PrimaryDsearch is shutdown
  **
INFO - Starting the PrimaryDsearch...
  **
  **  The PrimaryDsearch is running with PID: 9178
  **
[xplore@ds-0 server]$
[xplore@ds-0 server]$ ps uxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
xplore    9075  0.0  0.0  13428  1552 pts/1    S    12:06   0:00 /bin/sh ./startPrimaryDsearch.sh
xplore    9077  0.0  0.0  13448  1744 pts/1    S    12:06   0:00  \_ /bin/sh $XPLORE_HOME/wildfly17.0.1/bin/standalone.sh
xplore    9178 83.2  1.1 10193500 1580936 pts/1 Sl  12:06   1:32      \_ $XPLORE_HOME/java64/JAVA_LINK/bin/java -D[Standalone] -server -Xms8g -Xmx8g -XX:MaxMetaspaceSize=512m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:G1HeapRegionSize=4m -Djbo
xplore    1401  0.0  0.0  13568  2260 pts/2    Ss   11:50   0:00 bash -l
xplore    9940  0.0  0.0  53336  1864 pts/2    R+   12:07   0:00  \_ ps uxf
[xplore@ds-0 server]$
[xplore@ds-0 server]$ cat DctmServer_PrimaryDsearch/log/server.log
...
...
2022-07-13 12:06:30,311 UTC INFO  [org.jboss.ws.cxf.deployment] (MSC service thread 1-2) JBWS024074: WSDL published to: file:$XPLORE_HOME/wildfly17.0.1/server/DctmServer_PrimaryDsearch/data/wsdl/dsearch.war/ESSAdminWebServiceService.wsdl
2022-07-13 12:06:30,356 UTC WARN  [org.jboss.as.server.deployment] (MSC service thread 1-6) WFLYSRV0274: Excluded dependency org.slf4j.impl via jboss-deployment-structure.xml does not exist.
2022-07-13 12:06:30,356 UTC INFO  [org.jboss.as.webservices] (MSC service thread 1-1) WFLYWS0003: Starting service jboss.ws.endpoint."dsearch.war"."com.emc.documentum.core.fulltext.indexserver.admin.controller.ESSAdminWebService"
2022-07-13 12:06:30,421 UTC WARN  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0101: Duplicate servlet mapping /ESSAdminWebService found
2022-07-13 12:06:34,847 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 67) WFLYUT0021: Registered web context: '/dsearchadmin' for server 'default-server'
2022-07-13 12:06:34,996 UTC INFO  [javax.enterprise.resource.webcontainer.jsf.config] (ServerService Thread Pool -- 74) Initializing Mojarra 2.3.9.SP02 for context '/dsearch'
2022-07-13 12:06:40,012 UTC ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 74) MSC000001: Failed to start service jboss.deployment.unit."dsearch.war".undertow-deployment: org.jboss.msc.service.StartException in service jboss.deployment.unit."dsearch.war".undertow-deployment: java.lang.ExceptionInInitializerError
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentService$1.run(UndertowDeploymentService.java:81)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
        at java.base/java.lang.Thread.run(Thread.java:829)
        at org.jboss.threads@2.3.3.Final//org.jboss.threads.JBossThread.run(JBossThread.java:485)
Caused by: java.lang.ExceptionInInitializerError
        at deployment.dsearch.war//com.emc.cma.cps.management.CPSManager.shutdownManager(CPSManager.java:806)
        at deployment.dsearch.war//com.emc.cma.cps.management.CPSManager.terminate(CPSManager.java:160)
        at deployment.dsearch.war//com.emc.cma.cps.management.CPSManager.init(CPSManager.java:1019)
        at deployment.dsearch.war//com.emc.cma.cps.management.CPSManager.getInstance(CPSManager.java:114)
        at deployment.dsearch.war//com.emc.cma.cps.services.CPSServiceImpl.getVersion(CPSServiceImpl.java:46)
        at deployment.dsearch.war//com.emc.cma.cps.rt.ContentProcessingServiceLocalStub.getVersion(ContentProcessingServiceLocalStub.java:56)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.cps.CPSSubmitter.connectCPS(CPSSubmitter.java:301)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.cps.CPSSubmitter.<init>(CPSSubmitter.java:159)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.cps.CPSRouter.addCPS(CPSRouter.java:108)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.cps.CPSRouter.<init>(CPSRouter.java:85)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSContext.initialize(ESSContext.java:240)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSNode.startUp(ESSNode.java:67)
        at deployment.dsearch.war//com.emc.documentum.core.fulltext.webapp.IndexServerServlet.init(IndexServerServlet.java:48)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:117)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.security.RunAsLifecycleInterceptor.init(RunAsLifecycleInterceptor.java:78)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:103)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.ManagedServlet$DefaultInstanceStrategy.start(ManagedServlet.java:303)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.ManagedServlet.createServlet(ManagedServlet.java:143)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.DeploymentManagerImpl$2.call(DeploymentManagerImpl.java:583)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.DeploymentManagerImpl$2.call(DeploymentManagerImpl.java:554)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:42)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
        at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.DeploymentManagerImpl.start(DeploymentManagerImpl.java:596)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentService.startContext(UndertowDeploymentService.java:97)
        at org.wildfly.extension.undertow@17.0.1.Final//org.wildfly.extension.undertow.deployment.UndertowDeploymentService$1.run(UndertowDeploymentService.java:78)
        ... 8 more
Caused by: java.lang.NullPointerException
        at java.base/sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:75)
        at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:69)
        at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
        at java.base/java.nio.file.Path.of(Path.java:147)
        at java.base/java.nio.file.Paths.get(Paths.java:69)
        at deployment.dsearch.war//com.emc.cma.cps.util.FileUtils.isAbsoluteDirectory(FileUtils.java:8)
        at deployment.dsearch.war//com.emc.cma.cps.management.CPSConfiguration.getEffectiveTempDirectory(CPSConfiguration.java:394)
        at deployment.dsearch.war//com.emc.cma.cps.processor.common.CPSContentBufferManager.<clinit>(CPSContentBufferManager.java:166)
        ... 38 more

2022-07-13 12:06:40,020 UTC ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("deploy") failed - address: ([("deployment" => "dsearch.war")]) - failure description: {"WFLYCTL0080: Failed services" => {"jboss.deployment.unit.\"dsearch.war\".undertow-deployment" => "java.lang.ExceptionInInitializerError
    Caused by: java.lang.ExceptionInInitializerError
    Caused by: java.lang.NullPointerException"}}
2022-07-13 12:06:41,370 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearchadmin.war" (runtime-name : "dsearchadmin.war")
2022-07-13 12:06:41,370 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearch.war" (runtime-name : "dsearch.war")
2022-07-13 12:06:41,373 UTC INFO  [org.jboss.as.controller] (Controller Boot Thread) WFLYCTL0183: Service status report
WFLYCTL0186:   Services which failed to start:      service jboss.deployment.unit."dsearch.war".undertow-deployment: java.lang.ExceptionInInitializerError

2022-07-13 12:06:41,462 UTC INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
2022-07-13 12:06:41,468 UTC ERROR [org.jboss.as] (Controller Boot Thread) WFLYSRV0026: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) started (with errors) in 36174ms - Started 1074 of 1323 services (2 services failed or missing dependencies, 400 services are lazy, passive or on-demand)
[xplore@ds-0 server]$

As you can see above, there are no cps_daemon child processes for the PrimaryDsearch, which is a good indication that even if the WildFly process is running, the xPlore PrimaryDsearch isn’t fully up&running and that something went wrong. On the logs, and especially on the stack trace above, the issue appears to be on the initialization/startup of the PrimaryDsearch and more specifically around its included CPS. Looking at the URL obviously fails since the application has been switched to failed state so the first thing I tried was simply to test a redeploy of the application. The custom script $STARTSTOP we are using (c.f. above) already take care to force a redeployment, in case one of the Applications is in “undeployed” or “failed” state before WildFly starts, but I still tried to do it manually during runtime, to see if it would work:

[xplore@ds-0 server]$ curl -kI http://`hostname -f`:9300/dsearch
HTTP/1.1 404 Not Found
Connection: keep-alive
Server: FT1
Content-Length: 74
Content-Type: text/html
Date: Wed, 13 Jul 2022 12:07:36 GMT
[xplore@ds-0 server]$
[xplore@ds-0 server]$ cd DctmServer_PrimaryDsearch/deployments/
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ ls -l
total 24
drwxrwx--- 11 xplore xplore 4096 Jul 13 07:17 dsearchadmin.war
-rw-r-----  1 xplore xplore   16 Jul 13 07:17 dsearchadmin.war.deployed
drwxrwx---  4 xplore xplore 4096 Jul 13 07:17 dsearch.war
-rw-r-----  1 xplore xplore  251 Jul 13 12:06 dsearch.war.failed
-rw-rw----  1 xplore xplore 9078 Jul 13 07:17 README.txt
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ mv dsearch.war.failed dsearch.war.dodeploy
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ ls -l
total 24
drwxrwx--- 11 xplore xplore 4096 Jul 13 07:17 dsearchadmin.war
-rw-r-----  1 xplore xplore   16 Jul 13 07:17 dsearchadmin.war.deployed
drwxrwx---  4 xplore xplore 4096 Jul 13 07:17 dsearch.war
-rw-r-----  1 xplore xplore  251 Jul 13 12:06 dsearch.war.dodeploy
-rw-r-----  1 xplore xplore   11 Jul 13 12:08 dsearch.war.isdeploying
-rw-rw----  1 xplore xplore 9078 Jul 13 07:17 README.txt
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ sleep 20
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ ls -l
total 20
drwxrwx--- 11 xplore xplore 4096 Jul 13 07:17 dsearchadmin.war
-rw-r-----  1 xplore xplore   16 Jul 13 07:17 dsearchadmin.war.deployed
drwxrwx---  4 xplore xplore 4096 Jul 13 07:17 dsearch.war
-rw-r-----  1 xplore xplore   11 Jul 13 07:18 dsearch.war.deployed
-rw-rw----  1 xplore xplore 9078 Jul 13 07:17 README.txt
[xplore@ds-0 deployments]$

Somehow, it looked like WildFly was able to deploy the dsearch.war application this time… So, all happy, I looked at the server.log again to confirm that:

[xplore@ds-0 deployments]$ cat ../log/server.log
...
2022-07-13 12:06:40,020 UTC ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("deploy") failed - address: ([("deployment" => "dsearch.war")]) - failure description: {"WFLYCTL0080: Failed services" => {"jboss.deployment.unit.\"dsearch.war\".undertow-deployment" => "java.lang.ExceptionInInitializerError
    Caused by: java.lang.ExceptionInInitializerError
    Caused by: java.lang.NullPointerException"}}
2022-07-13 12:06:41,370 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearchadmin.war" (runtime-name : "dsearchadmin.war")
2022-07-13 12:06:41,370 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearch.war" (runtime-name : "dsearch.war")
2022-07-13 12:06:41,373 UTC INFO  [org.jboss.as.controller] (Controller Boot Thread) WFLYCTL0183: Service status report
WFLYCTL0186:   Services which failed to start:      service jboss.deployment.unit."dsearch.war".undertow-deployment: java.lang.ExceptionInInitializerError

2022-07-13 12:06:41,462 UTC INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
2022-07-13 12:06:41,468 UTC ERROR [org.jboss.as] (Controller Boot Thread) WFLYSRV0026: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) started (with errors) in 36174ms - Started 1074 of 1323 services (2 services failed or missing dependencies, 400 services are lazy, passive or on-demand)
2022-07-13 12:08:51,775 UTC INFO  [org.jboss.as.webservices] (MSC service thread 1-5) WFLYWS0004: Stopping service jboss.ws.endpoint."dsearch.war"."com.emc.documentum.core.fulltext.indexserver.admin.controller.ESSAdminWebService"
2022-07-13 12:08:52,508 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0028: Stopped deployment dsearch.war (runtime-name: dsearch.war) in 739ms
2022-07-13 12:08:52,517 UTC INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0027: Starting deployment of "dsearch.war" (runtime-name: "dsearch.war")
...
...
2022-07-13 12:09:03,839 UTC INFO  [org.jboss.ws.cxf.deployment] (MSC service thread 1-1) JBWS024074: WSDL published to: file:$XPLORE_HOME/wildfly17.0.1/server/DctmServer_PrimaryDsearch/data/wsdl/dsearch.war/ESSAdminWebServiceService.wsdl
2022-07-13 12:09:03,884 UTC INFO  [org.jboss.as.webservices] (MSC service thread 1-1) WFLYWS0003: Starting service jboss.ws.endpoint."dsearch.war"."com.emc.documentum.core.fulltext.indexserver.admin.controller.ESSAdminWebService"
2022-07-13 12:09:03,884 UTC WARN  [org.jboss.as.server.deployment] (MSC service thread 1-2) WFLYSRV0274: Excluded dependency org.slf4j.impl via jboss-deployment-structure.xml does not exist.
2022-07-13 12:09:03,908 UTC WARN  [org.wildfly.extension.undertow] (MSC service thread 1-6) WFLYUT0101: Duplicate servlet mapping /ESSAdminWebService found
2022-07-13 12:09:07,181 UTC INFO  [javax.enterprise.resource.webcontainer.jsf.config] (ServerService Thread Pool -- 80) Initializing Mojarra 2.3.9.SP02 for context '/dsearch'
2022-07-13 12:09:10,106 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 80) WFLYUT0021: Registered web context: '/dsearch' for server 'default-server'
2022-07-13 12:09:10,168 UTC INFO  [org.jboss.as.server] (DeploymentScanner-threads - 2) WFLYSRV0016: Replaced deployment "dsearch.war" with deployment "dsearch.war"
[xplore@ds-0 deployments]$

As you can see above, the initial deployment failed (12:06:40) but then, by manually asking for a redeployment (12:08:51), it looks like WildFly was able to successfully deploy it, no? Let’s look at the URL:

[xplore@ds-0 deployments]$ curl -kI http://`hostname -f`:9300/dsearch
HTTP/1.1 302 Found
Connection: keep-alive
Server: FT1
Location: http://ds-0.domain.com:9300/dsearch/
Content-Length: 0
Date: Wed, 13 Jul 2022 12:10:01 GMT

[xplore@ds-0 deployments]$

The URL is now responding with a correct “302 Found” which means that it looks good, right? To cross-check, I just added the “/” at the end to use the real redirected location:

[xplore@ds-0 deployments]$ curl -kI http://`hostname -f`:9300/dsearch/
HTTP/1.1 404 Not Found
Connection: keep-alive
Server: FT1
Content-Length: 0
Date: Wed, 13 Jul 2022 12:10:05 GMT

[xplore@ds-0 deployments]$

That’s very peculiar, the deployment appears to be successful, and the first URL appears to work as well but if you really look behind the first link, it’s not working. Of course, the cps_daemon child processes of the PrimaryDsearch are also not present. Therefore, you might think that it started but it didn’t. As mentioned previously, the stack trace appeared to be linked to CPS initialization. Therefore, I looked at the CPS configuration file which is, by default, $XPLORE_HOME/dsearch/cps/cps_daemon/###DS_NAME###_local_configuration.xml for a Dsearch or $XPLORE_HOME/dsearch/cps/cps_daemon/###CPS_NAME###_configuration.xml for a CPS Only. In any cases, you can find the correct name/path of this file in the “cps.properties” file (for a CPS Only, it’s inside the cps.war, not inside the dsearch.war):

[xplore@ds-0 deployments]$ cps_conf_file=$(grep cps.configuration.file dsearch.war/WEB-INF/classes/cps.properties | awk -F= '{print $2}')
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ echo ${cps_conf_file}
$XPLORE_HOME/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ ls -l ${cps_conf_file}
ls: cannot access $XPLORE_HOME/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml: No such file or directory
[xplore@ds-0 deployments]$

It appears that the CPS configuration file is gone and that is what is causing this issue, since these processes cannot be initialized. This file was unfortunately removed by mistake a few hours earlier and because of that, the xPlore processes couldn’t start anymore. Restoring the file (from the closest backup) and then restarting xPlore processes was sufficient to have the services back online (since the “/dsearch” isn’t really reachable, there are obviously errors during the shutdown):

[xplore@ds-0 deployments]$ $STARTSTOP stop
  **
  **  The PrimaryDsearch is running with PID: 9178
  **
INFO - Stopping the PrimaryDsearch...
Instance {PrimaryDsearch} is about to shut down, wait for shutdown complete message.
Exception in thread "main" java.lang.IllegalArgumentException: Fail to connect remote server ...
{
    "outcome" => "success",
    "result" => undefined
}
  **
  **  The PrimaryDsearch is shutdown
  **
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ # Restoring the file from the closest backup. Post restore:
[xplore@ds-0 deployments]$ ls -l ${cps_conf_file}
-rw-rw---- 1 xplore xplore 9269 Jul 13 12:22 $XPLORE_HOME/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ $STARTSTOP start
  **
  **  The PrimaryDsearch is shutdown
  **
INFO - Starting the PrimaryDsearch...
  **
  **  The PrimaryDsearch is running with PID: 14826
  **
[xplore@ds-0 deployments]$

And cross-checking with the logs and URLs, it started properly:

[xplore@ds-0 deployments]$ cat ../log/server.log
...
2022-07-13 12:23:08,670 UTC INFO  [org.jboss.ws.cxf.deployment] (MSC service thread 1-2) JBWS024074: WSDL published to: file:$XPLORE_HOME/wildfly17.0.1/server/DctmServer_PrimaryDsearch/data/wsdl/dsearch.war/ESSAdminWebServiceService.wsdl
2022-07-13 12:23:08,714 UTC INFO  [org.jboss.as.webservices] (MSC service thread 1-6) WFLYWS0003: Starting service jboss.ws.endpoint."dsearch.war"."com.emc.documentum.core.fulltext.indexserver.admin.controller.ESSAdminWebService"
2022-07-13 12:23:08,715 UTC WARN  [org.jboss.as.server.deployment] (MSC service thread 1-4) WFLYSRV0274: Excluded dependency org.slf4j.impl via jboss-deployment-structure.xml does not exist.
2022-07-13 12:23:08,755 UTC WARN  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0101: Duplicate servlet mapping /ESSAdminWebService found
2022-07-13 12:23:11,846 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 76) WFLYUT0021: Registered web context: '/dsearchadmin' for server 'default-server'
2022-07-13 12:23:13,634 UTC INFO  [javax.enterprise.resource.webcontainer.jsf.config] (ServerService Thread Pool -- 93) Initializing Mojarra 2.3.9.SP02 for context '/dsearch'
2022-07-13 12:23:27,323 UTC INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 93) WFLYUT0021: Registered web context: '/dsearch' for server 'default-server'
2022-07-13 12:23:27,356 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearch.war" (runtime-name : "dsearch.war")
2022-07-13 12:23:27,357 UTC INFO  [org.jboss.as.server] (ServerService Thread Pool -- 37) WFLYSRV0010: Deployed "dsearchadmin.war" (runtime-name : "dsearchadmin.war")
2022-07-13 12:23:27,607 UTC INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
2022-07-13 12:23:27,614 UTC INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 17.0.1.Final (WildFly Core 9.0.2.Final) started in 44869ms - Started 1077 of 1323 services (400 services are lazy, passive or on-demand)
[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ curl -kI http://`hostname -f`:9300/dsearch
HTTP/1.1 302 Found
Connection: keep-alive
Server: FT1
Location: http://ds-0.domain.com:9300/dsearch/
Content-Length: 0
Date: Wed, 13 Jul 2022 12:26:05 GMT

[xplore@ds-0 deployments]$
[xplore@ds-0 deployments]$ curl -kI http://`hostname -f`:9300/dsearch/
HTTP/1.1 259 Unknown
Connection: keep-alive
Server: FT1
Content-Type: text/html;charset=UTF-8
Content-Length: 66
Date: Wed, 13 Jul 2022 12:26:07 GMT

[xplore@ds-0 deployments]$

The HTTP response code of 259 is normal for the dsearch URL so it shows that it works properly now. All cps_daemon child processes are also present and there are no issues on any logs. So just to conclude on this blog, it’s not because it appears that WildFly properly deployed the xPlore applications that they are really working! Make sure to always check the URLs and processes, that’s always a better indication. The best is of course to test xPlore end-to-end from Repository Searches & Indexing but that can take some time if it’s not automated/scripted.

L’article Documentum – xPlore not able to deploy “dsearch.war” est apparu en premier sur dbi Blog.

Pages