Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 9 hours 31 min ago

SQL Developer Web on the Oracle Cloud

Thu, 2018-05-10 12:21

You like SQL Developer because it is easy to install (just unzip a jar) and has a lot of features? Me too. It can be even easier if it is provided as a web application: no installation, and no java to take all my laptop RAM…
When I say no installation, you will see that you have some little things to setup here in DBaaS. That will probably be done for you in the managed services (PDBaaS) such as ‘Express’ and ‘Autonomous’ ones.

Be careful, Oracle is a Top-Down deployment company. It seems that new products are announced first and then people have to work hard to make them available. Which means that if, like me, you want to test them immediately you may encounter some disappointment.
The announce was there. The documentation was there, mentioning that the Cloud Tooling must be upgraded to 18.2.3. But 18.2.3 was there only a few days later. You can check it from the place where the DBaaS looks for its software. Check from https://storage.us2.oraclecloud.com/v1/dbcsswlibp-usoracle29538/dbaas_patch if you a are not sure.

So, before being able to see SQL Developer in the colorful DBaaS landing page (where you can also access APEX for example) there’s a bit of command line stuff to do as root.

Install the latest Cloud Tooling

SQL Developer Web needs to be installed with the latest version of ORDS, which is installed with the latest version of Cloud Tooling aka dbaastools.rpm

You need to connect as root, so opc and then sudo

ssh opc@
sudo su

Check if there is a new version to install:

dbaascli dbpatchm --run -list_tools | awk '/Patchid/{id=$3}END{print id}'

If something is returned (such as you install it:

dbaascli dbpatchm --run -toolsinst -rpmversion=$(dbaascli dbpatchm --run -list_tools | awk '/Patchid/{id=$3}END{print id}')

Actually I got an error, and I had to ^C:

[root@DB18c opc]# dbaascli dbpatchm --run -toolsinst -rpmversion=$(dbaascli dbpatchm --run -list_tools | awk '/Patchid/{id=$3}END{print id}')
DBAAS CLI version 1.0.0
Executing command dbpatchm --run -toolsinst -rpmversion= -cli
/var/opt/oracle/patch/dbpatchm -toolsinst -rpmversion= -cli
Use of uninitialized value in concatenation (.) or string at /var/opt/oracle/patch/dbpatchm line 4773.

But finally, it was installed because the ‘list_tools’ above returns nothing.

Enable SQL Developer Web

SQL Developer Web (SDW) is running in ORDS (Oracle REST Data Services) and must be enabled with the ORDS Assistant with the enable_schema_for_sdw action.
Here I’ll enable it at CDB level. I provide a password for the SDW schema. I create it in a file:

cat > password.txt <<<'Ach1z0#d'

You may secure that better than I do, as I’m putting the password on command line here. But this is only a test.

Then, still as root, I call the ORDS assistant to install SDW in C##SQLDEVWEB (as I’m installing it in CDB$ROOT I need a common user name).

/var/opt/oracle/ocde/assistants/ords/ords -ords_action=enable_schema_for_sdw -ords_sdw_schema="C##SQLDEVWEB" -ords_sdw_schema_password=$PWD/password.txt -ords_sdw_schema_enable_dba=true

Here is the output. The last lines are important:

WARNING: Couldn't obtain the "dbname" value from the assistant parameters nor the "$OCDE_DBNAME" environment variable
Starting ORDS
Logfile is /var/opt/oracle/log/ords/ords_2018-05-10_10:44:12.log
Config file is /var/opt/oracle/ocde/assistants/ords/ords.cfg
INFO: Starting environment summary checks...
INFO: Database version : 18000
INFO: Database CDB : yes
INFO: Original DBaaS Tools RPM installed : dbaastools-1.0-1+
INFO: Actual DBaaS Tools RPM installed : dbaastools-1.0-1+
INFO: DBTools JDK RPM installed : dbtools_jdk-1.8.0-2.74.el6.x86_64
INFO: DBTools JDK RPM "/var/opt/oracle/rpms/dbtools/dbtools_jdk-1.8.0-2.74.el6.x86_64.rpm" MD5 : 48f13bb401677bfc7cf0748eb1a6990d
INFO: DBTools ORDS Standalone RPM installed : dbtools_ords_standalone-
INFO: DBTools ORDS Standalone RPM "/var/opt/oracle/rpms/dbtools/dbtools_ords_standalone-" MD5 : 480355ac3ce0f357d5741c2c2f688901
INFO: DBTools DBaaS Landing Page RPM installed : dbtools_dbaas_landing_page-2.0.0-1.el6.x86_64
INFO: DBTools DBaaS Landing Page RPM "/var/opt/oracle/rpms/dbtools/dbtools_dbaas_landing_page-2.0.0-1.el6.x86_64.rpm" MD5 : af79e128a56b38de1c3406cfcec966db
INFO: Environment summary completed...
INFO: Action mode is "full"
INFO: Database Role is "PRIMARY"
INFO: Enabling "C##SQLDEVWEB" schema in "CDB$ROOT" container for SQL Developer Web...
SQL*Plus: Release Production on Thu May 10 10:44:27 2018
Copyright (c) 1982, 2017, Oracle. All rights reserved.
Connected to:
Oracle Database 18c EE Extreme Perf Release - Production
SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL Developer Web user enable starting...
Enabling "C##SQLDEVWEB" user for SQL Developer Web...
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
Creating "C##SQLDEVWEB" user
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
Call completed.
Commit complete.
PL/SQL procedure successfully completed.
Session altered.
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
"C##SQLDEVWEB" user enabled successfully. The schema to access SQL Developer Web
is "c_sqldevweb"...
PL/SQL procedure successfully completed.
SQL Developer Web user enable finished...
Disconnected from Oracle Database 18c EE Extreme Perf Release - Production
INFO: To access SQL Developer Web through DBaaS Landing Page, the schema "c_sqldevweb" needs to be provided...
INFO: "C##SQLDEVWEB" schema in the "CDB$ROOT" container for SQL Developer Web was enabled successfully...

The information to remember here is that I will have to provide the c_sqldevweb schema name (which is the schema name I’ve provided but lowercased and with sequences of ‘special’ characters replaced by an underscore). It is lowercased, but it seems that the schemaname has to be provided in uppercase.

Basically what has been done is quite simple: create the C##SQLDEVWEB user and call ORDS.ENABLE_SCHEMA to enable it and map it to the url.

DBCS Landing Page 2.0.0

Now I’m ready to see SQL Developer on the DBCS Landing Page. You access this page by:

  1. Enabling https access from internet (in Access Rules, enable ora_p2_httpssl)
  2. going to default web page for your service, in my case

You may have to accept some self-signed certificates

And here it is with SQL Developer Web in the middle:

The above shows PDB1/pdbadmin for the schema but I installed it at CDB level and the log above tells me that the schema is c_sqldevweb, so given the input, I change the schema to c_sqldevweb then on the login page. Finally, the direct url in my example is

I enter C##SQLDEVWEB (uppercase here) as the user and Ach1z0#d as the password.

And here is the Dashboard:

Do not worry about the 97% storage used which tells me that SYSTEM is full. My datafiles are autoextensible.

Just go to the SQL Worksheet and check your files:

select tablespace_name,bytes/1024/1024 "MBytes", maxbytes/1024/1024/1024 "MaxGB", autoextensible from dba_data_files

Enable SDW for local PDB user

To enable a PDB local user, I run ORDS assistant with a local user name (PDBADMIN here) and an additional parameter with the PDB name (PDB1 here).

cat > password.txt <<<'Ach1z0#d'
/var/opt/oracle/ocde/assistants/ords/ords -ords_action=enable_schema_for_sdw -ords_sdw_schema=PDBADMIN -ords_sdw_schema_password=$PWD/password.txt -ords_sdw_schema_enable_dba=true -ords_sdw_schema_container=PDB1

Now, I can connect to it with PDB1/pdbadmin as schema name.

Error handling

If, like me, you are not used to ORDS applications, you may waste some minutes looking at a splash screen waiting for the result. Always look at the message bar. All actions are REST calls and the message bar will show if a call is running or completed successfully or not. The example on the right shows ‘call failed’. You can click on it to see the REST call, and the error.


Cet article SQL Developer Web on the Oracle Cloud est apparu en premier sur Blog dbi services.

Some ideas about Oracle Database on Docker

Tue, 2018-05-08 15:55

This is not a ‘best practice’ but just some ideas about building Docker images to provide an Oracle Database. I started with the images provided by Oracle: https://github.com/oracle/docker-images/tree/master/OracleDatabase/SingleInstance and this is great to validate the docker environment. Then I customized for my needs and here are the different points about this customization.

Do not send a huge context at each build attempt

I work by iteration. Sending a 3GB context each time I try a build is a no-go for me. Then I quickly stopped to put the Oracle installation .zip in the context of my build. I already blogged about this.

There are several ways to avoid to send a big context, such as having the .zip in an NFS or HTTP server and ADD it or RUN wget from there. I prefer to build one small container with this .zip that I’ll use later

In my current directory I have the linuxx64_12201_database.zip which I explicitly send to the context with this .dockerignore:


And I build a franck/oracle122/zip image with it:

FROM oraclelinux:7-slim
ADD linuxx64_12201_database.zip /var/tmp

When done, I’ll not have to send the context again and I will build my container from this one with another Dockerfile:

FROM franck/oracle122/zip
RUN yum -y install oracle-database-server-12cR2-preinstall unzip tar wget openssl vi && rm -rf /var/cache/yum

Do not yum install at each build attempt

In the same idea, I build another intermediate image with the yum install above. The reason is that once I have it, I don’t need internet access anymore. I did that before boarding for an 8 hours flight. I build the above Dockerfile as franck/oracle122/prereq while on airport wifi and will use it later as the base for the final Dockerfile:




FROM franck/oracle122/prereq
# then follow all the work which do not need large context or internet connection

Even if you are not on a plane, it is always good to avoid internet access. You probably had to get some doors opened in the firewall in order to pull the base image. Now that you have it, you should keep it. Or one day, the security team will close the door again and you will waste a few hours. That also means that you do not start with a :latest image but with a specific version.

Do the long steps first

The Dockerfile provided by Oracle starts with all ENV and a COPY to add all scripts into the container. The problem is that each time you want to change a script, the build has to start from this step. And then the long operations have to be done again: unzip, install,…

I have a small context here (only the scripts and configuration files) but I ADD or COPY them only when needed. For example, here, a modification in install.rsp will re-do the runInstaller step, but the unzip one will not have to be done again because the cache is re-used:

WORKDIR /var/tmp
RUN unzip linuxx64_12201_database.zip
COPY install.rsp /var/tmp
RUN ./database/runInstaller -silent -force -waitforcompletion -responsefile /var/tmp/install.rsp -ignoresysprereqs -ignoreprereq ; true

The script that will run the container is added only at the end so that I can modify and re-build quickly without re-doing the previous steps.

VOLUME ["/opt/oracle/pdbs"] EXPOSE 1521 5500
COPY docker_cmd.sh /opt/oracle
CMD exec /opt/oracle/docker_cmd.sh ${CDB} ${PDB

Another step that I do at the end is removing the files I do not need in the container. Because that’s a guess and try approach and I want to build quickly. Of course, this may not be optimized for the size of all those layers, but I can reduce the final image later. The main feature of Docker build are the layers and I use them to develop the Dockerfile without wasting my time. For the waste of storage, I use ZFS with block level Cow, dedup and compression. For the final image, I’ll –squash it.

Remove all unnecessary files

The detail will probably go into a future blog post. But, as soon as runInstaller is done, and latest bundle patch applied, you can remove a lot of directories that I do not need anymore:

rm -rf $ORACLE_HOME/inventory $ORACLE_HOME/.patch_storage

As soon as the database has been created with DBCA, I do not need the DBCA templates anymore:

rm -rf $ORACLE_HOME/assistants

As this container will run only the instance, I can remove:

rm -rf $ORACLE_HOME/sqldeveloper $ORACLE_HOME/suptools $ORACLE_HOME/jdk

And depending on the options I will provide in the database, I remove the big ones:

rm -rf $ORACLE_HOME/apex $ORACLE_HOME/javavm $ORACLE_HOME/md

There is also a lot to remove from $ORACLE_HOME/lib (I need only a few *.so* that I can determine with strace, perf, lsof, ldd) and from $ORACLE_HOME/bin (basically, I need oracle, tnslsnr, lsnrctl, and sqlplus). Those are executables and you can strip them to reduce the size further. Definitely remove the last relink ones renamed as oracleO, …

Those are just examples, your list will depend on your version and usage, but this may reduce the image to 1GB or less. Of course, this is not supported. But the goal is to provide a small development database. Not an reliable and efficient one for production.

Use ZFS for the storage driver

An Oracle Database is full of large files that are updated sparsely. Just forget about OVERLAY and OVERLAY2 which copies the whole file to the new layer when you update a single byte of a file. I do not consider BTRFS seriously. In my opinion, ZFS is the only filesystem to consider for storing Docker images with large files. Enforce deduplication and compression to overcome the inflation of layering and the ignorance of sparse files. I think that recordsize=32k is a good idea from what I’ve seen about how docker applies writes to layers. More detail in a future blog post.

Note that layering issues are not only for build efficiency but also for container run. You will see that I put some datafiles in the image. Then, at database open, some blocks are changed (at least the headers) and I do not want a full file copy to the runnable layer. Block level CoW is required for that.

Create the CDB in the container

The container is the place to store all the software, and most of CDB$ROOT and PDB$SEED is part of the software distribution. This is what takes time when creating a database (catalog, catproc,…) and I definitely refuse to give a container to a developer where he will have to wait 10 minutes at run because the database has to be created on the external volume. A ‘docker run’ must be fast. And the external volume must contain only the data that has to be persisted, not 500MB of dbms_% package code, which will be all the same for all containers from the same image.

This means that I create the CDB during the build:

RUN /opt/oracle/product/ -silent -createDatabase -templateName General_Purpose.dbc -gdbName ${CDB} -sid ${CDB} -initParams db_unique_name=${CDB},service_names=${CDB},shared_pool_size=600M,local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=' -createAsContainerDatabase true -numberOfPdbs 0 -sysPassword oracle -systemPassword oracle -emConfiguration NONE -datafileDestination /opt/oracle/oradata -recoveryAreaDestination /opt/oracle/fast_recovery_area -recoveryAreaSize 1024 -storageType FS -sampleSchema false -automaticMemoryManagement false -totalMemory 1024 -databaseType OLTP -enableArchive false -redoLogFileSize 10 -useLocalUndoForPDBs true -createListener LISTENER:1521 -useOMF true -dbOptions OMS:false,JSERVER:false,SPATIAL:false,IMEDIA:false,ORACLE_TEXT:false,SAMPLE_SCHEMA:false,CWMLITE:false,DV:false,APEX:false
RUN rm -rf /opt/oracle/product/*

No PDB here, as the PDB will be created at ‘docker run’ time into the external volume. I use a template with datafiles here, but I may prefer to run the whole creation to control the creation. I may even hack some .bsq and .sql files in RDBMS admin to reduce the size. I’m in archivelog mode here because I want to allow to flashback the PDB. The container is ephemeral. If it becomes too large (archive logs, audit, …) just remove it and start another one. Or add a script to remove the old ones (those not required by guarantee restore points).

Create the PDB in the external volume

The PDB is the only thing that must be persistent, and controlled by the developer. I create it with the following in my docker_cmd.sh (which is called from the Dockerfile CMD line providing CDB name and PDB name as arguments) script:

create pluggable database $2 admin user admin identified by oracle create_file_dest='/opt/oracle/pdbs';
alter pluggable database $2 open;
show pdbs

The PDB is bound to the external volume ( VOLUME [“/opt/oracle/pdbs”] ) thanks to 12.2 CREATE_FILE_DEST clause so that the developer can create datafiles only there. Then the ‘docker run’ is fast as a clone of PDB$SEED.

The developer will connect only to the PDB. He has nothing to do in CDB$ROOT. If there is a need to change something in CDB$ROOT, I’ll provide a new image. I may even define lockdown profiles rules to limit the PDB and define a listener where only the PDB registers.

Unplug the PDB at container stop

When the developer stops the container, I want to leave something consistent in the external volume. The way to do that quickly is a PDB unplug. An unplug to a PDB archive (a .pdb zip with all datafiles) would be nicer, but that takes too much time to create. I unplug to a .xml file. This is what I do on stop (SIGTERM and SIGSTOP):

alter pluggable database all close;
column pdb_name format a30
select pdb_id,pdb_name,status from dba_pdbs;
for c in (select pdb_name from dba_pdbs where pdb_id>2) loop
dbms_output.put_line('-- Unpluging '||c.pdb_name);
execute immediate 'alter pluggable database "'||c.pdb_name||'" unplug into ''/opt/oracle/pdbs/'||c.pdb_name||'.xml''';
end loop;
for c in (select pdb_name from dba_pdbs where pdb_id>2 and status='UNPLUGGED') loop
dbms_output.put_line('-- Dropping '||c.pdb_name);
execute immediate 'drop pluggable database "'||c.pdb_name||'" keep datafiles';
end loop;
-- All pluggable databases have been unplugged:
host ls /opt/oracle/pdbs/*.xml
-- Shutdown the CDB:
shutdown immediate;
-- You can plug those PDBs by passing the volume to a new container

The script iterates on all PDBs but I see no reason to create more than one. I unplug the PDB and drop it, and then shutdown the instance. We need the unplug to be completed before the stop timeout. The container may be killed before the drop or shutdown, but as long as we have the .xml we can plug the PDB into a new container.

Plug the PDB at container re-start

I mentioned earlier that at start I create the pluggable database mentioned by ${PDB}. But this is only when there is no /opt/oracle/pdbs/${PDB}.xml
If this file is found, this means that we provide a PDB that was unplugged by a previous container stop.
Actually, when the start detects this file, the following will be run:

whenever sqlerror exit failure;
create pluggable database "${PDB}" using '/opt/oracle/pdbs/${PDB}.xml';
host rm /opt/oracle/pdbs/${PDB}.xml
alter pluggable database "${PDB}" open;
select message from pdb_plug_in_violations;

Finally, because I may start a container which has a newer Release Update than the one which unplugged my PDB, I run:


One PDB per container

My scripts process all PDBs but I think that in most cases we need to have a one-to-one relationship between the container and the PDB. The idea is to provide a database that is ready to use and where no administration/troubleshooting is required. The key here is to keep it simple. If you need to provide a large CDB with several PDBs, then Docker is not the solution to your problem. A virtual machine is a better answer for that.

SPfile? Password file?

The image build provided by Oracle stores the persistent configuration files with the database, in the external volume, through symbolic links from ?/dbs. But with my design, I don’t need to. The configuration of the instance, running in the container, is within the container. The passwords for SYS is in the container. Then SPfile and password files stay in the container. The runnable image is not read-only. It is writeable. We can write here as long as the changes do not have to persist beyond the container end of life.

The ‘scope=spfile’ parameters that can be modified by the developer will be PDB parameters. They are persisted because they go to the .xml file at unplug. Only in case of crash, without a clean unplug, those parameters may be stored only in the container. That’s a special case. A crashed container is not dead and jsut waits to be re-started.

Crash recovery needs the same container

There’s one big flaw with my CDB-in-container/PDB-in-volume design. The whole database datafiles must be consistent, are checkpointed together, and are all protected by the same redo stream, which is located in the container. But what’s the real problem about that? If the container is cleanly stopped, the PDB is unplugged and there is a clear separation between my external volume and the container. And both are consistent.

However, if the container crashes, the datafiles in my external volume are fuzzy and need recovery. This cannot be done without the CDB files which are on the container. This has only one consequence: the user must know that if the container was not cleanly stopped, she will need to start the PDB with the same container. I don’t think this is a real problem. I just ensure that the user gets the warning (a big README file in the external volume for example, created at start and removed at clean stop) and that the container will always be able to recover (no 100% full filesystem at start – anyway I have some log cleanups at start).

Handle all errors and signals

My startup script handle 3 situations.
The first one is the first start after creation of the container. This creates the pluggable database.
The second one is the re-start after a clean stop. This plugs the existing pluggable database.
The third one is crash-recovery after a kill. This just runs the automatic instance recovery.

Then the startup script will run in a loop, either tailing the alert.log or displaying some status info every minutes.

But before all of that, the startup script must handle the termination signals.

The clean stop is handled by the following signals:

trap 'shutdown SIGINT' SIGINT
trap 'shutdown SIGTERM' SIGTERM
trap 'shutdown EXIT' 0

SIGINT is for ^C when running the container, SIGTERM is when ‘docker stop’, and the signal 0 is when the container exits by itself. This can happen when my ‘tail -f’ on alert log is killed for example. All of them call my shutdown() procedure which is trying a clean stop (unplug the PDBs).

When the stop timout is expired or when we do a ‘docker kill’, there’s no time for that. The only thing I do here before a shutdown abort is an ‘alter system checkpoint’ to try to reduce the recovery needed. And display a WARNING message saying that the container that was killed must not be removed but be re-started asap to recover the PDB in the external volume. Maybe explicitly name the container and the command to re-start.

I do that with an abort() function called by the following:

trap 'abort SIGKILL' SIGKILL

The kill -9 of the instance, or container crash, cannot be handled. Recovery is needed as for the SIGKILL one. Here is the reason for keeping a permanent README file near the PDB to explain that the container which crashed should be restarted as soon as possible to recover this.


This is not a recipe of how to build an Oracle Database Docker image, but just some ideas. The most important is to know the requirement. If you provide Oracle on Docker just because the developers want that, the solution will probably be wrong: too large, too long, inefficient, and too complex,… They will not use it and they will tell everybody that Oracle is not cool because it cannot be dockerized.
With my PDB-in-volume / CDB-in-container design, I have :

  • Docker Images with the ephemeral software, one per version (down to patches), and with different set of component installed
  • External volume (personal directory in a NFS share, or a local one) with the persistent data and settings
  • Containers running the software on the data, linking them together for the whole lifecycle of the container

Think of them as 2 USB sticks, one with the software (binaries and packages), and one with the data (user metadata and data). When plugged together on the same container, it runs one version of software with one state of data. If the container crashes, you just run it again without unplugging any of the sticks. When you are done with your test or development iteration, you stop the container and remove it. Then you have unplugged the sticks to run another combination of data and software.


Cet article Some ideas about Oracle Database on Docker est apparu en premier sur Blog dbi services.

idql and its column output

Mon, 2018-05-07 03:23
idql and its column output

A few days ago, I was reading an interesting blog from distinguished colleague Clemens Bleile with the title “sqlplus and its column output” (link here https://blog.dbi-services.com/sqlplus-and-its-column-output/) and I said to myself: the lucky Oracle administrators and developers have sqlplus, a rather good, out of the box command-line tool to talk to their databases. What equivalent tool do we have with Documentum ? Well, we have mainly idql, which, to put it mildly, sucks. Unlike sqlplus, idql has no column formatting, no reporting, no variable substitution, no error trapping, actually almost nothing, not even command editing or command history (at least, under Unix/Linux). It just reads DQL statements, passes them to the content server and displays back the received answer. Pretty basic. However, for their defense, the Documentum creators gave away the source code of an ancient version (look for $DOCUMENTUM/share/sdk/example/code/idql.c), so it is relatively easy to enhance it the way you like.
Needless to say, having a nicely displayed output is not possible within idql. Whereas sqlplus’ column formatting allows to control the column width to make it narrower and thusly avoids those unending lines filled with spaces, in idql a query is displayed as is. For example, in sqlplus, “column mycol format A10″ tells sqlplus to display the column mycol as an alphanumeric value in a field not larger than 10 characters; exceeding characters are wrapped around on the next line(s). However, if, many columns are SELECTed, long result lines are unvoidable in both sqlplus and idql and the solution proposed in Clemens’ blog can help with idql too since it applies to the terminal as a whole.
Hereafter though, I’d like to propose a few alternatives that don’t require another terminal software, although they may somewhat lack in interactivity. One of them uses the less command, another uses less + a named pipe and a third one a simple awk script to compact and reflow a query’s output. Here we go.
1. Use less command
If reading the output can be done separately from entering the commands into idql, “less -S” is pretty cool:

# run the select and output the result into a text file;
idql dmtest -Udmadmin -Pdmadmin -w100 > tmp_file
select * from dm_sysobject
# now use less -S to examine the result;
less -S tmp_file

Now, it is possible to scroll left and right and have a good look at the result.
Some columns, though, are so wide and filled with trailing blanks that it is quite distracting. This will be taken care of later in the last alternative.

2. A more interactive variant with less -S
It is possible to stay in idql while the output is being redirected into a named pipe which is read by “less -S” and displayed in a second terminal. To do this, follow the steps below:
a. Create a named pipe named idqlp:

mknod -p idqlp

Contrary to the usual anonymous pipes, named pipes have, well, a name, and are created like files in a filesystem. As expected, like their counterparts, they can be written to and read from.
b. copy/paste the following command, it will create a pre-processor script for the less command:

cat - <<EoScript > lesspipe.sh
#! /bin/sh

# must define LESSOPEN environment variable to be used;
# export LESSOPEN="|~/lesspipe.sh %s"

case "$1" in
      cat idqlp
   *) exit 1
exit $?

The preprocessor script will be invoked when less is launched, right before it. This is a nifty feature of less which allows to play tricks with binary files, e.g. decompressing them before viewing them (if they are compressed files). I guess less-ing a java class file could first invoke a decompiler and then pass the result to less. There is also a postprocessor for tasks to be performed after less exits, such as cleaning up the intermediate file created by the preprocessor. All this is very well presented in less’ man page.
c. make it executable:

chmod +x lesspipe.sh

d. copy/paste the following command in a terminal, it will create the consumer script that will continuously be reading from the named pipe idqlp:

cat - <<EoScript > readpipe.sh
#! /bin/bash

export LESSOPEN="|~/lesspipe.sh %s"

while [ true ]; do
   less -S idqlp

e. make it executable:

chmod +x readpipe.sh

f. in the first terminal, run this script in the foreground:


g. in the second terminal, run idql with a redirection into the named pipe idqlp:

idql dmtest -Udmadmin -Pdmadmin -w100 > idqlp

f. now, type your DQL statements with or without those endless lines:

execute show_sessions


select * from dm_sysobject

select * from dm_document



The DQL output of the second terminal (bottom) is displayed in the first terminal (top) where it can be browsed by less.
This trick works quite well as long as a few precautions are respected. As you know, a pipe blocks the producer when it gets full until the consumer starts reading it at the other end. Thus, idql is blocked as long as less has not finished reading its output; for short outputs, no special action is required but for those long listing, in order to force less to reach the end of the output, type shift-g in the less window; give it enough time so the DQL statement completes its output, then ctrl-C. idql is then released whereas the output can be quietly navigated from within less in the first terminal. Once done, BEFORE entering any new command in idql, quit less (command q) so the next cycle begins. Now, the next command can be typed in idql. Failure to do this can hang the consumer in the first terminal and the commands below must be used to get it back on track:

ctrl-Z         # send the consumer into the background;
jobs -l        # identify the pid that's messed up;
kill -9 pid    # send it ad patres;
./readpipe.sh  # restart the consumer;

Or use the one-liner:

jobs -l | grep readpipe.sh | cut -d\ -f3 | xargs kill -9

Sometimes, even the producer process must be restarted. If all this looks clumsy at first, once you get the habit of it, it becomes quite automatic.
This alternative is nice because it avoids cluttering the command window: the DQL commands are separated from their output and therefore stay visible in the second terminal. Moreover, as illustrated, error messages don’t show in the less-ed output.

3. The third alternative: compact the output
Eventhough those pesky long lines are now tamed, the issue of those extra-wide columns mostly filled with blanks remains and this alternative is aimed at it.
Firstly, here is where we take our inspiration from, Oracle sqlplus. Consider the SQL query below:
— The USER_PASSWORD column is still too wide, let’s narrow it:
See what happened here ? Column USER_PASSWORD’s text has been reflowed inside the column’s width, not truncated.
OK, we want all these 3 things:
. compact the columns by removing trainling blanks;
. control the columns width by resizing the way we like;
. introduce column wrapping if our width is too narrow;
Obviously, since we don’t have the source code of idql, we cannot enhance the way it displays the query results so we will do this outside idql and in 2 steps, execute the query and capture its output to process it.
The output processing is performed by the following awk script:

-- compact_wwa.awk;
# Usage:
#    gawk -v maxw=nn -f compact_wwa.awk file
# or:
#    cmd | gawk -v maxw=nn -f compact_wwa.awk
# where:
#     maxw is the maximum column width; characters outside this limit are wrapped around in their own column;
# example:
#     gawk -v maxw=50 -f compact_wwa.awk tmp_file  | less -S
# C. Cervini, dbi-services.com
   while (getline && !match($0, /^([0-9]+> )+/));
   header = substr($0, RLENGTH + 1)
   nbFields = NF
   fs[0] = 0; fw[0] = -1 # just so that fs[1] = 1, see below;
   headerLine = ""; sepLine = ""
   for (i = 1; i <= NF; i++) {
      fs[i] = fs[i - 1] + fw[i - 1] + 2
      fw[i] = length($i)
      sepLine = sepLine sprintf("%s  ", substr($0, fs[i], min(fw[i], maxw)))
   printf("%s\n", sepLine)
   if (match($0, /^\([0-9]+ rows? affected\)/)) {
function printWithWA(S) {
   do {
      left_over = ""
      for (i = 1; i <= nbFields; i++) {
         Min = min(fw[i], maxw)
         printf("%s  ", substr(S, fs[i], Min))
         subS = substr(S, fs[i] + Min, fw[i] - Min)
         if (length(subS) > 0) {
            left_over = left_over sprintf("%-*s  ", fw[i], subS)
            left_over = left_over sprintf("%*s  ", fw[i], "")
      printf "\n"
      gsub(/ +$/, "", left_over)
      S = left_over
   } while (left_over)
function min(x, y) {
   return(x <= y ? x : y)

Now, let’s put it to use:

idql dmtest -Udmadmin -Pdmadmin <<EoQ | gawk -v maxw=35 -f compact_wwa.awk | less -S
select r_object_id, user_name, user_os_name, user_address, user_group_name, user_privileges, owner_def_permit, world_def_permit, group_def_permit, default_folder, user_db_name, description,
acl_domain, acl_name, user_os_domain, home_docbase, user_state, client_capability, globally_managed, user_delegation, workflow_disabled, alias_set_id, user_source, user_ldap_dn, user_xprivileges,
failed_auth_attempt, user_admin, user_global_unique_id, user_login_name, user_login_domain, user_initials, USER_PASSWORD, user_web_page, first_failed_auth_utc_time, last_login_utc_time,
deactivated_utc_time, deactivated_ip_addr, root_log_dir

By the way, funny thing, here “select user_password” is not the same as “select USER_PASSWORD”. The first returns a sequence of asterisks while the second an ASCII representation of the encrypted password. The generated SQL explains why.
“select user_password” gets compiled into the SQL statement below:

select all '****************' as user_password from dm_user_sp dm_user

whereas “select USER_PASSWORD” is the real one:

select all dm_user.USER_PASSWORD from dm_user_sp dm_user

Unlike unquoted column names in Oracle SQL, attribute names in DQL are case-sensitive !
And here is the result:


The script takes one parameter, maxw, the maximum column width. If the columns have too many characters, they are wrapped around on the next line(s) until the whole column has been displayed.

4. The altogether

What if we want the above line compaction and column wrapping around but interactively like in alternative 2 ? Easy. Just edit the script readpipe.sh and change line

cat idqlp


cat idqlp | gawk -v maxw=35 -f compact_wwa.awk

Said otherwise, we are preprocessing idql’s output through the awk filter before giving it to less.

A final alternative

We can achieve the same result with dmawk by writing a generic procedure that takes a DQL query to execute and a maximum column width, or a list of columns name and width (like if we entered a sequence of “col mycol format …” in sqlplus) so everything is done on the fly, but, as they say, this is left as an exercise to the reader. Or, why not, maybe in a blog to come.


Cet article idql and its column output est apparu en premier sur Blog dbi services.

ADWC – the hidden gem: Zepplin Notebook

Fri, 2018-05-04 14:22

In the previous blog posts I explained how to create, and stop/start the Autonomous Data Warehouse Cloud service. And I didn’t show yet how to connect to it. It is easy, from sqlplus or SQL Developer, or SQLcl.

But there’s something more exciting to run some SQL queries: the Oracle Machine Learning Notebooks based on Apache Zepplin. At first, I didn’t realize why the administration menu entry to create users in the ADWC service was named ‘Manage Oracle ML Users’, and didn’t realize that the ‘Autonomous Data Warehouse Cloud’ header was replaced by ‘Machine Learning’.

But last week at IOUG Collaborate 18, I visited the Demo Grounds and thanks to Charlie Berger I realized all the power of this: we are in the ‘Machine Learning’ interface here and the home button opens all the features available to query the ADWC database, including the SQL Notebooks based on Apache Zepplin.

Here is the path to this hidden Gem. From your ADWC service, you go to the Service Console:

Here you log as the ADMIN user with the >12 characters password that you have defined at service creation. Don’t worry if you forgot it, you can reset it from here:

Once connected, you go to the Administration tab and choose the ‘Manage Oracle ML Users':


Here you have to create a user because the ADMIN user not a Machine Learning user. Machine Learning users need one of the following roles: OML_DEVELOPER, OML_APP_ADMIN, OML_SYS_ADMIN. The user you will create here will have OML_DEVELOPER which is required to use SQL Notebooks.

Now that you have a user created from here, you can click on this little house icon, which is your home in the Machine Learning part of the ADWC:


Here you connect with the user you have created from the Oracle ML User page (not the ADMIN one as it has no OML role granted).


Then you are in your OML home, ready to run SQL from a Notebook:


I’ll show what you can do in future post. But just to give you an idea, you have a Notebook where you can type a query, execute it, and have the result displayed as a table, or as a graph. Here I was looking at I/O latency and the following shows me that the ‘cell single block physical read’, which are nothing else than the buffered one-block-at-a-time reads that are called ‘db file sequential read’ when not on Exadata, in dark green here, have most of their I/O call time between 128 and 512 microseconds.


I like this way to have the result just under the query, with easy formatting. The code, documented, is at the same place as the result, in a notebook that is easy to refresh, or share. And you can export the whole in a simple JSON file.


Cet article ADWC – the hidden gem: Zepplin Notebook est apparu en premier sur Blog dbi services.

SQL Server Security: Are your databases ready for the GDPR?

Fri, 2018-05-04 04:57

The GDRP (General Data Protection Regulation), voted in 2016, will be applied in the European Union soon (May 25, 2018). We heard a lot of this new directive.
The goal is to improve the protection and confidentiality of personally identifiable information for every European citizen. Quid of Switzerland? Read the article of Gregory here.
A personal data is information that identifies a natural person, directly or indirectly. It can be a name, a photograph, an IP address, a phone number, a computer login, a mailing address, a fingerprint, a voice recording, social security number, email, etc. Some data are sensitive as they relate to information that may give rise to discrimination or prejudice.

Microsoft reacts with a Guide to enhancing privacy and addressing GDPR requirements with the Microsoft SQL platform and gives to the DBA a new feature with the latest version of SSMS (SQL Server Management Studio) 17: SQL Data Discovery and Classification.
This feature exists on SSMS since the version 17.5. I have installed the version 17.6 for this article.
Once SSMS updated, just choose the database you want to scan, right click on the database and select Task, Classify Data…:


After few seconds, you have the result of the scan and in my case 39 columns with classification recommendations:


Click on this result to see the detail:


For each column, you will see an Information Type and a Sensitivity Label.
The Information Type gives a better granularity of the type of data:


The Sensitivity label is to have the level of sensitivity of the data:


As you can see, you have 2 level for GDPR: Confidential and Highly Confidential.
You can identify some information like for example the personal phone number and change from Confidential – GDPR to Highly Confidential – GDPR:


After reviewed all columns and change the information Type and Sensitivity Label if needed, you can check all cases or only select few columns and click on “Accept selected recommendations”:


After that save the result. Without saving the result, you cannot see the report…
If you want, you have also the possibility to add manually a classification through the button “Add Classification”.
The message “The classification changes have been updated successfully”:


At the end you can click on View Report to have a global view and see with the application owner how to protect sensitive columns with features like Dynamic Data masking, Always Encrypted or a third-part tool.


If I go to a classified column in the objects explorer, a right-click and select properties, I can see in the Extended Properties the information about the information type and sensitivity label:
As you can see the information is directly written to the column properties (Action of the save button).

Like for Vulnerability Assessment, you need to run it database per database and you don’t have the possibility to scan in one  click all databases from an instance. It is also not possible to script it… Perhaps in the next verison! ;-)

It’s also available on Azure, click here for more information.



Cet article SQL Server Security: Are your databases ready for the GDPR? est apparu en premier sur Blog dbi services.

ADWC – a Docker container to start/stop Oracle Cloud services

Thu, 2018-05-03 01:08

In the previous post, I’ve explained how to start and stop the Autonomous Data Warehouse Cloud service from PSM (PaaS Service Manager). There’s a setup phase, and a run phase starting with service-start and ending with service-stop. And the setup is specific to an Oracle Cloud account, storing information in the local user home. You may want to run different setups, and even provide an easy way to start/stop an Oracle Cloud service without knowing the user, password and tenant name.

A Docker container is perfect to isolate this.


Here is my quick (aka experimental) Dockerfile:

FROM alpine:latest
RUN apk add --update --no-cache python3 curl
ENV user=my.cloud.account@me.com
ENV password=MyP@ssw0rd
ENV tenant=idcs-31bbd63c3cb9466cb8a96f627b6b6116
ENV region=us
# get PSM
RUN curl -X GET -u ${user}:${password} -H X-ID-TENANT-NAME:${tenant} https://psm.us.oraclecloud.com/paas/core/api/v1.1/cli/${tenant}/client -o psmcli.zip
# install PSM
RUN pip3 install -U psmcli.zip
# setup PSM
RUN echo "{\"username\":\"${user}\",\"password\":\"${password}\",\"identityDomain\":\"${tenant}\",\"region\":\"${region}\",\"outputFormat\":\"short\"}" > config-payload &&\
psm setup --config-payload config-payload &&\
rm 421d64918638 # remove file with password
# patch PSM for bugs
RUN sed -ie 's/core\\[/]//g' ~/.psm/data/ADWC.json
# variables which can be overwritten at run time
ENV service=ADWC
CMD trap 'echo "Stopping service ${service}...";psm ${service} stop-service --service-name ${name} -wc true ; exit 0' SIGINT SIGSTOP SIGKILL; echo "Starting service ${service}...";psm ${service} start-service --service-name ${name} -wc true || exit 1 ; echo "You can access to ${service} console with ADMIN user at:";echo;psm ${service} service -s ${name} -of json | jq -r '.serviceConsole';echo ; while sleep 60 ; do echo "Status of service ${service} at $(date)...";psm ADWC service -s ADWC ; done

We need curl to download PSM, and pip3 to install it, and python3 to run it.
You can set your Oracle Cloud Account credentials as environment variables.
Then it fills all required information in a ‘config-payload’ file, runs ‘psm setup’ and removes that file.

At run, it calls a ‘start-service’ and loops while showing the status every minute (you can see them with docker container logs). The INT, STOP and KILL signals call ‘stop-service’. Then, the idea is that while the container exists, the Cloud Service is running. And it is shutdown at the container end of life. There’s no other action to do with the container: it display the console url where you have everything to interact with the service (download client credentials, manage users, go to Machine Learning notebooks,…).

Run example:

Here is a simple example:

# docker run --rm franck/psm/awdc
Starting service MYADWC1...
Message: Job submitted successfully for start of service/system
Job ID: 25583108
Waiting for the job to complete... (it cannot be cancelled)
Command completed with status [SUCCEED].
You can access to MYADWC1 console with ADMIN user at:
Status of service MYADWC1 at Sun Apr 29 18:20:50 UTC 2018...
Service: MYADWC1
Status: Ready
Edition: N/A
Compute Site: N/A
Cloud Storage Container: N/A
Created On: 2018-04-19T19:22:18.360+0000
Status of service MYADWC1 at Sun Apr 29 18:21:51 UTC 2018...
Service: MYADWC1
Status: Ready
Edition: N/A
Compute Site: N/A
Cloud Storage Container: N/A
Created On: 2018-04-19T19:22:18.360+0000
Stopping service MYADWC1...
Message: Job submitted successfully for stop of service/system
Job ID: 25620930
Waiting for the job to complete... (it cannot be cancelled)
Command completed with status [SUCCEED].

After 2 minutes I’ve hit ‘Control-C’ to stop the container. The service has been cleanly shut-down.


Cet article ADWC – a Docker container to start/stop Oracle Cloud services est apparu en premier sur Blog dbi services.

Load Balancing with Docker Swarm mode and SQL Server containers

Thu, 2018-05-03 00:57

Docker swarm mode provides an easy way to publish ports for services. Indeed, in this mode all nodes participate in an ingress routing mesh and accept connections to any service port published. Thus, all incoming requests are routed to available nodes hosting a service.

That’s a pretty cool feature but it has some drawbacks. Let’s say that a node goes wrong … In this case the application must retry connecting to the next available node and it will likely imply changing the application connection string or using a DNS redirection. A load balancer outside of the Swarm provides a better way to connect to the containers without having to worry about the cluster nodes from an application perspective. Thanks to the ingress routing all of the published services are available through any of the swarm nodes and the load balancer can be set to use the swarm private IP addresses without a concern of which node is hosting what service.

After taking a look at the existing open source projects / solutions in the market, I quickly reduced the scope of my search to only 2 ones: nginx and HAProxy. Each product has pros and cons but I choose HAProxy that fit well with my tests. In fact, I used more specifically the dockercloud-haproxy because it includes an HAProxy docker image that runs with Docker Swarm mode including docker-compose. I’m not sure this project is dead or alive referring to the dockerhub page that states the following: Unfortunately, this project is currently put into maintenance mode. Please do not send any PR for new features. We will still fix bugs if there is any. It exists some other alternatives as proxy.dockerflow but once again the dockercloud-haproxy project remained relevant for what I wanted to test.

Here my docker-compose file:

version: '3.1'
    build: .
      - "1433:1433"
      - /u00/db2:/u00
      - /u01/db2:/u01
      - /u02/db2:/u02
      - /u03/db2:/u03
      - /u98/db2:/u98
      - MSSQL_SA_PASSWORD_FILE=/run/secrets/mssql_sa_password
      - MSSQL_PID=Developer
      - MSSQL_USER=dbi
      - MSSQL_USER_PASSWORD_FILE=/run/secrets/mssql_user_password
      - TZ=Europe/Berlin
      - SERVICE_PORTS=1433
      - back-tier
      replicas: 1
        constraints: [node.role != manager]
          cpus: '1'
          memory: 3584M
          cpus: '1'
          memory: 3072M
      - source: mssql_sa_password
        target: mssql_sa_password
      - source: mssql_user_password
        target: mssql_user_password

      image: dockercloud/haproxy
        - db
        - BALANCE=leastconn
        - ADDITIONAL_SERVICES=project_dir:db
        - MODE=tcp
        - /var/run/docker.sock:/var/run/docker.sock
        - 80:80
        - 1936:1936
        - back-tier
          constraints: [node.role == manager]

      name: backend-server
      name: frontend-server

secrets: # top level secrets block
    external: true
    external: true


It includes 2 services:

  • db (my SQL Server container)
  • proxy (my HAProxy server)

I used some Swarm Mode specific environment variables:

  • SERVICE_PORTS=1433 – to expose the port of my db service (mssql default port)
  • BALANCE=leastconn – because it is recommended with long sessions as LDAP, SQL, TSE etc…
  • ADDITIONAL_SERVICES=project_dir:db – to identify the db service from the proxy service
  • MODE=tcp – mode of load balancing for HAProxy (TCP in my case)

I deployed the stack as following:

$ docker stack deploy -c docker-compose-test.yml mssql
Ignoring unsupported options: build

Creating service mssql_db
Creating service mssql_proxy


My service state was as follows:

$ docker service ls --filter name=mssql
ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
26zdisl9r64y        mssql_db            replicated          1/1          *:1433->1433/tcp
nhje3081gwr8        mssql_proxy         replicated          1/1                 dockercloud/haproxy:latest             *:80->80/tcp,*:1936->1936/tcp


Concerning the mssql_proxy service we may notice the exposed ports 80 and 1936. The former will be used to connect to the HAProxy for the SQL Server connection redirection and the latter concers the default port to get HAProxy stats information (from a web browser).

blog 132 - 2 - docker swarm node status1

Here an example of my HAProxy output:

blog 132 - 1 - docker swarm HA proxy stats

The db service task (mssql_db.1.o9s2xxxxx) – that corresponds in fact to my service task (or container) is well-identified by the HA proxy.

So, I tried to connect to my SQL Server DB container through the HAProxy:

C:\Users\clustadmin>sqlcmd -S docker1,80 -Usa -PPassw0rd1 -Q"SELECT @@SERVERNAME"



It worked like a charm!

The next step consisted in simulating a failure of my docker node DOCKER3 …

blog 132 - 3 - docker swarm node status2

… and try again a connection to the SQL Server DB container that has restarted on DOCKER2 node. As expected, the connection kept working and was automatically redirected by the HAProxy to the next available node. Obviously in this case the redirection was not transparent for the application. We have to take to into account the short period of outage that corresponds to the mssql_db container restart duration. By the way, if you take a look at the HAProxy default_service section you may notice that the task id has changed (mssql_db.1.zo8yqxxxxxx) meaning the system has created a new task (or container). However, my mssql_db container host name remained the same in my case.

blog 132 - 4 - docker swarm HA proxy stats2

C:\Users\clustadmin>sqlcmd -S docker1,80 -Usa -PPassw0rd1 -Q"SELECT @@SERVERNAME"



Another interesting point I had to dig further concerns the HAProxy health check by itself. In my case I used TCP-based health check to ensure the service is alive but it is probably not good enough to check if my SQL Server instance (in other words my application) is working correctly. I spent some times to read articles and documentations and I discovered we may use custom health checks with HAProxy with databases as MySQL, PostgreSQL and Redis. Unfortunately, at the time I’m writing this blog post nothing concerning SQL Server probably because it is pretty new in this area (maybe I get wrong … so please feel free to comment) . I found out some other alternatives as xinetd that may be used to trigger a custom routine (by using sqlcmd command line tool for example) but in my context it makes the game a little bit more complex because I have to include it to my SQL Server docker image. After thinking a little bit about this issue, I realized we already implemented such routine directly inside the SQL Server image itself (HEALTHCHECK section) that uses precisely the sqlcmd command line tool to check the SQL Server connection is working correctly.

/opt/mssql-tools/bin/sqlcmd -S localhost,$MSSQL_TCP_PORT -U sa -P ${SA_PASSWORD} -Q "select 1"


By the way, the task status can be viewed through the docker ps command as follows:

$docker ps -f name=mssql --format 'table {{.ID}}\t{{.Names}}\t{{.CreatedAt}}\t{{.Status}}'
CONTAINER ID        NAMES	CREATED AT                       STATUS
9bfe04a97617        mssql_db.1.zc8yqp9llmjdcn3df49izyj72	2018-05-01 23:01:39 +0200 CEST   Up 9 hours (healthy)


So, the combination of the docker health check routine and the HAProxy TCP health check seems to be a good solution to address different kind of failure including docker swarm node failures and application failures as well.

This is just the first step about using an HAProxy with SQL Server (redirection capabilities). You may also extend this first scenario with a low cost solution including multiples SQL Server Express containers to address scale-out needs with Reporting queries for instance. I remember Christophe Laporte(b) gave a session about some years ago. I will try to blog about soon!

See you!






Cet article Load Balancing with Docker Swarm mode and SQL Server containers est apparu en premier sur Blog dbi services.

How uid mapping works in Docker containers?

Wed, 2018-05-02 09:56

It can be interesting to see how uids between the docker host and docker containers are mapped. For example, for security concerns.
As a reminder, docker containers are based on two linux kernel features: linux namespaces and cgroups.

Basically, linux namespaces provide isolation for running processes and cgroups allows you to isolate resource usage.

Let’s first run a docker container. Here, we will run a mariadb docker in background with -d option

[docker@docker1 ~]$ docker run -d -e MYSQL_ROOT_PASSWORD=test123 mariadb

Now the container mariadb is running. Let’s see what is happening on a host level.

[docker@docker1 ~]$ ps -ef
polkitd   1729  1718  0 08:14 ?        00:00:00 mysqld

On a container level:

root@5c4450939d71:~# ps -ef
mysql        1     0  0 06:14 ?        00:00:00 mysqld
root       174     0  0 06:22 pts/0    00:00:00 bash

On the host level the mysqld process is running by polkitd and on a container level the process is running by mysql. Any ideas?
This is because the user id (UID) of the mysql user created in mariadb container corresponds to the same UID of the polkitd user on the host.

Let’s see what is the userid of the mysql user in the mariadb container

root@5c4450939d71:~# id mysql
uid=999(mysql) gid=999(mysql) groups=999(mysql)

The UID of mysql is 999. On the host:

[docker@docker1 ~]$ cat /etc/passwd | grep 999
polkitd:x:999:997:User for polkitd:/:/sbin/nologin

We can see that 999 corresponds to the polkitd user id.

How to change this?

Well, this could be a problem because we don’t want to run docker containers with a system user that we don’t know.

One solution could be to create a mysql user with a certain UID on the host:

[root@docker1 ~]# useradd -g mysql -u 1099 -m -r mysql

Then, we modify the user id inside the docker image. To do so, we need to rebuild a new mariadb image :-)
Let’s first clone the docker mariadb project

[docker@docker1 ~]$ git clone https://github.com/docker-library/mariadb.git
Cloning into 'mariadb'...
remote: Counting objects: 751, done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 751 (delta 9), reused 18 (delta 8), pack-reused 728
Receiving objects: 100% (751/751), 152.38 KiB | 0 bytes/s, done.
Resolving deltas: 100% (338/338), done.

We enter the directory of the mariadb version 10.3

[docker@docker1 ~]$ cd mariadb/10.3/

We need to modify the Dockerfile where all instructions are described

[docker@docker1 10.3]$ vi Dockerfile

Change this line

# vim:set ft=dockerfile:
FROM debian:jessie

# add our user and group first to make sure their IDs get assigned consistently, regardless of whatever dependencies get added
RUN groupadd -r mysql && useradd -r -g mysql mysql

To this line

# vim:set ft=dockerfile:
FROM debian:jessie

# add our user and group first to make sure their IDs get assigned consistently, regardless of whatever dependencies get added
RUN groupadd -g 1099 -r mysql && useradd -u 1099 -r -g mysql mysql

We rebuild a new image, let’s call it mariadbcustom

[docker@docker1 10.3]$ docker build -t mariadbcustom:latest .
Sending build context to Docker daemon  13.31kB
Step 1/19 : FROM debian:jessie
 ---> 5dd74d62fab8
Step 2/19 : RUN groupadd -g 1099 -r mysql && useradd -u 1099 -r -g mysql mysql
 ---> Using cache
 ---> a285892faa45
Step 3/19 : ENV GOSU_VERSION 1.10
 ---> Using cache
 ---> 069252945f7a
Step 4/19 : RUN set -ex;                fetchDeps='             ca-certificates                 wget    ';      apt-get update;         apt-get install -y --no-install-recommends $fetchDeps;   rm -rf /var/lib/apt/lists/*;            dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')";      wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch";     wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch.asc";             export GNUPGHOME="$(mktemp -d)";        gpg --keyserver ha.pool.sks-keyservers.net --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4;         gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu;       rm -r "$GNUPGHOME" /usr/local/bin/gosu.asc;             chmod +x /usr/local/bin/gosu;    gosu nobody true;               apt-get purge -y --auto-remove $fetchDeps
 ---> Using cache
 ---> c82d4738b781
Step 5/19 : RUN mkdir /docker-entrypoint-initdb.d
 ---> Using cache
 ---> 08acd0843256
Step 6/19 : RUN apt-get update && apt-get install -y --no-install-recommends            apt-transport-https ca-certificates             pwgen   && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 3ed44a5e3cf5
Step 7/19 : ENV GPG_KEYS        199369E5404BD5FC7D2FE43BCBCB082A1BB943DB        430BDF5C56E7C94E848EE60C1C4CBDCDCD2EFD2A        4D1BB29D63D98E422B2113B19334A25F8507EFA5
 ---> Using cache
 ---> b30af869afbb
Step 8/19 : RUN set -ex;        export GNUPGHOME="$(mktemp -d)";        for key in $GPG_KEYS; do                gpg --keyserver ha.pool.sks-keyservers.net --recv-keys "$key";   done;   gpg --export $GPG_KEYS > /etc/apt/trusted.gpg.d/mariadb.gpg;    rm -r "$GNUPGHOME";     apt-key list
 ---> Using cache
 ---> 7a6e03190271
Step 9/19 : RUN echo "deb https://repo.percona.com/apt jessie main" > /etc/apt/sources.list.d/percona.list      &> /etc/apt/preferences.d/percona
 ---> Using cache
 ---> e55705d326a2
Step 10/19 : ENV MARIADB_MAJOR 10.3
 ---> Using cache
 ---> bb3bc4adcf42
Step 11/19 : ENV MARIADB_VERSION 1:10.3.6+maria~jessie
 ---> Using cache
 ---> 05bb1dc686c8
Step 12/19 : RUN echo "deb http://ftp.osuosl.org/pub/mariadb/repo/$MARIADB_MAJOR/debian jessie main" > /etc/apt/sources.list.d/mariadb.list     &> /etc/apt/preferences.d/mariadb
 ---> Using cache
 ---> 3626c50c8d83
Step 13/19 : RUN {              echo "mariadb-server-$MARIADB_MAJOR" mysql-server/root_password password 'unused';              echo "mariadb-server-$MARIADB_MAJOR" mysql-server/root_password_again password 'unused';         } | debconf-set-selections      && apt-get update       && apt-get install -y           "mariadb-server=$MARIADB_VERSION"                percona-xtrabackup-24           socat   && rm -rf /var/lib/apt/lists/*  && sed -ri 's/^user\s/#&/' /etc/mysql/my.cnf /etc/mysql/conf.d/*        && rm -rf /var/lib/mysql && mkdir -p /var/lib/mysql /var/run/mysqld      && chown -R mysql:mysql /var/lib/mysql /var/run/mysqld  && chmod 777 /var/run/mysqld    && find /etc/mysql/ -name '*.cnf' -print0                | xargs -0 grep -lZE '^(bind-address|log)'              | xargs -rt -0 sed -Ei 's/^(bind-address|log)/#&/'      && echo '[mysqld]\nskip-host-cache\nskip-name-resolve' > /etc/mysql/conf.d/docker.cnf
 ---> Using cache
 ---> 7d3d52632798
Step 14/19 : VOLUME /var/lib/mysql
 ---> Using cache
 ---> 3880f6c65676
Step 15/19 : COPY docker-entrypoint.sh /usr/local/bin/
 ---> Using cache
 ---> 98aa1e3161c4
Step 16/19 : RUN ln -s usr/local/bin/docker-entrypoint.sh / # backwards compat
 ---> Using cache
 ---> a5394275c2b2
Step 17/19 : ENTRYPOINT ["docker-entrypoint.sh"]
 ---> Using cache
 ---> c456c7b34697
Step 18/19 : EXPOSE 3306
 ---> Using cache
 ---> 05068b456523
Step 19/19 : CMD ["mysqld"]
 ---> Using cache
 ---> 5973a27bfd43
Successfully built 5973a27bfd43
Successfully tagged mariadbcustom:latest

Let’s check our image is here

[docker@docker1 10.3]$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
mariadbcustom       latest              5973a27bfd43        8 days ago          403MB

we run a docker container with our new customized image

[docker@docker1 10.3]$ docker run -d -e MYSQL_ROOT_PASSWORD=test123 mariadbcustom

Let’s check if the user id was properly initialized to the mysql user

[docker@docker1 10.3]$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
7e344d87c4bc        mariadbcustom       "docker-entrypoint.s…"   6 minutes ago       Up 6 minutes        3306/tcp            hungry_heisenberg
[docker@docker1 10.3]$ docker exec -it hungry_heisenberg /bin/bash
root@7e344d87c4bc:~# id mysql
uid=1099(mysql) gid=1099(mysql) groups=1099(mysql)

We check also that the mysqld process run on the host as mysql user

[docker@docker1 10.3]$ ps -ef
mysql     2727  2716  2 14:05 ?        00:00:00 mysqld

On the host, we can see that the mysqld process runs as mysql user. Why? Because now the user id of the mysql user existing on the docker container corresponds to the one existing on the host. In this case, the user id is 1099.


In some use cases you might want to use a specific user to run some process and not using root or whatever user.However, in order to do that you sometimes need to change the Dockerfile or add a specific user on the host.


Cet article How uid mapping works in Docker containers? est apparu en premier sur Blog dbi services.

Deploy a Cloudera cluster with Terraform and Ansible in Azure – part 2

Wed, 2018-05-02 04:25

In this part of the blog posts series, we will show how ansible helps us to configure our cluster and install all pre-requisite needed for Cloudera Manager. Ansible is one of the most important automation tools currently.

Ansible will help us to configure all nodes for a manual installation using Cloudera Manager. Our playbook will contain the following roles:

  • cm_repo: add the same C.M repo into all nodes.
  • os_config: Adjust all OS parameter for installing a Cloudera cluster. 
  • java: Java JDK 1.7.80 installation.
  • cm_agents: Installation of the C.M agent’s packages
  • MariaDB: Installation of a MariaDB. C.M needs an Oracle, MySQL (MariaDB) or PostgreSQL database for Cloudera Manager meta-data storage and Hive meta-store.
  • mysql_connector: Installation of the MySQL connector for connecting to MariaDB. 
  • scm: Install and start the Cloudera Manager Server.

In a Big Data cluster, we split the node into roles.

  • Manager: dedicated node for all Cloudera Manager daemons
  • Master: NameNode daemon + Secondary NameNode daemon
  • Workers: DataNode daemons

The first step is to define the Ansible hosts inventory file. Below my inventory file.

manager ansible_host=<manager_ip> id=6

manager  ansible_host=<manager_ip> id=6

master ansible_host=<master_ip>  id=5

worker1 ansible_host=<worker1>  id=2
worker2 ansible_host=<worker2>  id=3
worker3 ansible_host=<worker3>  id=4



We will now, define all variable needed for our roles. Variables are split into roles:

Below the example of variables definition for CDH server instances: cdh_servers.yml


db_hostname: "{{ hostvars[groups['db_server'][0]]['inventory_hostname'] }}"
scm_hostname: "{{ hostvars[groups['cdh_manager'][0]]['inventory_hostname'] }}"

cdh_version: 5.14.2
cluster_display_name: cluster_1

# Users and Groups
  - dbi
  - dbi

# Java variables
java_download_url: http://ftp.osuosl.org/pub/funtoo/distfiles/oracle-java/jdk-7u80-linux-x64.tar.gz
java_download_folder: /usr/java
java_name: "{{java_download_folder}}/jdk1.7_80"
java_archive: "{{java_download_folder}}/jdk-7u80-linux-x64.tar.gz"

# Mysql Java connector
mysql_java: mysql-connector-java-5.1.46
mysql_java_download_url: https://dev.mysql.com/get/Downloads/Connector-J/"{{mysql_java_archive}}"
mysql_java_download_folder: /usr/share/mysql-java/
mysql_java_archive: "{{ mysql_java_download_folder }}/{{ mysql_java }}.tar.gz"

mysql_java_jar: /usr/share/java/mysql-connector-java.jar

Same files will created for database server variable (db_server.yml) and Cloudera Manager server variables (scm_server.yml).

After the variables definition, we can start creating the different roles and their associated tasks.

 Cloudera Manager repo

The goal of this role is to add the same C.M repo in all cluster hosts. We will use a template of the repository file.


# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 7 x86_64
name=Cloudera Manager
gpgcheck = 1


- name: Add Cloudera repo
    src: ../templates/cloudera-manager.repo.j2
    dest: "/etc/yum.repos.d/cloudera-manager{{cdh_version}}.repo"

The definition of the Cloudera Manager version has previously done in the cdh_servers.yml variable file.

OS Configuration

Some requirements are needed before installing a Cloudera cluster. This role will configure all hosts with Cloudera requirements: https://www.cloudera.com/documentation/enterprise/release-notes/topics/rn_consolidated_pcm.html#cmig_topic_4 .

- name: Create groups
    name: "{{item}}"
    state: present
  with_items: "{{group}}"

- name: Create user
    name: "{{item}}"
    shell: /bin/bash
    uid: 1050
    groups: "{{group}}"
  with_items: "{{user}}"

- name: "Build hosts file"
    dest: /etc/hosts
    regexp: '.*{{ item }}$'
    line: "{{ hostvars
  • ['ansible_default_ipv4']['address'] }} {{item}}"     state: present   when: hostvars
  • ['ansible_default_ipv4']['address'] is defined   with_items: '{{groups.all}}' - name: Disable transparent huge page - defrag   shell: echo "never" > /sys/kernel/mm/transparent_hugepage/defrag - name: Disable transparent huge page - enabled   shell: echo "never" > /sys/kernel/mm/transparent_hugepage/enabled - name: VM swappiness - 1   shell: echo "1" > /proc/sys/vm/swappiness - name: Set VM swappiness - 2   sysctl:     name: vm.swappiness     value: 1     state: present - name: Create /data dir   file:     path: /data     state: directory     mode: 0775     owner: dbi     group: dbi - name: Create file system on volume   filesystem:     fstype: ext4     dev: /dev/xvdb - name: Mount volume as /data   mount:     name: /data     src: /dev/xvdb     fstype: ext4     opts: defaults,noatime     state: mounted - name: install the latest version of ntp   yum:     name: ntp     state: latest - name: install the latest version of nscd   yum:     name: nscd     state: latest - name: install wget   yum:     name: wget     state: latest - name: Disable SELinux   selinux:     state: disabled - name: Reboot for SELinux if needed   command: /sbin/shutdown -r +1   async: 0   poll: 0
    Java installation

    The Java installation is one of the most complex parts of the installation. First, we need to choose a supported version of JDK. Then we need to be sure that Java has been installed properly in all hosts. The installation tasks is split into the following part:

    • Create installation directories: /usr/share/java and /usr/java
    • Download Java JDK 1.7.80 which is a supported version for Cloudera Manager
    • Unarchive Java JDK
    • Fix ownership
    • Make Java available for the system with alternatives
    • Clean up installation download folder
    • Add Java home path by exporting $JAVA_HOME variable

    Below the java install tasks.

    - name: Create directories
        path: "{{ item }}"
        state: directory
        - "{{ java_download_folder }}"
        - "/usr/share/java"
    - name: Creates directory
        path:  "{{ java_download_folder }}"
        state: directory
    - name: Download Java
        url: "{{ java_download_url }}"
        dest: "{{ java_archive }}"
        headers: "Cookie:' gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie'"
        validate_certs: no
    - name: Unarchive Java archive
        src: "{{ java_archive }}"
        dest: "{{ java_download_folder }}"
        copy: no
    - name: Fix ownership
        state: directory
        path: "{{ java_name }}"
        owner: root
        group: root
        recurse: yes
    - name: Make Java available for system with alternatives
      command: 'alternatives --install "/usr/bin/java" "java" "{{java_name}}/bin/java" 2'
    - name: Clean up Java download
        state: absent
        path: "{{java_archive}}"
    - name: Add java home path
        dest: /etc/profile
        block: |
          export JAVA_HOME=/usr/java/jdk1.7.0_80
          export PATH=$JAVA_HOME/bin:$PATH
          regexp: "JAVA_HOME"
        state: present
    MariaDB installation

    After installing Java, we can start the installation and configuration of MariaDB database. You can find the entire role for MariaDB installation here.

    MySQL connector

    MySQL connector installation steps will follow approximatively the same steps as Java installation. All details here.

    Cloudera Manager Server installation

    The last role of this playbook is the installation of Cloudera Manager server. This role will simply install the Cloudera Manager server package in the cdh_manager host and start the 2 following deamons:

    • cloudera-manager-daemons
    • cloudera-manager-server
    - include_vars: ../../../group_vars/db_server.yml
    - name: Install the Cloudera Manager Server Packages
        name: "{{ item }}"
        state: installed
        - cloudera-manager-daemons
        - cloudera-manager-server
    # - name: Prepare Cloudera Manager Server External Database
    #   command: /usr/share/cmf/schema/scm_prepare_database.sh
    #              -f
    #              --host {{ hostvars[db_hostname]['inventory_hostname'] }}
    #              mysql {{ databases.scm.name }} {{ databases.scm.user }} {{ databases.scm.pass }}
    #   changed_when: False
    - name: Start the Cloudera Manager Server
        name: "{{ item }}"
        state: restarted
        enabled: yes
        - wait cloudera-scm-server
        - cloudera-scm-server
        - cloudera-scm-agent
    # Trigger handler to wait for SCM to startup
    - meta: flush_handlers



    After creating all roles, we need to define our site.yml in order to execute all tasks in the desired order.

    # Cloudera playbook
    - name: Configure Cloudera Manager Repository
      become: ansible_become
      hosts: cdh_servers
        - cm_repo
      tags: cm_repo
    - name: Configure Epel repository
      become: ansible_become
      hosts: cdh_servers
        - epel
      tags: epel_repo
    - name: OS Configuration
      become: ansible_become
      hosts: cdh_servers
          - os_config
      tags: os_config
    - name: Install Java JDK 7
      become: ansible_become
      hosts: cdh_servers
        - java
      tags: java
    - name: Install MySQL Java Connector
      become: ansible_become
      hosts: cdh_servers
        - mysql_connector
      tags: mysql_java_connector
    - name: Install MariaDB and create databases
      hosts: db_server
        - mariadb
      tags: mysql
    # ##############
    - name: Install Cloudera Manager Agents
      hosts: cdh_servers
        - cm_agents
      tags: cm_agents
    - name: Install Cloudera Manager Server
      hosts: cdh_manager
        - scm
      tags: cluster_template


    When all steps will finish, you can access to Cloudera Manager web interface by the following:


    Be sure, your network configuration is well configured to allow access to Cloudera Manager webUI through the default 7180 port.


    The entire project with all files is available here.


    Cet article Deploy a Cloudera cluster with Terraform and Ansible in Azure – part 2 est apparu en premier sur Blog dbi services.

    ADWC: start/stop with PSM Command Line Interface

    Tue, 2018-05-01 23:10

    In the previous post, I explained how to create an Autonomous Data Warehouse with PSM (PaaS Service Manager Command Line Interface). The most common operation you want to do with it is starting and stopping the service. This is the best way to save credits for hourly billed services. And PSM is the easiest: run from everywhere (it is Python 3) and no need to provide credentials each time. In the previous post, I explained how to setup PSM for the ADWC service.

    Unfortunately, for starting and stopping the instance you may realize that:

    • It is not in the documentation
    • Syntax exists but doesn’t work

    The documentation is there, but no mention of start-service, stop-service nor restart-service: https://docs.oracle.com/en/cloud/paas/autonomous-data-warehouse-cloud/user/adwc-commands.html

    The online help has start/stop/restart-service:

    $ psm adwc h
    Oracle Autonomous Data Warehouse Cloud
    psm ADWC [parameters]  
    o services
    List all Autonomous Data Warehouse Cloud instances
    o service
    List Autonomous Data Warehouse Cloud instance
    o create-service
    Provision Autonomous Data Warehouse
    o delete-service
    Unprovision Autonomous Data Warehouse
    o scale-service
    Scale Autonomous Data Warehouse
    o start-service
    This operation will set the operational state of service as started
    o stop-service
    This operation will set the operational state of service as stopped
    o restart-service
    This operation will set the operational state of service as after...
    o view-backups
    List all backups of Autonomous Data Warehouse Cloud instance
    o view-backup
    List a backup of Autonomous Data Warehouse Cloud instance
    o backup
    Backup Autonomous Data Warehouse
    o view-restores
    List all restore operations for Autonomous Data Warehouse Cloud instance
    o view-restore
    List a specified restore operation for Autonomous Data Warehouse Cloud...
    o restore
    Restore Autonomous Data Warehouse
    o check-health
    Health Check operation
    o operation-status
    View status of Autonomous Data Warehouse Cloud instance operation
    o activities
    View activities for Autonomous Data Warehouse Cloud instance
    o help
    Show help

    All 3 take the same parameters, the service name, the REST API output format, and a boolean for wait of the completion of the job:

    $ psm adwc start-service h
    This operation will set the operational state of service as started
    psm ADWC start-service [parameters] -s, --service-name
    [-of, --output-format ] [-wc, --wait-until-complete ]  
    -s, --service-name (string)
    Name of the Autonomous Data Warehouse Cloud instance
    -of, --output-format (string)
    Desired output format. Valid values are [short, json, html]  
    -wc, --wait-until-complete (boolean)
    Wait until the command is complete. Valid values are [true, false]. Default is
    psm ADWC start-service -s ExampleInstance


    So, the online help show it and I try it:

    $ psm adwc start-service --service-name ADWC --output-format short -wc true
    Error: Not Found. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Draft//EN">
    <TITLE>Error 404--Not Found</TITLE>
    <BODY bgcolor="white">
    <FONT FACE=Helvetica><BR CLEAR=all>
    <TABLE border=0 cellspacing=5><TR><TD><BR CLEAR=all>
    <FONT FACE="Helvetica" COLOR="black" SIZE="3"><H2>Error 404--Not Found</H2>
    <TABLE border=0 width=100% cellpadding=10><TR><TD VALIGN=top WIDTH=100% BGCOLOR=white><FONT FACE="Courier New"><FONT FACE="Helvetica" SIZE="3"><H3>From RFC 2068 <i>Hypertext Transfer Protocol -- HTTP/1.1</i>:</H3>
    </FONT><FONT FACE="Helvetica" SIZE="3"><H4>10.4.5 404 Not Found</H4>
    </FONT><P><FONT FACE="Courier New">The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent.</p><p>If the server does not wish to make this information available to the client, the status code 403 (Forbidden) can be used instead. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address.</FONT></P>

    Unfortunately, this doesn’t work. Is it that those commands are not supported yet, reason why we don’t find them in the documentation? Or maybe the opposite: they do not work and rather than fix them, they removed them from the documentation. One thing I’m 100% sure: start-service and stop-service are the most useful commands for a CLI giving easy access to an hourly billed and I want them to work. And it is Python, JSON and HTML – nothing hidden there.

    Hack Fix

    The error message is about no matching URL. PSM metadata is stored in your user directory (~/.psm/data on Linux) with one JSON file for each Oracle platform service. Having a look at the URLs in ADWC.json the bug is obvious:

    $ jq . ~/.psm/data/ADWC.json | grep uri
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/scale",
    "uri": "/paas/core/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/start",
    "uri": "/paas/core/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/stop",
    "uri": "/paas/core/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/restart",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/backups",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/backups/{backupId}",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/backups",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/restoredbackups",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/restoredbackups/{jobId}",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/restoredbackups",
    "uri": "/paas/api/v1.1/instancemgmt/{identityDomainId}/services/{serviceType}/instances/{serviceName}/healthcheck",
    "uri": "/paas/api/v1.1/activitylog/{identityDomainId}/job/{jobId}",
    "uri": "/paas/api/v1.1/activitylog/{identityDomainId}/filter",

    Let’s remove this ‘/core’ from the uri:

    sed -ie 's/core\\[/]//g' ~/.psm/data/ADWC.json

    And run again the start-service:

    $ psm adwc start-service -s ADWC -wc true
    Message: Job submitted successfully for start of service/system
    Job ID: 25617877
    Waiting for the job to complete... (it cannot be cancelled)

    Here it is.

    As my laptop is on Windows where I use Cygwin, I have setup two icons with:
    C:\cygwin64\bin\mintty.exe -w min /usr/bin/psm adwc stop-service -s ADWC -wc true
    C:\cygwin64\bin\mintty.exe -w min /usr/bin/psm adwc start-service -s ADWC -wc true

    Start/Stop time

    This ADWC service is a PDBaaS. Starting and Stopping is as easy as opening and closing a pluggable database. Here are the timestamps after starting and stoppin in a loop (with graph on 24 loops).

    The startup time is around 30 seconds. The stop time is about a minute. Really easy to use.

    $ psm adwc activities -s ADWC -l 50
    Operation Type Status Start Time End Time
    START_SERVICE SUCCEED 2018-04-29T14:30:36.888+0000 2018-04-29T14:31:21.563+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:27:26.551+0000 2018-04-29T14:27:35.610+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:25:22.172+0000 2018-04-29T14:25:51.586+0000
    START_SERVICE SUCCEED 2018-04-29T14:20:47.957+0000 2018-04-29T14:21:38.131+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:08:09.409+0000 2018-04-29T14:08:48.125+0000
    START_SERVICE SUCCEED 2018-04-29T14:07:24.892+0000 2018-04-29T14:08:08.244+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:04:57.566+0000 2018-04-29T14:05:27.458+0000
    START_SERVICE SUCCEED 2018-04-29T14:03:51.035+0000 2018-04-29T14:04:34.108+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:03:17.701+0000 2018-04-29T14:03:47.262+0000
    START_SERVICE SUCCEED 2018-04-29T14:02:00.944+0000 2018-04-29T14:02:50.978+0000
    STOP_SERVICE SUCCEED 2018-04-29T14:00:56.990+0000 2018-04-29T14:01:29.567+0000
    START_SERVICE SUCCEED 2018-04-29T13:59:52.898+0000 2018-04-29T14:00:39.373+0000
    STOP_SERVICE SUCCEED 2018-04-29T13:59:19.380+0000 2018-04-29T13:59:49.011+0000
    START_SERVICE SUCCEED 2018-04-29T13:58:15.594+0000 2018-04-29T13:58:58.937+0000
    STOP_SERVICE SUCCEED 2018-04-29T13:57:42.355+0000 2018-04-29T13:58:11.845+0000

    Easy command line without having to provide a password interactively, wait for completion, fast operation, this gives a great user experience for this service. The only problem is when you play with several cloud accounts. I’ll show an idea in the next post.


    Cet article ADWC: start/stop with PSM Command Line Interface est apparu en premier sur Blog dbi services.

    ADWC: Creation of Autonomous Database Cloud service

    Mon, 2018-04-30 23:00

    You want to try the Autonomous Database Cloud Service? That’s easy. Here is a Step-by-Step.

    Cloud Credits

    First, you need Cloud Credits. You may have bought them (any recent negotiation with Oracle Sales, even for on-premises, involves some Cloud Credits). You can have a free trial with 300$ Cloud Credits available for 1 month. To get another month, you need a different e-mail address and different Credit Card number (not charged). It is quite easy to have different e-mail addresses and your bank may provide virtual credit card where the number changes each time. Or you may have the 5000$ Cloud Credits available for 1 year from the Education program. I got those thanks to ACE Director program.

    Update 01-MAY-2018 – There’s also the 500$ credits from the “white glove” program – you can ask to your Sales representative

    In all cases you will be able to test the service without spending too much credits because:

    • This service is not expensive ($2.5 per OCPU per Hour in Pay As You Go)
    • It is very easy to start and stop the service, and then pay only for the hours where you connect
    • If you choose ‘Bring You Own License’ in the creation, the per OCPU per Hour is only $0.48 (but be sure that you have covered
      See https://cloud.oracle.com/en_US/datawarehouse/pricing)
    • Capturebilling

    • And finally, during the trial promotion, the credits are consumed at discounted rate
      (after 9 hours of usage, I got less than 1$ used)
    OCI Account

    The first generation of Oracle Cloud, is now called ‘OCI Classic’, and you distinguish it when connecting as the Sign-In page mentions ‘Traditional Cloud Account’. You cannot access to ADWC with this account.

    You need an access to the OCI (Oracle Cloud Infrastructure – the version 2 of Oracle Cloud).
    If, when Sign-In, you are welcomed by this guy looking at his phone, you are at the right place. I’m always curious about how they choose an image for a page used every day and for several years. The oracle.com login page is easy with the headquarters blurry shot. For the OCI account, they choose the “Man On Smart Phone – Young Business Man Texting In Airport – Casual Urban Professional Businessman Using Smartphone App Smiling Happy Inside Office Building Or Airport” from the Adobe image stock.


    For the moment, the ADWC service is available only in the Ashburn Cloud Center. Not yet in Europe (but planned for Frankfurt). You can see the regions here: https://cloud.oracle.com/data-regions. Then, when you receive your access to the Oracle Cloud Services, chose the Ashburn Data Center.

    Update 01-MAY-2018 – It seems that the service is available in Frankfurt.

    Create Instance

    The instance creation is easy and fast. It will create a Pluggable Database (PDB) in the Oracle Cloud CDB. You provide a name, and ADMIN password (be careful, rule is at least 12 characters) which is the password you’ll use to connect as the ADMIN user. You can change it later and add new users. The Shape is different from the DBaaS here. You define the number of threads you want to use (it actually sets the CPU_COUNT for the PDB) and the size of PDB datafiles. You can change both later with Scale Up/Down.

    PaaS Service Manager Command Line Interface

    You can also create an ADWC service from the command line. I’ll show how to install and use PSM, the PaaS Service Manager Command Line Interface). Rodrigo Jorge has a nice description for DBaaS on his blog.

    So, you download PSM:

    curl -X GET -u my.cloud.account@me.com:MyP@ssw0rd -H X-ID-TENANT-NAME:idcs-31bbd63c3cb9466cb8a96f627b6b6116 https://psm.us.oraclecloud.com/paas/core/api/v1.1/cli/idcs-31bbd63c3cb9466cb8a96f627b6b6116/client -o psmcli.zip
    % Total % Received % Xferd Average Speed Time Time Time Current
    Dload Upload Total Spent Left Speed
    0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
    100 86945 0 86945 0 0 16806 0 --:--:-- 0:00:05 --:--:-- 23820


    The user:password are those you use in the account Sign-In.

    The ‘Tenant Name’, you get it from the URL of this Man On Smart Phone Sign-in web page. You will see it also mentioned later as ‘Identity domain’ (like in OCI-Classic). If you have a doubt, create the service from the web console, click on it and you will see the Tenant Name.

    So, you have a zip file and do not unzip it. It is a Python 3 module and you install it with ‘pip3′. You can do that in any OS.

    I have the strange idea to run my laptop on Windows with Cygwin for command line stuff. Here are the python3 packages I have here.

    Here is the installation of PDM:

    pip3 install -U psmcli.zip
    Processing ./psmcli.zip

    And now the nice thing is that you will configure once your credentials with ‘psm setup’. You provide the user, password and tenant name (which is called ‘identity domain’ here):

    $ psm setup
    Username: my.cloud.account@me.com
    Password: MyP@ssw0rd
    Retype Password: MyP@ssw0rd
    Identity domain: idcs-31bbd63c3cb9466cb8a96f627b6b6116
    Region [us]:
    Output format [short]:
    Use OAuth? [n]:
    'psm setup' was successful. Available services are:
    o ADWC : Oracle Autonomous Data Warehouse Cloud
    o ADWCP : Oracle Autonomous Data Warehouse Cloud Platform
    o ANALYTICS : Oracle Analytics Cloud
    o APICS : Oracle API Platform Cloud Service
    o APICatalog : Oracle API Catalog Service

    ADWC is on the list. You are ready to manage ADWC instances, such as create one:

    $ psm adwc create-service -c - <<<' {
    "serviceName": "ADWCx",
    "adminPassword": "Ach1z00dAch1",
    "numCpus": "1",
    "storageCapacity": "1",
    "serviceLevel": "PAAS",
    "serviceVersion": "18.1.1",
    "managedSystemType": "oracle",
    "enableNotification": true,
    "notificationEmail": "notifocations@me.com",
    "isBYOL": true
    } '
    Message: Submitted job to create service [ADWCx] in domain [idcs-31bbd63c3cb9466cb8a96f627b6b6116].
    Job ID: 25509908

    We can check the status of job
    $ psm adwc activities --service-name ADWC
    Operation Type Status Start Time End Time
    CREATE_SERVICE RUNNING 2018-04-28T19:57:31.056+0000 N/A

    And a few minutes later the service is there:
    $ psm adwc activities --service-name ADWC
    Operation Type Status Start Time End Time
    CREATE_SERVICE SUCCEED 2018-04-28T19:57:31.056+0000 2018-04-28T19:59:51.900+0000

    We will see how to connect in a future post. Very easy from SQL Developer or SQLcl.

    You can delete the service when you don’t need it anymore:

    psm adwc delete-service --service-name ADWC

    To save credits, you want an easy way to stop and start the service. That’s for tne next post as PSN requires a little hack there.


    Cet article ADWC: Creation of Autonomous Database Cloud service est apparu en premier sur Blog dbi services.

    Oracle 18c clone PDB and Transparent Data Encryption

    Mon, 2018-04-30 10:02

    Let’s do some tests with the Oracle 18c new feature in creating PDB clone with DBCA. Unfortunately, this feature does not work when you have TDE enabled.

    Just to remember, with Oracle 12.2 we had the possibility to create PDBs with dbca just from PDBseed to from unplugged PDBs:


    Now in version 18c we can create PDBs from existing PDBs as follows (for this test TDE is disabled):


    You can choose the Pluggable Database you want to clone.


    You select the name of your cloned PDB, and in less than 1 minute your original PDB is cloned:

    oracle@localhost:/u00/app/oracle/oradata/PSI/ [DB18] sq
    SQL*Plus: Release Production on Mon Apr 30 12:55:54 2018
    Copyright (c) 1982, 2017, Oracle.  All rights reserved.
    Connected to:
    Oracle Database 18c Enterprise Edition Release - Production
    SQL> show pdbs
          2   PDB$SEED         READ ONLY      NO
          3   PDB1             READ WRITE     NO
          5   PDBNEW           READ WRITE     NO

     Let’s make some tests with PDBs and TDE.

    In Oracle 18c, it is no more mandatory to configure the sqlnet.ora file, we only have to define wallet_root and tde_configuration as follows:

    SQL> alter system set wallet_root='/u00/app/oracle/admin/DB18/wallet_cdb' scope=spfile;
    SQL> startup force;
    ORACLE instance started.
     Total System Global Area 1677717664 bytes
    Fixed Size          8896672 bytes
    Variable Size         520093696 bytes
    Database Buffers     1140850688 bytes
    Redo Buffers            7876608 bytes
    Database mounted.
    Database opened.
    SQL> alter system set tde_configuration="keystore_configuration=file" scope=both;
    System altered.

    We create a management key in the CDB

    SQL> administer key management create keystore identified by manager_cdb;
    keystore altered.

    The wallet file is created:

    SQL> !ls /u00/app/oracle/admin/DB18/wallet_cdb/tde

    We open the keystore for the CDB and the PDBs:

    SQL> administer key management set keystore open identified by manager_cdb 
    keystore altered.

    We check in the pluggable database:

    SQL> alter session set container=pdb1;
    Session altered.
    SQL> select  status from v$encryption_wallet;

    But we receive open_no_master_key …

    We return to the CDB:

    SQL> connect / as sysdba
    SQL> administer key management set key identified by manager_cdb with backup;
    keystore altered.

    In the PDB:

    SQL> alter session set container=pdb1;
    Session altered.
    keystore altered.
    SQL> select status from v$encryption_wallet;

    The keystore is now opened in the PDB, we can encrypt the data. With Oracle 18c there are two modes: united (the CDB owns the keystore for itself and the PDBs) or isolated (the PDB has its own keystore). In our case we are in united mode, let’s see if we can clone the PDB.


    We do the same operations as previously, but the assistant is asking us for the keystore password:


    By looking at oracle error messages, we can find a similar error on PDB switchover : “Metalink Note 2378945.1: “We only support this with auto login wallet”

    So I decided to implement auto login in my configuration and try to clone my PDB:

    SQL> administer key management create local auto_login keystore from keystore '/u00/app/oracle/admin/DB18/wallet_cdb/tde' identified by manager_cdb;
    SQL> startup force;
    ORACLE instance started.
    Total System Global Area 1677717664 bytes
    Fixed Size		    8896672 bytes
    Variable Size		  520093696 bytes
    Database Buffers	 1140850688 bytes
    Redo Buffers		    7876608 bytes
    Database mounted.
    Database opened.

    My PDB TDE configuration is in auto login mode:

    SQL> select wrl_type,status, wallet_type from v$encryption_wallet;

    But even if TDE is implemented in auto login mode, the PDB clone operation fails with the same ORA-46697 error message.

    We also encounter this bad behavior with the 18c new features about PDBs snapshot, which allows to create PDBs snapshots manually or automatically:

    SQL> create pluggable database snap_pdb1 from pdb1
      2  file_name_convert = ('snap_pdb1', 'pdb1')
      3* snapshot mode every 60 minutes
    create pluggable database snap_pdb1 from pdb1
    ERROR at line 1:
    ORA-46697: Keystore password required.

    Cloning PDBs is a very useful tool in order to realize mass deployment to development teams, it should be nice to make it work with TDE enabled.


    Cet article Oracle 18c clone PDB and Transparent Data Encryption est apparu en premier sur Blog dbi services.

    Managing SQL Server sa credentials with Docker secrets on Swarm

    Fri, 2018-04-27 05:39

    A couple of weeks ago, I was working on a MSSQL Server docker image in a context of Hidora, a swiss cloud provider based on jelastic and for Docker-based applications.

    When writing my jps manifest file I was agreeably surprised about the section concerning the MSSQL Server credentials information. We may able to define global variables for SQL Server sa password with ${fn.password} as function as shown below:

      sa_password: ${fn.password}
      user_password: ${fn.password}
      - nodeGroup: cp
        count: 1
        cloudlets: 30
        displayName: mssqlserver-linux-2017-cu4 dbi services
          MSSQL_SA_PASSWORD: ${globals.sa_password}
          MSSQL_PASSWORD: ${globals.user_password}
          MSSQL_USER: ${globals.user_login} 
          TZ: Europe/Berlin
        image: dbi_linux_sql2017_cu4
        registryUrl: node21755-env-6328816:5000


    In fact, jelastic provides this interesting function that generates a random password during the creation of the image container preventing to put the security information in clear text into the manifest. A very good idea!

    But let’s going back to a traditional docker infrastructure. Usually as developers, we put sensible information in the docker-compose file deployment but that’s not a big deal in this case (at least for almost cases). If we start containers based on our image in a Docker infrastructure, we still able to get the SQL Server sa password by running the docker inspect command.

    $docker inspect 23107bf057ef | jq .[].Config.Env


    blog 131 - 0 - docker swarm secret

    Definitely, one thing we want to avoid in Production. Sometimes ago, I wrote about Docker Swarm feature that enables addressing production-oriented workload including scalability, high-availability and others. In production, the game may change a lot because we have to manage sensible data as login credentials and fortunately we may rely on a Docker Swarm feature called Docker secrets.

    As stated to the Docker documentation, when we create and add a secret to a swarm, Docker sends the secret to the swarm manager over a mutual TLS connection. The secret is stored in the Raft log, which is encrypted. The entire Raft log is replicated across the other managers, ensuring the same high availability guarantees for secrets as for the rest of the swarm management data.

    This feature may address the security concern with MSSQL Server containers and sa credentials in production workloads. Indeed, as database administrator, we don’t want to provide the sa password to the application users and we will go further by providing a SQL Server login without any scope-limited permissions and without providing any password in clear text in the docker deployment file.

    Thus, we managed to modify our initial MSSQL Docker image to support Docker secrets on Docker Swarm. In this blog, let’s focus on SQL Server sa password. Firstly we have to create a docker secret concerning the SQL Server sa password

    $echo "Passw0rd1" | docker secret create mssql_sa_password –
    $ docker secret ls
    ID                          NAME                  DRIVER              CREATED             UPDATED
    fpqykdgr4ytcher1j3sb5tgfv   mssql_sa_password                         35 minutes ago      35 minutes ago


    The mssql_sa_password secret is then replicated to the other nodes by the Docker Swarm using TLS as explained above.

    The second step consisted in modifying the docker file as well as the docker-compose file for deployment. The former contains two important sections where we had to put the additional code to extract the docker secret information as entrypoint.sh and healthcheck.sh

    The Docker file (I put only the interesting sample here):

    # Entry point # 
    ENTRYPOINT ["./entrypoint.sh"]
    # Tail the setup logs to trap the process
    CMD ["tail -f /dev/null"]
    # Healthcheck routine for mssql instance
    HEALTHCHECK --interval=15s CMD [ "./healthcheck.sh" ]


    entrypoint.sh bash script includes starting up the sqlservr process and healhcheck.sh a custom health check routine based on sqlcmd command line tool (meaning this approach requires mssql-tools package is already installed in your image).

    /opt/mssql-tools/bin/sqlcmd -S localhost,$MSSQL_TCP_PORT -U sa -P $SA_PASSWORD -Q "select 1" && grep -q "MSSQL CONFIG COMPLETED" ./config.log


    The code to leverage Docker secrets was as follows:

    if [  ! -z ${MSSQL_SA_PASSWORD_FILE} ];


    We added a new $MSSQL_SA_PASSWORD_FILE variable that takes priority over the $MSSQL_SA_PASSWORD if exists. The $MSSQL_SA_PASSWORD_FILE points to the path where the secret is available inside the Docker container by design: /run/secrets/<secret file>. We tried to follow a standard rule that consists in adding the _FILE prefix to the existing SQL Server sa variable (MSSQL_SA_PASSWORD) for convenience.

    Finally, we modified the docker-compose file for deployment that contains all information to connect to the secret password so we may switch easily between using the traditional approach with the password in clear text in the deployment file and the securest way to manage sensible data on Docker Swarm.

    blog 131 - docker secret deploy file_

    After applying the code update, using docker inspect command doesn’t reveal the password anymore.

    $docker inspect 62c42040174a | jq .[].Config.Env


    As you probably know, Docker EE 2.0 is now able to manage container applications both on Docker Swarm and Kubernetes. I’m looking forward to write about for both environments in the context of MSSQL Server databases and managing sensible data :)

    See you!




    Cet article Managing SQL Server sa credentials with Docker secrets on Swarm est apparu en premier sur Blog dbi services.

    IOUG Collaborate 18

    Thu, 2018-04-26 18:12

    The IOUG Collaborate 18 is now done.
    I presented 2 sessions there:

    From Transportable Tablespaces to Pluggable Databases

    The introduction comes from a 5 minutes talk at Oracle Open World 2016 in the ‘EOUC Database ACES Share Their Favorite Database Things’, on the history of having tablespaces self-contained (with a relative file number in 8.0 and locally managed tablespaces in 8.1). I’ve added a demo on a feature that is not well known – using RMAN to transport tablespaces without the need to have the source in read-only, available since 10g. And I demoed all PDB movement features in 12cR2, 12cR2 and 18c: remote clone, refreshable clones, PDB switchover, Online relocation,…

    A full article on the topic is available on Oracle Scene: http://viewer.zmags.com/publication/07098028#/07098028/8 and feel free to gove feedback here if you are using those features. Lot of interesting comments went after the session.


    12.2 Multitenant New Security Features to Clarify DevOps and DBA role separation

    This session is basically a demo of lockdown profiles and resource manager settings at PDB level. With an introduction on DevOps because the goal of those features is to be able to lower the roundtrips between Dev and Ops by giving nearly full privileges on the PDB. Those features were developed by Oracle for their own managed cloud services: Exadata Express Cloud Service and Autonomous Data Warehouse. You are the administrator of your PDB there, but locked down to what cannot break the CDB, and limited to the resources you pay for.

    I’ll give this session next Month in Dusseldorf at DOAG Datenbank: https://programm.doag.org/datenbank/2018/#/scheduledEvent/558645, so you still have the occasion to see how this Autonomous Data Warehouse Cloud service works from command line.

    This is clearly an alternative to having Oracle Database on Docker, where containers have a clear separation between the user data and metadata (in the PDB) and the software distribution and data (in ORACLE_HOME, and in CDB$ROOT). But experience shows a slow adoption of multitenant, and developers are asking for Docker containers. But the separation is not so easy: it is clear that the user data must be in an external volume and the software (the Oracle Home – or at least the minimal part of it required to run the database and options). But a big part of the software (for example the dbms_… packages) is also in the database, in CDB$ROOT. Here again feel free to comment.


    Cet article IOUG Collaborate 18 est apparu en premier sur Blog dbi services.

    Deploying PostgreSQL in MiniShift/OpenShift

    Thu, 2018-04-26 00:05

    The last post quickly outlined on how you can setup MiniShift for playing around with OpenShift on your workstation. In this post we’ll setup PostgreSQL in MiniShift using the default PostgreSQL image that already comes with MiniShift.

    When MiniShift is currently stopped start it up:

    dwe@box:~$ minishift start
    dwe@box:~$ eval $(minishift oc-env)
    dwe@box:~$ which oc

    With OpenShift everything is organized into projects and the first thing you need to do is to create a project. You can either do that using the command line or the web interface. Doing it using the command line is quite simple and fast. The first step is to login to OpenShift:

    dwe@box:~$ oc login
    Authentication required for (openshift)
    Username: system
    Login successful.
    You don't have any projects. You can try to create a new project, by running
        oc new-project 

    The output of the login command already tells what you need to do to create a new project:

    dwe@box:~$ oc new-project postgres
    Now using project "postgres" on server "".
    You can add applications to this project with the 'new-app' command. For example, try:
        oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git
    to build a new example application in Ruby.

    Doing the same with the web interface of course needs more time:


    To get a list of available images from the command line:

    dwe@box:~$ oc get imagestreams --namespace openshift
    NAME             DOCKER REPO                                TAGS                           UPDATED
    dotnet            2.0,latest                     4 hours ago
    dotnet-runtime   2.0,latest                     4 hours ago
    httpd              2.4,latest                     4 hours ago
    jenkins          1,2,latest                     4 hours ago
    mariadb          10.1,10.2,latest               4 hours ago
    mongodb          2.4,2.6,3.2 + 2 more...        4 hours ago
    mysql              5.5,5.6,5.7 + 1 more...        4 hours ago
    nginx              1.10,1.12,1.8 + 1 more...      4 hours ago
    nodejs            8,latest,0.10 + 2 more...      4 hours ago
    perl                5.24,latest,5.16 + 1 more...   4 hours ago
    php                  5.5,5.6,7.0 + 2 more...        4 hours ago
    postgresql       9.2,9.4,9.5 + 2 more...        4 hours ago
    python            3.3,3.4,3.5 + 3 more...        4 hours ago
    redis              3.2,latest                     4 hours ago
    ruby                2.0,2.2,2.3 + 2 more...        4 hours ago
    wildfly          10.1,8.1,9.0 + 2 more...       4 hours ago

    Quite a few to choose from but we of course are interested in the postgresql one.

    dwe@box:~$ oc new-app -e POSTGRESQL_USER=blubb -e POSTGRESQL_PASSWORD=blubb -e POSTGRESQL_DATABASE=blubb postgresql

    Checking the logs is always a good idea. For this we need to know the pod:

    dwe@box:~$ oc get pods
    NAME                 READY     STATUS    RESTARTS   AGE
    postgresql-1-8n85h   1/1       Running   0          5m

    Now that we know the pod we can ask for the logs:

    dwe@box:~$ oc logs postgresql-1-8n85h
    The files belonging to this database system will be owned by user "postgres".
    This user must also own the server process.
    The database cluster will be initialized with locale "en_US.utf8".
    The default database encoding has accordingly been set to "UTF8".
    The default text search configuration will be set to "english".
    Data page checksums are disabled.
    fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok
    creating subdirectories ... ok
    selecting default max_connections ... 100
    selecting default shared_buffers ... 128MB
    selecting dynamic shared memory implementation ... posix
    creating configuration files ... ok
    running bootstrap script ... ok
    performing post-bootstrap initialization ... ok
    syncing data to disk ... ok
    Success. You can now start the database server using:
        pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start
    WARNING: enabling "trust" authentication for local connections
    You can change this by editing pg_hba.conf or using the option -A, or
    --auth-local and --auth-host, the next time you run initdb.
    waiting for server to start....LOG:  redirecting log output to logging collector process
    HINT:  Future log output will appear in directory "pg_log".
    server started
    => sourcing /usr/share/container-scripts/postgresql/start/set_passwords.sh ...
    waiting for server to shut down.... done
    server stopped
    Starting server...
    LOG:  redirecting log output to logging collector process
    HINT:  Future log output will appear in directory "pg_log".

    Looks good so far. How can we work with the PostgreSQL instance now? One way is to start a remote shell:

    dwe@box:~$ oc rsh postgresql-1-8n85h
    sh-4.2$ ps -ef
    UID        PID  PPID  C STIME TTY          TIME CMD
    1000100+     1     0  0 13:29 ?        00:00:00 postgres
    1000100+    57     1  0 13:29 ?        00:00:00 postgres: logger process  
    1000100+    59     1  0 13:29 ?        00:00:00 postgres: checkpointer process  
    1000100+    60     1  0 13:29 ?        00:00:00 postgres: writer process  
    1000100+    61     1  0 13:29 ?        00:00:00 postgres: wal writer process  
    1000100+    62     1  0 13:29 ?        00:00:00 postgres: autovacuum launcher pr
    1000100+    63     1  0 13:29 ?        00:00:00 postgres: stats collector proces
    1000100+    85     0  0 13:46 ?        00:00:00 /bin/sh
    sh-4.2$ psql -c "\l"
                                     List of databases
       Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
     blubb     | blubb    | UTF8     | en_US.utf8 | en_US.utf8 | 
     postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
     template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres        
               |          |          |            |            | postgres=CTc/postgr
     template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres        
               |          |          |            |            | postgres=CTc/postgr
    (4 rows)

    This is usually not what you want to do. What you rather want is to make the instance available from outside the cluster. How can you do that? Either you do port forwarding:

    dwe@box:~$ oc port-forward postgresql-1-8n85h 5432
    Forwarding from -> 5432

    This will stay in the foreground. From another session you can use psql to connect:

    dwe@box:~$ psql -h localhost -U blubb blubb
    psql (9.5.12, server 9.6.5)
    WARNING: psql major version 9.5, server major version 9.6.
             Some psql features might not work.
    Type "help" for help.

    … or you can expose a service:

    dwe@box:~$ oc expose dc postgresql --type=LoadBalancer --name=mpostgresql-ingress
    service "mpostgresql-ingress" exposed
    dwe@box:~$ oc get svc
    NAME                  TYPE           CLUSTER-IP     EXTERNAL-IP                     PORT(S)          AGE
    mpostgresql-ingress   LoadBalancer,   5432:31734/TCP   38s

    From now on you can connect using the MiniShift IP and the port listed above:

    dwe@box:~$ psql -h $(minishift ip) -p 31734 -U blubb
    Password for user blubb: 
    psql (9.5.12, server 9.6.5)
    WARNING: psql major version 9.5, server major version 9.6.
             Some psql features might not work.
    Type "help" for help.
    blubb=> \l
                                     List of databases
       Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
     blubb     | blubb    | UTF8     | en_US.utf8 | en_US.utf8 | 
     postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
     template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
               |          |          |            |            | postgres=CTc/postgres
     template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
               |          |          |            |            | postgres=CTc/postgres
    (4 rows)

    Be aware that the storage is not persistent in that case and you’ll lose everything when the container is stopped. In the next post we’ll look into how we can deploy the containers which are provided by EnterpriseDB.


    Cet article Deploying PostgreSQL in MiniShift/OpenShift est apparu en premier sur Blog dbi services.

    Oracle Fusion Middleware Infrastructure – Probable Incorrect Firewall Configuration

    Wed, 2018-04-25 15:14

    It was a long time that I wrote my last blog. Lots of customer activities and I had no time to write one. With the acquired knowledges, it’s time to write more blogs and share knowledge, don’t you think ?

    Let’s begin with an easy one. During customer activity when upgrading a complex Fusion Middleware Platform, it was asked to us to provide support during a move to a secure zone. I can’t tell you what’s really behind the secure zone, probably a more protected and more restricted network with more firewall restriction, don’t ask I have no idea but we unfortunately had an issue. Before that move, I was quite confident as it was informed that it’s only impacting the current IP address. No stress as all the Fusion Middleware Component configuration were using a network alias leveraging the impact of any network changes that could happen. So as a sample for this blog post, the WebLogic instances listen addresses was set with the network alias “dbi-cust-1983.dbi-services.com” pointing to the real hostname in the DNS “vmtestdbiofm01.dbi-services.com”. The NodeManager was set the same for the configured machine. Please see some screenshot to help to understand the configuration we had.

    WebLogic Admin Server Listen Address


    NodeManager machine Listen Address


    NodeManager Listen Address

    weblogic@:/home/weblogic/ [dbiOFMHDV] cd $DOMAIN_HOME
    weblogic@:/data/weblogic/config/domains/dbiOFMHDV/ [dbiOFMHDV] find ./ -name nodemanager.properties
    weblogic@:/data/weblogic/config/domains/dbiOFMHDV/ [dbiOFMHDV] cat nodemanager/nodemanager.properties | grep ListenAddress

    Let’s also add that the network naming resolution for the Middleware Components was also done through the DNS and the local naming resolution (/etc/hosts) wasn’t containing the real host nor the DNS alias used.

    weblogic@:/data/weblogic/config/domains/dbiOFMHDV/ [dbiOFMHDV] cat /etc/hosts localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

    So the operation began, we shutdown all the Middleware and Database Components, the Network and Unix Team worked on their tasks (IP changes, DNS correlated changes, forcing changes propagation over the network, aso). Once they completed and once we confirmed that Network Config change have been properly applied (ifconfig, ping, nslookup), we started the platform again without any changes from the Middleware part. Application team performed smoke testing and there was no functional impact.

    After reviewing the logs for any error, I found an Error with the RJVM module as follow

    ####<Apr 16, 2018, 11:08:06,438 AM CEST> <Error> <RJVM> <vmtestdbiofm01.dbi-services.com> <AdminServer> <ExecuteThread: '0' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <d41ced14-a5e8-4ef9-bd90-19f63910849d-00000059> <1523869686438> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000572> <The server rejected a connection attempt JVMMessage from: '3739718027889105070S:dbi-cust-1983.dbi-services.com:[-1,-1,9001,9001,-1,-1,-1]:dbiOFMHDV:WLS_FORMS' to: '0B:[8443,-1,-1,-1,-1,-1,-1]' cmd: 'CMD_IDENTIFY_REQUEST', QOS: '101', responseId: '-1', invokableId: '-1', flags: 'JVMIDs Sent, TX Context Not Sent, 0x1', abbrev offset: '183' probably due to an incorrect firewall configuration or administrative command.>
    ####<Apr 16, 2018, 11:08:55,371 AM CEST> <Error> <RJVM> <vmtestdbiofm01.dbi-services.com> <AdminServer> <ExecuteThread: '3' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <d41ced14-a5e8-4ef9-bd90-19f63910849d-0000005d> <1523869735371> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000572> <The server rejected a connection attempt JVMMessage from: '-2190410908851642855S:dbi-cust-1983.dbi-services.com:[-1,-1,9002,9002,-1,-1,-1]:dbiOFMHDV:WLS_REPORTS' to: '0B:[8443,-1,-1,-1,-1,-1,-1]' cmd: 'CMD_IDENTIFY_REQUEST', QOS: '101', responseId: '-1', invokableId: '-1', flags: 'JVMIDs Sent, TX Context Not Sent, 0x1', abbrev offset: '183' probably due to an incorrect firewall configuration or administrative command.>
    ####<Apr 16, 2018, 11:09:06,509 AM CEST> <Error> <RJVM> <vmtestdbiofm01.dbi-services.com> <AdminServer> <ExecuteThread: '2' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <d41ced14-a5e8-4ef9-bd90-19f63910849d-0000005e> <1523869746509> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000572> <The server rejected a connection attempt JVMMessage from: '3739718027889105070S:dbi-cust-1983.dbi-services.com:[-1,-1,9001,9001,-1,-1,-1]:dbiOFMHDV:WLS_FORMS' to: '0B:[8443,-1,-1,-1,-1,-1,-1]' cmd: 'CMD_IDENTIFY_REQUEST', QOS: '101', responseId: '-1', invokableId: '-1', flags: 'JVMIDs Sent, TX Context Not Sent, 0x1', abbrev offset: '183' probably due to an incorrect firewall configuration or administrative command.>
    ####<Apr 16, 2018, 11:09:59,162 AM CEST> <Error> <RJVM> <vmtestdbiofm01.dbi-services.com> <AdminServer> <ExecuteThread: '0' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <d41ced14-a5e8-4ef9-bd90-19f63910849d-00000059> <1523869799162> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000572> <The server rejected a connection attempt JVMMessage from: '-2190410908851642855S:dbi-cust-1983.dbi-services.com:[-1,-1,9002,9002,-1,-1,-1]:dbiOFMHDV:WLS_REPORTS' to: '0B:[8443,-1,-1,-1,-1,-1,-1]' cmd: 'CMD_IDENTIFY_REQUEST', QOS: '101', responseId: '-1', invokableId: '-1', flags: 'JVMIDs Sent, TX Context Not Sent, 0x1', abbrev offset: '183' probably due to an incorrect firewall configuration or administrative command.>

    Facing this issue, I ask colleagues for any recommendation or finding to solve the issue ASAP. They recommended to have a look on the cluster configuration for any broadcasting messaging mode, the default cluster configuration messaging mode of my Middleware component was unicast.

    WebLogic Server – Default Forms & Reports Cluster configuration


    Let’s check the BEA error code to continue the investigation.

    The BEA-00572 error description


    The message is quite clear “The server rejected a connection attempt JVMMessage From …” As said before, all the Fusion Middleware was properly started and the functional tests shown that there was no impact at all. It’s always recommended to find a solution to remove such kind of error message. After some research on the MOS, I found a document “Doc ID 860340.1″ applicable the WebLogic Server since the version 7.0. It clearly described that this issue was not reported, nor tested on more recent WebLogic Server version 10.3 or 12c version. I was not really confident with this Oracle Support Note but I tested in case of and I was surprised.

    The provided solution by Oracle work very well on my case, I append the  JAVA Option with the flag “-Dweblogic.rjvm.enableprotocolswitch=true” in the setUserOverrides.sh script located under the $DOMAIN_HOME/bin.

    export JAVA_OPTIONS="${JAVA_OPTIONS} -Ddomain.home=/u02/weblogic/config/domains/dbiOFMHDV -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.minimumProtocolVersion=TLSv1.2 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.rjvm.enableprotocolswitch=true"

    After this change and after having restarted the whole domain, there error was gone. No more probable Firewall Configuration Error.

    Hope this will help other people that will have to move a Fusion Middleware 12c Platform into a secure zone.


    Cet article Oracle Fusion Middleware Infrastructure – Probable Incorrect Firewall Configuration est apparu en premier sur Blog dbi services.

    Setting up MiniShift

    Wed, 2018-04-25 04:18

    Currently we have many requests to support customers in running PostgreSQL in Docker containers. Some of those use redhat OpenShift as the management platform for their Docker deployments. Setting up an OpenShift cluster requires quite some resources and is nothing you want to do on your workstation usually. To overcome that you can use MiniShift which launches a one node OpenShift cluster which you can use for testing. Setting that up is quite easy.

    Obviously MiniShift needs to be downloaded. I’ll be running MiniShift on my workstation and will use VirtualBox as the Hypervisor for Minishift:

    dwe@box:~$ sudo mkdir /opt/minishift
    [sudo] password for dwe: 
    dwe@box:~$ sudo chown dwe:dwe /opt/minishift
    dwe@box:~$ cd /opt/minishift
    dwe@box:/opt/minishift$ wget https://github.com/minishift/minishift/releases/download/v1.16.1/minishift-1.16.1-linux-amd64.tgz
    dwe@box:/opt/minishift$ tar -axf minishift-1.16.1-linux-amd64.tgz
    dwe@box:/opt/minishift$ cd minishift-1.16.1-linux-amd64/
    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ export PATH=/opt/minishift/minishift-1.16.1-linux-amd64:$PATH
    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ minishift config set vm-driver virtualbox
    No Minishift instance exists. New 'vm-driver' setting will be applied on next 'minishift start'

    Now MiniShift can be started:

    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ minishift start

    The output should look similar to this:

    -- Starting profile 'minishift'
    -- Checking if https://github.com is reachable (using proxy: "No") ... OK
    -- Checking if requested OpenShift version 'v3.9.0' is valid ... OK
    -- Checking if requested OpenShift version 'v3.9.0' is supported ... OK
    -- Checking if requested hypervisor 'virtualbox' is supported on this platform ... OK
    -- Checking if VirtualBox is installed ... OK
    -- Checking the ISO URL ... OK
    -- Downloading OpenShift binary 'oc' version 'v3.9.0'
     42.11 MiB / 42.11 MiB [========================================================================================================================] 100.00% 0s-- Downloading OpenShift v3.9.0 checksums ... OK
    -- Checking if provided oc flags are supported ... OK
    -- Starting local OpenShift cluster using 'virtualbox' hypervisor ...
    -- Minishift VM will be configured with ...
       Memory:    2 GB
       vCPUs :    2
       Disk size: 20 GB
       Downloading ISO 'https://github.com/minishift/minishift-b2d-iso/releases/download/v1.2.0/minishift-b2d.iso'
     40.00 MiB / 40.00 MiB [========================================================================================================================] 100.00% 0s
    -- Starting Minishift VM ........................... OK
    -- Checking for IP address ... OK
    -- Checking for nameservers ... OK
    -- Checking if external host is reachable from the Minishift VM ... 
       Pinging ... OK
    -- Checking HTTP connectivity from the VM ... 
       Retrieving http://minishift.io/index.html ... OK
    -- Checking if persistent storage volume is mounted ... OK
    -- Checking available disk space ... 0% used OK
       Importing 'openshift/origin:v3.9.0' . CACHE MISS
       Importing 'openshift/origin-docker-registry:v3.9.0'  CACHE MISS
       Importing 'openshift/origin-haproxy-router:v3.9.0'  CACHE MISS
    -- OpenShift cluster will be configured with ...
       Version: v3.9.0
    Pulling image openshift/origin:v3.9.0
    Pulled 1/4 layers, 26% complete
    Pulled 2/4 layers, 72% complete
    Pulled 3/4 layers, 82% complete
    Pulled 4/4 layers, 100% complete
    Image pull complete
    Using Docker shared volumes for OpenShift volumes
    Using as the server IP
    Starting OpenShift using openshift/origin:v3.9.0 ...
    OpenShift server started.
    The server is accessible via web console at:
    You are logged in as:
        User:     developer
    To login as administrator:
        oc login -u system:admin
    -- Exporting of OpenShift images is occuring in background process with pid 7708.

    You should also see a new VM in VirtualBox:

    As we will need the oc binary to work with OpenShift we need to add that to the PATH:

    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ minishift oc-env
    export PATH="/home/dwe/.minishift/cache/oc/v3.9.0/linux:$PATH"
    # Run this command to configure your shell:
    # eval $(minishift oc-env)
    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ eval $(minishift oc-env)
    dwe@box:/opt/minishift/minishift-1.16.1-linux-amd64$ which oc

    And we are ready to use OpenShift:

    oc login -u system:admin
    Logged into "" as "system:admin" using existing credentials.
    You have access to the following projects and can switch between them with 'oc project ':
      * edb
    Using project "edb".

    The web interface is up and running is well:

    Logging in as system/admin:

    … we could already start and deploy a PostgreSQL container but this is the topic for the next post. To stop MiniShift:

    dwe@box:~$ minishift stop
    Stopping local OpenShift cluster...
    Cluster stopped.

    Hope that helps.


    Cet article Setting up MiniShift est apparu en premier sur Blog dbi services.

    Can I do it with PostgreSQL? – 19 – Create user … identified by values

    Sat, 2018-04-21 06:39

    Puh, that last post in this series is already half a year old. Time is moving too fast :( Today, while being at a customer again, this question came up: Can I do something comparable in PostgreSQL to what I can do in Oracle, which is: Create a user and provide the hashed password so that the password is the same on the source and the target (which implies not knowing the password at all)? In Oracle you can find the hashed passwords in user$ where can I find that in PostgreSQL? Lets go.

    When we look at the “create user” command there is no option which seems to do that:

    postgres=# \h create user
    Command:     CREATE USER
    Description: define a new database role
    CREATE USER name [ [ WITH ] option [ ... ] ]
    where option can be:
        | LOGIN | NOLOGIN
        | CONNECTION LIMIT connlimit
        | [ ENCRYPTED ] PASSWORD 'password'
        | VALID UNTIL 'timestamp'
        | IN ROLE role_name [, ...]
        | IN GROUP role_name [, ...]
        | ROLE role_name [, ...]
        | ADMIN role_name [, ...]
        | USER role_name [, ...]
        | SYSID uid

    Maybe we can just pass the hashed password? Lets try be creating a new user:

    postgres=# create user u with login password 'u';

    The hashed passwords in PostgreSQL are stored in pg_shadow:

    postgres=# select passwd from pg_shadow where usename = 'u';
    (1 row)

    Lets use that hash and create a new user:

    postgres=# create user w login encrypted password 'md56277e2a7446059985dc9bcf0a4ac1a8f';

    Can we login as w using “u” as a password?

    postgres@pgbox:/home/postgres/ [PG10] psql -X -h -p $PGPORT -U w postgres -W
    Password for user u: 
    psql: FATAL:  no pg_hba.conf entry for host "", user "w", database "postgres", SSL off

    Ok, makes sense. After fixing that:

    postgres@pgbox:/home/postgres/ [PG10] psql -X -h -p $PGPORT -U w postgres -W
    Password for user w: 
    psql: FATAL:  password authentication failed for user "w"

    So obviously this is not the way to do it. Do we have the same hashes in pg_shadow?

    postgres=# select usename,passwd from pg_shadow where usename in ('w','u');
     usename |               passwd                
     u       | md56277e2a7446059985dc9bcf0a4ac1a8f
     w       | md56277e2a7446059985dc9bcf0a4ac1a8f
    (2 rows)

    Hm, exactly the same. Why can’t we login then? The answer is in the documentation:”Because MD5-encrypted passwords use the role name as cryptographic salt, …”. We can verify that be re-creating the “w” user using the same password as that of user “u”:

    postgres=# drop user w;
    postgres=# create user w login password 'u';
    postgres=# select usename,passwd from pg_shadow where usename in ('w','u');
     usename |               passwd                
     u       | md56277e2a7446059985dc9bcf0a4ac1a8f
     w       | md53eae63594a41739e87141e8333d15f73
    (2 rows)

    The hashed values are not the same anymore. What of course is working is to re-create the user with that hash:

    postgres=# drop role w;
    postgres=# create user w login password 'md53eae63594a41739e87141e8333d15f73';

    Now we should be able to login with the password ‘u':

    postgres@pgbox:/home/postgres/ [PG10] psql -X -h -p $PGPORT -U w postgres -W
    Password for user w: 
    psql (10.0 dbi services build)
    Type "help" for help.

    Fine. Another way of getting the password hashes is to use pg_dumpall using the “–globals-only” switch:

    postgres@pgbox:/home/postgres/ [PG10] pg_dumpall --globals-only > a.sql
    postgres@pgbox:/home/postgres/ [PG10] grep -w w a.sql 

    Hope that helps.


    Cet article Can I do it with PostgreSQL? – 19 – Create user … identified by values est apparu en premier sur Blog dbi services.

    2018.pgconf.de, recap

    Mon, 2018-04-16 11:43

    Finally I am home from pgconf.de in Berlin at the beautiful Müggelsee. Beside meeting core PostreSQL people such Devrim and Bruce, Andreas and joining Jan again for great discussions and some beers, joking with Anja, being at the dbi services booth, discussing with people, kidding with Hans: was it worth the effort? Yes, it was, and here is why.


    We had very interesting discussions at our booth, ranging from migrations to PostgreSQL, PostgreSQL training corporations and interest in our OpenDB appliance.

    The opening session “Umdenken! 11 Gebote zum IT-Management” raised a question we do always ask our selfs as well: When you do HA how much complexity does the HA layer add? Maybe it is the HA layer that was causing the outage and that would not have happened without that? Reducing complexity is key to robust and reliable IT operations.

    Listening to Bruce Momjian is always a joy: This time it was about PostgreSQL sharding. Much is already in place, some will come with PostgreSQL 11 and other stuff is being worked on for PostgreSQL 12 next year. Just check the slides which should be available for download from the website soon.

    Most important: The increasing interest in PostgreSQL. We can see that at our customers, at conferences and in the interest in our blog posts about that topic. Sadly, when you have a booth, you are not able to listen to all the talks you would like to. This is the downside :(

    So, mark your calendar: Next years date and location are already fixed: May 10, 2019, in Leipzip. I am sure we will have some updates to:



    Cet article 2018.pgconf.de, recap est apparu en premier sur Blog dbi services.

    Covering indexes in Oracle, and branch size

    Fri, 2018-04-13 16:01

    A covering index is an index that contains all the columns required by your query, so that you don’t have to do a TABLE ACCESS BY INDEX ROWID, which is the major cost of an index range scan. You don’t need any special feature to do that in Oracle. Just add the required columns at the end of the index. In the execution plan you will see the columns used as index keys for the range scan displayed in ‘access’ predicates, and the further filtering done on the remaining columns with ‘filter’ predicates. The ‘projection’ shows the columns that are returned in the rowset result.
    However you may have seen that SQL Server has a special ‘INCLUDING’ keyword to separate those non-key columns added only for filtering or projection but not for access. What does it bring that Oracle doesn’t have?

    An index entry is composed of a key and data associated to the key. The index is sorted on the key. The data for each key have no special order, like in a heap table. The idea of the SQL Server INCLUDING keyword is to separate the columns belonging to the key and the columns belonging to the data. It is not mandatory. You can add all columns to the key but depending on the implementation, the benefit can be:

    • some data types may not be allowed in the key but allowed as data
    • sorting the data when not required may be a performance overhead
    • there can be limitations on the size of the key
    • having a larger key may require more space in the branches
    • adding sorted columns may change the clustering factor

    In Oracle, there are very few data types that cannot be indexed (like LONG). The limitation on the size of the key may come into play for large 12c Extended Datatypes. You can substring them, but that defeats the goal of covering indexes. I see two reasons why ‘INCLUDING’ indexes can be useful. The first reason is about the clustering factor. The second about sorting the whole index entry and referencing it from the branches. I’ll detail those reasons later, but first here is an example.

    SQL> create table DEMO (UNIQU ,RANGE ,RANDOM_TEXT ,CONSTANT_TEXT ) as select rownum UNIQU , mod(rownum,4) RANGE , dbms_random.string('u',80) RANDOM_TEXT , lpad('x',80,'x') CONSTANT_TEXT from xmltable('1 to 100000');
    Table DEMO created.
    SQL> commit;
    Commit complete.

    This table has an all-distinct-values column UNIQ, a few-distinct-values on (RANGE) and I’ll use them for the key. And I’ve two columns I’ll add as additional column for covering queries: one is with lot of distinct values (RANDOM_TEXT) and the other has few distinct values (CONSTANT_TEXT).
    The first rows look like this:

    SQL> select * from DEMO order by ROWID fetch first 5 rows only;
    ----- ----- -------------------------------------------------------------------------------- --------------------------------------------------------------------------------
    1 1 XCFNWCRCFBEPJPSHREUVVVTBUCCXLZMRPJPNQDTHWYRZRUORBPDOBCIRFHLICETULTCZTMPOCMUNQITV xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    2 2 XUSPNDOMPQKOIRCVDDTVYAGKRDGIXOSVUNMRAQLSRQGYKOFEXRQMCPXPYZYKRHHKDXGIINOUUAUJOLOO xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    3 3 ZBCVFTDSRUFIUTSIWOOOBWIRMEFUXNWLADAPUPFNPVYDLPQTOUZVXJKMGIPCGZESXFXOIYVMKNSMMZKB xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    4 0 VOIRCXFVSRVZQRZDRLQRHZWNGQJAAWJXWXJKRCJVPWYDJSZLJIOEWAMCFSRCUPSPPEKITJYHHOUQSVYQ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    5 1 UUSAMEVRWNLPGCUVMJWVVPDAENRYKIWWMIHTUJSZRQASMTYOVQNCGZGZIJZWNSOJVSIBMMUEAXOHJCOA xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    I’m adding indexes fo access on RANGE as the index key, with only the key, or covering the random or constant text:

    SQL> create index DEMO_RANGE on DEMO(RANGE) pctfree 50;
    Index DEMO_RANGE created.

    An additional one adding the unique column in-between:


    And now for access with the unique column as a key:


    Here are some interesting stats:

    SQL> exec dbms_stats.gather_table_stats(user,'DEMO');
    PL/SQL procedure successfully completed.
    SQL> select index_name,blevel,leaf_blocks,num_rows,clustering_factor from user_indexes where table_name='DEMO' order by 2,3;
    -------------------------------- ------ ----------- -------- -----------------
    DEMO_RANGE 1 353 100000 9757
    DEMO_RANGE_COVERING_RANDOM 2 2440 100000 99967
    DEMO_RANGE_COVERING_CONSTANT 2 2440 100000 9757
    DEMO_UNIQU_COVERING_RANDOM 2 2500 100000 2440
    DEMO_UNIQU_COVERING_CONSTANT 2 2500 100000 2440
    DEMO_RANGE_COVERING_WITH_PK 2 2565 100000 9757
    6 rows selected.

    Leaf size

    About the size, the covering indexes have approximately the same number of leaf blocks because the included column (RANDOM_TEXT or CONSTANT_TEXT) has the same size (80 bytes). Of course, the non-covering index is smaller (but will need table access to query additional column). The key on UNIQU is slightly larger than the one on RANGE because the numbers go higher. The index with 3 columns is the largest.

    Clustering factor

    About the clustering factor, there’s one outlier here which deserves an explanation. But before that, you must understand that this higher clustering factor is not important for a query using the covering index, such as a SELECT RANDOM_TEXT WHERE RANGE=0, because in that case you don’t read the table. However for some queries you may cover only the filter predicates and go to the table for projection.
    But the big problem is that when you add a column to an index to address a specific query, you don’t want to risk a side effect on another query, and changing the clustering factor is a risk here. One solution is to keep the old non-covering index (DEMO_RANGE) but then the side effect is on DML overhead.

    To understand the change in clustering factor we must go deeper on Oracle index key and data implementation. The ‘data’ part exists in Oracle indexes even when not specified explicitely with an INCLUDING clause. The ROWID is the data part. An index entry associates a key (the indexed columns) with a pointer to the table row (the ROWID). At least, this is for UNIQUE indexes where each key is unique.

    Non-unique indexes are a special case. Actually, Oracle implements only unique key indexes. When the indexed columns are not unique, the ROWID is stored on the key part of the index entry, and there is no data part. You should read Richard Foote, Differences between Unique and Non-Unique Indexes for detailed explanation.

    Branch size

    The previous statistics displayed only the number of branch level, which was the same, but we can have more detail about the branch size with an ANALYZE INDEX.

    The non-covering index has only one branch block, the root, which references all the 353 leaf blocks containing the 100000 entries, with an average of 5479/352=15 bytes per branch entry:

    SQL> analyze index DEMO_RANGE validate structure offline;
    Index DEMO_RANGE analyzed.
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    2 384 353 1375000 352 1 5479 25000 2830616 1380479 49 25000 12502.5 19 1375000 353

    The covering index with lot of distinct values for the non-key columns has more branch blocks, with an average of 34623/2439=14 bytes per branch entry:

    SQL> analyze index DEMO_RANGE_COVERING_RANDOM validate structure offline;
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    3 2560 2440 9475000 2439 6 34623 1 19558408 9509623 49 1 4 2 9475000 2440

    Here the number of branches is higher only because there are more leaves (as we have more columns), but not because of the size in the branch entries, which are even smaller. They are smaller because the branch does not have to store the full value of all columns in order to identify one leaf block. Then, only the first bytes are needed and not the full 80 bytes of them.

    The covering index with few of distinct values for the non-key columns has a lot more branch blocks, with an average of 234755/2439=96 bytes per branch entry:

    SQL> analyze index DEMO_RANGE_COVERING_CONSTANT validate structure offline;
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    3 2560 2440 9475000 2439 31 234755 25000 19759108 9709755 50 25000 12503.5 86 9475000 2440

    So, here the size of the branch blocks is higher because we have multiple leaves blocks with the value of COVERING_CONSTANT the second column is not sufficient to identify only one leaf block. The full 80 bytes must be stored, and the rowid in addition to it.

    When the indexed column has only unique values, there is no need to store more in the branches (not the additional columns, not the rowid) and only 12 bytes are needed here on average:

    SQL> analyze index DEMO_UNIQU_COVERING_RANDOM validate structure offline;
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    3 2560 2500 9688892 2499 5 29737 1 20030140 9718629 49 1 4 0 9688892 2500

    As the second column is not needed, the size of branch is the same whether we use RANDOM_TEXT or CONSTANT_TEXT:

    SQL> analyze index DEMO_UNIQU_COVERING_CONSTANT validate structure offline;
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    3 2560 2500 9688892 2499 5 29737 1 20030140 9718629 49 1 4 0 9688892 2500

    Now, the last one is my workaround for the higher size when adding a column that do not have a lot of distinct values: just add a column before with more distinct values. Here I use the UNIQU one, but you probably have one that can be useful for your queries.

    SQL> analyze index DEMO_RANGE_COVERING_WITH_PK validate structure offline;
    SQL> select height,blocks,lf_blks,lf_rows_len,br_rows,br_blks,br_rows_len,most_repeated_key,btree_space,used_space,pct_used,rows_per_key,blks_gets_per_access,opt_cmpr_pctsave,lf_uncmp_rows_len,lf_uncmp_blks from index_stats
    ------ ------ ------- ----------- ------- ------- ----------- ----------------- ----------- ---------- -------- ------------ -------------------- ---------------- ----------------- -------------
    3 2688 2565 9963892 2564 6 37456 1 20557908 10001348 49 1 4 2 9963892 2565

    Now you get the idea. When creating an index, or adding columns for covering index, and you have the choice of column order, then try to have their first bytes selective enough so that the branch needs only a small substring to identify each leaf block (or lower level branches).

    Block dumps

    If you want to see the details about the branch length, here are some info from block dumps. I got them with the following:

    SQL> column value new_value tracefile
    SQL> select value from v$diag_info where name='Default Trace File';
    SQL> exec for i in (select header_file, header_block from dba_segments where owner='DEMO' and segment_name='DEMO_RANGE') loop execute immediate 'alter system dump datafile '||i.header_file||' block '||(i.header_block+1); end loop;
    PL/SQL procedure successfully completed.
    SQL> host tail -20 &tracefile

    Here is the last branch entry for the root block of DEMO_RANGE where the first column is not very selective and then the rowid is required in the branch:

    row#351[3279] dba: 113261807=0x6c03cef
    col 0; len 2; (2): c1 04
    col 1; len 6; (6): 07 00 05 7b 00 25

    Here is the last branch entry for the root block of DEMO_RANGE_COVERING_RANDOM where instead of the rowid the 3 first bytes of the RANDOM_TEXT column are sufficient:

    row#3[8006] dba: 113263037=0x6c041bd
    col 0; len 2; (2): c1 04
    col 1; len 3; (3): 53 51 52
    col 2; TERM

    Here is the last branch entry for the root block of DEMO_RANGE_COVERING_CONSTANT where the full 80 bytes of CONSTANT_TEXT are not even sufficient, and the ROWID is needed as a 3rd column:

    row#28[5316] dba: 117444566=0x7000fd6
    col 0; len 2; (2): c1 04
    col 1; len 80; (80):
    78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78
    78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78
    78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78
    78 78 78 78 78
    col 2; len 6; (6): 07 00 05 43 00 25

    Here is the last branch entry for the root block of DEMO_UNIQU_COVERING_CONSTANT where the first column is sufficient:

    row#2[8026] dba: 117447160=0x70019f8
    col 0; len 4; (4): c3 09 0d 04
    col 1; TERM

    So what?

    We probably don’t need a feature like SQL Server INCLUDING indexes in most of the cases. However, this may require thinking about the order of columns, mainly:

    • ensure that selective columns appear as early as possible (without compromising the index access efficiency of course) in order to lower the bytes required to address branches and leaves
    • when adding columns, try to add first a column that will keep the clustering factor you had with the rowid, such as a date of insert

    Cet article Covering indexes in Oracle, and branch size est apparu en premier sur Blog dbi services.