Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 2 hours 48 min ago

Documentum – Increase the number of concurrent_sessions

Sat, 2017-04-01 07:01

In Documentum, there is a parameter named concurrent_sessions which basically defines how many sessions a Content Server can open simultaneously. This parameter is defined in the server.ini file (or server_<dm_server_config.object_name>.ini on a Remote Content Server) of each docbase and it has a default value of 100.

An empty Content Server with an IndexAgent and D2 installed (without user using it) will usually take around 10 sessions for the jobs, for the searches, aso… As soon as there are users in the environment, the number of concurrent sessions will quickly grow and therefore depending on how many users you have, you may (will) need to increase this limit. To be more precise, the concurrent_sessions controls the number of connections the server can handle concurrently. This number must take into account not only the number of users who are using the repository concurrently, but also the operations those users are executing. Some operations require a separate connection to complete. For example:

  • Issuing an IDfQuery.execute method with the readquery flag set to FALSE causes an internal connection request
  • Executing an apply method or an EXECUTE DQL statement starts another connection
  • When the agent exec executes a job, it generally requires two additional connections
  • Issuing a full-text query requires an additional connection
  • aso…

 

Do you already know how to increase the number of allowed concurrent sessions? I’m sure you do, it’s pretty easy:

  1. Calculate the appropriate number of concurrent sessions needed based on the information provided above
  2. Open the file $DOCUMENTUM/dba/config/DOCBASE/server.ini and replace “concurrent_sessions = 100″ with the desired value (“concurrent_sessions = 500″ for example)
  3. Restart the docbase DOCBASE using your custom script of the default Documentum scripts under $DOCUMENTUM/dba
  4. Ensure that the Database used can handle the new number of sessions properly and see if you need to increase the sessions/processes for that

 

To know how many sessions are currently used, it’s pretty simple, you can just execute the DQL “execute show_sessions” but be aware that all sessions will be listed and that’s not exactly what we want. Therefore you need to keep only the ones with a dormancy_status that is Active in the final count otherwise the value will be wrong. The number of active sessions is not only linked to a docbase but also to a Content Server (to be more precise, it is only linked to a dm_server_config object). This means that if you have a High Availability environment, each Content Server (each dm_server_config object) will have its own number of active sessions. This also means that you need to take this into account when calculating how many concurrent sessions you need.

For example if you calculated that you will need a total of 500 concurrent sessions (again it’s not the number of concurrent users!) for a docbase, then:

  • If you have only one Content Server, you will need to set “concurrent_sessions = 500″ on the server.ini file of this docbase.
  • If you have two Content Servers (two dm_server_config objects) for the docbase, then you can just set “concurrent_sessions = 275″ on each server.ini files. Yes I know 2*275 isn’t really equal to 500 but that’s because each Content Server will need its internal sessions for the jobs, searches, aso… In addition to that, the Content Servers might need to talk to each other so these 25 additional sessions wouldn’t really hurt.

 

Now is the above procedure working for any value of the concurrent_sessions? Well the answer to this question is actually the purpose of this blog: yes and no. From a logical point of view, there is no restriction to this value but from a technical point of view, there is… A few months ago at one of our customer, I was configuring a new Application which had a requirement of 2150 concurrent_sessions accross a High Availability environment composed of two Content Servers. Based on the information provided above, I started the configuration with 1100 concurrent sessions on each Content Server to match the requirement. But then when I tried to start the docbase again, I got the following error inside the docbase log file ($DOCUMENTUM/dba/log/DOCBASE.log):

***********************************************************

Program error: Illegal parameter value for concurrent_sessions: 1100

***********************************************************

Usage: ./documentum -docbase_name <docbase name> -init_file <filename> [-o<option>]

    -docbase_name : name of docbase
    -init_file    : name of server.ini file (including path) which
                    contains the server startup parameters

Usage: ./documentum -h

 

As you can see, the docbase refuses to start with a number of concurrent sessions set to 1100. What’s the reason behind that? There is an artificial limit set to 1020. This is actually mentioned in the documentation:
The maximum number of concurrent sessions is dependent on the operating system of the server host machine. The limit is:

  • 3275 for Windows systems
  • 1020 for Linux, Solaris, and AIX systems

 

So why is there a limit? Why 1020? This limit is linked to the FD_SETSIZE value. The documentation on FD_SETSIZE says the following:

An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.

 

Thus FD_SETSIZE doesn’t explicitly limit the number of file descriptors that can be worked on with the system select() call. Inside every UNIX process, for its PID, Documentum maintain a corresponding list (of pointers) of file descriptors. In UNIX based systems, every Documentum session is created as a separate process. Since the number of sessions created directly depends on the number of file descriptors in an OS, each of these processes will be having a list of the file descriptors within their process which will be taking a good chunk of physical memory. With this technical reasoning the value 1020 has been set to be the max concurrent sessions available by default in Documentum.

So basically this limit of 1020 has been set arbitrary by EMC to stay within the default OS (kernel) value which is set to 1024 (can be checked with “ulimit -Sa” for example). An EMC internal task (CS-40186) was opened to discuss this point and to discuss the possibility to increase this maximum number. Since the current default limit is set only in regards to the default OS value of 1024, if this value is increased to 4096 for example (which was our case since the beginning), then there is no real reason to be stuck at 1020 on Documentum side. The Engineering Team implemented a change in the binaries that allows changing the limit. This is done by adding the environment variable DM_FD_SETSIZE.

Therefore to change the concurrent sessions above 1020 (1100 in this example) and in addition to the steps already mentioned before, you also need to do the following (depending on your OS, you might need to update the .bashrc or .profile files instead):

echo "export DM_FD_SETSIZE=1200" >> ~/.bash_profile
source ~/.bash_profile
$DOCUMENTUM/dba/dm_start_DOCBASE

 

With this environment variable DM_FD_SETSIZE now set to 1200, we can use 1100 concurrent sessions without issue. The value that will be used for the concurrent_sessions will be the one from the server.ini file. We just need to define a DM_FD_SETSIZE variable with a value equal or bigger than what we want. Also, I didn’t mention the ulimit but of course, you also need to set the limits of your OS accordingly (this is done in the file /etc/limits.conf or inside any file under /etc/limits.d/).

 

 

Cet article Documentum – Increase the number of concurrent_sessions est apparu en premier sur Blog dbi services.

Documentum – Wrong dfc versions after installation, upgrade or patch

Sat, 2017-04-01 06:09

If you are familiar with Documentum or if you already read some of my blogs, then you probably already know that EMC has sometimes issues with libraries. In a previous blog (this one), I talked about the Lockbox versions which caused us an issue and in this blog, I will talk about DFC versions.

Whenever you install a CS patch or another patch, it will probably have its own DFC libraries simply because EMC fixed something in it or because it was needed. Whenever you install D2, it will also have its own DFC libraries in the JMS and the WAR files. The problem is that the DFC libraries are everywhere… Each and every DFC client has its own DFC libraries which come when you install it, patch it, aso… Basically that’s not a wrong approach, it ensure that the components will work wherever they are installed so it can always talk to the Content Server.

The problem here is that the DFC libraries are changing at every patch almost and therefore it is kind of complicated to keep a clean environment. It already happened to us that two different patches (CS and D2 for example), released on the exact same day, were using different DFC versions and you will see below another example coming from the same package…  You can live with a server having five different DFC versions but this also means that whenever a bug impact one of your DFC library, it will be hard to fix that because you then need to deploy the next official patch which is always a pain. It also multiplies the number of issues that impact your DFC versions since you are running several versions at the same time.

I’m not saying that you absolutely need to always use only the latest DFC version but if you can properly and quickly perform the appropriate testing, I believe it can brings you something. A few weeks ago for example, one of the Application Teams we are supporting had an issue with some search functionalities in D2. This was actually caused by the DFC version bundled with D2 (DFC 7.2P03 I think) and we solved this issue by simply using the DFC version coming from our Content Server (DFC 7.2P05) which was only two patch above.

To quickly and efficiently see which versions of the DFC libraries you are using and where, you can use:

find <WHERE_TO_FIND> -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done

or

find <WHERE_TO_FIND> -type f -name dfc.jar -print0 | while IFS= read -r -d '' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done

 

You can execute these commands on a Content Server, Application Server (Note: dfc.jar files might be on the D2/DA war files if you aren’t using exploded deployments), Full Text Server or any other Linux Servers for what it matters. These commands handle the spaces in the paths even if normally you shouldn’t have any for the dfc files. To use them, you can just replace <WHERE_TO_FIND> with the base folder of your installation. This can be $DOCUMENTUM for a Content Server, $XPLORE_HOME for a Full Text Server, aso… Of course you still need to have the proper permissions to see the files otherwise it will be quite useless to execute this command.

A small example on a Content Server 7.3 (no patches are available yet) including xCP 2.3 P05 (End of February 2017 patch which is supposed to be for CS 7.3):

[dmadmin@content_server_01 ~]$ find $DOCUMENTUM -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/product/7.3/install/composer/ComposerHeadless/plugins/com.emc.ide.external.dfc_1.0.0/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/temp/installer/wildfly/dctmutils/templates/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/modules/system/layers/base/emc/documentum/security/main/dfc.jar
DFC: 7.2.0210.0184  ===  Size: 15212 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/bpm.ear/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/ServerApps.ear/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/acs.ear/lib/dfc.jar
DFC: 7.2.0210.0184  ===  Size: 15212 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-7ac143a725d0471/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-bc760ece35b05a08/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-35c79cfe4b79d974/dfc.jar

 

As you can see above, it looks like there are two different versions of the DFC library on this Content Server which has just been installed: one coming from the CS 7.3 which is therefore in 7.3 P00 build number 205 and another version which is still in 7.2 P21 build number 184. This second version has been put on the Content Server by the xCP 2.3 P05 installer. Therefore using a 7.2 library on a 7.3 Content Server is a little bit ugly but the good news is that they are both in a pretty recent version since these two libraries were released almost at the same time (end of 2016/beginning of 2017). Therefore here I don’t think it would be a big problem even if as soon as the CS 7.3 P01 is out (normally end of this month), we will replace all dfc.jar files with the 7.3 P01 versions.

Another example on a Full Text Server using xPlore 1.6 (same as before, no patches are available yet for xPlore 1.6) including one Primary Dsearch and two IndexAgents for DOCBASE1 and DOCBASE2:

[xplore@fulltext_server_01 ~]$ find $XPLORE_HOME -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/temp/installer/wildfly/dctmutils/templates/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE1/tmp/vfs/deployment/deployment5417db9ca7307cfc/dfc.jar-aa1927b943be418f/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE1/deployments/IndexAgent.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE2/tmp/vfs/deployment/deploymentbb9811e18d147b6a/dfc.jar-7347e3d3bbd8ffd/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE2/deployments/IndexAgent.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0196  ===  Size: 15220 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_PrimaryDsearch/tmp/vfs/deployment/deployment5fd2cff2d805ceb2/dfc.jar-29edda1355c549b8/dfc.jar
DFC: 7.3.0000.0196  ===  Size: 15220 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_PrimaryDsearch/deployments/dsearchadmin.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/modules/system/layers/base/emc/documentum/security/main/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/watchdog/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/setup/qbs/tool/lib/dfc.jar

 

Do you see something strange here? Because I do! This is a completely new xPlore 1.6 server which has just been installed and yet we have two different versions of the DFC libraries… It’s not a difference on the minor version but it’s a difference on the build number! As you can see above, it looks like the PrimaryDsearch uses a DFC 7.3 P00 build number 196 while all other DFC versions are in 7.3 P00 build number 205 (so just like the Content Server). The problem here is that each xPlore modules (IA, Dsearch, aso…) are built by different xPlore Teams. Therefore the team that package the Dsearch libraries isn’t the same that the one that package the IndexAgent libraries.

Since there is a difference here, it probably means that the Dsearch team built their package some days/weeks before the other teams (from IA, CS, aso…) and therefore the DFC libraries included in the Dsearch are older… Is it an issue or not? According to EMC, it’s not, BUT I wouldn’t be so categorical. If EMC built this library 9 additional times, it’s not for nothing… There must be a reason behind those builds and therefore not having the latest build seems a little bit risky to me. Since this is just a sandbox environment, I will most probably just wait for the P01 of the xPlore 1.6 which will be release in a few days and I will implement it to have an aligned version of the DFC for all components.

 

Have fun finding issues in the EMC releases :).

 

 

Cet article Documentum – Wrong dfc versions after installation, upgrade or patch est apparu en premier sur Blog dbi services.

Documentum – Thumbnail not working with TCS

Sat, 2017-04-01 04:41

A few months ago and right after a migration of around 0.5TB of documents, we enabled the TCS for one of our Application. We were using a Content Server 7.2 P05 with the associated D2 4.5 P03. As already mentioned in a previous blog, D2 4.5 doesn’t handle the previews of the documents and therefore there were also a Thumbnail Server used by this Application. The setup of the TCS for the document filestores went well without issue but when we tried to do the same thing for the Thumbnail filestore, the previews weren’t working anymore.

Basically when you configure the Thumbnail to use a TCS filestore, you need to request new renditions for existing documents otherwise they will continue to use the non-TCS filestore.

If you access D2 while inspecting the network traffic or using the browser dev/debug feature, you will find that D2 is building the Thumbnail preview URL using the value of the “dm_document.thumbnail_url”. This thumbnail_url is actually – by default – a concatenation of several things: thumbnail_url = base_url + path + store.

Therefore if you define:

  • the Thumbnail base_url to be “https://thumbnail_alias/thumbsrv/getThumbnail?”
  • the Thumbnail TCS filestore to be “trusted_thumbnail_store_01″
  • a test document 09012345801a8f56 with a file system path (get_path) to be: “00012345/80/00/10/4d.jpg”

Then D2 will retrieve a thumbnail_url that is:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01

 

When accessing this URL, if the filestore mentioned above is indeed a TCS filestore, this is the kind of logs that will be generated:

Jul 05, 2016 9:42:15 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 05, 2016 9:42:15 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : the ETag received is the path - will return 304 response
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_NEW_REQUEST] New request: path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_STORE] Retrieving storage area...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : getting storage area for docbase: 00012345 store: trusted_thumbnail_store_01
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_CHECK_SECURITY] Checking security...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : About to start reading files...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Trusted content store found, will use DFC for decryption
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Content paramters are: format= null; page=null; page modifier=null
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: Object ID missing from request...

 

When using a non-TCS filestore, this url will retrieve the document preview properly (of course the name of the store was “thumbnail_store_01″ and not “trusted_thumbnail_store_01″) but as you can see above, with a TCS filestore, the Thumbnail is trying to use the DFC for decryption but isn’t able to because the Object ID is missing on the request… With a non-TCS filestore, the Thumbnail is retrieving the content directly from the file system but it’s not possible to do that with a TCS filestore because of the encryption. Therefore a TCS-enabled Thumbnail has to use the “getContent” method of the DFC and this method requires the Object ID. That’s the issue here.

When we faced this issue last summer, we contacted EMC because that was clearly a bug in how D2 is constructing the URLs to request the preview and not how the Thumbnail Server is processing the requested URLs. After several days, the D2 engineering team provided a hotfix to enable D2 to provide the correct parameters. Basically with the new hotfix deployed, D2 was now able to generate the following URL:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Note: In this case, we also had to specify the format (jpeg_lres). While waiting for the EMC hotfix, we performed some tests on our side by trying to add missing parameters and trying to understand what is needed for the Thumbnail Server in the requests. At some point in time, we found out that the Thumbnail was trying to use “jpeg_th” as a default value for the format if you don’t specify any. As far as I know, we have a pretty default Thumbnail configuration but we don’t have any “jpeg_th” formats. Only jpeg_lres and jpeg_story are used and therefore I don’t know where this jpeg_th is coming from. I believe that’s an issue of the Thumbnail Server because EMC included the format in the D2 hotfix too, after we mentioned that to them.

 

So using this new URL, the generated log files were:

Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_NEW_REQUEST] New request: path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_STORE] Retrieving storage area...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : getting storage area for docbase: 00012345 store: trusted_thumbnail_store_01
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_CHECK_SECURITY] Checking security...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : About to start reading files...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Trusted content store found, will use DFC for decryption
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Content paramters are: format= jpeg_lres; page=null; page modifier=null
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : will be retrieving object 09012345801a8f56
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Session ok for 00012345

 

And it worked; the preview of the first page of this document has been displayed. The hotfix seemed to work so we started to deploy it in our DEV environments while waiting for a confirmation from the Application Team that it was also working from D2 (because of SoD, our admin accounts don’t have access to documents in D2). But then after a few tests, The App Team found out that the hotfix was actually only partially working: only for the first page! Indeed EMC created this hotfix using a one-page document and they never tried to retrieve the previews of a multi-page document. The thing with the Thumbnail Server is that if you have a 25-pages document, then this document will have 25 previews. I already talked about that in another previous blog so you can take a look at this blog to have more information on how to manage that.

I will suppose that 09012345801a8f56 is a 3-pages document. I gathered some information from this document and also got the path of the thumbnail previews that are related to the 3 pages:

API> ?,c,select r_object_id, full_format, parent_id, content_size, full_content_size, set_time, set_file from dmr_content where any parent_id='09012345801a8f56'
r_object_id       full_format  parent_id         content_size  full_content_size  set_time           set_file
----------------  -----------  ----------------  ------------  -----------------  -----------------  --------------------------------------------------------------------
06012345801d95cc  jpeg_lres    09012345801a8f56  60467         60467              7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4d.jpg

06012345801d95cd  jpeg_lres    09012345801a8f56  138862        138862             7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4e.jpg

06012345801d95ce  jpeg_lres    09012345801a8f56  29596         29596              7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4f.jpg

 

So here we have three different jpg files for the same “parent_id” (the document) and each one is a preview of a specific page. These previews have a different size (content_size/full_content_size), therefore the previews should be different!

With the information provided above, the previews URLs generated by D2 (using the new hotfix) are:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4e.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4f.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Three different URLs for three different pages. That looks fine but actually if you access all these three URLs, you will always see the preview of the first page of the document… It goes even beyond that! Actually you can put any path in the URL and it will always show the first page of the document (09012345801a8f56). To confirm that, you can access the following URLs and see by yourself that they will also display the same preview:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/zz.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/zz/zz/zz/zz.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Based on all the above information, you can understand that the way the Thumbnail Server is working is really different when using a non-TCS or a TCS filestore… For non-TCS filestores, the URL points directly to the actual file and that’s the reason why the full path is really important: it leads directly to the right file and each page has its own path. For TCS filestores, since all the files are encrypted, the Thumbnail Server can’t directly use the file path. Therefore it relies on the DFC to first decrypt the file and then return the result. That’s the reason why only the docbase ID is needed inside the “path” when using a TCS and that everything else in the path is completely useless. On the other end and as seen previously in this blog, you of course also need some additional information to specify which preview you want.

In addition to the parameters we saw above and in order to uniquely identify different renditions, page and/or page_modifier is also required with a TCS filestore. Both attributes are part of dmr_content table and page is the position of the content when the object has multiple contents (generally it will always be zero by default) while page_modifier uniquely identifies a rendition within the same page number and format for a document. If no/null page_modifier is passed in the URL then the rendition with an empty page_modifier value is returned. If there isn’t any rendition in the docbase without a page_modifier value, then the one with the smallest page_modifier (in alphabetical order) will be returned instead: that’s the reason why all previous URLs always returned the preview of the first page only… In short, for non-TCS filestores, the path pretty much does the job, but for TCS filestores we need additional parameters to uniquely identify the rendition.

 

So to summarize the investigation we did with EMC (because yes we found a definitive solution), the minimum elements that should be present in the Thumbnail URLs are:

  • non-TCS filestore:
    • Full path to the right file
    • Store used
  • TCS filestore:
    • Path containing at least the Docbase ID
    • Store used
    • Document ID (parent_id)
    • Format of the preview
    • page and/or page_modifier (from the dmr_content table)

 

In the end, EMC provided another hotfix which fix the first hotfix. The complete URLs are now generated by D2 and the previews are working properly even with a TCS Thumbnail filestore. Here are the examples for the 3-pages document:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000001

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4e.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000002


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4f.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000003

 

The definitive hotfix should normally be included in all the 2017 releases according to EMC.

 

 

Cet article Documentum – Thumbnail not working with TCS est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 12 – dbca (database configuration assistant)

Fri, 2017-03-31 12:59

Short answer: No. Long answer: No. Every other possible answer: No. There is no such tool in PostgreSQL.
In Oracle you can use the database configuration assistant to create your Oracle database. But this is not a must, you can do the same using scripts. Either you use dbca to generate scripts based on the questions you answered in the several screens or you use your own set of scripts to create an Oracle database (You can use dbca templates in addition or you invoke dbca in silent mode and pass your options on the command line). In PostgreSQL the steps to create a PostgreSQL instance are totally different, no surprise here. It already starts with the wording: In PostgreSQL you do not create a database you create a database cluster. Confused? This has nothing to do with multiple nodes are multiple instances as you know it from Oracle (e.g. Real Application Clusters). What a database cluster means are the files on disk (a collection of databases) that are managed by a set of background processes. Cluster just means: belonging together (there are many of types clusters, actually). I’ll use the term database instance from now on. So how can you create a database instance after you installed the PostgreSQL binaries?

When you use the graphical installer provided to the community by EnterpriseDB a PostgreSQL instance will be created automatically after the binaries have been installed. This will create a service on Windows and startup/shutdown scripts on Linux based operating systems (not sure about Mac OS). On Linux when you install the standard way the installer will require root privileges as it needs to add the scripts to startup and shutdown PostgreSQL automatically when then server starts up or does down to the system configuration. Once you started the installer it is just a matter of clicking through the screens:

pg_install_1
pg_install_2
pg_install_3
pg_install_4
pg_install_5
pg_install_6
pg_install_7
pg_install_8
pg_install_9

What happened in the background is that the PostgreSQL instance was created and started:

postgres@pgbox:/home/postgres/ [pg960final] ps -ef | grep postgres | egrep -v "ssh|bash|ps"
postgres  3412     1  0 07:30 ?        00:00:00 /opt/PostgreSQL/9.6/bin/postgres -D /opt/PostgreSQL/9.6/data
postgres  3413  3412  0 07:30 ?        00:00:00 postgres: logger process   
postgres  3415  3412  0 07:30 ?        00:00:00 postgres: checkpointer process   
postgres  3416  3412  0 07:30 ?        00:00:00 postgres: writer process   
postgres  3417  3412  0 07:30 ?        00:00:00 postgres: wal writer process   
postgres  3418  3412  0 07:30 ?        00:00:00 postgres: autovacuum launcher process   
postgres  3419  3412  0 07:30 ?        00:00:00 postgres: stats collector process   
postgres  3805  3579  0 07:56 pts/0    00:00:00 grep --color=auto postgres

In addition the installer created a systemd service (I am on CentOS) so your instance will start up and shutdown automatically (and a postgres user and group, of course):

postgres@pgbox:/home/postgres/ [pg960final] systemctl status postgresql-9.6.service
● postgresql-9.6.service - PostgreSQL 9.6 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2017-03-31 07:30:01 CEST; 28min ago
  Process: 3410 ExecStart=/opt/PostgreSQL/9.6/bin/pg_ctl start -w -t ${TimeoutSec} -D /opt/PostgreSQL/9.6/data -l /opt/PostgreSQL/9.6/data/pg_log/startup.log (code=exited, status=0/SUCCESS)
 Main PID: 3412 (postgres)
   CGroup: /system.slice/postgresql-9.6.service
           ├─3412 /opt/PostgreSQL/9.6/bin/postgres -D /opt/PostgreSQL/9.6/data
           ├─3413 postgres: logger process   
           ├─3415 postgres: checkpointer process   
           ├─3416 postgres: writer process   
           ├─3417 postgres: wal writer process   
           ├─3418 postgres: autovacuum launcher process   
           └─3419 postgres: stats collector process   

From now on you can connect to the instance and start your journey:

postgres@pgbox:/home/postgres/ [pg960final] /opt/PostgreSQL/9.6/bin/psql postgres
Password: 
psql.bin (9.6.2)
Type "help" for help.

postgres= \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

postgres=

But this is not the only method for creating a PostgreSQL instance. On Linux most of the distributions provide PostgreSQL packages in their repositories which you then can install using yum (when you are on RedHat based systems) or apt (when you are on Debian based systems (There are others as well, such as YaST on SuSE but yum and apt are the most popular, I believe). In my case, using the official CentOS repositories, I can use yum and check what CentOS provides:

postgres@pgbox:/home/postgres/ [pg960final] yum search postgresql
...
postgresql.i686 : PostgreSQL client programs
postgresql.x86_64 : PostgreSQL client programs
postgresql-contrib.x86_64 : Extension modules distributed with PostgreSQL
postgresql-devel.i686 : PostgreSQL development header files and libraries
postgresql-devel.x86_64 : PostgreSQL development header files and libraries
postgresql-docs.x86_64 : Extra documentation for PostgreSQL
postgresql-jdbc.noarch : JDBC driver for PostgreSQL
postgresql-jdbc-javadoc.noarch : API docs for postgresql-jdbc
postgresql-libs.i686 : The shared libraries required for any PostgreSQL clients
postgresql-libs.x86_64 : The shared libraries required for any PostgreSQL clients
postgresql-odbc.x86_64 : PostgreSQL ODBC driver
postgresql-plperl.x86_64 : The Perl procedural language for PostgreSQL
postgresql-plpython.x86_64 : The Python2 procedural language for PostgreSQL
postgresql-pltcl.x86_64 : The Tcl procedural language for PostgreSQL
postgresql-server.x86_64 : The programs needed to create and run a PostgreSQL server
postgresql-test.x86_64 : The test suite distributed with PostgreSQL
postgresql-upgrade.x86_64 : Support for upgrading from the previous major release of PostgreSQL
...

The packagers decided to split PostgreSQL into several packages as you can see above. If you only want to install the core server you would:

[root@centos7 ~] yum install postgresql-server.x86_64

In this case the instance will not be created automatically:

[root@centos7 ~] ps -ef | grep postgres
root      9981  4558  0 08:18 pts/0    00:00:00 grep --color=auto postgres

But, as with the graphical installer above, a systemd service was created for you:

[root@centos7 ~] systemctl list-unit-files | grep postgres
postgresql.service                            disabled

Then we can just enable and start the service?

[root@centos7 ~] systemctl enable postgresql.service
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql.service to /usr/lib/systemd/system/postgresql.service.
[root@centos7 ~] systemctl start postgresql.service
Job for postgresql.service failed because the control process exited with error code. See "systemctl status postgresql.service" and "journalctl -xe" for details.

Not really. What the installation with yum did not do for you is to create the instance. The good thing with the CentOS provided packages is that you get helpers to create the instance which are not there by default. For creating the instance you would:

-bash-4.2$ sudo postgresql-setup initdb
Initializing database ... OK

… and your instance got created but not yet started. To start it:

-bash-4.2$ sudo systemctl start postgresql.service
-bash-4.2$ psql postgres
psql (9.2.18)
Type "help" for help.

Quite easy. But here you can see one issue with the packages provided by the Linux distributions. What was installed in my case is PostgreSQL 9.2.18 and there are some points to consider with that: The latest PostgreSQL 9.2 release is 9.2.20 (as of today). So you miss 2 minor versions containing bug fixes. Even more important PostgreSQL 9.2 will go out of support this September because it was released almost 5 years ago, the current release is 9.6.2 (supported until September 2021). Not a good choice for running a production server.

Luckily there is another possibility: The PostgreSQL community provides repositories as well. There is one for yum based systems and one for apt based systems. For getting the yum based repositories into CentOS you would (for PostgreSQL 9.6):

[root@centos7 ~] wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
[root@centos7 ~] yum localinstall pgdg-centos96-9.6-3.noarch.rpm 

When you check what is available now you’ll find this:

postgresql96.x86_64 : PostgreSQL client programs and libraries
postgresql96-contrib.x86_64 : Contributed source and binaries distributed with PostgreSQL
postgresql96-debuginfo.x86_64 : Debug information for package postgresql96
postgresql96-devel.x86_64 : PostgreSQL development header files and libraries
postgresql96-docs.x86_64 : Extra documentation for PostgreSQL
postgresql96-libs.x86_64 : The shared libraries required for any PostgreSQL clients
postgresql96-odbc.x86_64 : PostgreSQL ODBC driver
postgresql96-plperl.x86_64 : The Perl procedural language for PostgreSQL
postgresql96-plpython.x86_64 : The Python procedural language for PostgreSQL
postgresql96-pltcl.x86_64 : The Tcl procedural language for PostgreSQL
postgresql96-server.x86_64 : The programs needed to create and run a PostgreSQL server
postgresql96-tcl.x86_64 : A Tcl client library for PostgreSQL
postgresql96-tcl-debuginfo.x86_64 : Debug information for package postgresql96-tcl
postgresql96-test.x86_64 : The test suite distributed with PostgreSQL

Installation is the same as with the CentOS repositories above:

[root@centos7 ~] yum install postgresql96-server.x86_64
[root@centos7 ~] systemctl list-unit-files | grep postgres
postgresql-9.6.service    
[root@centos7 ~] systemctl enable postgresql-9.6.service
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql-9.6.service to /usr/lib/systemd/system/postgresql-9.6.service.
[root@centos7 ~] ls /usr/pgsql-9.6/bin/initdb ^C
[root@centos7 ~] su - postgres
Last login: Fri Mar 31 08:29:10 CEST 2017 on pts/1
-bash-4.2$ /usr/pgsql-9.6/bin/initdb
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgsql/9.6/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-9.6/bin/pg_ctl -D /var/lib/pgsql/9.6/data -l logfile start

-bash-4.2$ sudo systemctl start postgresql-9.6.service
-bash-4.2$ /usr/pgsql-9.6/bin/psql postgres
psql (9.6.2)
Type "help" for help.

postgres= \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

And ready. The packages provided by the PostgreSQL community are a much better choice than the ones provided by your distributions because here you get the latest ones.

What we did above as an extra step was calling a utility which is named “initdb”. initdb is the tool you use to create the PostgreSQL instance after you installed the binaries. The graphical installer uses initdb in the background as well as there is no other way to do it. When you compile and install PostgreSQL from source you would need to use initdb to create your PostgreSQL instance as well.

Conclusion: Depending on how you install PostgreSQL onto your system the PostgreSQL instance is created automatically (when you use the installer provided by EnterpriseDB), almost automatically when you use the packages provided by your distribution and almost automatically when you use the packages provided directly by the PostgreSQL community. In the next post we’ll look at initdb in more detail.

 

Cet article Can I do it with PostgreSQL? – 12 – dbca (database configuration assistant) est apparu en premier sur Blog dbi services.

SQL Server on Linux and experimental Flexible Architecture

Fri, 2017-03-31 06:28

In this blog post let’s talk a little bit about SQL Server optimal installation on the Linux file system. This is probably one of the first considerations we will have as a database administrator. Actually basic installation of SQL Server can be resumed to use yum command and let the SQL Server installer place all related files on the file system. But in a production environment, the story will be different and we will have to follow best practices. In fact, Microsoft guidelines already exist in Windows world and concern mainly placement of system / user database files and tempdb as well.

But what about Linux? In my opinion, the game is not so different and existing database file placement rules also apply on Linux. We don’t work with drive letters on Linux but it doesn’t matter because in some cases we already use the same mount point concept.

How to start and how to achieve an optimal placement were the questions in my head since the last meetup I attended about MongoDB and organized by my colleagues Medhi Bada and Eric Dusquesnoy.

During the meeting, they talked about MongoDB installation according the OFA architecture initiated by Oracle and which has many similar aspects with FHS on Unix system. Another interesting thing I’ve realized after discussing with my colleagues is that we also propose the same based-model for other SGBDs like MySQL or PostgreSQL (with some adjustments in the respect of their respective context).

In a nutshell, OFA architecture provides the following interesting things:

  • A logical layout for the databases that helps DBAs to manage their system. The idea here is to provide a predictable and consistent file naming structure to manage and to deal easily with issues such as adding hardware, installing new instance etc…
  • Preventing performance bottlenecks by distributing I/O across enough disks. For instance, we may separate group of entities that will contend for disk resources. We may think here about separating data and log files, data files in different filegroups or dedicated tempdb to its own storage path as well.
  • Maximizing database reliability in case of disk failure

But you may wonder why to scale the existing SQL Server best practices to the OFA standard?

First of all, let’s say that existing Microsoft best practices, in many respects, resembles to OFA. Then providing a uniform approach to customers (Oracle, MySQL, PostgreSQL, MongoDB and SQL Server) about best practices may help to contribute to a true immersion of SQL Server on Linux. From a customer perspective, we may provide guidelines on an already widely architecture standard known in Linux world. This is, at least, the way I see it.

So just to be clear, the next part of this blog is purely experimental and it doesn’t pretend to replace any future guideline that will be provided by Microsoft.

Before getting to the heart of the matter, let’s get the files associated to the mssql-server package. I continue to use the CentOS distribution for my demo but you may easy apply the same with other distributions.

[root@sql mssqlserver]# rpm -qpl mssql-server-14.0.405.200-1.x86_64.rpm
/opt/mssql/bin
/opt/mssql/bin/compress-dump.sh
/opt/mssql/bin/generate-core.sh
/opt/mssql/bin/mssql-conf
/opt/mssql/bin/paldumper
/opt/mssql/bin/sqlpackage
/opt/mssql/bin/sqlservr
/opt/mssql/lib
/opt/mssql/lib/libc++.so.1
/opt/mssql/lib/libc++abi.so.1
/opt/mssql/lib/libjemalloc.so.1
/opt/mssql/lib/libsqlvdi.so
/opt/mssql/lib/libunwind.so.8
/opt/mssql/lib/mssql-conf
/opt/mssql/lib/mssql-conf/checkinstall.sh
/opt/mssql/lib/mssql-conf/checkrunninginstance.sh
/opt/mssql/lib/mssql-conf/collations.txt
/opt/mssql/lib/mssql-conf/disableservice.sh
/opt/mssql/lib/mssql-conf/enableservice.sh
/opt/mssql/lib/mssql-conf/mssql-conf.py
/opt/mssql/lib/mssql-conf/mssqlconfhelper.py
/opt/mssql/lib/mssql-conf/mssqlsettings.py
/opt/mssql/lib/mssql-conf/mssqlsettingsmanager.py
/opt/mssql/lib/mssql-conf/sapassword.sh
/opt/mssql/lib/mssql-conf/set-collation.sh
/opt/mssql/lib/mssql-conf/startservice.sh
/opt/mssql/lib/mssql-conf/stopservice.sh
/opt/mssql/lib/sqlpackage.sfp
/opt/mssql/lib/sqlservr.sfp
/opt/mssql/lib/system.certificates.sfp
/opt/mssql/lib/system.common.sfp
/opt/mssql/lib/system.netfx.sfp
/opt/mssql/lib/system.sfp
/usr/lib/systemd/system/mssql-server.service
/usr/share/doc/mssql-server/LICENSE.TXT
/usr/share/doc/mssql-server/THIRDPARTYNOTICES.TXT
/usr/share/man/man1/mssql-conf.1.gz
/usr/share/man/man1/sqlpackage.1.gz
/usr/share/man/man1/sqlservr.1.gz

From the above output we may find out a bunch of files that are part of the proper functioning of SQL Server. For example, we may notice different script files (either bash and python scripts files), libraries and sfp files (which are part of the new drawbridge architecture?), man page files and finally the sqlservr binary.

Most of these files are installed in the main hierarchy /opt/mssql/ with the following specific items

  • /opt/mssql/bin/ (binary files and SQL Server crash dump generation scripts)
  • /opt/mssql/lib (sqlserver libraries and sfp files)
  • /opt/mssql/lib/mssql-conf (python and bash scripts)

In addition, installing a SQL Server instance reveals an additional hierarchy /var/opt/mssql with the followings items :

  • data folder = default folder for database data and transaction log files. By the way, system and tempdb database files are located here by default. At the moment of writing this blog, SQL Server is in CTP 1.4 and moving system databases are not supported but we may expect to see this limit removed in the future
  • log folder = log files are stored here. We may retrieve logs related to SQL Server engine (errorlog files), to the SQL Server agent (SQLAGENT.OUT), to the default SQL Server trace and HK engine extended event.
  • secret folder = contains the machine.key file used by the SQL Server engine or potential other files that come with high-availability architectures to store pacemaker credentials for example.
  • conf file = Stores the default location files for user database files and dump files as well.

The last one – mssql.conf – has drawn my attention. According to my tests, it is possible to modify this file manually without getting sides effects but maybe to take with a pinch of salt at this stage.

So here my flexible architecture version for SQL Server:

  • Binary structure
mkdir -p /u00/app/sqlserver
mkdir -p /u00/app/sqlserver/product/14/mssqlserver
mkdir -p /u00/app/sqlserver/local/dmk
mkdir -p /u00/app/sqlserver/local/mssqltools
mkdir -p /u00/app/sqlserver/admin/mssqlserver/etc
mkdir -p /u00/app/sqlserver/admin/mssqlserver/log
mkdir -p /u00/app/sqlserver/admin/mssqlserver/dump

ln -s /opt/mssql/bin /u00/app/sqlserver/product/14/mssqlserver
ln -s /var/opt/mssql/mssql.conf /u00/app/sqlserver/admin/mssqlserver/etc/
ln -s /var/opt/mssql/log/ /u00/app/sqlserver/admin/mssqlserver/log 

Products file consists of SQL Server software that are supplied on the media / package provided by Microsoft.

Administrative files are files containing data about the instance, including SQL Server and SQL Agent error log files, default traces or HK engine extended event files server process diagnostic and finally instance parameter files like mssql.conf file.

  • Database structure
mkdir -p /u01/sqlserverdata/mssqlserver
mkdir -p /u02/sqlserverlog/mssqlserver
mkdir -p /u03/sqlservertempdb/mssqlserver
mkdir -p /u98/sqlserver/backup/mssqlserver

Then let’s configure the correct permissions on the new hierarchy folders

chown -R mssql:mssql /u01/sqlserverdata/mssqlserver
chmod -R 750 /u01/sqlserverdata/mssqlserver
chown -R mssql:mssql /u02/sqlserverlog/mssqlserver
chmod -R 750 /u02/sqlserverlog/mssqlserver
chown -R mssql:mssql /u03/sqlservertempdb/mssqlserver
chmod -R 750 /u03/sqlservertempdb/mssqlserver
chown -R mssql:mssql /u98/sqlserver/backup/mssqlserver
chmod -R 750 /u98/sqlserver/backup/mssqlserver

After configuring the permissions let’s change default path parameters from the mssql-config utility

/opt/mssql/bin/mssql-conf set filelocation.defaultdatadir /u01/sqlserverdata/mssqlserver 
/opt/mssql/bin/mssql-conf set filelocation.defaultlogdir /u02/sqlserverlog/mssqlserver 
/opt/mssql/bin/mssql-conf set filelocation.defaultbackupdir /u98/sqlserver/backup/mssqlserver
/opt/mssql/bin/mssql-conf set filelocation.defaultdumpdir /u00/app/sqlserver/admin/mssqlserver/dump

And let’s move tempdb database files by connecting to SQL Server (moving tempdb is supported since the CTP 1.4).

DECLARE @sql NVARCHAR(MAX) = N'';
DECLARE @path_name NVARCHAR(MAX) = N'/u03/sqlservertempdb/mssqlserver/';

USE tempdb;


SELECT 
	@sql += N'ALTER DATABASE [tempdb] MODIFY FILE (NAME = ' + name + ', FILENAME = ''' 
	        + @path_name + REVERSE(SUBSTRING(REVERSE(REPLACE(physical_name, 'C:\', '')), 1 , CHARINDEX('\', REVERSE(REPLACE(physical_name, 'C:\', ''))) - 1)) + ''
			+ ''')' + CHAR(13)
FROM sys.database_files;


--PRINT @sql

EXEC sp_executesql @sql;
GO

And finally let’s restart the SQL Server instance:

systemctl restart mssql-server

Moreover, we may also add to the system path the binary structure in place and then change it when a new product is installed

echo 'PATH=”$PATH:/u00/app/sqlserver/product/14/mssqlserver/bin”' >> ~/.bashrc
source ~/.bashrc

Invoking mssql-conf is pretty easy in this case as shown below:

[root@sql home]# mssql-conf [options]

Finally let’s have a look at the database files placement :

SELECT 
	DB_NAME(database_id) AS database_name,
	name AS logical_name,
	physical_name
FROM sys.master_files
ORDER BY database_id

blog 120 - SQL Server and Flexible architecture

Drive letters and Windows fashion paths are still present for system databases but no doubt this issue will disappear in a next release.

Bottom line

Keep in mind that OFA architecture is very flexible and provides only guidelines and best practices. Therefore, we will be able to make adjustments over time as the version evolves. Please feel free to comment. It always be appreciated!

See you

 

 

Cet article SQL Server on Linux and experimental Flexible Architecture est apparu en premier sur Blog dbi services.

GoldenGate 12.2: ERROR: Unable to create a new wallet

Thu, 2017-03-30 10:42

Today I had a very strange issue when setting up a brand new GoldenGate 12.2 installation. Installation went fine, patch went fine (PSU 170221) and then I started to configure GoldenGate. One of the first commands we usually do is:

GGSCI (xxxxx) 2> create wallet

I really never had any issues with this, but today I got:

GGSCI (psrdbap03n1) 2> create wallet

ERROR: Unable to create a new wallet.

What a great error message and it was not much more verbose in the GoldenGate logfile:

2017-03-30 15:20:06  INFO    OGG-02096  Oracle GoldenGate Command Interpreter for Oracle:  ERROR: Unable to create a new wallet.

So, what to do? My first thought was:

oracle@xxx]~[xxx]> ps -ef | grep ggscs
oracle    7683 30606  0 15:29 pts/2    00:00:00 grep ggscs
[oracle@xxx]~[xxx]> ps -ef | grep ggs
oracle    1261 24160  0 15:19 pts/0    00:00:00 ./ggsci
oracle    7698 30606  0 15:29 pts/2    00:00:00 grep ggs
[oracle@xxx]~[xxx]> strace -fp 1261

… and then do the command again and check the strace output. But there was not much to look at. GoldenGate tried to stat the wallet files but as they did not exist yet failed to, which is fine:

[pid  1261] stat("dirwlt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1261] stat("dirwlt/cwallet.sso", 0x7fff88bd2f30) = -1 ENOENT (No such file or directory)
[pid  1261] stat("dirwlt/cwallet.lck", 0x7fff88bd28f0) = -1 ENOENT (No such file or directory)
[pid  1261] open("dirwlt/cwallet.lck", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6
[pid  1261] close(6)                    = 0
[pid  1261] open("dirwlt/cwallet.lck", O_RDWR) = 6

Some lines later:

[pid  1262] futex(0x1e68a64, FUTEX_WAIT_PRIVATE, 123, {0, 982142} 
[pid  1261] write(1, "ERROR: Unable to create a new wa"..., 37) = 37
[pid  1261] write(1, "\n", 1)    

And then many, many of these:

[pid  1262] futex(0x1e68a64, FUTEX_WAIT_PRIVATE, 163, {0, 63977012}) = -1 ETIMEDOUT (Connection timed out)
[pid  1262] futex(0x1e68a38, FUTEX_WAKE_PRIVATE, 1) = 0

In short, I had no clue what was the issue (the many futex calls did not help me) . The next idea was to enable debugging for Goldengate. This can be done by creating a file like this in the GoldenGate Home (a picture again, sorry, the xml tags destroy the formatting. Let me know when you want to have the file):
ogg-debug

This will create a file called “gglog-ggsci.log” in the GoldenGate Home which contains more information what was going on. But the only two lines which showed “ERROR” were these:

2017-03-30 16:09:31.015 ERROR|ggsec.CWallet                 |main| 1090 ggsec/CWallet.cpp        | initNZCtx: Error initializing the nz context. NZERROR 43490
2017-03-30 16:09:31.015 ERROR|ggsec.CWallet                 |main|  439 ggsec/CWallet.cpp        | create: Could not initialize network security context.

Hm, this does not help me either (mos did not come up with a single note about this error). To make it short: Finally this happened because I had the ORACLE_HOME environment set to the Goldengate Home:

[oracle@xxxxxxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> echo $ORACLE_HOME
/u01/app/ogg/ch_src/product/12.2.0.1.170221/via

As soon as I changed this to a database home everything was fine again:

[oracle@xxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> export ORACLE_HOME=/u01/app/oracle/product/11.2.0.3/dbhome_1/
[oracle@xxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> ./ggsci 

Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.1.170221 OGGCORE_12.2.0.1.0OGGBP_PLATFORMS_170123.1033_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Jan 23 2017 21:54:15
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2017, Oracle and/or its affiliates. All rights reserved.

GGSCI (xxx) 1> create wallet

Created wallet at location 'dirwlt'.

Opened wallet at location 'dirwlt'.

GGSCI (xxx) 2> 

When you ask yourself why I had the ORACLE_HOME pointed to the GoldenGate Home: I applied the latest PSU just before and before invoking opatch I did set the ORACLE_HOME to point to the GoldenGate Home. When you do the same in GoldenGate 12.1 and even in the GoldenGate 12.2 base release you will not hit this. Currently there is bug open with Oracle, lets see what happens :)

 

Cet article GoldenGate 12.2: ERROR: Unable to create a new wallet est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 11 – NLS_DATE_FORMAT

Thu, 2017-03-30 00:36

Following the last post about time zones this post is about the display and default format of date and time values. What people usually do to get a consistent display of date and time values in Oracle is to set NLS_DATE_FORMAT:

SQL> show parameter nls_date_format;

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
nls_date_format 		     string	 YYYY-MON-DD HH24:MI:SS

SQL> alter session set nls_date_format='DD.MM.YYYY HH24:MI:SS';

Session altered.

SQL> select sysdate from dual;

SYSDATE
-------------------
29.03.2017 16:26:11

SQL> alter session set nls_date_format='YYYY-MON-DD HH24:MI:SS';

Session altered.

SQL> select sysdate from dual;

SYSDATE
--------------------
2017-MAR-29 16:31:01

Obviously PostgreSQL does not know anything about NLS_DATE_FORMAT, so what does it provide for doing the same?

First of all (and this is true for Oracle as well as for PostgreSQL) this is all about displaying date and time values and interpreting strings that shall be converted to date and time values. This has nothing to do with the internal storage.

What controls the display of date and time specific values in PostgreSQL is the parameter datestyle, the default is ISO:

postgres=# show datestyle;
 DateStyle 
-----------
 ISO, MDY
(1 row)

MDY here represents: MONTH-DAY-YEAR. If you want to change this on the session level you can:

postgres=# set datestyle='SQL, MDY';
SET
postgres=# select now();
              now               
--------------------------------
 03/29/2017 18:13:15.475693 GMT
(1 row)

You can do the same by setting the PGDATESTYLE environment variable before starting a new connection:

postgres@pgbox:/home/postgres/ [PG962] export PGDATESTYLE='SQL, MDY'
postgres@pgbox:/home/postgres/ [PG962] psql postgres
psql (9.6.2 dbi services build)
Type "help" for help.

postgres=# select now();
              now               
--------------------------------
 03/29/2017 18:15:29.765396 GMT
(1 row)

When you have a string and you want to insert that string into a column which is defined as date, how can you do this? PostgreSQL is very flexible here:

postgres=# create table t1 ( a date );
CREATE TABLE
postgres=# insert into t1 (a) values ('2017.01.01');
INSERT 0 1
postgres=# insert into t1 (a) values ('2017-01-01');
INSERT 0 1
postgres=# insert into t1 (a) values ('2017-JAN-01');
INSERT 0 1
postgres=# insert into t1 (a) values ('2017-JAN-01 11:11:11');
INSERT 0 1
postgres=# insert into t1 (a) values ('2017/01/01 11:11:11');
INSERT 0 1
postgres=# insert into t1 (a) values ('20170101 111111');
INSERT 0 1

All these work without specifying a format mask as you would need to do it in Oracle. The rules how a string is interpreted are documented here.

When you want to convert a string into a date or timestamp you can use one of the make_* functions:

postgres=# select make_date(2017,01,01);
 make_date  
------------
 2017-01-01
(1 row)
postgres=# select make_timestamp (2017,01,01,00,00,00);
   make_timestamp    
---------------------
 2017-01-01 00:00:00
(1 row)

… or use the data type formatting functions:

postgres=# select to_date('2017.01.01','YYYY.MM.DD');
  to_date   
------------
 2017-01-01
(1 row)

postgres=# select to_date('2017-JAN-01','YYYY.Mon.DD');
  to_date   
------------
 2017-01-01
(1 row)

Hope this helps …

 

Cet article Can I do it with PostgreSQL? – 11 – NLS_DATE_FORMAT est apparu en premier sur Blog dbi services.

OUD – Oracle Unified Directory 11.1.2.3 Backups Tips and Tricks

Wed, 2017-03-29 09:51

Backing up an OUD consists of several components

  • The OUD software itself
  • The OUD back end data
  • The OUD configuration
  • The OUD logical export as a ldif file

However, in this post I would like to take a closer look at the back end data. Unlike the old OID, the OUD directory server uses the Berkeley DB Java Edition (JE) as its primary back end. The OUD backup command allows you to backup all back ends in one shot, or a single back end, you can do full or incremental backups, you can compress it and you can even encrypt your back end data, if you like too.

One of the first questions that comes up is where to put the backup files. In a replicated environment, it makes a lot of sense to put them on a NFS share. In case you should loose one OUD host, you still have the access to backups on the other host.

I choose to backup the back end data to /u99/backup/OUD, which is a directory on a NFSv4 mount.

[dbafmw@dbidg01 OUD]$ mount | grep u99
dbidg03:/u99 on /u99 type nfs4 (rw,relatime,vers=4.1,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.201,local_lock=none,addr=192.168.56.203)

Before we issue the first backup command, it is good to know which back ends we have. Some back ends change quite frequently and others might not. I am using the OUD only for TNS name resolution, so, the most important for me is the OracleContext0.

[dbafmw@dbidg01 ~]$ list-backends
Backend ID        : Base DN
------------------:----------------------------------------
Eus0              :
EusContext0       :
Fa0               :
OIDCompatibility  : cn=OracleContext,cn=OracleSchemaVersion
OracleContext0    : "cn=OracleContext,dc=dbi,dc=com"
adminRoot         : cn=admin data
ads-truststore    : cn=ads-truststore
backup            : cn=backups
monitor           : cn=monitor
schema            : cn=schema
subschemasubentry :
tasks             : cn=tasks
userRoot          : "dc=dbi,dc=com"
virtualAcis       : cn=virtual acis


[dbafmw@dbidg01 ~]$ list-backends -n OracleContext0
Backend ID     : Base DN
---------------:---------------------------------
OracleContext0 : "cn=OracleContext,dc=dbi,dc=com"

Ok. Lets start a full backup of all back ends to the backup directory /u99/backup/OUDand and compress them.

[dbafmw@dbidg01 ~]$ backup --backUpAll --compress --backupDirectory=/u99/backup/OUD
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend virtualAcis
[29/Mar/2017:08:55:49 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend OracleContext0
[29/Mar/2017:08:55:49 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend tasks
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend schema
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend OIDCompatibility
[29/Mar/2017:08:55:49 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend userRoot
[29/Mar/2017:08:55:49 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend replicationChanges
[29/Mar/2017:08:55:49 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:55:49 +0200] category=TOOLS severity=NOTICE msgID=10944795 msg=The backup process completed successfully

For backing up your OUD server back ends, the OUD itself does not have to be up and running. You can back it up while it is offline too.

[dbafmw@dbidg01 ~]$ stop-ds
Stopping Server...

[29/Mar/2017:08:57:46 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=OIDCompatibility,cn=Workflow Elements,cn=config is now taken offline
[29/Mar/2017:08:57:46 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=OracleContext0,cn=Workflow elements,cn=config is now taken offline
[29/Mar/2017:08:57:46 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=userRoot,cn=Workflow Elements,cn=config is now taken offline
[29/Mar/2017:08:57:46 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=virtualAcis,cn=Workflow Elements,cn=config is now taken offline
[29/Mar/2017:08:57:46 +0200] category=CORE severity=NOTICE msgID=458955 msg=The Directory Server is now stopped


[dbafmw@dbidg01 ~]$ backup --backUpAll --compress --backupDirectory=/u99/backup/OUD
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend virtualAcis
[29/Mar/2017:08:58:06 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend OracleContext0
[29/Mar/2017:08:58:06 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend tasks
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend schema
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend OIDCompatibility
[29/Mar/2017:08:58:06 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend userRoot
[29/Mar/2017:08:58:06 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend replicationChanges
[29/Mar/2017:08:58:06 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:08:58:06 +0200] category=TOOLS severity=NOTICE msgID=10944795 msg=The backup process completed successfully
[dbafmw@dbidg01 ~]$

Backing up a single back end is done by the following command.

[dbafmw@dbidg01 ~]$ backup --backendID OracleContext0 --compress --backupDirectory=/u99/backup/OUD
[29/Mar/2017:15:14:22 +0200] category=TOOLS severity=NOTICE msgID=10944792 msg=Starting backup for backend OracleContext0
[29/Mar/2017:15:14:22 +0200] category=JEB severity=NOTICE msgID=8847446 msg=Archived: 00000000.jdb
[29/Mar/2017:15:14:22 +0200] category=TOOLS severity=NOTICE msgID=10944795 msg=The backup process completed successfully

The backup which I have done is reflected now in the following directory.

[dbafmw@dbidg01 OUD]$ ls -rtl /u99/backup/OUD/backup-OracleContext0*
-rw-r--r-- 1 dbafmw oinstall 19193 Mar 28 15:11 /u99/backup/OUD/backup-OracleContext0-20170328131137Z
-rw-r--r-- 1 dbafmw oinstall 56904 Mar 28 15:20 /u99/backup/OUD/backup-OracleContext0-20170328132004Z
-rw-r--r-- 1 dbafmw oinstall 27357 Mar 29 15:14 /u99/backup/OUD/backup-OracleContext0-20170329131419Z
-rw-r--r-- 1 dbafmw oinstall 27357 Mar 29 15:15 /u99/backup/OUD/backup-OracleContext0-20170329131552Z
-rw-r--r-- 1 dbafmw oinstall 84556 Mar 29 15:16 /u99/backup/OUD/backup-OracleContext0-20170329131622Z

The backups done via the OUD backup command are documented in a file called backup.info. If you grep for the last backup piece, you will find it there as the last entry.

[dbafmw@dbidg01 OUD]$ cat /u99/backup/OUD/backup.info | grep -B 8 backup-OracleContext0-20170329131622Z

backup_id=20170329131622Z
backup_date=20170329131625Z
incremental=false
compressed=false
encrypted=false
property.last_logfile_name=00000000.jdb
property.last_logfile_size=84330
property.archive_file=backup-OracleContext0-20170329131622Z

There is another method to find the last backup which was done. Before the backup command starts, it copies the current backup.info to backup.info.save, and so, you just need to do a diff, and then you know which is the latest backup.

[dbafmw@dbidg01 OUD]$ diff backup.info backup.info.save
48,56d47
< backup_id=20170329131622Z
< backup_date=20170329131625Z
< incremental=false
< compressed=false
< encrypted=false
< property.last_logfile_name=00000000.jdb
< property.last_logfile_size=84330
< property.archive_file=backup-OracleContext0-20170329131622Z
<

But what happens if you don’t need some old backup anymore. e.g. the backup-OracleContext0-20170328131137Z.

[dbafmw@dbidg01 OUD]$ cat backup.info | grep backup-OracleContext0-20170328131137Z
property.archive_file=backup-OracleContext0-20170328131137Z

Unfortunately, there is no purge procedure delivered with OUD to clean up old backups. You have to clean it up yourself. e.g. in case you want to clean up Oracle Context backups older than 2 days, you could do like this.

[dbafmw@dbidg01 OUD]$ find /u99/backup/OUD -maxdepth 1 -type f -name "backup-OracleContext0*" -mtime +2 | awk -F "/" '{ print $5 }' | awk -F "-" '{ print $3 }'
20170328132004Z

[dbafmw@dbidg01 OUD]$ find /u99/backup/OUD -maxdepth 1 -type f -name "backup-OracleContext0*" -mtime +2 | awk -F "/" '{ print $5 }' | awk -F "-" '{ print $3 }' | while read i
do
echo /u99/backup/OUD/backup-OracleContext0-${i}
rm /u99/backup/OUD/backup-OracleContext0-${i}
sed -i "/backup_id=${i}/,/property.archive_file=backup-OracleContext0-${i}/d" /u99/backup/OUD/backup.info
done
[dbafmw@dbidg01 OUD]$ cat backup.info | grep 20170328132004Z
[dbafmw@dbidg01 OUD]$

This script is of course not baby save, but you got the idea. ;-)

Conclusion

The Oracle OUD delivers quite a lot good options regarding backups. However, regarding purging the old stuff, you have to handle it yourself.

 

 

Cet article OUD – Oracle Unified Directory 11.1.2.3 Backups Tips and Tricks est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 10 – Timezones

Wed, 2017-03-29 08:59

This post is inspired by a question we received from a customer: In Oracle there is the sessiontimezone which returns the time zone of the session. Asking for the time zone of the session in Oracle returns you the offset to the UTC time:

SQL> select sessiontimezone from dual;

SESSIONTIMEZONE
---------------------------------------------------------------------------
+02:00

This is fine as I am based in Switzerland and we skipped one hour in the night from last Saturday to Sunday :)

How can we do something similar in PostgreSQL? To check the current time zone of your session:

(postgres@[local]:4445) [postgres] > show timezone;
   TimeZone   
--------------
 Europe/Vaduz
(1 row)

Or:

(postgres@[local]:4445) [postgres] > select current_setting('timezone');
 current_setting 
-----------------
 Europe/Vaduz
(1 row)

So, PostgreSQL will not show you the offset to UTC but the name of the time zone as specified by the Internet Assigned Numbers Authority (IANA). When you want to have the offset to UTC you can do something like this:

(postgres@[local]:4445) [postgres] > select age(now(),now() at time zone 'UTC');
   age    
----------
 02:00:00
(1 row)

You can do it using the extract function as well:

(postgres@[local]:4445) [postgres] > select extract( timezone from now() ) / 60 /60;
 ?column? 
----------
        2
(1 row)

How can you change the session time zone? One way is to set the PGTZ environment variable before starting a new session when you use a libpq client:

postgres@centos7:/home/postgres/ [PG3] export PGTZ=Europe/Berlin
postgres@centos7:/home/postgres/ [PG3] psql postgres
psql.bin (9.6.2.7)
Type "help" for help.

(postgres@[local]:4445) [postgres] > show timezone;
   TimeZone    
---------------
 Europe/Berlin
(1 row)

The other way is to directly set it in the session:

Time: 1.048 ms
(postgres@[local]:4445) [postgres] > set time zone 'Europe/Paris';
SET
Time: 82.903 ms
(postgres@[local]:4445) [postgres] > show timezone;
   TimeZone   
--------------
 Europe/Paris
(1 row)

Of course you can also set the timezone parameter in postgresql.conf.

To get the current timestamp you can use:

(postgres@[local]:4445) [postgres] > SELECT current_timestamp;
        current_timestamp         
----------------------------------
 29-MAR-17 15:41:59.203485 +02:00
(1 row)

And finally, to calculate the current time in another time zone you can do something like this:

(postgres@[local]:4445) [postgres] > SELECT current_time AT TIME ZONE 'Europe/Zurich', current_time AT TIME ZONE 'US/Samoa';
      timezone      |      timezone      
--------------------+--------------------
 15:43:05.634124+02 | 02:43:05.634124-11
(1 row)

All the time zone names can be found in pg_timezone_names:

(postgres@[local]:4445) [postgres] > select * from pg_timezone_names;
               name               | abbrev | utc_offset | is_dst 
----------------------------------+--------+------------+--------
 America/North_Dakota/Beulah      | CDT    | -05:00:00  | t
 America/North_Dakota/Center      | CDT    | -05:00:00  | t
 America/North_Dakota/New_Salem   | CDT    | -05:00:00  | t
 America/Argentina/Ushuaia        | ART    | -03:00:00  | f
...

Hope this helps…

 

Cet article Can I do it with PostgreSQL? – 10 – Timezones est apparu en premier sur Blog dbi services.

Vertically scale your PostgreSQL infrastructure with pgpool – 3 – Adding another standby

Tue, 2017-03-28 08:52

In the first and second post in this series we did the basic pgpool setup including the watchdog configuration and then did a simple setup for automatically failover from a PostgreSQL master instance in case it goes down for any reason. In addition we told pgpool how an old master instance can be reconfigured as a new standby instance that follows the new master. In this post we’ll add another standby instance and then teach pgpool how a standby can be made aware of a new master when the master fails. Sound interesting? Lets go…

As reminder this is how the system looks like right now:

pgpool-architecture

What we want to have is:

pgpool-extended

The idea behind a third node is that we always will have at least one standby server up and running in case the master node goes down. What do we need to do to bring in another instance? Once the operating system is up and running, PostgreSQL is installed it is actually quite easy. As a first step lets create the new standby database on the third node using exactly the same layout as on the other nodes:

postgres@pgpool3:/home/postgres/ [pg962] mkdir -p /u02/pgdata/PG1
postgres@pgpool3:/home/postgres/ [pg962] cd /u02/pgdata/PG1
postgres@pgpool3:/u02/pgdata/PG1/ [pg962] pg_basebackup -h 192.168.22.34 -x -D /u02/pgdata/PG1/
postgres@pgpool3:/u02/pgdata/PG1/ [pg962] echo "standby_mode = 'on'
primary_conninfo = 'host=pgpool1 user=postgres'
primary_slot_name = 'standby2'
recovery_target_timeline = 'latest'
trigger_file='/u02/pgdata/PG1/failover_trigger'" > recovery.conf
postgres@pgpool3:/u02/pgdata/PG1/ [pg962] psql -h pgpool1 -c "select * from pg_create_physical_replication_slot('standby2')" postgres
postgres@pgpool3:/u02/pgdata/PG1/ [PG1] pg_ctl -D /u02/pgdata/PG1/ start

Now we have one master instance with two standby instances attached. Lets configure the third instance into pool (the configuration change needs to be done on both pgpool nodes, of course). The lines we need to add to pgpool.conf are:

backend_hostname2 = '192.168.22.40'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/u02/pgdata/PG1'
backend_flag2 = 'ALLOW_TO_FAILOVER'

Reload pgpool (a stop is not necessary) and check the current status:

[postgres@centos7_pgpool_m1 etc]$ pgpool reload
[postgres@centos7_pgpool_m1 etc]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.333333  | primary | 0          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.333333  | standby | 0          | false             | 0
 2       | 192.168.22.40 | 5432 | up     | 0.333333  | standby | 0          | false             | 0
(3 rows)

We have a second standby database which is used to load balance read requests. In case the master fails now what we want is that one of the standby instances gets promoted and the remaining standby instance should be reconfigured to follow the new master. What do we need to do?

As we are using replication slots in this setup we need a way to make the failover scripts independent of the name of the replication slot. The first script that we need to change is “promote.sh” on all the PostgreSQL nodes because currently there is a hard coded request to create the replication slot:

#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
export PATH PGDATA
pg_ctl promote -D ${PGDATA} >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby1')" postgres >> /var/tmp/failover.log

The easiest way to do this is to create as many replication slots as you plan to add standby instances, e.g.:

#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
export PATH PGDATA
pg_ctl promote -D ${PGDATA} >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby1')" postgres >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby2')" postgres >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby3')" postgres >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby4')" postgres >> /var/tmp/failover.log

Of course this is not a good way to do it as you would need to adjust the script every time the amount of standby instances changes. One better way to do it is to centrally manage the amount of standby instances and the relation of the standby instances to the replication slots in a configuration in the $PGDATA directory of each PostgreSQL node and on each pgpool node in the HOME directory of the postgres user:

postgres@pgpool1:/u02/pgdata/PG1/ [PG1] cat pgpool_local.conf 
# the total amount of instances that
# participate in this configuration
INSTANCE_COUNT=3
# the mapping of the hostname to
# to the replication slot it uses for
# the PostgreSQL instance it is running
# in recovery mode
192.168.22.34=standby1
192.168.22.35=standby2
192.168.22.40=standby3

Having this we can adjust the promote.sh script (sorry I have to use a screenshot as the source code destroys the formatting of this post. let me know if you want to have the script):
pgp-promote-sh

Now the script will create all the replication slots on a promoted instance and in addition drops the replication slot it used before being promoted. What else do we need? As we now have a third instance in the configuration there is another point we need to take care of: When the master fails a new standby is promoted, so far so good. But in addition we want the second standby to follow the new master automatically, don’t we? For this we need to tell pgpool to call another script which is executed on the active pgpool node after failover:

[postgres@centos7_pgpool_m1 ~]$ grep follow /u01/app/postgres/product/pgpool-II/etc/pgpool.conf
follow_master_command = '/home/postgres/follow_new_master.sh "%h" "%H"' 

This will be executed when there is failover (all pgpool nodes need to have this script)

#!/bin/sh
set -x
master_node_host_name=$2
detached_node_host_name=$1
tmp=/tmp/mytemp$$
trap "rm -f $tmp" 0 1 2 3 15
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
RECOVERY_NODE_REPLICATION_SLOT=`cat /home/postgres/pgpool_local.conf | grep ${detached_node_host_name} | awk -F '=' '{print $2}'`
export PATH PGDATA
# make sure the instance is down
ssh -T $detached_node_host_name /home/postgres/stop_instance.sh
cat > $tmp <<EOF
standby_mode = 'on'
primary_conninfo = 'host=$master_node_host_name user=postgres'
primary_slot_name = '${RECOVERY_NODE_REPLICATION_SLOT}'
recovery_target_timeline = 'latest'
trigger_file='/u02/pgdata/PG1/failover_trigger'
EOF
scp $tmp $detached_node_host_name:$PGDATA/recovery.conf
ssh ${detached_node_host_name} /home/postgres/start_instance.sh
psql -c "select 'done'" postgres

So, this is the status now:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_dela
y 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-----------------
 0       | 192.168.22.34 | 5432 | up     | 0.333333  | primary | 4          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.333333  | standby | 2          | false             | 0
 2       | 192.168.22.40 | 5432 | up     | 0.333333  | standby | 0          | false             | 0
(3 rows)

Lets shutdown the primary and see what happens:

postgres@pgpool1:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ stop -m immediate
waiting for server to shut down.... done
server stopped

Pgpool is telling this:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_dela
y 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-----------------
--
 0       | 192.168.22.34 | 5432 | down   | 0.333333  | standby | 4          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.333333  | primary | 4          | true              | 0
 2       | 192.168.22.40 | 5432 | down   | 0.333333  | standby | 0          | false             | 0
(3 rows)

Re-attach:

[postgres@centos7_pgpool_m1 ~]$ pcp_attach_node -w -n 0
pcp_attach_node -- Command Successful
[postgres@centos7_pgpool_m1 ~]$ pcp_attach_node -w -n 2
pcp_attach_node -- Command Successful
[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_dela
y 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-----------------
--
 0       | 192.168.22.34 | 5432 | up     | 0.333333  | standby | 4          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.333333  | primary | 4          | true              | 0
 2       | 192.168.22.40 | 5432 | up     | 0.333333  | standby | 0          | false             | 0
(3 rows)

Perfect. The only pain point is that we need to manually re-attach the nodes, everything else is automated. But, luckily there is way to get around this: As we are on the pgpool nodes when the script is executed we can just use pcp_attach_node at the end of the follow_new_master.sh script (and pass the node id %d into the script):

[postgres@centos7_pgpool_m1 ~]$ grep follow /u01/app/postgres/product/pgpool-II/etc/pgpool.conf
follow_master_command = '/home/postgres/follow_new_master.sh "%h" "%H" %d' 
[postgres@centos7_pgpool_m1 ~]$ cat follow_new_master.sh 
#!/bin/sh
set -x
master_node_host_name=$2
detached_node_host_name=$1
detached_node_id=$3
tmp=/tmp/mytemp$$
trap "rm -f $tmp" 0 1 2 3 15
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
RECOVERY_NODE_REPLICATION_SLOT=`cat /home/postgres/pgpool_local.conf | grep ${detached_node_host_name} | awk -F '=' '{print $2}'`
export PATH PGDATA
# make sure the old master is down
ssh -T $detached_node_host_name /home/postgres/stop_instance.sh
cat > $tmp <<EOF
standby_mode = 'on'
primary_conninfo = 'host=$master_node_host_name user=postgres'
primary_slot_name = '${RECOVERY_NODE_REPLICATION_SLOT}'
recovery_target_timeline = 'latest'
trigger_file='/u02/pgdata/PG1/failover_trigger'
EOF
scp $tmp $detached_node_host_name:$PGDATA/recovery.conf
ssh ${detached_node_host_name} /home/postgres/start_instance.sh
psql -c "select 'done'" postgres
pcp_attach_node -w -n ${detached_node_id}

And now, when you shutdown the master everything is automatic. Hope this helps.

 

Cet article Vertically scale your PostgreSQL infrastructure with pgpool – 3 – Adding another standby est apparu en premier sur Blog dbi services.

JAN17 Proactive Bundle Patch + Adaptive Statistics control

Mon, 2017-03-27 02:30

If you have to create a new database now (I’m writing this in March 2017) for a migration of OLTP database with minimal risks of regression, I would recommend:

  • The latest patchset of Oracle Database 12cR1
  • The latest Proactive Bundle Patch
  • The two patches to get full control over Adaptive statistics

This post gives more detail about it and which patches to apply. It would be nice to have those patches merged into each bundle patch, but it is not.

dbi services 12cR2 free event

This Month, the 12cR2 was released and we immediately informed our customers about the new features that we think are important to know in order to plan when to go to 12.2 and for which projects. We started with the most important one, the Adaptive Statistics which helps to avoid all those performance issues encountered after migrating to 12cR1. We also mentioned that this new behavior has been backported to 12.2 with two patches, as explained here by Clemens Bleile:
IMG_3833

12cR1

This event was to inform about 12.2 but lot of customers waited for this to plan their 12.1 migration. Knowing the roadmap and the new features helps to plan what can wait for a stable 12cR2 (after few PSUs) and what must be migrated now to 12cR1. This is why we did everything to rollout this event as soon as possible once the main platform (Linux and Windows) were publicly available.

What to install for 12.1.0.2

Our recommendation for new installations of 12c for current migration with minimal risks of regression is

  • The latest patchset of Oracle Database 12cR1: 12.1.0.2
  • The latest Proactive Bundle Patch: JAN2017 BP
  • The two patches to get full control over Adaptive statistics: patches 22652097 and 21171382

Nothing is easy with patching, so I’ll detail how to find exactly what to install.

Latest patchset of Oracle Database 12cR1: 12.1.0.2

Now that the latest version is 12.2.0.1 you cannot find anymore the 12.1.0.2 on the oracle.com download page.
You can download 12.1.0.2 from the Patchset 21419221: https://updates.oracle.com/download/21419221.html

Latest Proactive Bundle Patch: JAN2017 BP

Finding the latest patch set update is easy if you follow the MOS Note Quick Reference to Patch Numbers for Database/GI PSU, SPU(CPU), Bundle Patches and Patchsets (Doc ID 1454618.1)

You will download https://updates.oracle.com/download/24968615.html for Linux or https://updates.oracle.com/download/25115951.html for Windows.

Adaptive statistics: patches 22652097 and 21171382

Then you have to find the two patches to apply them on top of the JAN17 ProactiveBP.

The first one is for separate parameters to enable adaptive plans and adaptive statistics separately: https://updates.oracle.com/download/22652097.html and you can find it for Oracle 12.1.0.2.170117 Proactive BP

The second one is for dbms_stats preference to control statistics extensions creation and you will have two suprises when following the link:
The name is AUTO DOP COMPUTES A HIGH DOP UNNECESSARILY because the change has been merged with some other changes on concurrent statistics
It is available only for (listed in the non-chronological order of the platform list on MOS):

  • Oracle 12.1.0.2.13 Proactive BP
  • Oracle 12.1.0.2.0
  • Oracle 12.1.0.2.5
  • Oracle 12.1.0.2.160119 Proactive BP
  • Oracle 12.1.0.2.7 Proactive BP

If you want to go further, you have to open a SR, provide an opatch lsinventory (because it seems that Oracle Support Engineers are not able to get the lsinventory for the latest Proactive BP – the recommended one). And this is where the nightmare starts. The lastest we have here is for JAN16 Proactive Bundle Patch – 12.1.0.2.160119 Proactive BP.

I can’t wait for a relevant answer from MOS support engineers, so I got to look at the patch. It is very simple change actually.In DBMS_STATS it has to check whether AUTO_STAT_EXTENSIONS is ON before creating the column group. This is all in prvtstat.plb and if we are lucky there were no changes on it since the JAN16.

Before trying it, we can check conflicts in MOS. Here are the 3 patches I would like to apply, in their most recent release for Linux:

CaptureOpatchConflict

Yes, this is a nice feature of My Oracle Support: you can analyze the conflicts online.

The result of conflict analysis shows that we are lucky:
CaptureOpatchConflict003

Ready now to download the files:
CaptureOpatchConflict004

So the last patch we need, https://updates.oracle.com/download/21171382.html, can be downloaded in its latest Proactive BP version, even if it is 1 year old. And don’t worry about its name: p21171382_12102160119DBEngSysandDBIM_Generic.zip

datapatch and verification

Don’t forget to run datapatch on your databases to be sure that the dictionary is patched.

$ORACLE_HOME/OPatch/datapatch
 
SQL Patching tool version 12.1.0.2.0 Production on Mon Mar 27 09:18:47 2017
Copyright (c) 2012, 2017, Oracle. All rights reserved.
 
Connecting to database...OK
Bootstrapping registry and package to current versions...done
Determining current state...done
Adding patches to installation queue and performing prereq checks...done
Installation queue:
Nothing to roll back
The following patches will be applied:
24732088 (DATABASE BUNDLE PATCH 12.1.0.2.170117)
21171382 (AUTO DOP COMPUTES A HIGH DOP UNNECESSARILY)
22652097 (PROVIDE SEPARATE CONTROLS FOR ADAPTIVE PLANS AND ADAPTIVE STATISTICS FEATURES)
 
Installing patches...
Patch installation complete. Total patches installed: 3
 
Validating logfiles...done
SQL Patching tool complete on Mon Mar 27 09:21:33 2017

And then connect to check the new Adaptive Statistics behavior:

SQL> show parameter optimizer_adaptive
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_adaptive_plans boolean TRUE
optimizer_adaptive_reporting_only boolean FALSE
optimizer_adaptive_statistics boolean FALSE
 
SQL> select dbms_stats.get_prefs('AUTO_STAT_EXTENSIONS') from dual;
 
DBMS_STATS.GET_PREFS('AUTO_STAT_EXTENSIONS')
--------------------------------------------------------------------------------
OFF

Summary for 12.1.0.2 new install

Don’t install a 12c database with only the software and documentation that was released 3 years ago. Lot of migration feedback has improved the stability of such migration, and this is implemented in patchset updates and MOS notes.
A good source of information is Mike Dietrich blog (the best source you can find to benefit from lot of migration projects experience):

Then, get the latest recommended software.

  1. The patchset: https://updates.oracle.com/download/21419221.html for your platform
  2. The patchset update https://updates.oracle.com/download/24968615.html for Unix/Linux
    or https://updates.oracle.com/download/25115951.html for Windows
  3. The Adaptive Statistics instance parameters: https://updates.oracle.com/download/22652097.html for Oracle 12.1.0.2.170117 Proactive BP
  4. The Adaptive Statistics stat preference https://updates.oracle.com/download/21171382.html for Oracle 12.1.0.2.160119 Proactive BP

Don’t forget to run datapatch on all databases, even the newly created ones.

 

Cet article JAN17 Proactive Bundle Patch + Adaptive Statistics control est apparu en premier sur Blog dbi services.

Vertically scale your PostgreSQL infrastructure with pgpool – 2 – Automatic failover and reconfiguration

Fri, 2017-03-24 10:18

In the first post of this little series we did the basic setup of pgpool and configured the watchdog. So, as of now, pgpool uses a VIP which is active on one of the pgpool nodes and failovers to the other node in case the node where the VIP currently runs on goes down for any reason. This provides the very same address for the application or clients to connect and eliminates the single point of failure that you have when there is only one pgpool instance running. What is still missing is a mechanism on how we can automatically failover to a promoted standby instance in case the PostgreSQL master goes down. We even want to be able to reconfigure the old master as new standby that follows the new master automatically. This is what this post is about …

Lets take a look at the picture again:

pgpool-architecture

What happens when we stop the standby instance? Currently both instances are in status “up” when you ask pgpool (notice that I connect to the 192.168.22.38 address which is the VIP that is managed by pgpool):

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)

Lets create a new connection first to see what happens to existing connections:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 postgres
psql (9.6.2 dbi services build)
Type "help" for help.

postgres= select now();
              now              
-------------------------------
 2017-03-24 10:53:16.077351+01
(1 row)


postgres=#

Shutdown the standby instance:

postgres@pgpool2:/home/postgres/ [PG1] psql -c "select pg_is_in_recovery()" postgres
 pg_is_in_recovery 
-------------------
 t
(1 row)

postgres@pgpool2:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ stop -m fast
waiting for server to shut down.... done
server stopped

What happened to our connection?

postgres=# select now();
              now              
-------------------------------
 2017-03-24 10:54:46.802984+01
(1 row)

Nothing, we can just keep using it. What is pgpool telling us about our instances?

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 4          | true              | 0
 1       | 192.168.22.35 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)

As expected the standby node is now down. This is the easy case, just start the standby again and you are fine:

postgres@pgpool2:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ start -m fast
server starting
postgres@pgpool2:/home/postgres/ [PG1] psql -c "select pg_is_in_recovery()" postgres
 pg_is_in_recovery 
-------------------
 t
(1 row)

… and pgpool should detect this as well:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 5          | true              | 0
 1       | 192.168.22.35 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)

No? What we need to do is to attach the node again:

[postgres@centos7_pgpool_m1 ~]$ pcp_attach_node -n 1
Password: 
pcp_attach_node -- Command Successful
[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 5          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | standby | 0          | false             | 0

Now comes the not so easy part: What happens if the primary instance goes down? This is where the failover_command specified in pgpool.conf comes into the game:

[postgres@centos7_pgpool_m1 etc]$ cat pgpool.conf | grep failover_command
failover_command = '/home/postgres/failover.sh %d "%h" %p %D %m %M "%H" %P'

When pgpool detects that the master is gone the failover_command will be executed. This is a very simple failover script (located on the pgpool hosts). The parameters are passed into the script by pgpool automatically:

#!/bin/sh -x
# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %H = new master node host name
#                  %P = old primary node id
#                  %% = '%' character
failed_node_id=$1
failed_host_name=$2
failed_port=$3
failed_db_cluster=$4
new_master_id=$5
old_master_id=$6
new_master_host_name=$7
old_primary_node_id=$8

if [ $failed_node_id = $old_primary_node_id ];then	# master failed
    ssh -T postgres@$new_master_host_name /home/postgres/promote.sh
fi 

The promote.sh script is very simple as well and must be available on all PostgreSQL nodes:

#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
export PATH PGDATA
pg_ctl promote -D ${PGDATA} >> /var/tmp/failover.log
psql -c "select * from pg_create_physical_replication_slot('standby1')" postgres >> /var/tmp/failover.log

So once we shutdown the master this gets triggered and the standby is promoted. How can we re-attach the old master as a new standby? The magic is in here:

[postgres@centos7_pgpool_m1 ~]$ grep recovery_1st_stage_command /u01/app/postgres/product/pgpool-II/etc/pgpool.conf
recovery_1st_stage_command = 'recover.sh'

In the pgpool.conf file a script is specified which is called when a recovery is triggered. This script needs to be in PGDATA, otherwise it will not work. It looks like this (on both nodes, of course):

postgres@pgpool1:/u02/pgdata/PG1/ [PG1] cat $PGDATA/recover.sh 
#!/bin/sh
set -x
master_node_host_name=`hostname`
master_db_cluster=$1
recovery_node_hostname=$2
recovery_db_cluster=$3
tmp=/tmp/mytemp$$
trap "rm -f $tmp" 0 1 2 3 15

PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"

export PATH PGDATA

# make sure the old master is down
ssh -T $recovery_node_hostname /home/postgres/stop_instance.sh
# rewind the old master
ssh -T ${recovery_node_hostname} /home/postgres/rewind.sh ${master_node_host_name}

cat > $tmp <<EOF
standby_mode = 'on'
primary_conninfo = 'host=$master_node_host_name user=postgres'
primary_slot_name = 'standby1'
recovery_target_timeline = 'latest'
trigger_file='/u02/pgdata/PG1/failover_trigger'
EOF

scp $tmp $recovery_node_host_name:$recovery_db_cluster/recovery.conf

ssh -T ${recovery_node_hostname} /home/postgres/start_instance.sh

The “set -x” on the top is quite important because this script is execute from inside PostgreSQL and you’ll see the output in the PostgreSQL logfile. The scripts executed within the recovery.sh script here are:

postgres@pgpool1:/home/postgres/ [PG1] cat stop_instance.sh 
#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_ctl -D $PGDATA stop -m immediate 

postgres@pgpool1:/home/postgres/ [PG1] cat rewind.sh 
#!/bin/bash
master_node_host_name=$1
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_rewind --target-pgdata=${PGDATA} --source-server="port=5432 user=postgres dbname=postgres host=${master_node_host_name}" >> /var/tmp/recover.log

postgres@pgpool1:/home/postgres/ [PG1] cat start_instance.sh 
#!/bin/bash
PGDATA="/u02/pgdata/PG1"
PATH="/u01/app/postgres/product/96/db_2/bin/:$PATH"
pg_ctl -w -s -D $PGDATA start 2>/dev/null 1>/dev/null < /dev/null &

Having this in place we can rebuild the old master with:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.22.34 | 5432 | down   | 0.500000  | standby | 0          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | primary | 0          | true              | 0

[postgres@centos7_pgpool_m1 ~]$ pcp_recovery_node -n 0 -w
pcp_recovery_node -- Command Successful

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38 -c "show pool_nodes" postgres
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
 
---------+---------------+------+--------+-----------+---------+------------+-------------------+------------------
-
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | standby | 0          | false             | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | primary | 0          | true              | 0

The output in the PostgreSQL logfile of the master looks then similar to this:

++ hostname
+ master_node_host_name=pgpool2
+ master_db_cluster=/u02/pgdata/PG1
+ recovery_node_hostname=192.168.22.34
+ recovery_db_cluster=/u02/pgdata/PG1
+ tmp=/tmp/mytemp10901
+ trap 'rm -f /tmp/mytemp10901' 0 1 2 3 15
+ PGDATA=/u02/pgdata/PG1
+ PATH=/u01/app/postgres/product/96/db_2/bin/:/u01/app/postgres/product/96/db_2bin:/u01/app/postgres/product/96/db_2/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/postgres/.local/bin:/home/postgres/bin:/sbin:/usr/edb-bart/bin:/usr/edb-efm/bin:/u01/app/postgres/product/pgpool-II/bin:/u01/app/postgres/local/dmk/bin
+ export PATH PGDATA
+ ssh -T 192.168.22.34 /home/postgres/stop_instance.sh
pg_ctl: PID file "/u02/pgdata/PG1/postmaster.pid" does not exist
Is server running?
+ ssh -T 192.168.22.34 /home/postgres/rewind.sh pgpool2
+ cat
+ scp /tmp/mytemp10901 pgpool1:/u02/pgdata/PG1/recovery.conf
+ ssh 192.168.22.34 /home/postgres/start_instance.sh
+ psql -c 'select '\''done'\''' postgres
 ?column? 
----------
 done
(1 row)

Time: 0.140 ms
+ rm -f /tmp/mytemp10901

I have to admit that I needed a small hack: At the end of the recovery pgpool tries to execute $PGDATA/pgpool_remote_start and I do not want to do it that way so I just created this file on both nodes:

postgres@pgpool1:/u02/pgdata/PG1/ [PG1] cat $PGDATA/pgpool_remote_start
exit 0

In the next post we’ll add a second standby system and then teach pgpool how a remaining standby can be told to follow a new master automatically. We’ll need adjust the scripts here to be more dynamic as well then.

 

Cet article Vertically scale your PostgreSQL infrastructure with pgpool – 2 – Automatic failover and reconfiguration est apparu en premier sur Blog dbi services.

SQL Server 2016: Does Dynamic Data Masking works with INSERT INTO and SELECT INTO commands?

Tue, 2017-03-21 02:55

I wonder how works Dynamic Data Masking (DDM) with these two commands INSERT INTO  and SELECT INTO.

First, I create a table and add some “sensitive data”:

USE [DDM_TEST]
GO

CREATE TABLE [dbo].[Confidential](
[ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[Name] [nvarchar](70)NULL,
[CreditCard] [nvarchar](16)NULL,
[Salary] [int] NULL,
[Email] [nvarchar](60)NULL)  


insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email]) values (N'Stephane',N'3546748598467584',113459,N'sts@dbi-services.com')
insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email]) values (N'David',N'3546746598450989',143576,'dab@dbi-services.com')
insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email])  values (N'Nathan',N'3890098321457893',118900,'nac@dbi-services.com')
insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email])  values (N'Olivier',N'3564890234785612',98000,'olt@dbi-services.com')
insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email])  values (N'Alain',N'9897436900989342',85900,'ala@dbi-services.com')
insert into [dbo].[Confidential]([Name],[CreditCard],[Salary],[Email])  values (N'Fabrice',N'908323468902134',102345,'fad@dbi-services.com')

select * from [dbo].[Confidential]

DDM_Into01

After, I create all masking rules and add a user:

Use DDM_TEST
ALTER Table Confidential
ALTER COLUMN NAME ADD MASKED WITH (FUNCTION='default()')
ALTER Table Confidential
ALTER COLUMN SALARY ADD MASKED WITH (FUNCTION='default()')
ALTER Table Confidential
ALTER COLUMN creditcard ADD MASKED WITH (FUNCTION='partial(1,"XXXX",2)')
ALTER Table Confidential
ALTER COLUMN email ADD MASKED WITH (FUNCTION='email()')

CREATE USER TestDemo WITHOUT LOGIN
GRANT SELECT ON Confidential TO TestDemo

-- Execute a select statement as TestDemo 
EXECUTE AS USER='TestDemo'
SELECT * FROM [dbo].[Confidential] 
REVERT

DDM_Into02

INSERT INTO

This command is used to copy a table.
What’s happens when I copy data from a table with masked columns to a table without mask?
First, I create a second table [dbo].[Confidential2] and give permissions SELECT and INSERT to the user “TestDemo”

USE [DDM_TEST]
GO

CREATE TABLE [dbo].[Confidential2](
[ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[Name] [nvarchar](70)NULL,
[CreditCard] [nvarchar](16)NULL,
[Salary] [int] NULL,
[Email] [nvarchar](60)NULL)  

GRANT SELECT ON Confidential2 TO TestDemo
GRANT INSERT ON Confidential2 TO TestDemo

I execute the query to insert data from [dbo].[Confidential] to [dbo].[Confidential2] with the INSERT INTO command:

USE [DDM_TEST]
GO
EXECUTE AS USER='TestDemo'
SELECT * FROM [dbo].[Confidential]
INSERT INTO [dbo].[Confidential2]([Name],[CreditCard],[Salary],[Email])
	SELECT [Name],[CreditCard],[Salary],[Email] FROM [dbo].[Confidential]
SELECT * FROM [dbo].[Confidential]
REVERT

DDM_Into03
As you can see data are also masked in the second table [dbo].[Confidential2].
But are they really?
I execute the query with the activation on the query plan.
DDM_Into04
As you can see the masking step is missing in the query plan from the select on [dbo].[Confidential2].
If I Select data from [dbo].[Confidential2] with my admin account, data are “masked data” and not real data…
DDM_Into05
Finally, the goal is reached, you cannot read sensitive data if you copy data from a table to another.
Keep in mind that the copied data are not masked for the user.
These data are copied as “masked data” and guarantee the anonymization and a good security for your sensitive data.
But on the other side, if you use the same WHERE clause in [dbo].[Confidential2], you don’t have the same result… :-(
DDM_Into05a

SELECT INTO

With this command, I test also the copy to a temporary table.
These two cases will be interesting…
I recreate the same table [dbo].[Confidential] with the same masking rules, the user with create table and alter any schema permissions to do the select into:

EXECUTE AS USER='TestDemo'
SELECT * FROM [dbo].[Confidential] 
SELECT * INTO [dbo].[Confidential2] FROM [dbo].[Confidential] ;
REVERT

DDM_Into06
In the query plan, you can see that the masking is between the select and the insert.
We are in the same case as previously: copied data are “masked data”.
To see it, I read data from the table [dbo].[Confidential2] with my sysadmin login:
DDM_Into07
And the result is that all copied data are masked. The data remain anonymous.

Finally, let’s test it with a temporary table and let’s see what happens:

USE [DDM_TEST]
GO
EXECUTE AS USER='TestDemo'
SELECT * FROM [dbo].[Confidential] 
SELECT * INTO #Confidential2 FROM [dbo].[Confidential] ;
REVERT
EXECUTE AS USER='TestDemo'
SELECT * FROM #Confidential2 
REVERT

DDM_Into08

The same query plan is applied and masked data are copied and remained anonymous.

Finally, these two commands INSERT INTO and SELECT INTO keep your data anonymous in the case of a Table copy.

Sorry but cheat mode is disabled … :evil:

 

Cet article SQL Server 2016: Does Dynamic Data Masking works with INSERT INTO and SELECT INTO commands? est apparu en premier sur Blog dbi services.

Purging Unified Audit Trail in 12cR2

Sun, 2017-03-19 06:12

A good thing from 12.2 is that the implementation of Unified Audit Trail has changed a little. It was stored in a weird CLI_SWP$ table but now we have a normal partitioned table named AUD$UNIFIED. In a previous post I traced the two purge method: purge all before a timestamp, or purge all. Here is the same in 12.2

Purge old

I have quite a few record in Unified Audit Trail here.


SQL> select unified_audit_policies,action_name,count(*) from unified_audit_trail group by unified_audit_policies,action_name;
 
UNIFIED_AUDIT_POLICIES ACTION_NAME COUNT(*)
---------------------------------------- -------------------- ----------
EXECUTE 3
ORA_LOGON_FAILURES LOGON 408275

I set the timestamp to 6 hours before now
SQL> exec dbms_audit_mgmt.set_last_archive_timestamp(audit_trail_type=>dbms_audit_mgmt.audit_trail_unified
,last_archive_time=>sysdate-6/24);
PL/SQL procedure successfully completed.

And call the clean procedure:
SQL> exec dbms_audit_mgmt.clean_audit_trail(audit_trail_type=>dbms_audit_mgmt.audit_trail_unified
,use_last_arch_timestamp=TRUE);
PL/SQL procedure successfully completed.

Here is what I see in the trace:

select sys.dbms_audit_mgmt.is_droppable_partition(:1, :2)
from
dual

This is interesting. The Unified Audit Trail is partitioned on timestamp and the purge procedure checks it the partition can be dropped instead of running a long delete statement.

Here is the documentation we have about it in ?/rdbms/admin/dbmsamgt.sql

-- is_droppable_partition - IS aud$unified table PARTITION DROPPABLE?
--
--
-- INPUT PARAMETERS
-- partname - aud$unified table's Partition Name
-- lat - Last Archive Timestamp mentioned during cleanup

In my case, I traced the bind variables and the is_droppable_partition procedure was run only once with partname=>'”SYS_P348″‘ and lat=>’03-MAR-17 03.07.56 PM’. The timestamp is the ‘last timestamp’ I’ve set, and I have only one partition here because my database was created recently.

As we can guess, this checks the high value of the partition:

select high_value
from
dba_tab_partitions where table_owner = 'AUDSYS' and table_name =
'AUD$UNIFIED' and partition_name = :1

Because I have only one partition, which is the current one, my ‘last timestamp’ is below the high_value so it is not possible to truncate this partition and keep the records from after the ‘last timestamp’.

Then a delete is run, which deletes all rows from before my last timestamp (bind variable :1 is ’03-MAR-17 03.07.56 PM’). Note that I don’t know (yet) why we can have DBID=0.

delete from audsys.aud$unified
where
event_timestamp < :1 and (dbid = :2 or dbid = 0)
 
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 10.68 31.25 16885 24367 437518 346517
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 10.68 31.25 16885 24367 437518 346517
 
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS (recursive depth: 1)
Number of plan statistics captured: 1
 
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 DELETE AUD$UNIFIED (cr=12219 pr=16885 pw=0 time=31219023 us starts=1)
346517 346517 346517 PARTITION RANGE ITERATOR PARTITION: 1 KEY (cr=12148 pr=0 pw=0 time=1161311 us starts=1 cost=547 size=1231218 card=68401)
346517 346517 346517 TABLE ACCESS FULL AUD$UNIFIED PARTITION: 1 KEY (cr=12148 pr=0 pw=0 time=788043 us starts=1 cost=547 size=1231218 card=68401)
 
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PGA memory operation 5 0.00 0.00
db file sequential read 16885 0.03 21.03

All my rows are deleted with conventional updates here. I had 400000 rows, deleted 340000 so 60000 remains.

Purge old with old partitions

I had only one partition here but AUDSYS.AUD$UNIFIED is partitioned by month. Here is what I can see in my alert.log about the creation of this partition:
TABLE AUDSYS.AUD$UNIFIED: ADDED INTERVAL PARTITION SYS_P348 (33) VALUES LESS THAN (TIMESTAMP' 2017-04-01 00:00:00')

Actually, this is automatically partitioned by months. Here is an excerpt of the table’s DDL as displayed by dbms_metadata:
PARTITION BY RANGE ("EVENT_TIMESTAMP") INTERVAL (INTERVAL '1' MONTH)
(PARTITION "SYS_P348" VALUES LESS THAN (TIMESTAMP' 2017-04-01 00:00:00') SEGMENT CREATION IMMEDIATE

When running the same as before but on a database with few older partitions (because there were no scheduled purge) I can see that the ‘is_droppable_partition’ and the related query is run 4 times:

select high_value
from
dba_tab_partitions where table_owner = 'AUDSYS' and table_name =
'AUD$UNIFIED' and partition_name = :1
 
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 4 0.05 0.05 0 0 0 0
Fetch 4 0.00 0.00 0 143 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 9 0.05 0.06 0 143 0 4

and I see a drop partition for the 3 old partitions:

ALTER TABLE AUDSYS.AUD$UNIFIED DROP PARTITION AUD_UNIFIED_P0
ALTER TABLE AUDSYS.AUD$UNIFIED DROP PARTITION SYS_P221
ALTER TABLE AUDSYS.AUD$UNIFIED DROP PARTITION SYS_P781

Note that this is called by:
CALL DBMS_PDB_EXEC_SQL('ALTER TABLE AUDSYS.AUD$UNIFIED DROP PARTITION SYS_P781')
which runs it internally as an oracle script because this DDL is not allowed otherwise.

In summary, purging with a timestamp is optimized to run conventional deletes only on latest partition. Older partitions are dropped. If you schedule a job to regularly set the timestamp and then have the purge job doing the cleaning, then better to set a timestamp at the beginning of the month. If you have to purge a large audit trail, then better to wait the beginning of the next month.

Purge all

If you don’t need to keep recent records and want to truncate all, then just call the purge without timestamp.

Here I have about 60000 rows remaining from the previous test, all in the current partition.

SQL> select unified_audit_policies,action_name,count(*) from unified_audit_trail group by unified_audit_policies,action_name;
 
UNIFIED_AUDIT_POLICIES ACTION_NAME COUNT(*)
---------------------------------------- -------------------- ----------
EXECUTE 6
ORA_LOGON_FAILURES LOGON 62152

I call the clean

SQL> exec dbms_audit_mgmt.clean_audit_trail(audit_trail_type=>dbms_audit_mgmt.audit_trail_unified
,use_last_arch_timestamp=>FALSE);
PL/SQL procedure successfully completed.

And I can see directly in the trace a truncate of the whole table:

TRUNCATE TABLE AUDSYS.AUD$UNIFIED
 
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.04 4.42 67 6 919 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.04 4.42 67 6 919 0
 
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS (recursive depth: 1)
 
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 67 0.00 0.07
enq: RO - fast object reuse 5 0.94 1.76
local write wait 36 0.14 2.54

This is the fastest way to empty the Unified Audit Trail.

So what?

We don’t have long experience on 12.2 production yet, but from what I see here, this new implementation is a good thing. There were many problems with the 12.1 implementations that are probably solved by having a normal table with normal interval partitioning, purged with normal deletes and normal truncates.
Of course, the next question is what happens when you upgrade a 12.1 database with a huge audit trail? That’s for a future post.
And don’t forget that by default you are in mixed mode. More info here.

 

Cet article Purging Unified Audit Trail in 12cR2 est apparu en premier sur Blog dbi services.

How many members for standby redo logs?

Sat, 2017-03-18 11:43

I see lot of databases with two members for redo logs and also two members for standby redo logs. Why not, but when asking I realized that there are some mis-comprehension about it. And what was recommended 10 years ago may be different today.

Online and Stanbdy redo logs

Your transactions happen on the primary database and are written to the online redo logs before the modification is done on datafiles. And when you commit you wait to be sure that the redo is on persistence storage. If you loose the current redo log group, then your database crashes and you loose the latest transactions. This is why we multiplex the online redo logs. Even if you are 100% confident on your storage high availability the risk of human error dropping a file exists and is considerably lower if there a two files.
For additional protection, in case you loose all the primary redo members, Data Guard synchronizes the transaction to a second site by shipping the redo stream. There, on the standby site, the redo is written to the standby redo logs.

The online redo logs are used only on the primary site, and should better be named primary redo logs. You create them on the standby site only to be prepared for failover, when it will become the primary and opened read-write. But let’s be clear: online redo logs are not used when database is not online, and mount is not online.

The standby redo logs are not standby at all. They are actively used on the standby site and this is why thew are called ‘standby. On the primary, they are not used, just there to be ready when the primary becomes a standby after a failover.

Members

We have seen why we multiplex the online redo logs:

  • it protects the transactions because without multiplexing you loose transactions when loosing one group
  • it protects the instance availability because without multiplexing you crash the instance when loosing one group

But this is different with standby redo logs.

  • it is an additional protection. Transactions are still persistent on the primary even if you loose a standby log group.
  • the primary is still available even if one standby cannot be SYNC

Of course, if in Maximum Protection mode the availability of the primary is compromised when the standby cannot apply the redo in SYNC. But in this protection mode you probably have multiple standby and the loss of one standby redo log on one standby site it not a problem.

Redo transport and redo apply

I said that transactions are still persisted on the primary, but even without standby redo logs they are still shipped to standby site, but in ASYNC mode. This means that in order to loose transactions in case of the loss of a standby redo log group, you need to experience this file loss, and primary site failure and network failure at the same time. The probability for this is very low and having an additional member do not lower the risk.

Of course, I’ve tested what happens. I have two standby redo log members and I removed all of them one minute ago:

DGMGRL for Linux: Release 12.2.0.1.0 - Production on Fri Mar 17 14:47:45 2017
 
Copyright (c) 1982, 2016, Oracle and/or its affiliates. All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
Connected to "ORCLA"
Connected as SYSDBA.
 
Database - orclb
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 1 minute 30 seconds (computed 0 seconds ago)
Average Apply Rate: 0 Byte/s
Active Apply Rate: 0 Byte/s
Maximum Apply Rate: 0 Byte/s
Real Time Query: ON
Instance(s):
ORCLB
 
Database Warning(s):
ORA-16853: apply lag has exceeded specified threshold
ORA-16857: member disconnected from redo source for longer than specified threshold
ORA-16826: apply service state is inconsistent with the DelayMins property
ORA-16789: standby redo logs configured incorrectly

As you can see, when there is no member remaining, the APPLY is stuck but transport still happens, in ASYNC to archived logs.
The standby alert log mentions the failure:
2017-03-17T14:51:21.209611+01:00
Errors in file /u01/app/oracle/diag/rdbms/orclb/ORCLB/trace/ORCLB_rsm0_6568.trc:
ORA-00313: open failed for members of log group 5 of thread 1
ORA-00312: online log 5 thread 1: '/u01/oradata/ORCLB/onlinelog/m5.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 2
ORA-00312: online log 5 thread 1: '/u01/oradata/ORCLB/onlinelog/o1_mf_5_dbvmxd52_.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 2

and the SYNC mode cannot continue without standby redo logs:

03/17/2017 14:58:15
Failed to open SRL files. ORA-313
Redo Apply is running without NODELAY option while DelayMins=0
Failed to open SRL files. ORA-313
Error: LogXptMode value 'SYNC' of requires this database to have standby redo logs, but they are not configured correctly.
03/17/2017 14:59:15
Failed to open SRL files. ORA-313
Redo Apply is running without NODELAY option while DelayMins=0
Failed to open SRL files. ORA-313
Error: LogXptMode value 'SYNC' of requires this database to have standby redo logs, but they are not configured correctly.

Sure, you don’t want to loose the standby redo member. But the risk is not higher than loosing any other files, and this is why there is no reason to multiplex it. Standby redo logs are not the same as the primary online redo logs. On similar hardware, you need same size and you need one more group, but no reason to multiplex the same.

Documentation

The confusion may come from old documentation. The 10g documentation says:
For increased availability, consider multiplexing the standby redo log files, similar to the way that online redo log files are multiplexed.
This documentation dates from 2005 and systems have changed about availability of files.

More recent documentation is the white paper on Best Practices for Synchronous Redo Transport which mentions: It is critical for performance that standby redo log groups only contain a single member

So what?

At the time of 10g we had a LUN for redo logs and were not concerned by the size, but more by its unavailability. Things change. Losing a file, and only one file, today is extremely rare. We are more concerned about consolidation and performance. Having 4 online groups, 200MB or 500MB, and 5 standby groups, all multiplexed, for many databases will take space. And this space you want to allocate it on the fastest disks because user commits wait on log writes (on primary and standby except in Max Performance). You don’t want to over-allocate the space here. Better have larger online redo logs. And your goal is that network shipping + standby log writing takes not longer than local write to online redo logs, so that Data Guard protection do not increase commit latency. Multiplexing standby redo logs increases the risk to get longer writes on standby site.

So if you have your standby redo logs multiplexed, it’s not wrong. But things change and today you may prefer to save space and performance overhead with only one member.

Before writing this blog post, my poll on twitter had 40 votes. Only 28% mentioned no multiplexing. But twitter poll is not exact science as you can see that 8 people answered 42 members ;)

Because I've seen lot of misunderstanding about it, I'm curious to know how many members you have in your standby redo logs

— Franck Pachot (@FranckPachot) March 17, 2017

 

Cet article How many members for standby redo logs? est apparu en premier sur Blog dbi services.

Oracle 12cR2 on Windows: Virtual Accounts

Thu, 2017-03-16 16:52

Oracle Database 12.2.0.1 is released for Windows, just 2 weeks after the Linux release, and this is a very good news. Let’s see something new you will encounter in the first screens of Oracle 12.2 installer. Don’t worry, the default choice is the right one. But better understand it.

SYSTEM

On Linux, you don’t install Oracle Database as root. You create a user, usually called oracle, which will be the owner of the database files and the instance processes and shared memory. This looks obvious be before 12c the Oracle Instance is running as the root equivalent, the built-in SYSTEM user. This was very bad from a security point of view: running a software with the most powerful user on the system.

12.1 Oracle Home User

This has changed in 12.1 with the possibility to define another user, which already exists, or which you create at installation providing user name and password.
CaptureWinVA000

This user is called the Oracle Home user. Just to get it clear, it is the user which will run the instance. You still install the software as Administrator.
So, in 12.1 the choice is existing user, new user or SYSTEM and the recommandation is to create a user. But it is quite annoying to have to provide a user and password for a user you will never use to log in.

12.2 Virtual Accounts

Windows 2008 R2 has introduced two new local service users: Managed Service Accounts (MSA) and Virtual Accounts.

Managed Service Accounts are created by the administrator in the Active Directory (using New-ADServiceAccount). And you can use them in 12c by mentioning the name in ‘Use Existing Windows User’.

Virtual Accounts are enabled by default in Windows. In 12.2 you can use this feature for Oracle Home account. It is the first option, the default one, and the one recommended if you have no reason to use another user:

CaptureWinVA001

oracle.key

If you don’t know what has been defined, look at the registry. Find the ORACLE_HOME you run from, read the registry key from %ORACLE_HOME%\bin\oracle.key and look at the keys:

CaptureOradimDBCA004

Here ORACLE_SVCUSER_TYPE is new with value ‘V’ which means that the ORACLE_SVCUSER is a Virtual Account. ORACLE_SVCUSER_PWDREQ mentions that no password has to be provided for the instances services.

Note that the old method, the ‘built-in account’ had the following, mentioning the internal SYSTEM, and without a TYPE:

ORACLE_SVCUSER REG_SZ NT AUTHORITY\SYSTEM
ORACLE_SVCUSER_PWDREQ REG_SZ 0

The 12.1 method of non-privileged user had ORACLE_SVCUSER_PWDREQ=1 and requires the password for the services.

Back to virtual account, I said that they are used for instance services and database files. Let’s have a look at services and file security properties:

CaptureOradimDBCA005

The database file owner is the user we have seen above as defined by ORACLE_SVCUSER but the service ‘Log On As’ has the special ‘NT SERVICE\ServiceName‘ which is the Virtual Account. It is not a real account like built-in, local or domain ones. It is more a service that is displayed as an account here.

So what?

Don’t panic in front of this additional choice. Virtual Account is the right choice to run with a minimal privilege user and no additional complexity.

 

Cet article Oracle 12cR2 on Windows: Virtual Accounts est apparu en premier sur Blog dbi services.

Vertically scale your PostgreSQL infrastructure with pgpool – 1 – Basic setup and watchdog configuration

Thu, 2017-03-16 12:21

I have written some posts on how you can make your PostgreSQL deployment high available by using PostgreSQL’s streaming replication feature in the past ( 1, 2 ). The main issue you’ll have to resolve with such a setup is how the application can be made aware of a new master when a fail over happened. You could use EDB Failover Manager (1, 2, 3, 4) for that because it provides the functionality to move a VIP from one host to another so the application can always connect to the very same IP address no matter where the current master is running (EDB EFM requires a subscription). You could also use Pacemaker and Corosync for that. But, which is the scope of this post, you can also use pgpool which is widely known in the PostgreSQL community. When you configure it the right way you can even spread your read operations over all hot standby servers in your configuration and only write operations go to the master. This allows you to vertically scale your PostgreSQL deployment by adding more standby nodes when you need more resources. Lets go …

To start with a picture is always a good idea. This is what we want to setup:

pgpool-architecture

We will have two nodes dedicated to pgpool (centos7_pgpool_m1/m2). pgpool will be running in a watchdog configuration so that one node can take over in case the other goes down. pgpool will provide a virtual IP address for the clients to connect to (which fails over to the surviving node in case a node goes down for any reason). In the background there are two nodes which host the PostgreSQL 9.6.2 primary and hot standby instances (centos7_pgpool_1/2). At the very beginning the master is running on centos7_pgpool_1 although that does not really matter once the whole setup is completed.

I’ll not describe the setup of the PostgreSQL master->standby setup. When you need assistance there take a look here, here or search the web, there are many great howtos.

Lets start by installing pgpool onto the hosts dedicated for pgpool (centos7_pgpool_m1/m2):

You can download pgpool here. As pgpool requires libpq we’ll just install the PostgreSQL binaries on the hosts dedicated for pgpool as well before proceeding with the installation of pgpool. Of course these steps need to be done on both hosts (centos7_pgpool_m1/m2):

[root@centos7_pgpool_m1 ~]$ groupadd postgres
[root@centos7_pgpool_m1 ~]$ useradd -g postgres postgres
[root@centos7_pgpool_m1 ~]$ passwd postgres
[root@centos7_pgpool_m1 ~]$ mkdir -p /u01/app/postgres/software
[root@centos7_pgpool_m1 ~]$ chown -R postgres:postgres /u01/app/postgres
[root@centos7_pgpool_m1 ~]$ su - postgres
[postgres@centos7_pgpool_m1 ~]$ cd /u01/app/postgres/software/
[postgres@centos7_pgpool_m1 software]$ wget https://ftp.postgresql.org/pub/source/v9.6.2/postgresql-9.6.2.tar.bz2
[postgres@centos7_pgpool_m1 software]$ tar -axf postgresql-9.6.2.tar.bz2
[postgres@centos7_pgpool_m1 software]$ cd postgresql-9.6.2
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ PGHOME=/u01/app/postgres/product/96/db_2
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ SEGSIZE=2
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ BLOCKSIZE=8
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ WALSEGSIZE=16
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ ./configure --prefix=${PGHOME} \
>             --exec-prefix=${PGHOME} \
>             --bindir=${PGHOME}/bin \
>             --libdir=${PGHOME}/lib \
>             --sysconfdir=${PGHOME}/etc \
>             --includedir=${PGHOME}/include \
>             --datarootdir=${PGHOME}/share \
>             --datadir=${PGHOME}/share \
>             --with-pgport=5432 \
>             --with-perl \
>             --with-python \
>             --with-tcl \
>             --with-openssl \
>             --with-pam \
>             --with-ldap \
>             --with-libxml \
>             --with-libxslt \
>             --with-segsize=${SEGSIZE} \
>             --with-blocksize=${BLOCKSIZE} \
>             --with-wal-segsize=${WALSEGSIZE}  \
>             --with-extra-version=" dbi services build"
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ make world
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ make install-world
[postgres@centos7_pgpool_m1 postgresql-9.6.2]$ cd ..
[postgres@centos7_pgpool_m1 software]$ rm -rf postgresql-9.6.2*
### download pgpool
[postgres@centos7_pgpool_m1 software]$ ls
pgpool-II-3.6.1.tar.gz
[postgres@centos7_pgpool_m1 software]$ tar -axf pgpool-II-3.6.1.tar.gz 
[postgres@centos7_pgpool_m1 software]$ cd pgpool-II-3.6.1
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ export PATH=/u01/app/postgres/product/96/db_2/bin/:$PATH
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ ./configure --prefix=/u01/app/postgres/product/pgpool-II
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ make
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ make install
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ cd src/sql/pgpool-recovery/
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ make
[postgres@centos7_pgpool_m1 pgpool-recovery]$ make install
[postgres@centos7_pgpool_m1 pgpool-recovery]$ cd ../pgpool-regclass/
[postgres@centos7_pgpool_m1 pgpool-regclass]$ make
[postgres@centos7_pgpool_m1 pgpool-regclass]$ make install

Copy the generated extensions to the PostgreSQL master and standby servers:

[postgres@centos7_pgpool_m1 ~]$ cd /u01/app/postgres/software/pgpool-II-3.6.1
# master node
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery.control 192.168.22.34:/u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery.control
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery--1.1.sql 192.168.22.34:/u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery--1.1.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool-recovery.sql 192.168.22.34:/u01/app/postgres/product/96/db_2/share/extension/pgpool-recovery.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/lib/pgpool-recovery.so 192.168.22.34:/u01/app/postgres/product/96/db_2/lib/pgpool-recovery.so
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass--1.0.sql 192.168.22.34:/u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass--1.0.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass.control 192.168.22.34:/u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass.control
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/lib/pgpool-regclass.so 192.168.22.34:/u01/app/postgres/product/96/db_2/lib/pgpool-regclass.so
# standby node
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery.control 192.168.22.35:/u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery.control
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery--1.1.sql 192.168.22.35:/u01/app/postgres/product/96/db_2/share/extension/pgpool_recovery--1.1.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool-recovery.sql 192.168.22.35:/u01/app/postgres/product/96/db_2/share/extension/pgpool-recovery.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/lib/pgpool-recovery.so 192.168.22.35:/u01/app/postgres/product/96/db_2/lib/pgpool-recovery.so
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass--1.0.sql 192.168.22.35:/u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass--1.0.sql
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass.control 192.168.22.35:/u01/app/postgres/product/96/db_2/share/extension/pgpool_regclass.control
[postgres@centos7_pgpool_m1 pgpool-II-3.6.1]$ scp /u01/app/postgres/product/96/db_2/lib/pgpool-regclass.so 192.168.22.35:/u01/app/postgres/product/96/db_2/lib/pgpool-regclass.so

Install the extensions on the master node only (this will be replicated to the standby node automatically as the PostgreSQL instances already operate in hot_standby mode):

postgres@pgpool1:/u01/app/postgres/product/96/db_2/ [PG1] psql template1
psql (9.6.2 dbi services build)
Type "help" for help.

(postgres@[local]:5432) [template1] > create extension pgpool_recovery;
CREATE EXTENSION
(postgres@[local]:5432) [template1] > create extension pgpool_regclass;
CREATE EXTENSION
(postgres@[local]:5432) [template1] > \dx
                                List of installed extensions
      Name       | Version |   Schema   |                    Description                     
-----------------+---------+------------+----------------------------------------------------
 pgpool_recovery | 1.1     | public     | recovery functions for pgpool-II for V3.4 or later
 pgpool_regclass | 1.0     | public     | replacement for regclass
 plpgsql         | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

Create the pgpool.conf configuration file on both nodes. For node 1 (centos7_pgpool_m1):

echo "echo "listen_addresses = '*'
port = 5432
socket_dir = '/tmp'
pcp_port = 9898
pcp_socket_dir = '/tmp'
backend_hostname0 = '192.168.22.34'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/u02/pgdata/PG1'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '192.168.22.35'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/u02/pgdata/PG1'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = off
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'stderr'
print_timestamp = on
log_connections = off
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/tmp/pgpool.pid'
logdir = '/tmp/pgpool'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = off
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'postgres'
sr_check_password = ''
delay_threshold = 0
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = 'centos7_pgpool_m2'
system_db_hostname  = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 20
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password = ''
health_check_max_retries = 0
health_check_retry_delay = 1
failover_command = '/home/postgres/failover.sh %d "%h" %p %D %m %M "%H" %P'
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'postgres'
recovery_password = ''
recovery_1st_stage_command = 'resync_master.sh'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/usr/bin'
wd_hostname = 'centos7_pgpool_m1'
wd_port = 9000
wd_authkey = ''
other_pgpool_hostname0 = 'centos7_pgpool_m2'
other_pgpool_port0 = 5432
other_wd_port0 = 9000
delegate_IP = '192.168.22.38'
ifconfig_path = '/usr/bin'
if_up_cmd = 'ifconfig enp0s8:0 inet \$_IP_\$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig enp0s8:0 down'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U \$_IP_\$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'host0_ip1'
heartbeat_destination_port0 = 9694
heartbeat_device0 = ''
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''
" > /u01/app/postgres/product/pgpool-II/etc/pgpool.conf

For node 2 (centos7_pgpool_m2):

echo "echo "listen_addresses = '*'
port = 5432
socket_dir = '/tmp'
pcp_port = 9898
pcp_socket_dir = '/tmp'
backend_hostname0 = '192.168.22.34'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/u02/pgdata/PG1'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '192.168.22.35'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/u02/pgdata/PG1'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = off
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'stderr'
print_timestamp = on
log_connections = off
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/tmp/pgpool.pid'
logdir = '/tmp/pgpool'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = off
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'postgres'
sr_check_password = ''
delay_threshold = 0
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = 'centos7_pgpool_m2'
system_db_hostname  = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 20
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password = ''
health_check_max_retries = 0
health_check_retry_delay = 1
failover_command = '/home/postgres/failover.sh %d "%h" %p %D %m %M "%H" %P'
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'postgres'
recovery_password = ''
recovery_1st_stage_command = 'resync_master.sh'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/usr/bin'
wd_hostname = 'centos7_pgpool_m2'
wd_port = 9000
wd_authkey = ''
other_pgpool_hostname0 = 'centos7_pgpool_m1'
other_pgpool_port0 = 5432
other_wd_port0 = 9000
delegate_IP = '192.168.22.38'
ifconfig_path = '/usr/sbin'
if_up_cmd = 'ifconfig enp0s8:0 inet \$_IP_\$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig enp0s8:0 down'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U \$_IP_\$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'host0_ip1'
heartbeat_destination_port0 = 9694
heartbeat_device0 = ''
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''
" > /u01/app/postgres/product/pgpool-II/etc/pgpool.conf

For switching the VIP from one host to another pgpool must be able to bring up and shutdown the virtual interface. You could use sudo for that or change the suid bit on the ifconfig and arping binaries:

[postgres@centos7_pgpool_m1 pgpool-II]$ sudo chmod u+s /usr/sbin/arping
[postgres@centos7_pgpool_m1 pgpool-II]$ sudo chmod u+s /sbin/ifconfig

The other important configuration file for pgpool is the pcp.conf file. This file holds the authentication for pgpool itself and requires a user name and a md5 hashed password. To generate the password you can use the pg_md5 utility which comes with the installation of pgpool:

[postgres@centos7_pgpool_m1 ~]$ /u01/app/postgres/product/pgpool-II/bin/pg_md5 --prompt --username postgres
password: 
e8a48653851e28c69d0506508fb27fc5

Once you have the hashed password we can create the pcp.conf file (on both pgpool nodes of course):

[postgres@centos7_pgpool_m1 ~]$ echo "postgres:e8a48653851e28c69d0506508fb27fc5" > /u01/app/postgres/product/pgpool-II/etc/pcp.conf

Before doing anything else we need to allow connections from the pgpool nodes to the database nodes by adjusting the pg_hba.conf file for both PostgreSQL instances. On both nodes:

postgres@pgpool1:/home/postgres/ [PG1] echo "host    all             postgres        192.168.22.36/32         trust" >> /u02/pgdata/PG1/pg_hba.conf
postgres@pgpool1:/home/postgres/ [PG1] echo "host    all             postgres        192.168.22.37/32         trust" >> /u02/pgdata/PG1/pg_hba.conf
postgres@pgpool1:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ reload

Before we start pgpool on both pgpool nodes lets take a look at the important watchdog parameters on node 1:

ping_path = '/usr/bin'
wd_hostname = 'centos7_pgpool_m2'
wd_port = 9000
wd_authkey = ''
other_pgpool_hostname0 = 'centos7_pgpool_m1'
other_pgpool_port0 = 5432
other_wd_port0 = 9000
delegate_IP = '192.168.22.38'
ifconfig_path = '/usr/sbin'
if_up_cmd = 'ifconfig enp0s8:0 inet \$_IP_\$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig enp0s8:0 down'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U \$_IP_\$ -w 1

The various *path* variables are obvious, they tell pgpool where to find the binaries for ping, arping and ifconfig (you can also use the ip command instead). The other0* variables specify which other host runs a pgpool instance on which pgpool and watchdog ports. This is essential for the communication between the two pgpool hosts. And then we have the commands to bring up the virtual interface and to bring it down (if_up_cmd,if_down_cmd). In addition we need an address for the virtual interface which is specified by the “delegate_IP” variable. Lets see if it works and start pgpool on both nodes:

# node 1
[postgres@centos7_pgpool_m1 ~]$ /u01/app/postgres/product/pgpool-II/bin/pgpool
[postgres@centos7_pgpool_m1 ~]$ 
# node 2
[postgres@centos7_pgpool_m2 ~]$ /u01/app/postgres/product/pgpool-II/bin/pgpool
[postgres@centos7_pgpool_m2 ~]$ 

Looks not so bad as no issues are printed to the screen. When everything went fine we should see the a new virtual IP (192.168.22.38) on one of the nodes (node2 in my case):

[postgres@centos7_pgpool_m2 ~]$ ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:d6:95:ab brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
       valid_lft 85216sec preferred_lft 85216sec
    inet6 fe80::a00:27ff:fed6:95ab/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s8:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:5c:b0:e5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.22.37/24 brd 192.168.22.255 scope global enp0s8
       valid_lft forever preferred_lft forever
    inet 192.168.22.38/24 brd 192.168.22.255 scope global secondary enp0s8:0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe5c:b0e5/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever

When we shutdown pgpool on the node where the VIP is currently running it should be switched to the other node automatically, so shutdown pgpool on the node where it is running currently:

[postgres@centos7_pgpool_m2 ~]$ /u01/app/postgres/product/pgpool-II/bin/pgpool -m fast stop
2017-03-16 17:54:02: pid 2371: LOG:  stop request sent to pgpool. waiting for termination...
.done.

Check the other host for the VIP:

[postgres@centos7_pgpool_m1 ~]$ ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:d6:95:ab brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
       valid_lft 85067sec preferred_lft 85067sec
    inet6 fe80::a00:27ff:fed6:95ab/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s8:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:5c:b0:e5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.22.36/24 brd 192.168.22.255 scope global enp0s8
       valid_lft forever preferred_lft forever
    inet 192.168.22.38/24 brd 192.168.22.255 scope global secondary enp0s8:0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe5c:b0e5/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever

Cool, now we have a VIP the application can connect to which switches between the pgpool hosts automatically in case the host where it currently runs on experiences an issue or is shutdown intentionally. There is a pcp command which shows you more details in regards to the watchdog:

[postgres@centos7_pgpool_m1 ~]$ /u01/app/postgres/product/pgpool-II/bin/pcp_watchdog_info 
Password: 
2 YES centos7_pgpool_m1:5432 Linux centos7_pgpool_m1 centos7_pgpool_m1

centos7_pgpool_m1:5432 Linux centos7_pgpool_m1 centos7_pgpool_m1 5432 9000 4 MASTER
centos7_pgpool_m2:5432 Linux centos7_pgpool_m2 centos7_pgpool_m2 5432 9000 7 STANDBY

As we now have a VIP we should be able to connect to the PostgreSQL backends by connecting to this VIP:

[postgres@centos7_pgpool_m1 ~]$ psql -h 192.168.22.38
psql (9.6.2 dbi services build)
Type "help" for help.

postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

Ok, that works as well. What do we see on the PostgreSQL instances? On the master:

(postgres@[local]:5432) [postgres] > select datname,client_addr,client_hostname from pg_stat_activity where client_addr is not null;
 datname  |  client_addr  | client_hostname 
----------+---------------+-----------------
 postgres | 192.168.22.36 | NULL
(1 row)

We see one connection from the first pgpool node. What do we see on the standby?

(postgres@[local]:5432) [postgres] > select datname,client_addr,client_hostname from pg_stat_activity where client_addr is not null;
 datname  |  client_addr  | client_hostname 
----------+---------------+-----------------
 postgres | 192.168.22.36 | NULL
(1 row)

One connection as well. Looks good.

When you connect the PostgreSQL instances though pgpool there is a sql like syntax for displaying pgpool stuff as well:

postgres=# show pool_nodes;
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replicati
on_delay 
---------+---------------+------+--------+-----------+---------+------------+-------------------+----------
---------
 0       | 192.168.22.34 | 5432 | up     | 0.500000  | primary | 1          | true              | 0
 1       | 192.168.22.35 | 5432 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)

To summarize: We now have a pgpool instance running on two nodes. Only one of these nodes hosts the VIP and the VIP switches to the other host in case there is an issue. Client connections from now on can go the VIP and pgpool will redirect the connection to one of the PostgreSQL nodes (depending if it is a write or a pure read operation).

In the next post we’ll dig deeper into the pgpool configuration, how you can tell on which instance you actually landed and how we can instruct pgpool to automatically promote a new master, dsiconnect the old master and the rebuild the old master as a new standby that follows the new master.

 

Cet article Vertically scale your PostgreSQL infrastructure with pgpool – 1 – Basic setup and watchdog configuration est apparu en premier sur Blog dbi services.

Apache JMeter and Cross-Site Request Forgery (CSRF) token management

Thu, 2017-03-16 08:45

Introduction

In Nowadays web technologies a common defensive mechanism against Cross-Site Request Forgery (CSRF) attacks is to use a synchronizer token. This token might be unique for each request and thus it blocks us from using the recorded JMeter test session off the shelf.

This blog will describe how this CSRF feature can be handled in JMeter.

How to implement this feature

The solution is to identify and extract the CSRF token from the response data or header depending how is it has been set.
The site I was doing the Load test using JMeter is using a cookie to set the CSRF Token and adds a X-CSRFToken header to the following HTTP requests.

The HTTP Response header contains something like:

Set-Cookie: csrftoken=sTrKh7qgnuKtuNTkbwlyCv45W2sqOaiY; expires=Sun, 21-Jan-2017 11:34:43 GMT; Max-Age=31449600; Path=/

To extract the CSRF token value from the HTTP Response header, add a Regular Expression Extractor Post Processor globally.
This way if the token value is reset to a new value somehow, it will be dynamically updated in the following response.

Now configure it as follows:

Apply to: Main sample only
Field to check: Response Headers
Reference Name: CSRF_TOKEN
Regular Expression: Set-Cookie: csrftoken=(.+?);
Template: $1$

Get the Response Cookie via the Regular Expression Extractor

DynCSRF_Regular_Expression

It is always better to have a user variable attached to the extracted value to be kept during the complete load test run.
select user defined variables and add a new variable with the same name as the reference name declared above in the regular expression Extractor.

DynCSRF_variable

The next step is to analyse each HTTP Request recorded in the scenario to replace the hard coded value for the X_CSRFToken header with the variable set by the Post Processor as shown below:

DynCSRF_HTTP_Header

To avoid having to check every request HTTP Header Manager as displayed above which can take some time and might introduce errors, a pre-processor can be used that checks the headers
and replace automatically the X_CSRFToekn hard coded value with the variable set by the post processor task. This kind of pre-processor can be time consuming and should be as simplest as possible. Thus I decided to not check if the X_CSRFToken exist in the request header and just call the remove header attribute and add the X_CSRFToken one to all requests. This worked fine for the site I was working on.

The pre-processor code used was the following:

import org.apache.jmeter.protocol.http.control.Header;

sampler.getHeaderManager().removeHeaderNamed("X-CSRFToken");
newValue=vars.get("csrfToken");
sampler.getHeaderManager().add(new Header("X-CSRFToken",newValue));

DynCSRF_BeasnShell

 

Cet article Apache JMeter and Cross-Site Request Forgery (CSRF) token management est apparu en premier sur Blog dbi services.

10th CH Exadata Community Meeting

Thu, 2017-03-16 04:55

Yesterday (15th of March), the 10th Switzerland Exadata Community Meeting took place. It was not a meeting only about Exadata, but also about the Oracle Database Appliances which became more and more popular. It is interesting, how many companies in Switzerland already use Exadata’s or ODA’s.

A big topic was the Exadata 12.2.1.1.0 software release which includes over 30 unique software features and enhancements, like better analytics, better transaction processing, better consolidation, more secure and faster and more robust upgrades, to mention just a few.

Attached are the slides from Gurmeet Goindi, the Master Product Manager for Exadata who presented the session about the Exadata 12.2 software.

http://www.oracle.com/technetwork/database/exadata/exadatasoftware-12211-3521134.pdf

Like in the Exadata Community Meetings before, patching is still one of the hot topics. Oracle puts a lot of effort to make it fast and more robust. Starting with 12.2, parallel firmware upgrades across components such as hard disks, flash, ILOM/BIOS, InfiniBand card are supported which might lead to 5x speed up in storage server software updates. Besides that, the number of reboots for software updates where reduced by using kexec where possible.
Making Exadata patching faster and more robust is in the great interest of Oracle, because Oracle themselves are the biggest Exadata customer in the world. ;-)

Another very interesting session from the Accenture Enkitec Group was about the attractiveness of the Oracle Cloud regarding performance and also pricing is some situations.

https://www.accenture.com/t20161013T060358__w__/us-en/_acnmedia/PDF-34/Accenture-Oracle-Cloud-Performance-Test-October2016.pdf

I am very looking forward to the next Exadata Community Meeting in Switzerland.

Cheers,
William

 

Cet article 10th CH Exadata Community Meeting est apparu en premier sur Blog dbi services.

Postgres Barman and DMK

Tue, 2017-03-14 10:21

As PostgreSQL is more and more present in our client’s infrastructure, I wanted to describe you the barman installation and configuration. Barman is the backup and recovery tool for PostgreSQL, I configured it using DMK out tool for infrastructure administrators on Oracle, MySQL, and PostgreSQL.

I used two virtual severs running under RedHat Enterprise Libux 7.1, one for PostgreSQL database server (pg1) ands the second for barman (pg2).

At first I install PostgreSQL 9.6 on both servers:

[root@pg1 ~]# wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/
rhel-7-x86_64/pgdg-redhat96-9.6-3.noarch.rpm
--2017-02-06 15:08:05--  https://download.postgresql.org/pub/repos/yum/9.6/redhat
/rhel-7-x86_64/pgdg-redhat96-9.6-3.noarch.rpm
Resolving download.postgresql.org (download.postgresql.org)... 
217.196.149.55, 174.143.35.246, 87.238.57.227, ...
Connecting to download.postgresql.org (download.postgresql.org)|
217.196.149.55|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4816 (4.7K) [application/x-redhat-package-manager]
Saving to: âpgdg-redhat96-9.6-3.noarch.rpm
 
100%[======================================>] 4,816       
 
2017-02-06 15:08:05 (2.71 MB/s) - pgdg-redhat96-9.6-3.noarch.rpm saved 
 
[root@pg1 ~]# sudo yum localinstall -y pgdg-redhat96-9.6-3.noarch.rpm
Examining pgdg-redhat96-9.6-3.noarch.rpm: pgdg-redhat96-9.6-3.noarch
Marking pgdg-redhat96-9.6-3.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package pgdg-redhat96.noarch 0:9.6-3 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
================================================================================
 Package           Arch       Version     Repository                       Size
================================================================================
Installing:
 pgdg-redhat96     noarch     9.6-3       /pgdg-redhat96-9.6-3.noarch     2.7 k
 
Transaction Summary
================================================================================
Install  1 Package
 
Total size: 2.7 k
Installed size: 2.7 k
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgdg-redhat96-9.6-3.noarch                                   1/1
  Verifying  : pgdg-redhat96-9.6-3.noarch                                   1/1
 
Installed:
  pgdg-redhat96.noarch 0:9.6-3
 
Complete!

I install barman on the barman server (pg2):

[root@pg2 ~]# sudo yum install barman
pgdg96                                                   | 4.1 kB     00:00
(1/2): pgdg96/7Server/x86_64/group_gz                      |  249 B   00:00
(2/2): pgdg96/7Server/x86_64/primary_db                    | 129 kB   00:02
Resolving Dependencies
--> Running transaction check
---> Package barman.noarch 0:2.1-1.rhel7 will be installed
--> Processing Dependency: python-psycopg2 >= 2.4.2 for package:
barman-2.1-1.rhel7.noarch
--> Processing Dependency: python-argh >= 0.21.2 for package: 
barman-2.1-1.rhel7.noarch
--> Processing Dependency: python-dateutil for package: 
barman-2.1-1.rhel7.noarch
--> Processing Dependency: python-argcomplete for package: 
barman-2.1-1.rhel7.noarch
--> Running transaction check
---> Package python-argcomplete.noarch 0:0.3.7-1.rhel7 will be installed
---> Package python-argh.noarch 0:0.23.0-1.rhel7 will be installed
---> Package python-dateutil.noarch 1:2.5.3-3.rhel7 will be installed
--> Processing Dependency: python-six for package: 1:
python-dateutil-2.5.3-3.rhel7.noarch
---> Package python-psycopg2.x86_64 0:2.6.2-3.rhel7 will be installed
--> Processing Dependency: postgresql96-libs for package: 
python-psycopg2-2.6.2-3.rhel7.x86_64
--> Running transaction check
---> Package postgresql96-libs.x86_64 0:9.6.1-1PGDG.rhel7 will be installed
---> Package python-six.noarch 0:1.9.0-2.el7 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
================================================================================
 Package                Arch       Version                 Repository      Size
================================================================================
Installing:
 barman                 noarch     2.1-1.rhel7             pgdg96         248 k
Installing for dependencies:
 postgresql96-libs      x86_64     9.6.1-1PGDG.rhel7       pgdg96         308 k
 python-argcomplete     noarch     0.3.7-1.rhel7           pgdg96          23 k
 python-argh            noarch     0.23.0-1.rhel7          pgdg96          33 k
 python-dateutil        noarch     1:2.5.3-3.rhel7         pgdg96         241 k
 python-psycopg2        x86_64     2.6.2-3.rhel7           pgdg96         131 k
 python-six             noarch     1.9.0-2.el7             ol7_latest      28 k
 
Transaction Summary
================================================================================
Install  1 Package (+6 Dependent packages)
 
Total download size: 1.0 M
Installed size: 3.6 M
Is this ok [y/d/N]: y
Downloading packages:
(1/7): barman-2.1-1.rhel7.noarch.rpm                       | 248 kB   00:03
(2/7): python-argcomplete-0.3.7-1.rhel7.noarch.rpm         |  23 kB   00:00
(3/7): python-argh-0.23.0-1.rhel7.noarch.rpm               |  33 kB   00:00
(4/7): postgresql96-libs-9.6.1-1PGDG.rhel7.x86_64.rpm      | 308 kB   00:04
(5/7): python-six-1.9.0-2.el7.noarch.rpm                   |  28 kB   00:00
(6/7): python-dateutil-2.5.3-3.rhel7.noarch.rpm            | 241 kB   00:01
(7/7): python-psycopg2-2.6.2-3.rhel7.x86_64.rpm            | 131 kB   00:01
--------------------------------------------------------------------------------
Total                                              163 kB/s | 1.0 MB  00:06
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : python-argh-0.23.0-1.rhel7.noarch                            1/7
  Installing : postgresql96-libs-9.6.1-1PGDG.rhel7.x86_64                   2/7
  Installing : python-psycopg2-2.6.2-3.rhel7.x86_64                         3/7
  Installing : python-argcomplete-0.3.7-1.rhel7.noarch                      4/7
  Installing : python-six-1.9.0-2.el7.noarch                                5/7
  Installing : 1:python-dateutil-2.5.3-3.rhel7.noarch                       6/7
  Installing : barman-2.1-1.rhel7.noarch                                    7/7
  Verifying  : python-psycopg2-2.6.2-3.rhel7.x86_64                         1/7
  Verifying  : python-six-1.9.0-2.el7.noarch                                2/7
  Verifying  : python-argcomplete-0.3.7-1.rhel7.noarch                      3/7
  Verifying  : postgresql96-libs-9.6.1-1PGDG.rhel7.x86_64                   4/7
  Verifying  : python-argh-0.23.0-1.rhel7.noarch                            5/7
  Verifying  : barman-2.1-1.rhel7.noarch                                    6/7
  Verifying  : 1:python-dateutil-2.5.3-3.rhel7.noarch                       7/7
 
Installed:
  barman.noarch 0:2.1-1.rhel7
 
Dependency Installed:
  postgresql96-libs.x86_64 0:9.6.1-1PGDG.rhel7
  python-argcomplete.noarch 0:0.3.7-1.rhel7
  python-argh.noarch 0:0.23.0-1.rhel7
  python-dateutil.noarch 1:2.5.3-3.rhel7
  python-psycopg2.x86_64 0:2.6.2-3.rhel7
  python-six.noarch 0:1.9.0-2.el7
Complete!

Everything is installed on both servers :

– PostgreSQL 9.6

– DMK last version

– barman

Now we configure as follows:

The barman server is pg2 : 192.168.1.101

The database server is pg1 : 192.168.1.100

 

On the database server, we create a barman user:

postgres@:5432) [postgres] > create user barman superuser login encrypted password 
'barman';
CREATE ROLE

And a barman_streaming user:

postgres@: [postgres] > create user barman_streaming replication encrypted password 
'barman';
CREATE ROLE

We modify the following parameters max_replication_slots (which specifies the maximum number of replication slots the server can support), and max_wal_senders (specifies the maximum number of simultaneously running wal sender processes):

postgres@:5432) [postgres] > alter system set max_replication_slots=10;
ALTER SYSTEM
postgres@:5432) [postgres] > alter system set max_wal_senders=10;
ALTER SYSTEM

As those previous parameters have been modified, we need to restart the database, we use pgrestart which is a DMK alias for pg_ctl -D ${PGDATA} restart -m fast

postgres@pg1:/home/postgres/ [PG1] pgrestart
waiting for server to shut down.... done
server stopped
server starting
postgres@pg1:/home/postgres/ [PG1] 2017-02-06 15:59:14.756 CET - 1 - 17008 -  
- @ LOG:  redirecting log output to logging collector process
2017-02-06 15:59:14.756 CET - 2 - 17008 -  - 
@ HINT:  Future log output will appear in directory 
"/u01/app/postgres/admin/PG1/pg_log".

We modify the pg_hba.conf on the barman server in order to allow connections from the barman server to the database server as follows:

host    all             barman          192.168.1.101/24       md5
host    replication     barman_streaming 192.168.1.101/24      md5

We modify the .pgpass file on the barman server in order not to be asked for passwords:

postgres@pg2:/home/postgres/ [pg96] cat .pgpass
*:*:*:postgres:postgres
192.168.1.100:*:*:barman:barman
192.168.1.100:*:*:barman_streaming:barman

Finally we test the connection from the barman server to the database server:

postgres@pg2:/home/postgres/ [pg96] psql -c 'select version()'
 -U barman -h 192.168.1.100 -p 5432 postgres
                                                 version
 
--------------------------------------------------------------------------------

 PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (
Red Hat 4.8.5-11), 64-bit
(1 row)
postgres@pg2:/home/postgres/ [pg96] psql -U barman_streaming -h 192.168.1.100 
-p 5432 -c "IDENTIFY_SYSTEM" replication=1
      systemid       | timeline |  xlogpos  | dbname
---------------------+----------+-----------+--------
 6384063115439945376 |        1 | 0/F0006F0 |
(1 row)

Now it’s time to create a configuration file pg96.conf in $DMK_HOME/etc/barman.d in the barman server:

[pg96]
description =  "PostgreSQL 9.6 server"
conninfo = host=192.168.1.100 port=5432 user=barman dbname=postgres
backup_method = postgres
streaming_conninfo = host=192.168.1.100 port=5432 user=barman_streaming 
dbname=postgres
streaming_wals_directory = /u99/received_wal
streaming_archiver = on
slot_name = barman

We create a barman.conf file in $DMK_HOME/etc as follows, mainly defining the barman_user, the configuration file directory and the barman backup home, the barman lock directory and the log directory:

postgres@pg2:/u01/app/postgres/local/dmk/etc/ [pg96] cat barman.conf
; Barman, Backup and Recovery Manager for PostgreSQL
; http://www.pgbarman.org/ - http://www.2ndQuadrant.com/
;
; Main configuration file
 
[barman]
; System user
barman_user = postgres
 
; Directory of configuration files. Place your sections in separate files 
with .conf extension
; For example place the 'main' server section in /etc/barman.d/main.conf
configuration_files_directory = /u01/app/postgres/local/dmk/etc/barman.d
 
; Main directory
barman_home = /u99/backup
 
; Locks directory - default: %(barman_home)s
barman_lock_directory = /u01/app/postgres/local/dmk/etc/
 
; Log location
log_file = /u01/app/postgres/local/dmk/log/barman.log
 
; Log level (see https://docs.python.org/3/library/logging.html#levels)
log_level = DEBUG
 
; Default compression level: possible values are None (default), 
bzip2, gzip, pigz, pygzip or pybzip2
compression = gzip
 
; Pre/post backup hook scripts
;pre_backup_script = env | grep ^BARMAN
;pre_backup_retry_script = env | grep ^BARMAN
;post_backup_retry_script = env | grep ^BARMAN
;post_backup_script = env | grep ^BARMAN
 
; Pre/post archive hook scripts
;pre_archive_script = env | grep ^BARMAN
;pre_archive_retry_script = env | grep ^BARMAN
;post_archive_retry_script = env | grep ^BARMAN
;post_archive_script = env | grep ^BARMAN
 
; Global retention policy (REDUNDANCY or RECOVERY WINDOW) - default empty
retention_policy = RECOVERY WINDOW OF 4 WEEKS
 
; Global bandwidth limit in KBPS - default 0 (meaning no limit)
;bandwidth_limit = 4000
 
; Immediate checkpoint for backup command - default false
;immediate_checkpoint = false
 
; Enable network compression for data transfers - default false
;network_compression = false
 
; Number of retries of data copy during base backup after an error - default 0
;basebackup_retry_times = 0
 
; Number of seconds of wait after a failed copy, before retrying - default 30
;basebackup_retry_sleep = 30
 
; Maximum execution time, in seconds, per server
; for a barman check command - default 30
;check_timeout = 30
 
; Time frame that must contain the latest backup date.
; If the latest backup is older than the time frame, barman check
; command will report an error to the user.
; If empty, the latest backup is always considered valid.
; Syntax for this option is: "i (DAYS | WEEKS | MONTHS)" where i is an
; integer > 0 which identifies the number of days | weeks | months of
; validity of the latest backup for this check. Also known as 'smelly backup'.
;last_backup_maximum_age =
 
; Minimum number of required backups (redundancy)
;minimum_redundancy = 1

 

In order to enable streaming of transaction logs and to use replication slots, we run the following command on the barman server:

postgres@pg2:/u01/app/postgres/local/dmk/etc/ [pg96] barman receive-wal 
--create-slot pg96
Creating physical replication slot 'barman' on server 'pg96'
Replication slot 'barman' created

Then we can test:

We can force a log switch on the database server:

postgres@pg2:/u01/app/postgres/local/dmk/etc/ [pg96] barman switch-xlog 
--force pg96
The xlog file 00000001000000000000000F has been closed on server 'pg96'

 

We start receive wal:

postgres@pg2:/u99/received_wal/ [pg96] barman -c 
/u01/app/postgres/local/dmk/etc/barman.conf receive-wal pg96
Starting receive-wal for server pg96
pg96: pg_receivexlog: starting log streaming at 0/68000000 (timeline 3)
pg96: pg_receivexlog: finished segment at 0/69000000 (timeline 3)
pg96: pg_receivexlog: finished segment at 0/6A000000 (timeline 3)
pg96: pg_receivexlog: finished segment at 0/6B000000 (timeline 3)
pg96: pg_receivexlog: finished segment at 0/6C000000 (timeline 3)

 

We can check the barman configuration:

postgres@pg2:/u99/restore_test/ [pg96] barman check pg96
Server pg96:
                    PostgreSQL: OK
                    superuser: OK
                    PostgreSQL streaming: OK
                    wal_level: OK
                    replication slot: OK
                    directories: OK
                    retention policy settings: OK
                    backup maximum age: OK (no last_backup_maximum_age provided)
                    compression settings: OK
                    failed backups: FAILED (there are 1 failed backups)
                    minimum redundancy requirements: OK (have 3 backups, 
                    expected at least 0)
                    pg_basebackup: OK
                    pg_basebackup compatible: OK
                    pg_basebackup supports tablespaces mapping: OK
                    pg_receivexlog: OK
                    pg_receivexlog compatible: OK
                    receive-wal running: OK
                    archiver errors: OK

We can run a barman archive-wal command:

postgres@pg2:/home/postgres/ [pg96] barman archive-wal pg96
Processing xlog segments from streaming for pg96
                    00000003.history
                    000000030000000000000067
                    000000030000000000000068

And finally you can run a backup with the command:

postgres@pg2:/home/postgres/ [pg96] barman backup pg96
Starting backup using postgres method for server pg96 in 
/u99/backup/pg96/base/20170214T103226
Backup start at xlog location: 0/69000060 (000000030000000000000069, 00000060)
Copying files.
Copy done.
Finalising the backup.
Backup size: 60.1 MiB
Backup end at xlog location: 0/6B000000 (00000003000000000000006A, 00000000)
Backup completed
Processing xlog segments from streaming for pg96
                    000000030000000000000069

We can list the backups :

postgres@pg2:/u02/pgdata/ [pg96] barman list-backup pg96
pg96 20170214T103226 - Tue Feb 14 09:32:27 2017 - Size: 60.2 MiB - WAL Size: 0 B 
(tablespaces: tab1:/u02/pgdata/PG1/mytab)
pg96 20170207T061338 - Tue Feb  7 06:19:38 2017 - Size: 29.0 MiB - WAL Size: 0 B
pg96 20170207T060633 - Tue Feb  7 06:12:33 2017 - Size: 29.0 MiB - WAL Size: 0 B

 

We have the possibility to test a restore for example on the barman server :

postgres@pg2:/u02/pgdata/ [pg96] barman recover pg96 20170214T103226 
/u99/restore_test/
Starting local restore for server pg96 using backup 20170214T103226
Destination directory: /u99/restore_test/
                    24648, tab1, /u02/pgdata/PG1/mytab
Copying the base backup.
Copying required WAL segments.
Generating archive status files
Identify dangerous settings in destination directory.
 
IMPORTANT
These settings have been modified to prevent data losses
 
postgresql.conf line 71: archive_command = false
postgresql.auto.conf line 4: archive_command = false

Your PostgreSQL server has been successfully prepared for recovery, the /u99/test_restore directory contains:

postgres@pg2:/u99/restore_test/ [pg96] ll

total 64
-rw-------  1 postgres postgres  208 Feb 14 10:32 backup_label
-rw-------  1 postgres postgres  207 Feb 14 10:32 backup_label.old
drwx------ 10 postgres postgres   98 Feb 14 10:32 base
drwx------  2 postgres postgres 4096 Feb 14 10:32 global
drwx------  2 postgres postgres    6 Feb 14 10:32 mytab
drwx------  2 postgres postgres   17 Feb 14 10:32 pg_clog
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_commit_ts
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_dynshmem
-rw-------  1 postgres postgres 4416 Feb 14 10:32 pg_hba.conf
-rw-------  1 postgres postgres 4211 Feb 14 10:32 pg_hba.conf_conf
-rw-------  1 postgres postgres 1636 Feb 14 10:32 pg_ident.conf
drwx------  4 postgres postgres   65 Feb 14 10:32 pg_logical
drwx------  4 postgres postgres   34 Feb 14 10:32 pg_multixact
drwx------  2 postgres postgres   17 Feb 14 10:32 pg_notify
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_replslot
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_serial
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_snapshots
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_stat
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_stat_tmp
drwx------  2 postgres postgres   17 Feb 14 10:32 pg_subtrans
drwx------  2 postgres postgres   18 Feb 14 10:32 pg_tblspc
drwx------  2 postgres postgres    6 Feb 14 10:32 pg_twophase
-rw-------  1 postgres postgres    4 Feb 14 10:32 PG_VERSION
drwx------  3 postgres postgres   81 Feb 14 10:39 pg_xlog
-rw-------  1 postgres postgres  391 Feb 14 10:39 postgresql.auto.conf
-rw-------  1 postgres postgres  358 Feb 14 10:32 postgresql.auto.conf.origin
-rw-------  1 postgres postgres 7144 Feb 14 10:39 postgresql.conf
-rw-------  1 postgres postgres 7111 Feb 14 10:32 postgresql.conf.origin
-rw-------  1 postgres postgres   56 Feb 14 10:32 recovery.done

If you need to  restore your backup on the pg1 original database server, you have to use the –remote-ssh-command as follows (you specify the hostname where you want restore, and the PGDATA directory)

postgres@pg2:/home/postgres/.ssh/ [pg96] barman recover --remote-ssh-command "ssh postgres@pg1" pg96 20170214T103226 /u02/pgdata/PG1
Starting remote restore for server pg96 using backup 20170214T103226
Destination directory: /u02/pgdata/PG1
       24648, tab1, /u02/pgdata/PG1/mytab
Copying the base backup.
Copying required WAL segments.
Generating archive status files
Identify dangerous settings in destination directory.
 
IMPORTANT
These settings have been modified to prevent data losses
 
postgresql.conf line 71: archive_command = false
postgresql.auto.conf line 4: archive_command = false
 
Your PostgreSQL server has been successfully prepared for recovery!

You also have the possibility to realise a point in time recovery.

In my PG1 database I create a table employes and insert some data :

postgres@[local]:5432) [blubb] > create table employes (name varchar(10));
CREATE TABLE
(postgres@[local]:5432) [blubb] > insert into employes values ('fiona');
INSERT 0 1
(postgres@[local]:5432) [blubb] > insert into employes values ('cathy');
INSERT 0 1
(postgres@[local]:5432) [blubb] > insert into employes values ('helene');
INSERT 0 1
(postgres@[local]:5432) [blubb] > select * from employes;
  name  
--------
 fiona
 cathy
 helene

A few minutes later I insert some more records in the employes table:

postgres@[local]:5432) [blubb] > insert into employes values ('larry');
INSERT 0 1
(postgres@[local]:5432) [blubb] > insert into employes values ('bill');
INSERT 0 1
(postgres@[local]:5432) [blubb] > insert into employes values ('steve');
INSERT 0 1
(postgres@[local]:5432) [blubb] > select * from employes;
  name  
--------
 fiona
 cathy
 helene
 larry
 bill
 steve

The first data were create at 15:15, let’s see if the pitr barman restore works correctly:

I stop the PG1 database :

postgres@pg1:/u02/pgdata/ [PG1] pgstop
waiting for server to shut down....... done
server stopped

I delete the PGDATA directory:

postgres@pg1:/u02/pgdata/ [PG1] rm -rf PG1

And from the barman server I run the pitr recovery command using the –target-time argument:

postgres@pg2:/home/postgres/ [pg96] barman recover --remote-ssh-command "ssh postgres@pg1" pg96 
--target-time "2017-02-14 15:15:48"  20170214T141055 /u02/pgdata/PG1 
Starting remote restore for server pg96 using backup 20170214T141055
Destination directory: /u02/pgdata/PG1
Doing PITR. Recovery target time: '2017-02-14 15:15:48'
       24648, tab1, /u02/pgdata/PG1/mytab
Copying the base backup.
Copying required WAL segments.
Generating recovery.conf
Identify dangerous settings in destination directory.
 
IMPORTANT
These settings have been modified to prevent data losses
 
postgresql.conf line 72: archive_command = false
postgresql.auto.conf line 4: archive_command = false
 
Your PostgreSQL server has been successfully prepared for recovery!

I restart my PG1 database the data are correctly restored, just before the Larry, Bill and Steve insertion into the employes tables

postgres@[local]:5432) [blubb] > select * from employes;
  name  
--------
 fiona
 cathy
 helene
(3 rows)

 

 

Cet article Postgres Barman and DMK est apparu en premier sur Blog dbi services.

Pages