Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 11 hours 1 min ago

SQL Server 2016: Distributed availability groups and Cross Cluster migration

Mon, 2016-10-24 06:43

How to migrate an environment that includes availability groups from one Windows Failover Cluster to another one? This scenario is definitely uncommon and requires a good preparation. How to achieve this task depends mainly of your context. Indeed, we may use a plenty of scenarios according to the architecture in-place as well as the customer constraints in terms of maximum downtime allowed for example. Among all possible scenarios, there is a process called “cross-cluster migration scenario” that involves two Windows Failover Clusters side by side. In this blog post, I would like to focus on it and I will then show improvements in this field with SQL Server 2016.

Just to be clear, for the first scenario my intention is not to detail all the steps required to achieve a cross-cluster migration. You may refer to the whitepaper written by Microsoft here. To be honest, my feeling is the documentation may be improved because it includes sometimes a lack of specificity but it has the great merit of being a reality. I remember to navigate through this document, go down and go up back to the beginning several times by using the bookmarks in order to be confident with my understanding of the whole process. Anyway, let’s apply this procedure to a real customer case.

In my experience, I may confirm we haven’t customers with only one availability group installed on their environment. Most of time, we encountered customer cases which include several SQL Server instances and replicas on the same cluster node. I will demonstrate it with the following infrastructure (similar to most customer shops):

 

blog 107 - 0 - distributed ag cross cluster migration use case

We have 3 availability groups with the first two ones hosted on the same SQL14 instance and the last one on the SQL142 instance. The availability group architecture runs on the top of a Windows failover cluster (WSFC) – WIN201216CLUST – that includes two cluster nodes and a file share witness (FSW) not reported in the above picture. So a pretty common scenario at customer shops as I said previously. Without going into details of customer cases, the idea was to migrate all the current physical environment from the first datacenter (subnet 192.168.5.0) to the second datacenter on a virtual environment (subnet 192.168.50.0). As an aside, note that my customer subnets are not exactly the same and he used a different set of IP ranges but anyway it will help to set the scene.

So basically, according to the Microsoft documentation the migration process is divided in 4 main steps:

  • Preparation of the new WSFC environment (no downtime)
  • Preparation of the migration plan (no downtime).
  • Data migration (no downtime)
  • Resource migration (downtime)

Probably, the main advantage of using this procedure is the short outage that will occur during the last step (resource migration). In this way, we are more comfortable with previous preparation steps because they do not require downtime. Generally migration of data between two replicas is an important part of the migration process in terms of time and we are able to prepare smoothly the migration stuff between the two WSFCs.

 

Preparation of the new WSFC

Further points in the process have brought my attention. The first one concerns the preparation of the new WSFC environment (first step). Basically, we have to prepare the target environment that will host our existing availability groups and Microsoft warned us about the number of nodes (temporary nodes or not) regarding the overlapping among availability groups section and migration batches. During this preparation step we have also to set the corresponding cluster registry permissions to allow correct switching of the cluster context from the new installed replicas on the remote WSFC. At the first glance, I wondered why we have to perform such operation but the response became obvious when I tried to do a switch of the cluster from my new SQL Server instance and I faced the following error message:

ALTER SERVER CONFIGURATION SET HADR CLUSTER CONTEXT = 'WIN201216CLUST.dbi-services.test'

 

Msg 19411, Level 16, State 1, Line 1
The specified Windows Server Failover Clustering (WSFC) cluster,  ‘WIN201216CLUST.dbi-services.test’, is not ready to become the cluster context of AlwaysOn Availability Groups (Windows error code: 5).

The possible reason could be that the specified WSFC cluster is not up or that a security permissions issue was encountered. Fix the cause of the failure, and retry your ALTER SERVER CONFIGURATION SET HADR CLUSTER CONTEXT = ‘remote_wsfc_cluster_name’ command.

It seems that my SQL Server instance is trying unsuccessfully to get information from the registry hive of the remote WSFC in order to get a picture of the global configuration. As an aside, until the cluster context is not switched, we are not able to add the new replicas to the existing availability group. If we put a procmon trace (from sysinternals) on the primary cluster node, we may notice that executing the above command from the remote SQL Server instance implies the reading of the local HKLM\Cluster hive.

Well, after fixing the cluster permissions issues by using the PowerShell script provided by Microsoft, we may add the concerned replicas to our existing AG configuration. The operation must be applied on all the replicas from the same WSFC. According to the Microsoft documentation, I added then two replicas in synchronous replication and another one in asynchronous mode. A quick look at the concerned DMVs confirms that everything is ok

SELECT	
	g.name as ag_name,
	r.replica_server_name as replica_name,
	rs.is_local,
	rs.role_desc AS [role],
	rs.connected_state_desc as connection_state,
	rs.synchronization_health_desc as sync_state
FROM sys.dm_hadr_availability_replica_states as rs
JOIN sys.availability_groups as g
	on g.group_id = rs.group_id
JOIN sys.availability_replicas as r
	on r.replica_id = rs.replica_id

 

blog 107 - 1 - distributed ag cross cluster migration 1

 

SELECT 
	g.name as ag_name,
	r.replica_server_name as replica_name,
	DB_NAME(drs.database_id) AS [db_name],
	drs.database_state_desc as db_state,
	drs.is_primary_replica,
	drs.synchronization_health_desc as sync_health,
	drs.synchronization_state_desc as sync_state
FROM sys.dm_hadr_database_replica_states as drs
JOIN sys.availability_groups as g
	on g.group_id = drs.group_id
JOIN sys.availability_replicas as r
	on r.replica_id = drs.replica_id
ORDER BY r.replica_server_name

 

blog 107 - 2 - distributed ag cross cluster migration 2

If you are curious like me you may wonder how SQL Server deals with the two remote replicas? A quick look at the registry doesn’t give us any clue. But what about getting registry changed values after adding new replicas? Regshot tool was a good tool to use in my case to track changes between two registry snapshots:

blog 107 - 3 - distributed ag cross cluster migration 3

This is probably not an exhaustive list of added or modified keys but this output provides a lot of useful information to understand what’s happening to the cluster registry hive when adding remote replicas. The concerned resource is identified by the id 9bb8b518-2d1a-4705-a378-86f282d387da which corresponds to my DUMMY availability group. It makes sense to notice some changes at this level.

blog 107 - 4 - distributed ag cross cluster migration 4

I may formulate some assumptions here. Those registry changes are necessary to represent the complete picture of the new configuration (with ID 8B62AACE-6AFC-49B7-9369-590D5E832ED6). If we refer to the SQL Server error log we may identify easily each replica server name by its corresponding hexadecimal value.

 

Resource migration

However, resource migration is a critical part of the migration process because it will introduce downtime. The downtime duration depends mainly on the number of items to migrate. Migrating availability group includes bringing offline each availability group as well as dropping each corresponding listener at the source and then recreating the availability group configuration at the target. In other words, the more you have items to migrate, the longer this migration step might be.

We are also concerned by the availability group’s topology and migration batches. Indeed, according to the Microsoft documentation, we may not change the HADR context of the targeted SQL Server instance until we have migrated all the related availability groups preventing using them as new functional replicas. To understand the importance of migration batches, think about the following scenario: the HADR context to the remote cluster is enabled and you have just finished to migrate the first availability group. You then switch back the HADR to the local context but you forget to migrate the second availability. At this point, reverting the HADR context to the remote cluster is not possible because the concerned replica is no longer eligible.

Assuming I used a minimal configuration that includes only two target replicas as shown in the first picture (WSFC at the destination), I have at least to group DUMMY and DUMMY2 availability groups in one batch. DUMMY3 availability group may be migrated as a separate batch.

So basically, steps to perform the resources migration are as follows:

  • Stop application connectivity on each availability group
  • Bring offline each availability group (ALTER AVAILABILTY GROUP OFFLINE) and drop the corresponding listener
  • Set the HADR context to local for each cluster node
  • Recreate each availability group and the corresponding listener with the new configuration
  • Validate application connectivity

The migration script provided by Microsoft helps a lot in the generation of the availability group definition but we face some restrictions. Indeed, the script doesn’t retain the exact configuration of the old environment including the previous replication mode or the backup policy for example. This is an expected behavior and according to the Microsoft documentation it is up to you to maintain the existing configuration at the destination.

Finally after performing the last migration step here is the last configuration for my DUMMY availability group as follows:

blog 107 - 5 - distributed ag cross cluster migration 5

blog 107 - 6 - distributed ag cross cluster migration 6

 

So what about SQL Server 2016 in the context of cross-cluster migration?

By starting with SQL Server 2016, distributed availability groups are probably the way to go. I already introduced this feature in a previous blog and I would like to show you how interesting this feature is in this context. Well let’s first go back to the initial context. We basically have to perform the exact same steps compared to the first scenario but distributed availability groups may increase ? drastically the complexity of the entire process.

  • Preparation of the new WSFC environment (no downtime) – We no longer need to grant access on the cluster registry hive permissions to the SQL Server service account as well as switching the HADR context to the remote cluster
  • Preparation of the new availability groups on the destination WSFC including the corresponding listeners (no downtime) – We no longer need to take into account the migration batches in the migration plan
  • Data migration between the availability groups (no downtime)
  • Application redirection to the new availability groups (downtime) – We no longer need to switch back the HADR context to local nodes as well as recreating the different availability group’s configurations at this level.

In short, the migration of availability groups across WSFC with SQL Server 2016 requires less efforts and shorter downtime.

Here is the new scenario after preparing the new WSFC environment and applying data migration between availability groups in the both sides

blog 107 - 7 - distributed ag cross cluster migration 7

 

During the first phase, I prepared a new Windows Failover Cluster WIN2012162CLUST which will host empty availability groups DUMMY10, DUMMY20 and DUMMY30. Those availability groups will act as new containers when implementing distributed availability groups (respectively TEMPDB_DUMMY_AG, TEMPDB_DUMMY2AG and TEMPDB_DUMMY3AG). You may notice that I configured ASYNC replication mode between local and remote availability groups but regarding our context, synchronous replication mode remains a viable option.

 

blog 107 - 8 - distributed ag cross cluster migration 8

 

A quick look at the sys.dm_hadr_database_replica_states confirms all the distributed availability groups are working well. TEMPDB_DUMMY3_AG is not included in the picture below.

blog 107 - 9 - distributed ag cross cluster migration 9

At this step, availability groups (DUMMY10 and DUMMY20) on the remote WSFC cannot be accessed and are used only as standby waiting to be switched as new primaries.

In the last step (resources migration) here the new steps we have to execute

  • Failover all the availability groups context to the remote WSFC by using distributed availability group capabilities
  • Redirect applications to point to the new listener

And that’s all!

In the respect of the last point, we may use different approaches and here is mine regarding the context. We don’t want to modify application connection strings that may lead to extra steps from the application side. Listeners previously created with remote availability groups may be considered as technical listeners in order to achieve cross-cluster migration through distributed availability groups. Once this operation is done, we may reuse old listeners in the new environment and achieve almost transparent application redirection in this way.

One another important thing I have to cover is the distributed availability group process behavior. After digging into several failover tests, I noticed triggering a failover event from a distributed availability group will move the primary replica of each secondary availability group to PRIMARY but unfortunately, it will not switch automatically the role of the primary of each primary availability group to SECONDARY as expected. This situation may lead to a split brain scenario in the worst case. The quick workaround consists in forcing the primary availability group to be SECONDARY before initiating the failover process. Let me demonstrate a little bit:

/* ========= 0- DEMOTE OLD LOCAL AGS TO SECONDARY TO AVOID UPDATE ACTIVITY  ========= 
   =========    FROM THE OLD PRIMARIES (SPLIT BRAIN SCENARIO                ========= */

:CONNECT LST-DUMMY

ALTER AVAILABILITY GROUP [TEMPDB_DUMMY_AG] SET (ROLE = SECONDARY);
ALTER AVAILABILITY GROUP [TEMPDB_DUMMY2_AG] SET (ROLE = SECONDARY);
GO


/* ========= 1- FORCE FAILOVER DISTRIBUTED AVAILABILTY GROUPS ========= */

:CONNECT LST-DUMMY10

USE master;
GO

ALTER AVAILABILITY GROUP [TEMPDB_DUMMY_AG] FORCE_FAILOVER_ALLOW_DATA_LOSS;
GO

ALTER AVAILABILITY GROUP [TEMPDB_DUMMY2_AG] FORCE_FAILOVER_ALLOW_DATA_LOSS;
GO

The new situation is as follows:

blog 107 - 10 - distributed ag cross cluster migration 10

Let’s try to connect to the old primary availability group. Accessing the DUMMY database is no longer permitted as expected

Msg 976, Level 14, State 1, Line 1
The target database, ‘DUMMY’, is participating in an availability group and is currently
not accessible for queries. Either data movement is suspended or the availability replica is not
enabled for read access. To allow read-only access to this and other databases in the availability
group, enable read access to one or more secondary availability replicas in the group. 
For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online.

We now have to get back the old listener for the new primary availability groups. In my case, I decided to drop the old availability groups in order to completely remove the availability group configuration from the primary WSFC and to make the concerned databases definitely inaccessible as well (RESTORING STATE).

/* ========= 3- DROP LISTENERS FROM THE OLD AVAILABILITY GROUPS  ========= */

:CONNECT WIN20121SQL16\SQL16

USE master;
GO

DROP AVAILABILITY GROUP [DUMMY];
GO

DROP AVAILABILITY GROUP [DUMMY2];
GO

 

Finally I may drop technical listeners and recreate application listeners by using the following script:

/* ========= 4- DROP TECHNICAL LISTENERS FROM THE NEW AVAILABILITY GROUPS  =========
   =========    ADD APPLICATION LISTENERS WITH NEW CONFIG                  ========= */
                      

:CONNECT WIN20122SQL16\SQL16

USE master;
GO

ALTER AVAILABILITY GROUP [DUMMY10] 
REMOVE LISTENER 'LST-DUMMY10';
GO

ALTER AVAILABILITY GROUP [DUMMY20] 
REMOVE LISTENER 'LST-DUMMY20';
GO

ALTER AVAILABILITY GROUP [DUMMY10]
ADD LISTENER 'LST-DUMMY' (WITH IP ((N'192.168.50.35', N'255.255.255.0')), PORT=1433);
GO

ALTER AVAILABILITY GROUP [DUMMY20]
ADD LISTENER 'LST-DUMMY' (WITH IP ((N'192.168.50.36', N'255.255.255.0')), PORT=1433);
GO

 

Et voilà!

 

Final thoughts

Cross-cluster migration is definitly a complex process regardless SQL Server 2016 improvements. However as we’ve seen in this blog post, the last SQL Server version may reduce the overall  complexity at the different migration steps. Personnally as a DBA, I don’t like to use custom registry modification stuff which may impact directly the WSFC level (required in the first migration model) because it may introduce some anxiousness and unexpected events. SQL Server 2016 provides a more secure way through distributed availability groups and includes all migration steps at the SQL Server level which make me more confident with the migration process.

Happy cross-cluster migration!

 

 

 

 

Cet article SQL Server 2016: Distributed availability groups and Cross Cluster migration est apparu en premier sur Blog dbi services.

Documentum story – Attempt to fetch object with handle 3c failed

Mon, 2016-10-24 02:00

Some time ago, I was working on the preparation of an upgrade of a Content Server and everything was working fine so I was about to begin but just before that I checked our monitoring interface for this environment to crosscheck and I saw the following alerts coming from the log file of the docbases installed on this CS:

2016-04-05T10:47:01.411651      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"

 

As you might know, an object with an ID starting with “3c” represents the dm_docbase_config and therefore I was a little bit afraid when I saw this alert. So first thing, I tried to open an idql session to see if the docbase was responding:

[dmadmin@content_server_01 log]$ iapi DOCBASE1 -Udmadmin -Pxxx


        EMC Documentum iapi - Interactive API interface
        (c) Copyright EMC Corp., 1992 - 2015
        All rights reserved.
        Client Library Release 7.2.0050.0084


Connecting to Server using docbase DOCBASE1
[DM_SESSION_I_SESSION_START]info:  "Session 013f245a800d7441 started for user dmadmin."


Connected to Documentum Server running Release 7.2.0050.0214  Linux64.Oracle
Session id is s0
API> retrieve,c,dm_docbase_config
...
3c3f245a60000210
API> exit
Bye

 

Ok so apparently the docbase is working properly, we can access D2, DA, the idql is also working and even the dm_docbase_config object can be retrieved and seen but there is still the exact same error in the log file coming every 5 minutes exactly. So I took a look at the error message in detail because I didn’t want to scroll hundreds of lines for maybe nothing. The string description containing the object ID wouldn’t really be useful inside the log file to find what might be the root cause and the same thing apply for the DM_OBJ_MGR_E_FETCH_FAIL error, it would just print the exact same error again and again with just different timestamps. When thinking about that in my mind, I actually realised that the 2 or 3 error lines I was able to see on my screen were completely exact – except the timestamps – and that include the process ID that is throwing this error (second column on the log file).

 

With this new information, I tried to find all log entries related to this process ID:

[dmadmin@content_server_01 log]$ grep "21425\[21425\]" $DOCUMENTUM/dba/log/DOCBASE1.log | more
2016-04-04T06:58:09.315163      21425[21425]    0000000000000000        [DM_SERVER_I_START_SERVER]info:  "Docbase DOCBASE1 attempting to open"
2016-04-04T06:58:09.315282      21425[21425]    0000000000000000        [DM_SERVER_I_START_KEY_STORAGE_MODE]info:  "Docbase DOCBASE1 is using database for cryptographic key storage"
2016-04-04T06:58:09.315316      21425[21425]    0000000000000000        [DM_SERVER_I_START_SERVER]info:  "Docbase DOCBASE1 process identity: user(dmadmin)"
2016-04-04T06:58:11.400017      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Post Upgrade Processing."
2016-04-04T06:58:11.401753      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Base Types."
2016-04-04T06:58:11.404252      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmRecovery."
2016-04-04T06:58:11.412344      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmACL."
2016-04-04T06:58:11.438249      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDocbaseIdMap."
2016-04-04T06:58:11.447435      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Error log streams."
2016-04-04T06:58:11.447915      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmUser."
2016-04-04T06:58:11.464912      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmGroup."
2016-04-04T06:58:11.480200      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmSysObject."
2016-04-04T06:58:11.515201      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmExprCode."
2016-04-04T06:58:11.524604      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmKey."
2016-04-04T06:58:11.533883      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmValueAssist."
2016-04-04T06:58:11.541708      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmValueList."
2016-04-04T06:58:11.551492      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmValueQuery."
2016-04-04T06:58:11.559569      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmValueFunc."
2016-04-04T06:58:11.565830      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmExpression."
2016-04-04T06:58:11.594764      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmLiteralExpr."
2016-04-04T06:58:11.603279      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmBuiltinExpr."
2016-04-04T06:58:11.625736      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmFuncExpr."
2016-04-04T06:58:11.636930      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmCondExpr."
2016-04-04T06:58:11.663622      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmCondIDExpr."
2016-04-04T06:58:11.707363      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDDInfo."
2016-04-04T06:58:11.766883      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmScopeConfig."
2016-04-04T06:58:11.843335      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDisplayConfig."
2016-04-04T06:58:11.854414      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmNLSDDInfo."
2016-04-04T06:58:11.878566      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDomain."
2016-04-04T06:58:11.903844      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmAggrDomain."
2016-04-04T06:58:11.929480      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmMountPoint."
2016-04-04T06:58:11.957705      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmLocation."
2016-04-04T06:58:12.020403      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Server Configuration."
2016-04-04T06:58:12.135418      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmPolicy."
2016-04-04T06:58:12.166923      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDDCommonInfo."
2016-04-04T06:58:12.196057      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDDAttrInfo."
2016-04-04T06:58:12.238040      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDDTypeInfo."
2016-04-04T06:58:12.269202      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmRelation."
2016-04-04T06:58:12.354573      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmForeignKey."
2016-04-04T06:58:12.387309      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmEvent."
2016-04-04T06:58:12.403895      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize DQL."
2016-04-04T06:58:12.405622      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmFolder."
2016-04-04T06:58:12.433583      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmDocument."
2016-04-04T06:58:12.480234      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Plugin Type."
2016-04-04T06:58:12.490196      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmNote."
2016-04-04T06:58:12.518305      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmComposite."
2016-04-04T06:58:12.529351      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmStorage."
2016-04-04T06:58:12.539944      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Common area."
2016-04-04T06:58:12.612097      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize File Store."
2016-04-04T06:58:12.675604      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Optical Store."
2016-04-04T06:58:12.717573      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Linked Stores."
2016-04-04T06:58:12.803227      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed Stores."
2016-04-04T06:58:13.102632      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Blob Stores."
2016-04-04T06:58:13.170074      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize External Store."
2016-04-04T06:58:13.242012      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize External File Store."
2016-04-04T06:58:13.305767      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize External URL."
2016-04-04T06:58:13.363407      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize External Free."
2016-04-04T06:58:13.429547      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmContent."
2016-04-04T06:58:13.461400      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmiSubContent."
2016-04-04T06:58:13.508588      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmOutputDevice."
2016-04-04T06:58:13.630872      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmMethod."
2016-04-04T06:58:13.689265      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmRouter."
2016-04-04T06:58:13.733289      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmFederation."
2016-04-04T06:58:13.807554      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmAliasSet."
2016-04-04T06:58:13.871634      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Content Addressable Storage."
2016-04-04T06:58:13.924874      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Formats."
2016-04-04T06:58:13.995154      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Convert."
2016-04-04T06:58:13.998050      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Indices."
2016-04-04T06:58:14.025587      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Dump Files."
2016-04-04T06:58:14.107689      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Load Files."
2016-04-04T06:58:14.176232      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize In Box."
2016-04-04T06:58:14.225954      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed References."
2016-04-04T06:58:14.292782      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Client Registrations."
2016-04-04T06:58:14.319699      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Client Rights."
2016-04-04T06:58:14.330420      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Client Rights Domain."
2016-04-04T06:58:14.356866      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Business Activities."
2016-04-04T06:58:14.411545      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Business Processes."
2016-04-04T06:58:14.444264      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Packages."
2016-04-04T06:58:14.493478      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Workitems."
2016-04-04T06:58:14.521836      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Workflows."
2016-04-04T06:58:14.559605      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Audit Trail."
2016-04-04T06:58:14.639303      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Clean Old Links."
2016-04-04T06:58:14.640511      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Compute Internal Type Tag Cache."
2016-04-04T06:58:14.696040      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize LastActionProcs."
2016-04-04T06:58:14.696473      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize User Types."
2016-04-04T06:58:14.696828      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Start Up - Phase 1."
2016-04-04T06:58:15.186812      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Start Up - Phase 2."
2016-04-04T06:58:15.666149      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Crypto Objects."
2016-04-04T06:58:15.677119      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Queue Views."
2016-04-04T06:58:15.678343      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Port Info Views."
2016-04-04T06:58:15.679532      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Global Cache Finalization."
2016-04-04T06:58:16.939100      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize National Language Character Translation."
2016-04-04T06:58:16.941815      21425[21425]    0000000000000000        [DM_CHARTRANS_I_TRANSLATOR_OPENED]info:  "Translator in directory ($DOCUMENTUM/product/7.2/install/external_apps/nls_chartrans) was added succesfully initialized.  Translator specifics: (Chararacter Translator: , Client Locale: (Windows :(4099), Version: 4.0), CharSet: ISO_8859-1, Language: English_US, UTC Offset: 0, Date Format:%2.2d/%2.2d/%2.2d %2.2d:%2.2d:%2.2d, Java Locale:en, Server Locale: (Linux :(8201), Version: 2.4), CharSet: UTF-8, Language: English_US, UTC Offset: 0, Date Format:%2.2d/%2.2d/%2.2d %2.2d:%2.2d:%2.2d, Java Locale:en, Shared Library: $DOCUMENTUM/product/7.2/install/external_apps/nls_chartrans/unitrans.so)"
2016-04-04T06:58:16.942377      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize LDAP setup."
2016-04-04T06:58:16.961767      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed change-checking."
2016-04-04T06:58:17.022448      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Fulltext Plugin and Configuration."
2016-04-04T06:58:17.113147      21425[21425]    0000000000000000        [DM_FULLTEXT_T_QUERY_PLUGIN_VERSION]info:  "Loaded FT Query Plugin: $DOCUMENTUM/product/7.2/bin/libDsearchQueryPlugin.so, API Interface version: 1.0, Build number: HEAD; Sep 14 2015 07:48:06, FT Engine version: xPlore version 1.5.0020.0048"
2016-04-04T06:58:17.122313      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed Content."
2016-04-04T06:58:17.125027      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed Content Map."
2016-04-04T06:58:17.125467      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Distributed Content Digital Signatures."
2016-04-04T06:58:17.621156      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Acs Config List."
2016-04-04T06:58:17.621570      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmLiteSysObject."
2016-04-04T06:58:17.623010      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmBatchManager."
2016-04-04T06:58:17.624369      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Partition Scheme."
2016-04-04T06:58:17.627552      21425[21425]    0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Authentication Plugins."
2016-04-04T06:58:17.631207      21425[21425]    0000000000000000        [DM_SESSION_I_AUTH_PLUGIN_LOADED]info:  "Loaded Authentication Plugin with code 'dm_krb' ($DOCUMENTUM/dba/auth/libkerberos.so)."
2016-04-04T06:58:17.631480      21425[21425]    0000000000000000        [DM_SESSION_I_AUTH_PLUGIN_LOAD_INIT]info:  "Authentication plugin ( 'dm_krb' ) was disabled. This is expected if no keytab file(s) at location ($DOCUMENTUM/dba/auth/kerberos).Please refer the content server installation guide."
2016-04-04T06:58:17.638885      21425[21425]    0000000000000000        [DM_SERVER_I_START_SERVER]info:  "Docbase DOCBASE1 opened"
2016-04-04T06:58:17.639005      21425[21425]    0000000000000000        [DM_SERVER_I_SERVER]info:  "Setting exception handlers to catch all interrupts"
2016-04-04T06:58:17.639043      21425[21425]    0000000000000000        [DM_SERVER_I_START]info:  "Starting server using service name:  DOCBASE1"
2016-04-04T06:58:17.810637      21425[21425]    0000000000000000        [DM_SERVER_I_LAUNCH_MTHDSVR]info:  "Launching Method Server succeeded."
2016-04-04T06:58:17.818319      21425[21425]    0000000000000000        [DM_SERVER_I_LISTENING]info:  "The server is listening on network address (Service Name: DOCBASE1_s, Host Name: content_server_01 :V4 IP)"
2016-04-04T06:58:17.821615      21425[21425]    0000000000000000        [DM_SERVER_I_IPV6_DISABLED]info:  "The server can not listen on IPv6 address because the operating system does not support IPv6"
2016-04-04T06:58:19.301490      21425[21425]    0000000000000000        [DM_WORKFLOW_I_AGENT_START]info:  "Workflow agent master (pid : 21612, session 013f245a80000007) is started sucessfully."
2016-04-04T06:58:19.302601      21425[21425]    0000000000000000        [DM_WORKFLOW_I_AGENT_START]info:  "Workflow agent worker (pid : 21613, session 013f245a8000000a) is started sucessfully."
2016-04-04T06:58:20.304937      21425[21425]    0000000000000000        [DM_WORKFLOW_I_AGENT_START]info:  "Workflow agent worker (pid : 21626, session 013f245a8000000b) is started sucessfully."
2016-04-04T06:58:21.307256      21425[21425]    0000000000000000        [DM_WORKFLOW_I_AGENT_START]info:  "Workflow agent worker (pid : 21639, session 013f245a8000000c) is started sucessfully."
2016-04-04T06:58:22.307448      21425[21425]    0000000000000000        [DM_SERVER_I_START]info:  "Sending Initial Docbroker check-point "
2016-04-04T06:58:22.325337      21425[21425]    0000000000000000        [DM_MQ_I_DAEMON_START]info:  "Message queue daemon (pid : 21655, session 013f245a80000456) is started sucessfully."

 

Ok so this is the first and second pages I got as a result and that’s actually when I realised that the process I was talking about above was in fact the docbase process… So I displayed the third page of the more command and I got the following:

2016-04-05T10:04:28.305238      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_CURSOR_FAIL]error:  "In operation Exec an attempt to create cursor failed; query was: 'SELECT * FROM DM_DOCBASE_CONFIG_RV dm_dbalias_B , DM_DOCBASE_CONFIG_SV dm_dbalias_C  WHERE (dm_dbalias_C.R_OBJECT_ID=:dmb_handle AND dm_dbalias_C.R_OBJECT_ID=dm_dbalias_B.R_OBJECT_ID) ORDER BY dm_dbalias_B.R_OBJECT_ID,dm_dbalias_B.I_POSITION'; error from database system was: ORA-03114: not connected to ORACLE"
2016-04-05T10:04:28.305385      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:04:28.317505      21425[21425]    0000000000000000        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:04:28.317591      21425[21425]    013f245a80000002        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:04:58.329725      21425[21425]    0000000000000000        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:04:58.329884      21425[21425]    013f245a80000002        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:05:28.339052      21425[21425]    0000000000000000        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:05:28.339143      21425[21425]    013f245a80000002        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12541: TNS:no listener
2016-04-05T10:05:49.077076      21425[21425]    0000000000000000        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
2016-04-05T10:05:49.077163      21425[21425]    013f245a80000002        [DM_SESSION_I_RETRYING_DATABASE_CONNECTION]info:  "The following error was encountered trying to get a database connection:  ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
2016-04-05T10:06:23.461495      21425[21425]    013f245a80000002        [DM_SESSION_W_RESTART_AGENT_EXEC]warning:  "The agent exec program has stopped running.  It will be restarted."
2016-04-05T10:06:48.830854      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:11:52.533340      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:16:52.574766      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:21:52.546389      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:26:52.499108      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:31:52.232095      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:36:57.700202      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:42:01.198050      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:47:01.411651      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:52:02.242612      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T10:57:11.886518      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T11:02:13.133405      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"
2016-04-05T11:07:15.364236      21425[21425]    0000000000000000        [DM_OBJ_MGR_E_FETCH_FAIL]error:  "attempt to fetch object with handle 3c3f245a60000210 failed"

 

As you can see above, the first error occurred at “2016-04-05T10:04:28.305385″ and that’s actually 0,000147s (=0,1ms) after an error while trying to execute an SQL query on the database for the operation Exec… So this must be linked… The database errors stopped one minute after the first one so the DB was available again. I quickly verified that using sqlplus.

 

We opened a SR on the EMC website to work on it with them but as far as I know, no solution were found. The only workaround that we found is that a simple restart of the docbase will cleanup these errors from the docbase log file and they will not appear afterwards but that’s pretty annoying to restart the docbase while it is actually working properly (jobs are running, dfc clients are OK, aso…) just because there is one error message printed every 5 minutes in the log file that is flooding our monitoring tool.

 

The problem with this error is that it can happen frequently if the Network isn’t really reliable, if the Database Listener isn’t always responsive or if anything else prevent Documentum from reaching the Database while it is doing something with the dm_docbase_config objects… Something that we didn’t try yet is to re-initialize the Content Server, to see if it can help to restore a proper log file. I think I’m gonna try that next time this issue occurs!

Edit: Re-initialize the Content Server isn’t helping :(.

 

Cet article Documentum story – Attempt to fetch object with handle 3c failed est apparu en premier sur Blog dbi services.

2 x ODA X6-2S + Dbvisit Standby: Easy DR in SE

Fri, 2016-10-21 11:28

What’s common with Standard Edition, simplicity, reliability, high performance, and affordable price?
Dbvisit standby can be an answer because it brings Disaster Recovery to Standard Edition without adding complexity
ODA Lite (the new X6-2S and 2M) is another answer because you can run Standard Edition in those new appliance.
So it makes sense to bring them together, this is what I did recently at a customer.

I’ll not cover the reasons and the results here as this will be done later. Just sharing a few tips to set-up the following configuration: two ODA X6-2S runnimg 12c Standard Edition databases, protected by Dbvisit standby over two datacenters.

ODA repository

ODA X6 comes with a new interface to provision databases from command line (odacli) or GUI (https://oda:7093/mgmt/index.html). It’s a layer over the tools we usually use: it calls dbca in behind. What it does in addition is to log what has been done in a Java DB repository.

What is done is logged in the opt/oracle/dcs/log/dcs-agent.log:
2016-10-13 15:33:59,816 DEBUG [Database Creation] [] c.o.d.a.u.CommonUtils: run: cmd= '[su, -, oracle, -c, export PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin:/u01/app/oracle/product/12.1.0.2/dbhome_2/bin; export ORACLE_SID=MYNEWDB; export ORACLE_BASE=/u01/app/oracle; export ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_2; export PWD=******** /u01/app/oracle/product/12.1.0.2/dbhome_2/bin/dbca -createDatabase -silent -gdbName MYNEWDB.das.ch -sid MYNEWDB -sysPassword ******* -systemPassword ******* -dbsnmpPassword ******* -asmSysPassword ******* -storageType ASM -datafileJarLocation /u01/app/oracle/product/12.1.0.2/dbhome_2/assistants/dbca/templates -emConfiguration DBEXPRESS -databaseConfType SINGLE -createAsContainerDatabase false -characterSet WE8MSWIN1252 -nationalCharacterSet AL16UTF16 -databaseType MULTIPURPOSE -responseFile NO_VALUE -templateName seed_noncdb_se2016-10-13_15-33-59.0709.dbc -initParams "db_recovery_file_dest_size=174080,db_unique_name=MYNEWDB" -recoveryAreaDestination /u03/app/oracle/fast_recovery_area/]'

Do I like it? Actually I don’t for two reasons. First reason is that I don’t want to learn new syntax every year. I know CREATE DATABASE from decades, I know DBCA for years. I just prefer to use those.
The second reason is that if you want to add a layer on something, you need to provide at least the same functionality and the same quality than the tool you call in behind. If you provide a command to create a database, then you must provide a command to delete it, even if the previous creation has failed. I’ve created a database which creation failed. The reason was that I changed the listener port, but the template explicitly sets local_listener to port 1521. Fortunately it calls DBCA and I know where are the logs. So my ODA repository has a database in failed status. The problem is that you can’t drop it (it doesn’t exist for DBCA) and you cannot re-create it (it exists for ODA). I’m not a developer, but when I write code I try to manage exceptions. At least they must implement a ‘force’ mode where errors are ignored when deleting something that does not exist.

So if you have the same problem, here is what I did:

  • Open a SR in the hope that they understand there’s something to fix in their code without asking me all log files to upload
  • create a database with same name, directly with DBCA, then drop it with ODACLI

Finally, My Workaround works and Their Oracle Support came with two solutions: create the database with another name or re-image the ODA!

But, when it doesn’t fail, the creation is very fast: from templates with datafiles, and datafiles in those very fast NVMe SSDs.

Create the standby

I don’t like this additional layer, but I have the feeling that it’s better than the ODA repository knows about my databases. The standby database is created with Dbvisit interface (I’m talking about real user friendly interface there, where errors are handled and you even have the possibility to resume a creation that failed). How to make it go to the ODA repository?

I see 3 possibilities.

The odacli has a “–register-database” option to register an already create database. But that does probably too much because it was designed to register databases created on previous ODAs with oakcli.

The odacli has a “–instanceonly” option which is there to register a standby database that will be created later with RMAN duplicate for example. Again this does too much as it creates an instance. I tried it and didn’t have the patience to make it work. When ODACLI encounters a problem, it doesn’t explain what’s wrong, but just show the command line help.

Finally what I did is create a database with ODACLI and the drop it (outside of ODACLI). This is ugly, but its the only way I got something where I understand exactly what is done. This is where I encountered the issue above, so my workflow was actually: create from ODACLI -> fails -> drom from DBCA -> re-create from ODACLI -> success -> drop

I didn’t drop it from DBCA because I wanted to keep the entry in ORATAB. I did it from RMAN:

RMAN> startup force dba mount
RMAN> drop database including backups noprompt;

Then, no problem to create the standby from Dbvisit GUI

Create a filesystem

I’ve created the database directly in ASM. I don’t see any reason to create an ACFS volume for them, especially for Standard Edition where you cannot use ACFS snapshots. It’s just a performance overhead (and with those no-latency disks, any CPU overhead counts) and a risk to remove a datafile as they are exposed in a filesystem with no reason for it.

However, Dbvisit needs a folder where to store the archived logs that are shipped to the standby. I can create a folder in in local filesystem, but I preferred to to create an ACFS filesystem for it.
I did it from ODACLI:


odacli create-dbstorage --dataSize 200 -n DBVISIT -r ACFS

This creates a 200GB filesystem mounted as /u02/app/oracle/oradata/DBVISIT/

Who starts the database?

Dbvisit comes with a scheduler that can start the databases in the required mode. But in ODA the resources are managed by Grid Infrastructure. So after creating the standby database you must modify its mount mode:

srvctl modify database -d MYNEWDB -startoption mount

Don’t forget to change the mount modes after a switchover or failover.

This can be scripted with something like: srvctl modify database -db $db -startoption $(/opt/dbvisit/standby/dbv_oraStartStop status $db| awk '/^Regular Database/{print "OPEN"}/^Standby Database/{print "MOUNT"}')

Keep it simple and test it

ODA is simple if you do what it has been designed for: run the database versions that are certified (currenty 11.2.0.4 and 12.1.0.2) adn don’t try to customize the configuration. Always test the switchover, so that you can rely on the protection. It’s easy with Dbvisit standby, either from GUI of command line. And be sure that your network can keep up with the redo rate. Again, this is easy to check from the GUI. Here is an exemple when testing the migration with Data Pump import:
DbvisitTransferLogSize

From public prices, and before any discount, you can get two ODA X6-2S plus perpetual licences for Oracle Database Standard Edition and Dbvisit standby for less than 90KUSD.
If you need more storage you can double the capacity for about additional 10KUSD for each ODA.
And if you think that ODA may need a DBA sometimes, have a look at our SLAs and you have a reliable and affordable system on your premises to store and process your data.

 

Cet article 2 x ODA X6-2S + Dbvisit Standby: Easy DR in SE est apparu en premier sur Blog dbi services.

Configure easily your Stretch Database

Fri, 2016-10-21 10:07

In this blog, I will present you the new Stretch Database feature in SQL Server 2016. It couples your SQL Server On-Premises database with an Azure SQL Database, allowing to stretch data from one ore more tables to Azure Cloud.
This mechanism offers to use low-cost hard drives available in Azure, instead of fast and expensive local solid state drives. Indeed SQL Database Server resources are solicited during data transfers and during remote queries (and not SQL Server on-premises).

First, you need to enable the “Remote Data Archive” option at the instance level. To verify if the option is enabled:
USE master
GO
SELECT name, value, value_in_use, description from sys.configurations where name like 'remote data archive'

To enable this option at the instance level:

EXEC sys.sp_configure N'remote data archive', '1';
RECONFIGURE;
GO

Now, you have to link your on-premises database with a remote SQL Database server:
Use AdventureWorks2014;
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'masterPa$$w0rd'
GO
CREATE DATABASE SCOPED CREDENTIAL Stretch_cred
WITH IDENTITY = 'dbi' , SECRET = 'userPa$$w0rd' ;
GO
ALTER DATABASE AdventureWorks2014
SET REMOTE_DATA_ARCHIVE = ON
(
SERVER = 'dbisqldatabase.database.windows.net' ,
CREDENTIAL = Stretch_cred
) ;
GO

The process may take some time as it will create a new SQL Database in Azure, linked to your on-premises database. The credential entered to connect to your SQL Database server is defined in SQL Database. Previously you need to secure the credential by a database master key.

To view all the remote databases from your instance:
Select * from sys.remote_data_archive_databases

Now, if you want to migrate one table from your database ([Purchasing].[PurchaseOrderDetail] in my example), proceed as follows:
ALTER TABLE [Purchasing].[PurchaseOrderDetail] SET ( REMOTE_DATA_ARCHIVE ( MIGRATION_STATE = OUTBOUND) ) ;

Of course repeat this process for each table you want to stretch. You can still access to your data during the migration process.

To view all the remote tables from your instance:
Select * from sys.remote_data_archive_tables

To view the batch process of all the data being migrated: (indeed, you can filtrate by the a specific table)
Select * from sys.dm_db_rda_migration_status

It is also to easily migrate your data back:
ALTER TABLE [Purchasing].[PurchaseOrderDetail] SET ( REMOTE_DATA_ARCHIVE ( MIGRATION_STATE = INBOUND ) ) ;

Moreover, you can select rows to migration by using a filter function. Here is an example:
CREATE FUNCTION dbo.fn_stretchpredicate(@column9 datetime)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN SELECT 1 AS is_eligible
WHERE @column9 > CONVERT(datetime, '1/1/2014', 101)
GO

Then when enable the data migration, specify the filter function:
ALTER TABLE [Purchasing].[PurchaseOrderDetail] SET ( REMOTE_DATA_ARCHIVE = ON (
FILTER_PREDICATE = dbo.fn_stretchpredicate(ModifiedDate),
MIGRATION_STATE = OUTBOUND
) )

Of course in Microsoft world, you can also use a wizard to set up this feature. The choice is up to you!

 

Cet article Configure easily your Stretch Database est apparu en premier sur Blog dbi services.

Documentum story – Unable to start a new Managed Server in SSL in WebLogic

Fri, 2016-10-21 02:00

Some time ago, I was creating a new Managed Server named msD2-02 on an existing domain of a WebLogic Server 12.1.3.0 created loooong ago and I faced a small issue that I will try to explain in this blog. This Managed Server will be used to host a D2 4.5 Application (Documentum Client) and I created it using the Administration Console, customized it, enabled the SSL with internal SSL Certificates, the SAML2 Single Sign-On, aso…

 

When I wanted to start it for the first time, I get an error showing that the user/password used was wrong… So I tried to recreate the boot.properties file from scratch, setting up the username/password in there and tried again: same error. What to do then? To be sure that the password was correct (even if I was pretty sure), I tried to copy the boot.properties file from another Managed Server and tried again but same result over and over. Therefore I tried a last time removing the boot.properties completely to enter the credentials during the startup:

[weblogic@weblogic_server_01 msD2-02]$ /app/weblogic/domains/DOMAIN/bin/startManagedWebLogic.sh msD2-02 t3s://weblogic_server_01:8443

JAVA Memory arguments: -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m

CLASSPATH=/app/weblogic/Middleware/wlserver/server/lib/jcmFIPS.jar:/app/weblogic/Middleware/wlserver/server/lib/sslj.jar:/app/weblogic/Middleware/wlserver/server/lib/cryptoj.jar::/app/weblogic/Java/jdk1.8.0_45/lib/tools.jar:/app/weblogic/Middleware/wlserver/server/lib/weblogic_sp.jar:/app/weblogic/Middleware/wlserver/server/lib/weblogic.jar:/app/weblogic/Middleware/wlserver/../oracle_common/modules/net.sf.antcontrib_1.1.0.0_1-0b3/lib/ant-contrib.jar:/app/weblogic/Middleware/wlserver/modules/features/oracle.wls.common.nodemanager_2.0.0.0.jar:/app/weblogic/Middleware/wlserver/common/derby/lib/derbyclient.jar:/app/weblogic/Middleware/wlserver/common/derby/lib/derby.jar:/app/weblogic/Middleware/wlserver/server/lib/xqrl.jar:/app/weblogic/domains/DOMAIN/lib/LB.jar:/app/weblogic/domains/DOMAIN/lib/LBJNI.jar:

PATH=/app/weblogic/Middleware/wlserver/server/bin:/app/weblogic/Middleware/wlserver/../oracle_common/modules/org.apache.ant_1.9.2/bin:/app/weblogic/Java/jdk1.8.0_45/jre/bin:/app/weblogic/Java/jdk1.8.0_45/bin:/app/weblogic/domains/DOMAIN/D2/lockbox:/app/weblogic/domains/DOMAIN/D2/lockbox/lib/native/linux_gcc34_x64:/app/weblogic/Java/jdk1.8.0_45/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/weblogic/bin

***************************************************
*  To start WebLogic Server, use a username and   *
*  password assigned to an admin-level user.  For *
*  server administration, use the WebLogic Server *
*  console at http://hostname:port/console        *
***************************************************
starting weblogic with Java version:
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
Starting WLS with line:
/app/weblogic/Java/jdk1.8.0_45/bin/java -server -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dweblogic.Name=msD2-02 -Djava.security.policy=/app/weblogic/Middleware/wlserver/server/lib/weblogic.policy  -Dweblogic.ProductionModeEnabled=true -Dweblogic.security.SSL.trustedCAKeyStore=/app/weblogic/Middleware/wlserver/server/lib/cacerts  -Dcom.sun.xml.ws.api.streaming.XMLStreamReaderFactory.woodstox=true -Dcom.sun.xml.ws.api.streaming.XMLStreamWriterFactory.woodstox=true -Djava.io.tmpdir=/app/weblogic/tmp/DOMAIN/msD2-02 -Ddomain.home=/app/weblogic/domains/DOMAIN -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.protocolVersion=TLS1 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true  -Djava.endorsed.dirs=/app/weblogic/Java/jdk1.8.0_45/jre/lib/endorsed:/app/weblogic/Middleware/wlserver/../oracle_common/modules/endorsed  -da -Dwls.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.home=/app/weblogic/Middleware/wlserver/server   -Dweblogic.management.server=t3s://weblogic_server_01:8443  -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true  weblogic.Server
<Jun 14, 2016 11:52:43 AM UTC> <Info> <Security> <BEA-090906> <Changing the default Random Number Generator in RSA CryptoJ from ECDRBG128 to FIPS186PRNG. To disable this change, specify -Dweblogic.security.allowCryptoJDefaultPRNG=true.>
<Jun 14, 2016 11:52:43 AM UTC> <Notice> <WebLogicServer> <BEA-000395> <The following extensions directory contents added to the end of the classpath:
/app/weblogic/domains/DOMAIN/lib/LB.jar:/app/weblogic/domains/DOMAIN/lib/LBJNI.jar.>
<Jun 14, 2016 11:52:44 AM UTC> <Info> <WebLogicServer> <BEA-000377> <Starting WebLogic Server with Java HotSpot(TM) 64-Bit Server VM Version 25.45-b02 from Oracle Corporation.>
<Jun 14, 2016 11:52:44 AM UTC> <Info> <Security> <BEA-090065> <Getting boot identity from user.>
Enter username to boot WebLogic server:weblogic
Enter password to boot WebLogic server:
<Jun 14, 2016 11:52:54 AM UTC> <Warning> <Security> <BEA-090924> <JSSE has been selected by default, since the SSLMBean is not available.>
<Jun 14, 2016 11:52:54 AM UTC> <Info> <Security> <BEA-090908> <Using the default WebLogic SSL Hostname Verifier implementation.>
<Jun 14, 2016 11:52:54 AM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file /app/weblogic/Middleware/wlserver/server/lib/cacerts.>
<Jun 14, 2016 11:52:54 AM UTC> <Info> <Management> <BEA-141298> <Could not register with the Administration Server: java.rmi.RemoteException: [Deployer:149150]An IOException occurred while reading the input.; nested exception is:
        javax.net.ssl.SSLException: Error using PKIX CertPathBuilder.>
<Jun 14, 2016 11:52:54 AM UTC> <Info> <Management> <BEA-141107> <Version: WebLogic Server 12.1.3.0.0  Wed May 21 18:53:34 PDT 2014 1604337 >
<Jun 14, 2016 11:52:55 AM UTC> <Info> <Security> <BEA-090908> <Using the default WebLogic SSL Hostname Verifier implementation.>
<Jun 14, 2016 11:52:55 AM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file /app/weblogic/Middleware/wlserver/server/lib/cacerts.>
<Jun 14, 2016 11:52:55 AM UTC> <Alert> <Management> <BEA-141151> <The Administration Server could not be reached at https://weblogic_server_01:8443.>
<Jun 14, 2016 11:52:55 AM UTC> <Info> <Configuration Management> <BEA-150018> <This server is being started in Managed Server independence mode in the absence of the Administration Server.>
<Jun 14, 2016 11:52:55 AM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STARTING.>
<Jun 14, 2016 11:52:55 AM UTC> <Info> <WorkManager> <BEA-002900> <Initializing self-tuning thread pool.>
<Jun 14, 2016 11:52:55 AM UTC> <Info> <WorkManager> <BEA-002942> <CMM memory level becomes 0. Setting standby thread pool size to 256.>
<Jun 14, 2016 11:52:55 AM UTC> <Notice> <Log Management> <BEA-170019> <The server log file /app/weblogic/domains/DOMAIN/servers/msD2-02/logs/msD2-02.log is opened. All server side log events will be written to this file.>
<Jun 14, 2016 11:52:57 AM UTC> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>
<Jun 14, 2016 11:52:57 AM UTC> <Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias alias_cert from the JKS keystore file /app/weblogic/domains/DOMAIN/certs/identity.jks.>
<Jun 14, 2016 11:52:57 AM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the JKS keystore file /app/weblogic/domains/DOMAIN/certs/trust.jks.>
<Jun 14, 2016 11:52:58 AM UTC> <Critical> <Security> <BEA-090403> <Authentication for user weblogic denied.>
<Jun 14, 2016 11:52:58 AM UTC> <Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: A MultiException has 6 exceptions.  They are:
1. weblogic.security.SecurityInitializationException: Authentication for user weblogic denied.
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.security.SecurityService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.jndi.internal.RemoteNamingService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.jndi.internal.RemoteNamingService
5. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.t3.srvr.T3InitializationService errors were found
6. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.t3.srvr.T3InitializationService

A MultiException has 6 exceptions.  They are:
1. weblogic.security.SecurityInitializationException: Authentication for user weblogic denied.
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.security.SecurityService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.jndi.internal.RemoteNamingService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.jndi.internal.RemoteNamingService
5. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.t3.srvr.T3InitializationService errors were found
6. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.t3.srvr.T3InitializationService

        at org.jvnet.hk2.internal.Collector.throwIfErrors(Collector.java:88)
        at org.jvnet.hk2.internal.ClazzCreator.resolveAllDependencies(ClazzCreator.java:269)
        at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:413)
        at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:456)
        at org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncRunLevelContext.java:225)
        Truncated. see log file for complete stacktrace
Caused By: weblogic.security.SecurityInitializationException: Authentication for user weblogic denied.
        at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.doBootAuthorization(CommonSecurityServiceManagerDelegateImpl.java:1023)
        at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.postInitialize(CommonSecurityServiceManagerDelegateImpl.java:1131)
        at weblogic.security.service.SecurityServiceManager.postInitialize(SecurityServiceManager.java:943)
        at weblogic.security.SecurityService.start(SecurityService.java:159)
        at weblogic.server.AbstractServerService.postConstruct(AbstractServerService.java:78)
        Truncated. see log file for complete stacktrace
Caused By: javax.security.auth.login.FailedLoginException: [Security:090303]Authentication Failed: User weblogic weblogic.security.providers.authentication.LDAPAtnDelegateException: [Security:090295]caught unexpected exception
        at weblogic.security.providers.authentication.LDAPAtnLoginModuleImpl.login(LDAPAtnLoginModuleImpl.java:257)
        at com.bea.common.security.internal.service.LoginModuleWrapper$1.run(LoginModuleWrapper.java:110)
        at java.security.AccessController.doPrivileged(Native Method)
        at com.bea.common.security.internal.service.LoginModuleWrapper.login(LoginModuleWrapper.java:106)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        Truncated. see log file for complete stacktrace
>
<Jun 14, 2016 11:52:58 AM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED.>
<Jun 14, 2016 11:52:58 AM UTC> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down.>
<Jun 14, 2016 11:52:58 AM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN.>

 

As you can see above, the WebLogic Managed Server is able to retrieve and read the SSL Keystores (identity and trust) so this apparently isn’t the issue which seems to be linked to a wrong username/password. Strange isn’t it?

 

All other Managed Servers are working perfectly, the applications are accessible in HTTPS, we can see the status of the servers via WLST/AdminConsole, aso… But this specific Managed Server isn’t able to start… After some reflexion, I thought at the Embedded LDAP! This is a completely new Managed Server and I tried to start it directly in HTTPS. What if this Managed Server isn’t able to authenticate the user weblogic because this user doesn’t exist in the Embedded LDAP of the Managed Server? Indeed during the first start, a Managed Server will try to automatically replicate the Embedded LDAP from the AdminServer which contains the primary Embedded LDAP. Just for information, we usually create a bunch of Managed Servers for Documentum during the domain creation and therefore all these Managed Servers are usually started at least 1 time in HTTP before setting up the SSL in the Domain: that’s the main difference between the existing Managed Servers and the new one and therefore I dug deeper in this direction.

 

To test my theory, I tried to replicate the Embedded LDAP manually. In case you don’t know how to do it, please take a look at this blog which explains that in details: click here. After doing that, the Managed Server msD2-02 was indeed able to start because it was able to authenticate the user weblogic but that doesn’t explain why the Embedded LDAP wasn’t replicated automatically in the first place…

 

So I checked more deeply the logs and actually the first strange message during startup is always the same:

<Jun 14, 2016 11:52:54 AM UTC> <Info> <Management> <BEA-141298> <Could not register with the Administration Server: java.rmi.RemoteException: [Deployer:149150]An IOException occurred while reading the input.; nested exception is:
        javax.net.ssl.SSLException: Error using PKIX CertPathBuilder.>

 

As said previously, all components are setup in HTTPS and only HTTPS. Therefore all communications are using an SSL Certificate. For this customer, we weren’t using a Self-Signed Certificate but a Certificate Signed by an internal Certificate Authority. As shown in the Info message, the Managed Server wasn’t able to register with the AdminServer with an SSL Exception… Therefore I checked the SSL Certificate, the Root and Gold Certificate Authority too but for me everything was working properly. The Admin Console is accessible in HTTPS, all Applications are accessible, the status of the Managed Servers are visible in the Administration Console and via WLST which shows that they are able to communicate internally too, aso… So what could be wrong? Well after checking the startup command of the Managed Server (and actually it is also mentioned in the startup logs), I found the following:

[weblogic@weblogic_server_01 servers]$ ps -ef | grep msD2-02 | grep -v grep
weblogic 31313     1  0 14:34 pts/2    00:00:00 /bin/sh ../startManagedWebLogic.sh msD2-02 t3s://weblogic_server_01:8443
weblogic 31378 31315 26 14:34 pts/2    00:00:35 /app/weblogic/Java/jdk1.8.0_45/bin/java -server
    -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dweblogic.Name=msD2-02
    -Djava.security.policy=/app/weblogic/Middleware/wlserver/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true
    -Dweblogic.security.SSL.trustedCAKeyStore=/app/weblogic/Middleware/wlserver/server/lib/cacerts
    -Dcom.sun.xml.ws.api.streaming.XMLStreamReaderFactory.woodstox=true -Dcom.sun.xml.ws.api.streaming.XMLStreamWriterFactory.woodstox=true
    -Djava.io.tmpdir=/app/weblogic/tmp/DOMAIN/msD2-02 -Ddomain.home=/app/weblogic/domains/DOMAIN -Dweblogic.nodemanager.ServiceEnabled=true
    -Dweblogic.security.SSL.protocolVersion=TLS1 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom
    -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true
    -Djava.endorsed.dirs=/app/weblogic/Java/jdk1.8.0_45/jre/lib/endorsed:/app/weblogic/Middleware/wlserver/../oracle_common/modules/endorsed
    -da -Dwls.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.home=/app/weblogic/Middleware/wlserver/server
    -Dweblogic.management.server=t3s://weblogic_server_01:8443 -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true weblogic.Server

 

What is this JVM parameter? Why does WebLogic defines a specific cacerts for this Managed Server and isn’t using the default one (included in Java)? Something is strange with this startup command…So I checked all other WebLogic Server processes and apparently ALL Managed Servers include this custom cacerts while the AdminServer doesn’t… Is that a bug?! Even if it makes sense to create a custom cacerts for WebLogic only, then why the AdminServer isn’t using it? This fact doesn’t make any sense and this is why we are facing this issue:
– All Managed Servers are using: /app/weblogic/Middleware/wlserver/server/lib/cacerts
– The AdminServer is using: /app/weblogic/Java/jdk1.8.0_45/jre/lib/security/cacerts

 

After checking the different startup scripts, it appears that this is define in the file startManagedServer.sh. Therefore this JVM parameter is only used by the Managed Server and therefore it is apparently a choice from Oracle (or something that has been forgotten…) to only start the Managed Servers with this option and not the AdminServer… Using different cacerts means that the SSL Certificates trusted by Java (default one) will be trusted by the AdminServer but it will not be the case for the Managed Servers. In our setup, we always add the Root and Gold Certificates (SSL Chain) in the default Java cacerts because it is the one used to allow the setup of our Domain and our Applications in SSL. This is working properly but that isn’t enough to allow the Managed Servers to start properly: you also need to take care of this second cacerts and that’s the reason why the new Managed Server wasn’t able to register to the AdminServer and therefore not able to replicate the Embedded LDAP.

 

So how to correct that? First, let’s export the Certificate Chain from the identity keystore and import that into the WebLogic cacerts too:

[weblogic@weblogic_server_01 servers]$ keytool -export -v -alias root_ca -file rootCA.der -keystore /app/weblogic/domains/DOMAIN/certs/identity.jks
[weblogic@weblogic_server_01 servers]$ keytool -export -v -alias gold_ca -file goldCA.der -keystore /app/weblogic/domains/DOMAIN/certs/identity.jks
[weblogic@weblogic_server_01 servers]$
[weblogic@weblogic_server_01 servers]$ keytool -import -v -trustcacerts -alias root_ca -file rootCA.der -keystore /app/weblogic/Middleware/wlserver/server/lib/cacerts
Enter keystore password:
[weblogic@weblogic_server_01 servers]$ keytool -import -v -trustcacerts -alias gold_ca -file goldCA.der -keystore /app/weblogic/Middleware/wlserver/server/lib/cacerts
Enter keystore password:

 

After doing that, you just have to remove the Embedded LDAP of this Managed Server to reinitialize it using the same steps as before but just do not copy the ldap from the AdminServer since we need to ensure that the automatic replication is working now. Then start the Managed Server one last time and verify that the replication is happening properly and therefore if the Managed Server is able to start or not. For me, everything was now working properly, so that’s a victory! :)

 

Cet article Documentum story – Unable to start a new Managed Server in SSL in WebLogic est apparu en premier sur Blog dbi services.

Enterprise Manager 13.1.0.0 does not display correct values for memory

Thu, 2016-10-20 08:03

I recently had problems with Enterprise Manager 13.1.0.0, receiving such alerts:

EM Event Critical hostname Memory Utilization is 93,205 % crossed warning (80%) or critical (90%)

When we have a look at the EM 13c console for the host:

me1

On the system the free -m command displays:

oracle@host:~/24437699/ [agent13c] free -m
             total       used       free     shared    buffers     cached
Mem:         48275      44762       3512          0        205      37483
-/+ buffers/cache:       7073      41201
Swap:         8189       2397       5791

Em 13c does not take into account the buffer / cached component.

In fact the memory calculation has changed from EM 12.1.0.5 and EM 13.1.0.0.  According to Metalink Note 2144976.1:

“While the total Memory available in the host target is displayed correctly after applying the latest PSU # 23030165 (Agent-Side 13.1.0.0.160429), the formula used for Memory Utilization is (100.0 * (realMem-freeMem) / realMem) and does not consider Buffers / Cached component for the calculation.”

To solve the problem we have to patch the OMS and the different agents:

For the oms: use the patch 23134365

For the agents : use the patch 24437699

Watch out, when you want to apply the 23134365 patch for oms, we have to install the latest version of omspatcher. We download  Patch 19999993 of Release 13.1.0.0.0 from MOS.

We backup the OMSPatcher directory in the $ORACLE_HOME oms13c environment:

oracle:OMS_HOME:/ [oms13c] mv OMSPatcher/ OMSPatcher_save

then we copy and unzip the p19999993_131000_Generic.zip from the $ORACLE_HOME directory:

oracle:$OMS_HOME/ [oms13c] unzip p19999993_131000_Generic.zip
Archive:  p19999993_131000_Generic.zip
   creating: OMSPatcher/
   creating: OMSPatcher/oms/
  inflating: OMSPatcher/oms/generateMultiOMSPatchingScripts.pl
   creating: OMSPatcher/jlib/
  inflating: OMSPatcher/jlib/oracle.omspatcher.classpath.jar
  inflating: OMSPatcher/jlib/oracle.omspatcher.classpath.unix.jar
  inflating: OMSPatcher/jlib/omspatcher.jar
  inflating: OMSPatcher/jlib/oracle.omspatcher.classpath.windows.jar
   creating: OMSPatcher/scripts/
   creating: OMSPatcher/scripts/oms/
   creating: OMSPatcher/scripts/oms/oms_child_scripts/
  inflating: OMSPatcher/scripts/oms/oms_child_scripts/omspatcher_wls.bat
  inflating: OMSPatcher/scripts/oms/oms_child_scripts/omspatcher_jvm_discovery
  inflating: OMSPatcher/scripts/oms/oms_child_scripts/omspatcher_jvm_discovery.bat
  inflating: OMSPatcher/scripts/oms/oms_child_scripts/omspatcher_wls
  inflating: OMSPatcher/scripts/oms/omspatcher
  inflating: OMSPatcher/scripts/oms/omspatcher.bat
  inflating: OMSPatcher/omspatcher
   creating: OMSPatcher/wlskeys/
  inflating: OMSPatcher/wlskeys/createkeys.cmd
  inflating: OMSPatcher/wlskeys/createkeys.sh
  inflating: OMSPatcher/omspatcher.bat
  inflating: readme.txt
  inflating: PatchSearch.xml

We check the OMSPatcher version:

oracle:/ [oms13c] ./omspatcher version
OMSPatcher Version: 13.6.0.0.1
OPlan Version: 12.1.0.2.2
OsysModel build: Wed Oct 14 06:21:23 PDT 2015
 
OMSPatcher succeeded.

We download from Metalink the p23134265_131000_Generic-zip file, and we run:

oracle@host:/home/oracle/23134365/ [oms13c] omspatcher apply -analyze
OMSPatcher Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.
OMSPatcher version : 13.6.0.0.1
OUI version        : 13.6.0.0.0
Running from       : /u00/app/oracle/product/13.1.0.0/middleware
Log file location  : /u00/app/oracle/product/13.1.0.0/middleware/
cfgtoollogs/omspatcher/opatch2016-09-26_11-06-30AM_1.log
 
OMSPatcher log file: /u00/app/oracle/product/13.1.0.0/middleware/
cfgtoollogs/omspatcher/23134365/omspatcher_2016-09-26_11-06-34AM_analyze.log
 
Please enter OMS weblogic admin server URL(t3s://hostname:7102):>
Please enter OMS weblogic admin server username(weblogic):>
Please enter OMS weblogic admin server password:>
 
Configuration Validation: Success
 
Running apply prerequisite checks for sub-patch(es) "23134365" 
and Oracle Home "/u00/app/oracle/product/13.1.0.0/middleware"...
Sub-patch(es) "23134365" are successfully analyzed for Oracle Home 
"/u00/app/oracle/product/13.1.0.0/middleware"

Complete Summary
================

OMSPatcher succeeded.

We stop the oms and we run:

oracle@hostname:/home/oracle/23134365/ [oms13c] omspatcher apply
OMSPatcher Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.
 
OMSPatcher version : 13.6.0.0.1
OUI version        : 13.6.0.0.0
Running from       : /u00/app/oracle/product/13.1.0.0/middleware
 
Please enter OMS weblogic admin server URL(t3s://hostname:7102):>
Please enter OMS weblogic admin server username(weblogic):>
Please enter OMS weblogic admin server password:>
 
Configuration Validation: Success
…

OMSPatcher succeeded.

We finally restart the OMS:

oracle@hostname:/home/oracle/ [oms13c] emctl start oms

Oracle Enterprise Manager Cloud Control 13c Release 1
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Starting Oracle Management Server...
WebTier Successfully Started
Oracle Management Server Successfully Started
Oracle Management Server is Up
JVMD Engine is Up
Starting BI Publisher Server ...
BI Publisher Server Already Started
BI Publisher Server is Up

 

Now we apply the patch to the agents:

After downloaded and unzipped the p24437699_131000_Generic.zip, we stop the management agent and we run:

oracle@hostname:/home/oracle/24437699/ [agent13c] opatch apply
Oracle Interim Patch Installer version 13.6.0.0.0
Copyright (c) 2016, Oracle Corporation.  All rights reserved.
 
Oracle Home       : /u00/app/oracle/product/13.1.0.0/agent/agent_13.1.0.0.0
Central Inventory : /u00/app/oraInventory
OPatch version    : 13.6.0.0.0
OUI version       : 13.6.0.0.0

OPatch detects the Middleware Home as "/u00/app/oracle/product/13.1.0.0/agent"
 
Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   24437699
 
Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.
Backing up files...
Applying interim patch '24437699' to 
OH '/u00/app/oracle/product/13.1.0.0/agent/agent_13.1.0.0.0'
 
Patching component oracle.sysman.top.agent, 13.1.0.0.0...
Patch 24437699 successfully applied.
 
OPatch succeeded.

Finally we restart the agent with the emctl start agent command.

After the patches have been applied, the memory used displayed is correct:

me2

me3

And we do not receive critical alerts anymore :=)

 

 

Cet article Enterprise Manager 13.1.0.0 does not display correct values for memory est apparu en premier sur Blog dbi services.

Documentum story – Replicate an Embedded LDAP manually in WebLogic

Thu, 2016-10-20 02:00

In this blog, I will talk about the WebLogic Embedded LDAP. This LDAP is created by default on all AdminServers and Managed Servers of any WebLogic installation. The AdminServer always contains the Primary Embedded LDAP and all other Servers are synchronized with this one. This Embedded LDAP is the default security provider database for the WebLogic Authentication, Authorization, Credential Mapping and Role Mapping providers: it usually contains the WebLogic users, groups, and some other stuff like the SAML2 setup, aso… So basically a lot of stuff configured under the “security realms” in the WebLogic Administration Console. This LDAP is based on files that are stored under “$DOMAIN_HOME/servers/<SERVER_NAME>/data/ldap/”.

 

Normally the Embedded LDAP is automatically replicated from the AdminServer to the Managed Servers during startup but this can fail for a few reasons:

  • AdminServer not running
  • Problems in the communications between the AdminServer and Managed Servers
  • aso…

 

Oracle usually recommend to use an external RDBMS Security Store instead of the Embedded LDAP but not all information are stored in the RDBMS and therefore the Embedded LDAP is always used, at least for a few things. More information on this page: Oracle WebLogic Server Documentation.

 

So now in case the automatic replication isn’t working properly, for any reason, or if a manual replication is needed, how can it be done? Well that’s pretty simple and I will explain that below. I will also use a home made script in order to quickly and efficiently start/stop one, several or all WebLogic components. If you don’t have such script available, then please adapt the steps below to manually stop and start all WebLogic components.

 

So first you need to stop all components:

[weblogic@weblogic_server_01 ~]$ $DOMAIN_HOME/bin/startstop stopAll
  ** Managed Server msD2-01 stopped
  ** Managed Server msD2Conf-01 stopped
  ** Managed Server msDA-01 stopped
  ** Administration Server AdminServer stopped
  ** Node Managed NodeManager stopped
[weblogic@weblogic_server_01 ~]$ ps -ef | grep weblogic
[weblogic@weblogic_server_01 ~]$

 

Once this is done, you need to retrieve the list of all Managed Servers installed/configured in this WebLogic Domain for which a manual replication is needed. For me, it is pretty simple, they are printed above in the start/stop command but otherwise you can find them like that:

[weblogic@weblogic_server_01 ~]$ cd $DOMAIN_HOME/servers
[weblogic@weblogic_server_01 servers]$ ls | grep -v "domain_bak"
AdminServer
msD2-01
msD2Conf-01
msDA-01

 

Now that you have the list, you can proceed with the manual replication for each and every Managed Server. First backup the Embedded LDAP and then replicate it from the Primary (in the AdminServer as explained above):

[weblogic@weblogic_server_01 servers]$ current_date=$(date "+%Y%m%d")
[weblogic@weblogic_server_01 servers]$ 
[weblogic@weblogic_server_01 servers]$ mv msD2-01/data/ldap msD2-01/data/ldap_bck_$current_date
[weblogic@weblogic_server_01 servers]$ mv msD2Conf-01/data/ldap msD2Conf-01/data/ldap_bck_$current_date
[weblogic@weblogic_server_01 servers]$ mv msDA-01/data/ldap msDA-01/data/ldap_bck_$current_date
[weblogic@weblogic_server_01 servers]$ 
[weblogic@weblogic_server_01 servers]$ cp -R AdminServer/data/ldap msD2-01/data/
[weblogic@weblogic_server_01 servers]$ cp -R AdminServer/data/ldap msD2Conf-01/data/
[weblogic@weblogic_server_01 servers]$ cp -R AdminServer/data/ldap msDA-01/data/

 

When this is done, just start all WebLogic components again:

[weblogic@weblogic_server_01 servers]$ $DOMAIN_HOME/bin/startstop startAll
  ** Node Manager NodeManager started
  ** Administration Server AdminServer started
  ** Managed Server msDA-01 started
  ** Managed Server msD2Conf-01 started
  ** Managed Server msD2-01 started

 

And if you followed these steps properly, the Managed Servers will now be able to start normally with a replicated Embedded LDAP containing all recent changes coming from the Primary Embedded LDAP.

 

Cet article Documentum story – Replicate an Embedded LDAP manually in WebLogic est apparu en premier sur Blog dbi services.

Datawarehouse ODS load is fast and easy in Enterprise Edition

Wed, 2016-10-19 14:56

In a previous post, tribute to transportable tablespaces (TTS), I said that TTS is also used to move data quickly from operational database to a datawarehouse ODS. For sure, you don’t transport directly from the production database because TTS requires that the tablespace is read only. But you can transport from a snapshot standby. Both features (transportable tablespaces and Data Guard snapshot standby) are free in Enterprise Edition without option. Here is an exemple to show that it’s not difficult to automate

I have a configuration with the primary database “db1a”

DGMGRL> show configuration
 
Configuration - db1
 
Protection Mode: MaxPerformance
Members:
db1a - Primary database
db1b - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS (status updated 56 seconds ago)
 
DGMGRL> show database db1b
 
Database - db1b
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Average Apply Rate: 0 Byte/s
Real Time Query: ON
Instance(s):
db1
 
Database Status:
SUCCESS

I’ve a few tables in the tablespace USERS and this is what I want to transport to ODS database:

SQL> select segment_name,segment_type,tablespace_name from user_segments;
 
SEGMENT_NAME SEGMENT_TY TABLESPACE
------------ ---------- ----------
DEPT TABLE USERS
EMP TABLE USERS
PK_DEPT INDEX USERS
PK_EMP INDEX USERS
SALGRADE TABLE USERS

Snapshot standby

With Data Guard it is easy to open temporarily the standby database. Just convert it to a snapshot standby with a simple command:


DGMGRL> connect connect system/oracle@//db1b
DGMGRL> convert database db1b to snapshot standby;
Converting database "db1b" to a Snapshot Standby database, please wait...
Database "db1b" converted successfully

Export

Here you can start to do some Extraction/Load but better to reduce this window where the standby is not in sync. The only thing we will do is export the tablespace in the fastest way: TTS.

First, we put the USERS tablespace in read only:

SQL> connect system/oracle@//db1b
Connected.
 
SQL> alter tablespace users read only;
Tablespace altered.

and create a directory to export metadata:

SQL> create directory TMP_DIR as '/tmp';
Directory created.

Then export is easy

SQL> host expdp system/oracle@db1b transport_tablespaces=USERS directory=TMP_DIR
Starting "SYSTEM"."SYS_EXPORT_TRANSPORTABLE_01": system/********@db1b transport_tablespaces=USERS directory=TMP_DIR
 
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/CONSTRAINT/CONSTRAINT
Master table "SYSTEM"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_TRANSPORTABLE_01 is:
/tmp/expdat.dmp
******************************************************************************
Datafiles required for transportable tablespace USERS:
/u02/oradata/db1/users01.dbf
Job "SYSTEM"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at Wed Oct 19 21:03:36 2016 elapsed 0 00:00:52

I’ve the metadata in /tmp/expdat.dmp and the data in /u02/oradata/db1/users01.dbf. I copy this datafile directly in his destination for the ODS database:

[oracle@VM118 ~]$ cp /u02/oradata/db1/users01.dbf /u02/oradata/ODS/users01.dbf

This is physical copy, which is the fastest data movement we can do.

I’m ready to import it into my ODA database, but I can already re-sync the standby database because I extracted everything I wanted.

Re-sync the physical standby

DGMGRL> convert database db1b to physical standby;
Converting database "db1b" to a Physical Standby database, please wait...
Operation requires shut down of instance "db1" on database "db1b"
Shutting down instance "db1"...
Connected to "db1B"
Database closed.
Database dismounted.
ORACLE instance shut down.
Operation requires start up of instance "db1" on database "db1b"
Starting instance "db1"...
ORACLE instance started.
Database mounted.
Connected to "db1B"
Continuing to convert database "db1b" ...
Database "db1b" converted successfully
DGMGRL>

The duration depends on the time to flashback the changes (and we did no change here as we only exported) and the time to apply the redo stream generated since the convert to snapshot standby (which duration has been minimized to its minimum).

This whole process can be automated. We did that at several customers and it works well. No need to change anything unless you have new tablespaces.

Import

Here is the import to the ODS database and I rename the USERS tablespace to ODS_USERS:

SQL> host impdp system/oracle transport_datafiles=/u02/oradata/db2B/users02.dbf directory=TMP_DIR remap_tablespace=USERS:ODS_USERS
Master table "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01": system/******** transport_datafiles=/u02/oradata/ODS/users01.dbf directory=TMP_DIR remap_tablespace=USERS:ODS_USERS
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/CONSTRAINT/CONSTRAINT
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" completed with 3 error(s) at Wed Oct 19 21:06:18 2016 elapsed 0 00:00:10

Everything is there. You have all your data in ODS_USERS. You can have other data/code in this database. Only the ODS_USERS tablespace have to be dropped to be re-imported. You can have your staging tables here adn even permanent tables.

12c pluggable databases

In 12.1 it is even easier because the multitenant architecture gives the possibility to transport the pluggable databases in one command, through file copy or database links. It is even faster because metadata are transported physically with the PDB SYSTEM tablespace. I said multitenant architecture here, and didn’t mention any option. Multitenant option is needed only if you want multiple PDBs managed by the same instance. But if you want the ODS database to be an exact copy of the operational database, then you don’t need any option to unplug/plug.

In 12.1 you need to put the source in read only, so you still need a snapshto standby. And from my test, there’s no problem to convert it back to physical standby after a PDB has been unplugged. In next release, we may not need a standby because it has been announced that PDB can be cloned online.

I’ll explain the multitenant features available without any option (in 12c current and next release) at Oracle Geneva office on 23rd of November:

CaptureBreakfastNov23
Do not hesitate to register by e-mail.

 

Cet article Datawarehouse ODS load is fast and easy in Enterprise Edition est apparu en premier sur Blog dbi services.

Feedback on my session at Oracle Open World 2016

Wed, 2016-10-19 11:00

I was a speaker at Oracle Open World and received the feedback and demographic data. this session took place on the Sunday, which is the User Group Forum day about Multitenant, defining what is the multitenant architecture and which features it brings to us even wen we dont’ have the multitenant option. Unfortunately, I cannot upload the slides before the next 12c release is available. If you missed the session or want to hear it in my native language, I’ll give it in Geneva on 23rd of November at Oracle Switzerland office.

Here is the room when we was setting up de demo on my laptop, but from demographic statistics below 84 people attended (or planned to attend) my session.

2016-09-18 10.28.07

Feedback survey

Depending on the conferences, the percentage of people that fill the feedback goes from low to very low. Here 6 people gave feedback wich is 7% or the attendees:

Number of Respondents: 6
Q1: How would you rate the content of the session? (select a rating of 1 to 3, 3 being the best): 2.67
Q2: How would you rate the speaker(s) of the session? (select a rating of 1 to 3, 3 being the best): 2.83
Q3: Overall, based on content and speakers I would rate this session as (select a rating of 1 to 3, 3 being the best): 2.67

Thanks for this. But most important is the quality than the quantity. I received only one comment and this one is very important for me because it can help me to improve my slide:

Heavy accent was difficult to understand at times where I lost interest/concentration. Ran through slides too quick (understanding time constraints). Did not allow image capturing (respected). Did provide examples which was nice. Advised slides will be downloadable…to be seen.

The accent is not a surprise. It’s an international event a lot of people coming from all around the world may have accent that is difficult to understand. I would love to speak English more clearly but I know that my French accent is there, and my lack of vocabulary as well. That will be hard to change. But the remark about the slides is very pertinent. I usually put lot of material in my presentations: lot of slides, lot of texts, lot of demos. My idea is that you don’t need to read all what is written. It is there to read it later when you download the slides (I expected the 12cR2 to be fully available for OOW when I prepared the slides). And it’s also there in case my live demos fails so that I’ve the info on the slides, but I usually skip them quickly when all was seen in the demo.

But thanks to this comment, I understand that reading the slides is important when you don’t get what I say, and having too much text makes it difficult to follow. For future presentations, I’ll remove text from slides and put it as powerpoint presenter notes, made available in the pdf.

So thanks for the one that has written this comment. I’ll improve that. And don’t hesitate to ping me to know when the slides can be downloadable, and maybe I can already share a few ones.

Demographic data

Open World gives some demographic data about attendees. As at the Sunday Session you don’t have to scan you badge, I suppose it’s about people that registered and may not have been there. But intention counts ;)

About countries: we were in US so that’s the main country represented here. Next comes 6 people from Switzerland, the country where I live and work:

CaptureOOW16demoCountries84

When we register to OOW we fill-in the industry we are working on. The most represented in the room were Financial, Heathcare, High Tech:

CaptureOOW16demoIndustries84

And the job title which is a free text have several values, which makes it difficult to aggregate:

CaptureOOW16demoJobtitles84

That’s no surprise that there were a lot of DBAs. I’m happy to see some managers/supervisors interested in technical sessions.
My goal for future events is to get more attention from developers because a database is not a black box storage for data, but a full software where data is processed.

I don’t think that 84 people were in that room actually, there were several good sessions at the same time, as well as the bridge run.

oo1Csp1eTvXgAABNcH

This is the kind of slides where there’s lot of text but I go fast. Actually I had initially 3 slides about this point (feature usage detection, multitenant and CON_IDs). I removed some, and kept one with too much text. When I remove slides, I usually post a blog about what I’ll not have time to detail.

Here are those posts:

http://blog.dbi-services.com/unplugged-pluggable-databases/
http://blog.dbi-services.com/how-to-check-multitenant-option-feature-usage/

Thanks

My session was part of part of the stream which was selected by the EMEA Oracle Usergroup Community. Thanks a lot to EOUC. They have good articles in their newly created magazine www.oraworld.org. Similar name but nothing to do with the team of worldwide OCMs and ACEs publishing for years as Oraworld Team.

 

Cet article Feedback on my session at Oracle Open World 2016 est apparu en premier sur Blog dbi services.

Documentum xPlore – ftintegrity tool not working

Wed, 2016-10-19 08:38

I’ve been patching around some xPlore servers for a while and recently I went into an issue regarding the ftintegrity tool. Maybe you figured it out as well, for xPlore 1.5 Patch 15 the ftintegrity tool available in $DSEARCH_HOME/setup/indexagent/tools was corrupted by the patch.
I think for some reason the libs were changed and the tool wasn’t able to load anymore. I asked EMC and they said it was a known bug which will be fixed in next release.

So I came to patch it again when the Patch 17 went out (our customer processes doesn’t allow us to patch every month, so I skipped the Patch 16). After patching, I directly started the ftintegrity tool in order to check that everything was fixed, and…. no.

In fact yes, but you have something to do before making it work. The error you have is like ‘Could not load because config is NULL’, or ‘dfc.properties not found’. I found these error kinda strange, so I wondered if the ftintegrity tool script was patched as well. And the answer is no, the script is still the same but the jar libraries have been changed, that means that the script is pointing to the wrong libraries and it can’t load properly.

Thus the solution is simple, I uninstalled the Index Agent and installed it again right after. The ftintegrity tool script was re-created with the good pointers to the new libraries. Little tip: If you have several Index Agents and don’t want to re-install them all, you may want to copy the content of the updated ftintegrity tool script and paste it into the other instances (do not forget to adapt it because it may point to different docbases).

To summary, if you have issues regarding the execution of the ftintegrity tool, check the libraries call in the script and try to re-install it in order to have the latest one.

 

Cet article Documentum xPlore – ftintegrity tool not working est apparu en premier sur Blog dbi services.

Documentum story – Jobs in a high availability installation

Wed, 2016-10-19 04:55

When you have an installation with one Content Server (CS) you do not take care where the job will be running. It’s always on your single CS.
But how should you configure the jobs in case you have several CS? Which jobs have to be executed and which one not? Let’s see that in this post.

When you have to run your jobs in a high availability installation you have to configure some files and objects.

Update the method_verb of the dm_agent_exec method:

API> retrieve,c,dm_method where object_name = 'agent_exec_method'
API> get,c,l,method_verb
API> set,c,l,method_verb
SET> ./dm_agent_exec -enable_ha_setup 1
API> get,c,l,method_verb
API> save,c,l
API> reinit,c

 

The java methods have been updated to be restartable:

update dm_method object set is_restartable=1 where method_type='java';

 

On our installation we use jms_max_wait_time_on_failures = 300 instead the default one (3000).
In server.ini (Primary Content Server) and server_HOSTNAME2_REPO01.ini (Remote Content Server), we have:

incremental_jms_wait_time_on_failure=30
jms_max_wait_time_on_failures=300

 

Based on some issues we faced, for instance with the dce_clean job that ran twice when we had both JMS projected to each CS, EMC advised us to each JMS with its local CS only. With this configuration, in case the JMS is down on the primary CS, the job (using a java method) is started on the remote JMS via the remote CS.

Regarding which jobs have to be executed – I am describing only the one which are used for the housekeeping.
So the question to answer is which job does what and what is “touched”, either metadata or/and content.

To verify that, check how many CS are used and where they are installed:

select object_name, r_host_name from dm_server_config
REPO1               HOSTNAME1.DOMAIN
HOSTNAME2_REPO1     HOSTNAME2.DOMAIN

 

Verify on which CS the jobs will run and “classify” them.
Check the job settings:

select object_name, target_server, is_inactive from dm_job
Metadata

The following jobs work only on metadata, they can run anywhere so the target_server has to be empty

 object_name target_server is_inactive dm_ConsistencyChecker False dm_DBWarning False dm_FileReport False dm_QueueMgt False dm_StateOfDocbase False

 

Content

The following jobs work only on content.

  object_name  target_server dm_ContentWarning  REPO1.REPO1@HOSTNAME1.DOMAIN dm_ContentWarningHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN dm_DMClean REPO1.REPO1@HOSTNAME1.DOMAIN dm_DMCleanHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN

As we are using a NAS for the Data directory which is shared for both servers, only one of the two jobs has to run. Per default the target_server is defined. So for the one which has to run, target_server has to be empty.

  object_name  target_server  is_inactive dm_ContentWarning False dm_ContentWarningHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN True dm_DMClean False dm_DMCleanHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN True Metadata and Content

These following jobs are working on metadata and content.

   object_name   target_server dm_DMFilescan  REPO1.REPO1@HOSTNAME1.DOMAIN dm_DMFilescanHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN dm_LogPurge  REPO1.REPO1@HOSTNAME1.DOMAIN dm_LogPurgeHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN

Filescan scans the NAS content storage. As said above, it is shared and therefore the job only need to be execute once: the target_server has to be empty to be run everywhere.

LogPurge is also cleaning files under $DOCUMENTUM/dba/log and subfolders which are obviously not shared and therefore both dm_LogPurge jobs have to run. You just have to use another start time to avoid an overlap when objects are removed from the repository.

   object_name   target_server   is_inactive dm_DMFilescan False dm_DMFilescanHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN True dm_LogPurge  REPO1.REPO1@HOSTNAME1.DOMAIN False dm_LogPurgeHOSTNAME2_REPO1  REPO1.HOSTNAME2_REPO1@HOSTNAME2.DOMAIN False

Normally with this configuration your housekeeping jobs should be well configured.

One point you have to take care is when you use DA to configure your jobs. Once you open the job properties, the “Designated Server” is set to one of your server and not to “Any Running Server” which means target_server = ‘ ‘. If you click the OK button, you will set the target server and in case this CS is down, the job will fail because it cannot use the second CS.

 

Cet article Documentum story – Jobs in a high availability installation est apparu en premier sur Blog dbi services.

Documentum story – How to display correct client IP address in the log file when a WebLogic Domain is fronted by a load Balancer

Wed, 2016-10-19 04:32

The Load Balancers do not provide the client IP address by default. The WebLogic HTTP log file (access_log) does not provide the client IP address but the Load Balancer one.
This is sometimes a problem when diagnosing issues and the Single Sign On configuration does not provide the user name in the HTTP log either.

In most of  the cases, the Load Balancer can provides an additional header named “X-Forwarded-For” but it needs to be configured from the Load Balancer administration people.
If the “X-Forwarded-For” Header is provided, it can be fetched using the WebLogic Server HTTP extended logging.

To enable the WebLogic Server HTTP logging to fetch the “X-Forwarded-For” Header follow the steps below for each WebLogic Server in the WebLogic Domain:

  1. Browse to the WebLogic Domain administration console and sign in as an administrator user
  2. Open the servers list and select the first managed server
  3. Select the logging TAB and the HTTP sub-tab
  4. Open the advanced folder and change the format to “extended” and the Extended Logging Format Fields to:
    "cs(X-Forwarded-For) date time cs-method cs-uri sc-status bytes"
  5. Save
  6. Browse back to the servers list and repeat the steps for each WebLogic Server from the domain placed behind the load balancer.
  7. Activate the changes.
  8. Stop and restart the complete WebLogic domain.

After this, the WebLogic Servers HTTP Logging (access_log) should display the client IP address and not the Load Balancer one.

When using the WebLogic Server extended HTTP logging, the username field is not available any more.
This feature is described in the following Oracle MOS article:
Missing Username In Extended Http Logs (Doc ID 1240135.1)

To get the authenticated usename displayed, an additional custom filed provided by a custom Java class needs to be used.

Here is an example of such Java class:

import weblogic.servlet.logging.CustomELFLogger; 
import weblogic.servlet.logging.FormatStringBuffer; 
import weblogic.servlet.logging.HttpAccountingInfo;

/* This example outputs the User-Agent field into a
 custom field called MyCustomField
*/

public class MyCustomUserNameField implements CustomELFLogger{

public void logField(HttpAccountingInfo metrics,
  FormatStringBuffer buff) {
  buff.appendQuotedValueOrDash(metrics.getRemoteUser());
  }
}

The next step is to compile and create a jar library.

Set the environment running the WebLogic setWLSEnv.sh script.

javac MyCustomUserNameField.java

jar cvf MyCustomUserNameField.jar MyCustomUserNameField.class

Once done, copy the jar library file under the WebLogic Domain lib directory. This way, it will be made available in the class path of each WebLogic Server of this WebLogic Domain.

The WebLogic Server HTTP Extended log format can now be modified to include a custom field named “x-MyCustomUserNameField”.

 

Cet article Documentum story – How to display correct client IP address in the log file when a WebLogic Domain is fronted by a load Balancer est apparu en premier sur Blog dbi services.

Manage GitHub behind a proxy

Wed, 2016-10-19 02:00

I’m quite used to GitHub since I’m using it pretty often but I actually never tried to use it behind a proxy. In the last months, I was working on a project and I had to use GitHub to version control the repository that contained scripts, monitoring configurations, aso… When setting up my local workstation (Windows) using the GUI, I faced an error showing that GitHub wasn’t able to connect to the repository while I was able to access it using my Web Browser… This is the problem I faced some time ago and I just wanted to share this experience because even if I’m writing a lot of blogs related to Documentum, it is sometimes good to change your mind you know… Therefore today is GitHub Day!

 

After some research and analysis and you already understood it if you read the first paragraph of this blog, I thought that maybe it was a proxy that is automatically setup in the Web Browser and that would prevent the GitHub process to access the GitHub repository and I was right! So GitHub behind a proxy, how can you manage that? Actually that’s pretty simple because everything is there so you just need to configure it. Unfortunately, I didn’t find any options in the GUI that would allow you to do that and therefore I had to use the Command Line Interface for that purpose. If there is a way to do that using the GUI, you are welcome to share!

 

Ok so let’s define some parameters:

  • PROXY_USER = The user’s name to be used for the Proxy Server
  • PROXY_PASSWORD = The password of the proxy_user
  • PROXY.SERVER.COM = The hostname of your Proxy Server
  • PORT = The port used by your Proxy Server in HTTP
  • PORT_S = The port used by your Proxy Server in HTTPS

 

With these information, you can execute the following commands to configure GitHub using the Command Line Interface (Git Shell on Windows). These two lines will simply tell GitHub that it needs to use a proxy server in order to access Internet properly:

git config --global http.proxy http://PROXY_USER:PROXY_PASSWORD@PROXY.SERVER.COM:PORT
git config --global https.proxy https://PROXY_USER:PROXY_PASSWORD@PROXY.SERVER.COM:PORT_S

 

If your Proxy Server is public (no authentication needed), then you can simplify these commands as follow:

git config --global http.proxy http://PROXY.SERVER.COM:PORT
git config --global https.proxy https://PROXY.SERVER.COM:PORT_S

 

With this simple configuration, you should be good to go. Now you can decide, whenever you want, to just remove this configuration. That’s also pretty simple since you just have to unset this configuration with the same kind of commands:

git config --global --unset http.proxy
git config --global --unset https.proxy

 

The last thing I wanted to show you is that if it is still not working, then you can check what you entered previously and what is currently configured by executing the following commands:

git config --global --get http.proxy
git config --global --get https.proxy

 

This conclude this pretty small blog but I really wanted to share this because I think it can help a lot of people!

 

Cet article Manage GitHub behind a proxy est apparu en premier sur Blog dbi services.

Using JMeter to run load test on a ADF application protected by Oracle Access Manager Single Sign On

Tue, 2016-10-18 10:50

Introduction

In one of my missions, I was requested to  run performance and load tests on a ADF application running in a Oracle Fusion Middleware environment protected using Oracle Access Manager. For this task we decided to use Apache JMeter because it  provides the control needed on the tests and uses multiple threads to emulate multiple users. It can be used to do distributed testing which uses multiple systems to do stress test.  Additionally, the GUI interface provides an easy way to manage the load test scenarios that can be easily recorded using the HTTP(s) Test Script Recorder.

Prepare a JMeter test plan

A first start is to review the following Blog: My Shot on Using JMeter to Load Test Oracle ADF Applications

The blog above explains how to record and use a test plan in JMeter.
It provides a SimplifiedADFJMeterPlan.jmx  JMeter test plan that can be used as a base for the JMeter test plan creation.
But this ADF starter test plan has to be reviewed for the jsessionId and afrLoop Extractors. As the regular expression associated with them might need to be adapted as they might change depending on the version of the ADF software.

In this environment, Oracle Fusion Middleware ADF 11.1.2.4 WebLogic Server 10.3.6 and Oracle Access Manager 11.2.3 were used.
The regular expressions for afrLoop and jsessionid needed to be updated as shown below:

reference name regular expression afrLoop _afrLoop\’, \';([0-9]{13,16}) jsesionId ;jsessionid=([-_0-9A-Za-z!]{62,63})

Coming to the single Sign On layer, it appears that the Oracle Access Manager compatible login screen requires three parameters:

  • username
  • password
  • request_id

First username and password pattern values will be provided by the recording of the test scenario. To run the same scenario with multiple users, a CSV file is used to store test users and passwords. This will be detailed later in this blog.
The request_id is provided by the Oracle Access Manager Single Sign On layer and needs to be fetched and re-injected to the authentication URL.
To resolve this, a new variable needed to be created and the regular expression below is used.

reference name regular expression requestId name=\’request_id\’ value=\'([&amp;#;0-9]{18,25})\';

Once the test plan scenario is recorded, look for the OAM standard “/oam/server/auth_cred_submit” URL and change the request_id parameter to use the defined requestId variable.

**  click on the image to increase the size
OAM Authentication URL
name: request_id   value: ${requestId}

After those changes, the new JMeter test plan can be run.

Steps to run the test plan with multiple users

In JMeter,
Right click on the “Thread Group” on the tree.
Select “Add” – “Config Element” – “CSV Data Set Config”.
Add CSV config in JMeter

Create a CSV file which contains USERNAME,PASSWORD and saved it in a folder on your Jmeter server. Make sure the users exist in OAM/OID:

ahunold,welcome1
jcooper,welcome1
monty,welcome1
king,welcome1
scott,welcome1

Adapt the path in the “CSV Data Set Config” and define the variable values (USERNAME and PASSWORD) in “Variable Names comma-delimited”
JMeter ADF OAM IMG3
Look for the URL that is submitting the authentication – /oam/server/auth_cred_submit- and click on it. In the right frame, replace the username and password got during the recording with respectively ${USERNAME} and ${PASSWORD} as shown below:
JMeter ADF OAM IMG4
At last you can adapt the thread group of your test plan to the number of users (Number of Threads) and loop (Loop Count) you want to run and execute it. The Ramp-Up Period in Seconds is the time between the Threads start.
JMeter test plan IMG5
The test plan can be executed now and results visualised in tree, graph or table views.

 

 

Cet article Using JMeter to run load test on a ADF application protected by Oracle Access Manager Single Sign On est apparu en premier sur Blog dbi services.

Documentum story – Change the location of the Xhive Database for the DSearch (xPlore)

Tue, 2016-10-18 02:00

When using xPlore with Documentum, you will need to setup a DSearch which will be used to perform the searches and this DSearch uses in the background an Xhive Database. This is a native XML Database that persists XML DOMs and provide access to them using XPath and XQuery. In this blog, I will share the steps needed to change the location of the Xhive Database used by the DSearch. You usually don’t want to move this XML Database everyday but it might be useful as a one-time action. In this customer case, one of the DSearch in a Sandbox/Dev environment has been installed using a wrong path for the Xhive Database (not following our installation conventions) and therefore we had to correct that just to keep the alignment between all environments and to avoid a complete uninstall/reinstall of the IndexAgents + DSearch.

 

In the steps below, I will suppose that xPlore has been installed under “/app/xPlore” and that the Xhive Database has been created under “/app/xPlore/data”. This is the default value and then when installing an IndexAgent, it will create, under the data folder, a sub-folder with a name equal to the DSearch Domain’s name (usually the name of the docbase/repository). In this blog I will show you how to move this Xhive Database to “/app/xPlore/test-data” without having to reinstall everything. This means that the Xhive Database will NOT be deleted/recreated from scratch (this is also possible) and therefore you will NOT have to perform a full reindex which would have taken a looong time.

 

So let’s start with stopping all components first:

[xplore@xplore_server_01 ~]$ sh -c "/app/xPlore/jboss7.1.1/server/stopIndexagent.sh"
[xplore@xplore_server_01 ~]$ sh -c "/app/xPlore/jboss7.1.1/server/stopPrimaryDsearch.sh"

 

Once this is done, we need to backup the data and config files, just in case…

[xplore@xplore_server_01 ~]$ current_date=$(date "+%Y%m%d")
[xplore@xplore_server_01 ~]$ cp -R /app/xPlore/data/ /app/xPlore/data_bck_$current_date
[xplore@xplore_server_01 ~]$ cp -R /app/xPlore/config/ /app/xPlore/config_bck_$current_date
[xplore@xplore_server_01 ~]$ mv /app/xPlore/data/ /app/xPlore/test-data/

 

Ok now everything in the background is prepared and we can start the actual steps to move the Xhive Database. The first step is to change the data location in the files stored in the config folder. There is actually two files that need to be updated: indexserverconfig.xml and XhiveDatabase.bootstrap. In the first file, you need to update the “storage-location” path that defines where the data are kept and in the second file you need to update all paths pointing to the Database files. Here are some simple commands to replace the old path with the new path and check that it has been done properly:

[xplore@xplore_server_01 ~]$ sed -i "s,/app/xPlore/data,/app/xPlore/test-data," /app/xPlore/config/indexserverconfig.xml
[xplore@xplore_server_01 ~]$ sed -i "s,/app/xPlore/data,/app/xPlore/test-data," /app/xPlore/config/XhiveDatabase.bootstrap
[xplore@xplore_server_01 ~]$ 
[xplore@xplore_server_01 ~]$ grep -A2 "<storage-locations>" /app/xPlore/config/indexserverconfig.xml
    <storage-locations>
        <storage-location path="/app/xPlore/test-data" quota_in_MB="10" status="not_full" name="default"/>
    </storage-locations>
[xplore@xplore_server_01 ~]$ 
[xplore@xplore_server_01 ~]$ grep "/app/xPlore/test-data" /app/xPlore/config/XhiveDatabase.bootstrap | grep 'id="[0-4]"'
        <file path="/app/xPlore/test-data/xhivedb-default-0.XhiveDatabase.DB" id="0"/>
        <file path="/app/xPlore/test-data/SystemData/xhivedb-SystemData-0.XhiveDatabase.DB" id="2"/>
        <file path="/app/xPlore/test-data/SystemData/MetricsDB/xhivedb-SystemData#MetricsDB-0.XhiveDatabase.DB" id="3"/>
        <file path="/app/xPlore/test-data/SystemData/MetricsDB/PrimaryDsearch/xhivedb-SystemData#MetricsDB#PrimaryDsearch-0.XhiveDatabase.DB" id="4"/>
        <file path="/app/xPlore/test-data/xhivedb-temporary-0.XhiveDatabase.DB" id="1"/>

 

The next step is to announce the new location of the data folder to the DSearch so it can create future Xhive Databases at the right location and this is done inside the file indexserver-bootstrap.properties. After the update, this file should look like the following:

[xplore@xplore_server_01 ~]$ cat /app/xPlore/jboss7.1.1/server/DctmServer_PrimaryDsearch/deployments/dsearch.war/WEB-INF/classes/indexserver-bootstrap.properties
# (c) 1994-2009, EMC Corporation. All Rights Reserved.
#Wed May 20 10:40:49 PDT 2009
#Note: Do not change the values of the properties in this file except xhive-pagesize and force-restart-xdb.
node-name=PrimaryDsearch
configuration-service-class=com.emc.documentum.core.fulltext.indexserver.core.config.impl.xmlfile.IndexServerConfig
indexserver.config.file=/app/xPlore/config/indexserverconfig.xml
xhive-database-name=xhivedb
superuser-name=superuser
superuser-password=****************************************************
adminuser-name=Administrator
adminuser-password=****************************************************
xhive-bootstrapfile-name=/app/xPlore/config/XhiveDatabase.bootstrap
xhive-connection-string=xhive://xplore_server_01:9330
xhive-pagesize=4096
# xhive-cache-pages=40960
isPrimary = true
licensekey=**************************************************************
xhive-data-directory=/app/xPlore/test-data
xhive-log-directory=

 

In this file:

  • indexserver.config.file => defines the location of the indexserverconfig.xml file that must be used to recreate the DSearch Xhive Database.
  • xhive-bootstrapfile-name => defines the location and name of the Xhive bootstrap file that will be generated during bootstrap and will be used to create the empty DSearch Xhive Database.
  • xhive-data-directory => defines the path of the data folder that will be used by the Xhive bootstrap file. This will therefore be the future location of the DSearch Xhive Database.

 

As you probably understood, to change the data folder, you just have to adjust the value of the parameter “xhive-data-directory” to point to the new location: /app/xPlore/test-data.

 

When this is done, the third step is to change the Lucene temp path:

[xplore@xplore_server_01 ~]$ cat /app/xPlore/jboss7.1.1/server/DctmServer_PrimaryDsearch/deployments/dsearch.war/WEB-INF/classes/xdb.properties
xdb.lucene.temp.path=/app/xPlore/test-data/temp

 

In this file, xdb.lucene.temp.path defines the path for temporary uncommitted indexes. Therefore it will just be used for temporary indexes but it is still a good practice to change this location since it’s also talking about the data of the DSearch and it helps to keep everything consistent.

 

Then the next step is to clean the cache and restart the DSearch. You can use your custom start/stop script if you have one or use something like this:

[xplore@xplore_server_01 ~]$ rm -rf /app/xPlore/jboss7.1.1/server/DctmServer_*/tmp/work/*
[xplore@xplore_server_01 ~]$ sh -c "cd /app/xPlore/jboss7.1.1/server;nohup ./startPrimaryDsearch.sh & sleep 5;mv nohup.out nohup-PrimaryDsearch.out"

 

Once done, just verify in the log file generated by the start command (for me: /app/xPlore/jboss7.1.1/server/nohup-PrimaryDsearch.out) that the DSearch has been started successfully. If that’s true, then you can also start the IndexAgent:

[xplore@xplore_server_01 ~]$ sh -c "cd /app/xPlore/jboss7.1.1/server;nohup ./startIndexagent.sh & sleep 5;mv nohup.out nohup-Indexagent.out"

 

And here we are, the Xhive Database is now located under the “test-data” folder!

 

 

Additional note: As said at the beginning of this blog, it is also possible to recreate an empty Xhive Database and change its location at the same time. Doing a recreation of am empty DB will result in the same thing as the steps above BUT you will have to perform a full reindexing which will take a lot of time if this isn’t a new installation (the more documents are indexed, the more time it will take)… To perform this operation, the steps are mostly the same and are summarized below:

  1. Backup the data and config folders
  2. Remove all files inside the config folder except the indexserverconfig.xml
  3. Create a new (empty) data folder with a different name like “test-data” or “new-data” or…
  4. Update the file indexserver-bootstrap.properties with the reference to the new path
  5. Update the file xdb.properties with the reference to the new path
  6. Clean the cache and restart the DSearch+IndexAgents

Basically, the steps are exactly the same except that you don’t need to update the files indexserverconfig.xml and XhiveDatabase.bootstrap. The first one is normally updated by the DSearch automatically and the second file will be recreated from scratch using the right data path thanks to the update of the file indexserver-bootstrap.properties.

 

Have fun :)

 

Cet article Documentum story – Change the location of the Xhive Database for the DSearch (xPlore) est apparu en premier sur Blog dbi services.

CDB resource plan: shares and utilization_limit

Mon, 2016-10-17 16:00

I’m preparing some slides about PDB security (lockdown) and isolation (resource) for DOAG and as usual I’ve more info to share than what can fit in 45 minutes. In order to avoid the frustration of removing slides, I usually share them in blog posts. Here is the basic concepts of CDB resource plans in multitenant: shares and resource limit.

The CDB resource plan is mainly about CPU. It also governs the degree when in parallel query and the I/O when on Exadata, but the main resource is the CPU: sessions that are not allowed to used more CPU will wait on ‘resmgr: cpu quantum’. In a cloud environment where you provision a PDB, like in the new Exadata Express Cloud Service, you need to ensure that one PDB do not take all CDB resources, but you also have to ensure that resources are fairly shared.

resource_limit

Let’s start with resource limit. This do not depend on the number of PDB: it is defined as a percentage of the CDB resources. Here I have a CDB with two PDBs and I’ll run a workload on one PDB only. I run 8 sessions, all cpu bound, on PDB1.

I’ve defined a CDB resource plan that sets the resource_limit to 50% for PDB1:

CURRENT_TIMESTAMP PLAN PLUGGABLE_DATABASE DIRECTIVE_TYPE SHARES UTIL
------------------------------------ ------------ ------------------------- ------------------------------ ---------- ----------
14-OCT-16 08.46.53.077947 PM +00:00 MY_CDB_PLAN ORA$AUTOTASK AUTOTASK 90
14-OCT-16 08.46.53.077947 PM +00:00 MY_CDB_PLAN ORA$DEFAULT_PDB_DIRECTIVE DEFAULT_DIRECTIVE 1 100
14-OCT-16 08.46.53.077947 PM +00:00 MY_CDB_PLAN PDB1 PDB 1 50
14-OCT-16 08.46.53.077947 PM +00:00 MY_CDB_PLAN PDB2 PDB 1 100

This is an upper limit. I’ve 8 CPUs so PDB1 will be allowed to run only 4 sessions in CPU at a time. Here is the result:

CDB_RESOURCE_PLAN_1_PDB_1_SHARE_50_LIMIT

What you see here is that when more than the allowed percentage has been used the sessions are scheduled out of CPU and wait on ‘resmgr: cpu quantum’. And the interesting thing is that they seem to be stopped all at the same time:

CDB_RESOURCE_PLAN_1_PDB_1_SHARE_50_LIMIT-2

This make sense because the suspended sessions may hold resources that are used by others. However, this pattern does not reproduce for any workload. More work and future blog posts are probably required about that.

Well, the goal here is to explain that resource_limit is there to define a maximum resource usage. Even if there is no other activity, you will not be able to use all CDB resources if you have a resource limit lower than 100%.

Shares

Share are there for the opposite reason: guarantee a minimum of ressources to a PDB.
However, the unit is not the same. It cannot be the same. You cannot guarantee a percentage of CDB ressources to one PDB because you don’t know how many other PDBs you have. Let’s say you have 4 PDBs and you want to have them equal. You want to define a minimum of 25% percent for each. But then, what happens when a new PDB is created? You need to change all 25% to 20%. To avoid that, the minimum ressources is allocated by shares. You give shares to each PDB and they will get a percentage of ressources calculated from their share divided by the total number of shares.

The result is that when there is not enough ressources in the CDB to run all the sessions, then the PDBs that use more than their share will wait. Here is an example where PDB1 has 2 shares and PDB2 has 1 share, which means that PDB1 will get at least 66% of ressources and PDB2 at least 33%:

CURRENT_TIMESTAMP PLAN PLUGGABLE_DATABASE DIRECTIVE_TYPE SHARES UTIL
------------------------------------ ------------ ------------------------- ------------------------------ ---------- ----------
14-OCT-16 09.14.59.302771 PM +00:00 MY_CDB_PLAN ORA$AUTOTASK AUTOTASK 90
14-OCT-16 09.14.59.302771 PM +00:00 MY_CDB_PLAN ORA$DEFAULT_PDB_DIRECTIVE DEFAULT_DIRECTIVE 1 100
14-OCT-16 09.14.59.302771 PM +00:00 MY_CDB_PLAN PDB1 PDB 2 100
14-OCT-16 09.14.59.302771 PM +00:00 MY_CDB_PLAN PDB2 PDB 1 100

Here is the ASH on each PDB when I run 8 CPU-bound sessions on each. System is saturated because I have only 8 CPUs.

CDB_RESOURCE_PLAN_2_PDB_2_SHARE_100_LIMIT-PDB1

CDB_RESOURCE_PLAN_2_PDB_1_SHARE_100_LIMIT-PDB2

Because of the shares difference (2 shares for PDB1 and 1 share for PDB2) PDB1 has been able ti use more CPU than PDB2 when the system was saturated:
PDB1 was 72% in cpu and 22% waiting, PDB2 was 50% in cpu and 50% waiting.

CDB_RESOURCE_PLAN_2_PDB_1_SHARE_100_LIMIT-SUM

In order to illustrate what changes when the system is saturated, I’ve run 16 sessions on PDB1 and then, after 60 seconds, 4 sessions on PDB2.

Here is the activity of PDB1:

CDB_RESOURCE_PLAN_SHARE_PDB1

and PDB2:

CDB_RESOURCE_PLAN_SHARE_PDB2

At 22:14 PDB1 was able to use all available CPU because there is no utilization_limit and no other PDB have activity. The system is saturated, but from PDB1 only.
At 22:15 PDB has also activity, so the resource manager must limit PDB1 in order to give ressources to PDB2 proportionally to its share. PDB1 with 2 shares are guaranteed to be able to use 2/3 of cpu. PDB1 with 1 share is guaranteed to use 1/3 of it.
At 22:16 PDB1 activity has completed, so PDB2 can use more resources. The 4 sessions are lower than the available cpu, so the system is not saturated and there is no wait.

What to remember?

Shares are there to guarantee a minimum of ressources utilization when system is saturated.
Resource_limit is there to set a maximum of resource utilization, whether the system is saturated or not.

 

Cet article CDB resource plan: shares and utilization_limit est apparu en premier sur Blog dbi services.

Documentum story – Documentum installers fail with various errors

Mon, 2016-10-17 02:00

Some months ago when installing/removing/upgrading several Documentum components, we ended up facing a strange issue (yes I know, another one!). We were able to see these specific errors during the installation or removal of a Docbase, during the installation of a patch for the Content Server, the installation of the Thumbnail Server, aso… The errors we faced change for different installers but in the end, all of these errors were linked to the same issue. The only error that wasn’t completely useless was the one faced during the installation of a new docbase: “Content is not allowed in trailing section”. Yes I know this might not be really meaningful for everybody but this kind of error usually appears when an XML file isn’t formatted properly: some content isn’t allowed at this location in the file…

 

The strange thing is that these installers were working fine a few days before so what changed in the meantime exactly? After some research and analysis, I finally found the guilty! One thing that has been added in these few days was D2 which has been installed a few hours before the first error. Now what can be the link between D2 and these errors when running some installers? The first thing to do when there is an issue with D2 on the Content Server is to check the Java Method Server. The first time I saw this error, it was during the installation of a new docbase. As said before, I checked the logs of the Java Method Server and I found the following WARNING which confirmed what I suspected:

2015-10-24 09:39:59,948 UTC WARNING [javax.enterprise.resource.webcontainer.jsf.config] (MSC service thread 1-3) JSF1078: Unable to process deployment descriptor for context ''{0}''.: org.xml.sax.SAXParseException; lineNumber: 40; columnNumber: 1; Content is not allowed in trailing section.
        at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:196) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:175) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:394) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:322) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:281) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLScanner.reportFatalError(XMLScanner.java:1459) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dispatch(XMLDocumentScannerImpl.java:1302) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:845) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:768) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1196) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:555) [xercesImpl-2.9.1-jbossas-1.jar:]
        at org.apache.xerces.jaxp.SAXParserImpl.parse(SAXParserImpl.java:289) [xercesImpl-2.9.1-jbossas-1.jar:]
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) [rt.jar:1.7.0_72]
        at com.sun.faces.config.ConfigureListener$WebXmlProcessor.scanForFacesServlet(ConfigureListener.java:815) [jsf-impl-2.1.7-jbossorg-2.jar:]
        at com.sun.faces.config.ConfigureListener$WebXmlProcessor.<init>(ConfigureListener.java:768) [jsf-impl-2.1.7-jbossorg-2.jar:]
        at com.sun.faces.config.ConfigureListener.contextInitialized(ConfigureListener.java:178) [jsf-impl-2.1.7-jbossorg-2.jar:]
        at org.apache.catalina.core.StandardContext.contextListenerStart(StandardContext.java:3392) [jbossweb-7.0.13.Final.jar:]
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:3850) [jbossweb-7.0.13.Final.jar:]
        at org.jboss.as.web.deployment.WebDeploymentService.start(WebDeploymentService.java:90) [jboss-as-web-7.1.1.Final.jar:7.1.1.Final]
        at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811)
        at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_72]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_72]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_72]

 

So the error “Content is not allowed in trailing section” comes from the JMS which isn’t able to properly read the first character of the line 40 coming from an XML file “deployment descriptor”. So which file is that? That’s where the fun begin! There are several deployment descriptors in JBoss like web.xml, jboss-app.xml, jboss-deployment-structure.xml, jboss-web.xml, aso…

 

The D2 installer is updating some configuration files like the server.ini. This is a text file, pretty simple to update and indeed the file is properly formatted so no issue on this side. Except this file, the D2 installer is mainly updating XML files like the following ones:
  • $DOCUMENTUM_SHARED/jboss7.1.1/server/DctmServer_MethodServer/deployments/ServerApps.ear/META-INF/jboss-deployment-structure.xml
  • $DOCUMENTUM_SHARED/jboss7.1.1/server/DctmServer_MethodServer/deployments/ServerApps.ear/DmMethods.war/WEB-INF/web.xml
  • $DOCUMENTUM_SHARED/jboss7.1.1/server/DctmServer_MethodServer/deployments/bpm.ear/META-INF/jboss-deployment-structure.xml
  • $DOCUMENTUM_SHARED/jboss7.1.1/modules/emc/d2/lockbox/main/module.xml
  • aso…

 

At this point, it was pretty simple to figure out the issue: I just checked all these files until I found the wrongly updated/corrupted XML file. And the winner was… the file web.xml for the DmMethods inside the ServerApps. The D2 installer usually update/read this file but in the process of doing so, it actually does also corrupt it… It is not a big corruption but it is still boring since it will prevent some installers from working properly and it will display the error shown above in the Java Method Server. Basically whenever you have some parsing errors, I would suggest you to take a look at the files web.xml across the JMS. The D2 Installer in our case added at the end of this file the word “ap”. As you know, an XML file should be properly formatted to be readable and “ap” isn’t a correct XML ending tag:

[dmadmin@content_server_01 ~]$ cat $DOCUMENTUM_SHARED/jboss7.1.1/server/DctmServer_MethodServer/deployments/ServerApps.ear/DmMethods.war/WEB-INF/web.xml
<?xml version="1.0" encoding="UTF-8"?>
<web-app>
    <display-name>Documentum Method Invocation Servlet</display-name>
    <description>This servlet is for Java method invocation using the DO_METHOD apply call.</description>
    <servlet>
        <servlet-name>DoMethod</servlet-name>
        <description>Documentum Method Invocation Servlet</description>
        <servlet-class>com.documentum.mthdservlet.DoMethod</servlet-class>
        <init-param>
            <param-name>trace</param-name>
            <param-value>f</param-value>
        </init-param>
        <init-param>
            <param-name>docbase_install_owner_name</param-name>
            <param-value>dmadmin</param-value>
        </init-param>
        <init-param>
            <param-name>methodlocation-1</param-name>
            <param-value>$DOCUMENTUM/dba/java_methods</param-value>
        </init-param>
        <init-param>
            <param-name>docbase-GLOBAL_REGISTRY</param-name>
            <param-value>GLOBAL_REGISTRY</param-value>
        </init-param>
        <init-param>
            <param-name>docbase-DOCBASE1</param-name>
            <param-value>DOCBASE1</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
    </servlet>
    <servlet-mapping>
        <servlet-name>DoMethod</servlet-name>
        <url-pattern>/servlet/DoMethod</url-pattern>
    </servlet-mapping>
</web-app>
ap
[dmadmin@content_server_01 ~]$

 

So to correct this issue, you just have to remove the word “ap” from the end of this file, restart the JMS and finally restart any installer and the issue should be gone. That’s pretty simple but still annoying that installers provided by EMC can cause such trouble on their own products.

 

The errors mentioned above are related to these XML files being wrongly updated by the D2 installer but that’s actually not the only installer that is often wrongly updating XML files. As far as I remember, the BPM installer and Thumbnail Server installer can also produce the exact same issue and the reason behind that is probably that the XML files of the Java Method Server on Linux Boxes have a wrong FileFormat… We faced this issue with all versions that we installed so far on our different environments: CS 7.2 P02, P05, P16… Each and every time we install a new Documentum Content Server, all XML files of the JMS are all using the DOS FileFormat and this prevents the D2/Thumbnail/BPM installers to do their job.

 

As a sub-note, I have also seen some issues with the file “jboss-deployment-structure.xml”. Just like the “web.xml” above, this one is also present for all applications deployed under the Java Method Server. Some installers will try to update this file (including D2, in order to configure the Lockbox in it) but again the same issue is happening, mostly because of the wrong FileFormat: I have already seen the whole content of this file just being removed by a Documentum installer… So before doing anything, I would suggest you to take a backup of the JMS as soon as it is installed and running and before installing all additional components like D2, bpm, Thumbnail Server, aso… On Linux, it is pretty easy to see and change the FileFormat of a file. Just open it using “vi” for example and then write “:set ff?”. This will display the current FileFormat and you can then change it using: “:set ff=unix”, if needed.

 

I don’t remember seeing such kind of behavior before the CS 7.2 so maybe it is just linked to this specific version… If you already have seen such thing for a previous version, don’t hesitate to share!

 

Cet article Documentum story – Documentum installers fail with various errors est apparu en premier sur Blog dbi services.

Documentum story – Monitoring of WebLogic Servers

Sat, 2016-10-15 02:00

As you already know if you are following our Documentum Story, we are building, working and managing, for some time now, a huge Documentum Platform with more than 115 servers so far (still growing). To be able to manage properly this platform, we need an efficient monitoring tool. In this blog, I will not talk about Documentum but rather I will talk a little bit about the monitoring solution we integrated with Nagios to be able to support all of our WebLogic Servers. For those of you who don’t know, Nagios is a very popular Open Source monitoring tool launched in 1999. By default Nagios doesn’t provide any interface to monitor WebLogic or Documentum and therefore we choose to build our own script package to be able to properly monitor our Platform.

 

At the beginning of the project when we were installing the first WebLogic Servers, we used the monitoring scripts coming from the old Platform (a Documentum 6.7 Platform not managed by us). The idea behind these monitoring scripts was the following one:

  • The Nagios Server needs to perform a check of a service
  • The Nagios Server contacts the Nagios Agent which executes the check
  • The Check is starting its own WLST script to retrieve only the value needed for this check (each check calls a different WLST script)
  • The Nagios Agent returns the value to the Nagios Server which is then happy with it

 

This pretty simple approach was working fine at the beginning when we only had a few WebLogic Servers with not so much to monitor on them… The problem is that the Platform was growing very fast and we quickly started to see a few timeouts on the different checks because Nagios was trying to execute a lot of check at the same time on the same host. For example on a specific environment, we had two WebLogic Domains running with 4 or 5 Managed Servers for each domain that were hosting a Documentum Application (DA, D2, D2-Config, …). We were monitoring the heapSize, the number of threads, the server state, the number of sessions, the different URLs with and without Load Balancer, aso… for each Managed Server and for the AdminServers too. Therefore we quickly reached a point where 5 or 10 WLST scripts were running at the same time for the monitoring and only the monitoring.

 

The problem with the WLST script is that it takes a lot of time to initialize itself and start (several seconds) and during that time, 1 or 2 CPUs are fully used only for that. Now correlate this figure with the fact that there are dozens of checks running every 5 minutes for each domain and that are all starting their own WLST script. In the end, you will get a WebLogic Server highly used with a huge CPU consumption only for the monitoring… So that might be sufficient for a small installation but that’s definitively not the right thing to do for a huge Platform.

 

Therefore we needed to do something else. To solve this particular problem, I developed a new set to scripts that I integrated with Nagios to replace the old ones. The idea behind these new scripts was that it should be able to provide us at least the same thing as the old ones but without starting so much WLST scripts and it should be easily extensible. I worked on this small development and this is what I came with:

  • The Nagios Server needs to perform a check of a service
  • The Nagios Server contacts the Nagios Agent which executes the check
  • The Check is reading a log file to find the value needed for this check
  • The Nagios Agent returns the value to the Nagios Server which is then happy with it

 

Pretty similar isn’t it? Indeed… And yet so different! The main idea behind this new version is that instead of starting a WLST script for each check which will fully use 1 or 2 CPUs and last for 2 to 10 seconds (depending on the type of check and on the load), this new version will only read a very short log file (1 log file per check) that contains one line: the result of the check. Reading a log file like that takes a few milliseconds and it doesn’t consume 2 CPUs for doing that… Now the remaining question is how can we handle the process that will populate the log files? Because yes checking a log file is fast but how can we ensure that this log file will contain the correct data?

 

To manage that, this is what I did:

  • Creation of a shell script that will:
    • Be executed by the Nagios Agent for each check
    • Check if the WebLogic Domain is running and exit if not
    • Check if the WLST script is running and start it if not
    • Ensure the log file has been updated in the last 5 minutes (meaning the monitoring is running and the value that will be read is correct)
    • Read the log file
    • Analyze the information coming from the log file and return that to the Nagios Agent
  • Creation of a WLST script that will:
    • Be started once, do its job, sleep for 2 minutes and then do it again
    • Retrieve the monitoring values and store that in log files
    • Store error messages in the log files if there is any issue

 

It will not describe any longer the shell script because that’s just basic shell commands but I will show you instead an example of a WLST script that can be used to monitor a few things (ThreadPool of all Servers, HeapFree of all Severs, Sessions of all Applications deployed on all Servers):

[nagios@weblogic_server_01 scripts]$ cat DOMAIN_check_weblogic.wls
from java.io import File
from java.io import FileOutputStream

directory='/app/nagios/etc/objects/scripts'
userConfig=directory + '/DOMAIN_configfile.secure'
userKey=directory + '/DOMAIN_keyfile.secure'
address='weblogic_server_01'
port='8443'

connect(userConfigFile=userConfig, userKeyFile=userKey, url='t3s://' + address + ':' + port)

def setOutputToFile(fileName):
  outputFile=File(fileName)
  fos=FileOutputStream(outputFile)
  theInterpreter.setOut(fos)

def setOutputToNull():
  outputFile=File('/dev/null')
  fos=FileOutputStream(outputFile)
  theInterpreter.setOut(fos)

while 1:
  domainRuntime()
  for server in domainRuntimeService.getServerRuntimes():
    setOutputToFile(directory + '/threadpool_' + domainName + '_' + server.getName() + '.out')
    cd('/ServerRuntimes/' + server.getName() + '/ThreadPoolRuntime/ThreadPoolRuntime')
    print 'threadpool_' + domainName + '_' + server.getName() + '_OUT',get('ExecuteThreadTotalCount'),get('HoggingThreadCount'),get('PendingUserRequestCount'),get('CompletedRequestCount'),get('Throughput'),get('HealthState')
    setOutputToNull()
    setOutputToFile(directory + '/heapfree_' + domainName + '_' + server.getName() + '.out')
    cd('/ServerRuntimes/' + server.getName() + '/JVMRuntime/' + server.getName())
    print 'heapfree_' + domainName + '_' + server.getName() + '_OUT',get('HeapFreeCurrent'),get('HeapSizeCurrent'),get('HeapFreePercent')
    setOutputToNull()

  try:
    setOutputToFile(directory + '/sessions_' + domainName + '_console.out')
    cd('/ServerRuntimes/AdminServer/ApplicationRuntimes/consoleapp/ComponentRuntimes/AdminServer_/console')
    print 'sessions_' + domainName + '_console_OUT',get('OpenSessionsCurrentCount'),get('SessionsOpenedTotalCount')
    setOutputToNull()
  except WLSTException,e:
    setOutputToFile(directory + '/sessions_' + domainName + '_console.out')
    print 'CRITICAL - The Server AdminServer or the Administrator Console is not started'
    setOutputToNull()

  domainConfig()
  for app in cmo.getAppDeployments():
    domainConfig()
    cd('/AppDeployments/' + app.getName())
    for appServer in cmo.getTargets():
      domainRuntime()
      try:
        setOutputToFile(directory + '/sessions_' + domainName + '_' + app.getName() + '.out')
        cd('/ServerRuntimes/' + appServer.getName() + '/ApplicationRuntimes/' + app.getName() + '/ComponentRuntimes/' + appServer.getName() + '_/' + app.getName())
        print 'sessions_' + domainName + '_' + app.getName() + '_OUT',get('OpenSessionsCurrentCount'),get('SessionsOpenedTotalCount')
        setOutputToNull()
      except WLSTException,e:
        setOutputToFile(directory + '/sessions_' + domainName + '_' + app.getName() + '.out')
        print 'CRITICAL - The Managed Server ' + appServer.getName() + ' or the Application ' + app.getName() + ' is not started'
        setOutputToNull()

  java.lang.Thread.sleep(120000)

[nagios@weblogic_server_01 scripts]$

 

A few notes related to the above WLST script:

  • userConfig and userKey are two files created previously in WLST that contain the username/password of the current user (at the time of creation of these files) in an encrypted way. This allows you to login to WLST without having to type your username and password and more importantly, without having to put a clear text password in this file…
  • To ensure the security of this environment we are always using t3s to perform the monitoring checks and this requires you to configure the AdminServer to HTTPS.
  • In the script, I’m using the “setOutputToFile” and “setOutputToNull” functions. The first one is to redirect the output to the file mentioned in parameter while the second one is to remove all output. That’s basically to ensure that the log files generated ONLY contain the needed lines and nothing else.
  • There is an infinite loop (while 1) that executes all checks, create/update all log files and then sleep for 120 000 ms (so that’s 2 minutes) before repeating it.

 

As said above, this is easily extendable and therefore you can just add a new paragraph with the new values to retrieve. So have fun with that! :)

 

Comparison between the two methods. I will use below real figures coming from one of our WebLogic Server:

  • Old:
    • 40 monitoring checks running every 5 minutes => 40 WLST scripts started
    • each one for a duration of 6 seconds (average)
    • each one using 200% CPU during that time (2 CPUs)
  • New:
    • Shell script:
      • 40 monitoring checks running every 5 minutes => 40 log files read
      • each one for a duration of 0,1s (average)
      • each one using 100% CPU during that time (1 CPU)
    • WLST script:
      • One loop every 2 minutes (so 2.5 loops in 5 minutes)
      • each one for a duration of 0.5s (average)
      • each one using 100% CPU during that time (1 CPU)

 

Period CPU Time (Old) CPU Time (New) 5 minutes 40*6*2 <~> 480 s 40*0.1*1 + 2.5*0.5*1 <~> 5.25 s 1 day 480*(1440/5) <~> 138 240 s
<~> 2 304 min
<~> 38.4 h 4.25*(1440/5) <~> 1 512 s
<~> 25.2 min
<~> 0.42 h

Based on these figures, we can see that our new monitoring solution is almost 100 times more efficient than the old one so that’s a success: instead of spending 38.4 hours using the CPU on a 24 hours period (so that’s 1.6 CPU the whole day), we are now using 1 CPU for only 25 minutes! Here I’m just talking about the CPU Time but of course you can do the same thing for the memory, processes, aso…

 

Note: Starting with WebLogic 12c, Oracle introduced the RESTful services which can now be used to monitor WebLogic too… It has been improved in 12.2 and that can become a pretty good alternative to WLST scripting but for now we are still using this WLST approach with one single execution every 2 minutes and then Nagios reading the log files when needs be.

 

Cet article Documentum story – Monitoring of WebLogic Servers est apparu en premier sur Blog dbi services.

Using Apache JMeter to run load test on a Web applications protected by Microsoft Advanced Directory Federation Services

Fri, 2016-10-14 09:37

One of my last mission was to configure Apache JMeter for performance and load tests on a Web Application. The access to this Web Application requires authentication provided by a Microsoft Advanced Directory Federation Services single Sign On environment.
This Single Sign On communication is based on SAML (Security Assertion Markup Language). SAML is an XML-based open standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider. ADFS login steps relies on several parameters that need to be fetched and re-injected to the following steps like ‘SAMLRequest’, ‘RelayState’ and ‘SAMLResponse’.
This step-by-step tutorial shows the SAML JMeter scenario part to perform those ADFS login steps.

Record a first scenario

After installing the Apache JMeter tool, you are ready to record a first scenario. Have a look on the JMeter user manual to configure JMeter for recording scenario.

1. Adapt the HTTP(s) Test Script Recorder

For this task we need to record all HTTP(S) requests. Those from the Application and those from the Single Sign On Server. We need then to change the HTTP(S) test Proxy Recorder parameters as below

Open the “WorkBench” on the tree and click on the “HTTP(S) Test Script Recorder”.

JMeter ADFS
The scenario recording requires some changes onto the “HTTP(S) Test Script Recorder”.
Change the:
Port:  this is the port on the server running JMeter that will act as proxy. Default value is 8080.
URL Patterns to Include: Add “.*” to include all requests (you may exclude some later, if you desire).

2.    Configure the Browser to use the Test Script recorder as proxy

Go to your favourite browser (Firefox, Internet Explorer, Chrome, etc.) and configure the proxy as explained as follow:
The example below is for Internet Explorer 11 (it may differ from version to version):

  1. Go to Tools > Internet Options.
  2. Select the “Connection” tab.
  3. Click the “LAN settings” button.
  4. Check the  “Use a proxy server for your LAN” check-box. The address and port fields should be enabled now.
  5. In the Address type the server name or the IP address of the server running JMeter HTTP(S) Test Script Recorder and in the Port, enter the port entered in Step 1.

From now, the JMeter is proxying the requests.

3.    Record your first scenario

Connect to the Web Application using the browser you have configured in the previous step. Run a simple scenario including the authentication steps. Once done, stop the HTTP(S) Test Script Recorder in JMeter.

4.    Analyse the recorded entries

Analyse the recorded entries to find out the entry that redirects to the login page. In this specific case, it was the first request because the Web Application automatically displays the login page for all users not authenticated. From this request, we need to fetch two values ‘SAMLRequest’ and ‘RelayState’ included in the page response data and submit them to the ADFS login URL. After successful login, ADFS will provide a SAMLResponse that need to be submitted back to the callback URL.  This can be done by using  Regular Expression Extractors. Refer to the image below  to see how to do this.

JMeter AFFS IMG2

Extractor Name Associated variable Regular Expression SAMLRequest Extractor SAMLRequest name=”SAMLRequest” value=”([0-9A-Za-z;.: \/=+]*)” RelayState Extractor RelayState name=”RelayState” value=”([&#;._a-zA-Z0-9]*)” SAMLResponse Extractor SAMLResponse name=”SAMLResponse” value=”([&#;._+=a-zA-Z0-9]*)”

In the registered scenario look for the entries having SAMLRequest, RelayState and SAMLResponse as parameter and replace them with the corresponding variable set in the regular expressions created in the previous step.

* Click on the image to increase the size
JMeter ADFS IMG3
JMeter ADFS IMG4

Once this is done the login test scenario can be executed now.

This JMeter test plan can be cleaned from the URL requests and be used as a base plan to record more complex test plans.

 

Cet article Using Apache JMeter to run load test on a Web applications protected by Microsoft Advanced Directory Federation Services est apparu en premier sur Blog dbi services.

Manage Azure in PowerShell (RM)

Fri, 2016-10-14 02:49

Azure offers two deployment models for cloud components: Resource Manager (RM) and Classic deployment model. Newer and more easier to manage, Microsoft recommends to use the Resource Manager.
Even if these two models can exist at the same time in Azure, they are different and managed differently: in PowerShell cmdlets are specific to RM.

In order to be able to communicate with Azure from On-Premises in PowerShell, you need to download and install the Azure PowerShell from WebPI. For more details, please refer to this Microsoft Azure post “How to install and configure Azure PowerShell“.
 
 
Azure PowerShell installs many modules located in C:\Program Files (x86)\Microsoft SDKs\Azure\PowerShell:
Get-module -ListAvailable -Name *AzureRm*
ModuleType Version Name ExportedCommands
---------- ------- ---- ----------------
Manifest 1.1.3 AzureRM.ApiManagement {Add-AzureRmApiManagementRegion, Get-AzureRmApiManagementSsoToken, New-AzureRmApiManagementHostnam...
Manifest 1.0.11 AzureRM.Automation {Get-AzureRMAutomationHybridWorkerGroup, Get-AzureRmAutomationJobOutputRecord, Import-AzureRmAutom...
Binary 0.9.8 AzureRM.AzureStackAdmin {Get-AzureRMManagedLocation, New-AzureRMManagedLocation, Remove-AzureRMManagedLocation, Set-AzureR...
Manifest 0.9.9 AzureRM.AzureStackStorage {Add-ACSFarm, Get-ACSEvent, Get-ACSEventQuery, Get-ACSFarm...}
Manifest 1.0.11 AzureRM.Backup {Backup-AzureRmBackupItem, Enable-AzureRmBackupContainerReregistration, Get-AzureRmBackupContainer...
Manifest 1.1.3 AzureRM.Batch {Remove-AzureRmBatchAccount, Get-AzureRmBatchAccount, Get-AzureRmBatchAccountKeys, New-AzureRmBatc...
Manifest 1.0.5 AzureRM.Cdn {Get-AzureRmCdnCustomDomain, New-AzureRmCdnCustomDomain, Remove-AzureRmCdnCustomDomain, Get-AzureR...
Manifest 0.1.2 AzureRM.CognitiveServices {Get-AzureRmCognitiveServicesAccount, Get-AzureRmCognitiveServicesAccountKey, Get-AzureRmCognitive...
Manifest 1.3.3 AzureRM.Compute {Remove-AzureRmAvailabilitySet, Get-AzureRmAvailabilitySet, New-AzureRmAvailabilitySet, Get-AzureR...
Manifest 1.0.11 AzureRM.DataFactories {Remove-AzureRmDataFactory, Get-AzureRmDataFactoryRun, Get-AzureRmDataFactorySlice, Save-AzureRmDa...
Manifest 1.1.3 AzureRM.DataLakeAnalytics {Get-AzureRmDataLakeAnalyticsDataSource, Remove-AzureRmDataLakeAnalyticsCatalogSecret, Set-AzureRm...
Manifest 1.0.11 AzureRM.DataLakeStore {Add-AzureRmDataLakeStoreItemContent, Export-AzureRmDataLakeStoreItem, Get-AzureRmDataLakeStoreChi...
Manifest 1.0.2 AzureRM.DevTestLabs {Get-AzureRmDtlAllowedVMSizesPolicy, Get-AzureRmDtlAutoShutdownPolicy, Get-AzureRmDtlAutoStartPoli...
Manifest 1.0.11 AzureRM.Dns {Get-AzureRmDnsRecordSet, New-AzureRmDnsRecordConfig, Remove-AzureRmDnsRecordSet, Set-AzureRmDnsRe...
Manifest 1.1.3 AzureRM.HDInsight {Get-AzureRmHDInsightJob, New-AzureRmHDInsightSqoopJobDefinition, Wait-AzureRmHDInsightJob, New-Az...
Manifest 1.0.11 AzureRM.Insights {Add-AzureRmMetricAlertRule, Add-AzureRmLogAlertRule, Add-AzureRmWebtestAlertRule, Get-AzureRmAler...
Manifest 1.1.10 AzureRM.KeyVault {Get-AzureRmKeyVault, New-AzureRmKeyVault, Remove-AzureRmKeyVault, Remove-AzureRmKeyVaultAccessPol...
Manifest 1.0.7 AzureRM.LogicApp {Get-AzureRmIntegrationAccountAgreement, Get-AzureRmIntegrationAccountCallbackUrl, Get-AzureRmInte...
Manifest 0.9.2 AzureRM.MachineLearning {Export-AzureRmMlWebService, Get-AzureRmMlWebServiceKeys, Import-AzureRmMlWebService, Remove-Azure...
Manifest 1.0.12 AzureRM.Network {Add-AzureRmApplicationGatewayBackendAddressPool, Get-AzureRmApplicationGatewayBackendAddressPool,...
Manifest 1.0.11 AzureRM.NotificationHubs {Get-AzureRmNotificationHubsNamespaceAuthorizationRules, Get-AzureRmNotificationHubsNamespaceListK...
Manifest 1.0.11 AzureRM.OperationalInsights {Get-AzureRmOperationalInsightsSavedSearch, Get-AzureRmOperationalInsightsSavedSearchResults, Get-...
Manifest 1.0.0 AzureRM.PowerBIEmbedded {Remove-AzureRmPowerBIWorkspaceCollection, Get-AzureRmPowerBIWorkspaceCollection, Get-AzureRmPower...
Manifest 1.0.11 AzureRM.Profile {Enable-AzureRmDataCollection, Disable-AzureRmDataCollection, Remove-AzureRmEnvironment, Get-Azure...
Manifest 1.1.3 AzureRM.RecoveryServices {Get-AzureRmRecoveryServicesBackupProperties, Get-AzureRmRecoveryServicesVault, Get-AzureRmRecover...
Manifest 1.0.3 AzureRM.RecoveryServices.Backup {Backup-AzureRmRecoveryServicesBackupItem, Get-AzureRmRecoveryServicesBackupManagementServer, Get-...
Manifest 1.1.9 AzureRM.RedisCache {Reset-AzureRmRedisCache, Export-AzureRmRedisCache, Import-AzureRmRedisCache, Remove-AzureRmRedisC...
Manifest 2.0.2 AzureRM.Resources {Get-AzureRmADApplication, Get-AzureRmADGroupMember, Get-AzureRmADGroup, Get-AzureRmADServicePrinc...
Manifest 1.0.2 AzureRM.ServerManagement {Install-AzureRmServerManagementGatewayProfile, Reset-AzureRmServerManagementGatewayProfile, Save-...
Manifest 1.1.10 AzureRM.SiteRecovery {Stop-AzureRmSiteRecoveryJob, Get-AzureRmSiteRecoveryNetwork, Get-AzureRmSiteRecoveryNetworkMappin...
Manifest 1.0.11 AzureRM.Sql {Get-AzureRmSqlDatabaseImportExportStatus, New-AzureRmSqlDatabaseExport, New-AzureRmSqlDatabaseImp...
Manifest 1.1.3 AzureRM.Storage {Get-AzureRmStorageAccount, Get-AzureRmStorageAccountKey, Get-AzureRmStorageAccountNameAvailabilit...
Manifest 1.0.11 AzureRM.StreamAnalytics {Get-AzureRmStreamAnalyticsFunction, Get-AzureRmStreamAnalyticsDefaultFunctionDefinition, New-Azur...
Manifest 1.0.11 AzureRM.Tags {Remove-AzureRmTag, Get-AzureRmTag, New-AzureRmTag}
Manifest 1.0.11 AzureRM.TrafficManager {Disable-AzureRmTrafficManagerEndpoint, Enable-AzureRmTrafficManagerEndpoint, Set-AzureRmTrafficMa...
Manifest 1.0.11 AzureRM.UsageAggregates Get-UsageAggregates
Manifest 1.1.3 AzureRM.Websites {Get-AzureRmAppServicePlanMetrics, New-AzureRmWebAppDatabaseBackupSetting, Restore-AzureRmWebAppBa...

 
The basic cmdlets to connect and navigate between your different Accounts or Subscriptions are located in “AzureRM.Profile” module:
PS C:\> Get-Command -Module AzureRM.Profile
CommandType Name Version Source
----------- ---- ------- ------
Alias Login-AzureRmAccount 1.0.11 AzureRM.Profile
Alias Select-AzureRmSubscription 1.0.11 AzureRM.Profile
Cmdlet Add-AzureRmAccount 1.0.11 AzureRM.Profile
Cmdlet Add-AzureRmEnvironment 1.0.11 AzureRM.Profile
Cmdlet Disable-AzureRmDataCollection 1.0.11 AzureRM.Profile
Cmdlet Enable-AzureRmDataCollection 1.0.11 AzureRM.Profile
Cmdlet Get-AzureRmContext 1.0.11 AzureRM.Profile
Cmdlet Get-AzureRmEnvironment 1.0.11 AzureRM.Profile
Cmdlet Get-AzureRmSubscription 1.0.11 AzureRM.Profile
Cmdlet Get-AzureRmTenant 1.0.11 AzureRM.Profile
Cmdlet Remove-AzureRmEnvironment 1.0.11 AzureRM.Profile
Cmdlet Save-AzureRmProfile 1.0.11 AzureRM.Profile
Cmdlet Select-AzureRmProfile 1.0.11 AzureRM.Profile
Cmdlet Set-AzureRmContext 1.0.11 AzureRM.Profile
Cmdlet Set-AzureRmEnvironment 1.0.11 AzureRM.Profile

According to the cmdlets present in “AzureRM.Profile” module, you will be able to connect to your Azure Account(enter your credentials):
PS C:\> Login-AzureRmAccount
Environment : AzureCloud
Account : n.courtine@xxxxxx.com
TenantId : a123456b-789b-123c-4de5-67890fg123h4
SubscriptionId : z123456y-789x-123w-4vu5-67890ts123r4
SubscriptionName : ** Subscription Name **
CurrentStorageAccount :

 
You can list your associated Azure Subscriptions:
Get-AzureRmSubscription
SubscriptionName : ** Subscription Name **
SubscriptionId : z123456y-789x-123w-4vu5-67890ts123r4
TenantId : a123456b-789b-123c-4de5-67890fg123h4

 
To switch your Subscription, do as follows:
Select-AzureRmSubscription -SubscriptionId z123456y-789x-123w-4vu5-67890ts123r4
Environment : AzureCloud
Account : n.courtine@xxxxxx.com
TenantId : a123456b-789b-123c-4de5-67890fg123h4
SubscriptionId : z123456y-789x-123w-4vu5-67890ts123r4
SubscriptionName : ** Subscription Name **
CurrentStorageAccount :

 
Or you can take a specific “snapshot” of your current location in Azure. It will help you to easily return to a specific context at the moment you ran the command:
PS C:\> $context = Get-AzureRmContext
Environment : AzureCloud
Account : n.courtine@xxxxxx.com
TenantId : a123456b-789b-123c-4de5-67890fg123h4
SubscriptionId : z123456y-789x-123w-4vu5-67890ts123r4
SubscriptionName : ** Subscription Name **
CurrentStorageAccount :
...
PS C:\> Set-AzureRmContext -Context $context
Environment : AzureCloud
Account : n.courtine@xxxxxx.com
TenantId : a123456b-789b-123c-4de5-67890fg123h4
SubscriptionId : z123456y-789x-123w-4vu5-67890ts123r4
SubscriptionName : ** Subscription Name **
CurrentStorageAccount :

 
It is also possible to list all the available Storage Account associated to your current subscriptions:
PS C:\> Get-AzureRmStorageAccount | Select StorageAccountName, Location
StorageAccountName Location
------------------ --------
semicroustillants259 westeurope
semicroustillants4007 westeurope
semicroustillants8802 westeurope

 
To see the existing blob container in each Storage Account:
PS C:\> Get-AzureRmStorageAccount | Select StorageAccountName, ResourceGroupName, Location
Blob End Point: https://dbimssql.blob.core.windows.net/
Name Uri LastModified
---- --- ------------
bootdiagnostics-t... https://dbimssql.blob.core.windows.net/bootdiagnostics-ta... 30.09.2016 12:36:12 +00:00
demo https://dbimssql.blob.core.windows.net/demo 05.10.2016 14:16:01 +00:00
vhds https://dbimssql.blob.core.windows.net/vhds 30.09.2016 12:36:12 +00:00
Blob End Point: https://semicroustillants259.blob.core.windows.net/
Name Uri LastModified
---- --- ------------
mastervhds https://semicroustillants259.blob.core.windows.net/master... 28.09.2016 13:41:19 +00:00
uploads https://semicroustillants259.blob.core.windows.net/uploads 28.09.2016 13:41:19 +00:00
vhds https://semicroustillants259.blob.core.windows.net/vhds 28.09.2016 13:55:57 +00:00
Blob End Point: https://semicroustillants4007.blob.core.windows.net/
Name Uri LastModified
---- --- ------------
artifacts https://semicroustillants4007.blob.core.windows.net/artif... 28.09.2016 13:59:47 +00:00

Azure infrastructure can be easily managed from On-Premises in PowerShell. In a previous post, I explained how to deploy a Virtual Machine from an Image in Azure PowerShell.
If you have remarks or advises, do not hesitate to share ;-)

 

Cet article Manage Azure in PowerShell (RM) est apparu en premier sur Blog dbi services.

Pages