Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from http://www.dbi-services.com/
Updated: 17 hours 47 min ago

solving customer issue at OOW14: Dbvisit replicate can even replicate tables with no primary key

Wed, 2014-10-01 06:39
Usually, the logical replication of changes uses the primary key. Each row updated or deleted generate a statement to be applied on the target, which affects only one row because it accesses with the primary key. If there is no primary key, we need to have something unique and at worst it is the whole row. But sometimes old applications were designed before being implemented into relational database and have no unicity. It it a problem for logical replication? We will see that Dbvisit replicate can address that.   Here is the case I encountered at a customer. The application has a master-detail table design, and the detail tables are inserted/deleted all together for the same master key. And there is no primary key, and even nothing unique. The only value that may help is a timestamp but sometimes timestamps do not have the sufficient precision to be unique. And anyway, imagine what happens if we change back the system time, or during daylight saving changes.   At dbi services we have very good contact with our partner Dbvisit and it's the kind of question that can be addressed quickly by the support. Anyway, I was at the Oracle Open World and then was able to discuss directly with the Dbvisit replicate developers. There is a solution and it is even documented.

The basic issue is that when the delete occurs, a redo entry is generated for each row that is deleted and then Dbvisit replicate generates an update statement to do the same on the target. But when there are duplicates the first statement will affect several rows and the next statement will affect no rows.

This is the kind of replication complexity that is addressed with conflict resolution. It can be addressed manually: the replication stops when a conflict is detected and continues once we have decided what to do. But we can also set rules to address it automatically when the problem occurs again so that the replication never stops.

Here is the demo about that as I tested it before providing the solution to my customer. 

Note that it concerns only deletes here but the same can be done with updates.

1. I create a table with 4 identical rows for each value of N:

  create table TESTNOPK as select n,'x' x from (select rownum n from dual connect by level
SQL> connect repoe/repoe Connected.
SQL> create table TESTNOPK as select n,'x' x from (select rownum n from dual connect by level   Table created.

2. Status of replication from the Dbvisit console:


| Dbvisit Replicate 2.7.06.4485(MAX edition) - Evaluation License expires in 29 days MINE IS running. Currently at plog 35 and SCN 796568 (10/01/2014 01:08:04). APPLY IS running. Currently at plog 35 and SCN 796566 (10/01/2014 01:08:04). Progress of replication dbvrep_XE:MINE->APPLY: total/this execution -------------------------------------------------------------------------------------------------------------------------------------------- REPOE.CUSTOMERS:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ADDRESSES:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.CARD_DETAILS:           100%  Mine:1727/1727       Unrecov:0/0         Applied:1727/1727   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ORDER_ITEMS:            100%  Mine:12520/12520     Unrecov:0/0         Applied:12520/12520 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.ORDERS:                 100%  Mine:10040/10040     Unrecov:0/0         Applied:10040/10040 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.INVENTORIES:            100%  Mine:12269/12269     Unrecov:0/0         Applied:12269/12269 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.LOGON:                  100%  Mine:12831/12831     Unrecov:0/0         Applied:12831/12831 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.TESTNOPK:               100%  Mine:40/40           Unrecov:0/0         Applied:40/40       Conflicts:0/0       Last:01/10/2014 01:08:02/OK -------------------------------------------------------------------------------------------------------------------------------------------- 8 tables listed.  

3. I delete the lines with the value 10:


SQL> select * from TESTNOPK where n=10;
         N X ---------- -         10 x         10 x         10 x         10 x
SQL> delete from TESTNOPK where n=10;
4 rows deleted.
SQL> commit;
Commit complete.

5. apply is stop on a conflict: too many rows affected by the delete


MINE IS running. Currently at plog 35 and SCN 797519 (10/01/2014 01:10:56). APPLY IS running. Currently at plog 35 and SCN 796928 (10/01/2014 01:09:08) and 1 apply conflicts so far (last at 01/10/2014 01:10:57) and WAITING on manual resolve of apply conflict id 35010009996. Progress of replication dbvrep_XE:MINE->APPLY: total/this execution -------------------------------------------------------------------------------------------------------------------------------------------- REPOE.CUSTOMERS:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ADDRESSES:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.CARD_DETAILS:           100%  Mine:1727/1727       Unrecov:0/0         Applied:1727/1727   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ORDER_ITEMS:            100%  Mine:12520/12520     Unrecov:0/0         Applied:12520/12520 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.ORDERS:                 100%  Mine:10040/10040     Unrecov:0/0         Applied:10040/10040 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.INVENTORIES:            100%  Mine:12269/12269     Unrecov:0/0         Applied:12269/12269 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.LOGON:                  100%  Mine:12831/12831     Unrecov:0/0         Applied:12831/12831 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.TESTNOPK:                90%  Mine:44/44           Unrecov:0/0         Applied:40/40       Conflicts:1/1       Last:01/10/2014 01:09:17/RETRY:Command affected 4 row(s). -------------------------------------------------------------------------------------------------------------------------------------------- 8 tables listed.     dbvrep> list conflict Information for conflict 35010009996 (current conflict): Table: REPOE.TESTNOPK at transaction 0008.003.0000022b at SCN 796930 SQL text (with replaced bind values): delete from "REPOE"."TESTNOPK" where (1=1) and "N" = 10 and "X" = 'x'
Error: Command affected 4 row(s). Handled as: PAUSE Conflict repeated 22 times.

6. I resolve the conflict manually, forcing the delete of all rows

                                                                                                                                                       dbvrep> resolve conflict 35010009996 as force Conflict resolution set.   At that point, there is 3 following conflicts that I need to force as well because of the other deletes affecting no rows. I don't reproduce them here.

7. Once the conflits are resolved, the replication continues:

  MINE IS running. Currently at plog 35 and SCN 800189 (10/01/2014 01:19:16). APPLY IS running. Currently at plog 35 and SCN 800172 (10/01/2014 01:19:14). Progress of replication dbvrep_XE:MINE->APPLY: total/this execution -------------------------------------------------------------------------------------------------------------------------------------------- REPOE.CUSTOMERS:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ADDRESSES:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.CARD_DETAILS:           100%  Mine:1727/1727       Unrecov:0/0         Applied:1727/1727   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ORDER_ITEMS:            100%  Mine:12520/12520     Unrecov:0/0         Applied:12520/12520 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.ORDERS:                 100%  Mine:10040/10040     Unrecov:0/0         Applied:10040/10040 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.INVENTORIES:            100%  Mine:12269/12269     Unrecov:0/0         Applied:12269/12269 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.LOGON:                  100%  Mine:12831/12831     Unrecov:0/0         Applied:12831/12831 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.TESTNOPK:               100%  Mine:44/44           Unrecov:0/0         Applied:44/44       Conflicts:4/4       Last:01/10/2014 01:18:21/RETRY:Command affected 0 row(s). -------------------------------------------------------------------------------------------------------------------------------------------- 8 tables listed.                                                                                                                                                           dbvrep> list conflict Information for conflict 0 (current conflict): No conflict with id 0 found.  

8. Now I want to set a rule that manages that situation automatically. I add a 'too many rows' conflict rule to touch only one line for each delete:


dbvrep> SET_CONFLICT_HANDLERS FOR TABLE REPOE.TESTNOPK FOR DELETE ON TOO_MANY TO SQL s/$/ and rownum = 1/ Connecting to running apply: [The table called REPOE.TESTNOPK on source is handled on apply (APPLY) as follows: UPDATE (error): handler: RETRY logging: LOG UPDATE (no_data): handler: RETRY logging: LOG UPDATE (too_many): handler: RETRY logging: LOG DELETE (error): handler: RETRY logging: LOG DELETE (no_data): handler: RETRY logging: LOG DELETE (too_many): handler: SQL logging: LOG, regular expression: s/$/ and rownum = 1/ INSERT (error): handler: RETRY logging: LOG TRANSACTION (error): handler: RETRY logging: LOG]                                                                                                                                                        9. Now testing the automatic conflict resolution:   SQL> delete from TESTNOPK where n=9;
4 rows deleted.
SQL> commit;
Commit complete.
10.  the conflicts are automatically managed:   MINE IS running. Currently at plog 35 and SCN 800475 (10/01/2014 01:20:08). APPLY IS running. Currently at plog 35 and SCN 800473 (10/01/2014 01:20:08). Progress of replication dbvrep_XE:MINE->APPLY: total/this execution -------------------------------------------------------------------------------------------------------------------------------------------- REPOE.CUSTOMERS:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ADDRESSES:              100%  Mine:1864/1864       Unrecov:0/0         Applied:1864/1864   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.CARD_DETAILS:           100%  Mine:1727/1727       Unrecov:0/0         Applied:1727/1727   Conflicts:0/0       Last:30/09/2014 02:38:30/OK REPOE.ORDER_ITEMS:            100%  Mine:12520/12520     Unrecov:0/0         Applied:12520/12520 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.ORDERS:                 100%  Mine:10040/10040     Unrecov:0/0         Applied:10040/10040 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.INVENTORIES:            100%  Mine:12269/12269     Unrecov:0/0         Applied:12269/12269 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.LOGON:                  100%  Mine:12831/12831     Unrecov:0/0         Applied:12831/12831 Conflicts:0/0       Last:30/09/2014 02:38:35/OK REPOE.TESTNOPK:               100%  Mine:48/48           Unrecov:0/0         Applied:48/48       Conflicts:7/7       Last:01/10/2014 01:19:57/OK -------------------------------------------------------------------------------------------------------------------------------------------- 8 tables listed.  

Now the replication is automatic and the situation is correctly managed.


 oow-imattending-200x200-2225057.gif  

As I already said, Dbvisit is a simple tool but is nethertheless very powerfull. And Oracle Open World is an efficient way to learn: share knowlege during the day, and test it during the night when you are too jetlagged to sleep...





 



 

 

OOW14 Day 2 - Delphix #cloneattack

Tue, 2014-09-30 19:14

Do you know Delphix? The first time I heard of it was from Jonathan Lewis. And from Kyle Hailey of course. So it's not only about agile and virtualization. It's a real DBA stuff. So as I did yesterday with Dbvisit #repattack let's install the demo.

Here is the setup:

  • one source virtual machine with an XE database
  • one target virtual machine with XE installed but no database
  • one virtual machine with Delphix
And what can we do with that? We can clone the databases instantaneously. It's:
  • a virtual appliance managing storage snapshots for instant cloning
  • this is exposed through direct NFS to be used by the database
  • totally automated database maintenance (creating, restore, changing name, etc) through a nice GUI
So what's the point? You want to clone an environment instantaneously. Chose the point in time you want and it's done. You can clone 50 databases for your 50 developers. You can rewind your test database to run unit testing in an continuous integration development environment. You can do all that stuff that requires so many IT procedures just with a few clicks on the Delphix GUI.   Just an example, here is my source database and the way I choose the point in time I want to clone:   CaptureDelphix01.PNG   It's running: CaptureDelphix02.PNG   The #cloneattack is a good way to test things and discuss with others...  

I have now @delphix on my laptop installed with @kylehhailey at #oow14. Cont. tomorrow at OTW http://t.co/QJLVhp93jg pic.twitter.com/QgoAgJPXyo

— Franck Pachot (@FranckPachot) September 30, 2014

@kylehhailey #cloneattack: finished it today now playing with clones while listening to @TanelPoder pic.twitter.com/wH3kQKBp8U

— Franck Pachot (@FranckPachot) September 30, 2014

That's some powerful multitasking - awesome @FranckPachot @TanelPoder

— Kyle Hailey (@kylehhailey) September 30, 2014

Day 2 at Oracle Open World - Best pratices

Tue, 2014-09-30 15:03

Today, in this post I will describe some best practices I learned in several sessions. It's always good to see what is adviced by other people that are confronting to other or same challenges.

Managing Oracle WebLogic Server with Oracle Enterprise Manager 12c

One session was related to the best pratices for managing WebLogic with Cloud Crontrol 12c.

- Use the administration functions:
Now you can, with Cloud Control 12c, do the WebLogic administration using its console. Starting and stopping the managed servers and applications were already possible but now you can do more like configuring the resources, deploying applications and so on.
As you are using the Cloud Control console you can sign in to several targets WLS servers. This means you have to enter each time the required password. By providing the credentials and saving them as the preferred ones (in Preferred Credentials) you avoid to enter the password each time.

- Automate Tasks accross domains with predefined jobs.
Predefined jobs can be used to start automatically WLST scripts and this against one or more domains. Like with the WLS console you can register your actions into a .py script, update it for your new targets, create the job and set the schedule. This can obviously be a script for configuration but also for monitoring or creating statistics.  

- Automatic response to issue via corrective action
By including corrective actions in templates you can apply them to managed servers. If the corrective action fails, by using rules you can send email in a second step to inform that there is an issue which need to be solved.

- use EMCLI to manage the credentials

- use APEX to query the Management Repository for reporting

Troubleshooting Performance Issues

An other session where best practices were explained was the "Oracle WebLogic Server: Best Practices for Troubleshooting Performance Issues". A very helpfull session, all chairs in the room were occupied and some people had to stand, meaning the session was expected.

Some general tips:  

  •  -verbose:gc to find out if the performance issues are related to the garbage collection behaviour  
  •  -Dweblogic.log.RedirectStdoutToServerLogEnabled=true  
  •  use the Java Flight Recorder (JFR)  
  •  use Remote Diagnostic Agent (RDA)  
  •  use WLDF to create an image of your system  
  •  Thread/heap dumps to see how your application is working

One of the first action you have to do is to read the log files as they can show you which kind of errors are logged; stuck threads, too many open files aso.

The same application can behave differently whether it is deployed on WebLogic running on Linux or on Windows. For instance a socket can remain in TIME_WAIT 4 minutes in Linux but only 1 minute under Windows.

In case you encounter OutOfMemory errors, log the garbage collector information

-verbose:gc -XX+PrintGCDetails -XX:PrintGCDateStamps -XX:-PrintGCTimeStamps

More information can be found in the document referred by ID 877172.1

Thread Dump
To analyze your application you can create a thread dump

  •  under Unix/Linux: kill -3
  •  jstack
  •  WLST threadDump()
  •  jcmd print_thread (for Java HotSpot)
  •  jcmd Thread.print (for Java 7)

More information can be found in the document referred by ID 1098691.1

Once the thread dump has been created you have to analyze it.
For that, several tools are available

  •  Samurai
  •  Thread Dump Analyzer (TDA)
  •  ThreadLogic

Some best practices I already know; one tool I want to test now is ThreadLogic to be trained in case I have to use it in a real case.

Let's see what will happen in the next days.

1st day at Oracle Open World '14 : news on the upcoming Oracle WebLogic 12.2.1

Mon, 2014-09-29 19:33

Landing on Sunday 28th, after a 13 hours' trip my colleague Franck Pachot and I had just the time to do the registration, go to the hotel, and go back to the "Welcome Reception" where we could eat something. After a night where I could feel the jet lag :-) we where ready to "participate" in this amazing event, the Oracle Open World 2014.

The first session I attended was the keynote where new challenges were exposed, "moving" old 20 years applications; building new infrastructures with less budget as the money is put more to the business applications to fullfill the user demands and expectations; Big Data where the analyzes but also the delivery of the results has to be fast. To resume we are in a period where the infrastructure is changing by using more and more the cloud but the approach to deal with the new challenges has also to be changed to integrate this new digital world.

Another interesting session was the one from Mr. William Lyons about the Oracle WebLogic server strategy and roadmap. He talked about the Cloud Application Foundation like mobile development productivity, Foundation for Fusion Middleware and Application, High Availability, performance, multi-tenancy, cloud management and operation aso. He first recapitulated the new features from WebLogic 12.1.2 like the management of Coherence, Oracle HTTP server, webtier using only one tool like WLS console, WLST or OFMW console. He also talked about the dabatase integration with GriLink, RAC, multi tenant database, application continuity and Database Resident Connection Pool which improves the performance.

He passed then to the new features from 12.1.3 which has been released in June 2014. This new version improves functionnalities in the Fusion MiddleWare, Mobile as well as in High Availability areas. The developper can now have a free development license, they can install the product by using a zip version which contains also the patches. WebLogic 12.1.3 supports Java EE7 AND 8.

The next release which is plan for 2015 is WebLogic 12.2.1. With this version the multitenancy concept is covered where domain partition can be used to isolate resources for the different tenants. Regarding Java it will be fully compliant with Java EE7 and 8.

In this first day lots of information have been ingested but they have to be digested in the next weeks :-)

Let's see what will happend in the next days!

OOW14 Day 1 - Dbvisit #repattack

Mon, 2014-09-29 19:02

Oracle Open World is not only conferences but also practice and networking. Today at the OTN lounge have installed the following demos on my laptop:

  • a Dbvisit replicate #repattack with 
  • a Delphix cloning environement #cloneattack

I'll detail the former below and the latter tomorrow, but if you are at San Francisco and missed it, please come tomorrow to the same kind of session at the Oak Table World! You don't even need the OOW registration for that - it's independant but at the same place. Here are the details: http://www.oraclerealworld.com/oaktable-world/agenda/

 

Dbvisit replicate

This is the event:

Tweet:

Become a #RepAttack warrior at #OOW14 Get to the OTN lounge for briefing today from 3:00 pm http://t.co/fJRbOuMPqn

— Dbvisit Software (@dbvisit) September 29, 2014

 

Well, actually I did install everything a bit earlier as I had the #repattack environement before and I woke up very early because of the jet lag... The installation is straightforward and I've monitored it with anoter tool which I like (and we are partner as well): Orachrome Lighty.

Tweet:

I woke up because of #jetlag then I've installed @dbvisit #repattack on my laptop and monitor it with @orachrome pic.twitter.com/EVm1GZBo3l

— Franck Pachot (@FranckPachot) September 29, 2014

 

The idea is to quickly setup a source and a target virtualbox, with Oracle XE and Swingbench on the source. And then setup the replication on it. It is really straightforward and shows that logical replication is not too complex to set. So the OTN lounge was to occasion to meet the Dbvisit team.

 

Delphix

Here is the setup - I will continue tomorrow for cloning:

Tweet:

I have now @delphix on my laptop installed with @kylehhailey at #oow14. Cont. tomorrow at OTW http://t.co/QJLVhp93jg pic.twitter.com/QgoAgJPXyo

— Franck Pachot (@FranckPachot) September 30, 2014

Documentum upgrade project: D2-Client and missing icons

Sun, 2014-09-28 19:49

The new D2-Client does not correctly display the icon for some formats. This usually happens when the icon.css is not up to date based on the content format in the repository. The solution is to find these formats and update the icon.css.

Here is the solution in three simple steps:

 

1) Find all formats described in icon.css

 

grep -B1 formats icon.css | cut -f1 -d"{" | grep SPAN | cut -f2 -d. > format.txt

 

icon.css is located on the application server under"...\\webapps\\D2-Client\\styles\\defaults\\css"

 

2) Find which format in the repository is not defined in icon.css using a dql query

Use the value in format.txt from previous step to build the dql query:

 

select distinct a_content_type from dm_document(all)
where a_content_type not in
('unknown',
'blank',
'crtext',
'text',
'mactext',
'pdftext',
'pdf',
.... .
....
'vsd1Large',
'vsd2Large',
'vsd3Large',
'vsd4Large')

 

3) Update icon.css with the missing formats

Let's take an example: For msw2007, I added:

 

SPAN.msw2007 {  BACKGROUND-IMAGE: url(/D2-Client/servlet/GetResource/images/icons/formats/doc-small.gif) }

 

High availability in the Oracle landscape

Wed, 2014-09-24 15:41

It's now several weeks I attended some event about high availability (HA). But what is actually high availability? According to our friend Wikipedia, HA is based on 3 main principals:

  1. Elimination of Single Point of Failure (SPoF)
  2. Reliable crossover (with minimal or no downtime)
  3. Detection of failures when they arrive

If those principals are met, end users may never realize about a failure.
The aim of that blog is not to dive into too much details, but just to give you an overview and provide you entry points for further investigation.

HA can be applied to a broad range of elements of the Oracle Fusion Middleware stack like:

  • Application programming with ADF
  • Reporting
  • Application server
  • Identity management
  • Supporting Database

Let's see how those elements can take care of HA.

Application Programming with ADF for HA:

As easy as application development can be made easy with ADF in JDeveloper, developers still have to consider particular settings to enable the application to run smouthly in WebLogic cluster and take advantage of the HA features.
BEWARE: by default, when you start an WebLogic managed server with the node manager, it is not taking in account any of the specific settings. So start script should be enabled on the node manager.
- Persistence store type should be set to REPLICATE_IF_CLUSTERED in the weblogix.xml file
- Controller configuration should be set with "adf scope ha support" being true in the adf_config.xml for the Application Ressource
- Application modules should be set for clustrer failover (AM-Pooling jbo.dofailover = "true")
- Managed Beans and Task Flow Parameters should be serializable
- UI bindings should be kept in a small scope

So if you follow some principals, your ADF application will take the best out HA on a WebLogic cluster.

Reporting:

There are 2 main tools provided by Oracle for reporting purpose:
- Forms/Reports
- BI Publisher
Both can be integrated in a HA environment even though the need may not be that big.
There are different challenges, specifically with Forms/Reports as the same report with a given ID can be generated and delivered by different report servers whereas the client application is looking for a single reference. So it's not very straight forward to create a HA environment for Forms/Reports with replicated servers.

What developers should now about WebLogic Server and HA:

Not only ADF based Java applications can benefit from WebLogic cluster, but any Java EE application can do so when deployed on a WebLogic cluster.
There are some settings and design principles to be taken in account so that the application can switch between the different nodes of the cluster.

There is a broad range of solutions that can apply to the WebLogic cluster:

• Auto Restart
• Session Replication
• Transaction Failover
• Clustered Messaging
• Server Migration
• Clusterware Integration
• Metadata split
• Online Deployment
• Rolling Redeployment
• Online Configuration Change
• Cluster Management
• Rolling Patching
• Shared Services Architektur

Oracle database 12c associated to WebLogic 12c enables another level in the HA with the Application Continuity feature allowing automated transaction replay making developer's life easier.

You can find more information about WebLogic Cluster and HA on the following Oracle white papaer:

http://www.oracle.com/technetwork/middleware/weblogic/learnmore/1534212

Identity and Access Management:

This is one very sensitive subject where HA is key. Without proper availability of your access management, HA in your applications would be almost useless, as user won't be able to use them.

But this is also one of the most complex environment to make HA, in the Oracle landscape, because of all the bricks put together:
- Oracle Internet Directory (OID)
- Oracle Access Management (OAM)
- Oracle Virtual Directory (OVD)
- Metadata Services (MDS)
- Oracle Platform Security Services (OPSS)
- Oracle Entitlement Server (OES)
... and more

You can find details about HA for Identity and Access management on the Oracle website:

http://docs.oracle.com/cd/E40329_01/doc.1112/e28391/toc.htm

Oracle DB - the basement:

Fusion Middleware applications are refering to databases which also need to be HA if the application layer wants to provide HA.
The Oracle database provides several solutions for the HA of which mainly:
- Real Application Cluster (RAC)
- Data Guard
Both can be used seperately, but also combined.
HA solutions can also be implemented on storage level and/or hardware.
Other option is to use third party solutions like Db Visit (http://www.dbvisit.com/) which can leverage HA on an Oracle standard edition and spare the additional costs of an Enterprise edition.

 

As HA is mostly synonym of complex environments, I hope you will enjoy setting-up them and please your end users in hidding failures.

How to measure Exadata SmartScan efficiency

Tue, 2014-09-23 03:09

A thread on OTN Forum about Exadata came to the following question: "But how can I monitor if it is effectively used or not?". This is a common question. There are 3 exclusive features coming with Exadata, and instance statistics can show their usage. Even better: two of them can even be checked on your current (non-Exadata) system. And that is good to foresee how Exadata can improve your workload.

Let's find how to measure the following feature efficiency:

  • Have reads eligible to SmartScan
  • Avoid I/O with Storage Index
  • Avoid transfert with offloading
Have reads eligible to SmartScan

First of all, SmartScan occurs only on direct-path reads. If you don't see 'cell smart table scan' and 'cell smart index scans' in your Top timed events, then SmartScan can do nothing for you. And you see that as 'direct path read' wait event when you are not in Exadata.

If those direct-path reads are not a significant part of your DB Time, then you have something else to do before going to Exadata. You should leverage direct-path reads: full table scans, parallel query, etc.

Then when you are on Exadata and 'cell smart table scan' and 'cell smart index scans' are used, then you can check the proportion of reads that are actually using SmartScan.

SmartScan input is: 'cell physical IO bytes eligible for predicate offload'. This is the amount of reads (in bytes) that are going to the SmartScan code. You have the total amount of reads as 'physical read total bytes' so you can compare it to know which part of your reads is subject to SmartScan.

If 'cell physical IO bytes eligible for predicate offload' / 'physical read total bytes' is small, then you have something to tune here. You want to do direct-path reads and you want to see 'TABLE ACCESS STORAGE' in the execution plan.

Not yet in Exadata? The Performance Analyzer can simulate it. The statistic is 'cell simulated physical IO bytes eligible for predicate offload.'

Avoid I/O with Storage Index

When you know that SmartScan is used or can be used on a significant part of your reads, then the first thing you want to do is to avoid physical I/O. Among the 'cell physical IO bytes eligible for predicate offload', some reads will not necessitate disc I/O at all, thanks to Storage Indexes. You have the volume in 'cell physical IO bytes saved by storage index'. Just compare that with the eligible volume and you know the amount of disk reads that have been saved by Storage Indexes. That is the most efficient optimization of SmartScan: you don't have to read them, you don't have to uncompress them, you don't have to filter them, you don't have to transfer them...

Avoid transfert with offloading

Then there is the proper offloading. The previous (Storage Indexes) addressed I/O elimination. This is the key feature for performance. Offloading addresses the transfer from storage to database servers. This is the key feature for scalability.

In the last decade, we replaced lot of direct attached disks with SAN. That was not for performance reasons. That was for maintainability and scalability. Having a shared storage system helps to allocate disk space when needed, get good performance by striping, get high availability by mirroring. The only drawback is the transfer time that is higher than direct attached disks. 

Exadata still has the scalable architecture of the SAN, but releases the transfer bottleneck with offloading (in addition fo the fast interconnect which is very efficient). What can be filtered early on storage cells do not have to be transferred: columns not in the select clause, rows outside of the where (or join) clause predicates.

And you can measure it as well. When you measure it on non-Exadata with the performance analyzer, you compare the SmartScan output, which is the 'cell simulated physical IO bytes returned by predicate offload', to the SmartScan input 'cell simulated physical IO bytes returned by predicate offload'. And this is a good estimation of the efficiency you can expect when going to Exadata.

When you are on Exadata, that may be different. Compressed data have to be uncompressed in order to apply the predicates and projection at the storage cells. Then the predicate/projection offloading input is: 'cell IO uncompressed bytes'. and you compare that to 'cell physical IO interconnect bytes returned by smart scan'

Summary

If you want to see Exadata SmartScan efficiency, just check an AWR report and compare the following:

cell physical IO bytes eligible
for predicate offload


      /  
   

 

physical read total bytes
 

     Goal:
     high % 

cell physical IO bytes saved
by storage index


     
      /  
   


 

cell physical IO bytes eligible
for predicate offload

     Goal:
     high %

cell physical IO interconnect bytes
returned by smart scan
 

      /      

 

cell IO uncompressed bytes

 

      Goal:
      small %
 

 

You probably wonder why I don't use the 'smart scan efficiency ratio' that we find at different places? They are often wrong for two reasons:

  • They compare 'cell physical IO interconnect bytes returned by smart scan' to 'cell physical IO interconnect bytes'. But the latter includes the writes as well, and because of ASM mirroring, writes are multipled when measured at interconnect level.

  • The 'cell physical IO interconnect bytes returned by smart scan' can't be compared with 'physical read total bytes' because the former has some data uncompressed. 

For that reason, we cannot use only a single ratio that covers all the SmartScan features.

This is why I always check the 3 pairs above in order to get a relevant picture. And two of them are available with the simulation mode (I'll blog about it soon).

 

 

SQL Server 2014: classic commit vs commit with delayed durability & I/Os

Mon, 2014-09-22 23:43

When you learn about SQL Server, you will often hear that a commit transaction is a synchronous operation and that you can trust it. In this blog post, I will provide some details about what we mean by synchronous behavior. The reason is that sometimes, when I talk about the new delayed durability feature provided by SQL Server 2014, there are some confusions. If you want more details on this new feature, please read the blog post of my colleague Stéphane Haby here. A quick shortcut is often the following: writing to the transaction log is synchronous, while writing with the new delayed durability feature is asynchronous.

First of all, you probably know that the buffer manager guarantees that the transaction log is written before the changes to the database are written. This is the famous protocol called Write-Ahead logging (aka WAL). Log records are not written directly to disk but first into the buffer cache and then flushed to the disk in a purely asynchronous manner. However, at the commit time the related thread must wait for the writes to complete to the point of the commit log record in the transaction log. This is the synchronous part of commit operation in order to meet the WAL protocol.

On the other hand, the new delayed durability feature allows the commit operation to be asynchronous (like writing to the transaction) but the big difference is that the related thread doesn’t have to wait until the commit log record is written in the transaction log. This new feature introduces some performance improvements, but as a caveat, there is the loss of data.

We can prove that both commit operations write asynchronously by using either the process monitor tool or by using a debugger and trying to catch the part of the code responsible for writing into the transaction log file.

I will use the following T-SQL script for this demonstration:

--> Commit transaction (without delayed durability option)

AdventureWorks2012; GO   -- Ensure DELAYED_DURABILITY is OFF for this test ALTER DATABASE adventureworks2012 SET DELAYED_DURABILITY = DISABLED; GO   -- Create table t_tran_delayed_durability IF OBJECT_ID(N't_tran_delayed_durability', 'U') IS NOT NULL        DROP TABLE t_tran_delayed_durability; GO     create table t_tran_delayed_durability (        id int identity ); GO   -- insert 1000 small transactions declare @i int = 1000   while @i 0 begin        insert t_tran_delayed_durability default values          set @i = @i - 1; end;

 

--> Commit transaction (with delayed durability enabled)

-- Ensure DELAYED_DURABILITY is ON for this test ALTER DATABASE adventureworks2012 SET DELAYED_DURABILITY = ALLOWED; GO   -- Create table t_tran_delayed_durability IF OBJECT_ID(N't_tran_delayed_durability', 'U') IS NOT NULL        DROP TABLE t_tran_delayed_durability; GO   create table t_tran_delayed_durability (        id int identity ); GO   -- insert 1000 small transactions declare @i int = 1000   while @i 0 begin        begin tran tran_1        insert t_tran_delayed_durability default values        commit tran tran_1 with (DELAYED_DURABILITY = on)          set @i = @i - 1; end;

 

Below, you will find an interesting picture of the process monitor trace output that shows the SQL Server file system activity that writes to the transaction log file in both cases.

 

--> Commit transaction (without delayed durability option)

 

blog_17_1_procmon_normal_transaction

 

You will notice that SQL Server uses the WriteFile() function to write to the transaction log for each commit operation (4096 bytes each). I will only show you a sample of the output, but you can imagine the final number of records you can have here. If we take a look at the process monitor stack you will notice that SQL Server uses the WriteFile() Windows function located in the Kernel32.lib library to write to the transaction log with an overlapped structure (in others words asynchronous I/O).

 

blog_17_3_procmon_stack

 

This test confirms what Bob Dorr explains in the Microsoft article about SQL Server I/Os and transaction log I/O.

 

--> Commit transaction (with delayed durability enabled)

 

blog_17_1_procmon_delayed_transaction

 

In this case, the same function is used by SQL Server with a big difference here: SQL Server will group some IO into chunks (in my case 16K, 48K, and 60K) before writing to disk. Cleary, there is less activity here (in my case 18 lines against approximatively 1000 lines for the first test).

We can also attach a debugger (for instance WinDbg) to the SQL Server process and set a breakpoint in the Kernel32!writefile() function for the calling thread in order to have more details about the execution stack. Note that the process monitor stack showed the module KERNELBASE.dll for the WriteFile() function but as mentioned by this Microsoft article kernelbase.dll gets functionality from kernel32.dll and advapi32.dll.

 

blog_17_1_windbg_stack_writefile

 

Both commit operations show the same stack except of course the number of executions.

To summarize, I wanted to show you that both commit operations (with and without delayed duration) are using asynchronous IO to write to the transaction log file. The big difference is that with the delayed durability option, SQL Server improves the log IO writes performance by deferring and grouping the IO into 60K chunks before writing to the disk. I hope this will help you understand more about SQL Server commit operations.

Oracle OEM Cloud Control 12.1.0.4 - agent upgrade & patch

Mon, 2014-09-22 20:09

The new Oracle OEM Cloud Control 12.1.0.4 release migration makes it necessary for the DBA to migrate the old agent version to 12.1.0.4. If your infrastructure has a huge number of agents and if you want to apply the agent patches to the upgraded agents, this might be a very time-consuming job. However, there is a way to realize the operation in just one shot.

In my example, we have an agent in version 12.1.0.3:

 

oracle@vmtestfusion01:/u00/app/oracle/agent12c/core/12.1.0.3.0/OPatch/ [agent12c] ./opatch lsinventory

Oracle Interim Patch Installer version 11.1.0.10.0

Copyright (c) 2013, Oracle Corporation.

All rights reserved.

Oracle Home       : /u00/app/oracle/agent12c/core/12.1.0.3.0

Central Inventory : /u00/app/oraInventory  

from           : /u00/app/oracle/agent12c/core/12.1.0.3.0/oraInst.loc

OPatch version   : 11.1.0.10.0

OUI version       : 11.1.0.11.0

Log file location : /u00/app/oracle/agent12c/core/12.1.0.3.0/cfgtoollogs/opatch/opatch2014-09-02_08-00-36AM_1.log

OPatch detects the Middleware Home as "/u00/app/oracle/Middleware/11g"

Lsinventory Output file location : /u00/app/oracle/agent12c/core/12.1.0.3.0/cfgtoollogs/opatch/lsinv/lsinventory2014-09-02_08-00-36AM.txt

Installed Top-level Products (1):EM Platform (Agent)                                                 12.1.0.3.0

There are 1 products installed in this Oracle Home.

Interim patches (2) :Patch 10203435     : applied on Sat Jun 22 08:51:24 CEST 2013

Unique Patch ID: 15915936.1   Created on 7 Feb 2013, 18:06:13 hrs PST8PDT  

Bugs fixed:     10203435

Patch 16087066     : applied on Sat Jun 22 08:51:22 CEST 2013

Unique Patch ID: 15928288  

Created on 4 Feb 2013, 04:52:18 hrs PST8PDT  

Bugs fixed:     13583799, 6895422

OPatch succeeded.

 

In the OMS environment, we have to download and copy the agent-side patches to $OMS_HOME/install/oneoffs/12.1.0.4.0/Generic.

In my example, I downloaded the 19002534 EM DB plugin bundle patch 12.1.0.6.1 (agent side):

 

oracle@vmtestoraem12c:/u01/app/oracle/MiddleWare_12cR4/oms/install/oneoffs/12.1.0.4.0/Generic/ [oms12c] ls

p19002534_121060_Generic.zip

 

The agent upgrade procedure will use this directory to apply the patch.

Let's upgrade the agent from 12.1.0.3 to 12.1.0.4 by using the Cloud Control console:

 

ag1

 

Select the agent to be upgraded:

 

ag2_copy

 

The new job screen lists the different steps:

 

ag3_copy

 

In the log file we can visualize the patch:

 

Tue Sep 2 08:07:26 2014 -

Found following valid patch files from the patch location which will be considered in this patching session :

Tue Sep 2 08:07:26 2014 - p19002534_121060_Generic.zip

Tue Sep 2 08:07:26 2014 - /u00/app/oracle/agent12c/core/12.1.0.4.0/bin/unzip -o p19002534_121060_Generic.zip -d /u00/app/oracle/agent12c/oneoffs >> /u00/app/oracle/agent12c/core/12.1.0.4.0/cfgtoollogs/agentDeploy/applypatchesonapplicablehome2014-09-02_08-07-26.log 2>&1

Archive: p19002534_121060_Generic.zip  

creating: /u00/app/oracle/agent12c/oneoffs/19002534/  

creating: /u00/app/oracle/agent12c/oneoffs/19002534/etc/  

creating: /u00/app/oracle/agent12c/oneoffs/19002534/etc/config/

inflating: /u00/app/oracle/agent12c/oneoffs/19002534/etc/config/actions.xml

…………

 

By checking the agent inventory, we verify the new upgraded agent has received the EM DB PLUGIN BUNDLE PATCH 12.1.0.6.1:

 

[agent12c] opatch lsinventory -oh /u00/app/oracle/agent12c/plugins/oracle.sysman.db.agent.plugin_12.1.0.6.0/

Oracle Interim Patch Installer version 11.1.0.10.4

Copyright (c) 2014, Oracle Corporation. All rights reserved.

Oracle Home       : /u00/app/oracle/agent12c/plugins/oracle.sysman.db.agent.plugin_12.1.0.6.0

Central Inventory : /u00/app/oraInventory

   from          : /u00/app/oracle/agent12c/plugins/oracle.sysman.db.agent.plugin_12.1.0.6.0//oraInst.loc

OPatch version   : 11.1.0.10.4

OUI version       : 11.1.0.12.0

Log file location : /u00/app/oracle/agent12c/plugins/oracle.sysman.db.agent.plugin_12.1.0.6.0/cfgtoollogs/opatch/opatch2014-09-02_10-09-32AM_1.log

OPatch detects the Middleware Home as "/u00/app/oracle/Middleware/11g"

Lsinventory Output file location : /u00/app/oracle/agent12c/plugins/oracle.sysman.db.agent.plugin_12.1.0.6.0/cfgtoollogs/opatch/lsinv/lsinventory2014-09-02_10-09-32AM.txt

Installed Top-level Products (1):

Enterprise Manager plug-in for Oracle Database                       12.1.0.6.0

There are 1 products installed in this Oracle Home.

Interim patches (1) :

Patch 19002534     : applied on Tue Sep 02 10:05:37 CEST 2014

Unique Patch ID: 17759438

Patch description: "EM DB PLUGIN BUNDLE PATCH 12.1.0.6.1 (AGENT SIDE)"

   Created on 17 Jun 2014, 09:10:22 hrs PST8PDT

   Bugs fixed:

     19002534, 18308719

 

This feature is very useful for massive agent upgrades, because the agent is upgraded and, in the same operation, the bundle patch is applied. You are also able to use the patch plan to apply bundle patches to multiple agents in one operation.

The SQL Server DBA's essential toolkit list

Mon, 2014-09-22 02:01

This week, I attended the SQLSaturday 2014 in Paris. During the Pre-Conference on Thursday, I followed Isabelle Van Campenhoudt for her SQL Server Performances Audit session. This conference took the form of an experience sharing between attendees. Indeed, we tried to list together the most important software, tools, features or scripts which will help an SQL Server DBA during his work. In this blog, I want to share our final list with you.

 

Windows Server Level: Hardware & Applications


CrystalDiskMark

CrystalDiskMark is a free disk benchmark software. It can be downloaded here.

 

SQLIO

SQLIO is another free disk benchmark software. It can be downloaded here.

 

Windows Performance Monitor (PerfMon)

PerfMon is a Windows native tool which collects log data in real time in order to examine how programs running on the computer affect the performance.

PerfMon provides a lot of counters which measure the system state or the activity.

You can learn more on TechNet.

You can find the most important counters for SQL Server here.

 

Performance Analysis of Logs (PAL)

PAL is an Open Source tool based on the top of PerfMon. It reads and analyses the main counters looking for known thresholds.

PAL generates an HTML report which alerts when thresholds are reached.

PAL tool can be downloaded on CodePlex.

 

Microsoft Assessment and Planning (MAP)

MAP is a Microsoft toolkit which provides hardware and software information and recommendations for deployment or migration process for several Microsoft technologies (such as SQL Server or Windows Server).

MAP toolkit can be downloaded on TechNet.

 

SQL Server Level: Configuration & Tuning

 

Dynamic Management Views and Functions (DMV)

DMV are native views and functions of SQL Server which returns server state information of a SQL Server instance.

You can learn more on TechNet.

 

sp_Blitz (from Brent Ozar)

It is a free script which checks SQL Server configuration and highlights common issues.

sp_Blitz can be found on Brent Ozar website.

 

Glenn Berry's SQL Server Performance

It provides scripts to diagnostic your SQL Server since SQL Server 2005.

These scripts can be downloaded here.

 

Enterprise Policy Management (EPM) Framework

EPM Framework is based on Policy-Based Management. It is a reporting solution which tracks SQL Server states which do not meet the specified requirements. It works on all instances of SQL Server since SQL Server 2000.

You can learn more on CodePlex.

 

SQL Server Level: Monitoring & Troubleshooting

 

SQL Profiler

SQL Profiler is a rich interface integrated in SQL Server, which allows to create and manage traces to monitor and troubleshoot an SQL Server instance.

You can learn more on TechNet.

 

Data Collector

Data Collector is a SQL Server feature introduced in SQL Server 2008, and available in all versions.

It gathers performance information from multiple instances for performance monitoring and tuning.

You can learn more on TechNet.

 

Extended Events

Extended Events is a monitoring system integrated in SQL Server. It helps for troubleshooting or identifying a performance problem.

You can learn more on TechNet.

 

SQL Nexus

SQL Nexus is an Open Source tool that helps you for identifying the root cause of SQL Server performance issues.

It can be downloaded on CodePlex.

 

SQL Server Level: Maintenance

 

SQL Server Maintenance Solution

It a set of scripts for running backups, integrity checks, and index statistics maintenance on all editions of Microsoft SQL Server since SQL Server 2005.

This solution can be downloaded on Ola Hallengren's website.

 

 

Conclusion

This blog does not pretend to make a complete list of DBA needs, but it tries to cover most parts. You will notice that all softwares are free and recognized by the DBA community as reliable and powerful tools.

I hope this will help you.

For information, you can learn how to use these tools in our SQL Server DBA Essentials workshop.

Documentum upgrade project: D2-Client, facets and xPlore

Sun, 2014-09-21 19:57

To enhance the search capability we had to configure xPlore to use the new customer attributes as facets and configure D2 to use the default and new facets.

  Configuring xPlore to use facets with the customer attributes
  • Stop the Index Agent and Server
  • Update indexserverconfig.xml by adding the following line (e. g.):

 

 xml-code

 

  • Keep only the indexserverconfig.xml file in $DSSEARCH_HOME/config
  • Remove $DSSEARCH_HOME/data/*
  • Start index and agent server
  • Start a full reindexing
  • Once all is indexed, set index to normal mode

 

Necessary tests

You should do two tests before configuring the D2-Client.

 

1. On the content server:

 

java com.emc.d2.api.config.search.D2FacetTest -docbase_name test67 -user_name admin -password xxxx -full_text -facet_names dbi_events

 

2. On the xPlore server:

  • Check if the new lines have been validated by executing $DSEARCH_HOME/dsearch/xhive/admin/XHAdmin
  • Navigate to xhivedb/root-library/dsearch/data/default
  • Under the Indexes Tab, click the "Add Subpaths" button to open the "Add sub-paths to index" window where you can see in the Path column the added customer attributes

 

Configure the search in D2-Config
  • Launch D2-Config
  • Select Interface and then the Search sub-menu
  • Tick  "Enable Facets" and enter a value for "Maximun of result by Facet"

 

D2-Config

 

Once this is done, you are able to use the facets with the D2-Client.

Improving your SharePoint performance using SQL Server settings (part 2)

Sun, 2014-09-21 17:36

Last week, I attended the SQLSaturday 2014 in Paris and participated in a session on SQL Server optimization for Sharepoint by Serge Luca. This session tried to list the best pratices and recommendations for Database Administrators in order to increase the SharePoint performance. This blog post is based on this session and is meant as a sequel to my previous post on Improving your SharePoint performance using SQL Server settings (part 1).

 

SQL Server instance

It is highly recommended to use a dedicated SQL Server instance for a SharePoint farm and to set LATIN1_GENERAL_CI_AS_KS_WS as the instance collation.

 

Setup Account permissions

You should give the Setup Account the following permissions in your SQL Server instance:

  • securityadmin server role

  • dbcreator server role

  • dbo_owner for databases used by the Setup Account

 

Alias DNS

It is recommended to use Alias DNS to connect to the SQL Server instance with your SharePoint server. It simplifies the maintenance and makes it easier to move SharePoint databases to another server.

 

Disk Priority

When you plan to allocate your SharePoint databases accross different databases, you might wonder how to maximize the performance of your system.

This is a possible disk organization (from faster to lower):

  • Tempdb data and transaction log files

  • Content database transaction log files

  • Search database data files (except Admin database)

  • Content database data files

 

Datafiles policy

You should use several datafiles for Content and Search databases, as follows:

  • distribute equally-sized data files accross separate disks

  • the number of data files should be lower than the number of processors

Multiple data files are not supported for other SharePoint databases.

 

Content databases size

You should avoid databases bigger than 200 GB. Databases bigger than 4 TB are not supported by Microsoft.

 

Conclusion

SharePoint is quite abstract for SQL Server DBAs because it requires specific configurations.

As a result, you cannot guess the answer: you have to learn on the subject.

Oracle OEM Cloud Control 12.1.0.4 - the new features

Thu, 2014-09-18 17:01

This document describes the main new features of Cloud Control 12.1.0.4. A lot of new features are coming with the 12.1.0.4 version I will describe the most important ones, but you can also refer to this Oracle documen: http://docs.oracle.com/cd/E24628_01/doc.121/e25353/whats_new.htm#CEGFFGBI

 

New management services repository page

There is a new management services repository page providing details about the management repository:

 

cc1

 

In the Setup Menu—> Manager Cloud Control, select health overview:

 

cc2

 

You have access to a new performance page:

 

cc3

 

This new Enterprise Manager Performance Page is providing precious performance informations in order to help administrators to check the overall performance of their Enterprise Manager infrastructure.

 

cc4

 

Oracle BI Publisher

Oracle BI Publisher 11g is now installed by default with Enterprise Manager Cloud Control 12.1.0.4, but it is not configured by default. A post installation configuration step has to be done in order to configure the BI Publisher server.

 

New Security Console

A new Security Console allows the administrators to have a single entry point where they can view, analyze, or optimize the security for their environment.

In the Setup menu, select Security, then Security Console:

 

cc5

cc6


This new security console displays your Enterprise Manager security configuration and allows you to view, analyze and optimize the security for your environment.

The categories are:

  • Pluggable authentication (LDAP authentication, Oracle access manager, Oracle SSO based authentication...)
  • Fine-grained Access Control (target and resource privilege, list of super administrator...)
  • Secure Communication (Https and public key infrastructure, Oms secure configuration, Database Encryption configuration)
  • Credentials Management
  • Comprehensive Auditing (Current auditing configuration, Specific Audit operations, Audit Usage Statistics)
  • Active User Session Count (Session settings, active sessions)
  • Best Practices Analysis (quick view of Enterprise Manager Security configuration)

 

cc7

 

Apply privilege delegation template

Enterprise Manager 12.1.0.4 allows to apply a default privilege delegation template setting to one newly discovered host, or many already discovered hosts.

This new feature is very interesting for administrators when a lot of new host targets have been added to Enterprise Manager Cloud Control. We can also use emcli with the set_default_privilege_delegation verb to apply those privileges to hosts.

 

cc8

 

In the Setup Menu à Security, we select Privilege Delegation:

 

cc9

 

We can display the templates:

 

cc10

 

We apply the template to multiple hosts:

 

cc11

 

The SUDO_TEMPLATE has been successfully applied to mutliple hosts:

 

Emcli

New emcli verbs are available in the 12.1.0.4 version. The command emcli help will show you the new features.

The following ones are especially interesting:

  • get_not_updatable_agents: displays agents not updatable
  • get_updatable_agents: displays updatable Agents
  • update_agents:  performs Agent Update Prereqs and submits Agent Update Job
  • delete_incident_record:   deletes incidents based on the provided IDs, up to a maximum of 20 incidents.
  • resume_job: resumes a job
  • suspend_job:  suspends a job
  • clear_default_privilege_delegation_setting: clears the default privilege delegation settings for a platform.
  • set_default_privilege_delegation_setting: sets the default privilege delegation setting for one or more platforms
  • test_privilege_delegation_setting: tests Privilege Delegation Setting on a host

 

Plugin management

We can deploy multiple plugins from the Cloud Control Console in one operation. This new feature will help administrators to reduce the number of OMS restarts during the fastidious plugins deployment:

 

cc12

 

Metric alert message customization

Metric alert messages can be customized in order to be more understandable or to be compliant with the data centers' wording convention. As you can see in the following screenshot, checking the Edit Alert Message allows the Enterprise Manager Administrator to modify the error message:

 

cc13

 

Metric collection schedule enhancement

We now have the possibility to enter a starting time for a metric collection, if the schedule frequency is defined by days, by weeks, weekly or monthly. This new feature might be very interesting for administrators when the metric is time sensitive.

 

cc14

 

Advanced Threshold Management

With Enterprise Manager 12.1.0.4, the Advanced Threshold Management new feature allows us to compute adaptive thresholds (self-adjusting) or time-based thresholds.

Now, in the Metric and collections settings, you can display different kinds of metrics:

 

cc15

 

Adaptive thresholds

Enterprise Manager 12.1.0.4 has improved the alerting mechanism with the adaptive thresholds. Generally the normal expected set metrics values are depending on the workload of the target, so the threshold value is too low or too high. The adaptive thresholds are calculated about a target’s baseline value.

For example (see below), you can define a warning or a critical threshold to high (95 %), very high (99 %), severe (99,9 %) and extreme value (99.99 %).

Select Adaptive Threshold:

 

cc16

 

cc17

 

Select Quick Configuration:

 

cc18

 

Choose Primary OLTP, for example:

 

cc19

 

Select Finish:

 

cc20

 

Then you can edit and modify the thresholds values:

 

cc21

 

Time based statics thresholds

As database activity is quite different during the day where a lot of users are conected, and the night where the main activity is concerning the batch jobs, this new feature allows the administrators to define higher thresholds value for a metric during the night.

For example in the Metric and Collection Settings:

 

cc22

 

We select Advanced Threshold Management in the Related Links:

 

cc23

 

By selecting the Threshold Change Frequency, you can adapt the warning and critical values, depending of the time of the day (or week):

cc24

 

Day is 7 AM to 7 PM in target timezone, night means 7 PM to 7 AM in target timezone, weekdays means Monday to Friday, and weekend means Sunday and Saturday.

 

Incident rule set simulator

The new Rule Set Simulator in the Incident Rule screen displays you the rules to which the event will apply. By this way administrators can test its rules without executing the actins specified in the rules like emailing or opening tickets.

 

cc25

cc26

 

Incident manager

There are some new features in the incident screen manager screen.

When looking at an incident, the related event tab displays recent configuration changes, helping administrators to solve the problem:


cc27

 

The notification tab displays now all the notifications sent for the event or the incident (email, SNMP traps …).

 

SNMP V3

The new SNMP version 3 protocols offer more security sending information from Enterprise Manager 12.1.0.4 and third party management systems. SNMPv3 includes three important services: authentication, privacy, and access control.

 

cc28

 

Faster target Down notifications

The target down detection (concerning hosts, database instance, WebLogic Server, Management Agent) has been improved in terms of quickness of detection. Oracle documentation says the target down monitoring is detected within seconds, the tests I made showed me it was true, the target down incident has been generated in some seconds.

 

Enhanced agent and host down detection

Every Enterprise Administrator has encountered problems with agents going down or not communicating anymore with the OMS. The new Enterprise Manager 12.1.0.4 version has added a sub status icon allowing the administrator to discover the reason why the agent is in an unreachable state.

 

cc29

 

When an agent goes down unexpectedly, in the Manager Cloud Control agent page you can select the symptom analysis icon which may help you to determine the root cause of the problem:

 

cc30

 

When an agent goes down unexpectedly, you can select the symptom analysis icon in the Manager Cloud Control agent page which may help you to determine the root cause of the problem.

 

Role and target property when adding a target

When adding a target database to a host managed by an agent 12.1.0.4, we have the possibility to specify the global target properties:

 

cc31

 

We can also explicitly specify a group for the targets:

 

cc32

 

New Job progress screen

 

oracle@vmtestoraem12c:/home/oracle/ [oms12c] emctl set property -name oracle.sysman.core.jobs.ui.useAdfExecutionUi -value true

Oracle Enterprise Manager Cloud Control 12c Release 4

Copyright (c) 1996, 2014 Oracle Corporation.

All rights reserved.

SYSMAN password:

Property oracle.sysman.core.jobs.ui.useAdfExecutionUi has been set to value true for all Management Servers

OMS restart is not required to reflect the new property value

 

Before setting this property to true, when selecting a job we only could view the following screen:

 

cc33

 

After setting the property to true:

 

cc34

 

Conclusion

A lot of new interesting features are present in the 12.1.0.4 version. I would particularly mention the Advanced Threshold Management and the new Security Console, which will help administrators to be more and more proactive in their job.

Thinking about downgrading from Oracle Enterprise to Standard Edition?

Tue, 2014-09-16 01:01

You are using an Oracle Enterprise Edition and thinking about downgrading to the Standard Edition? In this case, you must be sure that your applications are compatible. It's not something easy to check. Here are a few ideas.

 

Why?

Why do you want to downgrade to the Standard Edition? For licensing costs, of course. Today, it is difficult to find a server with only a few cores. And Oracle Enterprise Edition is licenced per number of cores which are physically in the machine. You change your hardware and you will find that you cannot have a machine with the same number of cores. Even if the performance is fine, you will need to buy more software licenses because of those new multicore processors.

Another reason is virtualization. You want to consolidate your servers, but you don't want to pay database software licenses for all your datacenter capacity.

So the Standard Edition is a good alternative: besides the fact that they are chaper, the licenses are counted per socket and not per core.

Oracle Standard Edition doesn't have all features, but you can accept that. The reduction in the cost of licenses can compensate several days of development, tuning or administration, as well as the acquisition of third party tools to compensate what is missing on SE (for example dbvisit standby for high availability).

But you need to identify those features that you are using and that come with Enterprise Edition only

 

1. Read feature availability

The features available only in Enterprise Edition are listed in the documentation which shows which ones are available in Standard Edition or Enterprise Edition.

So the first thing to be done is to read that list and mark what you know you are using.

But there are two problems:

  • It's sometimes difficult to understand. For example, do you see clearly that you can't send e-mails for Enterprise Manager notifications when you don't have diagnostic Pack?
  • You probably don't know all what you (or your developers, your application) use.

 

2. Query feature usage

Oracle comes with a nice view about feature usage. DBA_FEATURE_USAGE_STATISTICS. It's nice because you have information about what you used, with comments, dates, etc. And it's also exposed in Cloud Control.

But did you ever try to match that with the documentation from the link above? That's difficult:

  • some Enterprise Edition features are not checked. For example, the usage of materialized view is shown, but without the distinction about those using query rewrite (which is an EE feature)
  • some subset of features triggers usage even when they should not (for example the Locator part of Spatial do not need Spatial option)

 

3. Import to standard

One important thing to do is to import into a Standard Edition and check what fails with an 'ORA-00439: feature not enabled' error. Because what is nice is that when you install Standard Edition the features not available are supposed to be disabled at link time.

One tip: you probably need to import metadata only so you want to import it in a small database. But when you do that you will see that your datafiles are increasing because of the initial segment size. This is because the 'deferred segment creation' is an Enterprise Edition feature. So the tip is:

 

impdp ... content=metadata_only transform="storage:n"

 

The big advantage when testing the import is that you are already testing the migration procedure, because it's the only way to migrate from Enterprise Edition to Standard Edition.

The problem is that it warns you only about static feature - those in your data model. Not about the usage. For example you will know that you can't create bitmap indexes. But you will not know that you will not be able to do bitmap plan conversion from regular indexes.

Testing the import guarantees that the migration can be done, but you should test the application on a SE database with data in order to validate usage and performance.

 

4. Try and test

After having checked everything, from the obvious which is documented, to the little things we know by experience, I usually advise the customer to test. Install a test database in Standard Edition. Test the application, test the monitoring, test the administration procedures (no online operation, no flashback database,...). If you plan to migrate with minimum downtime with a replication tool (such as dbvisit replicate) you can start to replicate to a Standard Edition database. Then you will be able to test the read-only use cases, such as reporting, which may suffer from the lack of some optimizer features (adaptive plans, result cache,...)

 

5. Decide

Then you will decide if you are ready to downgrade to Oracle Standard Edition. Of course, it will no be transparent. You will have to find some workarounds. The decision is just a balance between the cost reduction and the time you can spend to do manually what was automatic in EE.

SQL Saturday 323: SQL Server AlwaysOn and availability groups session slides

Sun, 2014-09-14 23:50

This SQL Saturday’s edition in Paris is now over. It was a great event with a lot of French and international speakers. There were also many attendees indicating that this event is a great place to share about SQL Server technologies. Maybe the Montparnasse tower in Paris played a role here with its panoramic view over Paris from the 40th floor! Smile


blog_16_landscape_from_spuinfo

blog_16_badge_sqlsaturdays


For those who didn’t attend on Saturday, you will find our SQL Server AlwaysOn and availability groups session slides here: SQLSaturday-323-Paris-2014---AlwaysOn-session.pptx

Don’t forget the next big event of the SQL Server community in Paris (1-2 december): Journées SQL Server

We will probably be there and of course we will enjoy to meet you!

Documentum upgrade project - ActiveX and D2-Client 3.1Sp1

Sun, 2014-09-14 19:31

This is another blog posting an our Documentum upgrade project. This time, the following issue occured: the ActiveX could not be installed using the D2-Client. We had to access the D2-Config url to have it installed. For a normal user, this could not be used.

Analyzes

The workstation had the ActiveX for D2 3.0 installed, the version before the upgrade. Under C:\\Windows\\Downloaded Program Files, we had:  

ctx
ctx.ocx  
D2UIHelper.dll

On my workstation where I could install (using D2-Config) the D2 3.1.1 ActiveX, I also had C:\\Windows\\Downloaded Program Files\\CONFLICT.* folders containing D2UIHelper.dll and ctx.inf

By checking the content of ctx.inf of this new cab I saw that we had the wrong version (see FileVersion) of the

 [ctx.ocx]  
file-win32-x86=thiscab  
RegisterServer=yes  
clsid={8C55AA13-D7D9-4539-8B20-78BC4A795681}  
DestDir=  
FileVersion=3,0,0,2

By checking the "ctx.cab" file in "D2-Client/install" and "D2-Config/install" on the application server I found that we did not have the same version, both ctx-cab had the same date and size but the digital signature was different:  

D2-Config ctx-cab: &8206;17 &8206;September &8206;2013 10:56:11,  
D2-Client: 19 &8206;April &8206;2013 17:03:08

 

Solution

To solve the issue I copied the ctx.cab" from "D2-Config/install" path to "D2-Client/install/". Once this was done the activeX could be installed using the D2-Client url.

It was confirmed by the vendor that this is a bug in the delivered package

Kerberos SSO with Liferay 6.1

Sun, 2014-09-14 02:22

In my previous blog, I described the process to install a Kerberos Client and how to Kerberized Alfresco. In this blog, I will continue in the same way and present another application that could be configured to use the Kerberos MIT KDC: Liferay. Liferay is a very popular and a leader in Open Source solution for enterprise web platform (Intranet/Extranet/Internet web sites). Liferay could be bundled with several application servers like Tomcat, JBoss, Glassfish, but it could also be installed from scratch (deployment of a war file) with a lot of existing application servers.

 

For this blog, I will need the following properties/variables:

  • example.com = the DNS Domain
  • EXAMPLE.COM = the KDC REALM
  • kdc01oel.example.com = the FQDN of the KDC
  • document.write(['mpatou','EXAMPLE.COM'].join('@')) = the principal of a test user
  • lif01.example.com = the FQDN of the Liferay host server
  • otrs01.example.com = the FQDN of the OTRS host server

 

Please be aware that some configurations below may not be appropriate for production environment. For example, I don't configure Apache to run as a different user like "www" or "apache", I don't specify the installation directory for Apache or Kerberos, aso...

Actual test configuration:

  • OS: Oracle Enterprise Linux 6
  • Liferay: Liferay Community Edition 6.1.1 GA2 - installed on /opt/liferay-6.1.1
  • Application Server: Tomcat 7.0.27 - listening on port 8080

 

This version of Liferay doesn't have a default connection to a Linux KDC so everything should be done from scratch. The first thing to do is to add an Apache httpd in front of Liferay, if there is not already one, to process Kerberos requests. This part is described very quickly without extensive explanations because we don't need all the functionalities of Apache. Of course you can, if you want, add some other configurations to the Apache httpd to manage for example an SSL certificate, the security of your application or other very important features of Apache... So first let's check that the Tomcat used by Liferay is well configured for Kerberos with an Apache front-end:

  • The HTTP port should be 8080 for this configuration
  • The maxHttpHeaderSize must be increased to avoid authentication errors because an http header with a Kerberos ticket is much more bigger than a standard http header
  • The AJP port should be 8009 for this configuration
  • The tomcatAuthentication must be disabled to delegate the authentication to Apache

 

To verify that, just take a look at the file server.xml:

[root ~]# vi /opt/liferay-6.1.1/tomcat-7.0.27/conf/server.xml
1.png

 

Then download Apache httpd from the Apache web site (or use yum/apt-get), extract the downloaded file and go inside of the extracted folder to install this Apache httpd with some default parameters:

[root ~]# cd /opt
[root opt]# wget http://mirror.switch.ch/mirror/apache/dist//httpd/httpd-2.4.10.tar.gz
[root opt]# tar -xvf httpd-2.4.10.tar.gz
[root opt]# cd httpd-2.4.10
[root httpd-2.4.10]# ./configure
[root httpd-2.4.10]# make
[root httpd-2.4.10]# make install

 

This will install Apache httpd 2.4.10 under /usr/local/apache2. There could be some errors during the execution of "./configure" or "make" or "make install" but these kind of issues are generally well known and so the solutions to these issues could be found everywhere on Internet. An installation with the command apt-get will put the configuration file (named apache2.conf not httpd.conf) under /etc/apache2/ so please adapt the description below to your environment.

 

Once Apache httpd is installed, it must be configured to understand and use Kerberos for all incoming requests:

[root httpd-2.4.10]# vi /usr/local/apache2/conf/httpd.conf
# Add at the end of the file
Include /opt/liferay-6.1.1/tomcat-7.0.27/conf/mod_jk.conf
    Include /usr/local/apache2/conf/mod_kerb.conf

[root httpd-2.4.10]# vi /usr/local/apache2/conf/mod_kerb.conf
# New file for the configuration of the module "mod_auth_kerb" and Kerberos
    ServerAdmin root@localhost
    # The FQDN of the host server
    ServerName lif01.example.com:80

# Of course, find the location of the mod_auth_kerb and replace it there if
# it's not the same
    LoadModule auth_kerb_module /usr/local/apache2/modules/mod_auth_kerb.so

‹Location /›
    AuthName "EXAMPLE.COM"
        AuthType Kerberos
        Krb5Keytab /etc/krb5lif.keytab
        KrbAuthRealms EXAMPLE.COM
        KrbMethodNegotiate On
        KrbMethodK5Passwd On
        require valid-user
    ‹/Location›

 

The next step is to build the mod_auth_kerb and mod_jk. The build of mod_auth_kerb requires an already installed Kerberos client in this Liferay server. As seen below, my Kerberos client on this server is under /usr/local. Moreover, the buid of mod_jk may requires to specify the apxs binary used by Apache, that's why there is the "--with-apxs" parameter:

[root httpd-2.4.10]# cd ..
[root opt]# wget http://sourceforge.net/projects/modauthkerb/files/mod_auth_kerb/mod_auth_kerb-5.4/mod_auth_kerb-5.4.tar.gz/download
[root opt]# tar -xvf mod_auth_kerb-5.4.tar.gz
[root opt]# cd mod_auth_kerb-5.4
[root mod_auth_kerb-5.4]# ./configure --with-krb4=no --with-krb5=/usr/local --with-apache=/usr/local/apache2
[root mod_auth_kerb-5.4]# make
[root mod_auth_kerb-5.4]# make install

[root mod_auth_kerb-5.4]# cd ..
[root opt]# wget http://mirror.switch.ch/mirror/apache/dist/tomcat/tomcat-connectors/jk/tomcat-connectors-1.2.40-src.tar.gz
[root opt]# tar -xvf tomcat-connectors-1.2.40-src.tar.gz
[root opt]# cd tomcat-connectors-1.2.40-src/native
[root native]# ./configure --with-apxs=/usr/local/apache2/bin/apxs --enable-api-compatibility
[root native]# make
[root native]# make install

 

The module auth_mod_kerb doesn't need extra configuration but it's not the case of the mod_jk for which we will need to define several elements like log file and level, JkMount parameters which defines http requests that should be sent to the AJP connector, aso:

[root native]# cd ../..
[root opt]# vi /opt/liferay-6.1.1/tomcat/conf/mod_jk.conf
LoadModule jk_module /usr/local/apache2/modules/mod_jk.so
    JkWorkersFile /opt/liferay-6.1.1/tomcat-7.0.27/conf/workers.properties
    JkLogFile /usr/local/apache2/logs/mod_jk.log
    JkLogLevel debug
    JkLogStampFormat "[%a %b %d %H:%M:%S %Y]"
    # JkOptions indicate to send SSL KEY SIZE,
    JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
    # JkRequestLogFormat set the request format
    JkRequestLogFormat "%w %V %T"
    JkMount / ajp13
    JkMount /* ajp13

[root opt]# vi /opt/liferay-6.1.1/tomcat/conf/workers.properties
    # Define 1 real worker named ajp13
    worker.list=ajp13
    worker.ajp13.type=ajp13
    worker.ajp13.host=localhost
    worker.ajp13.port=8009
    worker.ajp13.lbfactor=50
    worker.ajp13.cachesize=10
    worker.ajp13.cache_timeout=600
    worker.ajp13.socket_keepalive=1
    worker.ajp13.socket_timeout=300

 

Finally, the last configuration for Apache httpd is to configure a krb5.conf file for the Kerberos client to know where the KDC is located:

[root opt]# vi /etc/krb5.conf
    [libdefaults]
        default_realm = EXAMPLE.COM

    [realms]
        EXAMPLE.COM = {
            kdc = kdc01oel.example.com:88
            admin_server = kdc01oel.example.com:749
            default_domain = example.com
        }

    [domain_realm]
        .example.com = EXAMPLE.COM
        example.com = EXAMPLE.COM

 

Once this is done, there is one step to execute on the KDC side for the configuration of Kerberos. Indeed, there is a configuration above in the file mod_kerb.conf that shows a keytab file named krb5lif.keytab. By default, this file doesn't exist so we must create it! From the KDC host server, execute the following commands to create a new service account for Liferay and then create the keytab for this service account:

[root opt]# kadmin
Authenticating as principal root/document.write(['admin','EXAMPLE.COM'].join('@')) with password.
Password for root/document.write(['admin','EXAMPLE.COM'].join('@')):  ##Enter here the root admin password##

kadmin:  addprinc HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@'))
WARNING: no policy specified for HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@')); defaulting to no policy
Enter password for principal "HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@'))":  ##Enter a new password for this service account##
Re-enter password for principal "HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@'))":  ##Enter a new password for this service account##
Principal "HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@'))" created.

kadmin:  ktadd -k /etc/krb5lif.keytab HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@'))
Entry for principal HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@')) with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/etc/krb5lif.keytab.
Entry for principal HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@')) with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/etc/krb5lif.keytab.
Entry for principal HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@')) with kvno 2, encryption type des3-cbc-sha1 added to keytab WRFILE:/etc/krb5lif.keytab.
Entry for principal HTTP/document.write(['lif01.example.com','EXAMPLE.COM'].join('@')) with kvno 2, encryption type arcfour-hmac added to keytab WRFILE:/etc/krb5lif.keytab.

kadmin:  exit

[root opt]# scp /etc/krb5lif.keytab document.write(['root','lif01.example.com'].join('@')):/etc/
document.write(['root','lif01.example.com'].join('@'))'s password:
krb5lif.keytab [====================================›] 100% 406 0.4KB/s 00:00
[root opt]# exit

 

From now on, all configurations required by Apache & Tomcat to handle Kerberos tickets are done. The only remaining step and certainly the most complicated is to configure Liferay to understand and use this kind of authentication. For that purpose, a Liferay Hook must be created (in eclipse using the Liferay Plugin for example). Let's name this Liferay Project created with the liferay-plugins-sdk-6.1.1: "custom-hook". For the configuration below, I will suppose that this project is at the following location: "C:/liferay-plugins-sdk-6.1.1/hooks/custom-hook/" and this location is abbreviated to %CUSTOM_HOOK%. You will find at the bottom of this blog a link to download the files that should be in this custom-hook. Feel free to use it!

 

To create a new authentication method, the first step is to create and edit the file %CUSTOM_HOOK%/docroot/WEB-INF/liferay-hook.xml as follow:

liferay-hook.png

 

Then, create and insert in the file %CUSTOM_HOOK%/docroot/WEB-INF/src/portal.properties the following lines:

    # This line defines the new auto login authentication used by Liferay
    auto.login.hooks=com.liferay.portal.security.auth.KerberosAutoLogin

 

And finally, the last step is to create the Java Class %CUSTOM_HOOK%/docroot/WEB-INF/src/com/liferay/portal/security/auth/KerberosAutoLogin with the following content. This class is used to retrieve the Kerberos principal from the Kerberos Ticket received by Apache and then transforms this principal to log the user in Liferay. Please be aware that this code can probably not be used as such because it's specific to our company: the screenName used in Liferay is equal to the principal used in the KDC. That's why there is some logger.info in the code: to help you to find the good relation between the Liferay screenName and the KDC principal.

AutoLogin.png

 

After that, just build your hook and deploy it using the liferay deploy folder (/opt/liferay-6.1.1/deploy/). If necessary, restart Apache and Liferay using the services or the control scripts:

[root opt]# /opt/liferay-6.1.1/tomcat-7.0.27/bin/shutdown.sh
[root opt]# /opt/liferay-6.1.1/tomcat-7.0.27/bin/startup.sh
[root opt]# /usr/local/apache2/bin/apachectl -k stop
[root opt]# /usr/local/apache2/bin/apachectl -f /usr/local/apache2/conf/httpd.conf

 

Wait for Liferay to start and that's it, you should be able to obtain a Kerberos Ticket from the KDC, access to Liferay (through Apache on port 80) and you should be logged in automatically. That's MAGIC!

Thanks for reading and I hope you will be able to work with Kerberos for a long long time =).

 

Custom hook download link: custom-hook.zip

MySQL high availability management with ClusterControl

Sat, 2014-09-13 03:03

Installing and managing a highly available MySQL infrastructure can be really tedious. Solutions to facilitate database and system administrator’s task exist, but few of these cover the complete database lifecycle and address all the database infrastructure management requirements. Severalnines’ product ClusterControl is probably the only solution that covers the full infrastructure lifecycle and is also able to provide a full set of functionalities required by database cluster architectures. In this article, I will show how to install, monitor and administrate a database cluster with ClusterControl.


Introduction

Severalnines is a Swedish company mostly composed of ex-MySQL AB staff. Severalnines provides automation and management software for database clusters. Severalnines’ ClusterControl perfectly fits this objective by providing a full “deploy, manage, monitor, and scale” solution. ClusterControl supports several database cluster technologies such as: Galera Cluster for MySQL, Percona XtraDB Cluster, MariaDB Galera Cluster, MySQL Cluster and MySQL Replication. However ClusterControl does not only support MySQL based cluster but also MongoDB clusters such as MongoDB Sharded Cluster, MongoDB Replica Set and TokuMX. In this article we will use Percona XtraDB Cluster to demonstrate ClusterControl functionalities.

 There are two different editions of ClusterControl: the community edition that provides basic functionalities and the enterprise edition that provides a full set of features and a really reactive support. All the details about the features of both editions can be found on the Severalnines website (http://www.severalnines.com/ClusterControl). In this article, we will detail four main global functionalities that are covered by ClusterControl:

 

1. The cluster deployment

2. The cluster management

3. The cluster monitoring

4. The scalability functionalities

 

The cluster architecture that we chose for the purpose of this article is represented in Figure 1. This cluster is composed by three Percona XtraDB nodes (green), two HAProxy nodes (red) and one ClusterControl (blue).

 

clustercontrol001.png

Figure 1: Percona XtraDB Cluster architecture


1. Cluster Deployment

As stated in the introduction, ClusterControl can manage several kinds of MySQL clusters or MongoDB clusters. The cluster deployment starts on Severalnines website on http://www.severalnines.com/configurator by choosing the kind of cluster we want to install. Once we have selected Percona XtraDB Cluster (Galera), we can select on which infrastructure we want to deploy the cluster. We can choose between on-premise, Amazon EC2 or Rackspace. Since we want to install this cluster on our own infrastructure, our choice here is “on-premise”.

Then we simply have to fill in the general settings forms by specifying parameters such as operating system, platform, number of cluster nodes, ports number, OS user, MySQL password, system memory, database size, etc., as presented in Figure 1.

 

clustercontrolsetup.png

Figure 2: General Settings


Once the general settings forms are filled in, we have to specify the nodes that belong to the Percona XtraDB cluster as well as the storage details.

The first settings are related to the ClusterControl server, the ClusterControl address and memory. There are also the details regarding the Apache settings, since the web interface is based on an Apache web server:

 

clustercontrolsetup002.png

Figure 3: ClusterControl settings


Now you can fill in the parameters related to the Percona XtraDB data nodes.

 

clustercontrolsetup003.png

Figure 4: Percona XtraDB nodes settings


Once all settings are entered, a deployment package can be automatically generated through the “Generate Deployment Script” button. We simply have to execute it on the ClusterControl server in order to deploy the cluster. Of course, it is still possible to edit the configuration parameters by editing the my.cnf file located in s9s-galera-1.0.0-/mysql/config/my.cnf.

 

[root@ClusterControl severalnines]# tar xvzf s9s-galera-percona-2.8.0-rpm.tar.gz

[root@ClusterControl severalnines]# cd s9s-galera-percona-2.8.0-rpm/mysql/scripts/install/

[root@ClusterControl install]# bash ./deploy.sh 2>&1|tee cc.log

 

The deployment package will download and install Percona XtraDB Cluster on the database hosts, as well as the ClusterControl components to manage the cluster. When the installation is successfully finalized, we can access the ClusterControl web interface via http://ClusterControl

Once logged in to ClusterControl we are able to view all database systems that are managed and monitored by ClusterControl. This means that you can have several differing cluster installations, all managed from one ClusterControl web interface.

 

clustercontrolsetup004.png

Figure 5: ClusterControl Database Clusters


Now the Percona XtraDB cluster is deployed and provides data high availability by using three data nodes. We still have to implement the service high availability and service scalability. In order to do that, we have to setup two HAProxy nodes in the frontend. Adding an HAProxy node with ClusterControl is a straightforward procedure. We would use a one-page wizard to specify the nodes to be included in the load balancing set and the node that will act as the load balancer, as presented in Figure 6.

 

clustercontrolsetup005.png

Figure 6 : Load balancer installation, using HAProxy


To avoid having a Single Point Of Failure (SPOF), it is strongly advised to add a second HAProxy node by following the same procedure as for adding the first HAProxy node. Then simply add a Virtual IP, using the “Install Keepalived” menu as presented in Figure 7.

 

clustercontrolsetup0x1.png 

 Figure 7: Virtual IP configuration using KeepAlived


2. Cluster Management 

ClusterControl offers numbers of administration features such as: Online backup scheduling, configuration management, database node failover and recovery, schema management, manual start/stop of nodes, process management, automated recovery, database user management, database upgrades/downgrades, adding and removing nodes online, cloning (for galera clusters), configuration management (independently for each MySQL node) and comparing status of different cluster nodes.

Unfortunately, presenting all these great management functionalities is not possible in the context of this article. Therefore, we will focus on backup scheduling and user, schema, and configuration management.

 

a. Backup Scheduling

As far as I remember, MySQL backup has always been a hot topic. ClusterControl offers three backup possibilities for MySQL databases: mysqldump, Percona Xtrabackup (full) and Percona Xtrabackup (incremental). Xtrabackup is a hot backup facility that does not lock the database during the backup. Scheduling the backups and having a look on performed backups is really easy with ClusterControl. It is also possible to immediately start a backup from the backup schedules’ interface. The Figure 7 presents the backup scheduling screen.

 

clustercontrolsetup007.png

Figure 8: Backup scheduling screen (retouched image for the purpose of this article)

You do not have to make a purge script to remove old backups anymore: ClusterControl is able to purge the backups after the definition of the retention period (from 0 to 365 days).

Unfortunately the restore procedure has to be managed manually since ClusterControl does not provide any graphical interface to restore a backup.

 

b. User, schema, and configuration management 

We can manage the database schemas, upload dumpfiles, and manage user privileges through the ClusterControl web interface.

 

clustercontrolsetup008.png

Figure 9: MySQL user privileges management

 

You can also change the my.cnf configuration file, apply the configuration changes across the entire cluster, and orchestrate a rolling restart – if required. Every configuration change is version-controlled.

 

clustercontrolsetup009.png

 Figure 10: MySQL Configuration management

 

New versions of the database software can be uploaded to ClusterControl, which then automates rolling software upgrades.

 

clustercontrolsetup010.png

Figure 11: Rolling upgrade through ClusterControl interface


A production cluster can easily be cloned, with a full copy of the production data, e.g. for testing purposes.

 

clustercontrolsetup011.png

Figure 12: Cloning Cluster screen


3. Cluster monitoring

With ClusterControl, you are not only able to build a cluster from scratch or get a full set of cluster management functionalities. It is also a great monitoring tool that provides you with a number of graphs and indicators, such as the list of top queries (by execution time or Occurrence), the running queries, the query histogram, CPU/Disk/Swap/RAM/Network usage, Tables/Databases growth, health check, and schema analyzer (showing tables without primary keys or redundant indexes). Furthermore, ClusterControl can record up to 48 different MySQL counters (such as opened tables, connected threads, aborted clients, etc.), present all these counters in charts, and many other helpful things that a database administrator will surely appreciate.

 

clustercontrolsetup012.png

Figure 13: Database performance graphics with time range and zoom functionalities (retouched image for the purpose of this article)


ClusterControl provides some interesting information regarding database growth for data and indexes. Figure 14 presents a chart showing the database growth since the last 26 days.

 

clustercontrolsetup013.png

Figure 14: Database growth since the last 26 days

 

ClusterControl is also able to send e-mail notifications when alerts are raised or even create custom expressions. The database administrator can also setup its own warning as well as critical thresholds for CPU, RAM, disk space, and MySQL memory usage. The following figure represents the resource usage for a given node.

 

clustercontrolsetup014.png

Figure 15: Resources usage for a Master node


Power users can set up custom KPIs, and get alerts in case of threshold breaches.

 

clustercontrolsetup015.png

 Figure 16: Custom KPIs definition

 

Health Report consists of a number of performance advisors that automatically examine the configuration and performance of the database servers, and alert in case of deviations from best practice rules.

 

clustercontrolsetup0xx.png

Figure 17: Health report with performance advisors

 

4. Scalability functionalities

Sooner or later it will be necessary to add or remove either a data node or a HAProxy node to the cluster for scalability or maintenance reasons. With ClusterControl, adding a new node is as easy as selecting the new host and giving it the role we want in the cluster. ClusterControl will automatically install the package needed for this new node and make the appropriate configuration in order to integrate it in the cluster. Of course, removing a node is just as easy.

 

clustercontrolsetup017.png

 Figure 18: New node addition and "add master" screens

 

Conclusion

With ClusterControl, Severalnines did a great job! For those who ever tried to build and administrate a highly available MySQL architecture using disparate clustering components such as heartbeat, DRBD (Data Replication Block Device), MySQL replication or any other high availability component, I am sure that you often wished to have a solution that provides a complete package. Deploying multiple clustering technologies can become a nightmare. Of course there are solutions such as MMM (Multi-Master replication Management for MySQL), but there is no solution covering the whole cluster lifecycle and offering such an amazing set of features via a nice web interface.

In addition to the great set of functionalities provided by ClusterControl, there is the Severalnines support: Their support team is amazingly efficient and reactive. The reaction time presented on the Severalnines website indicates 1 day but I never waited more than 1 hour before getting a first answer.

As stated in the introduction, there are two editions: The community edition with a limited set of functionalities is free, whereas the enterprise edition is available under a commercial license and support subscription agreement. This subscription includes ClusterControl software, upgrades, and 12 incidents per year. It is also interesting to notice that Severalnines and Percona are partners starting from this year.

 

The summary of my ClusterControl experience is presented in the table below:

 

Advantages

Drawbacks / limitation

+ Covers the whole cluster lifecycle from installation, upgrade as well as the management and monitoring phases


+ Much easier to use than many other tools that do not even provide half of the ClusterControl functionalities


+ Each operation includes a new job subscription – all operation are therefore logged


+ Amazingly reactive support!

- Does not provide backup restore functionalities


- It is not possible to acknowledge alerts or blackout targets

 

Additional information can be found on http://www.severalnines.com/blog. Since dbi services is Severalnines partner and has installed this solution at several customer sites, feel free to contact us if you have any additional question regarding ClusterControl.

Oracle OEM Cloud Control 12.1.0.4 - AWR Warehouse

Thu, 2014-09-11 20:42

This post explains how to configure and use the new AWR warehouse functionality present in Enterprise Manager Cloud Control 12.1.0.4. This new feature offers the possibility to store date from the AWR of your main databases in the EM AWR warehouse.

As OEM AWR warehouse automatically extracts the AWR data of the database you have selected, there is no impact on your production databases. The main advantage is that it allows to keep AWR historical data beyond the retention period of the target databases. Another benefit is a performance improvement and a space gain of the target databases because AWR data are uploaded in a centralized AWR warehouse.

We have to apply a patch to get the AWR warehouse functionality available in the menu Database Performance. For Linux x86-64, you have to apply the patch 19176910 which contains the EM DB plugin bundle patch 12.1.0.6.2. Before applying of the patch the screen appears as follows:

 

aw1

 

We have to use the latest Opatchversion available for 11.1 release.

Watch out OPatch 11.2.0.x.0 version is not supported with Oracle Management Service (OMS) 12.1.0.3 and 12.1.0.4

 

oracle@vmtestoraem12c:/u01/app/oracle/MiddleWare_12cR4/oms/OPatch/ [oms12c] opatch version

OPatch Version: 11.1.0.11.0

OPatch succeeded.

 

Then we can run:

 

oracle@vmtestoraem12c:/home/oracle/19176910/ [oms12c] opatchauto apply -analyze

OPatch Automation Tool

Copyright (c) 2014, Oracle Corporation. All rights reserved.

OPatchauto version : 11.1.0.11.0

OUI version       : 11.1.0.12.0

Running from       : /u01/app/oracle/MiddleWare_12cR4/oms

Log file location : /u01/app/oracle/MiddleWare_12cR4/oms/cfgtoollogs/opatch/opatch2014-08-25_16-45-18PM_1.log

OPatchauto log file: /u01/app/oracle/MiddleWare_12cR4/oms/cfgtoollogs/opatchauto/19176910/opatch_oms_2014-08-25_16-45-20PM_analyze.log

[Aug 25, 2014 4:46:33 PM]   Prerequisites analysis summary:

                             -------------------------------

                             The following sub-patch(es) are applicable:

                             oracle_sysman_db91:19051532

                             oracle_sysman_vt91:19060193

                             oracle_sysman_mos91:18873245

                             oracle_sysman_emas91:19051543

                             oracle_sysman_ssa91:19051528

 


We can check if any patches are installed in the OMS:


oracle@vmtestoraem12c:/home/oracle/19176910/ [oms12c] opatchauto lspatches

OPatch Automation Tool

Copyright (c) 2014, Oracle Corporation. All rights reserved.

There are no patches installed in the OMS system.

 

Then we stop the OMS:

 

oracle@vmtestoraem12c:/home/oracle/19176910/ [oms12c] emctl stop oms

Oracle Enterprise Manager Cloud Control 12c Release 4

Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved.

Stopping WebTier...

WebTier Successfully Stopped

Stopping Oracle Management Server...

Oracle Management Server Successfully Stopped

Oracle Management Server is Down

 

Then we run under the patch directory : opatchauto apply

Once the patch is applied you can start the oms with the classical command : emctl start oms

Finally we have the AWR warehouse menu:

 

aw2


We have first to configure the AWR warehouse repository:

 

aw3

 

 

You select the configure button:

 

aw4

 

You enter the database and host credentials:

 

aw4

 

You configure the snapshot management for a one year retention period, you can also define the snapshot upload interval from 1 hour to 24 hours:

 

aw6

 

The caw_load_setup job_%id is successful.

Now we have access to the AWR warehouse screen where we can add databases:

 

aw7

 

When you select add, you have to define preferred credentials to the database you add and add some grants:

 

aw8

 

We have to allocate the execute privilege on dbms_swrf_internal to system:

 

SYS> grant execute on sys.dbms_swrf_internal to system;

Grant succeeded.


For the NMO error, I forgot to run the root.sh agent 12c script:


[root@vmtestoradg1 ~]# cd /u00/app/oracle/agent12c/core/12.1.0.4.0

[root@vmtestoradg1 12.1.0.4.0]# ./root.sh

Finished product-specific root actions.

/etc exist

 

Finally the databases are added:

 

aw9

 

We have to grant the access to the databases added, we select a target database and we choose Privileges:

 

aw10

 

We select the sysman administrator account:

 

aw11

 

As we have configured the snapshot upload interval to 24 hours, we will have data every day:

 

aw12

 

From this dashboard you can make ADDM comparaisons, view ADDM reports, ASH analytics or directly go to the performance page of the database you have selected.

The AWR warehouse feature requires the diagnostic pack license; This new feature seems very interesting due to the central and consolidated AWR warehouse. I will test this exciting feature deeper in the next weeks.