Amis Blog

Subscribe to Amis Blog feed
Friends of Oracle and Java
Updated: 8 hours 14 min ago

Fastest creation of a Lean VirtualBox VM Image with Oracle Database 11gR2 XE, the Node.JS 7.x and the Oracle DB Driver for Node

Tue, 2017-05-16 11:05

For a workshop on Node.js I needed a VM to demonstrate and students try out the Oracle DB Driver for Node. I wanted a lean VM with the bare minimum: Oracle Database XE, Node, the Oracle DB Driver for Node and the Git client (for fetching sources from GitHub). I stumbled across the OXAR repository in GitHub (https://github.com/OraOpenSource/OXAR ) – Oracle XE & APEX build script along with images for popular cloud platforms http://www.oraopensource.com/oxar/ . Using the sources I found here, I could create my VM in a few simple, largely automated steps. I ended up with a 4.5 GB sized VM image (which exports as a 1.5 GB appliance) that runs in 1 GB. It is more than adequate for my needs.

The steps – for myself if I need to go through them again and of course for you, the reader, to also create this handsome, useful VM.

The steps for creating your own VM image are as follows:

1. make sure that you have Vagrant and VirtualBox installed locally (https://www.vagrantup.com/ and https://www.virtualbox.org/)

2. get the OXAR repository content

git clone https://github.com/OraOpenSource/OXAR

3. Download Oracle 11gR2 XE  installer for Linux from OTN: http://www.oracle.com/technetwork/database/database-technologies/express-edition/downloads/index.html and copy the downloaded file oracle-xe-11.2.0-1.0.x86_64.rpm.zip (the Linux installer for Oracle 11gR2 XE Database downloaded from OTN) to the OXAR/files directory
image

4. edit the file config.properties in the OXAR root directory

SNAGHTML93536f

– set parameter OOS_ORACLE_FILE_URL to file:///vagrant/files/oracle-xe-11.2.0-1.0.x86_64.rpm.zip and save the change:

OOS_ORACLE_FILE_URL=file:///vagrant/files/oracle-xe-11.2.0-1.0.x86_64.rpm.zip

image

Use the OOS_MODULE_XXX flags to specify which components should be installed. Here I have chosen not to install APEX and NODE4ORDS.

5. run vagrant using the statement:

vagrant up

this will run for a file, download the CentOS base image and create the VM (with NAT network configuration), install all of Git client, Oracle 11gR2 XE Database, Node and Node OracleDB Driver
SNAGHTML9564ec

6. after rebooting the system, the VM will be started (or you can start it using vagrant up again or by using the VirtualBox manager).

SNAGHTML973989

You can start an SSH session into it by connecting to localhost:50022, then login to Linux using vagrant/vagrant

image

7. connect to the database using sqlplus hr/oracle

image

8. Try out Node.js

image

9. To try out Node against Oracle Database using the driver, you can clone the GitHub Repository:

git clone https://github.com/lucasjellema/nodejs-oracledbdriver-quickstart

image

Next, cd into the newly created directory

image

and run the file select.js:

image

10. To try out PL/SQL as well:

Create the procedure get_salary using the file get_salary_proc.sql

 

image

Run Node program plsql.js:

image

The post Fastest creation of a Lean VirtualBox VM Image with Oracle Database 11gR2 XE, the Node.JS 7.x and the Oracle DB Driver for Node appeared first on AMIS Oracle and Java Blog.

Oracle SOA Suite: Two-way SSL with TLS1.2 made easy (slightly less complicated)

Wed, 2017-05-10 14:14

Transport layer security (TLS) is not an easy topic. Many blogs have been written about this already. Surprisingly though, I did not find a single blog which was more or less complete and provided me with everything I needed to know to get this working on SOA Suite 12.2.1. In this blog I try to make the topic more easy to understand and provide a complete end to end example.

Suppose you only want an implementation and do not care much about the explanation, you can skip the ‘Some basics’ section, only execute the commands in bold in the ‘Lets get started!’ section and the steps in the ‘WebLogic and SOA Suite’ section. Do take into consideration any existing SSL related configuration on your own system.

Some basics SSL/TLS

SSL stands for Secure Sockets Layer. SSL is the predecessor of TLS. SSL should be considered insecure since in October 2014 the POODLE attack was announced. TLS currently has 4 versions. TLS 1.0, 1.1, 1.2 and 1.3. 1.3 is not widely supported/adopted yet. SSL/TLS provide integrity checks, security and authentication.

Identity

A server which hosts traffic on a port which has SSL/TLS enabled, has an identity keystore. This identity keystore contains a private key and a public key/certificate. The public key/certificate can safely be given to other parties. With websites when visiting an HTTPS website (HTTP with SSL enabled), the public key is send to you. The other party / client can use the public key to encrypt messages meant for the server. The only one who can decrypt the messages is the one having the private key of the server. This is usually only the server.

Trust

Can you trust a server? You can use a certificate authority to create a signed public key. If someone trust the certificate authority, that someone also automatically trusts the signed key. With websites you often see a green lock when a certain website uses HTTPS with a public certificate signed by a (by your webbrowser) trusted certificate authority.

Usually a truststore is used to store trusted certificate authorities or specific trusted certificates. If you have many servers in your application landscape, it is recommended to use a certificate authority since it is cumbersome to load every public key of every server in every truststore. Trusting a single certificate authority makes things a lot easier.

Certificate authority

A certificate authority has a private key which it can use to sign a so-called certificate signing request. From this certificate signing request you can create a signed public key.

Certain companies such as Google and Microsoft provide certain checks to confirm someones identity before providing them with a signed public key. You can pay these companies to provide those checks and give you a signed certificate. Most of these companies are trusted certificate authorities by default in several OSs and browsers. This way for a website for example, you do not have to make changes on a client for your certificate to be trusted.

If you run several servers within your internal company network, you often do not require these external checks. You can create your own certificate authority private key and create a signed public key yourself. This certificate authority is not trusted by default so you should trust the public certificate of your self-signed certificate authority in order establish trust.

Cipher

A cipher is an algorithm for encryption and decryption. With SSL, during the handshake phase (the phase which establishes an SSL session), a cipher is determined. The client usually provides a list of the ciphers it supports and the server chooses which one to use. During an SSL handshake you can see in logfiles which cipher is chosen.

Lets get started!

I used 2 SOA Suite 12.2.1.2 installations (complete, no quickstart) in 2 different VM’s for this example. soaserver1 and soaserver2. I used a host-only network with fixed IP’s in VirtualBox and added IP/hostname mappings in the hosts files of the two servers.

Create a self-signed certificate autority

A blog explaining the topic on creating your own certificate authority can be found here. This is just my short summary with some corrections. Do read it for some easy to understand background information.

This simple example uses OpenSSL. OpenSSL is installed by default on most Linux environments and can also be installed on other OSs.

First create a private key for your certificate authority:

openssl genrsa -des3 -out rootCA.key 2048

I create an RSA key and protect it with the DES3 cipher algorithm based on a password. I want my key to have a length of 2048 bytes. You can also choose for ECC keys. They can be smaller when comparing to RSA keys to provide the same level of protection. ECDSA (Elliptic Curve Digital Signature Algorithm) ciphers use ECC keys. Keep this key private! It allows you to sign public keys (see later in this post) and create trust.

Next I self-sign this generated key. This creates a public signed key for the certificate authority. I can load this key in truststores to achieve trust for keys which are signed with this certificate:

openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.pem -subj ‘/CN=Conclusion/OU=Integration/O=AMIS/L=Nieuwegein/ST=Utrecht/C=NL’ -extensions v3_ca

Lets break this down:

  • req: do a request
  • x509: this defines the format of the key to be generated. In the x509 standard, several pieces of metadata can be stored with the certificate and the certificate authority structure is also part of the x509 standard. Read more here.
  • new: generate a new key
  • nodes: this is actually ‘no DES’. My public key does not need to be protected with a password.
  • key: specifies the private key to sign
  • sha256: secure hash algorithm. Hashing is used to provide data integrity functionality. Creating a hash of a transmission allows you to check at a later time if the transmission has been tampered with.
  • days: specifies the validity of the generated certificate
  • subj: provides some metadata for the certificate
  • extensions v3_ca: this adds a metadata field to the certificate indicating that it is a certificate of a certificate authority. If this extension is not added, certain validations might fail

You can use the certificate authority private key and certificate as server identity but you shouldn’t. This will give certain validation errors because of the ‘extensions v3_ca’.

Create server identity keys

Next we create a private key which will be used as identity of the WebLogic server

openssl genrsa -des3 -out soaserver1.key 2048

After we have created this private key, we can create a certificate signing request for this private key

openssl req -new -key soaserver1.key -out soaserver1.csr -subj ‘/CN=soaserver1/OU=Integration/O=AMIS/L=Nieuwegein/ST=Utrecht/C=NL’

This is pretty similar as to what we have done for the certificate authority. However mind the subj clause here. The common name should match the server hostname. This will be used later for verification of the identity of the server by the client. In order to allow two-way SSL, I added the server hostname to IP mapping to every servers hosts file. In an enterprise you would use a DNS (domain name system) for this since you do not want to maintain every mapping in every server locally.

Next sign the certificate using the information in the private key and certificate of the certificate authority.

openssl x509 -req -in soaserver1.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -out soaserver1.crt -days 1024 -sha256

This is very similar to signing the certificate authority certificate. Mind that a validity with a longer period than the validity of the certificate authority key is of course useless. Createserial creates a new file rootCA.srl. This serial number is unique for every signed certificate. You should save it so at a later time you can check if a certificate has been tampered with.

The next time you sign a certificate, you can use:

openssl x509 -req -in soaserver1.csr -CA rootCA.pem -CAkey rootCA.key -CAserial rootCA.srl -out soaserver1.crt -days 1024 -sha256

This will increase the previous serial with 1, making sure it is unique.

Creating an identity keystore

Now you have a signed certificate and a private key. Time to make a Java keystore (JKS) which can be used in WebLogic server and SOA Suite and other pieces of Java.

openssl pkcs12 -export -in soaserver1.crt -inkey soaserver1.key -chain -CAfile rootCA.pem -name “soaserver1” -out soaserver1.p12

keytool -importkeystore -deststorepass Welcome01 -destkeystore soaserver1identity.jks -srckeystore soaserver1.p12 -srcstoretype PKCS12

The above steps;

  • creating a private key
  • creating a certificate signing request
  • signing the certificate with the private key of the certificate authority
  • creating an identity keystore

need to be done for every server.

Creating a trust keystore

Here you can pick the fruits of the above work of using a certificate authority to sign your server private keys. You can use the certificate authority certificate in a truststore and every key signed with the certificate is trusted. You do not need to load every specific server certificate into every truststore the server needs access to. Creating a truststore is easy and you can do this once and use the same trust.jks file in all your servers.

keytool -import -alias rootCA -file rootCA.pem -keystore trust.jks -storepass Welcome01

WebLogic and SOA Suite

It is interesting to notice the incoming WebLogic configuration differs from the SOA Suite outgoing configuration. This is of course not surprising since a server usually only has a single identity, but an integration product like SOA Suite should able to interact with multiple protected external resources, maybe requiring different ciphers and keys for each of them. Also SOA Suite in the past (I’m not sure if that is still the case) could run on IBM WebSphere instead of WebLogic Server. Thus I can understand Oracle chose to provide a more generic implementation of SSL in the SOA Suite than the WebLogic specific one.

WebLogic

The WebLogic server configuration is pretty straightforward. In this example I’m only looking at SSL for incoming and outgoing messages for SOA Suite. The WebLogic specific configuration is only relevant for incoming connections. Basically the steps are as followed:

  • Enable SSL for the managed server
  • Specify keystores for identity and trust
  • Configure incoming SSL specifics
  • Force the server to use TLS1.2
Enable SSL for the managed server

First Enable the listen port for SSL. In WebLogic console, environment, servers, specify your server, configuration, general and indicate ‘SSL Listen port enabled’. You can also specify the SSL port here.

Specify the keystores for identity and trust

In WebLogic console, environment, servers, specify your server, configuration, keystores. You can specify the identity and trust keystores you have created during the above steps.

Configure incoming SSL specifics

In WebLogic console, environment, servers, specify your server, configuration, SSL. You can specify the identity key used for the server and several checks which can be done when establishing the SSL connection.

Some important settings:

  • BEA Hostname verifier. This indicates the CN field in the certificate is checked against the server hostname.
  • Client certs requested and enforced. If set, Two-Way SSL will be used and the client won’t be able to connect unless it presents a trusted certificate.
  • Buildin SSL Validation and Cert Path Validators. This checks the certificate chain.

It is important to understand what these checks do. A host name verifier ensures the host name in the URL to which the client connects matches the host name in the digital certificate that the server sends back as part of the SSL connection. This helps prevent man in the middle attacks where the client might connect to a different URL.

The below situation is something you won’t prevent even with this checks. I could connect without problems with the soaserver2 WebLogic server from soaserver1 with the certificate of soaserver2. Also when using the private key of soaserver1 as identity on soaserver2, soaserver2 would not complain about this. FireFox would though and most likely also other clients.

Force TLS1.2

If you want to force WebLogic / SOA Suite to use TLS 1.2 you can specify the following JVM parameters in the setDomainEnv.sh file.

-Dweblogic.security.SSL.minimumProtocolVersion=TLSv1.2 -Dhttps.protocols=TLSv1.2

SOA Suite

The SOA Suite configuration is a bit more elaborate in that it requires configuration in different places of which not all can be done from the GUI.

The steps which need to be performed are:

  • Specify the identity store used by SOA
  • Create a keystore password credential in the credential store
  • Configure composite to use two-way SSL
Specify identity store

First you have to specify the identity store which the SOA Suite will use for outbound connections. You can find this setting by going to SOA, soa-infra, SOA Administration, Common Properties, (scroll down), ‘More SOA Infra Advanced Configuration Properties…’

Here you have to specify the servers identity keystore. In my case /home/oracle/certs/soaserver1identity.jks.

Create a keystore password credential in the credential store

Next you have to specify the keystore password. If you forget to do this, you will encounter errors like:

On the client:
<May 7, 2017, 12:58:43,939 PM CEST> <Error> <oracle.integration.platform.blocks.soap> <BEA-000000> <Unable to create SSL Socket Factory>

On the server:
[2017-05-07T12:26:02.364+02:00] [soa_server1] [NOTIFICATION] [] [oracle.integration.platform.common.SSLSocketFactoryManagerImpl] [tid: [ACTIVE].ExecuteThread: ’25’ for queue: ‘weblogic.kernel.Default (self-tuning)’] [userId: <anonymous>] [ecid: cd874e6b-9d05-4d97-a54d-ff9a3b8358e8-00000098,0] [APP: soa-infra] [partition-name: DOMAIN] [tenant-name: GLOBAL] Could not obtain keystore location or password

You can set the keystore password by going to your domain, Security, Credentials. You can create a credential map SOA with a keyname/user of KeyStorePassword with the password you have used for your keystore. It will use the same password for the key as for the keystore if not specified. You can create a KeyPassword entry for the password for the key if they differ.

  

Configure composite to use two-way SSL

This step is easy. You have to add a binding property to your reference which indicates you want to use two-way SSL.

In the composite.xml file on your reference you can add:

<property name=”oracle.soa.two.way.ssl.enabled”>true</property>

This causes the composite binding to use the identity (with the credential store password) for outbound SSL specified in the previously configured MBean.

You should of course also not forget to set the endpoint to a port which hosts HTTPS and indicate in the URL that it should use HTTPS to call this endpoint. In my example I’ve overridden the URL in the EM. Be aware though that overriding the endpoint URL might still cause the original endpoint to be called when the overridden endpoint is not accessible (if for example the SSL connection has issues).

Some useful tips

If you want to debug SSL connections, the following tips might help you.

FireFox

It might appear strange to use a webbrowser to test SSL connections, but which piece of software uses SSL more than a browser? FireFox is very clear in its error messages what has gone wrong which greatly helps with debugging. FireFox uses its own certificate store and can provide certificates to login to a server. You can configure them from FireFox, Preferences, Advanced, Certificates, View Certificates. Here you can import client certificates such as the p12 files you have generated in an earlier step.

This provides for a very easy way to check whether a server can be accessed with SSL and if the server certificate has been correctly generated / set-up. FireFox also extensively checks certificates to provide the green-lock icons people are quite familiar with.

In this case I have SSL enabled for soaserver1 on port 7002. I open https://soaserver1:7002 in FireFox (do not forget the HTTPS part). Since I have enabled ‘Client certs requested and enforced’ in the WebLogic SSL configuration, it will ask me for a client key.

In this case you can check whether a client certificate will be trusted. When opening https://soaserver1:7002 and you get a 404 message, the WebLogic server is responding to you after the SSL handshake has succeeded.

In FireFox you can tweak the cipher suites which are used. Read about this here. Do mind that SSL connections can be cached and FireFox can remember to send specific keys. If you run FireFox on soaserver1 and open a link on soaserver1, Wireshark (read below) will not detect traffic on the same interface which is used to access soaserver2.

Wireshark

Use Wireshark to monitor connections/handshakes.

  • You can confirm the SSL/TLS version being used
  • You can see the number of messages which have crossed the wire (allows you to distinguish retries of for example a handshake fails)
  • Allows you to decrypt SSL traffic (if you have the private key)
  • It allows you to confirm an SSL connection is actually being set up. If you do not see it in Wireshark, no message has been send and the connection build-up fails on the client. This for example happens when the SOA, KeyStorePassword entry has not been set in the SOA Suite credential store.

SSL debug logging

If you want to see what is happening with your SSL connection, it is very helpful to provide some JVM switches in setDomainEnv.

-Dweblogic.security.SSL.verbose -Djavax.net.debug=all -Dssl.debug=true

You can also enable WebLogic SSL debugging in WebLogic console. Open a server and enable weblogic.security.SSL

Portecle

Portecle is a handy and freely available tool if you want to manage keystores and look at key details.

Which service is called?

Suppose you have process A on server X which calls process B on server Y. For testing you first deploy process B on server X and use the WSDL of process B locally from process A. Next you override the endpoint to refer to the SSL port of server Y. What happens if the SSL connection cannot be established? By default, there are 3 retries after which the process falls back to using the endpoint as specified in the WSDL file. When testing it might seem the call from process A on X to B on Y works but it is actually a local call because the local call is the fallback for the remote call. In this case you should confirm an instance of B is created on Y.

Finally Performance impact

Using SSL of course has a performance impact. 1-way SSL is faster than 2-way SSL. Using encryption is slower than not using encryption. Key length and cipher suites also play a major role in how fast your SSL connection will be. I have not measured the precise cost of the different options, but you should consider what you need and what you are willing to pay for it in terms of performance impact.

  • One way SSL allows the client to verify the server identity (certificate, hostname). The server provides the client with a public key but not the other way around.
  • Two way SSL also allows the server to verify the client. The client also needs to provide a public key.

SSL verifies host identities, keys, certificate chains. It does not allow you to provide (specific user) application authentication or authorization. You could do it with SSL but it would require giving every user a specific certificate. There are better ways to do that such as WS-Security, SAML or OAuth.

Entropy

If you use a server which has a lot of SSL connections, the random number generator is asked often for a new random number. Random numbers are generated by using entropy (a measure of randomness/disorder), which is a limited resource, especially in virtualized environments.

There is a setting which allows WebLogic server to recycle random numbers at the cost of security (the random number generator becomes predictable). Read more about that here.

-Djava.security.egd=file:/dev/./urandom

Oracle does not recommend using this recycling mechanism in production environments since if you can predict the random number generator, you have introduced a security vulnerability which can be exploited. Next to speeding up SSL connections, your server startup will most likely also be improved.

CRLs

I’ve not talked about a lot of things such as certificate revocation lists (CRLs). These lists contain keys which have been compromised. Compromised means the private key of a certificate authority has become public. Using the private CA key, someone is able to create new certificates which are being trusted by people who trust the CA. If a person can do such a thing, he is able to gain access to systems. Remember private keys can also be used to decrypt traffic? This is of course an issue on the internet but also when you have your own certificate authority. More generally speaking, if a private key is compromised, all trust should be revoked since you cannot count anymore on that for example a server is the sole owner of the key and is the only one who can decrypt traffic.

JDBC and SSL

Read more about this in the whitepaper here. It requires Oracle Advanced Security (OAS), which is an Oracle Database Enterprise Edition option. The US Government does not allow double encryption (you can imagine why..). If you configure Oracle Advanced Security to use SSL encryption and another encryption method concurrently, then the connection fails. See SSL Usage issues here.

Other things

I have not talked about securing the connection between managed servers in a cluster and between the NodeManager and managed servers. You can read more about that here. Do mind though that using trust can be more efficient than specifically putting every public key in every truststore. Especially when you have many servers.

The post Oracle SOA Suite: Two-way SSL with TLS1.2 made easy (slightly less complicated) appeared first on AMIS Oracle and Java Blog.

Consequences of stopping Oracle support

Sun, 2017-05-07 03:40

When buying licenses for Oracle, the first year support is mandatory. After that, a Customer may decide to stop paying for the yearly technical support of the Oracle licenses. The consequences of that decision is not always clear to customers. Most OLSA’s will contain the sentence   “If you decide not to purchase technical support, you may not update any unsupported program licenses with new versions of the program.”

This is correct, but there is more to think of.  This post will cover the elements that should be considered when deciding on stopping the support.

Unsupported actions

The Technical Support Policy of Oracle clarifies a bit more of what actions a customer is not entitled to do when stopping the support:

Customers with unsupported programs are not entitled to download, receive, or apply updates, maintenance  releases, patches, telephone assistance, or any other technical support services for unsupported programs.

This means the software instantly become legacy, AND a substantial risk. The Oracle software will not be upgraded or patched, the environment  (O.S., client software, middleware, other connected software) does. With the possible effect the application might not work in the future.

Audit-ranking

However Oracle claims that the departments Support, Accountmanagement and LMS acts more or less seperated and will not share this kind of information, it is naive to assume that the decision of stopping support of (part of) the Oracle licenses has no consequences regarding the rank of the customer on LMS’s list for submitting an audit.

 

Matching Service Levels

The support of the license to be stopped could be part of a socalled ‘subset’. Then the following rule applies according to the Support Policy:

You may desupport a subset of licenses in a license set only if you agree to terminate that subset of licenses.

The definition of a license subset is quite a definition, but here are two examples:

Oracle Database Enterprise Edition with RAC, Diagnostic and Tuning Pack.

Weblogic Suite with SOA Suite

So stopping support of the options is a ‘Matching Service Level’ – thing, what LMS will translate as incompliancy, and the chance that My Oracle Support is not willing to help when submitting a Service Request.

 

Afbeeldingsresultaat voor oracle reinstatement fee

Repricing

Support of Oracle software is related to CSI-numbers, and there may be several CSI-numbers in one contract. And a customer may have more contracts, all with ther own negotiated discounts. The following line in the Support Policy is important when stopping support of a line-item :

Pricing for support is based upon the level of support and the volume of licenses for which support is ordered. In the event that a subset of licenses on a single order is terminated or if the level of support is reduced, support for the remaining licenses on that license order will be priced at Oracle’s list price for support in effect at the time of termination or reduction minus the applicable standard discount.

This is ‘Repricing’, also called ‘Pricing following Reduction ‘. So, the updated support renewal, then, would be recalculated at a less optimal discount. Ending up being no savings – just less product on support for the same costs.

This is mostly the case of terminating a license and not for terminating support (however this is a ‘reduced level of support’), but it’s important to know.

Terminating a license within a CSI-number – in stead of stopping support – is in some cases by the way not a reason for repricing. E.g. when there has been a reorganisation of contracts in the past.

Reinstatement

When a customer decides – for what reason – to reinstate the support, there will be a reinstatement-fee.

The reinstatement fee is computed as follows:

a) if technical support lapsed, then the reinstatement fee is 150% of the last annual technical support fee you paid for the relevant program;

b) if you never acquired technical support for the relevant programs, then the reinstatement fee is 150% of the net technical support fee that would have been charged

Engineered Systems

Stopping support of a productline also has a peculiar effect on products, running on engineered systems.

The lifecycle managment of engineered systems are maintained by so-called  ‘bundle-patches’. These bundle-patches contains patches of storage-firmware, bios-updates, o.s-updates, and .. Oracle software patches.

So, when stopping Oracle support you still receive the database and middleware-patches through the bundle-patches, which is not allowed. And however it could be possible to not use these patches, it will break the life cycle managment of the engineered system. I don’t think this is advisable.

Prerequisites

The prerequisites of making such a decision:

  • An overview of all the Oracle contracts at your firm, what seems pretty obvious, but takes quite an effort sometimes.
  • An overview of what licences you are actually using, compared to what you are entitled to.
Recap

The OPEX (Operational of Operating Expenditures) can be decreased, in some cases substantially, but before jumping into action and conclusions, contact someone who understands the risks, and is able to look further ahead in the future, together with you.

Resources

Example OLSA: http://www.oracle.com/us/corporate/pricing/olsa-ire-v122304-070683.pdf

Oracle Software Technical Support Policies :  http://www.oracle.com/us/support/library/057419.pdf

The post Consequences of stopping Oracle support appeared first on AMIS Oracle and Java Blog.

The Hello World of Machine Learning – with Python, Pandas, Jupyter doing Iris classification based on quintessential set of flower data

Sat, 2017-05-06 01:58

imagePlenty of articles describe this hello world of Machine Learning. I will merely list some references and personal notes – primarily for my own convenience.

The objective is: get a first hands on exposure to machine learning – using a well known example (Iris classification) and using commonly used technology (Python). After this first step, a second step seems logical: doing the same thing with my own set of data.

Useful Resources:

Starting time: 6.55 AM

6.55 AM Download and install latest version of Oracle Virtual Box (5.1.22)

7.00 AM Download Fedora 64-bit ISO image (https://getfedora.org/en/workstation/download/)

7.21 AM Create Fedora VM and install Fedora Linux on it from ISO image (create users root/root and python/python); reboot, complete installation, run dnf update (updates worth 850 MB, 1348 upgrade actions – I regret this step), install Virtual Box Guest Addition (non trivial) using this article: https://fedoramagazine.org/install-fedora-virtualbox-guest/.

8.44 AM Save a Snapshot of the VM to retain its fresh, mint, new car smell  condition.

8.45 AM Install Python environment for Machine Learning (Python plus relevant libraries; possibly install Notebook server)

8.55 AM Save another snapshot of the VM in its current state

now the environment has been prepared, it is time for the real action – based on the second article in the list of resources.

10.05 AM start on machine learning notebook sample – working through Iris classification

10.15 AM done with sample; that was quick. And pretty impressive.

 

It seems the Anaconda distribution of Python may be valuable to use. I have downloaded and installed: https://www.continuum.io/downloads .

Note: to make the contents of a shared Host Directory available to all users

cd (go to home directory of current user)

mkdir share (in the home directory of the user)

sudo mount -t vboxsf Downloads  ~/share/ (this makes the shared folder called Downloads in Virtual Box Host available as directory share in guest (Fedora)

Let’s see about this thing with Jupyter Notebooks (fka as IPython). Installing the Jupyter notebook is discussed here: https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01/README.md . Since I installed Anaconda (4.3.1 for Python 3.6) I have the Jupyter app installed already.

With the following command, I download a number of notebooks:

git clone https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects

Let’s try to run one.

cd /home/python/Data-Analysis-and-Machine-Learning-Projects/example-data-science-notebook

jupyter notebook ‘Example Machine Learning Notebook.ipynb’

And the notebook opens in my browser:

image

I can run the notebook, walk through it step by step, edit the notebook’s contents and run the changed steps. Hey mum, I’m a Data Scientist!

Oh, it’s 11.55 AM right now.

 

Some further interesting reads to get going with Python, Pandas and Jupyter Notebooks – and with data:

The post The Hello World of Machine Learning – with Python, Pandas, Jupyter doing Iris classification based on quintessential set of flower data appeared first on AMIS Oracle and Java Blog.

Golden Gate 12c and DIY Sequence Replication with PL/SQL

Fri, 2017-05-05 10:58

Recently, while migrating AIX 11gR2 Databases to Oracle Linux 12cR1 on an ODA X5-2, our setup of Sequence Replication by Oracle Golden Gate appeared to be faulty. The target side sequences were not automatically incremented.

The problem came to light during the migration of acceptance databases, and under some time pressure it was devised to generate drop + create statements ( start with = DBA_SEQUENCES.LAST_NUMBER + DBA_SEQUENCES.INCREMENT_BY ) of all sequences in the Source, and to run these statements on the Target. Although this eventually resulted in the desired result, there were 2 side effects:

    With a total of 1270 sequences, the operation as a whole took more than an hour.
    Packages and triggers referencing these sequences became invalid.

Further research revealed that the Golden Gate Sequence Replication of Production suffered the same problem and I wondered if I could find a better solution with now a bit more time at hand. Well, I discovered that to set any desired sequence “currval” value, a one-time temporary adjustment of the increment and subsequent call to the sequence “nextval” pseudo column is sufficient. What follows is the output of a quick test, but check out what happens with “USER_SEQUENCES.LAST_NUMBER”, and what it really means in combination with the cache.

Create a test sequence

CREATE SEQUENCE TEST_SEQ_01 
START WITH 10 
INCREMENT BY 1000 
MINVALUE 10 
CACHE 20 
NOCYCLE 
NOORDER;
 
-- the sequence returns no current value yet
SELECT TEST_SEQ_01.CURRVAL from dual;
  ORA-08002: sequence TEST_SEQ_01.CURRVAL is not yet defined in this session.

-- check out last_number... it equals nextval because the cache doesn't exist yet
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01'; 
  MIN_VALUE	INCREMENT_BY CACHE_SIZE	LAST_NUMBER
  10	      1000	       20	        10

-- generate the first number and create the cache
SELECT TEST_SEQ_01.NEXTVAL from dual;
  NEXTVAL
  10

-- last_number is updated as the highest possible number of the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        20010

-- and now a current value is returned
SELECT TEST_SEQ_01.CURRVAL from dual;
  CURRVAL
  10

Set the current sequence value = 20000 without recreating the sequence

-- adjust the increment
ALTER SEQUENCE TEST_SEQ_01 INCREMENT BY 19990;

-- last_number equals the sequence next value
-- the last "alter sequence" command must have flushed the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      19990	       20	        20000

-- generate the next value and create a new cache
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  20000

-- last_number is updated as the highest possible number of the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      19990	       20	        419800

-- the sequence has the desired current value
SELECT TEST_SEQ_01.CURRVAL from dual
  CURRVAL
  20000

Reset the increment

-- set the increment_by value back to original
ALTER SEQUENCE TEST_SEQ_01 INCREMENT BY 1000;

-- again, the cache is flushed and last_number equals the next value
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        21000

-- generate the next value and create a new cache
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  21000

-- last_number is updated as the highest possible number of the cache 
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER 
FROM user_sequences 
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        41000

-- the increment is back to 1000
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  22000

This test shows that “USER_SEQUENCES.LAST_NUMBER”:

  • Is identical with sequence “nextval” directly after a “create sequence” or “alter sequence” command, because the cache is not there yet after first definition or gets flushed with an alter.
  • Is updated and saved to disk as the highest possible cache number after a call to “nextval”.
  • Serves as safeguard ( i.e. after a crash ) to ensure that sequence numbers do not conflict with numbers previously issued.

  • I decided to use “DBA_SEQUENCES.LAST_NUMBER” instead of the “currval” pseudo column to compare sequences in Source and Target. The reason is that “currval” is only ( and by definition ) the value returned by my sessions last call to “nextval”. If my session has not called “nextval” yet, “currval” is undefined. So I would have to “nextval” 1270 sequences in Source and also in Target before I could even start with the comparison, while last_numbers are already there to compare with. Also, this activity is unwanted during the short inactive Source and inactive Target migration stage and would take too much time. Last but not least, an exact match of sequence “currval” values is not really necessary… a guarantee of higher sequence “currval” values in Target compared to those in Source is quite enough.

    The next short piece of code is what I eventually came up with and used in the Production migration. It took less than 3 minutes processing time, did not render any Oracle object invalid, and contributed highly to a very limited migration inactivity time.

    -- Code assumes:
    --   1. "nocycle" sequences with positive "increment_by" values
    --   2. identical number of sequences and sequence DDL in Source and Target Database 
    -- Grant 'alter any sequence' and 'select any sequence' to the owner
    -- Replace the database link and schema names with your own
    -- Run the code from Target
    declare
      v_ret PLS_INTEGER := 0;
      v_dummy VARCHAR2(100);
      v_ln number := 0;
      v_ib number := 0;
      v_cz number := 0;
      v_incr number := 0;
    begin
      for i in ( select sequence_owner  so
                      , sequence_name   sn
                      , last_number     ln
                      , increment_by    ib
                      , cache_size      cz 
                 from dba_sequences@<DBLINK_FROM_SOURCE2TARGET>
                 where sequence_owner in ('<SCHEMA01>','<SCHEMA02>','<SCHEMA03>','<SCHEMA04>') )
      loop
          select last_number
               , increment_by 
               , cache_size
            into v_ln
               , v_ib
               , v_cz 
          from dba_sequences
          where sequence_owner = i.so
            and sequence_name = i.sn;
    
    -- set the difference in last_numbers as increment if target.last_number < source.last_number
          if v_ln < i.ln then
            v_incr := i.ln - v_ln;
    -- set the cache as increment if last_numbers match  
          elsif v_ln = i.ln then
            v_incr := v_ib * v_cz;
          end if;
          
          if v_ln <= i.ln then    
            execute immediate 'alter sequence '||i.so||'.'||i.sn||' increment by '||v_incr;
            execute immediate 'select '||i.so||'.'||i.sn||'.nextval from dual' into v_dummy;
            execute immediate 'alter sequence '||i.so||'.'||i.sn||' increment by '||v_ib;
            v_ret := v_ret +1;
          end if;
      end loop;
      dbms_output.put_line('Nr. sequences adjusted: '||v_ret);
    end;
    /
    

    The post Golden Gate 12c and DIY Sequence Replication with PL/SQL appeared first on AMIS Oracle and Java Blog.

    Smooth, easy, lightweight – Node.js and Express style REST API with Java SE

    Thu, 2017-05-04 05:28

    It is easy to be seduced by some of the attractive qualities of Node (aka Node.js) – the JavaScript technology that makes server side development fun again. Developing light weight applications that handle HTTP requests in a rapid, straightforward way with little overhead and no bloated infrastructure is easy as pie – and feels a long way from the traditional Java development. I like Node. I feel the attraction. I have used Node for simple and more complex applications. It’s cool.

    I have realized that what is so nice about Node, is also largely available with Java. Of course, there are many ways of doing Java development that is not lightweight and rapid and low overhead at all. As I am sure we can find ways to spoil Node development. More importantly, there are ways to make Java development comparably breezy as Node development. In this article I take a brief look at the development of a REST API using nothing but the [Oracle] Java Runtime and Maven as the package manager (Java’s equivalent to Node’s npm). Using the Java 8 JDK and Maven I am able to program and run a REST API from my command line, running locally on my laptop, using under two dozen lines of code. In a way to that is very similar to what I would do with Node and the Express library. The steps described below can be executed in less than 15 minutes – similar to what Node based development of this type of REST API foundation would require.

    The source code accompanying this article is in GitHub: https://github.com/lucasjellema/java-express-style-rest-api – but it is not a lot of code at all.

    image

    The final result of this article is simple: a REST API running locally that handles simple GET and POST requests. The logic of the API has to be implemented (and some JSON processing may have to be added, which granted is in Java more complex than in Node) – but that is fairly evident to do.

    Here is a screenshot of Postman where the REST API is invoked:

    image

    and here is the command line for the running REST API:

    image

    The application is started with a single command line (compare to npm start) and listens on port 8765 on localhost to process incoming requests.

    The steps for implementing this REST API and running it locally are described below.

    Implementation of REST API

    Again, the only two prerequisites for these steps are: a locally installed Oracle JDK 8 (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and Maven 3 environment (https://maven.apache.org/download.cgi)

    1. Create scaffold for new application using Maven (compare npm init)

    mvn -B archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=nl.amis.rest -DartifactId=my-rest

    image

    2. Edit Maven’s pom.xml to add dependencies for Jersey and Jersey Container (compare package.json and npm install –save)

    image

    Note: also add build section in pom.xml with explicit indication of Java 1.8 as source and target version (to ensure Lambda expressions are supported)

     

    3. Retrieve required libraries (jar files) using Maven (compare npm install)

    mvn install dependency:copy-dependencies

    This will install all required JARs into directory target\dependency – compare to node-modules in a Node application.

    image

    4. Edit Java class App to create the most simple and straightforward http request serving application conceivable – use imports for required dependencies (compare require instructions in node application)

    package nl.amis.rest;
    
    import java.io.IOException;
    import java.io.OutputStream;
    import java.net.InetSocketAddress;
    
    import com.sun.net.httpserver.HttpExchange;
    import com.sun.net.httpserver.HttpServer;
    
    public class App {
        private final static int port = 8765;
    
        public static void main(String[] args) throws IOException {
            HttpServer server = HttpServer.create(new InetSocketAddress(port), 0);
            server.createContext("/app", (HttpExchange t) -&gt; {
                byte[] response = "Hello World from HttpServer".getBytes();
                t.sendResponseHeaders(200, response.length);
                OutputStream os = t.getResponseBody();
                os.write(response);
                os.close();
            });
            server.setExecutor(null); // creates a default executor
            server.start();
            System.out.println("HTTP Server is running and listening at " + server.getAddress() + "/app");
        }
    }
    

     

    and invoke it:

    image

    5. Class App2.java builds on App2 to add the REST API capabilities – using class Api as the REST Resource (@Path Api) with the Resource Methods to handle GET and POST requests)

    package nl.amis.rest;
    
    import java.io.IOException;
    
    import com.sun.net.httpserver.HttpServer;
    
    import java.net.URI;
    
    import javax.ws.rs.core.UriBuilder;
    
    import org.glassfish.jersey.jdkhttp.JdkHttpServerFactory;
    import org.glassfish.jersey.server.ResourceConfig;
    
    public class App2 {
        private final static int port = 8765;
        private final static String host = "http://localhost/app";
    
        public static void main(String[] args) throws IOException {
            URI baseUri = UriBuilder.fromUri(host).port(port).build();
            ResourceConfig config = new ResourceConfig(Api.class);
            HttpServer server = JdkHttpServerFactory.createHttpServer(baseUri, config);
            System.out.println("HTTP Server is running and listening at "+baseUri+"/api" );
        }
    }
    

    and

    package nl.amis.rest;
    
    import javax.ws.rs.Consumes;
    import javax.ws.rs.GET;
    import javax.ws.rs.POST;
    import javax.ws.rs.Path;
    import javax.ws.rs.Produces;
    import javax.ws.rs.core.Context;
    import javax.ws.rs.core.Request;
    
    @Path("api")
    public class Api {
    
        @POST
        @Consumes("application/json")
        @Produces("text/plain")
        public String postApiMessage(@Context Request request, String json) {
            System.out.println("received event:" + json);
            return "post message received " + json;
        }
    
        @GET
        @Produces("text/plain")
        public String getApiMessage(@Context Request request) {
            return "nothing to report from getApiMessage.";
        }
    
    }
    

    6. Build application using Maven (this step does not really exist for node applications; programming errors come out at run time )

    mvn package

    This creates a JAR file – my-rest-1.0-SNAPSHOT.jar, 6 KB – that can be shipped, cloud deployed or simply executed (as in the next section)

     

    7. Run application which starts the REST API at http://localhost:8765

    java -cp target/my-rest-1.0-SNAPSHOT.jar;target/dependency/* nl.amis.rest.App

    or

    java -cp target/my-rest-1.0-SNAPSHOT.jar;target/dependency/* nl.amis.rest.App2

     

    Resources

    Get URL parameters using JDK HTTP server  http://www.rgagnon.com/javadetails/java-get-url-parameters-using-jdk-http-server.html

    Example of reading headers and of downloading (PDF) file through HTTP Server: http://www.rgagnon.com/javadetails/java-have-a-simple-http-server.html

    The post Smooth, easy, lightweight – Node.js and Express style REST API with Java SE appeared first on AMIS Oracle and Java Blog.

    Net usable storage when using Oracle Database Appliance

    Sat, 2017-04-29 08:19

    Since the birth of the ODA X6-2 HA, the net usable storage has become more of a challenge when advising a customer as the X5-2 HA had substantial more – but slower – storage. This very short blogpost is just a quick summary / checklist of the net usable storage, depending on the ODA and the mirroring you choose.

    By the way, for other ODA-comparisons than storage I wrote another blogpost.

    The following pictures are extracts of Oracle PDF’s.

    Usable storage of the single node ODA’s:

    image

     

    Usable storage of the two-node ODA (including the ‘old’ X5-2 HA for comparison):

    image

     

    ODA X6-2 HA Normal Redundancy – 2 x Mirroring

     

    image

     

    ODA X6-2 HA High Redundancy – 3 x Mirroring

     

    image

     

    Sources:

    ODA, comparing the line: https://technology.amis.nl/2017/01/08/oracle-database-appliance-x6-2l-x6-2-ha-comparing-line/

    Oracle PDF’s :

    – 201701120 EN Oracle Database Appliance X6-2 Portfolio Summary.pdf

    – ODA X6-2HA Net Storage.pdf

    The post Net usable storage when using Oracle Database Appliance appeared first on AMIS Oracle and Java Blog.

    R: Utilizing multiple CPUs

    Sat, 2017-04-22 07:02

    R is a great piece of software to perform statistical analyses. Computing power can however be a limitation. R by default uses only a single CPU. In almost every machine, multiple CPUs are present, so why not utilize them? In this blog post I’ll give a minimal example and some code snippets to help make more complex examples work.

    Utilizing multiple CPUs

    Luckily using multiple CPUs in R is relatively simple. There is a deprecated library multicore available which you shouldn’t use. A newer library parallel is recommended. This library provides mclapply. This function only works on Linux systems so we’re not going to use that one. The below examples work on Windows and Linux and do not use deprecated libraries.

    A very simple example
    library(parallel)
    
    no_cores <- detectCores() - 1
    cl <- makeCluster(no_cores)
    arr <- c("business","done","differently")
    
    #Work on the future together
    result <- parLapply(cl, arr, function(x) toupper(x))
    
    #Conclusion: BUSINESS DONE DIFFERENTLY
    paste (c('Conclusion:',result),collapse = ' ')
    
    stopCluster(cl)
    

    The example is a minimal example of how you can use clustering in R. What this code does is spawn multiple processes and process the entries from the array c(“business”,”done”,”differently”) in those separate processes. Processing in this case is just putting them in uppercase. After it is done, the result from the different processes is combined in Conclusion: BUSINESS DONE DIFFERENTLY.

    If you remove the stopCluster command, you can see there are multiple processes open on my Windows machine:

    After having called the stopCluster command, the number of processes if much reduced:

    You can imagine that for such a simple operation as putting things in uppercase, you might as well use the regular apply function which saves you from the overhead of spawning processes. If however you have more complex operations like the below example, you will benefit greatly from being to utilize more computing power!

    A more elaborate example

    You can download the code of this example from: https://github.com/MaartenSmeets/R/blob/master/htmlcrawling.R

    The sample however does not work anymore since it parses Yahoo pages which have recently been changed. The sample does illustrate however how to do parallel processing.

    Because there are separate R processes running, you need to make libraries and functions available to these processes. For example, you can make libraries available like:

    #make libraries available in other nodes
    clusterEvalQ(cl, {
      library(XML)
      library(RCurl)
      library(parallel)
      }
    )
    

    And you can make functions available like

    clusterExport(cl, "htmlParseFunc")
    
    Considerations

    There are several considerations (and probably more than mentioned below) when using this way of clustering:

    • Work packages are separated equally over CPUs. If however the work packages differ greatly in the amount of work, you can encounter situations where parLapply is waiting for a process to complete while the other processes are already done. You should try and use work packages mostly of equal size to avoid this.
    • If a process runs too long, it will timeout. You can set the timeout when creating the cluster like: cl <- makeCluster(no_cores, timeout=50)
    • Every process takes memory. If you process large variables in parallel, you might encounter memory limitations.
    • Debugging the different processes can be difficult. I will not go into detail here.
    • GPUs can also be utilized to do calculations. See for example: https://www.r-bloggers.com/r-gpu-programming-for-all-with-gpur/. I have not tried this but the performance graphs online indicate a much better performance can be achieved than when using CPUs.

    The post R: Utilizing multiple CPUs appeared first on AMIS Oracle and Java Blog.

    Better track the Usage of Database Options and Management Packs, or it will cost you

    Fri, 2017-04-21 06:34

    So here it is
    Oracle announces a license audit, some urgency kicks in and this familiar but also really serious question comes down from management: “Are we using any unlicensed database features“. The seriousness is quite understandable, because if so, the company can look forward to some negotiations with Oracle over license fees, possibly resulting in considerable and unforeseen extra costs.

    Tracking… why
    To be able to provide a swift and correct answer to this question, I track the usage of database options and management packs. As you might expect, tracking also enables detection of any deliberate or accidental unlicensed feature usage, so I can stop it sooner than later. And stopping it sooner is better because usage during months or years isn’t as easily excused by Oracle as usage during a day or week.

    Tracking… how
    Tracking is done by way of 2 views, both derived from “options_packs_usage_statistics.sql“, provided by Oracle Support –> MOS Note 1317265. Recently this script has been updated to handle version 12.2, so I had to update my views too. The Oracle script can be used on database version 11gR2 and higher, and on 12c container as well as non-container 12c databases. My views can also be used on 11gR2 databases and higher ( EE, SE and SE2 ), but assume a non-container database.

    Bugs
    Some bugs (Doc ID 1309070.1) are associated with DBA_FEATURE_USAGE_STATISTICS, the main data source for “options_packs_usage_statistics.sql“. At this time they mention false positives over the use of compression or encryption with Secure Files and RMAN, and with the reporting of Oracle Spatial usage where only Oracle Locator is used.

    Disclaimer
    The following code provide usage statistics for Database Options, Management Packs and their corresponding features.
    This information is to be used for informational purposes only and does not represent any license entitlement or requirement.

    SET DEFINE OFF;
    CREATE OR REPLACE FORCE VIEW FEATURE_USAGE
    AS
    select product
         , decode(usage, 'NO_USAGE','NO', usage ) "Used"
         , last_sample_date
         , first_usage_date
         , last_usage_date
    ------- following sql is based on options_packs_usage_statistics.sql  --> MOS Note 1317265.1
    from (
    with
    MAP as (
    -- mapping between features tracked by DBA_FUS and their corresponding database products (options or packs)
    select '' PRODUCT, '' feature, '' MVERSION, '' CONDITION from dual union all
    SELECT 'Active Data Guard'                                   , 'Active Data Guard - Real-Time Query on Physical Standby' , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Active Data Guard'                                   , 'Global Data Services'                                    , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Analytics'                                  , 'Data Mining'                                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'ADVANCED Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Advanced Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Backup HIGH Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup LOW Compression'                                  , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup MEDIUM Compression'                               , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup ZLIB Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Data Guard'                                              , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^11\.2\.0\.[1-3]\.'         , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^(11\.2\.0\.[4-9]\.|12\.)'  , 'INVALID' from dual union all -- licensing required by Optimization for Flashback Data Archive
    SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^11\.2|^12\.1'              , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.1'                     , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Information Lifecycle Management'                        , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Advanced Network Compression Service'             , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'SecureFile Compression (user)'                           , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'SecureFile Deduplication (user)'                         , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'ASO native encryption and checksumming'                  , '^11\.2|^12\.'               , 'INVALID' from dual union all -- no longer part of Advanced Security
    SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^11\.2'                     , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^12\.'                      , 'INVALID' from dual union all -- licensing required only by encryption to disk
    SELECT 'Advanced Security'                                   , 'Data Redaction'                                          , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Encrypted Tablespaces'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
    SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
    SELECT 'Advanced Security'                                   , 'SecureFile Encryption (user)'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Transparent Data Encryption'                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Change Management Pack'                              , 'Change Management Pack'                                  , '^11\.2'                     , ' '       from dual union all
    SELECT 'Configuration Management Pack for Oracle Database'   , 'EM Config Management Pack'                               , '^11\.2'                     , ' '       from dual union all
    SELECT 'Data Masking Pack'                                   , 'Data Masking Pack'                                       , '^11\.2'                     , ' '       from dual union all
    SELECT '.Database Gateway'                                   , 'Gateways'                                                , '^12\.'                      , ' '       from dual union all
    SELECT '.Database Gateway'                                   , 'Transparent Gateway'                                     , '^12\.'                      , ' '       from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Aggregation'                                   , '^12\.'                      , ' '       from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.0'            , 'BUG'     from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.[^0]|^12\.2'  , ' '       from dual union all
    SELECT 'Database Vault'                                      , 'Oracle Database Vault'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Database Vault'                                      , 'Privilege Capture'                                       , '^12\.'                      , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'ADDM'                                                    , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Baseline'                                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Baseline Template'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Report'                                              , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Automatic Workload Repository'                           , '^12\.'                      , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Baseline Adaptive Thresholds'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Baseline Static Computations'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Diagnostic Pack'                                         , '^11\.2'                     , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'EM Performance Page'                                     , '^12\.'                      , ' '       from dual union all
    SELECT '.Exadata'                                            , 'Exadata'                                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT '.GoldenGate'                                         , 'GoldenGate'                                              , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.1'                     , 'BUG'     from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Sun ZFS with EHCC'                                       , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'ZFS Storage'                                             , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
    SELECT 'Label Security'                                      , 'Label Security'                                          , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Multitenant'                                         , 'Oracle Multitenant'                                      , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
    SELECT 'Multitenant'                                         , 'Oracle Pluggable Databases'                              , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
    SELECT 'OLAP'                                                , 'OLAP - Analytic Workspaces'                              , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'OLAP'                                                , 'OLAP - Cubes'                                            , '^12\.'                      , ' '       from dual union all
    SELECT 'Partitioning'                                        , 'Partitioning (user)'                                     , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Partitioning'                                        , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
    SELECT '.Pillar Storage'                                     , 'Pillar Storage'                                          , '^12\.'                      , ' '       from dual union all
    SELECT '.Pillar Storage'                                     , 'Pillar Storage with EHCC'                                , '^12\.'                      , ' '       from dual union all
    SELECT '.Provisioning and Patch Automation Pack'             , 'EM Standalone Provisioning and Patch Automation Pack'    , '^11\.2'                     , ' '       from dual union all
    SELECT 'Provisioning and Patch Automation Pack for Database' , 'EM Database Provisioning and Patch Automation Pack'      , '^11\.2'                     , ' '       from dual union all
    SELECT 'RAC or RAC One Node'                                 , 'Quality of Service Management'                           , '^12\.'                      , ' '       from dual union all
    SELECT 'Real Application Clusters'                           , 'Real Application Clusters (RAC)'                         , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Real Application Clusters One Node'                  , 'Real Application Cluster One Node'                       , '^12\.'                      , ' '       from dual union all
    SELECT 'Real Application Testing'                            , 'Database Replay: Workload Capture'                       , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT 'Real Application Testing'                            , 'Database Replay: Workload Replay'                        , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT 'Real Application Testing'                            , 'SQL Performance Analyzer'                                , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT '.Secure Backup'                                      , 'Oracle Secure Backup'                                    , '^12\.'                      , 'INVALID' from dual union all  -- does not differentiate usage of Oracle Secure Backup Express, which is free
    SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^11\.2'                     , 'INVALID' from dual union all  -- does not differentiate usage of Locator, which is free
    SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^12\.'                      , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'Automatic Maintenance - SQL Tuning Advisor'              , '^12\.'                      , 'INVALID' from dual union all  -- system usage in the maintenance window
    SELECT 'Tuning Pack'                                         , 'Automatic SQL Tuning Advisor'                            , '^11\.2|^12\.'               , 'INVALID' from dual union all  -- system usage in the maintenance window
    SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^11\.2'                     , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^12\.'                      , 'INVALID' from dual union all  -- default
    SELECT 'Tuning Pack'                                         , 'SQL Access Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Monitoring and Tuning pages'                         , '^12\.'                      , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Profile'                                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Tuning Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Tuning Set (user)'                                   , '^12\.'                      , 'INVALID' from dual union all -- no longer part of Tuning Pack
    SELECT 'Tuning Pack'                                         , 'Tuning Pack'                                             , '^11\.2'                     , ' '       from dual union all
    SELECT '.WebLogic Server Management Pack Enterprise Edition' , 'EM AS Provisioning and Patch Automation Pack'            , '^11\.2'                     , ' '       from dual union all
    select '' PRODUCT, '' FEATURE, '' MVERSION, '' CONDITION from dual
    ),
    FUS as (
    -- the current data set to be used: DBA_FEATURE_USAGE_STATISTICS or CDB_FEATURE_USAGE_STATISTICS for Container Databases(CDBs)
    select
        0 as CON_ID,
        NULL as CON_NAME,
        -- Detect and mark with Y the current DBA_FUS data set = Most Recent Sample based on LAST_SAMPLE_DATE
          case when DBID || '#' || VERSION || '#' || to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS') =
                    first_value (DBID    )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                    first_value (VERSION )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                    first_value (to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS'))
                                                   over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc)
               then 'Y'
               else 'N'
        end as CURRENT_ENTRY,
        NAME            ,
        LAST_SAMPLE_DATE,
        DBID            ,
        VERSION         ,
        DETECTED_USAGES ,
        TOTAL_SAMPLES   ,
        CURRENTLY_USED  ,
        FIRST_USAGE_DATE,
        LAST_USAGE_DATE ,
        AUX_COUNT       ,
        FEATURE_INFO
    from DBA_FEATURE_USAGE_STATISTICS xy
    ),
    PFUS as (
    -- Product-Feature Usage Statitsics = DBA_FUS entries mapped to their corresponding database products
    select
        CON_ID,
        CON_NAME,
        PRODUCT,
        NAME as FEATURE_BEING_USED,
        case  when CONDITION = 'BUG'
                   --suppressed due to exceptions/defects
                   then '3.SUPPRESSED_DUE_TO_BUG'
              when     detected_usages > 0                 -- some usage detection - current or past
                   and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
                   and CURRENT_ENTRY  = 'Y'                -- current record set
                   and (    trim(CONDITION) is null        -- no extra conditions
                         or CONDITION_MET     = 'TRUE'     -- extra condition is met
                        and CONDITION_COUNTER = 'FALSE' )  -- extra condition is not based on counter
                   then '6.CURRENT_USAGE'
              when     detected_usages > 0                 -- some usage detection - current or past
                   and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
                   and CURRENT_ENTRY  = 'Y'                -- current record set
                   and (    CONDITION_MET     = 'TRUE'     -- extra condition is met
                        and CONDITION_COUNTER = 'TRUE'  )  -- extra condition is     based on counter
                   then '5.PAST_OR_CURRENT_USAGE'          -- FEATURE_INFO counters indicate current or past usage
              when     detected_usages > 0                 -- some usage detection - current or past
                   and (    trim(CONDITION) is null        -- no extra conditions
                         or CONDITION_MET     = 'TRUE'  )  -- extra condition is met
                   then '4.PAST_USAGE'
              when CURRENT_ENTRY = 'Y'
                   then '2.NO_CURRENT_USAGE'   -- detectable feature shows no current usage
              else '1.NO_PAST_USAGE'
        end as USAGE,
        LAST_SAMPLE_DATE,
        DBID            ,
        VERSION         ,
        DETECTED_USAGES ,
        TOTAL_SAMPLES   ,
        CURRENTLY_USED  ,
        case  when CONDITION like 'C___' and CONDITION_MET = 'FALSE'
                   then to_date('')
              else FIRST_USAGE_DATE
        end as FIRST_USAGE_DATE,
        case  when CONDITION like 'C___' and CONDITION_MET = 'FALSE'
                   then to_date('')
              else LAST_USAGE_DATE
        end as LAST_USAGE_DATE,
        EXTRA_FEATURE_INFO
    from (
    select m.PRODUCT, m.CONDITION, m.MVERSION,
           -- if extra conditions (coded on the MAP.CONDITION column) are required, check if entries satisfy the condition
           case
                 when CONDITION = 'C001' and (   regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                              or regexp_like(to_char(FEATURE_INFO), 'compression used: *TRUE', 'i')                 )
                      then 'TRUE'  -- compression has been used
                 when CONDITION = 'C002' and (   regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                              or regexp_like(to_char(FEATURE_INFO), 'encryption used: *TRUE', 'i')                  )
                      then 'TRUE'  -- encryption has been used
                 when CONDITION = 'C003' and CON_ID=1 and AUX_COUNT > 1
                      then 'TRUE'  -- more than one PDB are created
                 when CONDITION = 'C004' and 'N'= 'N'
                      then 'TRUE'  -- not in oracle cloud
                 else 'FALSE'
           end as CONDITION_MET,
           -- check if the extra conditions are based on FEATURE_INFO counters. They indicate current or past usage.
           case
                 when CONDITION = 'C001' and     regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                      then 'TRUE'  -- compression counter > 0
                 when CONDITION = 'C002' and     regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                      then 'TRUE'  -- encryption counter > 0
                 else 'FALSE'
           end as CONDITION_COUNTER,
           case when CONDITION = 'C001'
                     then   regexp_substr(to_char(FEATURE_INFO), 'compression used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
                when CONDITION = 'C002'
                     then   regexp_substr(to_char(FEATURE_INFO), 'encryption used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
                when CONDITION = 'C003'
                     then   'AUX_COUNT=' || AUX_COUNT
                when CONDITION = 'C004' and 'N'= 'Y'
                     then   'feature included in Oracle Cloud Services Package'
                else ''
           end as EXTRA_FEATURE_INFO,
           f.CON_ID          ,
           f.CON_NAME        ,
           f.CURRENT_ENTRY   ,
           f.NAME            ,
           f.LAST_SAMPLE_DATE,
           f.DBID            ,
           f.VERSION         ,
           f.DETECTED_USAGES ,
           f.TOTAL_SAMPLES   ,
           f.CURRENTLY_USED  ,
           f.FIRST_USAGE_DATE,
           f.LAST_USAGE_DATE ,
           f.AUX_COUNT       ,
           f.FEATURE_INFO
      from MAP m
      join FUS f on m.FEATURE = f.NAME and regexp_like(f.VERSION, m.MVERSION)
      where nvl(f.TOTAL_SAMPLES, 0) > 0                        -- ignore features that have never been sampled
    )
      where nvl(CONDITION, '-') != 'INVALID'                   -- ignore features for which licensing is not required without further conditions
        and not (CONDITION = 'C003' and CON_ID not in (0, 1))  -- multiple PDBs are visible only in CDB$ROOT; PDB level view is not relevant
    )
    select
        grouping_id(CON_ID) as gid,
        CON_ID   ,
        decode(grouping_id(CON_ID), 1, '--ALL--', max(CON_NAME)) as CON_NAME,
        PRODUCT  ,
        decode(max(USAGE),
              '1.NO_PAST_USAGE'        , 'NO_USAGE'             ,
              '2.NO_CURRENT_USAGE'     , 'NO_USAGE'             ,
              '3.SUPPRESSED_DUE_TO_BUG', 'SUPPRESSED_DUE_TO_BUG',
              '4.PAST_USAGE'           , 'PAST_USAGE'           ,
              '5.PAST_OR_CURRENT_USAGE', 'PAST_OR_CURRENT_USAGE',
              '6.CURRENT_USAGE'        , 'CURRENT_USAGE'        ,
              'UNKNOWN') as USAGE,
        max(LAST_SAMPLE_DATE) as LAST_SAMPLE_DATE,
        min(FIRST_USAGE_DATE) as FIRST_USAGE_DATE,
        max(LAST_USAGE_DATE)  as LAST_USAGE_DATE
      from PFUS
      where USAGE in ('2.NO_CURRENT_USAGE', '4.PAST_USAGE', '5.PAST_OR_CURRENT_USAGE', '6.CURRENT_USAGE')   -- ignore '1.NO_PAST_USAGE', '3.SUPPRESSED_DUE_TO_BUG'
      group by rollup(CON_ID), PRODUCT
      having not (max(CON_ID) in (-1, 0) and grouping_id(CON_ID) = 1)            -- aggregation not needed for non-container databases
    order by GID desc, CON_ID, decode(substr(PRODUCT, 1, 1), '.', 2, 1), PRODUCT );
    
    
    CREATE OR REPLACE FORCE VIEW FEATURE_USAGE_DETAILS
    AS
    select product
         , feature_being_used
         , usage
         , last_sample_date
         , dbid
         , ( select name from v$database ) dbname
         , version
         , detected_usages
         , total_samples
         , currently_used
         , first_usage_date
         , last_usage_date
         , extra_feature_info
    ------- following sql is based on options_packs_usage_statistics.sql  --> MOS Note 1317265.1
    from (
    with
    MAP as (
    -- mapping between features tracked by DBA_FUS and their corresponding database products (options or packs)
    select '' PRODUCT, '' feature, '' MVERSION, '' CONDITION from dual union all
    SELECT 'Active Data Guard'                                   , 'Active Data Guard - Real-Time Query on Physical Standby' , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Active Data Guard'                                   , 'Global Data Services'                                    , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Analytics'                                  , 'Data Mining'                                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'ADVANCED Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Advanced Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Backup HIGH Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup LOW Compression'                                  , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup MEDIUM Compression'                               , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Backup ZLIB Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Data Guard'                                              , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^11\.2\.0\.[1-3]\.'         , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^(11\.2\.0\.[4-9]\.|12\.)'  , 'INVALID' from dual union all -- licensing required by Optimization for Flashback Data Archive
    SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^11\.2|^12\.1'              , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.1'                     , 'BUG'     from dual union all
    SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Information Lifecycle Management'                        , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Advanced Network Compression Service'             , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
    SELECT 'Advanced Compression'                                , 'SecureFile Compression (user)'                           , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Compression'                                , 'SecureFile Deduplication (user)'                         , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'ASO native encryption and checksumming'                  , '^11\.2|^12\.'               , 'INVALID' from dual union all -- no longer part of Advanced Security
    SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^11\.2'                     , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^12\.'                      , 'INVALID' from dual union all -- licensing required only by encryption to disk
    SELECT 'Advanced Security'                                   , 'Data Redaction'                                          , '^12\.'                      , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Encrypted Tablespaces'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
    SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
    SELECT 'Advanced Security'                                   , 'SecureFile Encryption (user)'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Advanced Security'                                   , 'Transparent Data Encryption'                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Change Management Pack'                              , 'Change Management Pack'                                  , '^11\.2'                     , ' '       from dual union all
    SELECT 'Configuration Management Pack for Oracle Database'   , 'EM Config Management Pack'                               , '^11\.2'                     , ' '       from dual union all
    SELECT 'Data Masking Pack'                                   , 'Data Masking Pack'                                       , '^11\.2'                     , ' '       from dual union all
    SELECT '.Database Gateway'                                   , 'Gateways'                                                , '^12\.'                      , ' '       from dual union all
    SELECT '.Database Gateway'                                   , 'Transparent Gateway'                                     , '^12\.'                      , ' '       from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Aggregation'                                   , '^12\.'                      , ' '       from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.0'            , 'BUG'     from dual union all
    SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.[^0]|^12\.2'  , ' '       from dual union all
    SELECT 'Database Vault'                                      , 'Oracle Database Vault'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Database Vault'                                      , 'Privilege Capture'                                       , '^12\.'                      , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'ADDM'                                                    , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Baseline'                                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Baseline Template'                                   , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'AWR Report'                                              , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Automatic Workload Repository'                           , '^12\.'                      , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Baseline Adaptive Thresholds'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Baseline Static Computations'                            , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'Diagnostic Pack'                                         , '^11\.2'                     , ' '       from dual union all
    SELECT 'Diagnostics Pack'                                    , 'EM Performance Page'                                     , '^12\.'                      , ' '       from dual union all
    SELECT '.Exadata'                                            , 'Exadata'                                                 , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT '.GoldenGate'                                         , 'GoldenGate'                                              , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.1'                     , 'BUG'     from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.[2-9]'                 , ' '       from dual union all
    SELECT '.HW'                                                 , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Sun ZFS with EHCC'                                       , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'ZFS Storage'                                             , '^12\.'                      , ' '       from dual union all
    SELECT '.HW'                                                 , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
    SELECT 'Label Security'                                      , 'Label Security'                                          , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Multitenant'                                         , 'Oracle Multitenant'                                      , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
    SELECT 'Multitenant'                                         , 'Oracle Pluggable Databases'                              , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
    SELECT 'OLAP'                                                , 'OLAP - Analytic Workspaces'                              , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'OLAP'                                                , 'OLAP - Cubes'                                            , '^12\.'                      , ' '       from dual union all
    SELECT 'Partitioning'                                        , 'Partitioning (user)'                                     , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Partitioning'                                        , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
    SELECT '.Pillar Storage'                                     , 'Pillar Storage'                                          , '^12\.'                      , ' '       from dual union all
    SELECT '.Pillar Storage'                                     , 'Pillar Storage with EHCC'                                , '^12\.'                      , ' '       from dual union all
    SELECT '.Provisioning and Patch Automation Pack'             , 'EM Standalone Provisioning and Patch Automation Pack'    , '^11\.2'                     , ' '       from dual union all
    SELECT 'Provisioning and Patch Automation Pack for Database' , 'EM Database Provisioning and Patch Automation Pack'      , '^11\.2'                     , ' '       from dual union all
    SELECT 'RAC or RAC One Node'                                 , 'Quality of Service Management'                           , '^12\.'                      , ' '       from dual union all
    SELECT 'Real Application Clusters'                           , 'Real Application Clusters (RAC)'                         , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Real Application Clusters One Node'                  , 'Real Application Cluster One Node'                       , '^12\.'                      , ' '       from dual union all
    SELECT 'Real Application Testing'                            , 'Database Replay: Workload Capture'                       , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT 'Real Application Testing'                            , 'Database Replay: Workload Replay'                        , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT 'Real Application Testing'                            , 'SQL Performance Analyzer'                                , '^11\.2|^12\.'               , 'C004'    from dual union all
    SELECT '.Secure Backup'                                      , 'Oracle Secure Backup'                                    , '^12\.'                      , 'INVALID' from dual union all  -- does not differentiate usage of Oracle Secure Backup Express, which is free
    SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^11\.2'                     , 'INVALID' from dual union all  -- does not differentiate usage of Locator, which is free
    SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^12\.'                      , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'Automatic Maintenance - SQL Tuning Advisor'              , '^12\.'                      , 'INVALID' from dual union all  -- system usage in the maintenance window
    SELECT 'Tuning Pack'                                         , 'Automatic SQL Tuning Advisor'                            , '^11\.2|^12\.'               , 'INVALID' from dual union all  -- system usage in the maintenance window
    SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^11\.2'                     , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^12\.'                      , 'INVALID' from dual union all  -- default
    SELECT 'Tuning Pack'                                         , 'SQL Access Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Monitoring and Tuning pages'                         , '^12\.'                      , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Profile'                                             , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Tuning Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
    SELECT 'Tuning Pack'                                         , 'SQL Tuning Set (user)'                                   , '^12\.'                      , 'INVALID' from dual union all -- no longer part of Tuning Pack
    SELECT 'Tuning Pack'                                         , 'Tuning Pack'                                             , '^11\.2'                     , ' '       from dual union all
    SELECT '.WebLogic Server Management Pack Enterprise Edition' , 'EM AS Provisioning and Patch Automation Pack'            , '^11\.2'                     , ' '       from dual union all
    select '' PRODUCT, '' FEATURE, '' MVERSION, '' CONDITION from dual
    ),
    FUS as (
    -- the current data set to be used: DBA_FEATURE_USAGE_STATISTICS or CDB_FEATURE_USAGE_STATISTICS for Container Databases(CDBs)
    select
        0 as CON_ID,
        NULL as CON_NAME,
        -- Detect and mark with Y the current DBA_FUS data set = Most Recent Sample based on LAST_SAMPLE_DATE
          case when DBID || '#' || VERSION || '#' || to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS') =
                    first_value (DBID    )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                    first_value (VERSION )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                    first_value (to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS'))
                                                   over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc)
               then 'Y'
               else 'N'
        end as CURRENT_ENTRY,
        NAME            ,
        LAST_SAMPLE_DATE,
        DBID            ,
        VERSION         ,
        DETECTED_USAGES ,
        TOTAL_SAMPLES   ,
        CURRENTLY_USED  ,
        FIRST_USAGE_DATE,
        LAST_USAGE_DATE ,
        AUX_COUNT       ,
        FEATURE_INFO
    from DBA_FEATURE_USAGE_STATISTICS xy
    ),
    PFUS as (
    -- Product-Feature Usage Statitsics = DBA_FUS entries mapped to their corresponding database products
    select
        CON_ID,
        CON_NAME,
        PRODUCT,
        NAME as FEATURE_BEING_USED,
        case  when CONDITION = 'BUG'
                   --suppressed due to exceptions/defects
                   then '3.SUPPRESSED_DUE_TO_BUG'
              when     detected_usages > 0                 -- some usage detection - current or past
                   and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
                   and CURRENT_ENTRY  = 'Y'                -- current record set
                   and (    trim(CONDITION) is null        -- no extra conditions
                         or CONDITION_MET     = 'TRUE'     -- extra condition is met
                        and CONDITION_COUNTER = 'FALSE' )  -- extra condition is not based on counter
                   then '6.CURRENT_USAGE'
              when     detected_usages > 0                 -- some usage detection - current or past
                   and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
                   and CURRENT_ENTRY  = 'Y'                -- current record set
                   and (    CONDITION_MET     = 'TRUE'     -- extra condition is met
                        and CONDITION_COUNTER = 'TRUE'  )  -- extra condition is     based on counter
                   then '5.PAST_OR_CURRENT_USAGE'          -- FEATURE_INFO counters indicate current or past usage
              when     detected_usages > 0                 -- some usage detection - current or past
                   and (    trim(CONDITION) is null        -- no extra conditions
                         or CONDITION_MET     = 'TRUE'  )  -- extra condition is met
                   then '4.PAST_USAGE'
              when CURRENT_ENTRY = 'Y'
                   then '2.NO_CURRENT_USAGE'   -- detectable feature shows no current usage
              else '1.NO_PAST_USAGE'
        end as USAGE,
        LAST_SAMPLE_DATE,
        DBID            ,
        VERSION         ,
        DETECTED_USAGES ,
        TOTAL_SAMPLES   ,
        CURRENTLY_USED  ,
        FIRST_USAGE_DATE,
        LAST_USAGE_DATE,
        EXTRA_FEATURE_INFO
    from (
    select m.PRODUCT, m.CONDITION, m.MVERSION,
           -- if extra conditions (coded on the MAP.CONDITION column) are required, check if entries satisfy the condition
           case
                 when CONDITION = 'C001' and (   regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                              or regexp_like(to_char(FEATURE_INFO), 'compression used: *TRUE', 'i')                 )
                      then 'TRUE'  -- compression has been used
                 when CONDITION = 'C002' and (   regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                              or regexp_like(to_char(FEATURE_INFO), 'encryption used: *TRUE', 'i')                  )
                      then 'TRUE'  -- encryption has been used
                 when CONDITION = 'C003' and CON_ID=1 and AUX_COUNT > 1
                      then 'TRUE'  -- more than one PDB are created
                 when CONDITION = 'C004' and 'N'= 'N'
                      then 'TRUE'  -- not in oracle cloud
                 else 'FALSE'
           end as CONDITION_MET,
           -- check if the extra conditions are based on FEATURE_INFO counters. They indicate current or past usage.
           case
                 when CONDITION = 'C001' and     regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                      then 'TRUE'  -- compression counter > 0
                 when CONDITION = 'C002' and     regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                      then 'TRUE'  -- encryption counter > 0
                 else 'FALSE'
           end as CONDITION_COUNTER,
           case when CONDITION = 'C001'
                     then   regexp_substr(to_char(FEATURE_INFO), 'compression used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
                when CONDITION = 'C002'
                     then   regexp_substr(to_char(FEATURE_INFO), 'encryption used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
                when CONDITION = 'C003'
                     then   'AUX_COUNT=' || AUX_COUNT
                when CONDITION = 'C004' and 'N'= 'Y'
                     then   'feature included in Oracle Cloud Services Package'
                else ''
           end as EXTRA_FEATURE_INFO,
           f.CON_ID          ,
           f.CON_NAME        ,
           f.CURRENT_ENTRY   ,
           f.NAME            ,
           f.LAST_SAMPLE_DATE,
           f.DBID            ,
           f.VERSION         ,
           f.DETECTED_USAGES ,
           f.TOTAL_SAMPLES   ,
           f.CURRENTLY_USED  ,
           f.FIRST_USAGE_DATE,
           f.LAST_USAGE_DATE ,
           f.AUX_COUNT       ,
           f.FEATURE_INFO
      from MAP m
      join FUS f on m.FEATURE = f.NAME and regexp_like(f.VERSION, m.MVERSION)
      where nvl(f.TOTAL_SAMPLES, 0) > 0                        -- ignore features that have never been sampled
    )
      where nvl(CONDITION, '-') != 'INVALID'                   -- ignore features for which licensing is not required without further conditions
        and not (CONDITION = 'C003' and CON_ID not in (0, 1))  -- multiple PDBs are visible only in CDB$ROOT; PDB level view is not relevant
    )
    select
        CON_ID            ,
        CON_NAME          ,
        PRODUCT           ,
        FEATURE_BEING_USED,
        decode(USAGE,
              '1.NO_PAST_USAGE'        , 'NO_PAST_USAGE'        ,
              '2.NO_CURRENT_USAGE'     , 'NO_CURRENT_USAGE'     ,
              '3.SUPPRESSED_DUE_TO_BUG', 'SUPPRESSED_DUE_TO_BUG',
              '4.PAST_USAGE'           , 'PAST_USAGE'           ,
              '5.PAST_OR_CURRENT_USAGE', 'PAST_OR_CURRENT_USAGE',
              '6.CURRENT_USAGE'        , 'CURRENT_USAGE'        ,
              'UNKNOWN') as USAGE,
        LAST_SAMPLE_DATE  ,
        DBID              ,
        VERSION           ,
        DETECTED_USAGES   ,
        TOTAL_SAMPLES     ,
        CURRENTLY_USED    ,
        FIRST_USAGE_DATE  ,
        LAST_USAGE_DATE   ,
        EXTRA_FEATURE_INFO
      from PFUS
      where USAGE in ('2.NO_CURRENT_USAGE', '3.SUPPRESSED_DUE_TO_BUG', '4.PAST_USAGE', '5.PAST_OR_CURRENT_USAGE', '6.CURRENT_USAGE')  -- ignore '1.NO_PAST_USAGE'
    order by CON_ID, decode(substr(PRODUCT, 1, 1), '.', 2, 1), PRODUCT, FEATURE_BEING_USED, LAST_SAMPLE_DATE desc, PFUS.USAGE );
    

    The post Better track the Usage of Database Options and Management Packs, or it will cost you appeared first on AMIS Oracle and Java Blog.

    Oracle Mobile Cloud Service (MCS): An introduction to API security: Basic Authentication and OAuth2

    Fri, 2017-04-07 07:41

    As an integration/backend developer, when starting a project using Mobile Cloud Service, it is important to have some understanding of what this MBaaS (Mobile Backend as a Service) has to offer in terms of security features. This is important in order to be able to configure and test MCS. In this blog I will give examples on how to configure and use the basic authentication and OAuth2 features which are provided to secure APIs. You can read the Oracle documentation (which is quite good for MCS!) on this topic here.

    Introduction

    Oracle Mobile Cloud Service offers platform APIs to offer specific features. You can create custom APIs by writing JavaScript code to run on Node.js. Connectors are used to access backend systems. This blogs focuses on authentication options for incoming requests.

    The connectors are not directly available from the outside. MCS can secure custom and platform APIs. This functionality is taken care of by the Mobile Backend and the custom API configuration.

    Getting started

    The first thing to do when you want to expose an API is assign the API to a Mobile Backend. You can do this in the Mobile Backend configuration screen, APIs tab.

    You can allow anonymous access, but generally you want to know who accesses your API. Also because MCS has a license option to pay for a specific number of API calls; you want to know who you are paying for. In order to require authentication on a per user basis, you first have to create a user and assign it to a group. You can also do this from the Mobile Backend configuration. Go to the Mobile Users Management tab to create users and groups.

    After you have done this, you can assign the role to the API. You can also do this on a per endpoint basis which makes this authentication scheme very flexible.

    Now we have configured our API to allow access to users who are in a specific role. We can now call our API using basic authentication or OAuth2.

    Basic Authentication

    In order to test our API, Postman is a suitable option. Postman is a freely available Chrome plugin (but also available standalone for several OSes) which provides many options for testing HTTP calls.

    Basic authentication is a rather weak authentication mechanism. You Base64 encode a string username:password and send that as an HTTP header to the API you are calling. If someone intercepts the message, he/she can easily Base64 decode the username:password string to obtain the credentials. You can thus understand why I’ve blanked out that part of the Authorization field in several screenshots.

    In addition to specifying the basic authentication header, you also need to specify the Oracle-Mobile-Backend-Id HTTP header which can be obtained from the main page of the Mobile Backend configuration page.

    Obtain Oracle-Mobile-Backend-Id

    Call your API with Basic authentication

    This mechanism is rather straightforward. The authorization header needs to be supplied with every request though.

    OAuth2

    OAuth2 works a bit different than basic authentication in that first a token is obtained from a token service and the token is used in subsequent requests. When using the token, no additional authentication is required.

    You can obtain the token from the Mobile Backend settings page as shown above. When you do a request to this endpoint, you need to provide some information:

    You can use basic authentication with the Client ID:Client secret to access the token endpoint. These can be obtained from the screen shown below.

    You also need to supply a username and password of the user for whom the token is generated. After you have done a request to the token service, you obtain a token.

    This token can be used in subsequent request to your API. You can add the Bearer field with the token as Authentication HTTP header to authenticate instead of sending your username/password every time. This is thus more secure.

    Finally

    I’ve not talked about security options for outgoing requests provided by the supplied connectors.

    These have per connector specific options and allow identity propagation. For example the REST connector (described in the Oracle documentation here) supports SAML tokens, CSF keys, basic authentication, OAuth2, JWT. The SOAP connector (see here) can use WS-Security in several flavours, SAML tokens, CSF keys, basic authentication, etc (quite a list).

    The post Oracle Mobile Cloud Service (MCS): An introduction to API security: Basic Authentication and OAuth2 appeared first on AMIS Oracle and Java Blog.

    Machine learning: Getting started with random forests in R

    Fri, 2017-04-07 02:08

    According to Gartner, machine learning is on top of the hype cycle at the peak of inflated expectations. There is a lot of misunderstanding about what machine learning actually is and what it can be done with it.

    Machine learning is not as abstract as one might think. If you want to get value out of known data and do predictions for unknown data, the most important challenge is asking the right questions and of course knowing what you are doing, especially if you want to optimize your prediction accuracy.

    In this blog I’m exploring an example of machine learning. The random forest algorithm. I’ll provide an example on how you can use this algorithm to do predictions. In order to implement a random forest, I’m using R with the randomForest library and I’m using the iris data set which is provided by the R installation.

    The Random Forest

    A popular method of machine learning is by using decision tree learning. Decision tree learning comes closest to serving as an off-the-shelf procedure for data mining (see here). You do not need to know much about your data in order to be able to apply this method. The random forest algorithm is an example of a decision tree learning algorithm.

    Random forest in (very) short

    How it works exactly takes some time to figure out. If you want to know details, I recommend watching some youtube recordings of lectures on the topic. Some of its most important features of this method:

    • A random forest is a method to do classifications based on features. This implies you need to have features and classifications.
    • A random forest generates a set of classification trees (an ensemble) based on splitting a subset of features at locations which maximize information gain. This method is thus very suitable for distributed parallel computation.
    • Information gain can be determined by how accurate the splitting point is in determining the classification. Data is split based on the feature at a specific point and the classification on the left and right of the splitting point are checked. If for example the splitting point splits all data of a first classification from all data of a second classification, the confidence is 100%; maximum information gain.
    • A splitting point is a branching in the decision tree.
    • Splitting points are based on values of features (this is fast)
    • A random forest uses randomness to determine features to look at and randomness in the data used to construct the tree. Randomness helps reducing compute time.
    • Each tree gets to see a different dataset. This is called bagging.
    • Tree classification confidences are summed and averaged. Products of the confidences can also be taken. Individual trees have a high variance because they have only seen a small subset of data. Averaging helps creating a better result.
    • With correlated features, strong features can end up with low scores and the method can be biased towards variables with many categories.
    • A random forest does not perform well with unbalanced datasets; samples where there are more occurrences of a specific class.
    Use case for a random forest

    Use cases for a random forest can be for example text classification such as spam detection. Determine if certain words are present in a text can be used as a feature and the classification would be spam/not spam or even more specific such as news, personal, etc. Another interesting use case lies in genetics. Determining if the expression of certain genes is relevant for a specific disease. This way you can take someone’s DNA and determine with a certain confidence if someone will contract a disease. Of course you can also take other features into account such as income, education level, smoking, age, etc.

    R Why R

    I decided to start with R. Why? Mainly because it is easy. There are many libraries available and there is a lot of experience present worldwide; a lot of information can be found online. R however also has some drawbacks.

    Some benefits

    • It is free and easy to get started. Hard to master though.
    • A lot of libraries are available. R package management works well.
    • R has a lot of users. There is a lot of information available online
    • R is powerful in that if you know what you are doing, you require little code doing it.

    Some challenges

    • R loads datasets in memory
    • R is not the best at doing distributed computing but can do so. See for example here
    • The R syntax can be a challenge to learn
    Getting the environment ready

    I decided to install a Linux VM to play with. You can also install R and R studio (the R IDE) on Windows or Mac. I decided to start with Ubuntu Server. I first installed the usual things like a GUI. Next I installed some handy things like a terminal emulator, Firefox and stuff like that. I finished with installing R and R-studio.

    So first download and install Ubuntu Server (next, next, finish)

    sudo apt-get update
    sudo apt-get install aptitude

    –Install a GUI
    sudo aptitude install –without-recommends ubuntu-desktop

    — Install the VirtualBox Guest additions
    sudo apt-get install build-essential linux-headers-$(uname -r)
    Install guest additions (first mount the ISO image which is part of VirtualBox, next run the installer)

    — Install the below stuff to make Dash (Unity search) working
    http://askubuntu.com/questions/125843/dash-search-gives-no-result
    sudo apt-get install unity-lens-applications unity-lens-files

    — A shutdown button might come in handy
    sudo apt-get install indicator-session

    — Might come in handy. Browser and fancy terminal application
    sudo apt-get install firefox terminator

    –For the installation of R I used the following as inspiration: https://www.r-bloggers.com/how-to-install-r-on-linux-ubuntu-16-04-xenial-xerus/
    sudo echo “deb http://cran.rstudio.com/bin/linux/ubuntu xenial/” | sudo tee -a /etc/apt/sources.list
    gpg –keyserver keyserver.ubuntu.com –recv-key E084DAB9
    gpg -a –export E084DAB9 | sudo apt-key add –
    sudo apt-get update
    sudo apt-get install r-base r-base-dev

    — For the installation of R-studio I used: https://mikewilliamson.wordpress.com/2016/11/14/installing-r-studio-on-ubuntu-16-10/

    wget http://ftp.ca.debian.org/debian/pool/main/g/gstreamer0.10/libgstreamer0.10-0_0.10.36-1.5_amd64.deb
    wget http://ftp.ca.debian.org/debian/pool/main/g/gst-plugins-base0.10/libgstreamer-plugins-base0.10-0_0.10.36-2_amd64.deb
    sudo dpkg -i libgstreamer0.10-0_0.10.36-1.5_amd64.deb
    sudo dpkg -i libgstreamer-plugins-base0.10-0_0.10.36-2_amd64.deb
    sudo apt-mark hold libgstreamer-plugins-base0.10-0
    sudo apt-mark hold libgstreamer0.10

    wget https://download1.rstudio.org/rstudio-1.0.136-amd64.deb
    sudo dpkg -i rstudio-1.0.136-amd64.deb
    sudo apt-get -f install

    Doing a random forest in R

    R needs some libraries to do random forests and create nice plots. First give the following commands:

    #to do random forests
    install.packages(“randomForest”)

    #to work with R markdown language
    install.packages(“knitr”)

    #to create nice plots
    install.packages(“ggplot2”)

    In order to get help on a library you can give the following command which will give you more information on the library.

    library(help = “randomForest”)

     Of course, the randomForest implementation does have some specifics:

    • it uses the reference implementation based on CART trees
    • it is biased in favor of continuous variables and variables with many categories

    A simple program to do a random forest looks like this:

    #load libraries
    library(randomForest)
    library(knitr)
    library(ggplot2)

    #random numbers after the set.seed(10) are reproducible if I do set.seed(10) again
    set.seed(10)

    #create a training sample of 45 items from the iris dataset. replace indicates items can only be present once in the dataset. If replace is set to true, you will get Out of bag errors.
    idx_train <- sample(1:nrow(iris), 45, replace = FALSE)

    #create a data.frame from the data which is not in the training sample
    tf_test <- !1:nrow(iris) %in% idx_train

    #the column ncol(iris) is the last column of the iris dataset. this is not a feature column but a classification column
    feature_columns <- 1:(ncol(iris)-1)

    #generate a randomForest.
    #use the feature columns from training set for this
    #iris[idx_train, ncol(iris)] indicates the classification column
    #importance=TRUE indicates the importance of features in determining the classification should be determined
    #y = iris[idx_train, ncol(iris)] gives the classifications for the provided data
    #ntree=1000 indicates 1000 random trees will be generated
    model <- randomForest(iris[idx_train, feature_columns], y = iris[idx_train, ncol(iris)], importance = TRUE, ntree = 1000)

    #print the model
    #printing the model indicates how the sample dataset is distributed among classes. The sum of the sample classifications is 45 which is the sample size. OOB rate indicates ‘out of bag’ (the overall classification error).

    print(model)

    #we use the model to predict the class based on the feature columns of the dataset (minus the sample used to train the model).
    response <- predict(model, iris[tf_test, feature_columns])

    #determine the number of correct classifications
    correct <- response == iris[tf_test, ncol(iris)]

    #determine the percentage of correct classifications
    sum(correct) / length(correct)

    #print a variable importance (varImp) plot of the randomForest
    varImpPlot(model)

    #in this dataset the petal length and width are more important measures to determine the class than the sepal length and width.

    The post Machine learning: Getting started with random forests in R appeared first on AMIS Oracle and Java Blog.

    How I learned to appreciate MongoDB

    Tue, 2017-04-04 05:11

    Last week at our company we organized a session about NoSQL in general and MongoDB in particular, as you can read here. The MongoDB focus was presented by me and I would like to take you with me on my trip to actually appreciating the NoSQL database world (and MongoDB in particular).

    Coming from RDBMS I’m used to ACID. Business models reside within the database, data never gets lost, transactions are isolated, you’ve got your read consistency, strictly defined tables, foreign keys et cetera. MongoDB on the other hand I saw as an unstructured pile of data, not stuctured, no such thing as transactions, it’s ‘eventually consistent’ (that sounds like a leap of faith), no joins hence no foreign keys… You get the picture.

    I am a DBA. My major concern is to have all data available or at least recoverable, no matter what. But the world is changing. Developers more and more look at the database as a storage engine; business logica is programmed in the application. We, DBA’s, try to teach them to use the database better, but is that really necessary? There’s no law..

    The world is changing fast and so are businesses. It’s not unusual to deploy a new release every night. If developers need to redesign the database every few days, then maybe the structure of data is not that important. If we collect the number of hits on our website, is it a disaster if out of 10,000 hits we occasionally miss 1?

    It takes a lot of discussion and ‘yes, but..’ for this other look at data to finally settle in. At least, for me it did. What finally won me over was an online course at MongoDB University that sort of mitigated the pain. Because, once you let go of the ACID model, you gain a lot of flexibility in terms of database design and infrastructure design. Scaling out becomes a very easy operation for instance. Resilience against hardware failure is a piece of cake. And due the lack of the RDBMS legacy, the engine can focus almost entirely on reading and writing data which leads to lightning fast performance.

    In the next paragraphs i will show some examples of the resilience and general behaviour of MongoDB, losely compared to Oracle. It is handson so I will also get you started, as minimal as possible, with mongoDB in general.

    I will not go into the way you read and write data. Only some actions will be shown that are needed for the examples. But in general:

    db = database and can be compared to a schema in Oracle.

    A db contains collections, which can be compared to tables.

    A collections contains documents which can be compared to rows.

    Joins are not possible, so all data you need should be in the same collection.

    Collections consist of key:value pairs, as many as you like within one collection.

    So, you can have the database ‘trades’ with collection ‘customers’ with documents like

    {“name”:”Larry”,”company”:”Oracle”,”hobby”:”sailing”}

    Getting started.

    Go to the MongoDB site here and download the appropriate version. The entire handson can be run on your typical Windows or Linux laptop or Virtual Box. The installation is very easy, no instructions needed from my part.

    The software will be installed in C:\Program Files\MongoDB\Server\3.4\bin. Version can change over time off course.

    Add that to your path in your environment variables.

    In linux, add the path where you unpacked the tar ball, followed by bin, to your $PATH.

    Starting an instance is also very easy. Open a command box, create a directory \data\db1 and start the instance with

    mongod --dbpath \data\db1

    On windows you should leave this box open. When you close it the instance shuts down. Better would be to create a service but for this demonstration, this will do.

    Stop the database by pressing ^C.

    On linux you can fork the process so you don’t have to open a terminal for every instance:

    mongod --dbpath /data/db1 --fork --logpath a.log

    End it by killing the process.

    From now on I will continue in Windows, Linux users can follow the instructions with minor adjustments, like / instead of \

    Also make sure to use a different logpath for each instance.

    Resilience against hardware failure.

    In Oracle we have 2 kinds of resilience against hardware failure. Instance failure can be mitigated by RAC, storage failure by data guard. Besides, if a single instance crashes you can recover all data, provided you have a decent backup strategy.

    MongoDB uses a different approach called replica sets. Each instance (or member as it’s called) has its own storage and can be replicated to another instance (or many) with its own storage too. Only one instance can read and write, that is the primary instance. The others only allow you to read data.

    In production this is a no brainer: should a single instance fail, then you can’t recover all data like in Oracle, no matter how often you make backups.

    All instances vote who will be the primary. This can be manipulated by setting the priority parameter. I will not go into that here but just demonstrate a simple replica set.

    Open a command box and type:

    mkdir \data\db1\r1
    mkdir \data\db1\r1
    mkdir \data\db1\r1
    mongod --smallfiles --oplogSize 50 --port 27001 --dbpath \data\db1\r1 --replSet r

    Leave it open and open a second Command box and type:

    mongod --smallfiles --oplogSize 50 --port 27002 --dbpath \data\db1\r2 --replSet r

    Leave it open and open a third Command box and type:

    mongod --smallfiles --oplogSize 50 --port 27003 --dbpath \data\db1\r3 --replSet r

    Open fourth command box This will be used to actually talk to the database using the mongo shell. We will then initiate the replica set.

    mongo –-port 27003
    rs.initiate(
               { _id:'r',
                 members:[
                         { _id:1, host:'localhost:27001' },
                         { _id:2, host:'localhost:27002', "arbiterOnly" : true },
                         { _id:3, host:'localhost:27003' }
                         ]
               }
    )
    rs.status()

    I introduced a special member, the Arbiter. This is a member without data, it only helps to have an uneven number of members which is necessary to always get a majority of votes when it comes to choosing the Primary member.

    In the output you can see that we have 3 members: a Secondary on port 27001, an Arbiter on port 27002 and a Primary on port 27003. You can also see by the prompt that we are connected to the Primary.

    We will now create a collection called ‘simple’ and insert some data. Also, the writeConcern phrase makes sure data is written to at least 2 members. If there are more members they will be ‘eventually consistent’, meaning that they will synchronize but not immediately.

    db.simple.insert( { _id : 1 }, { writeConcern : { w : 2 } } )
    db.simple.insert( { _id : 2 }, { writeConcern : { w : 2 } } )
    db.simple.insert( { _id : 3 }, { writeConcern : { w : 2 } } )

    Go to your secondary member and try to read the data. This involves giving your self permission to read from the secondary as I’ll show:

    exit
    mongo --port 27001
    r:SECONDARY> db.simple.find()
    Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 }
    r:SECONDARY> rs.slaveOk()
    r:SECONDARY> db.simple.find()
    { "_id" : 1 }
    { "_id" : 2 }
    { "_id" : 3 }

    This looks okay. Not featured here: if I stop the Secondary, add data on the Primary and restart the Secondary, it synchronizes as expected. Just one thing: the writeConcern for 2 members can not be used since we only have 1 member.

    Now it becomes interesting. I’ll stop the Secondary, write some data on the Primary, stop the Primary and start the Secondary. Would the data written whilst the Secondary was down still be visible? If not, would it be recoverable?

    r:SECONDARY> exit
    bye

    Go to your first box and stop the Secondary with ^C.

    Go to the mongoshell box and connect to port 27003, the Primary and add some more data:

    mongo --port 27003
    MongoDB shell version: 3.0.14
    connecting to: 127.0.0.1:27003/test
    r:PRIMARY> db.simple.insert( { _id : 4 } )
    WriteResult({ "nInserted" : 1 })
    r:PRIMARY> db.simple.insert( { _id : 5 } )
    WriteResult({ "nInserted" : 1 })
    r:PRIMARY> db.simple.insert( { _id : 6 } )
    WriteResult({ "nInserted" : 1 })
    r:PRIMARY> db.simple.find()
    { "_id" : 1 }
    { "_id" : 2 }
    { "_id" : 3 }
    { "_id" : 4 }
    { "_id" : 5 }
    { "_id" : 6 }
    
    r:PRIMARY> exit
    bye

    Now stop the primary in your 3rd box with ^C and restart the Secondary in your 1st box. Then go to the mongoshell box and connect to port 27001

    mongo --port 27001
    MongoDB shell version: 3.0.14
    connecting to: 127.0.0.1:27001/test
    r:PRIMARY> rs.status()
    {
    "set" : "r",
    "date" : ISODate("2017-03-20T19:12:43.425Z"),
    "myState" : 1,
    "members" : [
    {
    "_id" : 1,
    "name" : "localhost:27001",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 25,
    "optime" : Timestamp(1490035617, 1),
    "optimeDate" : ISODate("2017-03-20T18:46:57Z"),
    "electionTime" : Timestamp(1490037141, 1),
    "electionDate" : ISODate("2017-03-20T19:12:21Z"),
    "configVersion" : 1,
    "self" : true
    },
    {
    "_id" : 2,
    "name" : "localhost:27002",
    "health" : 1,
    "state" : 7,
    "stateStr" : "ARBITER",
    "uptime" : 24,
    "lastHeartbeat" : ISODate("2017-03-20T19:12:43.354Z"),
    "lastHeartbeatRecv" : ISODate("2017-03-20T19:12:43.167Z"),
    "pingMs" : 0,
    "configVersion" : 1
    },
    {
    "_id" : 3,
    "name" : "localhost:27003",
    "health" : 0,
    "state" : 8,
    "stateStr" : "(not reachable/healthy)",
    "uptime" : 0,
    "optime" : Timestamp(0, 0),
    "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
    "lastHeartbeat" : ISODate("2017-03-20T19:12:43.354Z"),
    "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
    "configVersion" : -1
    }
    ],
    "ok" : 1
    }
    r:PRIMARY> db.simple.find()
    { "_id" : 1 }
    { "_id" : 2 }
    { "_id" : 3 }
    r:PRIMARY>db.simple.insert( { _id : 7 } )
    WriteResult({ "nInserted" : 1 })
    r:PRIMARY> db.simple.find()
    { "_id" : 1 }
    { "_id" : 2 }
    { "_id" : 3 }
    { "_id" : 7 }
    r:PRIMARY>
    

    So, member 1 now has become the Primary, but we hit data loss: it never had a chance to synchronize and they do not share any storage to read from.

    What would happen if we restart the 3rd member? After all, that one does have the lost data stored, somewhere.

    Start up the 3rd member (in the 3rd box)

    In the output you will see it transitions to Secondary and it performs a rollback: the lost data is actually rolled back. And the good news: it is stored. Under its data directory \data\db1\r3 it created a directory called rollback which contains a .bson file. This file can be examend and/or imported in the database as I’ll show.

    Go to the 4th box and exit mongoshell. Then:

    cd \data\db1\r3\rollback
    C:\data\db1\r3\rollback>bsondump test.simple.2017-03-20T19-32-31.0.bson
    {"_id":4.0}
    {"_id":5.0}
    {"_id":6.0}
    2017-03-20T20:45:06.412+0100 3 objects found
    C:\data\db1\r3\rollback>mongorestore --port 27001 --db test --collection simple test.simple.2017-03-20T19-32-31.0.bson
    2017-03-20T20:47:59.880+0100 checking for collection data in test.simple.2017-03-20T19-32-31.0.bson
    2017-03-20T20:47:59.886+0100 restoring test.simple from file test.simple.2017-03-20T19-32-31.0.bson
    2017-03-20T20:48:00.463+0100 no indexes to restore
    2017-03-20T20:48:00.463+0100 finished restoring test.simple (3 documents)
    2017-03-20T20:48:00.463+0100 done
    
    C:\data\db1\r3\rollback>mongo --port 27001
    MongoDB shell version: 3.0.14
    connecting to: 127.0.0.1:27001/test
    r:PRIMARY> db.simple.find()
    { "_id" : 1 }
    { "_id" : 2 }
    { "_id" : 3 }
    { "_id" : 7 }
    { "_id" : 4 }
    { "_id" : 5 }
    { "_id" : 6 }
    r:PRIMARY>
    

    Okay, Oracle would have done everything by itself. But at what cost? It needs to maintain redologs and archived redo logs. It only has 1 member to query, the Primary database. Yes, you can have a read only Standby database with Active Data Guard since 11G, but that’s a licensed option. It’s robust, nevertheless. I only want to say that the alternative is different but not all bad. Not at all.

    ;

    ;

    Scaling out, also known as sharding

    In the previous paragraph we covered HA and recoverablility. Now let’s have a look at scaling out, best compared to RAC.

    RAC enables you to add CPU power and memory to a database. It also enables you to distribute different kinds of workloads over different machines, for instance reporting on one node and OLTP on another node. That distribution can be compared to smart using of replica sets explained above.

    Adding CPU power and memory is something else. MongoDB heavily relies on memory to perform. And they made it very easy for you to add more nodes to your cluster. This is done by sharding.

    Sharding can best be described as range based partitioning. Sharding is done on a per collection base. A shard cluster consists of 1 (!) or more nodes that automatically partitions a collection and distributes it evenly over all cluster members.

    Let’s have a closer look.

    First of all, each mongod needs to know it is part of a shard cluster. That is accomplished with a –shardsvr startup parameter. It is also very wise to explicitely declare the port number with the –port parameter. Finally it needs its own storage, the –dbpath parameter. Example:

    mongod  --shardsvr --port 27020 --dbpath \data\db1\s1
    mongod  --shardsvr --port 27021 --dbpath \data\db1\s2

    Next we need a config server. This is also a mongodb, but instead of data, it has a special database that contains all information there is to know about the cluster. Especially which members are known and the relation between the members and the partitions, the shards in mongo language.

    As of Mongo 3.4 config servers need to be in a replica set instead of standalone. For demonstration of develoment, it is allowed to have a set of only 1 member.

    In a production environment you typically create 3 config servers, for now we’ll create just one:

    mongod --configsvr --replSet c --dbpath \data\db1\conf --port 27019

    Start the mongo shell in another command box so we can configure the relica set “c”:

    rs.initiate(
    {
    _id: "c",
    configsvr: true,
    members: [
    { _id : 0, host : "localhost:27019" }
    ]
    }
    )

    Finally we need at least one mongos which is a routing service and serves as the front end to which the users connect. The mongos has no persitant data, it reads the config server and distributes client requests over the shards.

    It needs to know where to find the config server so we tell it with a parameter configReplSetName/hostname:port:

    mongos --configdb c/localhost:27019

    We can now open a mongo shell. It will by default connect to port 27017 and, lo and behold, a mongos automatically runs on port 27017. Since we are all running on the same host, connecting is very easy.

    In the shell we will add shard servers to the cluster. Next we will enable sharding for a specific database.

    mongo
    mongos> sh.addShard( "localhost:27020")
    mongos> sh.addShard( "localhost:27021")
    mongos> sh.enableSharding("test")

    The only thing we have done is enable sharding for a db. But nothing is harded yet. For that to happen we need to decide which collection(s) will be sharded and on what key. This key needs to have an index on the shard key. And then finally nothing needs to be done anymore.

    So what did I learn?
    Sure, you lack the robustnes of an RDBMS. Sure, you can not join and sure, therefor you store way more bytes then usual. But it’s fast, it’s easy and it serves many purposes. And last but not least, it takes some serious out of the box thinking for a DBA to actually appriciate this new world: you have to let go of some fundamental principles on which your world was based for the last ten, twenty or more years.

    And finally a disclaimer: These examples have been over simplified. In the real world you’d use many hosts. You’d use 3 config servers, many mongos instances and off course a replicated shard cluster.
    Apart from that, there are many ways to make the behaviour more sophisticated and robust. Chack out the official documentation, it’s quite good in my opinion and challenges you to many experiments.

    The post How I learned to appreciate MongoDB appeared first on AMIS Oracle and Java Blog.

    The value of the Oracle PaaS Partner Community Forum

    Fri, 2017-03-31 04:10

    IMG_7692I have just returned home from the Oracle PaaS Partner Community Forum 2017 which took place in Split, Croatia. Filled with energy, inspiration, impressions, already fond memories, more understanding, longer term plans and shorter term action items and a warm feeling of being part of an international network of smart, open, helpful and fun people. Does that sound soft, a little over the top – slightly brainwashed even? Let me try to explain what this event entails and what it means to me.

    IMG_7722200 Representatives from Oracle PaaS and Middleware partners in 25+ countries – primarily in Europe and including North America, Asia and Africa – flocked to Split, Croatia to meet with each other as well as with  a few dozen Oracle staff – from Vice President Product Management Fusion Middleware and PaaS Cloud to the Oracle A Team. The objectives of the event are to share real life experiences with Oracle technology and Oracle as a company to do business with. The partners share experience with each other – to learn and teach – as well as with Oracle – to enlighten them as to what works well and what is sorely lacking in products. Oracle staff lay out the product roadmaps for the next 6-18 months, share commercial opportunities, demonstrate new or hidden features in the products and in general enable attendees to make more successful use of their technology. The common goal is clear: become more successful in applying Oracle products for customers – by making the products better and by better understanding how the products can be best used.

    IMG_7773The common ground for this community is strong and clear: a shared enthusiasm to work with technology and achieve business value for customers with it. We share a lot of history – even if this is the first time we meet – from working many years with the same technology, trying to achieve similar results for customers, facing the same challenges and probably unknowingly collaborating through discussion forums and blog articles. This shared history and common vocabulary make it very easy to strike up conversations – with just about anyone. Talking about a product, a technology, a wild plan at Oracle, the demo in the previous session. We speak the same language. And this common ground ground and shared history result in a warm, open atmosphere where conversations about sensitive topics and personal affairs can easily take place. With such a group of smart and very smart people, it seems easy to analyze the large problems of the world – and to overcome them as well (although that may have been the very agreeable local red wine speaking).

    IMG_7770Some of the conversations I was part of included topics such as: Political situation in Brazil, economy of Portugal, president Trump, the Dutch elections, Oracle SaaS in Belgium, recent history and current relations in Slovenia, Croatia and Former Yugoslavia, politics and healthy food trends in Germany, history and culture in India, the Brexit and its consequences for Northern Ireland, raising children – especially at that tender age of 16 – in Slovenia, Portugal, Netherlands and the USA, regional differences within Spain, weather and seasons in Peru, absorbing the move from Mexico to Norway. 

    I treasure the warm, open, positive, inquisitive, helpful attitude of the attendees – their willingness to share and help, to collaborate and appreciate. I am sure people have similar experiences with scientific communities and conferences, scouting jamborees, musical festivals and other gatherings of people with a shared passion. I am glad I have my own professional social community to be a part of.

    Thanks Jürgen for building and facilitating that community – it does not just happen.IMG_7753

     

     

    OMESA – the Open Modern Enterprise Software Architecture

    imageStarting at the Community Day on Monday – but actually continuing an initiative that was started around Oracle OpenWorld 2016 – a small team of likeminded spirits with shared architectural interests, experiences and zeal gathered to work on OMESA – the Open Modern Enterprise Software Architecture initiative, initiated by Hajo Normann and especially Luis Weir. Taking into account many modern themes and upcoming technology patterns and trends – such as microservices, IoT, wearables and other devices, APIs, real time, big data, DevOps, scale out , stateless/server-less – the OMESA initiative tries to come up with good ways to describe, design and eventually implement enterprise software architecture patterns that actually work in real life. Drawing on the many decades experience across organizations, IT landscapes, corporate cultures as well as countries and continents, the OMESA discussions ran deep (more than 2 hours for about 20% of a single slide) and wide (from business objectives to infrastructure design, from fine grained component to end to end business flow). All in good spirit and at times quite passionate. Fun to do and very meaningful as well. image

     

     

    Themes

    Clearly, one community day, two conference days and two days of in depth handson workshops cover many topics and details. A number of subjects and themes kept reappearing and seemed to define the main scope for the conference and the imminent roadmaps. A prime example was: ChatBot which made an appearance in almost every presentation.

    More far reaching themes included:

    • Integration – various solutions to create end to end integration flow across technology stacks, locations, cloud & on premises ;
    • SaaS Extension – Oracle focuses on various aspects of SaaS Extension (and SaaS enablement) and specific solutions for those. ABCS (Application Builder Cloud Service) is put forward as the primary solution for (simple) UI extensions of SaaS solutions. ABCS is rapidly evolving into a low code/no code/visual development environment for citizen developers and IT professionals to create cloud based user interfaces on top of local, custom business objects, SaaS APIs or custom REST APIs. Integration and Process & Workflow are other areas where through ICS with Cloud Adapters and PCS for human workflow and complex business processes the SaaS offerings can be extended and tailored. In terms of financial conditions – Oracle is offering these cloud services at very attractive price points for use with Oracle SaaS.
    • Machine Learning & AI – leveraging data to predict & recommend, based on statistical analysis of (big data) is rapidly becoming the holy grail for many organizations. The Big Data lakes are to be exploited and machine learning is one way of doing exactly that. IoT is leading to even more data and even more real time requirements. Machine Learning is popping up in various cloud services to provide them with predictive capabilities – for example in Dynamic Processes in PCS and Log and Infrastructure Analytics in OMC.
    • Processes – PCS will be extended and enriched with dynamic process (evolved from adaptive case management), business rules and more.
    • APIs – APIs – internal and external, generic business APIs and single purpose, consumer tailored APIs – API first design; API Platform/Catalog (design time) and API Gateway (runtime). apiary.io will always be free (promises Vikas Anand) – for API design and (design time) publication
    • PaaS orchestration – using the Oracle PaaS services together; having them talk to each other, share data, connections, information, logging, events, monitoring, IaaS resources. Oracle is working on that, and I getting better at it.
    • Containers & Microservices – another holy grail seems to be a definition of what a microservice is. It seems obvious that microservice is the next level in decoupling, ownership, stateless and flexible horizontal scale out and modern DevOps. In terms of implementation – containers including application contains and serverless functions – seem an obvious choice (to an extent where sometimes means and end get confused).
    • DevOps – taking ownership of a software component throughout its lifecycle (you build it, you run it, you fix it), using automated continuous delivery, automated (anti-)regression testing and automated monitoring/problem analysis as well as automated recovery/fail over and scaling. Containers at infra structure level, microservices as organization and architecture level and tooling such as Oracle Management Cloud for monitoring and analysis.

     

     

    REAL – Red Expert Alliance

    The partner community forum is a great opportunity for the partners who collaborate even more within the Red Expert Alliance (or REAL) to catch up with each other. With partners in Mexico, USA, Australia, Portugal, Norway, Germany, Spain, UK, The Netherlands – REAL network forms an international team of Oracle partners and over 400 Oracle Professionals, supporting customers on all continents. Split provided a good opportunity to exchange the latest developments, discuss new business propositions, better get to know each other and in general strengthen the mutual ties. Additionally, some conversations took place with new candidate members for REAL.

     

    Soaring through the Clouds

    IMG_7759One of the highlights for me of the community event was the Soaring through the Cloud Live Demonstration of integrating 15 cloud services – prepared by a team of seven Oracle ACEs spanning three countries and five companies. The preparation for this session started over two months before the actual performance – and in a way already started over a year ago with the first and decidedly more challenging (given the state of the PaaS Cloud services at that time) live demo at the community event in Valencia, Spain – March 2016.

    In 1 hour and 20 minutes, we described an end to end business flow with many interaction points with things, social media, IT components and people including the audience. We managed to leverage 18 and demonstrate 15 cloud services, with only one or two minor hiccups around switching between Oracle Public Cloud identity domains. In doing so, we claim to have set the world record for integrating PaaaS Cloud Services.image

    You can find the resources for this presentation at: http://tinyurl.com/SoaringSplit (live endpoints, UI and supporting resources including source code).

    The presentation slide deck itself (including many screenshots from our live demo) is available from SlideShare:

    The post The value of the Oracle PaaS Partner Community Forum appeared first on AMIS Oracle and Java Blog.

    Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines

    Wed, 2017-03-29 12:21
    Introduction

    In my previous blog I’ve explained what a Canonical Data Model (CDM) is and why you should use it. This blog is about how to do this. I will share my experiences on how to create and use a CDM. I gained these experiences at several projects, small ones, and large ones. All of these experiences were related to an XML based CDM. This blog consists of three parts. This blogpost contains part I: Standards & Guidelines. The next blogpost, part two, is about XML Namespace Standards and the last blogpost contains part three about Dependency Management & Interface Tailoring.
    This first part, about standards and naming conventions, primarily apply to XML, but the same principles and ideas will mostly apply to other formats, like JSON, as well. The second part about XML namespace standards only is, as it already indicates, applicable to an XML format CDM. The last part, in the third blogpost, about dependency management & interface tailoring entirely, applies to all kind of data formats.

    Developing a CDM

    About the way of creating a CDM. It’s not doable to create a complete CDM upfront and only then start designing services and developing them. This is because you only can determine usage of data, completeness and quality while developing the services and gaining experience in using them. A CDM is a ‘living’ model and will change in time.
    When the software modules (systems or data stores) which are to be connected by the integration layer are being developed together, the CDM will change very often. While developing software you always encounter shortcomings in the design, unseen functional flaws, unexpected requirements or restrictions and changes in design because of new insights or changed functionality. So sometimes the CDM will even change on a daily base. This perfectly fits into the modern Agile Software Development methodologies, like Scrum, where changes are welcome.
    When the development stage is finished and the integration layer (SOA environment) is in a maintenance stage, the CDM still will change, but at a much slower pace. It will keep on changing because of maintenance changes and modifications of connected systems or trading partners. Changes and modifications due to new functionality also causes new data entities and structures which have to be added to the CDM. These changes and modifications occur because business processes change in time, caused by a changing world, ranging from technical innovations to social behavioral changes.
    In either way, the CDM will never be ready and reach a final changeless state, so a CDM should be flexible and created in such a way that it welcomes changes.

    When you start creating a CDM, it’s wise to define standards and guidelines about defining the CDM and using it beforehand. Make a person (or group of persons in a large project), responsible for developing and defining the CDM. This means he defines the data definitions and structures of the CDM. When using XML this person is responsible for creating and maintaining the XML schema definition (XSD) files which represent the CDM. He develops the CDM based on requests from developers and designers. He must be able to understand the need of the developers, but he should also keep the model consistent, flexible and future proof. This means he must have experience in data modeling and the data format (e.g. XML or JSON) and specification language (e.g. XSD) being used. Of course, he also guards the standards and guidelines which has been set. He also is able, when needed, to deny requests for a CDM change from (senior) developers and designers in order to preserve a well-defined CDM and provide an alternative which meets their needs as well.

    Standards & Guidelines

    There are more or less three types of standards and guidelines when defining an XML data model:

    • Naming Conventions
    • Structure Standards
    • Namespace Standards
    Naming Conventions

    The most important advice is that you define naming conventions upfront and stick to them. Like all the naming convention in programming languages, there are a lot of options and often it’s a matter of personal preference. Changing conventions because of different personal preferences it not a good idea. Mixed conventions results in ugly code. Nevertheless I do have some recommendations.

    Nodes versus types
    The first one is to make a distinction between the name of a node (element or attribute) and an XML type. I’ve been in a project where the standard was to give them exactly the same name. In XML this is possible! But the drawback was that there were connecting systems and programming languages which couldn’t handle this! For example the standard Java library for XML parsing, JAX-P, had an issue with this. The Java code which was generated under the hood used the name of an XML type for a Java class name and the name of an element as a Java variable name. In Java it is not possible to use an identical name for both. In that specific project, this had to be fixed manually in the generated Java source code. That is not what you want! It can easily be avoided by using different names for types and elements.

    Specific name for types
    A second recommendation, which complements the advice above, is to use a specific naming convention for XML types, so their names always differ from node names. The advantage for developers is that they can recognize from the name if something is an XML node or an XML type. This eases XML development and makes the software code easier to read and understand and thus to maintain.
    Often I’ve seen the naming convention, which tries to implements this, by prescribing that the name of an XML type should be suffixed with the token “Type”. I personally do not like this specific naming convention. Consider you have a “Person” entity and so you end up with an XML type named “PersonType”. This perfectly makes sense, doesn’t it? But how about a “Document” entity? You end up with an XML type named “DocumentType” and guess what: there is also going to be a “DocumentType” entity resulting in an XML type named “DocumentTypeType”…!? Very confusing in the first place. Secondly, you end up with an element and an XML type with the same name! The name “DocumentType” is used as a name for an element (of type “DocumentTypeType”) and “DocumentType” is used as an XML type (of an element named “Document”).
    From experience I can tell you there are more entities with a name that ends with “Type” than you would expect!
    My advice is to prefix an XML type with the character “t”. This not only prevents this problem, but it’s also shorter. Additionally you can distinguish an XML node from an XML type by the start of its name. This naming convention results into element names like “Person”, “Document” and “DocumentType” versus type names “tPerson”, “tDocument” and “tDocumentType”.

    Use CamelCase – not_underscores
    The third recommendation is to use Camel Case for names instead of using underscores as separator between the words which make up a name of a node or type. This shortens a name and still the name can be read easily. I’ve got a slight preference to start a name with an uppercase character, because then I can use camel Case beginning with a lowercase character for local variables in logic or translations (BPEL, xslt, etc) in the integration layer or tooling. This results in a node named “DocumentType” of type “tDocumentType” and when used in a local variable in code, this variable is named “documentType”.

    Structure Standards

    I also have some recommendations about standards which apply to the XML structure of the CDM.

    Use elements only
    The first one is to never use attributes, so only elements. You can never expand an attribute and create child elements in it. This may not be necessary at the moment, but may be necessary sometime in the future. Also an attribute cannot have the ‘null’ value in contrast with an element. You can argue that an empty value can represent the null value. But this is only possible with String type attributes (otherwise it’s considered as invalid XML when validating against its schema) and often there is a difference between an empty string and a null value. Another disadvantage is that you can not have multiple attributes with the same name inside an element.
    Furthermore, using elements makes XML better readable by humans, so this helps developers in their coding and debugging. A good read about this subject is “Principles of XML design: When to use elements versus attributes”. This article contains a nice statement: “Elements are the extensible engine for expressing structure in XML.” And that’s exactly what you want when developing a CDM that will change in time.
    The last advantage is that when the CDM only consists of elements, processing layers can add their own ‘processing’ attributes only for the purpose of helping the processing itself. This means that the result, the XML which is used in communicating with the world outside of the processing system, should be free of attributes again. Also processing attributes can be added in the interface, to provide extra information about the functionality of the interface. For example, when retrieving orders with operation getOrders, you might want to indicate for each order whether it has to be returned with or without customer product numbers:

    <getOrdersRequest>
      <Orders>
        <Order includeCustProdIds='false'>
          <Id>123</Id>
        </Order>
        <Order includeCustProdIds='true'>
          <Id>125</Id>
        </Order>
        <Order includeCustProdIds='false'>
          <Id>128</Id>
        </Order>
      </Orders>
    </getOrdersRequest>

    Beware these attributes are processing or functionality related, so they should not be a data part the entity. And ask yourself if they are really necessary. You might consider to provide this extra functionality in a new operation, e.g. operation getCustProdIds to retrieve customer product ids or operation getOrderWithCustIds to retrieve order with customer product number.

    All elements optional
    The next advice is to make all the elements optional! There unexpectedly always is a system or business process which doesn’t need a certain (child) element of which you initially had thought it would always be necessary. On one project this was the case with id elements. Each data entity must have an id element, because the id element contains the functional unique identifying value for the data entity. But then there came a business case with a front end system that had screens in which the data entity was being created. Some of the input data had to be validated before the unique identifying value was known. So the request to the validation system contained the entity data without the identifying id element, so the mandatory id element had to be changed to an optional element. Of-course, you can solve this by creating a request which only contains the data that is used in separate elements, so without the use of the CDM element representing the entity. But one of the powers of a CDM is that there is one definition of an entity.
    At that specific project, in time, more and more mandatory elements turned out to be optional somewhere. Likely this will happen at your project as well!

    Use a ‘plural container’ element
    There is, of course, an exception of an element which should be mandatory. That is the ‘plural container’ element, which only is a wrapper element around a single element which may occur multiple times. This is my next recommendation: when a data entity (XML structure) contains another data entity as a child element and this child element occurs two or more times, or there is a slight chance that this will happen in the future, then create a mandatory ‘plural container’ element which acts as a wrapper element that contains these child elements. A nice example of this is an address. More often than you might think, a data entity contains more than one address. When you have an order as data entity, it may contain a delivery address and a billing address, while you initially started with only the delivery address. So when initially there is only one address and the XML is created like this:

    <Order>
      <Id>123</Id>
      <CustomerId>456/<CustomerId>
      <Address>
        <Street>My Street</Street>
        <ZipCode>23456</ZipCode>
        <City>A-town</City>
        <CountryCode>US</CountryCode>
        <UsageType>Delivery</UsageType>
      </Address>
      <Product>...</Product>
      <Product>...</Product>
      <Product>...</Product>
    </Order>

    Then you have a problem with backwards compatibility when you have to add the billing address. This is why it’s wise to create a plural container element for addresses, and for products as well. The name of this element will be the plural of the element it contains. Above XML will then become like this:

    <Order>
      <Id>123</Id>
      <CustomerId>456/<CustomerId>
      <Addresses>
        <Address>
          <Street>My Street</Street>
          <ZipCode>23456</ZipCode>
          <City>A-town</City>
          <CountryCode>US</CountryCode>
          <UsageType>Delivery</UsageType>
        </Address>
      </Addresses>
      <Products>
        <Product>...</Product>
        <Product>...</Product>
        <Product>...</Product>
      </Products>
    </Order>

    In the structure definition, the XML Schema Definition (XSD), define the plural container element to be single and mandatory. Make its child elements optional and without a maximum of occurrences. First this results in maximum flexibility and second, in this way there is only one way of constructing XML data that doesn’t have any child elements. In contrast, when you make the plural container element optional, you can create XML data that doesn’t have any child element in two ways, by omitting the plural container element completely and by adding it without any child elements. You may want to solve this by dictating that child elements always have at least one element, but then the next advantage, discussed below, is lost.
    So the XML data example of above will be modeled as follows:

    <complexType name="tOrder">
      <sequence>
        <element name="Id" type="string" minOccurs="0" maxOccurs="1"/>
        <element name="CustomerId" type="string" minOccurs="0" maxOccurs="1"/>
        <element name="Addresses" minOccurs="1" maxOccurs="1">
          <complexType>
            <sequence>
              <element name="Address" type="tns:tAddress" minOccurs="0" maxOccurs="unbounded"/>
            </sequence>
          </complexType>
        </element>
        <element name="Products" minOccurs="1" maxOccurs="1">
          <complexType>
            <sequence>
              <element name="Product" type="tns:tProduct" minOccurs="0" maxOccurs="unbounded"/>
            </sequence>
          </complexType>
        </element>
      </sequence>
    </complexType>
    <complexType name="tAddress">
      <sequence>
        ...
      </sequence>
    </complexType>
    <complexType name="tProduct">
      <sequence>
        ...
      </sequence>
    </complexType>

    There is another advantage of this construction for developers. When there is a mandatory plural container element, this elements acts as a kind of anchor or ‘join point’ when XML data has be modified in the software and for example, child elements have to be added. As this element is mandatory, it’s always present in the XML data that has to be changed, even if there are no child elements yet. So the code of a software developer can safely ‘navigate’ to this element and make changes, e.g. adding child elements. This eases the work of a developer.

    Be careful with restrictions
    You never know beforehand with which systems or trading partners the integration layer will connect in future. When you define restrictions in your CDM, beware of this. For example restricting a string type to a list of possible values (enumeration) is very risky. What to do when in future another possible value is added?
    Even a more flexible restriction, like a regular expression can soon become too strict as well. Take for example the top level domain names on internet. It once was restricted to two character abbreviations for countries, some other three character abbreviations (“net”, “com”, “org”, “gov”, “edu”) and one four character word “info”, but that’s history now!
    This risk applies for all restrictions, restriction on character length, numeric restrictions, restriction on value ranges, etc.
    Likewise I bet that the length of product id’s in the new version of your ERP system will exceed the current one.
    My advice is to minimize restriction as much as possible in your CDM, preferable no restrictions at all!
    Instead define restrictions on the interfaces, the API to the connection systems. When for example the product id of your current ERP system is restricted to 8 characters, it perfectly makes sense that you define a restriction on the interface with that system. More on this in part III in my last blogpost in the section about Interface Tailoring.

    String type for id elements
    Actually this one is the same as the one above about restrictions. I want to discuss this one separately, because of its importance and because it often goes wrong. Defining an id element as a numeric type is a way of applying a nummeric restriction to a string type id.
    The advice is to make all identifying elements (id, code, etc) of type string and never a numeric type! Even when they always get a numeric value… for now! The integration layer may in future connect to another system that uses non-numeric values for an id element or an existing system may be replaced by a system that uses non-numeric id’s. Only make those elements numeric which truly contain numbers, so the value has a nummeric meaning. You can check this by asking yourself whether it functionally makes sense to calculate with the value or not. So for example phone numbers should be strings. Also when there is a check (algorithm) based on the sequence of the digits whether a number is valid or not (e.g. bank account check digit), this means the number serves as an identification and thus should be a string type element! Another way to detect numbers which are used as identification, is to determine if it matters when you add a preceding zero to the value. If that does matter, it means it’s not used nummeric. After all, preceding zero’s doesn’t change a nummeric value.

    Determine null usage
    The usage of the null value in XML (xsi:nil=”true”) always leads to lots of discussions. The most import advice is to explicitly define standards & rules and communicate them! Decide whether the null usage is allowed or not. If so, determine in what situation it is allowed and what it functionally means. Ask yourself how it is used and how it differs from an element being absent (optional elements).
    For example I’ve been in a project where a lot of data was updated in the database. An element being absent meant that a value didn’t change, while a null value meant that for a container element it’s record had be deleted and for a ‘value’ element that the database value had to be set to null.
    The most important advice in this is: Make up your mind, decide, document and communicate it!

    To summarize this first part of naming conventions and guidelines:

    • Keep in mind that a CDM keeps on changing, so it’s never finished
    • Define naming and structure standards upfront
    • and communicate your standards and guidelines!

    When creating a CDM in the XML format, you also have to think about namespaces and how to design the XML. This is where the second part in my next blogpost is all about. When you are not defining a CDM in the XML format, you can skip this one and immediately go to the third and last blogpost about dependency management & interface tailoring.

    The post Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines appeared first on AMIS Oracle and Java Blog.

    Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards

    Wed, 2017-03-29 12:21

    This blog is about XML namespace standards. Primary for using them in a Canonical Data Model (CDM), but also interesting for anyone who has to define XML data by creating XML Schema files (XSD). This blogpost is the second part of a trilogy about my experiences in using and developing a CDM. The first blogpost is about naming & structure standards and the third blogpost is about dependency management & interface tailoring.

    XML Namespace Standards

    A very important part of an XML model, is its namespace. With a namespace you can bind an XML model to specific domain and can represent a company, a business domain, a system, a service or even a single component or layer within a service. For a CDM model this means that choices have to be made. Use one or more namespaces. How to deal with newer versions of a CDM, etc.

    Two approaches: one generic namespace vs component specific namespaces
    Basically I’ve come across two approaches of defining a namespace in a CDM. Both ways can be a good approach, but you have to choose one based on your specific project characteristics.

    1. The first approach is to use one generic fixed namespace for the entire CDM. This may also be the ’empty’ namespace which looks like there is no namespace. This approach of one generic fixed namespace is useful when you have a central CDM that is available at run time and all services refer to this central CDM. When you go for this approach, go for one namespace only, so do not use different namespaces within the CDM.
      For maintenance and to keep the CDM manageable, it can be useful to split up the CDM into more definition files (XSD’s), each one representing a different group (domain) of entities. However my advise is to still use the same namespace in all of these definition files. The reason is that in time the CDM will change and you may want to move entities from one group to another group or you wan to split up a group. When each group had its own namespace, you would have gotten a problem with backward compatibility. That’s because an element which moves from one group to another, would then have changed from its namespace.
      When at a certain moment you’re going to have a huge amount of changes which also impacts the running software, you can create a new version of the CDM. Examples of such situations are connecting a new external system or replacing an important system by another system. In case you have more versions of the CDM, each version must have its own namespace where the version number is part of the name of the namespace. New functionality can now be developed with the new version of the CDM. When it uses existing functionality (e.g. calling an existing service) it has to transform the data from the new version of the CDM to the old version (and vice versa).

    2. The second approach is that each software component (e.g. a SOAP webservice) has its own specific namespace. This specific namespaces is used as the namespace for a copy of the CDM. The software component uses this copy of the CDM. You can consider it as ‘his’ own copy of the CDM. A central runtime CDM is not needed any more. This means that the software components have no runtime dependencies on the CDM! The result is that the software components can be deployed and run independent of the current version of the CDM. This is the most important advantage!
      The way to achieve this is to have a central CDM without a namespace (or a dummy namespace like ‘xxx’), which is only available as an off-line library at design time. So there even is no run time CDM to reference to!
      Developers need to create a hard coded copy of the CDM for the software component they are building and apply a namespace to the copy. The name of this namespace is specific for that software component and typically includes the name (and version) of the software component itself. Because the software component is the ‘owner’ of this copy, the parts (entities) of CDM which are not used by the software component, can be removed from this copy.

    In part III in my last blogpost about run time dependencies and interface tailoring I will advise when to use the first and when to use the second approach. First some words about XML patterns and their usage in these two namespace approaches.

    XML Patterns
    XML patterns are design patterns, applicable to the design of XML. Because the design of XML is defined by XML Schema, XSD files, these XML patterns actually are XML Schema (XSD) patterns. These design patterns describe a specific way of modeling XML. Different ways of modeling can result into the same XML, but may be different in terms of maintenance, flexibility, ease of extension, etc.
    As far as I know, there are four XML patterns: “Russian Doll”, “Salami Slice”, “Venetian Blind” and “Garden of Eden”. I’m not going to describe these patterns, because that has already be done by others. For a good description of the first three, see http://www.xfront.com/GlobalVersusLocal.html and http://www.oracle.com/technetwork/java/design-patterns-142138.html gives for a brief summary of all four. I advise you to read and understand them when you want to setup an XML type CDM.

    I’ve described two approaches of using a CDM above, a central run-time referenced CDM and a design time only CDM. So the question is, which XML design pattern matches best for each approach?

    When you’re going for the first approach, a central run-time-referenced CDM, there are no translations necessary when passing (a part of) an XML payload from one service to another service. This is easier compared with the second approach where each service has a different namespace. Because there are no translations necessary and the services need to reference parts of entities as well as entire entity elements, it’s advisable to use the “Salami Slice” or the “Garden of Eden” pattern. They both have all elements defined globaly, so it’s easy to reuse them. With the “Garden of Eden” patterns types are defined globally as well and thus reusable providing more flexibility and freedom to designers and developers. The downside is that you end up with a very scattered and verbose CDM.
    So solve this disadvantage, you can go for the “Venetian Blind” pattern and set the schema attribute “elementFormDefault” to “unqualified” and do not include any element definitions in the root of the schema’s (XSD) which make up the CDM. This means there are only XML type definitions in the root of the schema(s), so CDM is defined by types. The software components, e.g. a web service, do have their own namespace. In this way the software components define a namespace (through their XSD or WSDL) for the root element of the payload (in the SOAP body), while all the child elements below this root remain ‘namespace-less’.
    This makes the life of an developer easier as there is no namespace and thus no prefixes needed the payloads messages. No dealing with namespaces in all transformation, validation and processing software that works with those messages makes programming code (e.g. xslt) less complicated, so less error prone.
    This leads to my advise that:

    The “Venetion Blind” pattern with the schema attribute “elementFormDefault” set to “unqualified” and no elements in the root of the schema’s, is the best XML pattern for the approach of using a central run-time referenced CDM.

    When you’re going for the second option, no runtime CDM, but only a design time CDM, you shouldn’t use a model which results in payloads (or part of the payloads) of different services having exact the same namespace. So you cannot use the “Venetian Blind” pattern with “elementFormDefault” set to “unqualified” which I have just explained. You can still can use the “Salami Slice” or “Garden of Eden” pattern, but the disadvantages of large scattered and verbose CDM remain.
    The reason that you can not have the same namespace for the payload of services with this approach is because the services have their own copy (‘version’) of the CDM. When (parts of) payloads of different services have the same element with also the same namespace (or the empty namespace), the XML structure of both is considered to be exactly equal, while that need not be the case!. When they are not the same you have a problem when services need to call each other and payloads are passed to each other. They can already be different at design time and then it’s quite obvious.
    Much more dangerous is that they even can become different later in time without even being noticed! To explain this, assume that at a certain time two software components were developed, they used the same CDM version, so the XML structure was the same. But what if one of them changes later in time and these changes are considered as backwards compatible (resulting in a new minor version). The design time CDM has changed, so the newer version of this service uses this newer CDM version. The other service did not change and now receives a payload from the changed service with elements of a newer version of the CDM. Hopefully this unchanged service can handle this new CDM format correctly, but it might not! Another problem is that it might break its own contract (WSDL) when this service copies the new CDM entities (or part of it) to its response of caller. Thus breaking its own contract while the service itself has not changed! Keep in mind its WSDL still uses the old CDM definitions of the entities in the payload.
    Graphically explained:
    Breach of Service Contract
    Service B calls Service A and retrieves a (part of) the payload entity X from Service A. Service B uses this entity to return it to his consumers as (part of) payload. This is all nice and correct according to its service contract (WSDL).
    Later in time, Service A is updated to version 1.1 and the newer version of the CDM is used in this updated version. In the newer CDM version, entity X has also been updated to X’. Now this X’ entity is passed from Service A to Service B. Service B returns this new entity X’ to its consumers, while they expect the original X entity. So service B returns an invalid response and breaks its own contract!
    You can imagine what happens when there is a chain of services and probably there are more consumers of Service A. Such an update can spread out through the entire integration layer (SOA environment) like ripples on water!
    You don’t want to update all the services in the chains effected by such a little update.
    I’m aware a service should not do this. Theoretically a service is fully responsible that always complies to its own contract (WSDL), but this is very difficult to implement this when developing lots of services. When there is a mapping in a service, this is quite clear, but all mapping should be checked. However an XML entity often is used as variable (e.g. BPEL) in some processing code and can be passed to a caller unnoticed.
    The only solution is to avoid passing complete entities (container elements), so, when passing through, all data fields (data elements) have to be mapped individually (in a so called transformation) for all incoming and outgoing data of the service.
    The problem is that you cannot enforce software to do this, so this must become a rule, a standard, for software developers.
    Everyone, who has been in a software development for some years, knows this is not going to work. There will always be a software developer (at that moment or maybe in future for maintenance) not knowing or understanding this standard.
    The best way to prevent this problem, is to give each service its own namespace, so entities (container elements) cannot be copied and passed through in its entirety and thus developers have to map the data elements individually.

    This is why I advise for the approach of a design time only CDM to also use the “Venetian Blind” pattern, but now with the schema attribute “elementFormDefault” set to “qualified”. This results into a CDM of which

    • it is easy to copy the elements that are needed, including child elements and necessary types, from the design time CDM to the runtime constituents of the software component being developed. Do not forget to apply the component specific target namespace to this copy.
    • it is possible to reuse type definitions within the CDM itself, preventing multiple definitions of the same entity.

    In my next blogpost, part III about runtime dependencies and interface tailoring, I explain why you should go in most cases for a design time CDM and not a central runtime CDM.

    The post Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards appeared first on AMIS Oracle and Java Blog.

    Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring

    Wed, 2017-03-29 12:20
    Introduction

    This blogpost is part III, the last part of a trilogy on how to create and use a Canonical Data Model (CDM). The first blogpost contains part I in which I share my experiences in developing a CDM and provide you with lots of standards and guidelines for creating a CDM. The second part is all about XML Namespace Standards. This part is about usage of a CDM in the integration layer, thus how to use it in a run time environment and what are the consequences for the development of the services which are part of the integration layer.

    Dependency Management & Interface Tailoring

    When you’ve decided to use a CDM, it’s quite tempting to use the XSD files, that make up the CDM, in a central place in the run time environment where all the services can reference to. In this way there is only one model, one ‘truth’ for all the services. However there are a few problems you run into quite fast when using such a central run time CDM.

    Dependency Management

    Backwards compatibility
    The first challenge is to maintain backwards compatibility. This means that when there is a change in the CDM, this change is implemented in such a way that the CDM supports both the ‘old’ data format, according to the CDM before the change, as well as the new data format with the change. When you’re in the development stage of the project, the CDM will change quite frequently, in large projects even on a daily basis. When these changes are backwards compatible, the services which already have been developed and are considered as finished, do not need to change (unless of course the change also involves a functional change of a finished service). Otherwise, when these changes are not backwards compatible, all software components, so all services, which have been finished have to be investigated whether they are hit by the change. Since all services use the same set of central XSD definitions, many will be hit by a change in these definitions.
    If you’re lucky you have nice unit tests or other code analysis tools you can use to detect this. You may ask yourself if these test and/or tool will cover a 100% hit range. When services are hit, they have to be modified, tested and released again. To reduce maintenance and rework of all finished services, there will be pressure to maintain backwards compatibility as much as possible.
    Maintaining backwards compatibility in practice means

    • that all elements that are added to the CDM have to be optional;
    • That you can increase the maximum occurrence of an element, but never reduce it;
    • That you can make mandatory elements optional, but not vice versa;
    • And that structure changes are much more difficult.

    For example, when a data element has to be split up into multiple elements. Let’s take a product id element of type string and split it up into a container elements that is able to contain multiple product identifications for the same product. The identification container element will have child elements for product id, product id type and an optional owner id for the ‘owner’ of the identification (e.g. a customer may have his own product identification). One way of applying this change and still maintain backwards compatibility is by using an XML choice construction:

    <complexType name="tProduct">
      <sequence>
        <choice minOccurs="0" maxOccurs="1">
          <element name="Id" type="string" />
          <element name="Identifications">
            <complexType>
              <sequence>
                <element name="Identification" minOccurs="0" maxOccurs="unbounded">
                  <complexType>
                    <sequence>
                      <element name="Id" type="string" />
                      <element name="IdType" type="string" />
                      <element name="IdOwner" type="string" minOccurs="0"/>
                    </sequence>
                  </complexType>
                </element>
              </sequence>
            </complexType>
          </element>
        </choice>
        <element name="Name" type="string" />
        ...
      </sequence>
    </complexType>

    There are other ways to implement this change and remain backwards compatible, but they will all will into a redundant and verbose data model. As you can imagine, this soon results in a very ugly CDM, which is hard to read and understand.

    Hidden functional bugs
    There is another danger. When keeping backward compatibility in this way, the services which were finished technically don’t break and still run. But they might functional break! This break is even more dangerous because it may not be visible immediately and it can take quite a long time before this hidden functional bug is discovered. Perhaps the service already runs in a production environment and execute with unnoticed functional bugs!
    Take the example above and consider that there has already been a service developed which does something with orders. Besides order handling, it also sends the product id’s in an order to a CRM system, but only for the product id’s in the range 1000-2000. The check in the service on the product id being in the range 1000-2000 will be based upon the original product id field. But what happens if the CDM is changed as described in previous paragraph, so the original product id field is part of a choice and thus becomes optional. This unchanged service now might handle orders that contain products with the newer data definition for a product in where the new “Identification” element is used instead of the old “Id” element. If you’re lucky, the check on the range fails with a run time exception! Lucky, because you’re immediately notified of this functional flaw. It probably will be detected quite early in a test environment when it’s common functionality. But what if it is rare functionality? Then the danger is that it might not be detected and you end up with a run time exception in a production environment. That is not what you want, but at least it is detected!
    The real problem is that there is a realistic chance that the check doesn’t throw an exception and doesn’t log an error or warning. It might conclude that the product id is not in the range 1000-2000, because the product id field is not there, while the product identification is in that range! It just uses the new way of data modeling the product identification with the new “Identification” element. This results into a service that has a functional bug while it seems to run correctly!

    Backward compatibility in time
    Sometimes you have no choice and you have to make changes which are not backward compatible. This can cause another problem: you’re not backwards compatible in time. You might be developing newer versions of services. But what if in production there is a problem with one of these new services using the new CDM and you want to go back to a previous version of that service? You have to go back to the old version of the CDM as well, because the old version is not compatible with the new CDM. But that also means that none of the newer services can run, because they depend on the new CDM. So you have to revert to the old versions for all of the new services using the new CDM!

    The base cause of these problems is that all software components (service) are dependent on the central run time CDM!
    So this central run CDM introduces dependencies between all (versions of) components. This heavily conflicts with one of the base principles of SOA: loose coupled, independent services.

     

    Interface Tailoring

    There is another problem with a central CDM which has more to do with programming concepts, but also impacts the usage of services resulting in a slower development cycle. The interface of a service which is described in its contract (WSDL) should reflect the functionality of a service. However, if you’re using a central CDM, the CDM is used by all the services. This means that the entities in the CDM contain all the data elements which are needed in the contracts of all the services. So basically a CDM entity consists of a ‘merge’ of all these data elements. The result is that the entities will be quite large, detailed and extensive. The services use these CDM entities in their contracts, while functionally only a (small) part of the elements are used in a single service.

    This makes the interface of a service very unclear, ambiguous and meaningless.

    Another side effect is that it makes no sense to validate (request) messages, because all elements will be optional.

    Take for example a simple service that returns the street and city based upon the postal code and house number (this is a very common functionality in The Netherlands). The interface would be nice and clear and almost self-describing when the service contract dictates that the input (request) only is a postal code and the output (response) only contains the street name and the city. But with a central CDM, the input will be an entity of type address, as well as the output. With some bad luck, the address entity also contain all kind of elements for foreign addresses, post office boxes, etc. I’ve seen exactly this example in a real project with an address entity containing more than 30 child elements! While the service only needed four of them: two elements, postal code and house number, as input and also two elements, street and city, as the output. You might consider to this by defining these separate elements as input and output and not to use the entity element. But that’s not the idea of a central CDM! Take notice that this is just a little example. I’ve seen this problem in a project with lawsuit entities. You can imagine how large such an entity can become, with hundreds of elements. Services individually only used some of the elements of the lawsuit entity, but these elements were scattered across the entire entity. So is does not help either to split up the type definition of a lawsuit entity into several sub types. In that project almost all the services needed one or more lawsuit entities resulting in interface contracts (WSDL) which all were very generic and didn’t make sense. You needed the (up to date) documentation of the service in order to know which elements you had to use in the input and which elements were returned as output, because the definitions of the request and response messages were not useful as they contained complete entities.

    Solution

    The solution to both of the problems described above, is not to use a central run time CDM, but only a design time CDM.
    This design time CDM has no namespace (or a dummy one). When a service is developed, a hard copy is made of (a part of) the CDM at that moment to a (source) location specific for that service. Then a service specific namespace has to be applied to this local copy of the (service specific) CDM.
    And now you can shape this local copy of the CDM to your needs! Tailor it by removing elements that the service contract (WSDL) doesn’t need. You can also apply more restrictions to the remaining elements by making optional elements mandatory, reduce the maximum occurrences of an element and even create data value restrictions for an element (e.g. set a maximum string length). By doing this, you can tailor the interface in such a way that it reflects the functionality of the service!
    You can even have two different versions of an entity in this copy of the CDM. For example one to use in the input message and one in the output message.
    Let’s demonstrate this with the example of above: An address with only postal code and house number for the input message and an address with street and city for the output message. The design time CDM contains the full address entity, while the local and tailored copy of the service CDM contains two tailored address entities. And this one can be used by the service XSD which contains the message definitions of the request and response payloads:

    CDM XSD and Service XSD

    CDM XSD and Service XSD

    You can expand the source code if you are interested:

    <schema targetNamespace="DUMMY_NAMESPACE"
                xmlns="http://www.w3.org/2001/XMLSchema" 
                version="1.0">
    
       <complexType name="TAddress">
          <sequence>
             <element name="Department" type="string" minOccurs="0"/>
             <element name="Street" type="string" minOccurs="0"/>
             <element name="Number" type="string" minOccurs="0"/>
             <element name="PostalCode" type="string" minOccurs="0"/>
             <element name="City" type="string" minOccurs="0"/>
             <element name="County" type="string" minOccurs="0"/>
             <element name="State" type="string" minOccurs="0"/>
             <element name="Country" type="string" minOccurs="0"/>
          </sequence>
       </complexType>
       
    </schema>
    <schema targetNamespace="http://nl.amis.AddressServiceCDM"
                xmlns="http://www.w3.org/2001/XMLSchema" 
                version="1.0">
    
       <complexType name="TAddressInput">
          <sequence>
             <element name="Number" type="string" minOccurs="0"/>
             <element name="PostalCode" type="string" minOccurs="1"/>
          </sequence>
       </complexType>
    
       <complexType name="TAddressOutput">
          <sequence>
             <element name="Street" type="string" minOccurs="1"/>
             <element name="City" type="string" minOccurs="1"/>
          </sequence>
       </complexType>
       
    </schema>
    <schema targetNamespace="http://nl.amis.AddressService"
            xmlns="http://www.w3.org/2001/XMLSchema" 
            xmlns:cdm="http://nl.amis.AddressServiceCDM" 
            version="1.0">
    
       <import namespace="http://nl.amis.AddressServiceCDM" schemaLocation="AddressServiceCDM.xsd"/>
    
       <element name="getAddressRequest">
    	   <complexType>
    		  <sequence>
    			 <element name="Address" type="cdm:TAddressInput" minOccurs="1"/>
    		  </sequence>
    	   </complexType>
       </element>
    
       <element name="getAddressResponse">
    	   <complexType>
    		  <sequence>
    			 <element name="Address" type="cdm:TAddressOutput" minOccurs="1"/>
    		  </sequence>
    	   </complexType>
       </element>
       
    </schema>

    When you’re finished tailoring, you can still deploy these service interfaces (WSDL) containing the shaped data definitions (XSDs) to a central run time location. However each service must have its own location within this run time location, to store these tailored data definitions (XSDs). When you do this, you can also store the service interface (abstract WSDL) in there as well. In this way there is only one copy of a service interface, that is used by the implementing service as well as by consuming services.
    I’ve worked in a project with SOAP services where the conventions dictated that the filename of a WSDL is the same as the name of the service. The message payloads were not defined in this WSDL, but were included from an external XSD file. This XSD also had the same filename as the service name. This service XSD defined the payload of the messages, but it did not contain CDM entities or CDM type definitions. They were included from another XSD with the fixed name CDM.xsd. This local, service specific, CDM.xsd contained the tailored (stripped and restricted) copy of the central design time CDM, but had the same target namespace as the service.wsdl and the service.xsd:
    Service Files
    This approach also gave the opportunity to add operation specific elements to the message definitions in the service.xsd. These operation specific elements were not part of the central CDM and did not belong there due to their nature (operation specific). These operation specific elements ware rarely needed, but when needed, they did not pollute the CDM, because you don’t need to somehow add them to the CDM. Think of switches and options on operations which act on functionality, e.g. a boolean type element “includeProductDescription” in the request message for operation “getOrder”.

    Note: The services in the project all did use a little generic XML of which the definition (XSD) was stored in a central run time location. However these data definitions are technical data fields and therefor are not part of the CDM. For example header fields that are used for security, a generic response entity containing messages (error, warning info) and optional paging information elements in case a response contains a collection. You need a central type definition when you are using generic functionality (e.g. from a software library) in all services and consuming software.

    Conclusion
    With this approach of a design time CDM and tailored interfaces:

    • There are no run time dependencies on the CDM and thus no dependencies between (versions of) services
    • Contract breach and hidden functional bugs are prevented. (Because of different namespaces services have to copy each data element individually when passing an entity or part of an entity, to its output)
    • Service interfaces reflect the service functionality
    • Method specific parameters can be added without polluting the CDM
    • And – most important – the CDM can change without limitations and as often as you want to!

    The result is that the CDM in time will grow to a nice clean and mature model that reflects the business data model of the organization – while not impending and even promoting the agility of service development. And that is exactly what you want with a CDM!

     

    When to use a central run time CDM

    A final remark about a central run time CDM. There are situations where this can be a good solution. That is for smaller integration projects and in the case when all the systems and data sources which are to be connected with the integration layer are already in place, so they are not being developed. They probably already run in production for a while.
    This means that the data and the data format which has to be passed through the integration layer and is used in the services is already fixed. You could state that the CDM already is there, although it still has to be described, documented in a data model. It’s likely that it’s also a project where there is a ‘one go’ to production, instead of frequent delivery cycles.
    When after a while one system is replaced by another system or the integration layer is extended by connecting one or more systems and this results that the CDM has to be changed, you can add versioning to the CDM. Create a copy of the existing CDM and give it a new version (e.g. with a version number in the namespace) and you can make the changed in CDM which are needed. This is also a good opportunity to clean up the CDM by removing unwanted legacy due to keeping backwards compatibility. Use this newer version of the CDM for all new development and maintenance of services.
    Again, only use this central run time CDM for smaller projects and when it is a ‘one go’ to production (e.g. replacement of one system). As soon as the project becomes larger and/or integration of systems keeps on going, switch over to the design time CDM approach.
    You can easily switch over by starting to develop the new services with the design time CDM approach and keep the ‘old’ services running with the central run time CDM. As soon there is a change in an ‘old’ service, refactor it to the new approach of the design time CDM. In time there will be no more services using the run time CDM, so the run time CDM can be removed.

    After reading this blogpost, together with the previous two blogpost which make up the trilogy about my experiences with a Canonical Data Model, you should be able to have good understanding about how to set up a CDM and use it in your projects. Hopefully it helps you in making valuable decisions about creating and using a CDM and your projects will benefit from it.

    The post Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring appeared first on AMIS Oracle and Java Blog.

    Single-Sign-On to Oracle ERP Cloud

    Tue, 2017-03-21 10:04

    More and more enterprises are using Single-Sign-On (SSO) for there on-premise applications today, but what if they want to use SSO for there cloud applications as well?

    This blog post is addressing this topic for Single-Sign-On to Oracle ERP Cloud in a hybrid environment.

    First of all lets focus on SSO on-premise and introduce some terminology.

    A user (aka principal) wants to have access to a particular service. This service can be found at the Service Provider (SP). The provided services are secured so the user needs to authenticate itself first. The Identity Provider (IdP) is able to validate (assert) the authenticity of the user by asking, for instance, the username and password (or using other methods).

    So for authentication we always have three different roles: User, Service Provider (SP) and Identity Provider (IdP), as shown below.

    For Single-Sign-On we should have a centralized IdP and we should have a standardized way to assert the authentication information.

    In an on-premise landscape there is plenty of choice for an IdP. Some common used ones are: Microsoft Active Directory (AD) (closed source), Oracle Identity & Access Management (closed source) and Shibboleth (open source). For now we assume we are using AD.

    Kerberos

    The most used standard for doing SSO is Kerberos. In that case a Kerberos ticket is asserted by the IdP which is used towards all the Service Providers to be able to login.

    This Kerberos method is suited for an on-premise landscape and also suited if the connection to a private cloud is via a VPN (the two are effectively part of the internal network and everything should work ok for the cloud as well). But what if we want to integrate a public cloud such as Oracle Fusion Cloud then things get messy.

    Arguably the reason Kerberos isn’t used over the public Internet doesn’t have to do with the security of the protocol itsef, but rather that it’s an authentication model that doesn’t fit the needs of most “public Internet” applications. Kerberos is a heavy protocol and cannot be used in scenarios where users want to connect to services from unknown clients as in a typical Internet or Public Cloud computer scenario, where the authentication provider typically does not have knowledge about the users client system.

    The main standards to be able to facilitate SSO for the internet are:

    • OpenID
    • OAuth
    • Security Assertion Markup Language (SAML)
    SAML 2.0
    Oracle Fusion Cloud is based on SAML 2.0 so let’s go on with this standard for now.
    Conceptually the SAML handshake looks like Kerberos; you can also see the different roles for User, SP and IdP and the assertion of a SSO ticket.
    Identity Federation
    But how can we integrate Kerberos with SAML?
    Now a different concept comes in: Identity Federation. This means linking and using the identity of a user across several security domains (on-premise and public cloud). In simpler terms, an SP does not necessarily need to obtain and store the user credentials in order to authenticate them. Instead, the SP can use an IdP that is already storing this. In our case the IdP is on-premise (Active Directory for example) and the SP is the Oracle Fusion Cloud application.
    Now there are two things to be done:
    • The on-premise Kerberos ticked should be translated to SAML. Because we want SSO.
    • There is need for trust between IdP en SP. Only trusted security domains can access the SP. Trust configuration should be done at both sites (on-premise vs cloud)
    Translation of Kerberos ticked is performed by a Security Token Service (STS). This is the broker that sits between a SP and the user. An STS is an issuer of security tokens. “Issuer” is often a synonym of an STS. STS’s can have different roles: as IdP when they authenticate users or as Federation Provider (FP) when they sit in the middle of a trust chain and act as “relying parties” for other IdPs.
    In our case the STS translates Kerberos to SAML and Microsoft Active Directory Federation Server (ADFS) and Oracle Identity Federation Server (part of Oracle Identity Governance Suite) are examples of doing this.
    So the picture look like this now:
    Trust
    But how is the Trust achieved?
    Trust is just metadata about the SP and the IdP. So the metadata from the IdP should be uploaded in Oracle ERP Cloud and visa versa.When you create metadata for the IdP, the IdP entity is added to a circle of trust. A circle of trust is used to group SP’s and IdP’s in a secure, trusted environment. Other remote provider entities can be added to the circle of trust.
    Metadata is defined in XML. A SP uses the Metadata to know how to communicate with the IdP and vise versa. Metadata define things like what service is available, addresses and certificates:
    • Location of its SSO service.
    • An ID identifying the provider.
    • Signature of the metadata and public keys for verifying and encrypting further communication.
    • Information about if the IdP wants the communication signed or encrypted.
    There is no protocol how the exchange is done, but there are no secret information in the metadata so the XML can be freely distributed by mail or published in clear text on the Internet.
    It is however highly recommended that the metadata is protected from unauthorized modification, this could be a good start on a Man-In-The-Middle attack.
    The integrity of the Metadata could be protected using for example digital signatures or by transporting the metadata using some secure channels.
    Metadata could contain lots of other information. For a full description have a look at the SAML specifications http://saml.xml.org/saml-specifications
    Oracle ERP Cloud Security
    Application Security in Oracle ERP Cloud consists of two parts:
    1. Oracle Identity Management (OIM) running in the cloud (Oracle Identity Federation is part of this).
    2. Authorization Policy Manager (APM).
    Oracle Identity Management is responsible for the user accounts management. Authorization Policy Manager is responsible for the fine grained SaaS role mapping (aka Entitlements).
    See this blog post from Oracle how this works: http://www.ateam-oracle.com/introduction-to-fusion-applications-roles-concepts/
    Remark: the application security in Oracle ERP Cloud will change with R12 and will benefit from the following new capabilities:
    • Separation between OIM and APM is no longer available. A new simplified Security Console will contain both.

    • Configuration of SSO integration (with IdP) is simplified and can be performed from a single screen.
    • REST API’s based on SCIM (System for Cross-Domain Identity Management) 2.0 are available for Identity Synchronization with IdP.
    Another remark: Oracle Identity Cloud Service is released in Q1 2017. Integration with Oracle ERP Cloud is not the case yet because Identity Federation functionality is not implemented yet. The release date isn’t clear, so we have to deal with the functionality presented above.
    Configuring SSO for Oracle ERP Cloud
     For SSO the following aspects should be taken into account:
    • Users and Entitlements
    • Initial upload of identities and users
    • Identity and user Synchronization
    • Exchange of SP and IdP metadata
    Users and Entitlements

    Before going into this I must explain the difference between users and employees.

    • When talking about users we mean the user login account. As explained before these accounts are the domain of IAM.
    • Users have access rights based on Role Based Access Controls (RBAC). Also IAM is handling this.
    • Users have entitlements to use particular ERP functionality. This is handled in APM.
    • When talking about employees we mean the business employee with it’s associated business job. This is the domain of Oracle HCM Cloud (even when you don’t have a HCM full-use license). An employee can access Oracle ERP Cloud when it’s having an user account in IAM and the proper entitlements in APM.
    Initial user upload

    To establish SSO between the customer’s on-premises environment and the Oracle ERP Cloud environment, the customer must specify which identity attribute (aka GUID) (user name or email address) will be unique across all users in the customer’s organization. The SAML token should pass this attribute so the SP could determine which user is asserted (remember the first picture in this blog post).

    But before this could work the SP should have all users loaded. This is a initial step in the Oracle ERP Cloud on-boarding process.
    Currently (Oracle ERP Cloud R11) the following options are available:
    • If running Oracle HCM Public Cloud, you may need to use HR2HR Integration
    • If running Non-HCM Public Cloud, use Spreadsheet Upload [Document Note 1454722.1] or if you are running CRM Public Cloud, use the CRM upload utility for HCM employees. You could also manually enter the employee.

    Do the following tasks to load the initial user data into Oracle ERP Cloud:

    1. Extract user data from your local LDAP directory service to a local file by using the tools provided by your LDAP directory service vendor.
    2. Convert the data in the file into a format that is delivered and supported by Oracle ERP Cloud.
    3. Load the user data into Oracle ERP Cloud by using one of the supported data loading methods.

    Data loaders in Oracle ERP Cloud import data in the CSV format. Therefore, you must convert user data extracted from your local LDAP directory into the CSV format. Ensure that the mandatory attributes are non-empty and present.

    From Oracle ERP Cloud R12 the initial load can also be performed by using the SCIM 2.0 REST API’s. For details about this see: https://docs.oracle.com/cd/E52734_01/oim/OMDEV/scim.htm#OMDEV5526

    Identity and user Synchronization

    The IdP should always have the truth about the users and business roles. So there should be something in place to push them to the Oracle ERP Cloud.

    For R12 the SCIM REST API’s are the best way to do that. For R11 it’s a lot more complicated as explained below.

    Now the concept of employee and job comes in again. As explained earlier in this blog post this is the domain of Oracle HCM Cloud (which is also part of Oracle ERP Cloud).

    Oracle HCM Cloud is having REST API’s for read and push of Employee and Job data to Oracle HCM Cloud:

    • GET /hcmCoreApi/resources/11.12.1.0/emps
    • POST /hcmCoreApi/resources/11.12.1.0/emps
    • PATCH /hcmCoreApi/resources/11.12.1.0/emps/{empsUniqID}
    • GET /hcmCoreSetupApi/resources/11.12.1.0/jobs
    • GET /hcmCoreSetupApi/resources/11.12.1.0/jobs/{jobsUniqID}

    For more details about these see (which are also available in R11) see: https://docs.oracle.com/cloud/latest/globalcs_gs/FARWS/Global_HR_REST_APIs_and_Atom_Feeds_R12.html

    But how can we provision IAM/APM? For that Oracle HCM Cloud have standard provisioning job:

    • Send Pending LDAP Requests: Sends bulk requests and future-dated requests that are now active to OIM. The response to each request from OIM to Oracle Fusion HCM indicates transaction status (for example, Completed).
    • Retrieve Latest LDAP Changes: Requests updates from OIM that may not have arrived automatically because of a failure or error, for example.

    For details see: http://docs.oracle.com/cloud/farel8/common/OCHUS/F1210304AN1EB1F.htm

    Now the problem could arise that an administer has changed user permissions in ERP Cloud (HCM or IAM/APM) which are not reflected in the IdP (which should always reflect the truth), so these are out-of-sync.

    To solve this the IdP should first read all employee and job data from Oracle HCM Cloud and based on that creates the delta with it’s own administration. This delta is pushed to Oracle HCM Cloud so all manually changes are removed. This synchronization job should be performed at least every day.

    The whole solution for Identity and user synchronization for R11 could look like this:

     

    Exchange metadata for SSO
    In R11 of Oracle ERP Cloud the exchange of SAML metadata for SSO is a manual process. In R12 there is a screen to do this. So for R12 skip the rest of this blog.
    For R11, generation of SP metadata.xml (to setup the Federation for IdP) and upload of your IdP metadata.xml into the SP is performed by the Oracle Cloud Operations team. To start the integration process you should create a Service Request and provide the following information:
    • Which type Federation Server is used on-premise.
    • Which SaaS application you want to integrate.
    • How many users will be enabled.
    • URL’s for IdP production and IdP non-production.
    • Technical contacts.

    The following should also be taken into account (at both sites):

    • The Assertion Consumer Service URL of the SP, where the user will be redirected from the IdP with SAML Assertion.
    • The Signing Certificate corresponding to the private key used by the SP to sign the SAML Messages.
    • The Encryption Certificate corresponding to the private key used by the SP to decrypt the SAML Assertion, if SAML encryption is to be used.
    • The Logout service endpoint.
    The Oracle Cloud Operations team document delivers a document how to configure the on-premises IdP (Microsoft Active Directory Federation Server (ADFS) 2.0 or Oracle Identity Federation Server 11g).
    Be aware that the Oracle Cloud Operations team needs two weeks as least to do the configuration in Oracle SSO Cloud.
    For detailed information about this see Oracle Support Document: Co-Existence and SSO: The SSO Enablement Process for Public Cloud Customers (Doc ID 1477245.1).

    The post Single-Sign-On to Oracle ERP Cloud appeared first on AMIS Oracle and Java Blog.

    Oracle SOA Suite: Find that composite instance!

    Mon, 2017-03-20 06:14

    When executing BPM or BPEL processes, they are usually executed in the context of a specific entity. Sometimes you want to find instances involved with a specific entity. There are different ways to make this easy. You can for example use composite instance titles or sensors and set them to a unique identifier for your entity. If they have not been used, you can check the audit trail. However, manually checking the audit trail, especially if there are many instances, can be cumbersome. Also if different teams use different standards or standards have evolved over time, there might not be a single way to look for your entity identifier in composite instances. You want to automate this.

    It is of course possible to write Java or WLST code and use the API to gather all relevant information. It would however require fetching large amounts of data from the SOAINFRA database to analyse. Fetching all that data into WLST or Java and combining it, would not be fast. I’ve created a database package / query which performs this feat directly on the 11g SOAINFRA database (and most likely with little alteration on 12c).

    How does it work

    The checks which are performed in order (the first result found is returned):

    • Check the composite instance title
    • Check the sensor values
    • Check the composite audit trail
    • Check the composite audit details
    • Check the BPM audit trail
    • Check the Mediator audit trail
    • Do the above checks for every composite sharing the same ECID.

    It first looks for instance titles conforming to a specific syntax (with a regular expression), next it looks for sensor values of sensors with a specific name. After that it starts to look in the audit trail and if even that fails, it looks in the audit details where messages are stored when they become larger than a set value (look for Audit Trail threshold). Next the BPM and Mediator specific audit tables are looked at and as a last resort, it uses the ECID to find other composite instances in the same flow which might provide the required information and it does the same checks as mentioned above on those composite instances. Using this method I could find for almost any composite instance in my environment a corresponding entity identifier. The package/query has been tested on 11g but not on 12c. You should of course check to see if it fits your personal requirements. The code is mostly easy to read save the audit parsing details. For parsing the audit trail and details tables, I’ve used the following blog. The data is saved in a file which can be imported in Excel and can be scheduled on Linux with a provided sh script.

    Getting the script to work for your case

    You can download the script here. Several minor changes are required to make the script suitable for a specific use case.

    • In the example script getcomposites_run.sql the identification regular expressing: AA\d\d\.\d+ is used. You should of course replace this with a regular expression reflecting the format of your entity identification.
    • In the example script getcomposites_run.sql sensors which have AAIDENTIFICATION in the name will be looked at. This should be changed to reflect the names used by your sensors.
    • The getcomposites.sh contains a connect string: connect soainfra_username/soainfra_password. You should change this to your credentials.
    • The getcomposites.sh script can be scheduled. In the example script, it is scheduled to run at 12:30:00. If you do not need it, you can remove the scheduling. It can come in handy when you want to run it outside of office hours because the script most likely will impact performance.
    • The selection in getcomposites_run.sql only looks at running composites. Depending on your usecase, you might want to change this to take all composites into consideration.
    • The script has not been updated to 12g. If you happen to create a 12g version of this script (I think not much should have to be changed), please inform me so I can add it to the Github repository.
    Considerations
    • The script contains some repetition of code. This could be improved.
    • If you have much data in your SOAINFRA tables, the query will be slow. It could take hours. During this period, performance might be adversely affected.
    • That I had to create a script like this (first try this, then this, then this, etc) indicates that I encountered a situation in which there was not a single way to link composite instances to a specific identifier. If your project uses strict standards and these standards are enforced, a script like this would not be needed. For example, you set your composite instance title to reflect your main entity identifier or use specific sensors. In such a case, you do not need to fall back to parsing audit data.

    The post Oracle SOA Suite: Find that composite instance! appeared first on AMIS Oracle and Java Blog.

    Apache Kafka on the Oracle Cloud: My First experiences with Oracle Event Hub Cloud Service

    Sat, 2017-03-18 06:46

    Oracle recently made their ‘Kafka on Cloud’ service available: the Event Hub cloud service – offered as part of the Big Data Compute Cloud Service. In this article, I will briefly show the steps I went through to get up and running with this service. To be frank, it was not entirely intuitive – so this may be handy the next time I have to do it or perhaps when you are struggling. One obvious point of confusion is the overloaded use of the ‘service’ – which is applied among others for Kafka Topics.

    The steps are:

    1. Initialize the Oracle BigData Compute CS | Oracle Event Hub Cloud Service – Platform – -expose REST Proxy
    2. Create Network Access Rule to enable access to the Event Hub Platform instance from the public internet
    3. Create a Event Hub service instance on the Event Hub Platform – corresponding to a Kafka Topic
    4. Inspect Event Hub using PSM
    5. Interact with Event Hub through CURL
    6. Create application (e.g. NodeJS) to access the Event Hub Service (aka Kafka Topic): produce message, create consumer, consume messages (through the REST proxy) – discussed in a subsequent blog article
    Initialize the Oracle BigData Compute CS | Oracle Event Hub Cloud Service – Platform – -expose REST Proxy

     

    From the Dashboard, open the BigData Compute Cloud Service. Open the dropdown menu. Select option Oracle Event Hub Cloud Service – Platform.

     

    image

     

    image

     

    Provide the name of the service – the Kafka Event Hub Cluster: soaringEventsHub. Click Next

    image

    Specify Basic deployment type (for simple explorations). Upload the SSH Public Key.

    For a trial, select just a single node (I started out with 5 nodes and it cost me a huge pile of resources).

    Enable REST Access – to be able to communicate with the Event Hub through the REST API. Specify the username and password for accessing the REST proxy. (in hindsight, I probably should have picked a different username than admin).

    image

     

    Click Next. The service will be created – with a generous memory an storage allocation:image

    When the service creation is complete – details are shown, including the public IP for the Kafka cluster and the ZooKeeper instanceSNAGHTMLd64c40

    And the Rest Proxy details:

    SNAGHTML1296a2d

    Create Network Access Rule to enable access to the Event Hub Platform instance from the public internet

    To be perfectly honest – I am not sure anymore that this step is really required. I had trouble accessing the Event Hub service from my laptop – and I am still not able to connect to the Kafka broker from a regular Kafka client – so I tried many different things, including exposing the public IP address on the Event Hub (Kafk)a node. Whether it helped or made any difference, I am not entirely sure. I will share the steps I took, and you figure out whether you need them. (I am still hoping that accessing the Event Hub from applications running inside the same identity domain on Oracle PaaS will be easier).

    I opened the Access Rule for the Event Hub service:

    image

    image

    I created an access rule

    image

    And enabled it. In red the new access rule. In the blue rectangle the access rule that opens up the REST Proxy for the Event Hub service:

    SNAGHTML152941b

     

    Create Event Hub Service on the Event Hub Service Platform aka Create a Kafka Topic

    Open the Oracle Event Hub Cloud Service console (not the Oracle Event Hub Cloud Service – Platform!)

     

    image

    Click on Create Service. This corresponds to creating a Kafka Topic. The naming is awfully confusing here. Create an instance of Event Hub Service on top of an instance of Event Hub Platform Service actually means create a Kafka message topic. Now that is much clearer.

     

    image

     

    Specify the name for the service – which will be part of the name of the Kafka Topic. Specify the Number of Partitions and the default Retention Period for messages published to the topic. Indicate on which instance of the Event Hub Cloud Service Platform – created in the first step in this article – the topic should be created.image

    Click Next.image

    Click Create.

    The Topic is created with a name that is composed from the name of the identity domain and the name specified for the Event Hub Service instance: partnercloud17-SoaringEventBus.

     

    image

    The popup details for the “service”  (aka Topic) indicate the main configuration details as well as the REST endpoint for this Topic. Messages can be produced at this endpoint:SNAGHTMLd98394

    The log of all Service activity (again, Service here is at the same level of Kafka Topic)

    image

    The Service Console now lists these instances:

    image

     

     

    Inspect Event Hub using PSM

    The PSM (PaaS Service Manager) command line interface for the Oracle Cloud (read here on how to install it and get going) can be used to inspect the state of the Event Hub Service. It cannot be used to produce and consume messages though.

    Using PSM for Event Hub is described in the documentation:  http://docs.oracle.com/en/cloud/paas/java-cloud/pscli/oehcs-commands.html

    To list all services (topics)
    psm oehcs services

    image

    — details for a single service
    psm oehcs service –service-name SoaringEventBus

    — to create a new Topic:

    psm oehcs create-service

    — to update the retention time for a topic:

    psm oehcs update-service

     

    Interact with Event Hub through CURL

      One way to produce messages to the topic and/or consume message from a topic is by using cURL.

      It took me some time to figure out the correct syntax for each of these operations. It seems not abundantly well documented/explained at present. Anyways, here it goes. (note: the -k option tells cURL to accept an unknown certificate)

      * To produce a message to the topic on Event Hub, use:

      curl -X POST -k -u username:password -H “Content-Type: application/vnd.kafka.json.v1+json”  -d “{ \”records\”: [ { \”key\”: \”myKey\”, \”value\”: \”mySpecialValue\”}] }” https://public_ip_REST-PROXY:1080/restproxy/topics/NAME_OF_EVENT_HUB_SERVICE

      * Create a consumer group: in order to consume messages, you first need to create a consumer group with a consumer instance. Subsequently, you can consume messages through the consumer:

      curl -X POST -k -u username:password -H “Content-Type: application/vnd.kafka.json.v1+json”  -d “{ \”name\”:  \”soaring-eventbus-consumer\”, \”format\”: \”json\” }”  https://public_ip_REST-PROXY:1080/restproxy/consumers/soaring-eventbus-consumer-group

      * To consume messages – levering the consumer group soaring-eventbus-consumer-group and the consumer instance soaring-events-consumer:
      curl -X POST -k -u username:password -H “Accept: application/vnd.kafka.json.v1+json”    https://public_ip_REST-PROXY:1080/restproxy/consumers/soaring-eventbus-consumer-group/instances/soaring-eventbus-consumer/topics/partnercloud17-SoaringEventBus

       

      Here three commands: create a consumer group, produce a message and consume (all) available messages – all from the same topic:

      SNAGHTML142b0dd

      The post Apache Kafka on the Oracle Cloud: My First experiences with Oracle Event Hub Cloud Service appeared first on AMIS Oracle and Java Blog.

      Introducing NoSQL and MongoDB to Relational Database professionals

      Wed, 2017-03-15 06:25

      Most enterprises have a lot of variety in the data they deal with. Some data is highly structured and other is very unstructured, some data is bound by strict integrity rules and quality constraints and other is free of any restrictions, some data is “hot” – currently very much in demand – and other data can be stone cold. Some data needs to extremely accurate, down to a prescribed number of fractional digits and other is only approximate. Some is highly confidential and other publicly accessible. Some is around in small quantities and other in huge volumes.

      Over the years many IT professionals and companies have come to the realization that all this differentiation in data justifies or even mandates a differentiation in how the data is stored and processed. It does not make sense to treat the hottest transactional data in the same way as the archived records from 30 years. Yet many organizations have been doing exactly that: store it all in the enterprise relational database. It works, keeps all data accessible for those rare instances where that really old data is required and most importantly: keeps all data accessible in the same way – through straightforward SQL queries.

      On March 14th, we organized a SIG session at AMIS around NoSQL in general and MongoDB in particular. We presented on the history of NoSQL, how it complements relational databases and a pure SQL approach and what types of NoSQL databases are available. Subsequently we focused on MongoDB, introducing the product and its architecture and discussing how to interact with MongoDB from JavaScript/NodeJS and from Java.

      The slides presented by the speakers – Pom Bleeksma and Lucas Jellema – are shown here (from SlideShare):

       

      The handson workshop is completely available from GitHub: https://github.com/lucasjellema/sig-nosql-mongodb.

      image

      An additional slide deck was discussed – to demonstrate 30 queries side by side, against MongoDB vs Oracle Database SQL. This slide deck includes the MongoDB operations:

      •Filter & Sort (find, sort)

      •Aggregation ($group, $project, $match, $sort)

      •Lookup & Outer Join ($lookup, $arrayElemAt)

      •Facet Search ($facet, $bucket, $sortByCount)

      •Update (findAndModify, forEach, save, update, upsert, $set, $unset)

      •Date and Time operations

      •Materialized View ($out)

      •Nested documents/tables ($unwind, $reduce)

      •Geospatial (ensureIndex, 2dsphere, $near, $geoNear)

      •Text Search (createIndex, text, $text, $search)

      •Stored Procedures (db.system.js.save, $where)

       

      The post Introducing NoSQL and MongoDB to Relational Database professionals appeared first on AMIS Oracle and Java Blog.

      Pages