Amis Blog

Subscribe to Amis Blog feed
Friends of Oracle and Java
Updated: 37 min 1 sec ago

Download all directly and indirectly required JAR files using Maven install dependency:copy-dependencies

Thu, 2017-02-09 00:33

My challenge is simple: I am creating a small Java application – single class with main method – that has many direct and indirect dependencies. In order to run my simple class locally I need to:

  • code the Java Class
  • compile the class
  • run the class

In order to compile the class, all directly referenced classes from supporting libraries should be available. To run the class, all indirectly invoked classes should also be available. That means that in addition to the .class file that is result of compiling my Java code, I need a large number of JAR-files.

Maven is a great mechanism for describing the dependencies of a project. With a few simple XML elements, I can indicate which libraries my application has a direct dependency on. The Maven pom.xml file is where these dependencies are described. Maven uses these dependencies during compilation – to have all direct dependent classes available for the compiler.

In order to help out with all run time dependencies, Maven also can download all jar-files for the direct and even the indirect dependencies. Take the dependencies in this pom.xml file (for a Java application that will work with Kafka Streams):

 

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>nl.amis.streams.countries</groupId>
  <artifactId>Country-Events-Analyzer</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>Country-Events-Analyzer</name>
  <url>http://maven.apache.org</url>
  <dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams -->  
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-streams</artifactId>
        <version>0.10.0.0</version>    
    </dependency>
    <dependency>    
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>0.10.0.0</version>    
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
    	<artifactId>jackson-databind</artifactId>
    	<version>2.7.4</version>
    </dependency>
     <dependency>
          <groupId>junit</groupId>
          <artifactId>junit</artifactId>
          <version>3.8.1</version>
          <scope>test</scope>
     </dependency>
     <dependency>
        <groupId>org.rocksdb</groupId>
        <artifactId>rocksdbjni</artifactId>
        <version>4.9.0</version>
    </dependency>
  </dependencies>
  <build>
  <plugins>
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.1</version>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
    </configuration>
</plugin>
</plugins>
</build>
</project>

The number of JAR files required to eventually run the generated class is substantially. To find all these JAR-files manually is not simple: it may not be so simple to determine which files are required, the files may not be easy to locate and the indirect dependencies (stemming from the JAR files that the application directly depends on) are almost impossible to determine.

Using a simple Maven instruction, all JAR files are gathered and copied to a designated directory. Before the operation, here is the application. Note that the target directory is empty.

image

The statement to use is:

mvn install dependency:copy-dependencies

This will instruct Maven to analyze the pom.xml file, find the direct dependencies, find the associated JAR files, determine the indirect dependencies for each of these direct dependencies and process them similarly and recursively.

image

after some dozens of seconds:

image

 

The JAR files are downloaded to the target/dependency directory:

SNAGHTML56d4f25

 

I can now run my simple application using this command line command, that adds all JAR files to the classpath for the JVM:

java -cp target/Country-Events-Analyzer-1.0-SNAPSHOT.jar;target/dependency/* nl.amis.streams.countries.App

Note: on Linux, the semi colon should be a colon: java -cp target/Country-Events-Analyzer-1.0-SNAPSHOT.jar:target/dependency/* nl.amis.streams.countries.App

Note: the maven dependencies for specific projects and libraries can be explored in MVNRepository , such as https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/0.10.0.0 for Kafka Streams.

The post Download all directly and indirectly required JAR files using Maven install dependency:copy-dependencies appeared first on AMIS Oracle and Java Blog.

NodeJS – Publish messages to Apache Kafka Topic with random delays to generate sample events based on records in CSV file

Wed, 2017-02-08 23:59

In a recent article I described how to implement a simple Node.JS program that reads and processes records from a delimiter separated file. That is  stepping stone on the way to my real goal: publish a load of messages on a Kafka Topic, based on records in a file, and semi-randomly spread over time.

In this article I will use the stepping stone and extend it:

  • read all records from CSV file into a memory array
  • create a Kafka Client and Producer using Node module kafka-node
  • process one record at a time, and when done schedule the next cycle using setTimeOut with a random delay
  • turn each parsed record into an object and publish the JSON stringified representation to the Kafka Topic

image

The steps:

1. npm init kafka-node-countries

2. npm install csv-parse –save

3. npm install kafka-node –save

4. Implement KafkaCountryProducer.js

 

/*
This program reads and parses all lines from csv files countries2.csv into an array (countriesArray) of arrays; each nested array represents a country.
The initial file read is synchronous. The country records are kept in memory.
After the the initial read is performed, a function is invoked to publish a message to Kafka for the first country in the array. This function then uses a time out with a random delay 
to schedule itself to process the next country record in the same way. Depending on how the delays pan out, this program will publish country messages to Kafka every 3 seconds for about 10 minutes.
*/

var fs = require('fs');
var parse = require('csv-parse');

// Kafka configuration
var kafka = require('kafka-node')
var Producer = kafka.Producer
// instantiate client with as connectstring host:port for  the ZooKeeper for the Kafka cluster
var client = new kafka.Client("ubuntu:2181/")

// name of the topic to produce to
var countriesTopic = "countries";

    KeyedMessage = kafka.KeyedMessage,
    producer = new Producer(client),
    km = new KeyedMessage('key', 'message'),
    countryProducerReady = false ;

producer.on('ready', function () {
    console.log("Producer for countries is ready");
    countryProducerReady = true;
});
 
producer.on('error', function (err) {
  console.error("Problem with producing Kafka message "+err);
})


var inputFile='countries2.csv';
var averageDelay = 3000;  // in miliseconds
var spreadInDelay = 2000; // in miliseconds

var countriesArray ;

var parser = parse({delimiter: ';'}, function (err, data) {
    countriesArray = data;
    // when all countries are available,then process the first one
    // note: array element at index 0 contains the row of headers that we should skip
    handleCountry(1);
});

// read the inputFile, feed the contents to the parser
fs.createReadStream(inputFile).pipe(parser);

// handle the current coountry record
function handleCountry( currentCountry) {   
    var line = countriesArray[currentCountry];
    var country = { "name" : line[0]
                  , "code" : line[1]
                  , "continent" : line[2]
                  , "population" : line[4]
                  , "size" : line[5]
                  };
     console.log(JSON.stringify(country));
     // produce country message to Kafka
     produceCountryMessage(country)
     // schedule this function to process next country after a random delay of between averageDelay plus or minus spreadInDelay )
     var delay = averageDelay + (Math.random() -0.5) * spreadInDelay;
     //note: use bind to pass in the value for the input parameter currentCountry     
     setTimeout(handleCountry.bind(null, currentCountry+1), delay);             
}//handleCountry

function produceCountryMessage(country) {
    KeyedMessage = kafka.KeyedMessage,
    countryKM = new KeyedMessage(country.code, JSON.stringify(country)),
    payloads = [
        { topic: countriesTopic, messages: countryKM, partition: 0 },
    ];
    if (countryProducerReady) {
    producer.send(payloads, function (err, data) {
        console.log(data);
    });
    } else {
        // the exception handling can be improved, for example schedule this message to be tried again later on
        console.error("sorry, CountryProducer is not ready yet, failed to produce message to Kafka.");
    }

}//produceCountryMessage

5. Run node KafkaCountryProducer.js

The post NodeJS – Publish messages to Apache Kafka Topic with random delays to generate sample events based on records in CSV file appeared first on AMIS Oracle and Java Blog.

NodeJS – reading and processing a delimiter separated file (csv)

Wed, 2017-02-08 23:34

Frequently, there is a need to read data from a file, process it and route it onwards. In my case, the objective was to produce messages on a Kafka Topic. However, regardless of the objective, the basic steps of reading the file and processing its contents are required often. In this article I show the very basic steps with Node.js and and the Node module csv-parse.

1. npm init process-csv

Enter a small number of details in the command line dialog. Shown in blue:

image

2. npm install csv-parse -save

This will install Node module csv-parse. This module provides processing of delimiter separated files.

image

This also extends the generated file package.json with a reference to csv-parse:

image

3. Implement file processFile.js

The logic to read records from a csv file and do something (write to console) with each record is very straightforward. In this example, I will read data from the file countries2.csv, a file with records for all countries in the world (courtesy of https://restcountries.eu/)

image

The fields are semi colon separated, the records are each on a new line.

 

/*
This program reads and parses all lines from csv files countries2.csv into an array (countriesArray) of arrays; each nested array represents a country.
The initial file read is synchronous. The country records are kept in memory.
*/

var fs = require('fs');
var parse = require('csv-parse');

var inputFile='countries2.csv';
console.log("Processing Countries file");

var parser = parse({delimiter: ';'}, function (err, data) {
    // when all countries are available,then process them
    // note: array element at index 0 contains the row of headers that we should skip
    data.forEach(function(line) {
      // create country object out of parsed fields
      var country = { "name" : line[0]
                    , "code" : line[1]
                    , "continent" : line[2]
                    , "population" : line[4]
                    , "size" : line[5]
                    };
     console.log(JSON.stringify(country));
    });    
});

// read the inputFile, feed the contents to the parser
fs.createReadStream(inputFile).pipe(parser);

 

4. Run file with node procoessFile.js:

image

The post NodeJS – reading and processing a delimiter separated file (csv) appeared first on AMIS Oracle and Java Blog.

Oracle Service Bus: Produce messages to a Kafka topic

Mon, 2017-02-06 04:04

Oracle Service Bus is a powerful tool to provide features like transformation, throttling, virtualization of messages coming from different sources. There is a (recently opensourced!) Kafka transport available for Oracle Service Bus (see here). Oracle Service Bus can thus be used to do all kinds of interesting things to messages coming from Kafka topics. You can then produce altered messages to other Kafka topics and create a decoupled processing chain. In this blog post I provide an example on how to use Oracle Service Bus to produce messages to a Kafka topic.

Messages from Service Bus to Kafka

First perform the steps as described here to setup the Service Bus with the Kafka transport. Also make sure you have a Kafka broker running.

Next create a new Business Service (File, New, Business Service). It is not visible in the component palette since it is a custom transport. Next use transport Kafka.


In the Type screen be sure to select Text as request message and None as response message.


Specify a Kafka bootstrap broker.


The body needs to be of type {http://schemas.xmlsoap.org/soap/envelope/}Body. If you send plain text as the body to the Kafka transport, you will get the below error message:

<Error> <oracle.osb.pipeline.kernel.router> <ubuntu> <DefaultServer> <[STUCK] ExecuteThread: '22' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <43b720fd-2b5a-4c93-073-298db3e92689-00000132> <1486368879482> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <OSB-382191> <SBProject/ProxyServicePipeline: Unhandled error caught by system-level error handler: com.bea.wli.sb.pipeline.PipelineException: OSB Assign action failed updating variable "body": [OSB-395105]The TokenIterator does not correspond to a single XmlObject value

If you send XML as the body of the message going to the transport but not an explicit SOAP body, you will get errors in the server log like below:

<Error> <oracle.osb.pipeline.kernel.router> <ubuntu> <DefaultServer> <[STUCK] ExecuteThread: '22' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <43b720fd-2b5a-4c93-a073-298db3e92689-00000132> <1486368987002> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <OSB-382191> <SBProject/ProxyServicePipeline: Unhandled error caught by system-level error handler: com.bea.wli.sb.context.BindingLayerException: Failed to set the value of context variable "body". Value must be an instance of {http://schemas.xmlsoap.org/soap/envelope}Body.

As you can see, this causes stuck threads. In order to get a {http://schemas.xmlsoap.org/soap/envelope/}Body you can for example use an Assign activity. In this case I’m replacing text in the input body and assign it to the output body. I’m using <ns:Body xmlns:ns=’http://schemas.xmlsoap.org/soap/envelope/’>{fn:replace($body,’Trump’,’Clinton’)}</ns:Body>. This replaces Trump with Clinton.


When you check the output with a tool like for example KafkaTool you can see the SOAP body is not propagated to the Kafka topic.

Finally

Oracle Service Bus processes individual messages. If you want to aggregate data or perform analytics on several messages, you can consider using Oracle Stream Analytics (OSA). It also has pattern recognition and several other interesting features. It is however not very suitable to split up messages or perform more complicated actions on individual messages such as transformations. For such a use-case, use Oracle Service Bus.

The post Oracle Service Bus: Produce messages to a Kafka topic appeared first on AMIS Oracle and Java Blog.

How About Oracle Database 12c Threaded_Execution

Sun, 2017-02-05 13:53

THREADED_EXECUTION
Threaded_Execution is an Oracle Database 12c feature aiming to reduce the number of Oracle processes on LINUX. After setting parameter THREADED_EXECUTION on TRUE and a database bounce, most of the background processes are threads within just 6 Oracle processes, where more than 60 processes existed before the bounce. And if you want to apply this to client processes also, just add DEDICATED_THROUGH_BROKER_LISTENER = ON to the listener.ora and reload.

Especially within a consolidated environment and in coping with applications that just can’t restrict their connection pool activity and overload the database with sessions, this feature is very welcome. LINUX is better off with less Oracle processes, the communication between threads within a process is faster and more efficient than between processes, and logins | logoffs of client sessions as threads instead of processes are faster and less stressful for Oracle. Just to make sure you don’t mistake this for the shared server option… every session is still a dedicated session, and not a shared one. What follows is a summary of issues I encountered when I implemented this feature within databases on an ODA X5-2.

SEPS
Threaded_execution is a medicine that comes with an annoying side effect… database login with OS authentication is no longer possible. Quite some issue because most of my scripts use the “/ as sysdba” login, and from now on I am forced to use “sys/password as sysdba”? Well, this presented me with more of an opportunity than a problem, because now I could implement SEPS ( Secure External Password Store ), an Oracle wallet in which to keep unique combinations of TNS alias, username and password. I don’t find this a particularly user-friendly Oracle tool, but it does the job and enables a password less login with “/@TNS-alias as sysdba”. If you want some more information on SEPS, see here. It’s main aim is to prevent hard-coded passwords in code and scripts.

JDBC THIN
With threaded_execution also enabled for client sessions, some jdbc thin applications were not able to login.
Cause: Bug 19929111 : 11.1 JDBC THIN CAN’T CONNECT TO A 12C D/B WITH THREADED_EXECUTION=TRUE
Solved by upgrading the antique JDBC driver (11.1.0.7) to 11.2.0.3 ( higher is also OK )

Resident Memory
Linux server rebooted spontaneously, after experiencing a lack of memory and having to use SWAP
Cause: Bug 22226365 : THREADED_EXECUTION=TRUE – SCMN PROCESS RES MEMORY INCREASE
Solved by implementing the patch.

Shared Pool
Memory allocation of PGA seems to be from the Shared Pool. I can’t find any mention of this in Oracle docs, so I’m not stating this as a fact… it may or may not change in future releases, but to be on the safe side, until Oracle documents this, I will presume that in combination with threaded_execution the PGA is allocated from the Shared Pool.
Action: for some databases I doubled or even quadrupled the SGA.

Big Table Caching
This 12c feature enables you to set a certain percentage of the buffer cache apart for the caching of “full_table_scanned” tables. By caching these tables in memory ( in part or full ), you can boost application performance considerably, where this constituted an application performance bottleneck before.
Action: for some databases I doubled or even quadrupled the SGA.
This is not directly related to threaded_execution, but it’s nice to know that the action taken to accommodate PGA in the shared pool also affects the size of the buffer cache, making room for the implementation of big_table_caching.

Datapatch
Datapatch is done, or should be done by Oracle patching. As mentioned before, threaded_execution doesn’t allow “/ as sysdba” connections, but datapatch cannot do without.
Solved by setting threaded_execution=false and bouncing the database, setting threaded_execution=true after datapatch and bounce again.
As of the “160719” patch this seems to be resolved… if datapatch.sql can’t login with “/ as sysdba” this is recognized and the script will ask for sys password.

ODA Update
Every 3 months the ODA X5-2 must be updated.
At the moment we are at the latest version 2.9 ( patch 161018 ), and the part that updates the Oracle Home(s) still can’t cope with threaded_execution. This is solved by setting threaded_execution=false and bouncing the database(s), setting threaded_execution=true after the update and bounce again.

The glogin.sql script we use interferes with the ODA update.
Solved by renaming the file before the ODA update, and renaming it back to its original name after the ODA update.

The Oracle Database 12c Patch for Bug 22226365 : THREADED_EXECUTION=TRUE – SCMN PROCESS RES MEMORY INCREASE does not interfere with the 2.9 ODA update.

After all of this
Am I still as enthusiastic as before over this medicine… yes, but please solve the main side effect of not being able to use database login with OS authentication. It is not so much a problem for me ( check the SEPS paragraph ) but more so for Oracle itself, because some software ( like ODA Update ) is not prepared for this configuration (yet).

The post How About Oracle Database 12c Threaded_Execution appeared first on AMIS Oracle and Java Blog.

Pages