Skip navigation.

Feed aggregator

Watch: Hadoop vs. HBase

Pythian Group - Mon, 2014-12-08 09:58

Every data platform has its value, and deciding which one will work best for your big data objectives can be tricky—Alex Gorbachev, Oracle ACE Director, Cloudera Champion of Big Data, and Chief Technology Officer at Pythian, has recorded a series of videos comparing the various big data platforms and presents use cases to help you identify which ones will best suit your needs.

“…It’s actually not quite fair comparing them,” Alex says. “HBase is part of the Hadoop ecosystem… You could see them living with each other in the same cluster.” Learn how HBase and Hadoop can work together by watching Alex’s video Hadoop vs. HBase.

Note: You may recognize this series, which was originally filmed back in 2013. After receiving feedback from our viewers that the content was great, but the video and sound quality were poor, we listened and re-shot the series.

Find the rest of the series here

 

Pythian is a global leader in data consulting and managed services. We specialize in optimizing and managing mission-critical data systems, combining the world’s leading data experts with advanced, secure service delivery. Learn more about Pythian’s Big Data expertise.

Categories: DBA Blogs

ADF Mythbusters UKOUG'14

Andrejus Baranovski - Mon, 2014-12-08 09:11
I would like to post the slides from our recent session on UKOUG'14 conference - ADF Mythbusters. This session was presented by my colleague from Red Samurai Consulting - Florin Marcus. The goal was to break popular ADF myths. We have logged Oracle Support SR's, each myth in the slide is assigned with SR number.

Slides are available on SlideShare:

ADF Mythbusters UKOUG'14 from andrejusb
The following topics are covered in the presentation:
  • ADF BC Batches Of Functionality
  • AM Pools and DB Connection Pools
  • Activation-Safe Application Modules
  • Maximum Number of Regions Per Page
  • JSP vs Facelets
More detailed description and analysis data are available in the slides.

FIO (Flexible I/O) - a benchmark tool for any operating system

Yann Neuhaus - Mon, 2014-12-08 04:55

I have just attended an interesting session held by Martin Nash (@mpnsh) at UKOUG 14 - Liverpool: "The least an Oracle DBA Should Know about Linux Administration" . During this session I had the opportunity to discover some interesting commands and tools such as FIO (Flexible I/O). FIO is a workload generator that can be used both for benchmark and stress/hardware verification.

FIO has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work on block devices as well as files. fio accepts job descriptions in a simple-to-understand text format.

This tool has the huge advantage to be available for almost all kind of Operating Systems ( POSIX, Linux, BSD, Solaris, HP-UX, AIX ,OS X, Android, Windows). If you want to use this tool in the context of Oracle database I invite you to have a look on the following blog from Yann Neuhaus: Simulating database-like I/O activity with Flexible I/O

 

In order to install it on ubuntu simply use the following command:


steulet@ThinkPad-T540p:~$ sudo apt-get install fio

 

After having installed fio you can run your first test. This first test will run 2 gigabyte of IO (read write) in directory /u01/fio.


steulet@ThinkPad-T540p:~$ mkdir /u01/fio

 

Once the directory have been created we can set up the configuration script as described below. However it is perfectly possible to execute this command in command line without configuration script (fio --name=global --ioengine=posixaio --rw=readwrite --size=2g --directory=/u01/fio --threads=1 --name=myReadWriteTest-Thread1):

 

[global]
ioengine=posixaio
rw=readwrite
size=2g
directory=/u01/fio
threads=1

[myReadWriteTest-Thread1]

 

Now you can simply run your test with the command below:


steulet@ThinkPad-T540p:~$ fio testfio.fio

 

The output will looks like the following:

 

myReadWriteTest-Tread1: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=1
fio-2.1.3
Starting 1 thread
Jobs: 1 (f=1): [M] [100.0% done] [112.9MB/113.1MB/0KB /s] [28.9K/29.2K/0 iops] [eta 00m:00s]
myReadWriteTest-Tread1: (groupid=0, jobs=1): err= 0: pid=7823: Mon Dec  8 12:45:27 2014
  read : io=1024.7MB, bw=98326KB/s, iops=24581, runt= 10671msec
    slat (usec): min=0, max=72, avg= 1.90, stdev= 0.53
    clat (usec): min=0, max=2314, avg=20.25, stdev=107.40
     lat (usec): min=5, max=2316, avg=22.16, stdev=107.41
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    6], 10.00th=[    7], 20.00th=[    7],
     | 30.00th=[    7], 40.00th=[    7], 50.00th=[    7], 60.00th=[    7],
     | 70.00th=[    8], 80.00th=[    8], 90.00th=[    8], 95.00th=[   10],
     | 99.00th=[  668], 99.50th=[ 1096], 99.90th=[ 1208], 99.95th=[ 1208],
     | 99.99th=[ 1256]
    bw (KB  /s): min=    2, max=124056, per=100.00%, avg=108792.37, stdev=26496.59
  write: io=1023.4MB, bw=98202KB/s, iops=24550, runt= 10671msec
    slat (usec): min=1, max=24, avg= 2.08, stdev= 0.51
    clat (usec): min=0, max=945, avg= 9.71, stdev=24.52
     lat (usec): min=5, max=947, avg=11.79, stdev=24.54
    clat percentiles (usec):
     |  1.00th=[    5],  5.00th=[    8], 10.00th=[    8], 20.00th=[    8],
     | 30.00th=[    8], 40.00th=[    8], 50.00th=[    9], 60.00th=[    9],
     | 70.00th=[    9], 80.00th=[    9], 90.00th=[   10], 95.00th=[   11],
     | 99.00th=[   15], 99.50th=[   20], 99.90th=[  612], 99.95th=[  628],
     | 99.99th=[  652]
    bw (KB  /s): min=108392, max=123536, per=100.00%, avg=114596.33, stdev=3108.03
    lat (usec) : 2=0.01%, 4=0.01%, 10=91.43%, 20=6.93%, 50=0.71%
    lat (usec) : 100=0.13%, 250=0.01%, 500=0.01%, 750=0.47%, 1000=0.01%
    lat (msec) : 2=0.31%, 4=0.01%
  cpu          : usr=10.46%, sys=21.17%, ctx=527343, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=98325KB/s, minb=98325KB/s, maxb=98325KB/s, mint=10671msec, maxt=10671msec
  WRITE: io=1023.4MB, aggrb=98202KB/s, minb=98202KB/s, maxb=98202KB/s, mint=10671msec, maxt=10671msecDisk stats (read/write):
  sda: ios=6581/67944, merge=0/67, ticks=4908/196508, in_queue=201408, util=56.49%

You will find some really good examples and a detailed list of parameters on the following website: http://www.bluestop.org/fio/HOWTO.txt

This tool is really powerful and present the huge advantage to be available for more or less any Operating System. Such advantage will allow you to make some consistent comparison accross different kind of architecture.

FIO (Flexible I/O) - a benchmark tool for any Operating System

Yann Neuhaus - Mon, 2014-12-08 04:55

I just attended to an interesting session at UKOUG 14 - Liverpool, named "The least an Oracle DBA Should Know about Linux Administration". This session has been given by Martin Nash (@mpnsh).

During this session I had the opportunity to discover some interesting commands and tools such as fio (Flexible I/O). fio is a workload generator that can be used both for benchmark and stress/hardware verification.

fio has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work on block devices as well as files. fio accepts job descriptions in a simple-to-understand text format.

This tool has the huge advantage to be available for almost all kind of Operating Systems ( POSIX, Linux, BSD, Solaris, HP-UX, AIX ,OS X, Android, Windows). If you want to use this tool in the context of Oracle database I invite you to have a look on the following blog from Yann Neuhaus: Simulating database-like I/O activity with Flexible I/O

 

In order to install it on ubuntu simply use the following command:


steulet@ThinkPad-T540p:~$ sudo apt-get install fio

 

After having installed fio you can run your first test. This first test will run 2 gigabyte of IO (read write) in directory /u01/fio.


steulet@ThinkPad-T540p:~$ mkdir /u01/fio

 

Once the directory have been created we can set up the configuration script as described below. However it is perfectly possible to execute this command in command line without configuration script (fio --name=global --ioengine=posixaio --rw=readwrite --size=2g --directory=/u01/fio --threads=1 --name=myReadWriteTest-Thread1):

 

[global]
ioengine=posixaio
rw=readwrite
size=2g
directory=/u01/fio
threads=1

[myReadWriteTest-Thread1]

 

Now you can simply run your test with the command below:


steulet@ThinkPad-T540p:~$ fio testfio.fio

 

The output will looks like the following:

 

myReadWriteTest-Tread1: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=1
fio-2.1.3
Starting 1 thread
Jobs: 1 (f=1): [M] [100.0% done] [112.9MB/113.1MB/0KB /s] [28.9K/29.2K/0 iops] [eta 00m:00s]
myReadWriteTest-Tread1: (groupid=0, jobs=1): err= 0: pid=7823: Mon Dec  8 12:45:27 2014
  read : io=1024.7MB, bw=98326KB/s, iops=24581, runt= 10671msec
    slat (usec): min=0, max=72, avg= 1.90, stdev= 0.53
    clat (usec): min=0, max=2314, avg=20.25, stdev=107.40
     lat (usec): min=5, max=2316, avg=22.16, stdev=107.41
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    6], 10.00th=[    7], 20.00th=[    7],
     | 30.00th=[    7], 40.00th=[    7], 50.00th=[    7], 60.00th=[    7],
     | 70.00th=[    8], 80.00th=[    8], 90.00th=[    8], 95.00th=[   10],
     | 99.00th=[  668], 99.50th=[ 1096], 99.90th=[ 1208], 99.95th=[ 1208],
     | 99.99th=[ 1256]
    bw (KB  /s): min=    2, max=124056, per=100.00%, avg=108792.37, stdev=26496.59
  write: io=1023.4MB, bw=98202KB/s, iops=24550, runt= 10671msec
    slat (usec): min=1, max=24, avg= 2.08, stdev= 0.51
    clat (usec): min=0, max=945, avg= 9.71, stdev=24.52
     lat (usec): min=5, max=947, avg=11.79, stdev=24.54
    clat percentiles (usec):
     |  1.00th=[    5],  5.00th=[    8], 10.00th=[    8], 20.00th=[    8],
     | 30.00th=[    8], 40.00th=[    8], 50.00th=[    9], 60.00th=[    9],
     | 70.00th=[    9], 80.00th=[    9], 90.00th=[   10], 95.00th=[   11],
     | 99.00th=[   15], 99.50th=[   20], 99.90th=[  612], 99.95th=[  628],
     | 99.99th=[  652]
    bw (KB  /s): min=108392, max=123536, per=100.00%, avg=114596.33, stdev=3108.03
    lat (usec) : 2=0.01%, 4=0.01%, 10=91.43%, 20=6.93%, 50=0.71%
    lat (usec) : 100=0.13%, 250=0.01%, 500=0.01%, 750=0.47%, 1000=0.01%
    lat (msec) : 2=0.31%, 4=0.01%
  cpu          : usr=10.46%, sys=21.17%, ctx=527343, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=98325KB/s, minb=98325KB/s, maxb=98325KB/s, mint=10671msec, maxt=10671msec
  WRITE: io=1023.4MB, aggrb=98202KB/s, minb=98202KB/s, maxb=98202KB/s, mint=10671msec, maxt=10671msecDisk stats (read/write):
  sda: ios=6581/67944, merge=0/67, ticks=4908/196508, in_queue=201408, util=56.49%

You will find some really good examples and a detailed list of parameters on the following website: http://www.bluestop.org/fio/HOWTO.txt

This tool is really powerful and present the huge advantage to be available for more or less any Operating System. Such advantage will allow you to make some consistent comparison accross different kind of architecture.

 

 

Going Beyond MapReduce for Hadoop ETL Pt.2 : Introducing Apache YARN and Apache Tez

Rittman Mead Consulting - Mon, 2014-12-08 02:00

In the first post in this three part series on going beyond MapReduce for Hadoop ETL, I looked at how a typical Apache Pig script gets compiled into a series of MapReduce jobs, and those MapReduce jobs pass data between themselves by writing intermediate resultsets to disk (HDFS, the Hadoop cluster file system). As a reminder, here’s the Pig script we’re working with:

register /opt/cloudera/parcels/CDH/lib/pig/piggybank.jar
raw_logs = LOAD '/user/mrittman/rm_logs' USING TextLoader AS (line:chararray);
logs_base = FOREACH raw_logs
GENERATE FLATTEN
  (REGEX_EXTRACT_ALL(line,'^(\\S+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] "(.+?)" (\\S+) (\\S+) "([^"]*)" "([^"]*)"')
)AS
  (remoteAddr: chararray, remoteLogname: chararray, user: chararray,time: chararray, request: chararray, status: chararray, bytes_string: chararray,referrer:chararray,browser: chararray);
logs_base_nobots = FILTER logs_base BY NOT (browser matches '.*(spider|robot|bot|slurp|bot|monitis|Baiduspider|AhrefsBot|EasouSpider|HTTrack|Uptime|FeedFetcher|dummy).*');
logs_base_page = FOREACH logs_base_nobots GENERATE SUBSTRING(time,0,2) as day, SUBSTRING(time,3,6) as month, SUBSTRING(time,7,11) as year, FLATTEN(STRSPLIT(request,' ',5)) AS (method:chararray, request_page:chararray, protocol:chararray), remoteAddr, status;
logs_base_page_cleaned = FILTER logs_base_page BY NOT (SUBSTRING(request_page,0,3) == '/wp' or request_page == '/' or SUBSTRING(request_page,0,7) == '/files/' or SUBSTRING(request_page,0,12) == '/favicon.ico');
logs_base_page_cleaned_by_page = GROUP logs_base_page_cleaned BY request_page;
page_count = FOREACH logs_base_page_cleaned_by_page GENERATE FLATTEN(group) as request_page, COUNT(logs_base_page_cleaned) as hits;
page_count_sorted = ORDER page_count BY hits DESC;
page_count_top_10 = LIMIT page_count_sorted 10;
posts = LOAD '/user/mrittman/posts.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() as (post_id:int,title:chararray,post_date:chararray,post_type:chararray,author:chararray,url:chararray,generated_url:chararray);
posts_cleaned = FOREACH posts GENERATE CONCAT(generated_url,'/') as page_url,author as author, title as title;
pages_and_post_details = JOIN page_count by request_page, posts_cleaned by page_url;
pages_and_posts_trim = FOREACH pages_and_post_details GENERATE page_count::request_page as request_page, posts_cleaned::author as author, posts_cleaned::title as title, page_count::hits as hits;
pages_and_posts_sorted = ORDER pages_and_posts_trim BY hits DESC;
pages_and_post_top_10 = LIMIT pages_and_posts_sorted 10;
store pages_and_post_top_10 into 'top_10s/pages';

When you submit the script for execution using the Pig client, it then parses the Pig Latin script, logically optimizes it and then compiles it into MapReduce programs; these programs are then sent in-turn to the Hadoop 1.0 Job Manager which then sends them for execution on the Hadoop cluster – in the case of our Pig script, there’s five MapReduce programs generated in-total.

NewImage

Each one of these MapReduce programs are what’s called “directed acyclic graphs”, or DAGs. A directed acyclic graph is a programming style within distributed computing where processing is broken down into functions that can be run independently of one another, as long as one is not an ancestor of another. In MapReduce terms, this means that all mappers can run independently of each other (and therefore on different nodes in a cluster) and it’s only the reducers that have to wait for their ancestors to finish before they can also start their work independently of the other reducers. It’s a great programming model for processing large amounts of data with fault tolerance across a cluster and it was the key insight that made MapReduce possible and the “big data” systems that we work with today.

NewImage

A Pig script that generates five MapReduce jobs then effectively creates five DAGs, with each one using files (HDFS) to persist and hand-off data between them and JVMs being spun-up for the individual map and reduce jobs. As such, there’s no way for this version of MapReduce and the Hadoop framework it runs on to consider the overall dataflow, and each DAG has to run in isolation.

NewImage

To address this issue, version 2.0 of Hadoop introduced a new feature called Apache YARN, or “yet another resource negotiator”. YARN took on the resource management and job scheduling/monitoring parts of Hadoop and made it so that YARN then effectively became an “operating system” for Hadoop that then allowed frameworks to run on it; initially MapReduce2 (reworked to run on YARN), but since then other ones like Apache Tez, and Apache Spark.

NewImage

YARN also crucially supported frameworks that used DAGs that describe the entire dataflow, not just individual MapReduce jobs, and a new framework that came out of this that used that new capability was Apache Tez. Tez is a generalisation of the MapReduce distributed compute framework that supports these dataflow-style DAGs and runs either MapReduce code unaltered, or has its own API for describing DAGs using vertexes  (logic and resources) or edges (connections). Both Hive and Pig have been ported to run on Tez, and in Pig’s case this means another type of compiler is added to the existing MapReduce one, and Pig scripts executed on Tez can now submit a DAG that encompasses all stages in the dataflow and the reducers and be linked-together via in-memory passing of intermediate steps, rather than having to write those intermediate steps to disk.

NewImage

In practice, what this means is that if your version of Pig or Hive has been updated to also run on Tez, you can run your code unaltered on Tez and typically see a 2-3x performance improvement without any changes to your code or application logic. Hortonworks have been the main Hadoop vendor backing Tez and their new Hortonworks Data Platform 2.2 comes with Tez support, so I took my Pig script and ran it on a five-node cluster to get an initial timing and it took 4 min 17 secs to run, again generating the same five MapReduce jobs that I saw on the Cloudera CDH5 cluster. Then, running it using Tez as the execution engine (pig -x tez analyzeblog.pig) the same script took 2mins 25secs to run, around twice as fast as when we ran it using regular MapReduce.

NewImage

 

And Tez is great for getting existing Pig and Hive scripts to run faster – if your platform supports it, you should use it instead of MapReduce as your execution engine; MapReduce code submitted “as is” will benefit from better YARN container re-use, and Hive and Pig scripts that run on the Tez execution engine can run as a single DAG and use memory, rather than disk, to pass data between jobs in the DAG. For ETL and analytic jobs that you’re creating from new though, Apache Spark is arguably the framework you should look to use instead of Tez, and tomorrow we’ll find out why.

Categories: BI & Warehousing

Typesafe activator , play framework applications deployed to Pivotal Cloud Foundry

Pas Apicella - Sun, 2014-12-07 17:59
I decided to quickly build an application using Typesafe activator for a play framework scala application and deploy it to Pivotal Cloud Foundry. You can read more about Typesafe activator below.

https://typesafe.com/activator

Here are the steps to deploy a scala play framework application created using Typesafe activator. I created a basic hello world scala application with the play framework. The purpose here is what is needed to get it deployed on Pivotal Cloud Foundry.

Note: Assumes we have created an application with name "hello-play-scala" and we are in that actually directly as we create files for deployment.

1. Create a distribution ZIP file as follows once you have finished developing your application

> ./activator dist

2. Create a manifest file as follows which refers to the DIST zip file created in #1 above.

applications:
- name: pas-helloworld-scala
  memory: 756M
  instances: 1
  host: pas-helloworld-scala
  domain: apj.fe.pivotal.io
  path: ./target/universal/hello-play-scala-1.0-SNAPSHOT.zip

3. Create a build.sh file, make it executable. This simple shell script is going to call sbt/activator

java -jar activator-launch-1.2.12.jar dist

4. Deploy as shown below.

[Mon Dec 08 10:31:14 papicella@:~/vmware/software/scala/apps/hello-play-scala ] $ cf push -f manifest.yml
Using manifest file manifest.yml

Creating app pas-helloworld-scala in org ANZ / space development as pas...
OK

Using route pas-helloworld-scala.apj.fe.pivotal.io
Binding pas-helloworld-scala.apj.fe.pivotal.io to pas-helloworld-scala...
OK

Uploading pas-helloworld-scala...
Uploading app files from: target/universal/hello-play-scala-1.0-SNAPSHOT.zip
Uploading 1.1M, 131 files
OK

Starting app pas-helloworld-scala in org ANZ / space development as pas...
OK
-----> Downloaded app package (26M)
-----> Java Buildpack Version: v2.4 (offline) | https://github.com/cloudfoundry/java-buildpack.git#7cdcf1a
-----> Downloading Open Jdk JRE 1.7.0_60 from http://download.run.pivotal.io/openjdk/lucid/x86_64/openjdk-1.7.0_60.tar.gz (found in cache)
       Expanding Open Jdk JRE to .java-buildpack/open_jdk_jre (0.9s)
-----> Downloading Play Framework Auto Reconfiguration 1.4.0_RELEASE from http://download.run.pivotal.io/auto-reconfiguration/auto-reconfiguration-1.4.0_RELEASE.jar (found in cache)

-----> Uploading droplet (57M)

1 of 1 instances running

App started

Showing health and status for app pas-helloworld-scala in org ANZ / space development as pas...
OK

requested state: started
instances: 1/1
usage: 756M x 1 instances
urls: pas-helloworld-scala.apj.fe.pivotal.io

     state     since                    cpu    memory           disk
#0   running   2014-12-08 10:32:27 AM   0.0%   164.6M of 756M   118.8M of 1G


5. Finally access in a browser


http://feeds.feedburner.com/TheBlasFromPas
Categories: Fusion Middleware

Announcing XtremIO Performance Engineering Lab Report: Facts About Redo Logging And NAND Flash.

Kevin Closson - Sun, 2014-12-07 15:06

I invite you to please read this report.

NAND Flash is good for a lot of things but not naturally good with write-intensive workloads. Unless, that is, skillful engineering is involved to mitigate the intrinsic weaknesses of NAND Flash in this regard. I assert EMC XtremIO architecture fills this bill.

Regardless of your current or future plans for adopting non-mechanical storage I hope this lab report will show some science behind how to determine suitability for non-mechanical storage–and NAND Flash specifically–where Oracle Database redo logging is concerned.

Please note: Not all lab tests are aimed at achieving maximum theoretical limits in all categories. This particular lab testing required sequestering precious lab gear for a 104 hour sustained test.

The goal of the testing was not to show limits but, quite to the contrary, to show a specific lack of limits in the area of Oracle Database redo logging. For a more general performance-focused paper please download this paper (click here).  With that caveat aside, please see the following link for the redo logging related lab report:

Link to XtremIO Performance Engineering Lab Report (click here).

 

Redo-Durability-splash


Filed under: oracle

APEX 503 – Service Unavailable – And you don’t know the APEX_PUBLIC_USER Password

The Anti-Kyte - Sun, 2014-12-07 12:46

It’s probably Monday morning. The caffeine from your first cup of coffee has not quite worked it’s way into your system.
The cold sweat running down the back of your neck provides an unpleasant contrast to the warm blast of panicked users as they call up to inform you that the Application is down.
APEX, which has been behaving impeccibly all this time, has suddenly decided to respond to all requests with :

503 – Service Unavailable.

The database is up. The APEX Listener is up. But something else is up. APEX just doesn’t want to play.
Better still, the person who set up the APEX in the first place has long-departed the company. You have no idea how the Apex Listener was configured.

Out of sympathy with your current predicament, what follows is :

  • How to confirm that this problem is related to the APEX_PUBLIC_USER (the most likely cause)
  • A quick and fairly dirty way of getting things back up and running again
  • How to stop this happening again

Note: These steps were tested Oracle Developer Day VM with a 12c database running on Oracle Linux 6.5. In this environment, APEX is configured to run with the APEX Listener.

Confirming the APEX User name

First of all, we want to make sure that APEX is connecting to the database as APEX_PUBLIC_USER. To do this, we need to check the default.xml file.
Assuming you’re on a Linux box :

cd /u01/oracle/apexListener/apex
cat default.xml

If you don’t see an entry for db.username then APEX_PUBLIC_USER is the one that’s being used.
If there is an entry for db.username then that is the name of the database user you need to check in the following steps.
For now, I’ll assume that it’s set to the default.

Incidentally, there will also be an entry for db.password. This will almost certainly be encrypted so is unlikely to be of use to you here.

Confirming the status of the APEX_PUBLIC_USER

The most likely reason for your current troubles is that the APEX_PUBLIC_USER’s database password has expired.
To verify this – and get the information we’ll need to fix it, connect to the database and run the query :

select account_status, profile
from dba_users
where username = 'APEX_PUBLIC_USER'
/

If the account_status is EXPIRED, then the issue you are facing is that the APEX_PUBLIC_USER is expired and therefore APEX can’t connect to the database.

The other item of interest here is the PROFILE assigned to the user.
We need to check this to make sure that there is no PASSWORD_VERIFY_FUNCTION assigned to the profile. If there is then you need to supply the existing password in order to change it, which is a bit of a problem if you don’t know what it is.
Whilst we’re at it, we need to check whether there is any restriction in place as to the length of time or number of password changes that must take place before a password can be reused.
In my case, APEX_PUBLIC_USER has been assigned the DEFAULT profile.

select resource_name, limit
from dba_profiles
where profile = 'DEFAULT'
and resource_name in 
(
  'PASSWORD_REUSE_TIME', 'PASSWORD_REUSE_MAX', 
  'PASSWORD_VERIFY_FUNCTION'
)
/

When I ran this, I was lucky and got :

RESOURCE_NAME                  LIMIT              
------------------------------ --------------------
PASSWORD_REUSE_TIME            UNLIMITED            
PASSWORD_REUSE_MAX             UNLIMITED            
PASSWORD_VERIFY_FUNCTION       NULL     

So, there are no restrictions on password reuse for this profile. Neither is there any verify function.

If your APEX_PUBLIC_USER is attached to a profile that has these restrictions, then you’ll want to change this before re-setting the password.
As we’re going to have to assign this user to another profile anyway, we may as well get it out of the way now.

The New Profile for the APEX_PUBLIC_USER

Oracle’s advice for the APEX_PUBLIC_USER is to set the PASSWORD_LIFE_TIME to UNLIMITED.

Whilst it’s only these four parameters we need to set in the profile for us to get out of our current predicament, it’s worth also including a limitation on the maxiumum number of failed login attempts, if only to provide some limited protection against brute-forcing.
In fact, I’ve just decided to use the settings from the DEFAULT profile for the attributes that I don’t need to change :

create profile apex_public limit
    failed_login_attempts 10
    password_life_time unlimited
    password_reuse_time unlimited
    password_reuse_max unlimited
    password_lock_time 1 
    composite_limit unlimited
    sessions_per_user unlimited
    cpu_per_session unlimited
    cpu_per_call unlimited
    logical_reads_per_session unlimited
    logical_reads_per_call unlimited
    idle_time unlimited
    connect_time unlimited
    private_sga unlimited
/

As we don’t specify a PASSWORD_VERIFY_FUNCTION, none is assigned to the new profile.

NOTE – it’s best to check the settings in your own default profile as they may well differ from those listed here.

Next, we assign this profile to APEX_PUBLIC_USER…

alter user apex_public_user profile apex_public
/

The next step is to reset the APEX_PUBLIC_USER password, which is the only way to unexpire the user.

No password, no problem

Remember, in this scenario, we don’t know the current password for APEX_PUBLIC_USER. We don’t want to reset the password to just anything because we’re not sure how to set the password in the DAD used by the Apex Listener.

First of all, we need to get the password hash for the current password. To do this :

select password
from sys.user$
where name = 'APEX_PUBLIC_USER'
/

You’ll get back a hex string – let’s say something like ‘DF37145AF23CCA4′.

Next step is to re-set the APEX_PUBLIC_USER password :

alter user apex_public_user identified by sometemporarypassword
/

We now immediately set it back to it’s original value using IDENTIFIED BY VALUES :

alter user apex_public_user identified by values 'DF37145AF23CCA4' 
/

At this point, APEX should be back up and running.

Once the dust settles…

Whilst your APEX installation may now be back up and running, you now have a database user for which the password never changes.
Although the APEX_PUBLIC_USER has only limited system and table privilges, it also has access to any database objects that are available to PUBLIC.
Whilst this is in-line with Oracle’s currently documented recommendations, you may consider that this is a situation that you want to address from a security perspective.
If there is a sensible way of changing the APEX_PUBLIC_USER password without breaking anything, then you may consider it preferable to simply setup some kind of reminder mechanism so that you know when the password is due to expire and can change it ahead of time.
You would then be able to set the password to expire as normal.
If you’re wondering why I’m being a bit vague here, it’s simply because I don’t currently know of a sensible way of doing this.
If you do, it would be really helpful if you could let me know :)


Filed under: APEX, Oracle, SQL Tagged: APEX 503 Unavailable, create profile, dba_profiles, dba_users.account_status, failed_login_attempts, identified by values, password_reuse_max, password_reuse_time, password_verify_function

OBPM versus BPEL, That's the Question

Jan Kettenis - Sun, 2014-12-07 12:20
Recently I was pointed to the so-called Oracle Learning Streams http://education.oracle.com/streams which provide short presentations on all kind of topics.

While ironing my clothes on a Sunday afternoon, I watched one with the title "Leveraging OBPM vs BPEL" by David Mills. An excellent story where he explains in less than 13 minutes the high-level difference using a practical example.

One reason why I like about this stream is that it is in line with what I preach for years already. Otherwise I would have told you it sucked, obviously.

The main point David makes is that you should use the right tool for the right job. OBPM aims at orchestrating business functions, whereas BPEL aims at orchestrating system functions. The example used is an orchestration of system functions to compose an Update Customer Profile service, which then can be used in a business process, orchestrating business functions where one person is involved to approve some update, while someone else needs to be informed about that. Watch, and you'll see!

For understandable reasons the presentation does not touch the (technical) details. Without any intention to explain those, one should think about differences in the language itself (for example in BPEL you cannot create loops while in BPMN that quite normal to do), and also in the area of configuration and tuning (for example in case of BPEL there are more threads to tune, and you can do in-memory optimization, etc.).

Maybe I find some time to give you a more detailed insight in those more detailed differences. Would help if you would express your interest by leaving a comment!

Hadoop’s next refactoring?

DBMS2 - Sun, 2014-12-07 08:59

I believe in all of the following trends:

  • Hadoop is a Big Deal, and here to stay.
  • Spark, for most practical purposes, is becoming a big part of Hadoop.
  • Most servers will be operated away from user premises, whether via SaaS (Software as a Service), co-location, or “true” cloud computing.

Trickier is the meme that Hadoop is “the new OS”. My thoughts on that start:

  • People would like this to be true, although in most cases only as one of several cluster computing platforms.
  • Hadoop, when viewed as an operating system, is extremely primitive.
  • Even so, the greatest awkwardness I’m seeing when different software shares a Hadoop cluster isn’t actually in scheduling, but rather in data interchange.

There is also a minor issue that if you distribute your Hadoop work among extra nodes you might have to pay a bit more to your Hadoop distro support vendor. Fortunately, the software industry routinely solves more difficult pricing problems than that.

Recall now that Hadoop — like much else in IT — has always been about two things: data storage and program execution. The evolution of Hadoop program execution to date has been approximately:

  • Originally, MapReduce and JobTracker were the way to execute programs in Hadoop, period, at least if we leave HBase out of the discussion.
  • In a major refactoring, YARN replaced a lot of what JobTracker did, with the result that different program execution frameworks became easier to support.
  • Most of the relevant program execution frameworks — such as MapReduce, Spark or Tez — have data movement and temporary storage near their core.

Meanwhile, Hadoop data storage is mainly about HDFS (Hadoop Distributed File System). Its evolution, besides general enhancement, has included the addition of file types suitable for specific kinds of processing (e.g. Parquet and ORC to accelerate analytic database queries). Also, there have long been hacks that more or less bypassed central Hadoop data management, and let data be moved in parallel on a node-by-node basis. But several signs suggest that Hadoop data storage should and will be refactored too. Three efforts in particular point in that direction:

The part of all this I find most overlooked is inter-program data exchange. If two programs both running on Hadoop want to exchange data, what do they do, other than reading and writing to HDFS, or invoking some kind of a custom connector? What’s missing is a nice, flexible distributed memory layer, which:

  • Works well with Hadoop execution engines (Spark, Tez, Impala …).
  • Works well with other software people might want to put on their Hadoop nodes.
  • Interfaces nicely to HDFS, Isilon, object storage, et al.
  • Is fully parallel any time it needs to talk with persistent or external storage.
  • Can be fully parallel any time it needs to talk with any other software on the Hadoop cluster.

Tachyon could, I imagine, become that. HDFS caching probably could not.

In the past, I’ve been skeptical of in-memory data grids. But now I think that a such a grid could take Hadoop to the next level of generality and adoption.

Related links

Categories: Other

Turkish Hadoop User Group(TRHUG) 2014 meeting

H.Tonguç Yılmaz - Sun, 2014-12-07 08:58
Turkish Hadoop User Group(TRHUG) 2014 annual meeting will be at Monday December 22, Levent İstanbul. Microsoft TR is the sponsor of the meeting this year. Turkcell has two slots on the agenda this year; one on an interesting project called Curio based on Kafka, Storm and Cassandra the real-time side of the ecosystem. The other […]

Statistics on this blog

Hemant K Chitale - Sun, 2014-12-07 08:40
I began this blog on 28-Dec-2006.  For the 8 years 2007 to 2014, I have averaged 56 posts per year.  Unfortunately, this year, 2014 has produced the fewest posts -- 40 including this one.  This includes the "series" on Grid / ASM / RAC and the series on StatsPack / AWR.

2011 was my most prodigious year -- 99 posts.

There were 8,176 page views in July 2007.  To date, there have been more than 930thousand page views on this blog.  By month, the peak count has been for March 2012 -- 24,346 page views.

My largest viewer counts are from USA, India, UK, Germany and France.  www.google.com has been the largest source of traffic to this blog.

.
.
.



Categories: DBA Blogs

Going Beyond MapReduce for Hadoop ETL Pt.1 : Why MapReduce Is Only for Batch Processing

Rittman Mead Consulting - Sun, 2014-12-07 08:28

Over the previous few months I’ve been looking at the various ways you can load data into Hadoop, process it and then report on it using Oracle tools. We’ve looked at Apache Hive and how it provides a SQL layer over Hadoop, making it possible for tools like ODI and OBIEE to use their usual SQL set-based process approach to access Hadoop data; later on, we looked at another Hadoop tool, Apache Pig, which provides a more dataflow-type language over Hadoop for when you want to create step-by-step data pipelines for processing data. Under the covers, both Hive and Pig generate Java MapReduce code to actually move data around, with MapReduce then working hand-in-hand with the Hadoop framework to run your jobs in parallel across the cluster.

But MapReduce can be slow; it’s designed for very large datasets and batch processing, with overall analysis tasks broken-down into individual map and reduce tasks that start by reading data off disk, do their thing and then write the intermediate results back to disk again.

NewImage

Whilst this approach means the system is extremely fault-tolerant and effectively infinitely-scalable, this writing to disk of each step in the process means that MapReduce jobs typically take a long time to run and don’t really take advantage of the RAM that’s available in today’s commodity servers. Whilst this is a limitation most early adopters of Hadoop were happy to live with (in exchange for being able to analyse cheaply data on a scale previously unheard of), over the past few years as Hadoop adoption has broadened there’s been a number of initiatives to move Hadoop past it’s batch processing roots and into something more real-time that does more of its processing in-memory. Whilst there are whole bunch of projects and products out there that claim to improve the speed of Hadoop processing and bring in-memory capabilities – Apache Drill, Cloudera Impala, Oracle’s Big Data SQL are just some examples – the two that are probably of most interest to Hadoop customers working in an Oracle environment are called Apache Spark, and Apache Tez. But before we get into the details of Spark, Tez and how they improve over MapReduce, let’s take a look at why MapReduce can be slow.

MapReduce and Hadoop 1.0 – Scalable, Fault-Tolerant, but Aimed at Batch Processing

Going back to MapReduce and what’s now termed “Hadoop 1.0”, MapReduce works on the principle of breaking larger jobs down into lots of smaller ones, with each one running independently and persisting its results back to disk at the end to ensure data doesn’t get lost if a server node breaks down. To take an example, the Apache Pig script below reads in some webserver log files, parses and filters them, aggregates the data and then joins it to another Hadoop dataset before outputting the results to a directory in the HDFS storage layer:

register /opt/cloudera/parcels/CDH/lib/pig/piggybank.jar
raw_logs = LOAD '/user/mrittman/rm_logs' USING TextLoader AS (line:chararray);
logs_base = FOREACH raw_logs
GENERATE FLATTEN
  (REGEX_EXTRACT_ALL(line,'^(\\S+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] "(.+?)" (\\S+) (\\S+) "([^"]*)" "([^"]*)"')
)AS
  (remoteAddr: chararray, remoteLogname: chararray, user: chararray,time: chararray, request: chararray, status: chararray, bytes_string: chararray,referrer:chararray,browser: chararray);
logs_base_nobots = FILTER logs_base BY NOT (browser matches '.*(spider|robot|bot|slurp|bot|monitis|Baiduspider|AhrefsBot|EasouSpider|HTTrack|Uptime|FeedFetcher|dummy).*');
logs_base_page = FOREACH logs_base_nobots GENERATE SUBSTRING(time,0,2) as day, SUBSTRING(time,3,6) as month, SUBSTRING(time,7,11) as year, FLATTEN(STRSPLIT(request,' ',5)) AS (method:chararray, request_page:chararray, protocol:chararray), remoteAddr, status;
logs_base_page_cleaned = FILTER logs_base_page BY NOT (SUBSTRING(request_page,0,3) == '/wp' or request_page == '/' or SUBSTRING(request_page,0,7) == '/files/' or SUBSTRING(request_page,0,12) == '/favicon.ico');
logs_base_page_cleaned_by_page = GROUP logs_base_page_cleaned BY request_page;
page_count = FOREACH logs_base_page_cleaned_by_page GENERATE FLATTEN(group) as request_page, COUNT(logs_base_page_cleaned) as hits;
page_count_sorted = ORDER page_count BY hits DESC;
page_count_top_10 = LIMIT page_count_sorted 10;
posts = LOAD '/user/mrittman/posts.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() as (post_id:int,title:chararray,post_date:chararray,post_type:chararray,author:chararray,url:chararray,generated_url:chararray);
posts_cleaned = FOREACH posts GENERATE CONCAT(generated_url,'/') as page_url,author as author, title as title;
pages_and_post_details = JOIN page_count by request_page, posts_cleaned by page_url;
pages_and_posts_trim = FOREACH pages_and_post_details GENERATE page_count::request_page as request_page, posts_cleaned::author as author, posts_cleaned::title as title, page_count::hits as hits;
pages_and_posts_sorted = ORDER pages_and_posts_trim BY hits DESC;
pages_and_post_top_10 = LIMIT pages_and_posts_sorted 10;
store pages_and_post_top_10 into 'top_10s/pages';

Pig works by defining what are called “relations” or “aliases”, similar to tables in SQL that contain data or pointers to data. You start by loading data into a relation from a file or other source, and then progressively define further relations take that initial dataset and apply filters, use transformations, re-orientate the data or join it to other relations until you’ve arrived at the final set of data you’re looking for. In this example we start with raw log data, parse it, filter out bit and spider activity, project just the columns we’re interested in and then remove further “noise” from the logs, then join it to reference data and finally return the top ten pages over that period based on total hits.

NewImage

Pig uses something called “lazy evaluation”, where relations you define don’t necessarily get created when they’re defined in the script; instead they’re used as a pointer to data and instructions on how to produce it if needed, with the Pig interpreter only materializing a dataset when it absolutely has to (for example, when you ask it to store a dataset on disk or output to console). Moreover, all the steps leading up to the final dataset you’ve requested are considered as a whole, giving Pig the ability to merge steps, miss out steps completely if they’re not actually needed to produce the final output, and otherwise optimize the flow data through the process.

Running the Pig script and then looking at the console from the script running the Grunt command-line interpreter, you can see that five separate MapReduce jobs were generated to load in the data, filter join and transform it, and then produce the output we requested at the end.

JobId                  Maps Reduces Alias                                                                                                               Feature Outputs
job_1417127396023_0145 12   2       logs_base,logs_base_nobots,logs_base_page,logs_base_page_cleaned,logs_base_page_cleaned_by_page,page_count,raw_logs GROUP_BY,COMBINER 
job_1417127396023_0146 2    1       pages_and_post_details,pages_and_posts_trim,posts,posts_cleaned                                                     HASH_JOIN 
job_1417127396023_0147 1    1       pages_and_posts_sorted                                                                                              SAMPLER 
job_1417127396023_0148 1    1       pages_and_posts_sorted                                                                                              ORDER_BY,COMBINER 
job_1417127396023_0149 1    1       pages_and_posts_sorted                                                                                              hdfs://bdanode1....pages2,

Pig generated five separate MapReduce jobs that loaded, parsed, filtered, aggregated and joined the datasets as part of an overall data “pipeline”, with the intermediate results staged to disk before the next MapReduce job took over. On my six-node CDH5.2 VM cluster it took just over five minutes to load, process and aggregate 5m records from our site’s webserver.

NewImage

Now the advantage of this approach is that its more or less infinitely scalable and certainly resilient, but whilst Pig can look at your overall dataflow “graph” and come up with an optimal efficient way to get to your end result, MapReduce treats every step as atomic and separate and insists on writing every intermediate step to disk before moving on.

What this means in-practice is that ETL routines that use Pig, Hive and MapReduce whilst scaling well, never really get to the point where you can run them as micro-batches or in real-time. For that type of scenario we need to look at moving away from MapReduce and breaking the link between Hadoop (the platform, the cluster management and resource handling part) and the processing that runs on it, so that we can run alternative execution engines on the Hadoop platform such as Apache Tez, which we’ll cover in tomorrow’s post.

Categories: BI & Warehousing

Notes on the Hortonworks IPO S-1 filing

DBMS2 - Sun, 2014-12-07 07:53

Given my stock research experience, perhaps I should post about Hortonworks’ initial public offering S-1 filing. :) For starters, let me say:

  • Hortonworks’ subscription revenues for the 9 months ended last September 30 appear to be:
    • $11.7 million from everybody but Microsoft, …
    • … plus $7.5 million from Microsoft, …
    • … for a total of $19.2 million.
  • Hortonworks states subscription customer counts (as per Page 55 this includes multiple “customers” within the same organization) of:
    • 2 on April 30, 2012.
    • 9 on December 31, 2012.
    • 25 on April 30, 2013.
    • 54 on September 30, 2013.
    • 95 on December 31, 2013.
    • 233 on September 30, 2014.
  • Per Page 70, Hortonworks’ total September 30, 2014 customer count was 292, including professional services customers.
  • Non-Microsoft subscription revenue in the quarter ended September 30, 2014 seems to have been $5.6 million, or $22.5 million annualized. This suggests Hortonworks’ average subscription revenue per non-Microsoft customer is a little over $100K/year.
  • This IPO looks to be a sharply “down round” vs. Hortonworks’ Series D financing earlier this year.
    • In March and June, 2014, Hortonworks sold stock that subsequently was converted into 1/2 a Hortonworks share each at $12.1871 per share.
    • The tentative top of the offering’s price range is $14/share.
    • That’s also slightly down from the Series C price in mid-2013.

And, perhaps of interest only to me — there are approximately 50 references to YARN in the Hortonworks S-1, but only 1 mention of Tez.

Overall, the Hortonworks S-1 is about 180 pages long, and — as is typical — most of it is boilerplate, minutiae or drivel. As is also typical, two of the most informative sections of the Hortonworks S-1 are:

The clearest financial statements in the Hortonworks S-1 are probably the quarterly figures on Page 62, along with the tables on Pages F3, F4, and F7.

Special difficulties in interpreting Hortonworks’ numbers include:

  • A large fraction of revenue has come from a few large customers, most notably Microsoft. Details about those revenues are further confused by:
    • Difficulty in some cases getting a fix on the subscription/professional services split. (It does seem clear that Microsoft revenues are 100% subscription.)
    • Some revenue deductions associated with stock deals, called “contra-revenue”.
  • Hortonworks changed the end of its fiscal year from April to December, leading to comparisons of a couple of eight-month periods.
  • There was a $6 million lawsuit settlement (some kind of employee poaching/trade secrets case), discussed on Page F-21.
  • There is some counter-intuitive treatment of Windows-related development (cost of revenue rather than R&D).

One weirdness is that cost of professional services revenue far exceeds 100% of such revenue in every period Hortonworks reports. Hortonworks suggests that this is because:

  • Professional services revenue is commonly bundled with support contracts.
  • Such revenue is recognized ratably over the life of the contract, as opposed to a more natural policy of recognizing professional services revenue when the services are actually performed.

I’m struggling to come up with a benign explanation for this.

In the interest of space, I won’t quote Hortonworks’ S-1 verbatim; instead, I’ll just note where some of the more specifically informative parts may be found.

  • Page 53 describes Hortonworks’ typical sales cycles (they’re long).
  • Page 54 says the average customer has increased subscription payments 25% year over year, but emphasize that the sample size is too small to be reliable.
  • Pages 55-63 have a lot of revenue and expense breakdowns.
  • Deferred revenue numbers (which are a proxy for billings and thus signed contracts) are on Page 65.
  • Pages II 2-3 list all (I think) Hortonworks financings in a concise manner.

And finally, Hortonworks’ dealings with its largest customers and strategic partners are cited in a number of places. In particular:

  • Pages 52-3 cover dealings with Yahoo, Teradata, Microsoft, and AT&T.
  • Pages 82-3 discusses OEM revenue from Hewlett-Packard, Red Hat, and Teradata, none of which amounts to very much.
  • Page 109 covers the Teradata agreement. It seems that there’s less going on than originally envisioned, in that Teradata made a nonrefundable prepayment far greater than turns out to have been necessary for subsequent work actually done. That could produce a sudden revenue spike or else positive revenue restatement as of February, 2015.
  • Page F-10 has a table showing revenue from Hortonworks’ biggest customers (Company A is Microsoft and Company B is Yahoo).
  • Pages F37-38 further cover Hortonworks’ relationships with Yahoo, Teradata and AT&T.

Correction notice: Some of the page numbers in this post were originally wrong, surely because Hortonworks posted an original and amended version of this filing, and I got the two documents mixed up.  A huge Thank You goes to Merv Adrian for calling my attention to this, and I think I’ve now fixed them. I apologize for the errors!

Related links

Categories: Other

Good Blog Bad Blog

Denes Kubicek - Sat, 2014-12-06 03:29
Just checked if the http://www.odtug.com/apex is available again and it is. It seems the people there are filtering blogs because I don't see my blog post from yesterday appearing there and I don't understand why. Is that just because I said that the old blog listing was much better? Or this is just another technical problem they have? Am I going to be removed from that blog listing forever if I continue saying things which they may not like?
Categories: Development

UKOUG 2014 : Are you there?

Angelo Santagata - Fri, 2014-12-05 09:55

Im going to be at UKOUG next week helping out with the AppsTech 2014 Apps "Just Do It Workshop"...

Are you going to be there?? if so come and find me on Monday in the Executive Rooms, Tuesday/Wednesday I'll a "participant" and attending the various presentations on Cloud, Integration technologies , Mobile and ADF.. Come and find me :-)

 https://blogs.oracle.com/fadevrel/entry/don_t_miss_us_at


Getting JDeveloper HttpAnalyzer to easily work against SalesCloud

Angelo Santagata - Fri, 2014-12-05 09:48

Hey all

Little tip here. If your trying to debug some Java code working against SalesCloud one of the tools you might try and use is the http analyzer.. Alas I couldn’t get it to recognize the oracle sales cloud security certificate and the currently version of JDeveloper (11.1.1.7.1) doesnt give you an option to ignore the certificate..

However.. there is a workaround, simply start JDeveloper using a special flag which tells JDevelopers Http Analyzer to trust everybody!

jdev -J-Djavax.net.ssl.trusteverybody=true

Very useful…and obviously for testing and development its ok, but not for anyting else

For more information please see this  Doc reference

Log Buffer #400, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-12-05 09:40

Another centurion mark achieved by the Log Buffer as it reaches 400. Freshness and uniqueness of Log Buffer still is as youthful as was with the edition 1. Enjoy the gems of Oracle, SQL Server and MySQL.

Oracle:

What Cloud Infrastructure Will Best Deliver?

Adaptive Case Management 12c and ADF Human Tasks.

What Does “Backup Restore Throttle Speed” Wait Mean?

All You Need, and Ever Wanted to Know About the Dynamic Rolling Year.

Using grant connect through to manage database links.

The Future of Oracle Forms Straight From the Source’s Mouth.

SQL Server:

Create a repository of all your database devices and stay informed about changes in their size and usage.

When a hospital’s mission-critical database fails at Christmas, disaster for the hospital – and its hapless DBA – seems certain. With less than an hour to spare before catastrophe, can the DBA Team save the day?

How do you use SQL Server, and how do you expect this to change next year?

How can you get a list of columns that have changed within a trigger in T-SQL? How can you see what bits are set within a varbinary or integer? How would you pass a bitmap parameter to a system stored procedure?

Have you ever wanted to run a query across every database on a server with the convenience of a stored procedure? If so, Microsoft provided a stored procedure to do so. It’s unreliable, outdated, and somewhat obfuscated, though. Let’s improve on it!

MySQL:

Thanks, Oracle, for fixing the stupid and dangerous SET GLOBAL sql_log_bin!

Auto-bootstrapping an all-down cluster with Percona XtraDB Cluster.

Proposal to deprecate collation_database and character_set_database settings.

Puppet is a powerful automation tool that helps administrators manage complex server setups centrally. You can use Puppet to manage MariaDB.

Tips from the trenches for over-extended MySQL DBAs.

Categories: DBA Blogs

Join Us For a Networking Event at UKOUG

Pythian Group - Fri, 2014-12-05 09:25
UKOUG event photo

Ask not what you can do for your data. Ask what your data can do for you!

Join us for an informal networking event alongside Rittman Mead on Monday December 8th during UKOUG. We will be discussing how to leverage data to drive your organization’s success. Come meet with peers and industry experts, Mark Rittman and Jon Mead of Rittman Mead, and Marc Fielding and Christo Kutrovsky of Pythian. The networking event will take place at PanAm Bar and Restaurant in Liverpool from 6-8 PM, and will include drinks and light refreshments.

Please be sure to RSVP to the event here—we hope to see you there! Find more information about Pythian’s speaking sessions here.

Questions? Please contact Elliot Zissman, Director of Sales at zissman@pythian.com.

Categories: DBA Blogs

Are All Your Project Managers Certified?

WebCenter Team - Fri, 2014-12-05 09:14

Originally posted on the Redstone Content Solutions blog
____________________________________________________________________________________________________________________________________

"We place a high value on the manner and effectiveness in which we manage our client’s projects."

Many companies over the years have made this or similar statements to their customers. Which begs the question, “What, if anything, have they done to assure their customers that they mean what they say and that it is truly a top priority to them.?"

Okay, we admit it.  We have made statements similar to the one above here at Redstone Content Solutions, but to us it not merely a statement used to close deals or woo a customers into doing business with us. We truly believe that process and knowledge are key in delivering our customers the most effective and efficient Oracle WebCenter Project experience as possible. One of the ways we accomplish this is by investing in our project managers on a continuous basis - In fact, all of Redstone’s project managers are Trained and Certified Project Management Professionals (PMP). 

Read the entire article here