Fusion Middleware

Memories of the way we were...

Greg Pavlik - Sat, 2014-05-31 16:13
The fascinating thing about Hadoop is the obviousness of its evolutionary needs. For example, MapReduce coupled with reliable scale out storage was a powerful - even revolutionary - effect for organizations with both lots of and multi-structured data. Out of the gate, Hadoop unlocked data "applications" that were for all intents and purposes unimplementable. At the same time, it didn't take much imagination to see that separating the compute model from resource management would be essential for future applications that did not fit well with MapReduce itself. It took a lot of work and care to get YARN defined, implemented and hardened, but the need for YARN itself was fairly obvious. Now it is here and Hadoop is no longer about "batch" data processing.

Note, however, it takes a lot of work to make the evolutionary changes available. In some cases, bolt on solutions have emerged to fill the gap. For key value data management, HBase is a perfect example. Several years ago, Eric Baldeschwieler was pointing out that HDFS could have filled that role. I think he was right, but the time it would take to get "HBase-type" functionality implemented via HDFS would have been a very long path indeed. In that case, the community filled the gap with HBase and it is being "back integrated" into Hadoop via YARN in a way that will make for a happier co-existence.

Right now we are seeing multiple new bolt on attempts to add functionality to Hadoop. For example, there are projects to add MPP databases on top of Hadoop itself. It's pretty obvious that this is at best a stop gap again - and one that comes at a pretty high price - I don't know of anyone that seriously thinks that a bolt on MPP is ultimately the right model for the Hadoop ecosystem. Since the open source alternatives look to be several years away from being "production ready", that raises an interesting question: is Hadoop evolution moving ahead at a similar or even more rapid rate to provide a native solution - a solution that will be more scalable, more adaptive and more open to a wider range of use cases and applications - including alternative declarative languages and compute models?

I think the answer is yes: while SQL on Hadoop via Hive is really the only open source game in town for production use cases - and its gotten some amazing performance gains in the first major iteration on Tez that we'll talk more about in the coming days - its clear that the Apache communities are beginning to deliver a new series of building blocks for data management at scale and speed: Optiq's Cost Based Optimizer; Tez for structuring multi-node operator execution; ORC and vectorization for optimal storage and compute; HCat for DDL. But what's missing? Memory management. And man has it ever been missing - that should have been obvious as well (and it was - one reason that so many people are interested in Spark for efficient algorithm development).

What we've seen so far has been two extremes available when it comes to supporting memory management (especially for SQL) - all disk and all memory. An obvious point here is that neither is ultimately right for Hadoop. This is a long winded intro to point to two, interrelated pieces by Julian Hyde and Sanjay Radia unveiling a model that is being introduced across multiple components called Discardable In-memory Materialized Query (DIMMQ). Once you see this model, it becomes obvious that the future of Hadoop for SQL - and not just SQL - is being implemented in real time. Check out both blog posts:

http://hortonworks.com/blog/dmmq/

http://hortonworks.com/blog/ddm/


MDM isn't about data quality its about collaboration

Steve Jones - Tue, 2014-05-27 10:00
I'm going to state a sacrilegious position for a moment: the quality of data isn't a primary goal in Master Data Management Now before the perfectly correct 'Garbage In, Garbage Out' statement let me explain.  Data Quality is certainly something that MDM can help with but its not actually the primary aim of MDM. MDM is about enabling collaboration, collaboration is about the cross-reference
Categories: Fusion Middleware

Data Lakes will replace EDWs - a prediction

Steve Jones - Fri, 2014-05-23 15:14
Over the last few years there has been a trend of increased spending on BI, and that trend isn't going away.  The analyst predictions however have, understandably, been based on the mentality that the choice was between a traditional EDW/DW model or Hadoop.  With the new 'Business Data Lake' type of hybrid approach its pretty clear that the shift is underway for all vendors to have a hybrid
Categories: Fusion Middleware

Lipstick on the iceberg - why the local view matters for IT evolution

Steve Jones - Thu, 2014-05-22 13:00
There is a massive amount of IT hype that is focused on what people see, its about the agile delivery of interfaces, about reporting, visualisation and interactional models.  If you could weight hype then it is quite clear that 95% of all IT is about this area.  Its why we need development teams working hand-in-hand with the business, its why animations and visualisation are massively important.
Categories: Fusion Middleware

How to select a Hadoop distro - stop thinking about Hadoop

Steve Jones - Thu, 2014-05-22 10:00
Scoop, Flume, PIG, Zookeeper.  Do these mean anything to you?  If they do then the odds are you are looking at Hadoop.  The thing is that while that was cool a few years ago it really is time to face it that HDFS is a commodity, Map Reduce is interesting but not feasible for most users and the real question is how we turn all that raw data in HDFS into something we can actually use. That means
Categories: Fusion Middleware

Deep Dive: Oracle WebCenter Tips and Traps!

Bex Huff - Tue, 2014-04-08 18:26

I'm currently at IOUG Collaborate 2014 in Las Vegas, and I recently finished my 2-hour deep dive into WebCenter. I collected a bunch of tips & tricks in 5 different areas: metadata, contribution, consumption, security, and integrations:


As usual, a lot of good presentations this year, but the Collaborate Mobile App makes it a bit tough to find them...

Bezzotech will be at booth 1350, right by Oracle, be sure to swing by and register for a free iPad, or even a free consulting engagement!

read more

Categories: Fusion Middleware

Microservices is SOD all within SOA

Steve Jones - Tue, 2014-03-25 10:06
Microservices is a Service Oriented Delivery approach, all within a Service Oriented Architecture context. (Long Title ;) Ok so a few more updates since the last time I wrote about Microservices and I think its worth just updating as it really is heavily underlining why Microservices is a Service Oriented Delivery approach that absolutely can fit within a Service Oriented Architecture.  Lets be
Categories: Fusion Middleware

Microservices is SOA, for those who know what SOA is.

Steve Jones - Tue, 2014-03-18 10:05
Ok so its started a bit of debate on Twitter and now there have been emails, but in the spirit of openness I thought I'd better blog.  Now its good that Martin has now added a side bar on SOA to his article on Microservices but that really makes it worse in many ways.  I'll get to that at the end but first off lets explain why Microservices is just another SOA implementation pattern.  Its SOD
Categories: Fusion Middleware

What is real-time? Depends on who you ask

Steve Jones - Wed, 2014-03-12 13:29
"Real-time" its a word that gets thrown about a lot in IT and its worth documenting a few of the different ways it gets used Hard Real-time This is what Real-time Java was created to address (along with Soft Real-time) what is this?  Easiest way to say it is that often in Hard Real-time environments the following statement is true If it doesn't finish in X milliseconds then people might die So
Categories: Fusion Middleware

Microservices - Money for old rope or re-badging SOA for the cool kids

Steve Jones - Tue, 2014-03-11 13:00
Hat tip to John Evedemon for the heads up on this one.  Martin Fowler is peddling a new approach, 'Microservices' which... wait for it is a way of developing applications as a suite of services.  Each one of which has its own process thread and 'communicates via lightweight mechanisms' such as.... over HTTP. But wait there is more, you'll be stunned to know that these services can be built
Categories: Fusion Middleware

What are the types of Data Scientist?

Steve Jones - Tue, 2014-03-11 11:00
There are various views going around on what a Data Scientist is and what their value is to an organisation and the salaries they command.  To me however asking 'what is a Data Scientist?' is like asking 'What is a Physicist?' sure 'someone who studies Physics' might be a factually accurate but pointless definition.  How does that separate someone who did Physics in High School from Albert
Categories: Fusion Middleware

BI change is coming, time to get over it and get on with the job

Steve Jones - Fri, 2014-03-07 11:15
One of the things that always stuns me in IT is how people don't appear to like change.  Whether it was the EAI folks pushing back on Web Services in 2000 in favour of their old-school approaches.  The package guys pushing back against SaaS or now the BI guys pushing back against the new wave of BI technologies and approaches the message is always the same: We are happy doing what we are doing,
Categories: Fusion Middleware

The next big wave of IT is Software Development

Steve Jones - Mon, 2014-03-03 10:25
I can smell a change coming, the last few years have seen cloud and SaaS on the rise and seen a fragmentation in application development (thanks in a large part to the appalling stewardship of Java) and a real focus of budgets around BI and 'vanilla' package approaches.  Now this is a good thing, both because I jumped out of the Java boat onto the BI boat a few years ago but also because its
Categories: Fusion Middleware

Software Development Wave 4: back to the package

Steve Jones - Mon, 2014-03-03 10:20
The end of the next Software Development wave will be when Software development against 'eats itself' as it did with with technologies like Hadoop showing a new value in information, with platforms like SFDC showing new pre-build services, where people like GoodData have turned BI into SaaS.  So we will see the same evolution again and a new generation of commoditisation which drives
Categories: Fusion Middleware

Configure Coherence HotCache

Edwin Biemond - Tue, 2014-02-04 22:29
Coherence can really accelerate and improve your application because it's fast, high available, easy to setup and it's scalable. But when you even use it together with the JCache framework of Java 8 or the new Coherence Adapter in Oracle SOA Suite and OSB 12c it will even be more easier to use Coherence as your main HA Cache.  Before Coherence 12.1.2 when you want to use Coherence together with

WebCenter LDAP Filters Explained!

Bex Huff - Fri, 2014-01-24 18:42

We recently has a client with some LDAP performance issues, and had a need to tune how WebLogic was querying their LDAP repository. In WebLogic, the simplest way to do this is with their LDAP Filters. While trying to explain how to do this, I was struck by the lack of clear documentation on what exactly these filters are and why on earth you would need them... The best documentation was in the WebCenter guide, but it was still a bit light on the details.

Firstly, all these filters use LDAP query syntax. For those familiar with SQL, LDAP query syntax looks pretty dang weird... mainly because it uses Prefix, or Polish notation to construct the queries. So if you wanted all Contact objects in the LDAP repository with a Common Name that began with "Joe", your query would look like this:

    (&(objectClass=contact)(cn=Joe*))

Notice how the ampersand AND operator is in the front, and the conditionals are in their own parenthesis. Also note the * wildcard. If you wanted to grab all Group objects that had either Marketing or Sales in the name, the query would look like this:

    (&(objectClass=group)(|(cn=*Marketing*)(cn=*Sales*)))

Notice that the pipe OR operator prefixes the conditionals checking for Marketing or Sales in the group. Of course, this would not be a great query to run frequently... substring searches are slow, and multiple substring searches are even worse!

Below are what these filters do, and why I think you'd need to change them...

All Users Filter: This is basically the initial filter to grab all "user" objects in the entire repository. LDAP stores all kinds of objects (groups, contacts, computers, domains), and this is a simple query to narrow the list of user objects from the collection of all objects. A common setting is simply:

    (objectclass=user)
    (&(objectCategory=person)(objectClass=user))
    (sAMAccountType=805306368)

Users From Name Filter: This is a query to find a user object based on the name of the user. This is a sub-filter based on the previous All Users Filter to grab one specific person based on the user name. You would sometimes change this based on what single sign on system you are using, some use the common name as the official user ID, whereas other systems use the sAMAccountName. The %u token is the name being looked up. One of these two usually works:

    (&(cn=%u)(objectclass=user))
    (&(sAMAccountName=%u)(objectclass=user))

All Groups Filter: Similar to the all names filter, this filter narrows the list of all objects in the LDAP repository to just the list of groups. By default, most applications just grab all group objects with this filter:

    (objectCategory=group)

However, if you have a particularly large LDAP repository, this can be a performance problem. We usually don't need all the groups defined in the repository, we just need the ones with a specific name:

    (&(objectCategory=group)(|(cn=Sales)(cn=Marketing)))

Or the ones under a specific organizational unit:

    (&(objectCategory=group)(|(ou:dn:=Sales)(ou:dn:=Marketing)))

Then the list of group objects to query based on name is much smaller and faster.

Group From Name Filter: Similar to the User From Name Filter, this filter looks up a specific group by the name (the %g token). Again, thie value here usually depends on what single sing on solution you are using, but one of these two usually works:

    (&(cn=%g)(objectclass=group)
    (&(sAMAccountName=%g)(objectclass=group))

Hopefully that clears things up a bit! If you have performance problems, your best bet is to modify the All Groups Filter and the All Users Filter to only grab the groups and users relevant to your specific app.

read more

Categories: Fusion Middleware

REST, SSE or WebSockets on WebLogic 10.3.6

Edwin Biemond - Wed, 2014-01-15 14:10
WebLogic 10.3.6 comes with Jersey1.9 and has no support for Server Side Events or WebSockets. But for one of our projects we are making a HTML5 / AngularJS application, which need to invoke some RESTful services and we also want to use of SSE or WebSockets. Off course we can use WebLogic 12.1.2 but we already have an OSB / SOA Suite WebLogic 10.3.6 environment. So when you want to pimp your

new Puppet 3 Weblogic provisioning module

Edwin Biemond - Sun, 2013-11-24 14:22
The last few weeks I was busy re-writing of my puppet WLS module so it fully supports the power of Puppet 3 (thanks for more than 4000 downloads on puppet forge and all the github downloads). With Puppet 3 we now can use Hiera, Iterations and Lambdas expression. This does not sound like a big change but with Hiera and the new Puppet Language features, I can define big WebLogic Domains without

Creating your own Virtualbox Development Images

Edwin Biemond - Sat, 2013-11-16 15:11
For my Oracle Puppet provisioning development I can't do without these create image tools: Packer and Vagrant in combination with Oracle VirtualBox or VMware.  In this blogpost I will explain what these tools can do for you and how you can make your own images and use puppet as provisioning tool. With Vagrant you can create your own virtual images and it can start puppet or chef to do all the

Pages

Subscribe to Oracle FAQ aggregator - Fusion Middleware