Skip navigation.

Feed aggregator

Adding An EMC XtremIO Volume As An ASM Disk With Oracle Database 12c – It Does Not Get Any Easier Than This.

Kevin Closson - 1 hour 50 min ago
When Something Is Simple It Must Be Simple To Prove

Provisioning high-performance storage has always been a chore. Care and concern over spindle count, RAID type, RAID attributes, number of controller arms involved and a long list of other complexities have burdened storage administrators. Some of these troubles were mitigated by the advent of Automatic Storage Management–but not entirely.

Wouldn’t it be nice if the complexity of storage provisioning could be boiled down to but a single factor? Wouldn’t it be nice if that single factor was, simply, capacity? With EMC XtremIO the only factor storage administrators need to bear in mind when provisioning storage is, indeed, capacity.

With EMC XtremIO a storage administrator hears there is a need for, say, one terabyte of storage and that is the entirety of information needed. No more questions about the I/O pattern (e.g., large sequential writes ala redo logging, etc). The Database Administrator simply asks for capacity with a very short sentence and the Storage Administrator clicks 3 buttons in the XtremIO GUI and that’s all there is to it.

Pictures Speak Thousands of Words

I too enjoy the simplicity of XtremIO in my engineering work. Just the other day I ran short on space in a tablespace while testing Oracle Database 12c intra-node parallel query. I was studying a two-node Real Application Clusters setup attached to an EMC XtremIO array via 8 paths of 8GFC Fibre Channel. The task at hand was a single parallel CTAS (Create Table As Select) but the command failed because my ASM disk group ran out of space when Oracle Database tried to extend the BIGFILE tablespace.

Since I had to add some space I thought I’d take a few screen shots to show readers of this blog how simple it is to perform the full cycle of tasks required to add space to an active cluster with ASM in an XtremIO environment.

The following screen shot shows the error I was reacting to:


The following screenshot shows the XtremIO GUI configuration tab. I selected “Add” and then typed a name and size (1TB) of the volume I wanted to create:

NOTE: Right click the embedded images for greater clarity


The following screenshot shows how I then selected the initiators (think hosts) from the right-hand column that I wanted to see the new volume:


After I clicked “apply” I could see my new volume in my “12C” folder. With the folder construct I can do things like create zero-overhead, immediate, writable snapshots with a single mouse click. As the following screenshot shows, I highlighted “data5″ so I could get details about the volume in advance of performing tasks on the host. The properties tab shows me the only information I need to proceed–the NAA Identifier. Once I had the NAA Identifier I moved on to the task of discovering the new volume on the hosts.



Host Discovery

Host discovery consists of three simple steps:

  1. Multipath discovery
  2. Updating the udev rules file with a text editor
  3. Updating udev state with udevadm commands
Multipath Discovery

On both nodes of the cluster I executed the following series of commands. This series of commands generates a lot of terminal output so I won’t show that in this blog post.

# multipath -F ;service multipathd restart ; -r

After executing the multipath related commands I was able to see the new volume (0002a) on both nodes of the cluster. Notice how the volume has different multipath names (mpathab, mpathai) on the hosts. This is not an issue since the volumes will be controlled by udev:


Updating Udev Rules File and Udev State

After verifying the volumes were visible under DM-MPIO I moved on to the udev actions. The following screenshot shows how I added an ACTION line in the udev rules file and copied it to the other RAC host and then executed the udev update commands on both RAC hosts:


I then could see “/dev/asmdisk6″ on both RAC hosts:


Adding The New XtremIO Volume As An ASM Disk

The next task was to use ASMCA (ASM Configuration Assistant) to add the XtremIO volume to the ASM disk group called “DATA”:


As the following screenshot shows the volume is visible as /dev/asmdisk6:


I selected asmdisk6 and the task was complete:


I then saw evidence of ASM rebalancing in the XtremIO GUI Performance tab:




With EMC XtremIO you provision capacity and that allows you to speak in very short sentences with the application owners that share space in the array.

It doesn’t get any easier than this.

Filed under: oracle

Joined twitter

Bobby Durrett's DBA Blog - 3 hours 30 min ago

I joined twitter.  I don’t really know how to use it.  I’m setup as Bobby Durrett, @bobbydurrettdba if that means anything to you. :)

– Bobby

Categories: DBA Blogs

Oracle Priority Support Infogram for 05-MAR-2015

Oracle Infogram - Wed, 2015-03-04 15:12

Oracle Support
Excellent course on using the new MOS: My Oracle Support Accreditation Series - Level 1 My Oracle Support (Doc ID 1579751.1.
From Oracle Partner Hub: ISV Migration Center Team:
Even More Oracle Database Health Checks with ORAchk and (Beta)
Webcast - Oracle Database 12c High Availability New Features
Backing up MySQL using ZFS Snapshots and Clones, from paulie’s world in a blog.
From Oracle's MySQL Blog: Optimize MySQL Database Performance
Transcript of "Free Open Source Tools" OTN Virtual Technology Session, from Geertjan’s Blog.
Last Solaris 8 and 9 patches released, from Patch Corner.
From Matthew P. Barnson’s blog: Maximizing ZFS Backup Throughput
BPM & SOA Application missing in JDeveloper 12c gallery, from the SOA & BPM Partner Community Blog.
Two from Fusion Applications Performance Issues:
How To Set Stack Size to overcome java.lang.StackOverflowError
BI Publisher Memory Guard Performance Monitor
From the Oracle Exalogic blog: Exalogic Run Book.
Oracle Technologies
From ArchBeat: Top 10 2 Min Tech Tips - February 2015
Oracle Service Cloud
From Oracle EMEA CX Partner Community: Upcoming Webinars for Oracle Service Cloud
The Cumulative Feature Overview Tool provides concise descriptions of new and enhanced features and functionality available between starting and target releases, previously available in Excel form, now has a new WEB version .  
From the Oracle E-Business Suite Support Blog:
Just Released - EBS Financials February 2015 Recommended Patch Collections (RPCs)
From the Oracle E-Business Suite Technology blog:
Reminder: Upgrade Database to by July 2015

Oracle University Instructors on the Cruise Ship

The Oracle Instructor - Wed, 2015-03-04 14:14

Oracle User Group Norway Annual ConferenceI’m really looking forward to speak at the Oracle User Group Norway Spring Seminar 2015, together with my dear colleague Joel Goodman! For sure it’s one of the highlights this year in terms of Oracle Events.

Joel will present about Oracle Automatic Parallel Execution on MAR-12, 6pm and about Oracle 12c Automatic Data Optimization and Heat Map on MAR-13, 9:30am

Yours sincerely will talk about The Data Guard Broker – Why it is recommended on MAR-12, 6pm and about The Recovery Area – Why it is recommended on MAR-13, 8:30am

The OUGN board has again gathered an amazing lineup of top-notch speakers for this event, so I will gladly take the opportunity to improve my knowledge :-)

Tagged: #ougn2015
Categories: DBA Blogs

The age of cybermercenaries [VIDEO]

Chris Foot - Wed, 2015-03-04 12:23

Hi, welcome to RDX. Another group of cybercriminals, which has been dubbed Desert Falcons, was discovered by Kaspersky Lab researchers. The organization’s operations typically target the Middle East, but its reach is expanding to Europe and North America.

Desert Falcons specializes in cyberespionage. After infiltrating an infected system, the perpetrators will insert backdoor malware designed to pull data continuously. Its flagship Trojan can take screenshots, log keystrokes, upload files and steal passwords.

The discovery of Desert Falcons is a reminder of how the hacker underground has evolved since the 90s. This particular organization acts as a contractor. This means criminal figures, political groups and other entities can hire Desert Falcons to conduct a variety of surveillance endeavors.

How do you combat these groups? Working with a team of experts to install database monitoring tools and constantly assess backend systems. Thanks for watching!

The post The age of cybermercenaries [VIDEO] appeared first on Remote DBA Experts.


Kubilay Çilkara - Wed, 2015-03-04 12:15
In this post I will try to show you how I used the Oracle Apex and the APEX_WEB_SERVICE  PL/SQL package to quickly send a request to a public Internet API and how I handled the response. The code below was written during a 'Hackday' and hasn't been extensively tested.

My use case is integrating Oracle Apex with the public Mendeley REST API for Mendeley Catalog Search.

The idea was to build an application in Oracle Apex to query the Mendeley REST API Catalog with a keyword. Mendeley REST API gives JSON response so I used PL/JSON to parse it.  I hear in Oracle 12c JSON is going to be a native data-type. My Oracle Apex host is running Oracle 11g and I had to use PL/JSON for ease.

To cut it short here is how the Mendeley Catalog Search on Oracle Apex application look  like. (Click image to go to app or visit

To integrate with Mendeley REST API from Oracle Apex, I used one PL/SQL function and one procedure.

I used the function to obtain the Mendeley REST API Client Credentials Authorisation flow token and the procedure to do make the API request to Mendeley Catalog Search and to handle the response.

Here is the MENDELEY_CALL PL/SQL function I created:

This function returns the Client Credentials Authorisation Flow token from the Mendeeley REST API

create or replace function mendeley_call (p_id in varchar2)
return varchar2
v_token varchar2(1000);
token varchar2(1000);
jtoken json;
v_grant_type varchar2(400) := 'client_credentials';
v_client_id varchar2(500) := p_id;
v_client_secret varchar2(500) := '<put_your_mendeley_client_secret_here>';
v_scope varchar2(300) := 'all';
/*----------Setting Headers----------------------------------------*/                                      
apex_web_service.g_request_headers(1).name := 'Content-Type';
apex_web_service.g_request_headers(1).Value := 'application/x-www-form-urlencoded; charset=utf-8';

token := apex_web_service.make_rest_request
      p_url         => ''
    , p_http_method => 'POST'
    , p_parm_name   => apex_util.string_to_table('grant_type:client_id:client_secret:scope')
    , p_parm_value  => apex_util.string_to_table(v_grant_type||':'||v_client_id||':'||v_client_secret||':'
    , p_wallet_path => 'file:/home/app/oracle/product/11.2.0/dbhome_1/owm/wallets/oracle'
    , p_wallet_pwd  => '<put_your_oracle_wallet_password_here>'
-- debug
-- dbms_output.put_line(token);
jtoken := json(token);
v_token := json_ext.get_string(jtoken,'access_token');
-- debug
-- dbms_output.put_line(v_token);
return v_token;
   raise_application_error(-20001,'An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);

Here is the anonymous procedure which I put into a PL/SQL region on the Oracle Apex page:

This procedure incorporates the function above and makes the request and handles the response from the Mendeley REST API

Note how the procedure calls the function MENDELEY_CALL (above) to load the variable v_token. 

  v_token  VARCHAR2(599) := mendeley_call(put_your_mendeley_client_id_here);
  v_search VARCHAR2(500);
  mendeley_document NCLOB;
  v_status VARCHAR2(100);
  obj json_list;
  v_id VARCHAR2(100);
  v_title NVARCHAR2(1000);
  v_abstract NCLOB;--varchar2(32000);
  v_link     VARCHAR2(1000);
  v_source   VARCHAR2(500);
  v_type     VARCHAR2(100);
  v_pct_hit  VARCHAR2(10);
  v_rows     NUMBER(10);
  v_batch_id NUMBER(10);
  -- Oracle Wallet
  -- Set Authorisation headers and utf8
  -- the following lilne is necessary if you need to use languages other than latin and 
  -- you will use APEX_WEB_SERVICE package 
  -- build the Authorisation header
  apex_web_service.g_request_headers(1).name  := 'Content-Type';
  apex_web_service.g_request_headers(1).value := 'application/jsonrequest';
  apex_web_service.g_request_headers(1).name  := 'Authorization';
  apex_web_service.g_request_headers(1).value := 'Bearer '||v_token||'';
  -- Make the request and load the response into a CLOB 
  mendeley_document := apex_web_service.make_rest_request 
        p_url => '' 
      , p_http_method => 'GET' 
      , p_parm_name => apex_util.string_to_table('title:limit') 
      , p_parm_value => apex_util.string_to_table('Mendeley:10') 
  -- Load the response to JSON_LIST PL/JSON object
  obj := json_list(mendeley_document);
  -- Start extracting values from the JSON and writhe some HTML
  -- Traverse over JSON_LIST extract elements you like
  FOR i IN 1..obj.count
    v_id       := json_ext.get_string(json(obj.get(i)),'id');
    v_title    := json_ext.get_string(json(obj.get(i)),'title');
    v_abstract := json_ext.get_string(json(obj.get(i)),'abstract');
    v_link     := json_ext.get_string(json(obj.get(i)),'link');
    v_source   := json_ext.get_string(json(obj.get(i)),'source');
    v_type     := json_ext.get_string(json(obj.get(i)),'type');
    -- write extracted data
   dbms_output.put_line(v_title||' ==> '||v_abstract);

This shows how easy is, in this case using one function and one procedure to make a REST API request to an external Web Service from Oracle Apex. 
Categories: DBA Blogs

Quick update on Tachyon

DBMS2 - Wed, 2015-03-04 12:03

I’m on record as believing that:

That said:

  • It’s an open secret that there will be a Tachyon company. However, …
  • … no details have been publicized. Indeed, the open secret itself is still officially secret.
  • Tachyon technology, which just hit 0.6 a couple of days ago, still lacks many features I regard as essential.
  • As a practical matter, most Tachyon interest to date has been associated with Spark. This makes perfect sense given Tachyon’s origin and initial technical focus.
  • Tachyon was in 50 or more sites last year. Most of these sites were probably just experimenting with it. However …
  • … there are production Tachyon clusters with >100 nodes.

As a reminder of Tachyon basics: 

  • You do I/O with Tachyon in memory.
  • Tachyon data can optionally be persisted.
    • That “tiered storage” capability — including SSDs — was just introduced in 0.6. So in particular …
    • … it’s very primitive and limited at the moment.
    • I’ve heard it said that Intel was a big contributor to tiered storage/SSD support. (Solid-State Drives.)
  • Tachyon has some ability to understand “lineage” in the Spark sense of term. (In essence, that amounts to knowing what operations created a set of data, and potentially replaying them.)

Beyond that, I get the impressions:

  • Synchronous write-through from Tachyon to persistent storage is extremely primitive right now — but even so I am told it is being used in production by multiple companies already.
  • Asynchronous write-through, relying on lineage tracking to recreate any data that gets lost, is slightly further along.
  • One benefit of adding Tachyon to your Spark installation is a reduction in garbage collection issues.

And with that I have little more to say than my bottom lines:

  • If you’re writing your own caching layer for some project you should seriously consider adapting Tachyon instead.
  • If you’re using Spark you should seriously consider using Tachyon as well.
  • I think Tachyon will be a big deal, but it’s far too early to be sure.
Categories: Other

Blueprint for a Post-LMS, Part 1

Michael Feldstein - Wed, 2015-03-04 11:27

By Michael FeldsteinMore Posts (1016)

Reading Phil’s multiple reviews of Competency-Based Education (CBE) “LMSs”, one of the implications that jumps out at me is that we see a much more rapid and coherent progression of learning platform designs if you start with a particular pedagogical approach in mind. CBE is loosely tied to family of pedagogical methods, perhaps the most important of which at the moment is mastery learning. In contrast, questions about why general LMSs aren’t “better” beg the question, “Better for what?” Since conversations of LMS design are usually divorced from conversations of learning design, we end up pretending that the foundational design assumptions in an LMS are pedagogically neutral when they are actually assumptions based on traditional lecture/test pedagogy. I don’t know what a “better” LMS looks like, but I am starting to get a sense of what an LMS that is better for CBE looks like. In some ways, the relationship between platform and pedagogy is similar to the relationship former Apple luminary Alan Kay claimed between software and hardware: “People who are really serious about software should make their own hardware.” It’s hard to separate serious digital learning design from digital learning platform design (or, for that matter, from physical classroom design). The advances in CBE platforms are a case in point.

But CBE doesn’t work well for all content and all subjects. In a series of posts starting with this one, I’m going to conduct a thought experiment of designing a learning platform—I don’t really see it as an LMS, although I’m also not allergic to that term as some are—that would be useful for conversation-based courses or conversation-based elements of courses. Because I like thought experiments that lead to actual experiments, I’m going to propose a model that could realistically be built with named (and mostly open source) software and talk a bit about implementation details like use of interoperability standards. But all of the ideas here are separable from the suggested software implementations. The primary point of the series is to address the underlying design principles.

In this first post, I’m going to try to articulate the design goals for the thought experiment.

When you ask people what’s bad about today’s LMSs, you often get either a very high-level answer—“Everything!”—or a litany of low-level answers about how archiving is a pain, the blog app is bad, the grade book is hard to use, and so on. I’m going to try to articulate some general goals for improvement that are in between those two levels. They will be general design principles. Some of them apply to any learning platform, while others apply specifically to the goal of developing a learning platform geared toward conversation-based classes.

Here are four:

1. Kill the Grade Book

One of the biggest problems with mainstream models of teaching and education is their artificiality. Students complete assignments to get grades. Often, they don’t care about the assignment, and the assignments aren’t often designed to be something that entice students to care about them. To the contrary, they are often designed to test specific knowledge or competency goals, most of which would never be practically tested in isolation in the real world. In the real world, our lives or livelihoods don’t depend solely on knowing how to solve a quadratic equation or how to support an argument with evidence. We use these pieces to accomplish more complex real-world goals that are (usually) meaningful to us. That’s the first layer of artificiality. The second layer is what happens in the grade book. Teachers make up all kinds of complex weighting systems, dropping the lowest, assigning a percentage weight to different classes of assignments, grading on curves, and so on. Faculty often spend a lot of energy first creating and refining these schemes and then using them to assign grades. And they are all made up, artificial, and often flawed. (For example, many faculty who are not in mathematically heavy disciplines make the mistake at one time or another of mixing points with percentage grades, and then spend many hours experimenting with complex fudge factors because they don’t have an intuition of how those two grading schemes interact with each other.)

Some of this artificiality is fundamentally baked into the foundational structures of schooling and accreditation, but some of it is contingent. For example, while CBE approaches don’t, in and of themselves, do anything to get rid of the artificiality of the schooling tasks themselves (and may, in fact, exacerbate them, depending on the course design), they can reduce or eliminate a traditional grade book, particularly in mastery learning courses. With CBE in general, you have a series of binary gates: Either you did demonstrate competency or you didn’t. You can set different thresholds, and sometimes you can assess different degrees of competency. But at the end of the day, the fundamental grading unit in a CBE course is the competency, not the quiz or assignment. This simplifies grading tremendously. Rather than forcing teachers to answer questions like, “How many points should each in class quiz be, and what percentage of the total grade should the count for,” teachers instead have to answer questions like, “How much should students’ ability to describe a minor seventh chord count toward their music theory course grade?” The latter question is both a lot more straightforward and more focused on teachers’ intuitions about what it means for a student to learn what a class has to teach.

Master Scale

Details of LoudCloud’s CBE Platform

Nobody likes a grade book, so let’s see how close we can get to eliminating the need for one. In general, we want a grading system that enables teachers to make adjustments to their course evaluation system based on questions that are closely related to their expertise—i.e., what students need to know and whether they seem to know it—rather than on their skills in constructing complex weighting schemes. The mechanism by which we do so will be different for discussion-based course components than for many typical implementations of CBE, particularly machine-graded CBE, but I believe that a combination of good course design and good software design can actually help reduce both layers of grading artificiality that I mentioned above.

2. Use scale appropriately

Most of the time the word “scale” used in an educational context attaches to a monolithic, top-down model like MOOCs. It takes a simplistic view of Baumol’s Cost Disease (which is probably the wrong model of the problem to begin with) and boils down to asking, “How can we reduce the per-student costs by cramming more students into the same class?” I’m more interested in a different question: What new models can we develop that harness both the economic and the pedagogical benefits of large-scale classes without sacrificing the value of teacher-facilitated cohorts? Models like Mike Caulfield’s and Amy Collier’s distributed flip, or FemTechNet’s Distributed Open Collaborative Courses (DOCCs). There are almost certainly some gains to be made using these designs in increasing access by lowering cost. They might (or might not) be more incremental than the centralized scale-out model, but they should hopefully not come with the same trade-offs in educational quality. In fact, they will hopefully improve educational quality by harnessing global resources (including a global community of peers for both students and teachers) while still preserving the local support. And I think there’s actually a potential for some pretty significant scaling without quality loss when the model I have in mind is used in combination with a CBE mastery learning approach in a broader, problem-based learning course design. More on that later.

Another kind of scaling that interests me is scaling (or propagating) changes in pedagogical models. We know a lot about what works well in the classroom that never gets anywhere because we have few tools for educating faculty about these proven techniques and helping them to adopt them. I’m interested in creating an environment in which teachers share learning design customizations by default, and teachers who create content can see what other teachers are doing with it—and especially what students in other classes are doing with it—by default. Right now, there is a lot of value to the individual teacher of being able to close the classroom door and work unobserved by others. I would like to both lower barriers to sharing and increase the incentives to do so. The right platform can help with that, although it’s very tricky. Learning Object Repositories, for example, have largely failed to be game changers in this regard, except within a handful of programs or schools that have made major efforts to drive adoption. One problem with repositories is that they demand work on the part of the faculty while providing little in the way of rewards for sharing. If we are going to overcome the cultural inhibitions around sharing, then we have to make the barrier as low as possible and the reward as high as possible.

3. Assess authentically through authentic conversations

Circling back to the design goal of killing the grade book, what we want to be able to do is directly assess the student’s quality of participation, rather than mediate it through a complicated assignment grading and weighting scheme. Unfortunately, the minute you tell students they are getting a “class participation” grade, you immediately do massive damage to the likelihood of getting authentic conversation and completely destroy the chances that you can use the conversation as authentic assessment. People perform to the metrics. That’s especially true when the conversations are driven by prompts written by the teacher or textbook publisher. Students will have fundamentally different types of conversations if their conversations are not isolated graded assignments but rather integral steps on their way to accomplish some larger task. Problem-Based Learning (PBL) is a good example. If you have a course design in which students have to do some sort of project or respond to some sort of case study, and that project is hard and rich enough that students have to work with each other to pool their knowledge, expertise, and available time, you will begin to see students act as authentic experts in discussions centered around solving the course problem set before them.

A good example of this is ASU’s Habitable Worlds, which I have blogged about in the past and which will be featured in an episode of the aforementioned e-Literate TV series. Habitable Worlds is roughly in the pedagogical family of CBE and mastery learning. It’s also a PBL course. Students are given a randomly generated star field and are given a semester-long project to determine the likelihood that intelligent life exists in that star field. There are a number of self-paced adaptive lessons built on the Smart Sparrow platform. Students learn competencies through those lessons, but they are competencies that are necessary to complete the larger project, rather than simply a set of hoops that students need to jump through. In other words, the competency lessons are resources for the students. They also happen to be assessments, but that’s not the only reason, and hopefully not the main reason, students have to care about them anymore. The class discussions can be positioned in the same way, given the right learning design. Here’s a student featured in our e-Literate TV episode talking about that experience:

Click here to view the embedded video.

The way the course is set up, students use the discussion board for authentic science-related problem solving. In doing so, they are exhibiting competencies necessary to be a good scientist (or a good problem solver, or a supportive member of a problem-solving team). They have to know when to search for information that already exists on the discussion board, how to ask for help when they are stuck, how to facilitate a problem-solving conversation, and so on. And these are, in fact, more valuable competencies for employers, society, and the students themselves than knowing the necessary ingredients for a planet to be habitable (for example). Yet we generally ignore these skills in our grading and pretend that the knowledge quizzes tell us what we need to know, because those are easier to assess. I would like for us to refuse to settle for that anymore.

This is a great example of how learning design and learning platform design can go hand-in-hand. If the platform and learning design work together to enable students to have their discussions within a very large (possibly global) group of learners who are solving similar problems, then there are richer opportunities to evaluate students’ respective abilities to demonstrate both expertise and problem-solving skills across a wide range of social interactions. Assuming a distributed flip model (where faculty are teaching their own classes on their own campuses with their own students but also using MOOC-like content and discussions that multiple other classes are also using), if you can develop analytics that help the local teachers directly and efficiently evaluate students’ demonstrated skills in these conversations, then you can feed the output of the analytics, tweaked by faculty based on which criteria for evaluating students’ participation they think are most important, into a simplified grading system. I’ll have a fair bit to say about what this could look like in practice in a later post in this series.

4. Leverage the socially constructed nature of expertise (and therefore competence)

Why do colleges exist? Once upon a time, if you went to a local blacksmith that you hadn’t been to before, you could ask your neighbor about his experience as a customer or look at the products the blacksmith produced. If you wanted to hire somebody you didn’t know to work in your shop, you would do the same. You’d generally get a holistic evaluation with some specific examples. “Oh, he’s great. My horse has five hooves. He figured out how to make a special shoe for that fifth hoof and didn’t even charge me extra!” You might gather a few of these stories and then make your decision. One thing you would not do is make a list of the 193 competencies that a blacksmith should have and check to see whether he’s been tested against them.

For a variety of reasons, it’s not that simple to evaluate expertise anymore. Credentialing institutions have therefore become proxies for these sorts of community trust network. “I don’t know you, but you graduated from Harvard, and I believe Harvard is a good school.” There was some of that in the early days—“I don’t know you, but you apprenticed with Rolf, and I trust Rolf”—but the universities (and other guilds) took this proxy relationship to the next step by asking people to invest their trust in the institution rather than the particular teacher. The paradox is that, in order to justify their authority as reputation proxies, these institutions came under increasing pressure to produce objective sounding assessments of their students’ expertise. As we go further and further down this road, these assessments look less and less like the original trust network assessment that the credential is supposed to be a proxy for. This may be one reason why a variety of measures show employers don’t pay much attention to where prospective employees get their degrees and don’t have a high opinion of the degree to which college is preparing students for their careers. As somebody who has made hiring decisions in both big and small companies, I can tell you that I don’t remember even looking at the prospective employees’ college credentials. The first screening was based on what work they had done for whom. If the positions had been entry-level, I might have looked at their college backgrounds, but even there, I probably would have looked more at the recommendations, extra-curricular activities, and any portfolio projects. In other words, who will vouch for you, what you are passionate about, and what work you can show. At most, the college degree is a gateway requirement except in a few specific fields. You may have to have one in order to be considered for some jobs, but it doesn’t help you actually land those jobs. And there is little evidence I am aware of that increasingly fine-grained competency assessments improve the value of the credential. This isn’t to say that there is no assessment mechanism better than the old ways. Nor is it to say anything about the value of CBE for either pedagogical purposes (e.g., the way it is used the Habitable Worlds example above) or its value in increasing access to education (and educational credentials) through prior learning assessments and the ability to free the students from the tyranny of seat time requirements. It’s just to say that it’s not clear to me that the path toward exhaustive assessment of fine-grained competencies leads us anywhere useful in terms of the value of the credential itself or in fostering the deep learning that a college degree is supposed to certify. In fact, it may be harmful in those respects.

If we could muster the courage to loosen our grip on the current obsession with objective, knowledge-based certification, we might discover that the combination of digital social networks and various types of analytics hold out the promise that we can recreate something akin to the original community trust network at scale. Participants—students, in our case—could be evaluated on their expertise based on whether people with good reputations in their community (or network) think that they have demonstrated expertise. Just as they always have been. And the demonstration of that expertise will be on full display for direct evaluation because the conversation(s) in which the demonstration(s) occurred and got judged by our trusted community members are on permanent digital display.[1] The learning design creates situations in which students are motivated to build trust networks in the pursuit of solving a difficult, college-level problem. The platform helps us to externalize, discover, and analyze these local trust networks (even if we don’t know any of the participants).

* * *

Those are the four main design goals for the series. (Nothing too ambitious.) In my next post, I’ll lay out the use case that will drive the design.



  1. Hat tip to Patrick Masson, among others, for guiding me to this insight.

The post Blueprint for a Post-LMS, Part 1 appeared first on e-Literate.

OAM 11GR2PS2 in a day

Frank van Bortel - Wed, 2015-03-04 10:40
Get Access Manager 11gRel2 PS2 installed in a day Goal is to get OAM installed and configured in a day - with full control; that is without using the Installation Wizard. Virtual Box Start with Virtual Box. Allow plenty of memory (10GB), and disk (120GB). Attach V33411-01.iso (Oracle Server V6.3) to the CD, and boot. Minimal (not Basic server!) install, configure network with static IP Frank

Rittman Mead BI Forum 2015 Now Open for Registration!

Rittman Mead Consulting - Wed, 2015-03-04 09:46

I’m very pleased to announce that the Rittman Mead BI Forum 2015, running in Brighton and Atlanta in May 2015, is now open for registration.

Back for its seventh successful year, the Rittman Mead BI Forum once again will be showcasing the best speakers and presentations on topics around Oracle Business Intelligence and data warehousing, with two events running in Brighton, UK and Atlanta, USA in May 2015. The Rittman Mead BI Forum is different to other Oracle tech events in that we keep the numbers attending limited, topics are all at the intermediate-to-expert level, and we concentrate on just one topic – Oracle Business Intelligence Enterprise Edition, and the technologies and products that support it.


As in previous years, the BI Forum will run on two consecutive weeks, starting in Brighton and then moving over to Atlanta for the following week. Here’s the dates and venue locations:

This year our optional one-day masterclass will be delivered by Jordan Meyer, our Head of R&D, and myself and will be on the topic of “Delivering the Oracle Big Data and Information Management Reference Architecture” that we launched last year at our Brighton event. Details of the masterclass, and the speaker and session line up at the two events are on the Rittman Mead BI Forum 2015 homepage

Each event has its own agenda, but both will focus on the technology and implementation aspects of Oracle BI, DW, Big Data and Analytics. Most of the sessions run for 45 minutes, but on the first day we’ll be holding a debate and on the second we’ll be running a data visualization “bake-off” – details on this, the masterclass and the keynotes and our special guest speakers will be revealed on this blog over the next few weeks – watch this space!

Categories: BI & Warehousing

In Memory XML Performance (XVM)

Marco Gralike - Wed, 2015-03-04 08:04
I wouldn’t believe the bad XMLType performance statement given stated in Martin Preiss’ blog post,…

Creating Real-Time Search Dashboards using Apache Solr, Hue, Flume and Cloudera Morphlines

Rittman Mead Consulting - Wed, 2015-03-04 01:19

Late last week Cloudera published a blog post on their developer site on building a real-time log analytics dashboard using Apache Kafka, Cloudera Search and Hue. As I’d recently been playing around with Oracle Big Data Discovery with our website log data as the data source, and as we’ve also been doing the same exercise in our development labs using ElasticSearch and Kibana I thought it’d be interesting to give it a go; partly out of curiosity around how Solr, Kafka and Hue search works and compares to Elasticsearch, but also to try and work out what extra benefit Big Data Discovery gives you above and beyond free and open-source tools.


In the example, Apache web log data is read from the Linux server via a Flume syslog source, then fed into Apache Kafka as the transport mechanism before being loaded into Solr using a data transformation framework called “morphlines”. I’ve been looking at Kafka as an alternative to Flume for ingesting data into a Hadoop system for a while mainly because of the tireless advocacy of Cloudera’s Gwen Shapira (Oracle ACE, ex-Pythian, now at Cloudera) who I respect immensely and has a great background in Oracle database administration as well as Hadoop, and because it potentially offers some useful benefits if used instead of, or more likely alongside, Flume – a publish-subscribe model vs. push, the ability to have multiple consumers as well as publishers, and a more robust transport mechanism that should avoid data loss when an agent node goes down. Kafka is now available as a parcel and service descriptor that you can download and then install within CDH5, and so I set up a separate VM in my Hadoop cluster as a Kafka broker and also installed Solr at the same time.


Working through the example, in the end I went with a slightly different and simplified approach that swapped the syslog Flume source for an Apache Server file tailing source, as our webserver was on a different host to the Flume agent and I’d previously set this up before for an earlier blog post. I also dropped the Kafka element as the Cloudera article wasn’t that clear to me whether it’d work in its published form or needed amending to use with Kafka (“To get data from Kafka, parse it with Morphlines, and index it into Solr, you can use an almost identical configuration”), and so I went with an architecture that looked like this:


Compared to Big Data Discovery, this approach has got some drawbacks, but some interesting benefits. From a drawback perspective, Apache Solr (or Cloudera Search as it’s called in CDH5, where Cloudera have integrated Solr with HDFS storage) needs some quite fiddly manual setup that’s definitely an IT task, rather than the point-and-click dataset setup that you get with Big Data Discovery. In terms of benefits though, apart from being free it’s potentially more scalable than Big Data Discovery as BDD has to sample the full Hadoop dataset and fit that sample (typically 1m rows, or 1-5% of the full dataset) into BDD’s Endeca Server-based DGraph engine; Solr, however, indexes the whole Hadoop dataset and can store its indexes and log files within HDFS across the cluster – potentially very interesting if it works.

Back to drawbacks though, the first complication is that Solr’s configuration settings in this Cloudera Search incarnation are stored in Apache Zookeeper, so you first have to download a template copy of the collection files (schema, index etc) from Zookeeper using solrctl, the command-line tool for SolrCloud (Solr running on a distributed cluster, as it is with Cloudera Search)

solrctl --zk bda5node2:2181/solr instancedir --generate $HOME/accessCollection

Then – and this again is a tricky part compared to Big Data Discovery – you have to edit the schema.xml file that Solr uses to determine which fields to index, what their datatypes are and so on. The Cloudera blog post points to a Github repo with the required schema.xml file for Apache Combined Log Format input files, I found I had to add an extra entry for the “text” field name before Solr would index properly, added at the end of the file except here:

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
   <field name="time" type="tdate" indexed="true" stored="true" />
   <field name="record" type="text_general" indexed="true" stored="false" multiValued="true"/>
   <field name="client_ip" type="string" indexed="true" stored="true" />
   <field name="code" type="string" indexed="true" stored="true" />
   <field name="user_agent" type="string" indexed="true" stored="true" />
   <field name="protocol" type="string" indexed="true" stored="true" />   
   <field name="url" type="string" indexed="true" stored="true" />   
   <field name="request" type="string" indexed="true" stored="true" />
   <field name="referer" type="string" indexed="true" stored="true" />
   <field name="bytes" type="string" indexed="true" stored="true" />
   <field name="method" type="string" indexed="true" stored="true" />
   <field name="extension" type="string" indexed="true" stored="true" />   
   <field name="app" type="string" indexed="true" stored="true" />      
   <field name="subapp" type="string" indexed="true" stored="true" />
   <field name="device_family" type="string" indexed="true" stored="true" />
   <field name="user_agent_major" type="string" indexed="true" stored="true" />   
   <field name="user_agent_family" type="string" indexed="true" stored="true" />
   <field name="os_family" type="string" indexed="true" stored="true" />   
   <field name="os_major" type="string" indexed="true" stored="true" />
   <field name="region_code" type="string" indexed="true" stored="true" />
   <field name="country_code" type="string" indexed="true" stored="true" />
   <field name="city" type="string" indexed="true" stored="true" />
   <field name="latitude" type="float" indexed="true" stored="true" />
   <field name="longitude" type="float" indexed="true" stored="true" />
   <field name="country_name" type="string" indexed="true" stored="true" />
   <field name="country_code3" type="string" indexed="true" stored="true" />
   <field name="_version_" type="long" indexed="true" stored="true"/>
   <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>
   <dynamicField name="ignored_*" type="ignored"/>

Then you have to upload the solr configuration settings to Zookeeper, and then configure Solr to use this particular set of Zookeeper Solr settings (note the “—create” before the accessCollection collection name in the second command, this was missing from the Cloudera steps but is needed to be a valid solrctl command)

solrctl --zk bda5node2:2181/solr instancedir --create accessCollection $HOME/accessCollection
solrctl --zk bda5node2:2181/solr --create accessCollection -s 1

At this point you should be able to go to the Solr web admin page within the CDH cluster (, in my case), and see the collection (a distributed Solr index) listed with the updated index schema.


Next I configure the Flume source agent on the RM webserver, using this Flume conf file:

## Local instalation: /etc/flume1.5.0
## configuration file location:  /etc/flume1.5.0/conf/conf
## bin file location: /etc/flume1.5.0/conf/bin
## START Agent: bin/flume-ng agent -c conf -f conf/flume-src-agent.conf -n source_agent
source_agent.sources = apache_server
source_agent.sources.apache_server.type = exec
source_agent.sources.apache_server.command = tail -f /etc/httpd/logs/access_log
source_agent.sources.apache_server.batchSize = 1
source_agent.sources.apache_server.channels = memoryChannel
source_agent.sources.apache_server.interceptors = itime ihost itype
source_agent.sources.apache_server.interceptors.itime.type = timestamp
source_agent.sources.apache_server.interceptors.ihost.type = host
source_agent.sources.apache_server.interceptors.ihost.useIP = false
source_agent.sources.apache_server.interceptors.ihost.hostHeader = host
source_agent.sources.apache_server.interceptors.itype.type = static
source_agent.sources.apache_server.interceptors.itype.key = log_type
source_agent.sources.apache_server.interceptors.itype.value = apache_access_combined
source_agent.channels = memoryChannel
source_agent.channels.memoryChannel.type = memory
source_agent.channels.memoryChannel.capacity = 100
## Send to Flume Collector on Hadoop Node
source_agent.sinks = avro_sink
source_agent.sinks.avro_sink.type = avro = memoryChannel
source_agent.sinks.avro_sink.hostname =
source_agent.sinks.avro_sink.port = 4545

and then I set up a Flume sink agent as part of the Flume service using Cloudera Manager, initially set as “stopped”.


The Flume configuration file for this sink agent is where the clever stuff happens.

collector.sources = AvroIn
collector.sources.AvroIn.type = avro
collector.sources.AvroIn.bind = bda5node5
collector.sources.AvroIn.port = 4545
collector.sources.AvroIn.channels = mc1 mc2

collector.channels = mc1 mc2
collector.channels.mc1.type = memory
collector.channels.mc1.transactionCapacity = 1000
collector.channels.mc1.capacity = 100000
collector.channels.mc2.type = memory
collector.channels.mc2.capacity = 100000
collector.channels.mc2.transactionCapacity = 1000

collector.sinks = LocalOut MorphlineSolrSink

collector.sinks.LocalOut.type = file_roll = /tmp/flume/website_logs
collector.sinks.LocalOut.sink.rollInterval = 0 = mc1

collector.sinks.MorphlineSolrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
collector.sinks.MorphlineSolrSink.morphlineFile = /tmp/morphline.conf = mc2

The interesting bit here is the MorphlineSolrSink flume sink. This Flume sink type routes flume events to a morphline script that in turn copies the log data into the HDFS storage area used by Solr, and passes it to Solr for immediate indexing. Cloudera Morphlines is a command-based lightweight ETL framework designed to transform streaming data from Flume, Spark and other sources and load it into HDFS, HBase or in our case, Solr. Morphlines config files define ETL routines that then call  extensible morphlines Kite SDK functions to perform transformations on incoming data streams such as

  • Split webserver request fields into HTTP protocol, method and URL requested
  • In conjunction with the Maxmind GeoIP database, generate the country, city and geocode for a given IP address
  • Converting dates and times in string format to a Solr-format date and timestamp

with the output then being passed to Solr in this instance, along with the UUID and other metadata Solr needs, for loading to the Solr index, or “collection” as its termed when it’s running across the cluster (note the full log files aren’t stored by this process into HDFS, just the Solr indexes and transaction logs). The morphlines config file I used is below, based on the one provided in the Github repo accompanying the Cloudera blog post – note though that you need to download and setup the Maxmind GeoIP database file, and install the Python pip utility and a couple of pip packages before this will work:

# Specify server locations in a SOLR_LOCATOR variable;
# used later in variable substitutions
# Change the zkHost to point to your own Zookeeper quorum
    # Name of solr collection
    collection : accessCollection
    # ZooKeeper ensemble
    zkHost : "bda5node2:2181/solr"
# Specify an array of one or more morphlines, each of which defines an ETL
# transformation chain. A morphline consists of one or more (potentially
# nested) commands. A morphline is a way to consume records (e.g. Flume events,
# HDFS files or blocks), turn them into a stream of records, and pipe the stream
# of records through a set of easily configurable transformations on it's way to
# Solr (or a MapReduceIndexerTool RecordWriter that feeds via a Reducer into Solr).
morphlines : [
    # Name used to identify a morphline. E.g. used if there are multiple morphlines in a
    # morphline config file
    id : morphline1
    # Import all morphline commands in these java packages and their subpackages.
    # Other commands that may be present on the classpath are not visible to this morphline.
    importCommands : ["org.kitesdk.**", "org.apache.solr.**"]
    commands : [
        ## Read the email stream and break it up into individual messages.
        ## The beginning of a message is marked by regex clause below
        ## The reason we use this command is that one event can have multiple
        ## messages
        readCSV {
    separator:  " "
            columns:  [client_ip,C1,C2,time,dummy1,request,code,bytes,referer,user_agent,C3]
    ignoreFirstLine : false
            quoteChar : "\""
            commentPrefix : ""
            trim : true
            charset : UTF-8
split { 
inputField : request
outputFields : [method, url, protocol]          
separator : " "        
isRegex : false      
#separator : """\s*,\s*"""        
#  #isRegex : true      
addEmptyStrings : false
trim : true          
split { 
inputField : url 
outputFields : ["", app, subapp]          
separator : "\/"        
isRegex : false      
#separator : """\s*,\s*"""        
#  #isRegex : true      
addEmptyStrings : false
trim : true          
userAgent {
inputField : user_agent
outputFields : {
user_agent_family : "@{ua_family}"
user_agent_major  : "@{ua_major}"
device_family     : "@{device_family}"
os_family         : "@{os_family}"
os_major  : "@{os_major}"
#Extract GEO information
geoIP {
            inputField : client_ip
            database : "/tmp/GeoLite2-City.mmdb"
# extract parts of the geolocation info from the Jackson JsonNode Java 
# # object contained in the _attachment_body field and store the parts in
# # the given record output fields:      
extractJsonPaths {
flatten : false
paths : { 
country_code : /country/iso_code
country_name : /country/names/en
                region_code  : /continent/code
#"/subdivisions[]/names/en" : "/subdivisions[]/names/en"     
#"/subdivisions[]/iso_code" : "/subdivisions[]/iso_code"     
city : /city/names/en
#/postal/code : /postal/code
latitude : /location/latitude
longitude : /location/longitude
#/location/latitude_longitude : /location/latitude_longitude
#/location/longitude_latitude : /location/longitude_latitude
      #{logInfo { format : "BODY : {}", args : ["@{}"] } }
    # add Unique ID, in case our message_id field from above is not present
        generateUUID {
    # convert the timestamp field to "yyyy-MM-dd'T'HH:mm:ss.SSSZ" format
       #  21/Nov/2014:22:08:27
        convertTimestamp {
            field : time 
            inputFormats : ["[dd/MMM/yyyy:HH:mm:ss", "EEE, d MMM yyyy HH:mm:ss Z", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "yyyy-MM-dd'T'HH:mm:ss", "yyyy-MM-dd"]
            inputTimezone : America/Los_Angeles
           outputFormat : "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
            outputTimezone : UTC
    # Consume the output record of the previous command and pipe another
    # record downstream.
    # This command sanitizes record fields that are unknown to Solr schema.xml
    # by deleting them. Recall that Solr throws an exception on any attempt to
    # load a document that contains a field that isn't specified in schema.xml
        sanitizeUnknownSolrFields {
            # Location from which to fetch Solr schema
            solrLocator : ${SOLR_LOCATOR}
    # load the record into a SolrServer or MapReduce SolrOutputFormat.
        loadSolr {
            solrLocator : ${SOLR_LOCATOR}

Then it’s just a case of starting the target sink agent using Cloudera Manager, and the source agent on the RM webserver using the flume-ng command-line utility, and then (hopefully) watch the web activity log entries start to arrive as documents in the Solr index/collection – which, after a bit of fiddling around and correcting typos, it did:


What’s neat here is that instead of having to use either an ETL tool such as ODI to process and parse the log entries (as I did here, in an earlier blog post series on ODI on Hadoop), or use the Hive-to-DGraph data reload feature in BDD, I’ve instead just got a Flume sink running this morphlines process and my data is added in real-time to my Solr index, and as you’ll see in a moment, a Hue Search dashboard.

To get Hue to work with my Solr service and new index, you first have to add the Solr service URL details to the Hue configuration settings using Cloudera Manager, like this:


Then, you can select the index from the list presented by the Search application within Hue, and start creating your data discovery and faceted search dashboard.


with the end result, after a few minutes of setup, looking like this for me:


So how does Solr, Hue, Flume and Morphlines compare to Oracle Big Data Discovery as a potential search-and-discovery solution on Hadoop? What’s impressive is how little work, once I’d figured it out, it took to set this up including the real-time loading and indexing of data for the dashboard. Compared to a loading HDFS and Hive using ODI, and manually refreshing the BDD DGraph data store, it’s much more lightweight and pretty elegant. But, it’s clearly an IT / developer solution, and I spent a fair few late nights getting it all to work – getting the Solr schema.xml right was a tricky task, and the morphlines / Solr ingestion process was particularly hard to to debug and understand why it wasn’t working.

Oracle Big Data Discovery, by contrast, makes the data loading, transformation and enrichment process available to the business or data analyst, and provides much richer tools for cataloging and exploring the full universe of datasets on the Hadoop cluster. Morphlines compares well to the Groovy transformations provided by Big Data Discovery and Solr is extensible to add functionality such as sentiment analysis and text parsing, but again these are IT tasks and not something the average data analyst will want to do.

In summary then – Hue, Solr and the Morphlines transformation framework can be an excellent tool in the hands of IT professionals and can create surprisingly featureful and elegant solutions with just a bit of code and process configuration – but where Big Data Discovery comes into its own is putting significant parts of this capability in the hands of the business and the data analyst, and providing tools for data upload and wrangling, combining that data with other datasets, analyzing that whole dataset (or “data reservoir”) and then collaborating with others around the organization.

Categories: BI & Warehousing

Coding in PL/SQL in C style, UKOUG, OUG Ireland and more

Pete Finnigan - Tue, 2015-03-03 22:20

My favourite language is hard to pin point; is it C or is it PL/SQL? My first language was C and I love the elegance and expression of C. Our product PFCLScan has its main functionallity written in C. The....[Read More]

Posted by Pete On 23/07/14 At 08:44 PM

Categories: Security Blogs

Integrating PFCLScan and Creating SQL Reports

Pete Finnigan - Tue, 2015-03-03 22:20

We were asked by a customer whether PFCLScan can generate SQL reports instead of the normal HTML, PDF, MS Word reports so that they could potentially scan all of the databases in their estate and then insert either high level....[Read More]

Posted by Pete On 25/06/14 At 09:41 AM

Categories: Security Blogs

Automatically Add License Protection and Obfuscation to PL/SQL

Pete Finnigan - Tue, 2015-03-03 22:20

Yesterday we released the new version 2.0 of our product PFCLObfuscate . This is a tool that allows you to automatically protect the intellectual property in your PL/SQL code (your design secrets) using obfuscation and now in version 2.0 we....[Read More]

Posted by Pete On 17/04/14 At 03:56 PM

Categories: Security Blogs

Twitter Oracle Security Open Chat Thursday 6th March

Pete Finnigan - Tue, 2015-03-03 22:20

I will be co-chairing/hosting a twitter chat on Thursday 6th March at 7pm UK time with Confio. The details are here . The chat is done over twitter so it is a little like the Oracle security round table sessions....[Read More]

Posted by Pete On 05/03/14 At 10:17 AM

Categories: Security Blogs

PFCLScan Reseller Program

Pete Finnigan - Tue, 2015-03-03 22:20

We are going to start a reseller program for PFCLScan and we have started the plannng and recruitment process for this program. I have just posted a short blog on the PFCLScan website titled " PFCLScan Reseller Program ". If....[Read More]

Posted by Pete On 29/10/13 At 01:05 PM

Categories: Security Blogs

PFCLScan Version 1.3 Released

Pete Finnigan - Tue, 2015-03-03 22:20

We released version 1.3 of PFCLScan our enterprise database security scanner for Oracle a week ago. I have just posted a blog entry on the PFCLScan product site blog that describes some of the highlights of the over 220 new....[Read More]

Posted by Pete On 18/10/13 At 02:36 PM

Categories: Security Blogs

PFCLScan Updated and Powerful features

Pete Finnigan - Tue, 2015-03-03 22:20

We have just updated PFCLScan our companies database security scanner for Oracle databases to version 1.2 and added some new features and some new contents and more. We are working to release another service update also in the next couple....[Read More]

Posted by Pete On 04/09/13 At 02:45 PM

Categories: Security Blogs

Oracle Security Training, 12c, PFCLScan, Magazines, UKOUG, Oracle Security Books and Much More

Pete Finnigan - Tue, 2015-03-03 22:20

It has been a few weeks since my last blog post but don't worry I am still interested to blog about Oracle 12c database security and indeed have nearly 700 pages of notes in MS Word related to 12c security....[Read More]

Posted by Pete On 28/08/13 At 05:04 PM

Categories: Security Blogs