Skip navigation.

BI & Warehousing

Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table

Rittman Mead Consulting - Mon, 2016-02-15 04:00

Well, it’s been awhile since I’ve posted one of our Rittman Mead Data Integration Tips, so I thought a recent challenge might be the next great candidate. I was working through the Kimball ETL Subsystems and the Error Event Schema, Subsystem 5 if you’re familiar with the methodology, and attempting to build the schema from Oracle Data Integrator 12c (ODI 12c) metadata tables. If you caught my presentation “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration” at Oracle OpenWorld 2015, then you’ll know a bit more about where I’m coming from. You can still catch my presentation on this topic at both IOUG Collaborate16 and ODTUG KScope16 (being invited to speak at each of these events is always an honor). Now this Data Integration Tip isn’t solely related to the Kimball ETL Subsystems, though the solution did prove rather useful for my presentation (and upcoming article in the IOUG Select Journal). It’s actually an interesting twist in how Oracle Data Integrator mapping metadata is stored differently between the ODI 11g and ODI 12c repositories.

The challenge is to find the target table, or Datastore, for a given Mapping in ODI 12c, or Interface in ODI 11g. When I say “find”, I mean query the ODI work repository tables and return the target table, or tables in 12c, for a given Mapping. Luckily, I’m equipped with some guidance from our friends at My Oracle Support. If you look at support document Oracle Data Integrator 11g and 12c Repository Description (Doc ID 1903225.1) you’ll find the data dictionaries for both ODI11g (11.1.1.6.0+) and ODI12c (12.1.2, 12.1.3, & 12.2.1). These are invaluable resources from Oracle – though the may be works in progress and somewhat incomplete!

Let’s take a look first at ODI 11g and how simple things used to be. Back when Interfaces were the mechanism for extracting, transforming, and loading, we were allowed only 1 single target Datastore. Ahh, those were the good ‘ol days! Digging into the repository we really only had one place to look for the target table – SNP_POP. This table in the ODI 11g repository, SNP_POP (which essentially stands for Synopsis – Populate), contains a column I_TABLE. This identifying column represents the target table for that particular Interface. Here’s a query that ties it all together.

select
    p.pop_name interface_name,
    t.table_name target_table,
    m.cod_mod model_code
from snp_pop p inner join snp_table t on p.i_table = t.i_table
    inner join snp_model m on t.i_mod = m.i_mod;

As you can see, the key to capturing the target table for an Interface is simply in the SNP_POP.I_TABLE column. Because there is only one target, we can easily figure it out.

interface-target-table

Now, ODI 12c is where the real challenge lies. As you may know, with the move from Interfaces in 11g to flow based Mappings in 12c, we were allowed to do new and exciting things, such as load multiple target tables from a single Mapping. We may also have a case where a Datastore component maps to a Filter component, which then maps to another Datastore component, etc. As you can see in the image below, we can have lots of tables, and lots of tables that may be sources or targets, but we’re only interested in the final target table (or tables, for that matter!).

odi12c-map

Ok, so let’s dig into the Work Repository now. It seems an ODI Mapping can’t be that much more difficult than an Interface, right? Well…

First, there are quite a few tables related to the Mapping itself.

SNP_MAPPING
SNP_MAP_ATTR
SNP_MAP_ATTR_INFO
SNP_MAP_COMP
SNP_MAP_COMP_TYPE
SNP_MAP_CONN
SNP_MAP_CP
SNP_MAP_CP_ROLE
SNP_MAP_DATA_TYPE
SNP_MAP_EXPR
SNP_MAP_EXPR_REF
SNP_MAP_PROP
SNP_MAP_PROP_DEF
SNP_MAP_REF
SNP_MAP_REF_PP

Whoa…we’ve got some work to do. Lucky for you, I’ve already done the work. Let’s look at how it all fits together. We can start with the Mapping (SNP_MAPPING) table. We also have different components on the Mapping, such as Lookups, etc, so we can join in the table SNP_MAP_COMP as well. Here’s the information we’ll be able to see with that simple join.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping

odi12c-map-component

That’s interesting, we’ve captured all components in the mapping. But there are still quite a lot of non-targets here. Ok, maybe if we add the connection points for each component we can find the input and output for each. Components each have an input connector point, allowing an output from a different component to flow into it.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp

That just added the connection point information for each component and whether it is an INPUT or OUTPUT. Not extremely useful by itself, so let’s dig a bit deeper. How about adding the SNP_MAP_REF table? This table seems to contain a reference to all types of other attributes about the Mapping and its Components. We also need to consider that Datastores, just as any other Component, will have both an input and output. Right now, the dataset shows both the input and output connectors for each Component. Let’s remove the input connection points to limit our result set.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = 'O' --output connection point only

Joining the reference table has now allowed us to focus on the Datastores in the Mapping and their OUTPUT connection point. What I really want is to see only the Datastores that do not have their OUTPUT connection point connected to an INPUT connector. Therefore, if the OUTPUT is empty, it must be the target table!

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = ‘O' and --output connection point only.
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point

The SNP_MAP_CONN, which stores the mapping connections, will allow me to limit the query to the components that only have an output, but not an input. The connections table will contain all component connections in the ODI 12c mappings. Here’s what we get as a result.

mapping target tables found

Hey, now we’re onto something here. In fact, this is what I was looking for! Target table(s) in a single Mapping in ODI12c. Not quite as simple as ODI11g, but with a bit of SQL and understanding of the repository tables, you can do it. Here’s the final query again, joining in the SNP_TABLE & SNP_MODEL tables to complete the dataset.

select
    m.name mapping_name,
    mr.qualified_name,
    mc.name datastore_alias,
    t.table_name target_table,
    mdl.cod_mod model_code
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
    inner join snp_table t on mr.i_ref_id = t.i_table
    inner join snp_model mdl on t.i_mod = mdl.i_mod
where cp.direction = 'O' and --output connection point
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point
;

Please let me know if you find this Data Integration Tip useful or if you have a better way of accessing the target table in a mapping. One of my Rittman Mead colleagues asked, “why not just use the ODI Java API?”. For accessing the ODI repository, I do prefer using some Groovy script and the API. But in this case, I’m interested in building out a dimensional schema and writing ETL to load the dimensions and facts, which lends itself to SQL rather than Groovy script.

As always, if you’re team needs help around Oracle Data Integrator, or Oracle Data Integration Solutions in general, drop us a line at info@rittmanmead.com. Or feel free to reach out to me directly via email (michael.rainey@rittmanmead.com) or Twitter (@mRainey). Cheers!

The post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

OBIEE 12c – Repository Password Corruption Issue

Rittman Mead Consulting - Thu, 2016-02-11 10:00

Here at Rittman Mead we’ve been working with OBIEE 12c for some time now, as part of the beta programme and more recently with clients looking to get the most out an upgrade to OBIEE 12c. We’ve also been hard at work on our brand new OBIEE 12c training course. What we’ve seen in terms of the stability of OBIEE 12c has been pleasantly surprising. Anyone who’s worked with software long enough will be familiar with the reputation that first releases in general have for nasty bugs, and it’s probably fair to say that with the first release of 11g (11.1.1.3) this was proven out. With the first release of OBIEE 12c, however, we’re seeing a stable tool with very few issues so far.

That said…I’m going to demonstrate an issue here that is a bit of a nasty one. It’s nasty because the trigger for it appears innocuous, and what it breaks is one of the really new things in OBIEE 12c—the way in which the RPD is stored on disk and accessed by the BI Server. This makes it a bit of a tough one to get to the bottom of at first, but it’s a good excuse to go digging!

Summary

If you open the RPD in online mode (use File –> Copy As and then use the Save option), the password on the server gets corrupted.

[nQSError: 43113] Message returned from OBIS  
[nQSError: 13042] Repository password is wrong

obiee 12c repository password corruption

From this point on you cannot checkin any changes, and when you restart the BI Server it will fail to start up.

Details

In OBIEE (12c, and before) it is possible to open the RPD as a straight binary file on disk (“Offline” mode), or by connecting directly to the BI Server and opening the copy that it is currently running (“Online mode”). Offline mode suits larger changes and development, with the RPD then being deployed onto the server once the development work is ready to be tested. Online mode is a good way for making changes on a dedicated dev server, minor changes on a shared server, or indeed just for viewing the RPD that’s currently being run.

Here’s what we’ve seen as the problem:

  1. Open RPD in online mode
  2. File -> Copy As
  3. Enter a password with which to protect the RPD being saved on disk.
  4. Do one of:
    • File -> Close, and then when prompted to save changes click Yes
    • File -> Save
    • Click the Save icon on the toolbar
    • Ctrl-S

obiee 12c repository password error

What happens now is two-fold:

  1. You cannot check in any changes made online—the check in fails with an error from the Administration Tool:
    [nQSError: 43113] Message returned from OBIS  
    [nQSError: 13042] Repository password is wrong
  2. The BI Server will fail on restart with the same error:
    Opening latest versioned cached RPD for : /app/oracle/biee/bi/bifoundation/server/empty.rpd which is /app/oracle/biee/user_projects/domains/bi/bidata/service_instances/ssi/metadata/datamodel/customizations/liverpd.rpd_5  
    [nQSError: 13042] Repository password is wrong. [[

We saw this on SampleApp v511, as well as on vanilla installations of OBIEE. Versions on both were 12.2.1.0.0.

After we reported this to Oracle, they agreed it was a bug and have logged it as bug number 22682937, with no patch currently (February 10, 2016) available.

Workaround

If you open the RPD online and use File -> Copy As, don’t hit save or check in, even if prompted by the Admin Tool. Close the RPD straightaway.

Often people will use File -> Copy As to take a copy of the current live RPD before doing some changes to it. At Rittman Mead, we’d always recommend using source control such as git to store all code including the RPD, and using this approach you obviate the need to open the RPD online simply to get the latest copy (because the latest copy is in source control).

You can also use the data-model-cmd downloadrpd option to download the actual live RPD—that’s exactly what this option is provided for.

Solution – if BI Server (OBIS) has not yet been restarted

If you’ve hit this bug and are hitting “Repository password is wrong” when you try to checkin, and if the BI Server is still running, then redeploy the RPD using the data-model-cmd uploadrpd tool. By redeploying the RPD the password appears to get sorted out.

If the BI Server is down, then this is not an option because it has to be running in order for data-model-cmd uploadrpd to work.

Solution – if BI Server (OBIS) has been restarted and failed

At this point using data-model-cmd uploadrpd is not possible because OBIS is not running and so the data-model-cmd uploadrpd will fail with the error:

[oracle@demo ~]$ /app/oracle/biee/user_projects/domains/bi/bitools/bin/data-model-cmd.sh uploadrpd -I /home/oracle/rmoff.rpd -W Password01 -U weblogic -P Admin123 -SI ssi  
Service Instance: ssi

Operation failed.  
An exception occurred during execution, please check server logs.

The only option from this point is to use importServiceInstance to reset the service instance, either to an empty, SampleAppLite, or an existing .bar export of your environment. For example:

/app/oracle/biee/oracle_common/common/bin/wlst.sh  
importServiceInstance('/app/oracle/biee/user_projects/domains/bi','ssi','/app/oracle/biee/bi/bifoundation/samples/sampleapplite/SampleAppLite.bar')

This will enable OBIS to start up correctly, from which point the desired RPD can then be re-uploaded if required using data-model-cmd uploadrpd.

Conclusion

The easiest thing is to simply not use File -> Copy As in online mode. Whilst this on its own is fine, the UI means it’s easy to accidentally use the Save option, which then triggers this problem. Instead, use data-model-cmd downloadrpd, and/or use source control so that you can easily identify the latest RPD that you want to develop against.

If you do hit this repository password corruption problem, then keep calm and don’t restart the BI Server—just re-upload the RPD using data-model-cmd uploadrpd. If you have already uploaded the RPD, then you need to use importServiceInstance to restore things to a working state.

As part of the diagnostics that we did to get to the bottom of this issue, we found some interesting things in OBIEE 12c, such as a web service endpoint for RPD upload/download, as well as the detailed workings of the RPD upload process and that infamous liverpd.rpd file. Stay tuned for a blog post on this and more soon! And in the meantime, be sure to get in touch with us to discuss how we can help you with your OBIEE systems, including OBIEE 12c upgrade, and OBIEE 12c training.

The post OBIEE 12c – Repository Password Corruption Issue appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Implementing OBIEE Commentary

Rittman Mead Consulting - Thu, 2016-02-11 04:00
My Application Development Experience with Rittman Mead

Greetings, readers! My name is Nick Padgett, and this is my first blog post with Rittman Mead. I started with Rittman Mead as an intern over a year ago while I was attending Kennesaw State, a local university, but I’ve graduated from both school and the internship. Since then I’ve been working on several interesting applications. Today, I would like to take some time to discuss developing one such application, our OBIEE Commentary tool, as well as my learning experiences working at Rittman Mead.

When I first started with Rittman Mead in October 2014, I was beginning my final year in a Computer Science program. When I joined, I knew very little about Business Intelligence but was eager to learn everything I could. Learning BI while still in school created a powerful synergy between my class studies and the real world. As I learned web development and security in class, I learned about data warehouses and agile development at work. In addition, I had the opportunity to practice my new knowledge and skills on a daily basis.

The most important lesson I learned during my time as an intern was the importance of developing primarily to meet the user requirements. This subject was often ignored in school, except for the obligatory Software Engineering course. But now, as a formal developer, I was always being reminded of the users we were developing for. All features for all projects were first qualified by asking “Is this what the user wants or needs?” or “Will this add value for the users?”

The first project I was assigned was the Rittman Mead commentary tool. As I assisted in development, the team always looked back to the initial requirements: “Users want commentary in OBIEE.” All development was focused in that direction, and any feature which hindered the goal, despite the technical impressiveness, was removed from the application.

Now, after spending many hours developing the commentary tool, the team is confident it has effectively answered the primary user requirements. We’re all very excited to provide this tool to our customers, and we are confident it will meet all the requirements we spent so long accumulating and developing for.

The Primary Use-case: OBIEE Commentary

In a typical environment, users are required to leave the application to communicate with their peers. This often involves screenshots, long text descriptions of the subject, and a service such as email or Slack. However, forcing users to leave the application in order to do their job is less than optimum. What’s to keep them using the expensive tool you purchased for them if they have to leave it every few minutes?

Many organizations recognize this issue and seek to implement commentary themselves. Frequent approaches typically use either a text object on a dashboard, a write-back enabled input, or even strange iFrame solutions to achieve the desired capabilities. Obviously, these implementations have serious drawbacks, and may not be maintainable or feasible for many organizations.

The commentary tool we have developed uses none of the above approaches, but instead uses a standard web deployment, hosted in WebLogic, to provide similar capabilities. This means no RPD modifications, no requirement for dashboard designers to become involved, and no security threats with iFrames or Cross-Site Scripting (XSS) attacks, while still fulfilling the primary use case of adding commentary to your OBIEE dashboards.

-->
obiee 12c commentary

Our tool allows for users to create several different types of comments, all displayed through standard HTML in a web browser. Comments can be added on a dashboard to provide commentary for the page as a whole, while other comments can be placed on reports, which will be shown anywhere the report is located. You can even submit replies to existing comments, allowing for comment threads over a particular subject. All of these comments are available for use without having to leave the application.

A Second Use-case: Documentation

Some comments are highly structured and curated, containing a wealth of information. Some users like to add comments explaining how a measure is calculated, or a post-mortem for the launch of a new product. These comments are best classified as documentation, as they don’t serve to engage in a conversation, but to present information. In addition, there may be formal documentation that is required to be on a dashboard, and relates directly to a series of reports. The most common way to provide this is to supply links to external applications specifically suited for documentation. Among a myriad of issues pertaining to accessibility concerns, this approach forces users to leave the application, some of whom may never return.

As an alternative, documentation can be placed directly on a dashboard or report, but this leads to design problems. Having a twenty-page document on a dashboard is hardly an acceptable solution, even though this is the best place to put your documentation. In addition, users will have to write their documentation in HTML, or at least be able to understand it.

-->
obiee 12c documentation tool

The commentary tool also allows for this use case. Documentation can be created or edited using a WYSIWYG (What You See Is What You Get) editor, similar to any standard document editor, and added to a “Table of Contents” on a dashboard. The Table of Contents lists all current documents available to view and displays them in a dedicated area. This provides access to any pertinent content directly on a dashboard, rather than forcing users to migrate to an external application, which is the entire point for having native commentary in the first place.

A Third Use-case: Integration

A common concern may be the introduction of yet another channel for communication. For example, many organizations use Slack as a communication tool or Atlassian Confluence for documentation. Rittman Mead does not desire to replace these tools or make them obsolete to your organization. We acknowledge the fact you spend money on these applications, and they may be widely adopted within your organization.

-->
obiee commenting tool

Instead of forcing you to maintain several channels of communication, our commentary tool allows integrations with applications such as JIRA and Confluence. Comments can be submitted as JIRA issues, and documents can be synchronized with Confluence pages. Several other integrations are available by default, while additional implementations are available on a per-customer basis. Now, if users absolutely must leave the application, they can do it on their own terms, with the convenience of a few clicks.

As I have grown during my time here with Rittman Mead, so has our commentary application. While I learned how to cater to user requirements, the tool evolved into an extension of this knowledge and was developed with the intention to meet all customer needs. I am incredibly excited to see this product released soon, and I can’t wait for users to start getting more out of their BI applications.

More information on this tool will be coming very soon, so keep an eye on our blog and website for updates!

The post Implementing OBIEE Commentary appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Using Python to ‘Wrangle’ Your Messy Data

Rittman Mead Consulting - Mon, 2016-02-08 00:00
 or How to Organize Your Comic Book Collection Based on Issue Popularity

In addition to being a product manager at Rittman Mead, I consider myself to be a nerd of the highest order. My love of comic books, fantasy, sci-fi and general geekery began long before the word ‘Stark’ had anything to do with Robert Downey Jr or memes about the impending winter. As such, any chance to incorporate my hobbies into my work is a welcomed opportunity. For my next few blog entries, I’ve decided to construct a predictive classification model using comic book sales data whose eventual goal will be to build a model that can accurately predict whether a comic will rocket off the shelves or if it will be a sales dud. The first blog of the series shows some of the pitfalls that can come up when preparing your data for analysis. Data preparation, or data wrangling as it has come to be known, is an imperfect process that usually takes multiple iterations of transformation, evaluation and refactoring before the data is “clean” enough for analysis.

While the steps involved in data wrangling vary based on the state and availability of the raw data, for this blog I have chosen to focus on the gathering of data from disparate sources, the enrichment of that data by merging their attributes and the restructuring of it to facilitate analysis. Comic book sales data is readily available on the interwebs, however finding that data in a usable format proved to be a more difficult task. In the end, I had to resort to dreaded process of screen scraping the data from a comic research site. For those of you who are lucky enough be unfamiliar with it, screen scraping is the process of programmatically downloading HTML data and stripping away that formatting to make it suitable for use. This is generally used as a last resort because web sites are volatile creatures that are prone to change their appearance as often as my teenage kids do preparing to leave the house. However, for the purposes of this series, as my friend Melvin the handyman would often say, “We works with what we gots.”

blog-ironman-pythonexclamation-point-icon-30522This leads us to the first issue you may run into while wrangling your data. You have access to lots of data but it’s not pretty. So make it pretty.  Working with raw data is like being a sculptor working with wood. Your job is not to change the core composition of the data to suit your purposes but to chip away at the excess to reveal what was there all along, a beautiful horse… er I mean insight. Sorry, I got lost in my analogy.  Actually to expand on this analogy a bit, the first tool I pulled out of my toolbox for this project was Python, the Leatherman of  programming languages.  Python is fast, plays well with other technologies and most importantly in this case, Python is ubiquitous. Used for tasks ranging from process automation and ETL to gaming and academic pursuits, Python is truly a multipurpose tool. As such, if you have a functional need, chances are there is a native module or someone has already written a public library to perform that function.  In my case, I needed some scripts to “scrape” HTML tables containing comic sales data and combine that data with other comic data that I retrieved elsewhere. The “other” data is metadata about each of the issues. Metadata is just data about data. In this case, information about who authored it, how it was sold, when it was published, etc… More on that later.  

blog-sales-tableLuckily for me, the format of the data I was scraping was tabular, so extracting the data and transforming it into Python objects was a relatively simple matter of iterating through the table rows and binding each table column to the designated Python object field. There was still a lot of unnecessary content on the page that needs to be ignored, like the titles and all of the other structural tags, but once I found the specific table holding the data, I was able to isolate it. At that point, I wrote the objects to to a CSV file, to make the data easy to transport and to facilitate usability by other languages and/or processes.

The heavy lifting in this process was performed by three different Python modules: urllib2, bs4 and csv. Urllib2, as the name implies, provides functions to open URLs. In this case, I found a site that hosted a page containing the estimated issue sales for every month going back to the early 1990’s. To extract each month without manually updating the hardcoded URL over and over, I created a script that accepted MONTH and YEAR as arguments, month_sales_scraper.py
blog-get-sales

The response from the urlopen(url) function call was the full HTML code that is typically rendered by a web browser. In that format, it does me very little good, so I needed to employ a parser to extract the data from the HTML. In this context, a parser is a program that is used to read in a specific document format, break it down into its constituent parts while preserving the established relationships between those parts, and then finally provide a means to selectively access said parts. So an HTML parser would allow me to easily access all the <TD> column tags for a specific table within an HTML document. For my purposes, I chose BeautifulSoup, or bs4.

BeautifulSoup provided search functions that I used to find the specific HTML table containing the sales data and loop through each row, while using the column values to populate a Python object.

blog-bs4

This Python object, named data, contains fields populated with data from different sources. The year and month are populated using the arguments passed to the module. The format field is dynamically set based on logic related to the rankings and the remaining fields are set based on their source’s position in the HTML table. As you can see, there is a lot of hard coded logic that would need to be updated, should the scraped site change their format. However, for now this logic gets it done.

The final step of this task was to write those Python objects to a CSV file. The python module, CSV, provides the function writerow(), which accepts an array as a parameter and writes each of the array elements as columns in the CSV.
blog-csv-write

My first pass raised the an exception because the title field contained unicode characters that the CSV writer could not handle.
blog-unicode-error

To rectify this, I had to add a check for unicode and encoded the content as UTF-8. Unicode and UTF-8 are character encodings; meaning they provide a map computers use to identify characters. This includes alphabets and logographic symbols from different languages as well as common symbols like ®.

blog-unicode-to-utf8

Additionally, there was the matter of reformatting the values of some of the numeric fields to allow math to be performed on them later(ie stripping ‘$’ and commas). Other than that, the data load went pretty smoothly. A file named (MONTH)_(YEAR).CSV was generated for each month. Each file turned out like so:

blog-sales-csv

While this generated tens of thousands of rows of comic sales data, it was not enough. Rather, it had the volume but not the breadth of information I needed. In order to make an accurate prediction, I needed to feed more variables to the model than just the comic’s title, issue number, and price. The publisher was not relevant as I decided to limit this exercise to only Marvel comics and passing in the the estimated sales would be cheating, as rank is based on sales. So to enhance my dataset, I pulled metadata about each of the issues down from “the Cloud” using Marvel’s Developer API. Thankfully, since the API is a web service, there was no need for screen scraping.

exclamation-point-icon-30522Retrieving and joining this data was not as easy as one might think. My biggest problem was that the issue titles from the scraped source were not an exact match to the titles stored in the Marvel database. For example, the scraped dataset lists one title as ‘All New All Different Avengers”. Using their API to search the Marvel database with that title retrieved no results. Eventually, I was able to manually find it in their database listed as “All-New All-Different Avengers”. In other cases, there were extra words like “The Superior Foes of Spider-Man” vs “Superior Foes of Spider-Man”. So in order to perform a lookup by name, I needed to know the titles as they expected them. To do this I decided to pull a list of all the series titles whose metadata was modified during the timeframes for which I had sales data. Again, I ran into a roadblock. The Marvel API only allows you to retrieve up to 100 results per request and Marvel has published thousands of titles. To get around this I had to perform incremental pulls, segmented alphabetically. 

blog-incremental-pulls

Even then there were a few minor issues, as some letters like ‘S’ had more than 100 titles. To get around that I had to pull the list for ‘S’ titles sorted ascending and descending then combine the results, making sure to remove duplicates. So my advice on this one is be sure to read up on the limitations of any API you are using. It may enforce limits but you may be able to work around the limits using creative querying.

blog-sticky-tab

At this point I have my list of Marvel series titles, stored in some CSV files that I eventually combined into a single file, MarvelSeriesList.csv, for ease of use. Actually, I have more than that. While retrieving the series titles, I also pulled down the ID for the series and an appropriateness rating. Searching the API by ID will be much more accurate than name and the appropriateness  rating may be useful when building out the prediction model. The next step was to iterate through each row of the CSVs we created from the sales data, find the matching ID from MarvelSeriesList.csv and use that ID to retrieve its metadata using the API.

exclamation-point-icon-30522If you remember, the point of doing that last step was that the titles stored in the sales data files don’t match the titles in the API, so I needed to find a way to join the two sources. Rather than writing cases to handle each of the scenarios (e.g. mismatched punctuation, extra filler words), I looked for a python library to perform some fuzzy matching. What I found was a extremely useful library called, Fuzzy Wuzzy. Fuzzy Wuzzy provides a function called extractOne() that allows you to pass in a term and compare it with an array of values. The extractOne() function will then return the term in the array that has the highest match percentage. Additionally, you can specify a lower bound for acceptable matches (ie. only return result where the match is >= 90%).

Again, it took a few passes to get the configuration to work effectively. The first time through about only about 65% of the titles in the sales file found a match. That was throwing away too much data for my liking so I had to look at the exceptions and figure out why the matches were falling through. One issue that I found was titles that tacked on the publication years in the Marvel database, like “All-New X-Men (2012)”, had a match score in the 80’s when matched against a sales title like, “All New X-Men”. This was a pretty consistent issue, so rather than lowering the match percentage, which could introduce some legitimate mismatches, I decided to strip the year, if present, on mismatches and run it through that matching process again. This got me almost there. The only other issue I ran into was Fuzzy Wuzzy had trouble matching acronyms/acrostics. So ‘S.H.E.I.L.D.’  had a match score in the 50’s when matching ‘SHIELD’. That’s because half the characters (periods) were missing. Since  there were only two titles affected, I built a lookup dictionary of special cases that needed to be translated. For the purposes of this exercise, I would still have had enough matches to skip that step, but by doing it brought us up to 100% matching between the two sources. Once the matching function was working, I pulled out urllib2 and retrieved all the metadata I could for each of the issues.

The resulting files contained not only sales data (title, issue number, month rank, estimated sales), but information about the creative team, issue descriptions, characters, release dates and  associated story arcs. This would be enough to get us started with building our predictive classification model.
blog-csv-all That being said, there was still a lot of structural rearranging required to make it ready for the type of analysis I wanted to do, but we will deal with that in the next blog. Hopefully,  you picked up some useful tips on how to combine data from different sources or at the very least found solace in knowing that while you may not be the coolest person in the world, somewhere out there is a grown man who still likes to read comic books enough to write a blog about it. Be sure to tune in next week, True Believers, as we delve into The Mysteries of R!

The post Using Python to ‘Wrangle’ Your Messy Data appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Becky’s BI Apps Corner: OBIA install Perl Script Patching and troubleshooting when they fail

Rittman Mead Consulting - Thu, 2016-02-04 05:00

During a recent project installing Oracle BI Applications, I became much better acquainted with OPatch, Oracle’s standard tool for managing application patches. By acquainted, I mean how to troubleshoot when OPatch patching fails. Since, at last count, there are around 50 patches total for Oracle BI Applications 11.1.1.9.2, the first patching attempt may not apply all patches successfully. There are any number of reasons for a failure, like an extra slash at the end of a path, a misspelled word, Weblogic or NodeManager still running, or some other reason. We will take a look at the logs for each section, learn where additional logs can be found, and learn how to turn on OPatch debugging to better understand the issue. Then, following the ideas from a previous OPatch post by Robin, I’ll describe how to manually apply the patches with OPatch at the command line for any patches that weren’t applied successfully using the provided perl script.

*Disclaimers – Please read the readme files for patches and follow all Oracle recommendations. Patch numbers are subject to change depending on OS and OBIA versions. Commands and paths here are of the linux/unix variety, but there are similar commands available for Windows OS.

Perl Script patching

Unzip the patch files to a patch folder. I have included the OBIEE patch as well.

unzip pb4biapps_11.1.1.9.2_.zip -d patches/
unzip pb4biapps_11.1.1.9.2_generic_1of2.zip -d patches/
unzip pb4biapps_11.1.1.9.2_generic_2of2.zip -d patches/
unzip p20124371_111170_.zip -d patches/

While installing the Oracle BI Applications versions 11.1.1.7. and up, patches get applied with a perl script called APPLY_PATCHES.pl. Following Oracle’s install documentation for 11.1.1.9 version of Oracle BI Applications here, there is a text file to modify and pass to the perl script. Both the perl script and the text file reside in the following directory: $ORACLE_HOME/biapps/tools/bin. In the text file, called apply_patches_import.txt, parameters are set with the path to the following directories:

JAVA_HOME
INVENTORY_LOC
ORACLE_HOME
MW_HOME
COMMON_ORACLE_HOME
WL_HOME
ODI_HOME
WORKDIR
PATCH_ROOT_DIR
WINDOWS_UNZIP_TOOL_EXE (only needed if running on Windows platforms)

Some pro tips to modifying this text file:
1. Oracle recommends you use the JDK in the ORACLE_BI1 directory.
2. Use ORACLE_BI1 as the ORACLE_HOME.
3. Ensure WORKDIR and PATCH_ROOT_DIR are writeable directories.
4. Don’t add a path separator at the end of the path.
5. Commented lines are safe to remove.

Then you run the APPLY_PATCHES.pl passing in the apply_patches_import.txt. If everything goes well, at the end of the perl script, the results will look similar to the following:

If this is the case, CONGRATULATIONS!!, you can move on to the next step in the install documentation. Thanks for stopping by and come back soon! However, if any patch or group of patches failed, the rest of this post is for you.

Log file location

First, the above patching report does not tell you where to find the logs, regardless of success or failure. If you remember though, you set a WORKDIR path in the text file earlier. In that directory is where you will find the following log files:

  1. final_patching_report.log
  2. biappshiphome_generic_patches.log
  3. odi_generic_patches.log
  4. oracle_common_generic_patches.log
  5. weblogic_patching.log

Open the final_patching_report.log to determine first if all patches were applied and identify ones that were not successful. For example, looking that this log may show that the Oracle Common patches failed.

cd $WORKDIR
vi final_patching_report.log

However, this doesn’t tell you what caused the failure. Next we will want to look into the oracle_common_generic_patches.log to gather more information.

From the $WORKDIR:

vi oracle_common_generic_patches.log

Here you will see the error, that a component is missing. Patch ######## requires component(s) that are not installed in OracleHome. These not-installed components are oracle.jrf.thirdparty.jee:11.1.1.7.0. Notice also that in this log there is a path to another log file location. The path is in the $COMMON_ORACLE_HOME/cfgtoollogs/opatch/ directory. This directory has more detailed logs specific to patches applied to oracle_common. Additionally, there are logs under $ORACLE_HOME/cfgtoollogs/opatch/, $WL_HOME/cfgtoollogs/opatch/, and $ODI_HOME/cfgtoollogs/opatch/. These locations are very helpful to know, so you can find the logs for each group of patches in the same relative path.

Going back to the above error, we are going to open the most recent log file listed in the $COMMON_ORACLE_HOME/cfgtoollogs/opatch/ directory.

cd $COMMON_ORACLE_HOME/cfgtoollogs/opatch/
vi opatch2015-08-08_09-20-58AM_1.log

The beginning of this log file has two very interesting pieces of information to take note of for use later. It has the actual OPatch command used, and it has a path to a Patch History file. Looks like we will have to page down in the file to find the error message.

Now we see our missing component error. Once the error occurs, the java program starts backing out and then starts cleanup by deleting the files extracted earlier in the process. This log does have more detail, but still doesn’t say much about the missing component. After some digging around on the internet, I found a way to get more detailed information out of OPatch by setting export OPATCH_DEBUG=TRUE. After turning OPatch debugging on, run the OPatch command we found earlier that was at the top of the log. A new log file will be generated and we want to open this most recent log file.

Finally, the results now get me detailed information about the component and the failure.

Side Note: If you are getting this specific error, I’ll refer you back to a previous blog post that happened to mention making sure to grab the correct version of OBIEE and ODI. If you have a wrong version of OBIEE or ODI for the Oracle BI Apps version you are installing, unfortunately you won’t start seeing errors until you get to this point.

Manually running Oracle BI Application patches

Normally, the error or reason behind a patch or group of patches failing doesn’t take that level of investigation, and the issue will be identified in the first one or two logs. Once the issue is corrected, there are a couple of options available. Rerunning the perl script is one option, but it will cycle through all of the patches again, even the ones already applied. There is no harm in this, but it does take longer than running the individual patches. The other option is to run the OPatch command at the command line. To do that, first I would recommend setting the variables from the text file. I also added the Oracle_BI1/OPatch directory to the PATH variable.

export JAVA_HOME=$ORACLE_HOME/jdk
export INVENTORY_LOC=
export COMMON_ORACLE_HOME=$MW_HOME/oracle_common
export WL_HOME=$MW_HOME/wlserver_10.3
export SOA_HOME=$MW_HOME/Oracle_SOA1
export ODI_HOME=$MW_HOME/Oracle_ODI1
export WORKDIR=
export PATCH_FOLDER=/patches
export PATH=$ORACLE_HOME/OPatch:$JAVA_HOME/bin:$PATH

Next, unzip the patches in the required directory. For example, the $PATCH_FOLDER/oracle_common/generic might look like this after unzipping files:

Below are the commands for each group of patches:

Oracle Common Patches:

cd $PATCH_FOLDER/oracle_common/generic
unzip "*.zip"

$COMMON_ORACLE_HOME/OPatch/opatch napply $PATCH_FOLDER/oracle_common/generic -silent -oh $COMMON_ORACLE_HOME -id 16080773,16830801,17353546,17440204,18202495,18247368,18352699,18601422,18753914,18818086,18847054,18848533,18877308,18914089,19915810

BIApps Patches:

cd $PATCH_FOLDER/biappsshiphome/generic
unzip "*.zip"

opatch napply $PATCH_FOLDER/biappsshiphome/generic -silent -id 16913445,16997936,19452953,19526754,19526760,19822893,19823874,20022695,20257578

ODI Patches:

cd $PATCH_FOLDER/odi/generic
unzip "*.zip"

/$ODI_HOME/OPatch/opatch napply $PATCH_FOLDER/odi/generic -silent -oh $ODI_HOME -id 18091795,18204886

Operating Specific Patches:

cd $PATCH_FOLDER/
unzip "*.zip"

opatch napply $PATCH_FOLDER/ -silent -id ,,

Weblogic Patches:

cd $PATCH_FOLDER/suwrapper/generic
unzip "*.zip"

cd $PATCH_FOLDER/weblogic/generic

$JAVA_HOME/bin/java -jar $PATCH_FOLDER/suwrapper/generic/bsu-wrapper.jar -prod_dir=$WL_HOME -install -patchlist=JEJW,LJVB,EAS7,TN4A,KPFJ,RJNF,2GH7,W3Q6,FKGW,6AEJ,IHFB -bsu_home=$MW_HOME/utils/bsu -meta=$PATCH_FOLDER/suwrapper/generic/suw_metadata.txt -verbose > $PATCH_FOLDER/weblogic_patching.log

Even though this is a very specific error as an example, understanding the logs and having the break-down of all of the patches will help with any number of patch errors at this step of the Oracle BI Applications installation. I would love to hear your thoughts if you found this helpful or if any part was confusing. Keep an eye out for the next Becky’s BI Apps Corner where I move on from installs and start digging into incremental logic and Knowledge Modules.

The post Becky’s BI Apps Corner: OBIA install Perl Script Patching and troubleshooting when they fail appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Corporate Social Responsibility (Where Can We Serve?)

Rittman Mead Consulting - Mon, 2016-02-01 03:00

At Rittman Mead, we believe that people are more important than profit.
This manifests itself in two ways. First, we want to impact the world beyond data and analytics, and secondly, we want our employees to be able to contribute to organizations they believe are doing impactful work.

This year, we’ve put a Community Service requirement in place for all of our full-time employees.

We’ll each spend 40 hours this year serving with various nonprofits. Most of our team are already involved with some amazing organizations, and this “requirement” allows us to not only be involved after hours and on the weekends, but even during normal business hours.

We want to highlight a few team members and show how they’ve been using their Community Service hours for good.

Beth deSousa
Beth is our Finance Manager and she has been serving with Sawnee Women’s Club. Most of her work has been around getting sponsorship and donations for their annual silent auction. She’s also helped with upgrading a garden at the local high school, collecting toys and gift wrap for their Holiday House, and collecting prom dresses and accessories for girls in need.

Charles Elliott
Charles is the Managing Director of North America. He recently ran in the Dopey Challenge down at Disney World which means he ran a 5k, 10k, half marathon, and full marathon in 4 days. He did the run to raise funds for Autism Speaks. Charles was recognized as the third largest fundraiser for Autism Speaks at the Dopey Challenge!

David Huey
David is our U.S. Business Development rep. He recently served with the nonprofit Hungry For A Day for their Thanksgiving Outreach. He flew up to Detroit the week of Thanksgiving and helped serve over 8,000 Thanksgiving dinners to the homeless and needy in inner city Detroit.

Andy Rocha

Andy is our Consulting Manager. Andy is a regular volunteer and instructor with Vine City Code Crew. VC3 works with inner city youth in Atlanta to teach them about electronics and coding.

Pete Tamisin

Pete is a Principal Consultant. He is also involved as a volunteer and instructor with the aforementioned Code Crew. Pete has taught a course using Makey Makey electronic kits for VC3.

This is just a sample of what our team has done, but engaging in our local communities is something that Rittman Mead is striving to make an integral piece of our corporate DNA.
We can’t wait to show you how we’ve left our communities better in 2016!

The post Corporate Social Responsibility (Where Can We Serve?) appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

The best OBIEE 12c feature that you’re probably not using

Rittman Mead Consulting - Thu, 2016-01-28 03:00

With the release of OBIEE 12c we got a number of interesting new features on the front-end.  We’re all talking about the cleaner look-and-feel, Visual Analyzer, and the ability to create data-mashups, etc.

While all this is incredibly useful, it’s one of the small changes you don’t hear about that’s truly got me excited.  I can’t tell you how thrilled I am that we can finally save a column back to the web catalog as an independent object (to be absolutely fair, this actually first shipped with 11.1.1.9).

For the most part, calculations should be pushed back to the RPD.  This reduces the complexity of the reports on the front-end, simplifies maintenance of these calculations, and ultimately assures that the same logic is used across the board in all dashboards and reports… all the logic should be in the RPD.  I agree with that 110%… at least in theory.  In reality, this isn’t always practical.  When it comes down to it, there’s always some insane deadline or there’s that pushy team (ahem, accounting…) riding you to get their dashboard updated and migrated in time for year end, or whatever.  It’s quite simply just easier sometimes to just code the calculation in the analysis.  So, rather than take the time to modify the RPD, you fat finger the calculation in the column formula.  We’ve all done it.  But, if you spend enough time developing OBIEE reports and dashboards, sooner or later you’ll find that this is gonna come back to bite you.

Six months, a year from now, you’ll have completely forgotten about that calculation.  But there will be a an org change, or a hierarchy was updated… something, to change the logic of that calculation and you’ll need make a change.  Only now, you know longer remember the specifics of the logic you coded, and even worse you don’t remember if you included that same calculation in any of the other analyses you were working on at the time.  Sound familiar?  Now, a change that should have been rather straightforward and could have been completed in an afternoon takes two to three times longer as you dig through all your old reports trying to make sense of things.  (If only you’d documented your development notes somewhere…)

Saving columns to the web catalog is that middle ground that gives us the best of both worlds… the convenience of quickly coding the logic on the front-end but the piece of mind knowing that the logic is all in one place to ensure consistency and ease maintenance.

After you update your column formula, click OK.

From the column dropdown, select the Save Column As option.

Save the column to the web catalog.  Also, be sure to use the description field.  The description is a super convenient place to store a few lines of text that your future self or others can use to understand the purpose of this column.

As an added bonus, this description field is also used when searching the web catalog.  So, if you don’t happen to remember what name you gave a column but included a little blurb about the calculation, all is not lost.

Saved columns can be added from the web catalog.

Add will continue to reference the original saved column, so that changes to made to the saved column will be reflected in your report.  Add Copy will add the column to your report, but future changes to the saved column will not be reflected.

One thing to note, when you add a saved column to a report it can no longer be edited from within the report.  When you click on Edit Formula you will still be able to see the logic, but you will need to open and edit that saved column directly to make any changes to the formula.

Try out the saved columns, you’ll find it’s invaluable and will greatly reduce the time it takes to update reports.  And with all that free time, maybe you’ll finally get to play around with the Visual Analyzer!

The post The best OBIEE 12c feature that you’re probably not using appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Up in the JCS Clouds !!

Tim Dexter - Wed, 2016-01-27 04:05
Hello Friends,

Oracle BI Publisher has been in the cloud for quite sometime ....as a part of Fusion Applications or few other Oracle product offerings. We now announce certification of BI Publisher in the Java Cloud Services!! 

BI Publisher on JCS

Oracle Java Cloud Service (JCS) is a part of the platform service offerings in Oracle Cloud. Powered by Oracle WebLogic Server, it provides a platform on top of Oracle's enterprise-grade cloud infrastructure for developing and deploying new or existing Java EE applications. Check for more details on JCS here. In this page, under "Perform Advanced Tasks" you can find a link to "Leverage your on-premise licenses". This page cites all the products certified for Java Cloud Services and now we can see BI Publisher 11.1.1.9 listed as one of the certified products using Fusion Middleware 11.1.1.7.


How to Install BI Publisher on JCS?

Here are the steps to install BI Publisher on JCS. The certification supports the Virtual Image option only.

Step 1: Create DBaaS Instance


Step 2: Create JCS Instance

To create an Oracle Java Cloud Service instance, use the REST API for Oracle Java Cloud Service. Do not use the Wizard in the GUI. The Wizard does not allow an option to specify the MWHOME partition size, whereas REST API allows us to specify this. The default size created by the Wizard is generally insufficient for BI Publisher deployments.

The detailed instructions to install JCS instance are available in the Oracle By Example Tutorial under "Setting up your environment", "Creating an Oracle Java Cloud Service instance".


Step 3:  Install and Configure BI Publisher

  1. Set up RCU on DBaaS
    • Copy RCU
    • Run RCU
  2. Install BI Publisher in JCS instance
    • Copy BI Installer in JCS instance
    • Run Installer
    • Use Software Only Install
  3. Configure BI Publisher
    • Extend Weblogic Domain
    • Configure Policy Store
    • Configure JMS
    • Configure Security

You can follow the detailed installation instructions as documented in "Oracle By Example" tutorial. 


Minimum Cloud Compute and Storage Requirements:

  1. Oracle Java Cloud Service: 1 OCPU, 7.5 GB Memory, 62 GB Storage
    • To install Weblogic instance
    • To Install BI Publisher
    • To set Temp File Directory in BI Publisher
  2. Oracle Database Cloud Service: 1 OCPU, 7.5 GB Memory, 90 GB Storage
    • To install RCU
    • To use DBaaS as a data source
  3. Oracle IaaS (Compute & Storage): (Optional - Depends on sizing requirements)
    • To Enable Local & Cloud Storage option in DBaaS (Used with Full Tooling option)

So now you can use your on-premise license to host BI Publisher as a standalone on the Java Cloud Services for all your highly formatted, pixel perfect enterprise reports for your cloud based applications. Have a great Day !!

Categories: BI & Warehousing

OBIEE 12c – Three Months In, What’s the Verdict?

Rittman Mead Consulting - Sun, 2016-01-24 13:22

I’m over in Redwood Shores, California this week for the BIWA Summit 2016 conference where I’ll be speaking about BI development on Oracle’s database and big data platforms. As it’s around three months since OBIEE12c came out and we were here for Openworld, I thought it’d be a good opportunity to reflect on how OBIEE12c has been received by ourselves, the partner community and of course by customers. Given OBIEE11g was with us for around five years it’s clearly still early days in the overall lifecycle of this release, but it’s also interesting to compare back to where we were around the same time with 11g and see if we can spot any similarities and differences to take-up then.

Starting with the positives; Visual Analyzer (note – not the Data Visualization option, I’ll get to that later) has gone down well with customers, at least over in the UK. The major attraction for customers seems to be “Tableau with a shared governed data model and integrated with the rest of our BI platform” (see Oracle’s slide from Oracle Openworld 2015 below), and seeing as the Data Visualization Option is around the same price as Tableau Server the synergies around not having to deploy and support a new platform seem to outweigh the best-of-breed query functionality of Tableau and Qlikview.

NewImage

Given that VA is an extra-cost option what I’m seeing is customers planning to upgrade their base OBIEE platform from 11g to 12c as part of their regular platform refresh, and considering the VA part after the main upgrade as part of a separate cost/benefit exercise. VA however seems to be the trigger for customer interest in an upgrade, with the business typically now holding the budget for BI rather than IT and Visual Analyzer (like Mobile with 11g) being the new capability that unlocks the upgrade spend.

On the negative side, Oracle charging for VA hasn’t gone down well, either from the customer side who ask what it is they actually get for their 22% upgrade and maintenance fee if they they have to pay for anything new that comes with the upgrade; or from partners who now see little in the 12c release to incentivise customers to upgrade that’s not an additional cost option. My response is usually to point to previous releases – 11g with Scorecards and Mobile, the database with In-Memory, RAC and so on – and say that it’s always the case that anything net-new comes at extra cost, whereas the upgrade should be something you do anyway to keep the platform up-to-date and be prepared to uptake new features. My observation over the past month or so is that this objection seems to be going away as people get used to the fact that VA costs extra; the other push-back I get a lot is from IT who don’t want to introduce data mashups into their environment, partly I guess out of fear of the unknown but also partly because of concerns around governance, how well it’ll work in the real world, so on and so on. I’d say overall VA has gone down well at least once we got past the “shock” of it costing extra, I’d expect there’ll be some bundle along the lines of BI Foundation Suite (BI Foundation Suite Plus?) in the next year or so that’ll bundle BIF with the DV option, maybe include some upcoming features in 12c that aren’t exposed yet but might round out the release. We’ll see.

The other area where OBIEE12c has gone down well, surprisingly well, is with the IT department for the new back-end features. I’ve been telling people that whilst everyone thinks 12c is about the new front-end features (VA, new look-and-feel etc) it’s actually the back-end that has the most changes, and will lead to the most financial and cost-saving benefits to customers – again note the slide below from last year’s Openworld summarising these improvements.

NewImage

Simplifying install, cloning, dev-to-test and so on will make BI provisioning considerably faster and cheaper to do, whilst the introduction of new concepts such as BI Modules, Service Instances, layered RPD customizations and BAR files paves the way for private cloud-style hosting of multiple BI applications on a single OBIEE12c domain, hybrid cloud deployments and mass customisation of hosted BI environments similar to what we’ve seen with Oracle Database over the past few years.

What’s interesting with 12c at this point though is that these back-end features are only half-deployed within the platform; the lack of a proper RPD upload tool, BI Modules and Services Instances only being in the singular and so on point to a future release where all this functionality gets rounded-off and fully realised in the platform, so where we are now is that 12c seems oddly half-finished and over-complicated for what it is, but it’s what’s coming over the rest of the lifecycle that will make this part of the product most interesting – see the slide below from Openworld 2014 where this full vision was set-out, but in Openworld this year was presumably left-out of the launch slides as the initial release only included the foundation and not the full capability.

NewImage

Compared back to where we were with OBIEE11g (11.1.1.3, at the start of the product cycle) which was largely feature-complete but had significant product quality issues, with 12c we’ve got less of the platform built-out but (with a couple of notable exceptions) generally good product quality, but this half-completed nature of the back-end must confuse some customers and partners who aren’t really aware of the full vision for the platform.

And finally, cloud; BICS had an update some while ago where it gained Visual Analyzer and data mashups earlier than the on-premise release, and as I covered in my recent UKOUG Tech’15 conference presentation it’s now possible to upload an on-premise RPD (but not the accompanying catalog, yet) and run it in BICS, giving you the benefit of immediate availability of VA and data mashups without having to do a full platform upgrade to 12c.

NewImage

In-practice there are still some significant blockers for customers looking to move their BI platform wholesale into Oracle Cloud; there’s no ability yet to link your on-premise Active Directory or other security setup to BICS meaning that you need to recreate all your users as Oracle Cloud users, and there’s very limited support for multiple subject areas, access to on-premise data sources and other more “enterprise” characterises of an Oracle BI platform. And Data Visualisation Cloud Service (DVCS) has so far been a non-event; for partners the question is why would we get involved and sell this given the very low cost and the lack of any area we can really add value, while for customers it’s perceived as interesting but too limited to be of any real practical use. Of course, over the long term this is the future – I expect on-premise installs of OBIEE will be the exception rather than the rule in 5 or 10 years time – but for now Cloud is more “one to monitor for the future” rather than something to plan for now, as we’re doing with 12c upgrades and new implementations.

So in summary, I’d say with OBIEE12c we were pleased and surprised to see it out so early, and VA in-particular has driven a lot of interest and awareness from customers that has manifested itself in enquires around upgrades and new features presentations. The back-end for me is the most interesting new part of the release, promising significant time-saving and quality-improving benefits for the IT department, but at present these benefits are more theoretical than realisable until such time as the full BI Modules/multiple Services Instances feature is rolled-out later this year or next. Cloud is still “one for the future” but there’s significant interest from customers in moving either part or all of their BI platform to the cloud, but given the enterprise nature of OBIEE it’s likely BI will follow after a wider adoption of Oracle Cloud for the customer rather than being the trailblazer given the need to integrate with cloud security, data sources and the need to wait for some other product enhancements to match on-premise functionality.

The post OBIEE 12c – Three Months In, What’s the Verdict? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Upgrade to ODI 12c: Repository and Standalone Agent

Rittman Mead Consulting - Wed, 2016-01-20 22:00

This is my first post, and I hope to have many more to come. My name is Brian Sauer, and I am fairly new here at Rittman Mead and proud to be here. I have close to 5 years of experience with Oracle Data Integrator, both 10g and 11c. I have been involved in ETL activities for close to 8 years. Prior to using Oracle Data Integrator, most of my activities in ETL utilized Perl. Ever since being exposed to Oracle and ODI, I have found a home that has been both comforting and challenging. It is a passion of mine to be able to work with any type of database and to get data from point A to point B, making the necessary changes to get user’s hands on it in a meaningful and useful way. Thus, I’ve put together this blog post to assist with the upgrade of ODI 11g to ODI 12c. I hope it’s useful to you.

After going through the process of upgrading from ODI 11g to ODI 12c, I found the documentation to be scattered around a bit and began putting together a roadmap of the information that was vital to the upgrade itself. This post is essentially a brief explanation of some added enhancements to ODI in 12.1.3.0 and a straightforward guide to the upgrade process. I’m not addressing 12.2.1.0 in this post, as many of our clients have been moving to 12.1.3.0 due to stability and testing issues with 12.2.1.0. I will take a look 12.2.1.0 in my next post in this series.

ODI 12.1.3.0 was released in November 2014, and along with it came a number of enhancements and significant changes from ODI 11.1.1.7.0. I am going to focus on the upgrade process with this post and some of the more significant changes.

According to the Oracle documentation, the only supported starting points for upgrading to ODI 12c (12.1.3) are the following:

  • Oracle Data Integrator 11g Release 1 (11.1.1.6.0)
  • Oracle Data Integrator 11g Release 1 (11.1.1.7.0)

There are some significant changes to ODI from 11g. First and foremost is the way that the agents are handled in 12c. Standalone Agents are now managed by the WebLogic Management Framework, and they are installed in their own directories. Also, interfaces have given way to mappings and reusable mappings. Lastly, the repository upgrade to 12.1.3 validates name uniqueness for objects such as interfaces, folders, procedures, knowledge modules, packages, and profiles. If duplicates exist, the upgrade process will check for it, and you can print a report that will list the duplicates in the upgrade log. You’ll then need to reopen ODI 11g and manually fix the duplicates and restart the upgrade.

Below is a comparison of the ODI JEE Agent Upgrade topology. Notice that in 12c only an LDAP-based store can be used for OPSS as file-based stores are no longer allowed.

Picture 1

You’ll also notice the topology for Standalone Agents not registered with a WebLogic Domain have changed as mentioned earlier; they are now part of the WebLogic Management Framework. The agent is configured in a standalone domain unlike its predecessor in 11g.

Picture 2

For Standalone Agents in 11g that were registered with a WebLogic Domain nothing has really changed from a topology standpoint as shown below:

Picture 3

Some considerations to take into account before diving into the upgrade are the following:

  • Verify that the database which will host the schemas required by Fusion Middleware is supported. You can verify at: http://docs.oracle.com/middleware/1213/core/ODIUG/tasklist.htm#BABGAIFA
  • ODI 12.1.3 requires a 64-bit operating system. Verify your machine is 64-bit, and, if not, then upgrade.
  • Verify you are running a supported version (11.1.1.6, 11.1.1.7).
  • Develop a backup and recovery strategy.
  • Plan for system downtime during the upgrade process.

Upgrade Process

BACKUPS

The first and probably most important step before you even begin the upgrade is to make sure you have a solid backup of your 11g environment. There are many ways to do this, but you will want to make sure that you have a backup of the following so you can restore your prior environment in the event that a failure occurs:

  1. ODI Home Directory
  2. ODI Work Repository(s)
  3. ODI Master Repository

To preserve the ODI 11g Home Directory, I simply make a copy of the ODI_HOME directory, zip it up, and save in another location. To preserve the ODI Master and Work Repositories, simply use the export utility within ODI.

GUID-08FF568A-7A03-4CE1-9F4E-2F2993BF72C6-default

This will provide a saved copy of each. In addition to the backup, record the Repository IDs in the event you need to restore. Lastly, I create a database backup of the ODI schemas that will be upgraded. These are the schemas where the Master and Work Repositories reside.

Zero or Limited Downtime Approaches

An important question is, How can I upgrade my system with limited or even zero downtime?

Oracle recommends cloning the work and master repositories on your database in another database instance and then upgrading the cloned repositories. This will keep your current 11g repositories active and untouched during the upgrade process, and once the upgrade is complete and you’ve been able to test it out, you can switch to the new 12c repositories as your active ODI implementation. Oracle has provided sample scripts for different database environments at the following location:

https://docs.oracle.com/middleware/1212/odi/ODIUG/tasklist.htm#ODIUG117

As a part of your testing you will want to verify, after you have set up your cloned repositories, that the cloned master repository points to the correct work repository(s). Also, as a part of cloning the repositories, you’ll need to copy the row about the master repository in SYSTEM.SCHEMA_VERSION_REGISTRY$ from the old to the new instance.  This is not a supported change by Oracle, however it is required for the upgrade to be successful.

Another upgrade option other than cloning is to create a brand new ODI 12c repository on your database and then export your objects from 11g and import them individually to your new 12c repository using ODI Studio. To perform this task you’ll need to use the upgrade key which is new to 12c and which I’ve explained later in this post for the imports. This is not my recommended approach as it takes longer and is more tedious which can lead to missed objects during the export/import process.

However, if you choose to go this route, the export/import order I have used in the past is the following:

  1. Topology
  2. Models
  3. Global Objects
  4. Projects

Installing Oracle Data Integrator 12c Products

Now that backups have been made and we are confident that in the event of an unsuccessful upgrade we can recover and we have decided on a transition strategy, it is time to begin installing our 12c products.

We have a couple questions to answer first. Are we upgrading with a Standalone Agent or a Java EE Agent? If we are installing an environment with a Java EE Agent then we will need to download two separate files from Oracle:

  1. Oracle Data Integrator 12cR1 (12.1.3.0.0) for all platforms install file.
  2. Oracle Fusion Middleware Infrastructure (for All Platforms).

If we are installing for a standalone environment which we are in this post, then the second file is not required or needed. You will need approximately 1.5GB of space for the home directory where you install ODI 12c. I would recommend an additional 1GB of space to account for domains for agents. The following link will give you the ODI 12c Studio installation instructions in detail:

https://docs.oracle.com/middleware/1212/odi/ODING/install.htm#ODING286

Creating/Upgrading the Master and Work Repositories

Now that ODI 12c Studio is installed, we’ll need to create the Master and Work repositories in the database so that they can effectively communicate with ODI 12c Studio. To do this you’ll need to navigate to the <ODI_HOME>/oracle_common/bin directory and run either ./rcu.sh or rcu.bat, depending on your environment.

Before beginning the creation of the 12c repositories, make sure you have a solid backup of the 11g repositories.                

You’ll want to make sure that you have DBA privileges for this step as you’ll be creating new schemas and tables in the database that will require these permissions. The wizard does give you the option of creating the scripts for the DBA to use later if necessary, although it may be best to work with the DBA directly as you create the repositories.

First, you will want to choose an existing prefix since this is how you identify the repository to be upgraded in the DB.

Secondly, you’ll need to check off the following components to identify them in your 12c upgrade:

  1. Oracle AS Repository Components/AS Common Schemas/Common Infrastructure Services which creates the new *_STB schema for 12c.
  2. Oracle Data Integrator/Master and Work Repository

RCU_screenshot1

As you enter the schema passwords and mapping information, the utility will ultimately reach the summary screen where it will show you the new schemas it will create. If any schemas have already been created, be aware that they will not be listed here since they do not need to be created again. At this screen, you click create, and the creation of the new schemas/components will commence.

RCU_screenshot2

Once you have created the new schemas necessary for 12c, we will start the Upgrade Assistant so the previous version of ODI can be upgraded to 12.1.3.0.0. You will find the upgrade assistant at:

<ODI_HOME>/oracle_common/upgrade/bin

You’ll execute the ua.bat or ua file depending on your environment.

When asked what type of upgrade you wish to perform, you will want to select “Schemas” as this will allow you to identify the schemas where the upgrade will take place.

UA1

The upgrade assistant will then list the schemas available to upgrade. When asked what components to upgrade, you will select “Oracle Data Integrator”.

UA2

The only other item that I’d draw your attention to is the screen providing the “Upgrade Key”. This key is used to convert 11g object IDs into unique GUIDs. You will want to save this key as you will need this for additional 11g objects that may need to be imported at a later time. This key allows ODI to consistently calculate the same GUID for an 11g object when importing to 12c. The key identifies uniquely the set of repositories that were working together before an upgrade. Since the 12c repository has switched to using GUIDs to better guarantee unique ID’s, the 11g ID will not work without providing the upgrade key.

UA3

As you upgrade to 12c, the question will likely arises as to why would you need this upgrade key. The situation may arise where you would need to restore an object, import a scenario, or revert back to a prior topology. This key will allow you to do so with your 11g exports.

Once completion of the Upgrade Assistant has occurred, you will want to verify that the version of ODI has been updated in the schema. You can run the following command to verify:

SELECT OWNER, VERSION, STATUS FROM SCHEMA_VERSION_REGISTRY WHERE OWNER = ‘<prefix>_ODI_REPO’;

You should see the version for your schema as: 12.1.3.0.0 with a status of ‘VALID’

Creating the Standalone Agent

Our final step is creating the Standalone Agent. As I wrote earlier, this has changed with 12c since the WebLogic framework is being used now for all agents. Due to this, you cannot upgrade 11g standalone agents to 12c, and you will need to create new ones.

We will place our new Standalone Agent in a domain folder which will sit outside of the ODI_HOME.

To create the Standalone Agent, navigate to ODI_HOME/oracle_common/common/bin and execute either config.sh or config.exe based on your operating system.

The first screen that you will come across will ask you to either create a new domain or update an existing domain.  In our case here, we did not have an existing domain so we will create a new Standalone Agent Domain. The wizard will default to a location within your ODI_HOME. You will want to change this location so that it resides OUTSIDE the ODI_HOME as I wrote earlier. The reason for this is that in the future it could create problems for upgrading your ODI software if the domain is created in the ODI_HOME.

agent1

The rest of the steps are straightforward. You’ll choose to create the “Oracle Data Integrator – Standalone Agent – 12.1.3.0[odi]”, enter database configuration information, and choose a user name and password for the Node Configuration. Then, you will be prompted to create your agent.

Once you have completed these steps to start the agent, go to DOMAIN_HOME/bin and select either startNodeManger.cmd or startNodeManager.sh depending on your operating system. You’ll be prompted to enter the credentials you chose when configuring the node above.

Once you’ve completed these steps, you’ll go through the process of creating your agent within ODI Studio both within the Physical and Logical Architecture.

imageedit_1_4071130689

Then run the following command depending on your operating system:

./agent.sh –NAME=<AGENT_NAME>

agent.cmd –NAME=<AGENT_NAME>

Once these commands have executed successfully you’ll want to confirm the agent is running without issue by clicking on the “Test” button by opening the agent you created in the Physical Architecture in ODI Studio.

imageedit_1_8629909932

I would shut down and remove the agent that existed in your 11g environment as you remove the 11g ODI home directory. Remember in our backup that we zipped a copy of the 11g home directory and stored it in another location. Once you have confirmed that ODI 12c is operating as expected, you may remove this directory if it’s on the same server.

Testing, Upgrade Issues, and Clean Up

The approach that I have used when testing has been to test each individual interface and load plan against the results in its 11g counterpart. I verify the counts initially and then have the users/owners verify the data. I then allow the load plans to run a few days to a week, and once the users/owners are comfortable with the results, migrate to the new implementation.

That being said, the following issues and clean up items may need to be addressed, which I and some of my colleagues at Rittman Mead have encountered while upgrading to 12c:

  • Variables in mappings/KMs are not recognized by the parser when creating scenarios, so they are not recognized as input parameters. Consequently, you’ll want to wrap each mapping in a package or load plan with the variables declared in it.
  • The variables are case-sensitive in 12c, so if you called a variable with a different case in a mapping, you will need to change it to the matching case.
  • Make sure to prefix all variables with the project code or global.
  • Do not start any data store alias with a number as this does not work well in 12c.
  • Temporary interfaces from 11g are triplicated in 12c during the upgrade and require cleanup in projects as well as the models. The upgrade creates these as a regular mapping and creates a target table in the models which can be removed. It also creates two copies of each temporary interface in Reusable Mappings with suffixes of RMS (Re-usable Mapping Source) and RMT (Re-usable Mapping Target). For the most part the RMT reusable mappings can be removed, however verify the mapping(s) using the temporary interface are suffixed by RMS before deleting.
  • Upgrade the mappings to flow-based using “Convert to Flow” on the dataset. (Optional)
  • You can check out Michael Rainey’s post on this at: http://www.rittmanmead.com/2015/04/di-tips-odi-convert-to-flow/

In my next post, I will go over the process that involves upgrading the Java EE Agent to 12c from 11g and highlight where the upgrade to 12.2.1.0 differs from the upgrade to 12.1.3.0. Stay tuned!

 

The post Upgrade to ODI 12c: Repository and Standalone Agent appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Mike Hibbert – A Celebration

Rittman Mead Consulting - Tue, 2016-01-19 16:18

Mike Hibbert

It’s been a long while since I graced the blog with my presence and I had always hoped that when I did return, it would be with something new, or exciting to say.  Instead, I find myself writing something much more personal and subdued:  a celebration of one of our dear colleagues who, after a 2 year fight, lost his battle to cancer just before Christmas 2015.

Joining us in September 2010, Mike Hibbert was one of our longest serving team members.  We still don’t have too many people with over 5 years service, but Mike was one of us.  I was lucky enough to work with Mike from day one:  he was parachuted into the project I was working on to cover the large gaps in my OWB knowledge and experience.  His role on the project was not that of simply building some mappings (we were already 7 months into the 9 month engagement and the OWB estate had already been built).  Instead, Mike’s task was to mentor the clients OWB team and ensure their ETL development capability was in place.  At the same time, he had to come to terms with the OWB estate, review it and where necessary, make it production-ready.  Landing on a project late in the day is never an enviable task for us consultants…even more so when it is your first day on the new job!

Mike brought bags of technical experience in both OWB and OBIEE from his previous roles with Edenbrook & Hitachi Consulting, but what impressed me most about Mike in that first project was the way he built up such a good rapport with all the members of the clients diverse project team.  He gained the teams trust on the technical side early but also engaged with people at a very personal level.  His approach was friendly and easy-going and he would always show an active interest in everything about the people he was working with, be it understanding their perspective on a particular technical challenge or simply talking about last nights TV, major family milestones or the travails of our chosen football teams. His sense of humour was disarming and his natural approach stood him in very good stead throughout his time with Rittman Mead.

Mike spent most of his time with us working on the continent, initially in Belgium, but in 2011 he began working with one of our major clients in the Netherlands.  This was the start of a long and valued relationship with the client, where Mike fulfilled several roles, mainly around OBIEE but also getting his hands increasingly dirty with APEX!  The scope of the projects and the personnel involved changed over time, but Mike was a constant.  He championed new agile delivery methods and became a key member of the clients team, delivering a number of critical solutions to their business units across Europe.  All told, Mike worked with the same client for over 4 years – a mark of his value and importance to them.  He continued to work through his illness and treatment, taking very little time off.  He was still working into his final week.  I came to learn that this was how Mike wanted to tackle his illness and the courage and determination he showed in facing it in this way is an inspiration to us all.

Rittman Mead UK vs Rest of the World 5-a-side, 2013

Mike was a vital member of our team.  He was a founding father of the Rittman Mead Fantasy Football League, now in its 4th year and, although he never admitted it, I think he was always a little frustrated that he never won the title.  I always looked forward to the start of each season in anticipation of the interesting team names Mike always seemed to come up with…often cryptic, always cheeky and always guaranteed to raise a chuckle!  As a Manchester United fan, we could always rely on Mike to give us his hints and tips on which of the Red Devils we should have in our teams…occasionally, his advice even made some sense!  Mike’s main interests lay in sport.  He kept the illustrious tennis career he had before entering the BI world under wraps for quite a while, but his main pursuits were running and cycling – two things that he put together (with a swim beforehand for good measure) to compete in the occasional triathlon.  Amazingly, he continued his cycling through his illness, raising money for both Prostate Cancer UK and The Christie.  In proposing a UK vs Rest of the World 5-a-side match, little did Mike know that he would initiate the companies longest e-mail debate.  It caught our imagination and we had our first match during our BI Forum in 2013.  Mike played on the Rest of the World side (justified by his working relationship with the Netherlands!) and despite giving away a few years on us all, he duly earned the Man of the Match award.  I hope that we will be able to continue these 5-a-side battles in his memory for years to come.

 

On a personal level, Mike and I lived relatively close to each other and for a period, this meant us sharing flights to/from Manchester airport.  We spent some very early mornings in MAN departures and many an hour waiting for delayed flights from AMS (and inevitably enjoying the odd continental beer!).  We coincidentally earned our FA Level 1 coaching badges at the same time and each of us coached our own sons teams.  Our conversations would always eventually turn into a discussion of recent results, tactics and training methods…just another example of Mike’s personable style and his interest in others.

Mike will be sadly missed as a friend and colleague and the collective thoughts and support of Rittman Mead go out to the family that he leaves behind.

We are keen to lend our efforts to the amazing charitable efforts that Mike started and his family continues. We have some fund-raising ideas, which I think Mike would appreciate and I hope can come to fruition in the near future. In the meantime, if you knew or worked with Mike it would be great to hear your memories of him.

The post Mike Hibbert – A Celebration appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ORA-01671 while creating cascaded standby from standby using RMAN DUPLICATE

Amardeep Sidhu - Sat, 2015-12-19 09:20

On a T5 Super Cluster (running 11.2.0.3) I was creating a cascaded standby from an already functional standby using RMAN DUPLICATE and it errored out with

ORA-01671: control file is a backup, cannot make a standby control file

 

A quick search reveals that it is bug 11715084 that affects most of the 11.x versions except 11.2.0.4. There is a one off patch available for most of the versions or one can install the bundle patch that includes the fix for this patch. I applied BP26 and it worked fine after that.

Categories: BI & Warehousing

Sangam 15

Amardeep Sidhu - Tue, 2015-11-24 07:34

This was my 6th year at Sangam and as always was good fun. We were a group of 4 people who were traveling from Delhi and we reached Hyderabad on Friday morning. Just wanted to keep a day for visiting Ramoji Film City and also wanted to avoid the rush that morning travel on the conference’s starting day brings. So after dropping the luggage at the hotel we hired a taxi and reached Ramoji Film City. It is a huge place and it is tiring to move around checking everything. But fortunately on that day the weather was very pleasant so moving around was good fun. We took a ride what they call as Space Walk and watched few sets where some movies were shot. Also they have a pretty good bird sanctuary over there where they have pretty good number of beautiful birds. Spending time there was nice and fun.

By 7 PM or so we were done with everything and started back to hotel. As it was dinner time already so we directly headed to Paradise and had some awesome Biryani.

Saturday was the first day of the conference. We reached the venue by 8:30 AM and the registration was pretty quick. Before starting of the technical sessions at 10 AM, we had plenty of time to move around, meet folks especially  who we know online but had never met in person. For me it was my chance to meet Tim Hall in person for the first time. Simply put Tim is brilliant. His website is an inspiration for many bloggers. It was great meeting Tim in person and striking few conversations about various technologies.

Also met Kamran for the first time in person. Been connected to him on social media for quite some time now. It was great catching up with you mate.

Had last met Francisco in Sangam 10 and this year got a chance to meet him again. The second question (first was how is job

Categories: BI & Warehousing

Data Mining Scoring Development Process

Dylan Wan - Thu, 2015-04-02 23:39

I think that the process of building a data mining scoring engine is similar to develop an application.


We have the requirement analysis, functional design, technical design, coding, testing, deployment, etc. phases.



Categories: BI & Warehousing