Feed aggregator

Smart View Internals: Exploring the Plumbing of Smart View

Tim Tow - Fri, 2013-12-13 17:46
Ever wonder how Smart View stores the information it needs inside an Excel file?  I thought I would take a look to see what I could learn.  First, I created this simple multi-grid retrieve in Smart View.



Note that the file was saved in the Excel 2007 and higher format (with an .xlsx file extension).  Did you know that the xlsx format is really just a specialized zip file?  Seriously.  It is a zip file containing various files that are primarily in xml format.  I saved the workbook, added the .zip extension to the filename, and opened it in WinRar.  Here is what I found.

I opened the xl folder to find a series of files and folders.


Next, I opened the worksheets folder to see what was in there.  Of course, it is a directory of xml files containing the contents of the worksheets.


My Essbase retrieves were on the sheet named Sheet1, so let’s take a look at what is in the sheet1.xml file.   The xml is quite large, so I can’t show all of it here, but needless to say, there is a bunch of information in the file.  The cell contents are only one of the things in the file.  Here is an excerpt that shows the contents of row 5 of the spreadsheet.



This is interesting as it shows the numbers but not the member name.  What is the deal with that?  I noticed there is an attribute, ‘t’, on that node.  I am guessing that the attribute t=”s” means the cell type is a string.  I had noticed that in one of the zip file screenshots, there was a file named sharedStrings.xml.  Hmm...  I took a look at that file and guess what I found?




That’s right!  The 5th item, assuming you start counting at zero like all good programmers do, is Profit.   That number corresponds perfectly with the value specified in the xml for cell B5, which was five (circled in blue in the xml file above).   OK, so when are we going to get to Smart View stuff?  The answer is pretty quick.  I continued looking at sheet1.xml and found these nodes near the bottom.

Hmm, custom properties that contain the name Hyperion?  Bingo!  There were a number of custom property files in the xml file.  Let’s focus on those.

Custom property #1 is identified by the name CellIDs.  The corresponding file, customProperty1.bin, contained only the empty xml node <root />.  Apparently there aren’t any CellIDs in this workbook.

Custom property #2 is identified by the name ConnName.  The file customProperty2.bin contains the string ‘Sample Basic’ which is the name of my connection.

Custom property #3 is named ConnPOV but it appears to contain the connection details in xml format.  Here is an excerpt of the xml.


Custom property #4 is named HyperionPOVXML and the corresponding file contains xml which lines up with the page fields I have in my worksheet.



What is interesting about the POV xml is that I have two different retrieves that both have working POV selectors which are both implemented as list-type data validations in Excel.  I don’t know what happens internally if I save different values for the POV.

Custom property #5 is labeled HyperionXML.  It appears to contain the information about the Essbase retrieval, but it doesn't appear to be the actual retrieval xml because it doesn't contain the numeric data.  My guess is that this xml is used to track what is on the worksheet from a Hyperion standpoint.



There is a lot of information in this simple xml stream, but the most interesting information is contained in the slice element.  Below is a close-up of contents in the slice.



The slice covers 6 rows and 7 columns for a total of 42 cells.  It is interesting that the Smart View team chose to serialize their XML in this manner for a couple of reasons.  First, the pipe delimited format means that every cell must be represented regardless of whether it has a value or not.  This really isn’t too much of a problem unless your spreadsheet range is pretty sparse.  The second thing about this format is that the xml itself is easy and fast to parse, but the resulting strings need to be parsed again to be usable.  For example, the vals node will get split into an array containing 42 elements.  The code must then loop the 42 elements and process them individually.  The other nodes, such as the status, contain other pieces of information about the grid.  The status codes appear to be cell attributes returned by Essbase; these attributes are used to apply formatting to cells in the same way the Excel add-in UseStyles would apply formatting.  There are a couple of things to take away:

  1. In addition to the data on the worksheet itself, there is potentially *a lot* of information stored under the covers in a Smart View file.
  2. String parsing is a computation-intensive operation and can hurt performance.  Multiply that workload by 8 because, depending on the operation and perhaps the provider, all 8 xml nodes above may need to be parsed.

In addition, the number of rows and columns shown in the slice may be important when you are looking at performance.  Smart View must look at the worksheet to determine the size of the range to read in order to send it to Essbase.  In the case of a non-multi-grid retrieve, the range may not be known and, as a result, the grid may be sized based on the UsedRange of the worksheet.  In our work with Dodeca, we have found that workbooks converted from the older xls format to the newer xlsx format, which support a larger number of cells, may have the UsedRange flagged internally to be 65,536 rows by 256 columns.  One culprit appears to be formatting applied to the sheet in a haphazard fashion.  In Dodeca, this resulted in a minor issue which resulted in a larger memory allocation on the server.   Based on the format of the Smart View xml, as compared to the more efficient design of the Dodeca xml format, if this were to happen in Smart View it may cause a larger issue due to the number of cells that would need to be parsed and processed.  Disclaimer: I did not attempt to replicate this issue in Smart View but rather is an educated guess based on my experience with spreadsheet behavior.

Note: The Dodeca xml format does not need to contain information for cells that are blank.  This format reduces the size and the processing cycles necessary to complete the task.  In addition, when we originally designed Dodeca, we tested a format similar to the one used today by Smart View and found it to be slower and less efficient.

Considering all of this information, I believe the xml format would be difficult for the Smart View team to change at this point as it would cause compatibility issues with previously created workbooks.  Further, this discussion should give some visibility to the fact that the Smart View team faces an on-going challenge to maintain compatibility between different versions of Smart View considering that different versions distributed on desktops and different versions of the internal formats that customers may have stored in their existing Excel files.  I don’t envy their job there.

After looking at all of this, I was curious to see what the xml string would look like on a large retrieve, so I opened up Smart View, connected to Sample Basic and drilled to the bottom of the 4 largest dimensions.  The resulting sheet contained nearly 159,000 rows of data.  Interestingly enough, when I looked at the contents of customProperty5.bin inside that xlsx file, the contents were compressed.  It occurred to be a bit strange to me as the xlsx file format is already compressed, but after thinking about it for a minute it makes sense as the old xls file format probably did not automatically compress content, so compression was there primarily to compress the content when saved in the xls file format.

Custom property #6 is labeled NameConnectionMap.  The corresponding property file contains xml that appears to map the range names in the workbook to the actual grid and the connection.


Custom property #7 is labeled POVPosition. The file customProperty7.bin contains the number 4 followed by a NUL character.  Frankly, I have no idea what position 4 means.

Moving on to custom property #8 which is labeled SheetHasParityContent.  This file contains the number 1 followed by a NUL character.  This is obviously a boolean flag that tells the Smart View code that new features, such as support for multiple grids, are present in this file.

Custom property #9 is labeled SheetOptions.  The corresponding file, customProperty9.bin, contains an xml stream that (obviously) contains the Hyperion options for the sheet.


Custom property #10 is labeled ShowPOV and appears to contain a simple Boolean flag much like that in custom property #8.

Finally, custom property #11 is labeled USER_FORMATTING and may not be related to Smart View.

I did look through some of the other files in the .zip and found a few other references to Smart View, but I did not see anything significant.

So, now that we have completed an overview of what is contained in one, very simple, multi-grid file, what have we learned?

  1. There is a bunch of stuff stored under the covers when you save a Smart View retrieve as an Excel file.
  2. With the reported performance issues in certain situations with Smart View, you should now have an idea of where to look to resolve Smart View issues in your environment.

There are a number of files I did not cover in this overview that could also cause performance issues.  For example, Oracle support handled one case where they found over 60,000 Excel styles in the file.  Smart View uses Excel Styles when it applies automatic formatting to member and data cells.  When there are that many styles in the workbook, however, it is logical that Excel would have a lot of overhead searching through its internal list of Style objects to find the right one.  Accordingly, there is a styles.xml file that contains custom styles.  If you have a bunch of Style objects, you could delete the internal styles.xml file.

Note: Be sure to make a copy of your original workbook before you mess with the internal structures.  There is a possibility that you may mess it up and lose everything you have in the workbook. Further, Oracle does not support people going under-the-covers and messing with the workbook, so don’t even bring it up to support if you mess something up.

Wow, that should give you some idea of what may be going on behind the scenes with Smart View.  Even with the experience I have designing and writing the Dodeca web services that talk to Essbase, I wouldn't say that I have a deep understanding of how the information in a Smart View workbook really works.  However, one thing is for certain;  Dodeca does not put stuff like this in your Excel files.  It may be interesting to hear what you find when you explore the internals of your workbooks.
Categories: BI & Warehousing

Upgrade to Oracle 12c [1Z0-060 ] exam is available now

Syed Jaffar - Fri, 2013-12-13 14:16
A very quick note about the announcement of Upgrade to Oracle 12c - 1Z0-060 exam availability, it is no longer beta now. The exam has two sections, with 51 and 34 question respectively in each section with 64% and 65% passing margin. You must succeed in both sections in order to certify Oracle 12c upgrade exam.
However, I felt the exam fee is little higher, 245$, isn't expensive?

What you guys waiting for, go ahead, and upgrade your certification to the latest Oracle release. Its a good chance to upgrade to the latest release with a single upgrade exam, even if you are certified earlier with Oracle 7.3.

Here is the link for more information about the exam:

http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getpage?page_id=5001&get_params=p_exam_id:1Z0-060

https://blogs.oracle.com/certification/entry/0856_18


Good luck with your exam!

OWB to ODI Migration Utility released for Windows 64 bit.

Antonio Romero - Wed, 2013-12-11 15:13

The OWB to ODI Migration Utility is now available for Windows 64-bit platforms. It can be downloaded from the Oracle support site. It is available as Patch number17830453. It needs to be applied on top of  a OWB 11.2.0.4 standalone install.

More information about the Migration Utility is available here.

OWB to ODI Migration Utility released for Windows 64 bit.

Antonio Romero - Wed, 2013-12-11 15:13

The OWB to ODI Migration Utility is now available for Windows 64-bit platforms. It can be downloaded from the Oracle support site. It is available as Patch number17830453. It needs to be applied on top of  a OWB 11.2.0.4 standalone install.

More information about the Migration Utility is available here.

OWB to ODI Migration Utility Webcast - Thu 12th December

Antonio Romero - Wed, 2013-12-11 13:33

On Thursday 12th December there is a webcast on the OWB to ODI 12c migration utility, there will be a demo and drill down into the utility. Check the meeting URL here - its at 10am PST on 12th December. Check out the blog post here on getting the utility. Good chance to get the inside scoop on the utility and ask questions to the PM and development team.


OWB to ODI Migration Utility Webcast - Thu 12th December

Antonio Romero - Wed, 2013-12-11 13:33

On Thursday 12th December there is a webcast on the OWB to ODI 12c migration utility, there will be a demo and drill down into the utility. Check the meeting URL here - its at 10am PST on 12th December. Check out the blog post here on getting the utility. Good chance to get the inside scoop on the utility and ask questions to the PM and development team.


How To Do Single Sign On (SSO) for Web Services

Mark Wilcox - Wed, 2013-12-11 08:38

A recent question on our internal list was

"A customer has OAM and wants to do SSO to SOAP Web Services".

In this case the customer was using Webcenter Content (the product formerly known as Unified Content Manager UCM). But the scenario applies to any SOAP Web Service.

My answer was well received and there isn't anything proprietary here so I thought I would share to make it easier for people to find and for me to refer to later.

First - There is no such thing as SSO in web services.

There is only identity propagation.

Meaning that I log in as Fabrizio into OAM, connect to a Web application protected by OAM.

That Web application is a Web Services client and I want to tell the client to tell the Web Services that Fabrizio is using the service.

The first step to set this up is to protect the web services via OWSM.

The second step is to translate the OAM token into a WS-Security token.

There are 3 ways to this second step:

1 - If you are writing manual client and don't want any other product involved - use OAM STS

2 - Use Oracle Service Bus (which most likely will also use OAM STS but should make this a couple of mouse clicks)

3 - Use OAG - which doesn't need to talk to STS. It has a very simple way to convert OAM into WS-Security header.

If you're not using OSB already - I would recommend OAG. It's by far the simplest plus you get the additional benefits of OAG.

PS - You can use OSB and OAG together in many scenarios - I was only saying to avoid OSB here because the service was already exposed and there was no benefit I could see for having OSB. If you have a reason to have OSB - let me know. I only know OSB at a very high level since my area of focus is security.

How To Do Single Sign On (SSO) for Web Services

Mark Wilcox - Wed, 2013-12-11 08:38

A recent question on our internal list was

"A customer has OAM and wants to do SSO to SOAP Web Services".

In this case the customer was using Webcenter Content (the product formerly known as Unified Content Manager UCM). But the scenario applies to any SOAP Web Service.

My answer was well received and there isn't anything proprietary here so I thought I would share to make it easier for people to find and for me to refer to later.

First - There is no such thing as SSO in web services.

There is only identity propagation.

Meaning that I log in as Fabrizio into OAM, connect to a Web application protected by OAM.

That Web application is a Web Services client and I want to tell the client to tell the Web Services that Fabrizio is using the service.

The first step to set this up is to protect the web services via OWSM.

The second step is to translate the OAM token into a WS-Security token.

There are 3 ways to this second step:

1 - If you are writing manual client and don't want any other product involved - use OAM STS

2 - Use Oracle Service Bus (which most likely will also use OAM STS but should make this a couple of mouse clicks)

3 - Use OAG - which doesn't need to talk to STS. It has a very simple way to convert OAM into WS-Security header.

If you're not using OSB already - I would recommend OAG. It's by far the simplest plus you get the additional benefits of OAG.

PS - You can use OSB and OAG together in many scenarios - I was only saying to avoid OSB here because the service was already exposed and there was no benefit I could see for having OSB. If you have a reason to have OSB - let me know. I only know OSB at a very high level since my area of focus is security.

Why Oozie?

Chen Shapira - Mon, 2013-12-09 13:04

Thats a really frequently asked question. Oozie is a workflow manager and scheduler. Most companies already have a workflow schedulers – Activebatch, Autosys, UC4, HP Orchestration. These workflow schedulers run jobs on all their existing databases – Oracle, Netezza, MySQL. Why does Hadoop need its own special workflow scheduler?

As usual, it depends. In general, you can keep using any workflow scheduler that works for you. No need to change, really.
However, Oozie does have some benefits that are worth considering:

  1. Oozie is designed to scale in a Hadoop cluster. Each job will be launched from a different datanode. This means that the workflow load will be balanced and no single machine will become overburdened by launching workflows. This also means that the capacity to launch workflows will grow as the cluster grows.
  2. Oozie is well integrated with Hadoop security. This is especially important in a kerberized cluster. Oozie knows which user submitted the job and will launch all actions as that user, with the proper privileges. It will handle all the authentication details for the user as well.
  3. Oozie is the only workflow manager with built-in Hadoop actions, making workflow development, maintenance and troubleshooting easier.
  4. Oozie UI makes it easier to drill down to specific errors in the data nodes. Other systems would require significantly more work to correlate jobtracker jobs with the workflow actions.
  5. Oozie is proven to scale in some of the world’s largest clusters. The white paper discusses a deployment at Yahoo! that can handle 1250 job submissions a minute.
  6. Oozie gets callbacks from MapReduce jobs so it knows when they finish and whether they hang without expensive polling. No other workflow manager can do this.
  7. Oozie Coordinator allows triggering actions when files arrive at HDFS. This will be challenging to implement anywhere else.
  8. Oozie is supported by Hadoop vendors. If there is ever an issue with how the workflow manager integrates with Hadoop – you can turn to the people who wrote the code for answers.

So, should you use Oozie? If you find these benefits compelling, then yes. Step out of your comfort zone and learn another new tool. It will be worth it.


Categories: DBA Blogs

Adding Statcounter to Contao-CMS

Dietrich Schroff - Sun, 2013-12-08 03:47
If you are running a Contao based website and you want to add the Statcounter.com snippet for statistics, there is no manual how to to this. I tried some different things, and this was my solution:

  • Create a file name counter.html on your webserver on toplevel of the Contao system (htdocs or something equivalent).
  • Put the Statcounter snippet in this file
<-- Start of StatCounter Code for Default Guide -->
(script type="text/javascript")
var sc_project=NNN; 
var sc_invisible=1; 
...
  • Then add the following next to the last line inside index.php:
include 'counter.html';
  • After that you will be able to get your statistics via Statcounter.com....

    My history with Big Data

    Tahiti Views - Sat, 2013-12-07 19:28
    Before I joined Cloudera, I hadn't had much formal experience with Big Data. But I had crossed paths with one of its major use cases before, so I found it easy to pick up the mindset. My previous big project involved a relational database hooked up to a web server. Naturally I wanted to be able to track visitor stats, detect denial-of-service attacks, and chart the most popular pages, search John Russellhttp://www.blogger.com/profile/17089970732272081637noreply@blogger.com0

    Oracle Linux containers continued

    Wim Coekaerts - Fri, 2013-12-06 16:10
    More on Linux containers... the use of btrfs in particular and being able to easily create clones/snapshots of container images. To get started : have an Oracle Linux 6.5 installation with UEKr3 and lxc installed and configured.

    lxc by default uses /container as the directory to store container images and metadata. /container/[containername]/rootfs and /container/[containername]/config. You can specify an alternative pathname using -P. To make it easy I added an extra disk to my VM that I use to try out containers (xvdc) and then just mount that volume under /container.

    - Create btrfs volume

    If not yet installed, install btrfs-progs (yum install btrfs-progs)

    # mkfs.btrfs /dev/xvdc1
    
    # mount /dev/xvdc1 /container 
    
    You can auto-mount this at startup by adding a line to /etc/fstab

    /dev/xvdc1		/container		btrfs   defaults 0 0
    

    - Create a container

    # lxc-create -n OracleLinux59 -t oracle -- -R 5.9
    
    This creates a btrfs subvolume /container/OracleLinux59/rootfs

    Use the following command to verify :

    # btrfs subvolume list /container/
    ID 260 gen 33 top level 5 path OracleLinux59/rootfs
    

    - Start/Stop container

    # lxc-start -n OracleLinux59
    

    This starts the container but without extra options your current shell becomes the console of the container.
    Add -c [file] and -d for the container to log console output to a file and return control to the shell after starting the container.

    # lxc-start -n OracleLinux59 -d -c /tmp/OL59console
    
    # lxc-stop -n OracleLinux59
    

    - Clone a container using btrfs's snapshot feature which is built into lxc

    # lxc-clone -o OracleLinux59 -n OracleLinux59-dev1 -s
    Tweaking configuration
    Copying rootfs...
    Create a snapshot of '/container/OracleLinux59/rootfs' in '/container/OracleLinux59-dev1/rootfs'
    Updating rootfs...
    'OracleLinux59-dev1' created
    
    # btrfs subvolume list /container/
    ID 260 gen 34 top level 5 path OracleLinux59/rootfs
    ID 263 gen 34 top level 5 path OracleLinux59-dev1/rootfs
    

    This snapshot clone is instantaneous and is a copy on write snapshot.
    You can test space usage like this :

    # btrfs filesystem df /container
    Data: total=1.01GB, used=335.17MB
    System: total=4.00MB, used=4.00KB
    Metadata: total=264.00MB, used=25.25MB
    
    # lxc-clone -o OracleLinux59 -n OracleLinux59-dev2 -s
    Tweaking configuration
    Copying rootfs...
    Create a snapshot of '/container/OracleLinux59/rootfs' in '/container/OracleLinux59-dev2/rootfs'
    Updating rootfs...
    'OracleLinux59-dev2' created
    
    # btrfs filesystem df /container
    Data: total=1.01GB, used=335.17MB
    System: total=4.00MB, used=4.00KB
    Metadata: total=264.00MB, used=25.29MB
    

    - Adding Oracle Linux 6.5

    # lxc-create -n OracleLinux65 -t oracle -- -R 6.5
    
    lxc-create: No config file specified, using the default config /etc/lxc/default.conf
    Host is OracleServer 6.5
    Create configuration file /container/OracleLinux65/config
    Downloading release 6.5 for x86_64
    ...
    Configuring container for Oracle Linux 6.5
    Added container user:oracle password:oracle
    Added container user:root password:root
    Container : /container/OracleLinux65/rootfs
    Config    : /container/OracleLinux65/config
    Network   : eth0 (veth) on virbr0
    'oracle' template installed
    'OracleLinux65' created
    

    - Install an RPM in a running container

    # lxc-attach -n OracleLinux59-dev1 -- yum install mysql
    Setting up Install Process
    Resolving Dependencies
    --> Running transaction check
    ---> Package mysql.i386 0:5.0.95-3.el5 set to be updated
    ..
    Complete!
    

    This connects to the container and executes # yum install mysql inside the container.

    - Modify container resource usage

    # lxc-cgroup -n OracleLinux59-dev1 memory.limit_in_bytes 53687091
    
    # lxc-cgroup -n OracleLinux59-dev1 cpuset.cpus
    0-3
    
    # lxc-cgroup -n OracleLinux59-dev1 cpuset.cpus 0,1
    

    Assigns cores 0 and 1. You can also use a range 0-2,...

    # lxc-cgroup -n OracleLinux59-dev1 cpu.shares
    1024
    
    # lxc-cgroup -n OracleLinux59-dev1 cpu.shares 100
    
    # lxc-cgroup -n OracleLinux59-dev1 cpu.shares
    100
    
    # lxc-cgroup -n OracleLinux59-dev1 blkio.weight
    500
    
    # lxc-cgroup -n OracleLinux59-dev1 blkio.weight 20
    
    

    etc...
    A list of resource control parameters : http://docs.oracle.com/cd/E37670_01/E37355/html/ol_subsystems_cgroups.html#ol_cpu_cgroups

    Lenz has created a Hands-on lab which you can find here : https://wikis.oracle.com/display/oraclelinux/Hands-on+Lab+-+Linux+Containers

    Parameterizing Hive Actions in Oozie Workflows

    Chen Shapira - Fri, 2013-12-06 12:30

    Very common request I get from my customers is to parameterize the query executed by a Hive action in their Oozie workflow.
    For example, the dates used in the query depend on a result of a previous action. Or maybe they depend on something completely external to the system – the operator just decides to run the workflow on specific dates.

    There are many ways to do this, including using EL expressions, capturing output from shell action or java action.
    Here’s an example of how to pass the parameters through the command line. This assumes that whoever triggers the workflow (Human or an external system) has the correct value and just needs to pass it to the workflow so it will be used by the query.

    Here’s what the query looks like:

    insert into test select * from test2 where dt=${MYDATE}
    

    MYDATE is the parameter that allows me to run the query on a different date each time. When running this query in hive, I’d use something like “set MYDATE=’2011-10-10′” before running the query. But when I run it from Oozie, I need to pass the parameter to the Hive action.

    Lets assume I saved the query in a file hive1.hql. Here’s what the Oozie workflow would look like:

    <workflow-app name="cmd-param-demo" xmlns="uri:oozie:workflow:0.4">
    	<start to="hive-demo"/>
    	<action name="hive-demo">
    		<hive xmlns="uri:oozie:hive-action:0.2">
    			<job-tracker>${jobTracker}</job-tracker>
    			<name-node>${nameNode}</name-node>
    			<job-xml>${hiveSiteXML}</job-xml>
    			<script>${dbScripts}/hive1.hql</script>
    			<param>MYDATE=${MYDATE}</param>
    		</hive>
    		<ok to="end"/>
    		<error to="kill"/>
    	</action>
    	<kill name="kill">
    		<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    	</kill>
    	<end name="end"/>
    </workflow-app>
    

    The important line is “MYDATE=${MYDATE}”. Here I translate an Oozie parameter to a parameter that will be used by the Hive script. Don’t forget to copy hive-site.xml and hive1.hql to HDFS! Oozie actions can run on any datanode and will not read files on local file system.

    And here’s how you call Oozie with the commandline parameter:
    oozie job -oozie http://myserver:11000/oozie -config ~/workflow/job.properties -run -verbose -DMYDATE=’2013-11-15′

    Thats it!


    Categories: DBA Blogs

    It’s a Miracle

    Mathias Magnusson - Thu, 2013-12-05 14:20

    Time to get back into blogging. I stopped a while ago and the reason was two-fold. As I was leaving my job at Kentor AB I wanted to avoid any concerns with what I wrote while my three month long two week notice played out. The other reason was that once I had left life got really busy both with work for my client as well as with the new adventure I had departed on.

    The new adventure is to bring the Miracle brand to Sweden. I will try to build up an Oracle database focused team here in Stockholm based on the success Miracle A/S has had in Denmark. Yes it is equally exciting as it is scary for someone who has lived their life in medium to large scale consulting practices.

    However, I have always had a big admiration for the guys who started Miracle and what they have achieved. Getting a chance to bring the same style and quality to my home market of Stockolm and Sweden was just an offer that was too good to pass up. That is the kind of opportunities most of us will only get once or twice in a career. This one came at a time where it was close enough to work with everything else that was going on. The one thing that may not be optimal is having a new house built at the same time. Actually, that makes it completely not optimal. But the phrase I keep repeating to others that are thinking about when the best time to get started is “There is no time such as the present”, so I took my own advice for once.

    So now the work is on getting the Oracle practice going, or rather it is going it just needs a few more legs. And with legs I mean consultants.

    Next is however partying with our colleagues down in Copenhagen close to the headquarters. Tomorrow evening is when the Christmas party kicks off. Who knows when it will end. :-)

    This blog will soon return to a more technical program. thought probably with some posts on interesting things with starting up a company. I’m sure the next year will teach us a lot of things on how to do that.

    About that, this hiring business… When is the best time to add more people to a small company and how do we manage risk? Well… ehhh… Yeah… That’s right there is no best time, but there is no time such as the present.


    Largest Tables Including Indexes and LOBs

    Jeremy Schneider - Wed, 2013-12-04 16:04

    Just a quick code snippit. I do a lot of data pumps to move schemas between different databases; for example taking a copy of a schema to an internal database to try to reproduce a problem. Some of these schemas have some very large tables. The large tables aren’t always needed to research a particular problem.

    Here’s a quick bit of SQL to list the 20 largest tables by total size – including space used by indexes and LOBs. A quick search on google didn’t reveal anything similar so I just wrote something up myself. I’m pretty sure this is somewhat efficient; if there’s a better way to do it then let me know! I’m posting here so I can reference it in the future. :)

    with segment_rollup as (
      select owner, table_name, owner segment_owner, table_name segment_name from dba_tables
        union all
      select table_owner, table_name, owner segment_owner, index_name segment_name from dba_indexes
        union all
      select owner, table_name, owner segment_owner, segment_name from dba_lobs
        union all
      select owner, table_name, owner segment_owner, index_name segment_name from dba_lobs
    ), ranked_tables as (
      select rank() over (order by sum(blocks) desc) rank, sum(blocks) blocks, r.owner, r.table_name
      from segment_rollup r, dba_segments s
      where s.owner=r.segment_owner and s.segment_name=r.segment_name
        and r.owner=upper('&schema_name')
      group by r.owner, r.table_name
    )
    select rank, round(blocks*8/1024) mb, table_name
    from ranked_tables
    where rank<=20;
    

    The output looks like this:

    Enter value for schema_name: someschema
    
          RANK         MB TABLE_NAME
    ---------- ---------- ------------------------------
             1      14095 REALLYBIGTABLE_USESLOBS
             2       6695 VERYBIG_MORELOBS
             3       5762 VERYLARGE
             4       5547 BIGBIG_LOTSOFINDEXES
             5        446 MORE_REASONABLE
             6        412 REASONABLE_TABLE_2
             7        377 ANOTHERONE
             8        296 TABLE1235
             9        280 ANOTHER_MADEUP_NAME
            10        141 I_CANT_POST_PROD_NAMES_HERE
            11         99 SMALLERTABLE
            12         92 NICESIZETABLE
            13         89 ILIKETHISTABLE
            14         62 DATATABLE
            15         53 NODATATABLE
            16         48 NOSQLTABLE
            17         30 HERES_ANOTHER_TABLE
            18         28 TINYTABLE
            19         24 ACTUALLY_THERES_400_MORE_TABLES
            20         19 GLAD_I_DIDNT_LIST_THEM_ALL
    
    20 rows selected.
    

    And just a quick reminder – the syntax to exclude a table from a data pump schema export is:

    expdp ... exclude=SCHEMA_EXPORT/TABLE:[TABNAME],[TABNAME],...
    

    Hope this is useful!

    Joint Webinar with Oracle – Top 5 Key Reasons Why Oracle VM is Best for Oracle Database

    Oracle NZ - Wed, 2013-12-04 14:50

    Top 5 Key Reasons Why Oracle VM is Best for Oracle Database

    ovm_hq_revera

    When: Tuesday December 17th , 10am PST

    Register Here:

    http://event.on24.com/r.htm?e=725350&s=1&k=1512E332202610FE9518AB8B01354C6A&partnerref=Francisco

    Oracle VM application-driven architecture is designed for rapid deployment and ease of lifecycle management providing IT a highly scalable and cost effective virtualization solution for Oracle Database and business-critical applications.

    Join us in this webcast featuring Francisco Munoz Alvarez, a seasoned expert, and Oracle ACE Director as he shares his performance benchmark findings highlighting key reasons why Oracle VM is the best virtualization technology for Oracle Databases.



    Tags:  

    Del.icio.us
    Facebook
    TweetThis
    Digg
    StumbleUpon

    Copyright © OracleNZ by Francisco Munoz Alvarez [Joint Webinar with Oracle - Top 5 Key Reasons Why Oracle VM is Best for Oracle Database], All Right Reserved. 2016.
    Categories: DBA Blogs

    Oracle Linux containers

    Wim Coekaerts - Wed, 2013-12-04 14:24
    So I played a bit with docker yesterday (really cool) and as I mentioned, it uses lxc (linux containers) underneath the covers. To create an image based on OL6, I used febootstrap, which works fine but Dwight Engen pointed out that I should just use lxc-create since it does all the work for you.

    Dwight's one of the major contributors to lxc. One of the things he did a while back, was adding support in lxc-create to understand how to create Oracle Linux images. All you have to do is provide a version number and it will figure out which yum repos to connect to on http://public-yum.oracle.com and download the required rpms and install them in a local subdirectory. This is of course superconvenient and incredibly fast. So... I played with that briefly this morning and here's the very short summary.

    Start out with a standard Oracle Linux 6.5 install and uek3. Make sure to add/install lxc if it's not yet there (yum install lxc) and you're good to go.

    *note - you also have to create /container for lxc - so also do mkdir /container after you install lxc, thank Tony for pointing this out.

    # lxc-create -n ol65 -t oracle -- -R 6.5.
    

    That's it. lxc-create will know this is an Oracle Linux container, using OL6.5's repository to create the container named ol65.

    lxc-create automatically connects to public-yum, figures out which repos to use for 6.5, downloads all required rpms and generates the container. At the end you will see :

    Configuring container for Oracle Linux 6.5
    Added container user:oracle password:oracle
    Added container user:root password:root
    Container : /container/ol65/rootfs
    Config    : /container/ol65/config
    Network   : eth0 (veth) on virbr0
    'oracle' template installed
    'ol65' created
    

    Now all you need to do is :

    lxc-start --name ol65
    

    And you are up and running with a new container. Very fast, very easy.

    If you want an OL5.9 container (or so) just do lxc-create -n ol59 -t oracle -- -R 5.9. Done. lxc has tons of very cool features, which I will get into more later. You can use this model to import images into docker as well, instead of using febootstrap.

    #  lxc-create -n ol65 -t oracle -- -R 6.5
    #  tar --numeric-owner -jcp -C /container/ol65/rootfs . | \
        docker import - ol6.5
    #  lxc-destroy -n ol65
    

    Oracle Linux 6.5 and Docker

    Wim Coekaerts - Tue, 2013-12-03 23:21
    I have been following the Docker project with great interest for a little while now but never got to actually try it out at all. I found a little bit of time tonight to at least try hello world.

    Since docker relies on cgroups and lxc, it should be easy with uek3. We provide official support for lxc, we are in fact a big contributor to the lxc project (shout out to Dwight Engen) and the docker website says that you need to be on 3.8 for it to just work. So, OL6.5 + UEK3 seems like the perfect combination to start out with.

    Here are the steps to do few very simple things:

    - Install Oracle Linux 6.5 (with the default UEK3 kernel (3.8.13))

    - To quickly play with docker you can just use their example

    (*) if you are behind a firewall, set your HTTP_PROXY

    -> If you start from a Basic Oracle Linux 6.5 installation, install lxc first. Your out-of-the-box OL should be configured to access the public-yum repositories.

    # yum install lxc
    

    -> ensure you mount the cgroups fs

    # mkdir -p /cgroup ; mount none -t cgroup /cgroup
    

    -> grab the docker binary

    # wget https://get.docker.io/builds/Linux/x86_64/docker-latest -O docker
    # chmod 755 docker
    

    -> start the daemon

    (*) again, if you are behind a firewall, set your HTTP_PROXY setting (http_proxy won't work with docker)

    # ./docker -d &
    
    -> you can verify if it works

    # ./docker version
    Client version: 0.7.0
    Go version (client): go1.2rc5
    Git commit (client): 0d078b6
    Server version: 0.7.0
    Git commit (server): 0d078b6
    Go version (server): go1.2rc5
    

    -> now you can try to download an example using ubuntu (we will have to get OL up there :))

    # ./docker run -i -t ubuntu /bin/bash
    

    this will go and pull in the ubuntu template and run bash inside

    # ./docker run -i -t ubuntu /bin/bash
    WARNING: IPv4 forwarding is disabled.
    root@7ff7c2bae124:/# 
    

    and now I have a shell inside ubuntu!

    -> ok so now on to playing with OL6. Let's create and import a small OL6 image.

    -> first install febootstrap so that we can create an image

    # yum install febootstrap
    

    -> now you have to point to a place where you have the repoxml file and the packages on an http server. I copied my ISO content over to a place

    I will install some basic packages in the subdirectory ol6 (it will create an OL installed image - this is based on what folks did for centos so it works the same (https://github.com/dotcloud/docker/blob/master/contrib/mkimage-centos.sh)

    # febootstrap -i bash -i coreutils -i tar -i bzip2 -i gzip \
    -i vim-minimal -i wget -i patch -i diffutils -i iproute -i yum ol6 ol6 http://wcoekaer-srv/ol/
    
    # touch ol6/etc/resolv.conf
    # touch ol6/sbin/init
    

    -> tar it up and import it

    # tar --numeric-owner -jcpf ol6.tar.gz -C ol6 .
    # cat ol6.tar.gz | ./docker import - ol6
    

    Success!

    List the image

    # ./docker images
    
    # ./docker images
    REPOSITORY          TAG                 IMAGE ID      
          CREATED             SIZE
    ol6                 latest              d389ed8db59d    
          8 minutes ago       322.7 MB (virtual 322.7 MB)
    ubuntu              12.04               8dbd9e392a96     
          7 months ago        128 MB (virtual 128 MB)
    

    And now I have a docker image with ol6 that I can play with!

    # ./docker run -i -t ol6 ps aux
    WARNING: IPv4 forwarding is disabled.
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    root         1  1.0  0.0  11264   656 ?        R+   23:58   0:00 ps aux
    

    Way more to do but this all just worked out of the box!

    # ./docker run ol6 /bin/echo hello world
    WARNING: IPv4 forwarding is disabled.
    hello world
    

    That's it for now. Next time, I will try to create a mysql/ol6 image and various other things.

    This really shows the power of containers on Linux and Linux itself. We have all these various Linux distributions but inside lxc (or docker) you can run ubuntu, debian, gentoo, yourowncustomcrazything and it will just run, old versions of OL, newer versions of OL, all on the same host kernel.

    I can run OL6.5 and create OL4, OL5, OL6 containers or docker images but I can also run any old debian or slackware images at the same time.

    Android Update: 4.4

    Dietrich Schroff - Tue, 2013-12-03 14:44
    After nearly everyone got the update, my device came up with this tiny icon:
    And here we go:

     If you follow the link you will get the following information:
    • KitKat 4.4: Smart, simple, and truly yours 
    • and several other enhancements of version 4.4
    And here the system information after the update:
    For a complete history of all updates visit this posting.

    OSP #2c: Build a Standard Platform from the Bottom-Up

    Jeremy Schneider - Mon, 2013-12-02 15:07

    This is the fourth of twelve articles in a series called Operationally Scalable Practices. The first article gives an introduction and the second article contains a general overview. In short, this series suggests a comprehensive and cogent blueprint to best position organizations and DBAs for growth.

    This article – building a standard platform – has been broken into three parts. We’ve already discussed standardization in general and looked in-depth at storage. Now it’s time to look in-depth at three more key decisions: CPU and memory and networking.

    Consolidation

    One of the key ideas of Operationally Scalable Practices is to start early with standards that don’t get in the way of consolidation. As you grow, consolidation will be increasingly important – saving both money and time. Before we dig into specifics of standardizing CPU and memory, we need to briefly discuss consolidation in general.

    Consolidation can happen at many levels:

    1. Single schema and multiple customers
    2. Single database and multiple schemas or tenants (12c CDB)
    3. Single OS and multiple databases
    4. Single hardware and multiple OS’s (virtualization)

    Two important points about this list. First, it works a lot like performance tuning: the biggest wins are always highest in the stack. If you want to save time and money then you should push to consolidate as high as possible, ideally in the application. But there are often forces pushing consolidation lower in the stack as well. For example:

    • Google doesn’t spin up new VMs every time a new customer signs up for Google Apps. Their existing webapp stack handles new customers. This is a great model – but if your app wasn’t designed this way from the beginning, it could require a massive development effort to add it.
    • It’s obvious but worth stating: you can only push consolidation up a homogenous stack. If the DB runs on linux and the app runs on windows then naturally they’ll each need their own VM. Same goes for the other three tiers.
    • Server operating systems have robust multiuser capabilities – but sharing an Operating System can still be tricky and these days virtualization offers a strong value proposition (especially when combined with automation). Then there are containers, which fall somewhere in between single OS and virtualization.
    • Security or regulatory or contractual requirements may require separate storage, separate databases or separate operating systems.
    • A requirement for independent failover may drive separate databases. In data guard, whole databases (or whole container databases) must be failed over as a single unit.

    The second important point is that realistically you will encounter all four levels of consolidation at some point as you grow. Great standards accommodate them all.

    CPU

    In my opinion, batch workloads can vary but interactive workloads should always be CPU-bound (not I/O-bound). To put it another way: there are times when your database is mainly servicing some app where end-users are clicking around. At those times, your “top activity” graph in enterprise manager should primarily be green. Not blue, not red, not any other color. (And not too much of that green!) I’m not talking about reports, backups, or scheduled jobs – just the interactive application itself. (Ideally you even have some way to distinguish between different categories of activity, in which case there are ways to look at the profile of the interactive traffic even when there is other activity in the database!)

    This leads into the question of how much CPU you need. I don’t have any hard and fast rules for CPU minimums in a standard configuration. Just two important thoughts:

    1. Maximum unit of consolidation: CPU is a major factor in how many applications can be consolidated on a single server. (Assuming that we’re talking about interactive applications with effective DB caching – these should be primarily CPU-bound.)
    2. Minimum unit of licensing: If partitioning or encryption becomes a requirement for you six months down the road then you’ll have to license the number of cores in one server. Oracle requires you to license all CPUs physically present in the server if any feature is used on that server.

    The goal is to limit future purchasing to this configuration. And as with storage, if you really must have more than one configuration, then try to keep it down to two (like a high-CPU option).

    Memory

    I don’t have a formula to tell you how much memory you should standardize on either. It’s surprising how often SGAs are still poorly sized today – both too small and too large. You need to understand your own applications and their behavior. It’s worthwhile to spend some time reading sar or AWR reports and looking at historical activity graphs.

    Once you start to get a rough idea what your typical workload looks like, I would simply suggest to round up as you make the final decision on standard total server memory capacity. There are two reasons for this:

    1. OS and database consolidation have higher memory requirements. Application and schema/multitenant consolidation will not be as demanding on memory – but as we pointed out earlier, your standards should support all levels of consolidation.
    2. You’re probably not maxing out the memory capacity of your server and it’s probably not that expensive to bump it up a little bit.
    Consolidation Level Common Bottleneck Single Schema (Multiple Customers) CPU Multiple Schemas/PDBs CPU Multiple Databases Memory Multiple OS’s (VMs) Memory Networking

    Small companies generally start with one network. But these days, networking can quickly get complicated even at small companies since network gear allows you to define and deploy multiple logical networks on the physical equipment. Early on, even if it doesn’t all seem relevant yet, I would recommend discussing these networking topics:

    • Current traffic: Are you gathering data on current network usage? Do you know how much bandwidth is used by various services, and how bursty those services are?
    • Logical segregation: Which network should be used for application traffic? What about monitoring traffic, backup traffic, replication traffic (e.g. data guard or goldengate) and operations traffic (kickstarts, data copies between environments, etc)? What about I/O traffic (e.g. NFS or iSCSI)? What is the growth strategy and how will this likely evolve over the coming years?
    • Physical connections: How many connections do we need, accounting for redundancy and isolation/performance requirements and any necessary physical network separation?
    • Clustering: Clustering generally require a dedicated private network and tons of IPs (on both the private cluster network and your corporate network). Sometimes it has higher bandwidth and latency requirements than usual. Generally it is recommended to deploy RAC on at least 10G ethernet for the interconnect. Is there a general strategy for how this will be addressed when the need arises?

    It will benefit you greatly to take these discussions into consideration early and account for growth as you build your standard platform.

    Slots

    One design pattern that I’ve found to be helpful is the idea of slots. The basic idea is similar to physical PCI or DIMM slots – but these are logical “slots” which databases or VMs can use. This is a simplified, practical version of the service catalog concept borrowed from ITIL for private cloud architectures – and this can provide a strong basis if you grow or migrate to that point.

    1. Determine the smallest amount of memory which a standardized database (SGA) or VM will use. This will determine a slot size.
    2. Determine the largest amount of memory which can be allocated on the server. For databases, about 70% of server memory for SGA is a good starting point if it’s an interactive system. For VMs it’s possible to even allow more memory than is physically present but I don’t know the latest conventional wisdom about doing this.
    3. Choose additional DB or VM size options as even multiples of the minimum size.

    For example, a database server containing 64GB of memory might have a slot size of 5GB with 9 total available slots. Anyone who wants a database can choose either a small or large database; a small database uses 1 slot and its SGA is 5GB. A large database uses 5 slots and its SGA is 25GB.

    After the basic slot definition has been decided, CPU limits can be drafted. If the database server has 8 physical cores then the small database might have a hard limit of 2 CPUs and a large database might have a hard limit of 6 CPUs.

    One area which can be confusing with CPU limits is factoring in processor threads. When determining your limits for a consolidation environment, make sure that individual applications are capped before pushing the total load over the physical number of CPUs. But allow the aggregate workload to approach the logical number of CPUs in a period of general heavy load coming from lots of applications.

    In practice, that means:

    1. For multiple databases, set cpu_limit on each one low according to the physical count and calibrate the aggregate total against the logical count.
    2. For multiple schemas in a single database: use resource manager to limit CPU for each schema according to physical count and set cpu_count high according to logical count.

    Slot Example

    Now you have a first draft of memory and CPU definitions for a small and large database. The next step is to define the application workload limits for each database size. As you’re consolidating applications into a few databases, how many instances of your app can be allowed in a small and large database respectively?

    Suppose you’re a SAAS company who hosts and manages lots of SAP databases for small businesses. I don’t actually know what the CPU or memory requirements of SAP are so I’m making these numbers up – but you might decide that a small database (5GB/2cpu) can support one SAP instance and a large database (25GB/6cpu) can support 25 instances (with PDBs).

    Remember that schema/multi-tenant consolidation is very efficient – so you can service many more applications with less memory compared to multiple databases. For a starting point, make sure that the large database uses more than half of the slots then use server CPU capacity to determine how many app instances can be serviced by a large database.

    Another observation is that your production system probably uses more CPU than your test and dev systems. You may be able to double or triple the app instance limits for non-production servers.

    It’s an iterative process to find the right slot sizes and workload limits. But the payoff is huge: something that’s easy to draw on a whiteboard and explain to stakeholders. Your management has some concrete figures to work with when projecting hardware needs against potential growth. The bottom line is that you have flexible yet strong standards – which will enable rapid growth while easing management.

    Example Slot Definitions Database Size Slots Max SAP Instances (Production Server) Max SAP Instances (Test Server) Small 1 1 2 Large 5 25 50
    Standard Server:
    - 8 Cores
    - 64 GB Memory
    - 9 Slots
    
    nr_hugepages = 23552 (45 GB plus 1 GB extra)
    
    Standard Database:
    
    sga_target/max_size = [slots * 5] GB
    pga_aggregate_target = [slots * 2] GB
    cpu_count = [slots + 1]
    processes = [slots * 400]
    

    Pages

    Subscribe to Oracle FAQ aggregator