We have reached the last article of this series. To close this series out, we will talk about Distributed Transaction Coordinator, or simply DTC. I’ll try to do some simple explanation. After that, I’ll demonstrate how to prepare the DTC for a clustered instance.
What is DTC (MS DTC)?
The MS DTC is a OS level service, which comes automatically installed and running under the Network Service account. Its role is to ensure that a distributed transaction is consistent, even with failures.
Those transactions might be initiated when a transaction is dealing with data on multiple computers via network or when the transaction is dealing with multiple processes in a single computer.
All the participants of a distributed transaction works in sync with the other participants (computers) involved in a transaction, looking for the right moment to commit or abort its work. For this reason, we need to make sure that the computers can reach each other.
Do I need to configure MS DTC on my environment?
The answer for this question is the standard for almost everything involved with SQL Server; It depends. You need to understand whether or not you will perform distributed transactions. If you have more than one instance in the same computer, you won’t need the DTC. On the other hand, if you have a two nodes cluster with two clustered instances communicating with each other, you will need the DTC – the instances could be in different nodes. Another possible scenario is when you have the database engine and SSIS installed, in this case you will need to configure the DTC.
For more information, check this link: http://msdn.microsoft.com/en-us/library/ms189910#MSDTC
How to create a clustered MS DTC?
Since Windows 2008, we are allowed to have more than one instance of MS DTC in a server/cluster. So, for clustered SQL Server installations is a best practice to have a Role exclusively for the DTC and a dedicated DTC for each SQL Server Role.
As documented per Microsoft, the SQL Server follow this path to choose the MS DTC instance to use:
- Use MS DTC installed to the local group, else
- Use the mapped instance of MS DTC, else
- Use the cluster’s default instance of MS DTC, else
- Use the local machine’s installed instance of MS DTC
- Use the cluster’s default instance of MS DTC, else
- Use the mapped instance of MS DTC, else
To configure a DTC in cluster, we will need a disk and a hostname.
To configure a Role exclusively for the DTC, follow the steps:
- Right-click on Roles and pick the “Configure Role” option.
- A new window will open. Click “next”.
- Choose the option “Distributed Transaction Coordinator (DTC)” from the list. Click “Next”.
- Fill the hostname in the “Name” field and the IP in the “Network” section. Click “Next”.
- Pick up the disk to be used. Click “Next”.
- Review the configurations and click “Next”.
- The installation will run and in the last step you will see a report. Click “Finish”.
- Now you will be able to see a new Role created in the cluster, with all the indicated resources.
To add a DTC resource into the SQL Server Role, follow the steps:
- Right-click the Role, go to “Add Resource”->”More Resources” -> “Distributed Transaction Coordinator”.
- The resource will be created in the selected Role, now we need to configure it. Right-click the “New Distributed Transaction Coordinator” and click on “Properties”.
- As referred early on this article, the DTC needs a hostname and a disk to work. On dependencies you can pick up those items as shown, and click “ok”.
- Now, let’s bring it online.
How to configure the network for distributed transactions?
Note: On clustered environments, you just need to perform the following steps one time.
- On “Server Manager” go to “Tools”->”Component Services” or run the command “dcomcnfg”.
- Expand the tree, right-click the desired DTC and choose “Properties”.
- Go to the “Security” tab and check “Network DTC Acess” as well as “Allow Inbound” and “Allow Outbound”, as shown bellow. Click Ok.
- Let’s briefly describe the some of the options on this window:
- “Network DTC Access“: Enable/Disable the network access.
- “Allow inbound“: Permit a distributed transaction originated from another computer to run on the current computer.
- “Allow outbound“: Permit a distributed transaction initiated in the current computer to run on a remote computer.
- “Enable XA transactions” and “Enable SNA LU 6.2 Transactions“: Enables/Disable those particular specifications for distributed transactions.
There’s a tool called DTC Ping which can help us to verify if the DTC is working correctly on all the computers that should be involved in a transaction.
You can download this tool here: http://www.microsoft.com/en-us/download/details.aspx?id=2868
I recommend the reading of this article, to learn hos to use this tool, as well as troubleshoot the possible errors: Troubleshooting MSDTC issues with the DTCPing tool.
Another great tool is the DTC Tester. You can simulate a distributed transaction on SQL Server:
To download and get more info about this tool, check this link: http://support.microsoft.com/kb/293799 .
This way we finish this series about how to install a clustered instance. We still have too many details to cover and I will try to create separated articles with best practices, configuration alternatives, etc.
I hope you enjoyed this series and as always, if you have any doubts, contact me!
Thank you for reading!
If you want to check the other parts of this series, here are the links:
First things first: Happy New Year y’all! Health, happiness, and all that jazz to each and everyone of youses!
So the Holidays are over — or almost over, if you are one of the few lucky souls. Which means that we have to get back into the saddle. Considering that, ahem, some of us have just fallen off the grid and into the eggnog for the last two weeks, that’s easier said than done. In order to jog those hacking muscles back into shape, I decided to revisit an idea that I outlined in the Holidays of 2012: dancecard.
For the 2014 edition of this dancecard, I decided to take a slightly different approach. Instead of keeping a repository of the libraries locally, I wanted to be even lazier and simply have a main registry that would point to the download location or the git repository of the libraries.
For the moment I won’t go into the details of the code. Mostly because I pretty much banged the keyboard like a semi-hangover monkey for the last 2 hours, and it reflects in the quality of the codebase. Let it just be said that the git repo is at the usual place, and that I used MooseX::App to build the app.
So, what did those few hours of hacking gave me? Well, with the config file
~/.dancecard/config.yaml, which looks like:
--- browser_command: firefox %s libs: backbone: location: http://backbonejs.org/backbone.js website: http://backbonejs.org documentation: http://backbonejs.org dependencies: - underscore underscore: location: http://underscorejs.org/underscore.js website: http://underscorejs.org documentation: http://underscorejs.org ...
I can now get a list of all libraries I have at my disposal:
$ dancecard list INSTALLED LIBS no lib installed AVAILABLE LIBS underscore bootstrap backbone slippy font-awesome
Of course, I can also install any of those. And… look Ma: dependencies are handled automatically!
$ dancecard install backbone backbone installed underscore installed
Something I also often do is peruse the online documentation of those libraries. Noticed those “documentation” urls in the yaml registry? It’s there so that I can do,
$ dancecard doc bacbone
and have the documentation page opened automatically in Firefox. Or, for maximum laziness, I can also do,
$ dancecard doc --installed
which will open the documentation URL of every library found in the current project.Next Steps
If anyone shows interest in the thing, I’ll bundle the app for CPAN consumption. As for features, things that I’ll probably tackle for my own use are:
- Tags (no software is true software without tags nowadays);
- Virtual libraries (just as a mean to install a group of libraries in a single go);
- Fish completion (I wuv the fish shell);
- Cached local documentation (’cause hitting the ‘Net each time I look for a piece of documentation fills me with guilt);
- An online main registry of libraries;
- And whatever else I may think of.
In the previous post I ended with a discussion about how to gather an operating system's CPU consumption by function. More specific for our purposes is gathering Oracle process (session) CPU consumption at the Oracle kernel function level.
I Want Something Like This
At this point, we are ready to take the Linux perf tool output and merge that with Oracle wait event data. The output could look something like this:
Time Component secs %
------------------------------------------------------- ---------- -------
cpu : [.] kcbgtcr 29.714 66.87
cpu : [.] kdstf000010100001km 3.716 8.36
cpu : [.] lnxsum 3.541 7.97
cpu : [?] sum of funcs consuming less than 2% of CPU ti 2.393 5.38
cpu : [.] kaf4reasrp0km 2.180 4.91
wait: latch: cache buffers chains 2.100 4.73
I suspect you're not surprised that the above output was, in fact, taken from an existing tool. The tool is called "fulltime.sh" and is the result of a collaborative effort with myself and Frits Hoogland. You can download the free tool here. If the link doesn't work, just do an OraPub.com search for "fulltime" and you'll see it.
How The Tools Works
Obviously the tool collects both process CPU consumption and also Oracle wait event data. But we wanted some usage flexibility, so it's a little more interesting than that. Also, the real trick is outputting the perf data into a comma delimited file using "perf report -t,". It's the "t" option that is key here. Oracle can easily read the coma delimited file as an external table and then combine that with the wait event data collected from the v$sess_time_model view.
Here is the basic idea:
Help user identify the PID to profile
get oracle wait times (snap 0)
get oracle CPU time (snap 0)
start oracle kernel cpu details collection
get oracle wait times (snap 1)
get oracle CPU time (snap 1)
stop oracle kernel cpu collection
do some cool math and combine results
The looping capability easily allows for a single collection or multiple collection. The multiple collection option displays much like the "top" tool with a default cycle duration of three seconds.
If you're running on a virtual machine, make sure to change the shell script variable PERF_SAMPLE_METHOD to '-e cpu-clock' instead of the default '-e cycles'. If you look near the top of the fulltime.sh script, you will see a comment about this and also where to make the change.
Before you run the script, take a quick look at the script, especially the top third. There is some good information about the script, default settings, and usage details.
How To Run The Script
There are two input usages. Let's call them basic and advanced.
The BASIC USAGE helps you find the Oracle process ID, sets some defaults for you, and works on a three second cycle until you control-C. For example, let's say you don't know the OS process but you do know the Oracle SID or perhaps you know the machine name or user name. Then you're in luck because the fulltime.sh script will display this information and then prompt you for the OS PID! Here is an example of you can expect if you simply enter, "./fulltime.sh" and selected process 60505.
For each cycle, you would see something like this:
Notice that both Oracle and OS process details are shown in conjunction with the date and time. Plus the SQL statement the process is running at the end of the sample period! Then the total time is shown along with the two high level components, that is, CPU consumption and Oracle wait time. Finally, the details are shown. Notice the details clearly identify the time component as either CPU or wait time.
If the time is tagged CPU then the Oracle kernel function is displayed (thank you perf) along with it's inferred time (based on both perf and v$sess_time_model). If the time is tagged Oracle wait time then the wait event name is displayed along with the wait time (based on v$session_event).
It's important to understand that if we ran a standard wait event report, the output would be similar but with NO CPU information. The wait events detail would still be displayed. This tool simply adds value by incorporating the CPU consumption.
The ADVANCED USAGE gives us full control. The advanced usage requires the OS process ID, the cycle duration, and the number of cycles. The advanced option makes it simple to get a single long cycle or multiple shorter cycles. For example, if I wanted to watch process 1234 in 5 second intervals, I could do this:
Or suppose I wanted a single 60 second sample for process 5432. I would enter this:
$ ./fulltime.sh 1234 5 9999
The output will look exactly like the output from basic usage. You are just not prompted or given information to help you pick the OS PID.
$ ./fulltime.sh 5432 60 1
Here is a short video so you can watch this in action!
Fulltime.sh - Report both Oracle Wait and CPU time details from OraPub on Vimeo.
Why This Is So Important
Look at the example screen shot below and ask yourself, "Is there an IO bottleneck?"
If the CPU information was not combined with the wait time details, I suspect 90% of Oracle DBAs would say there is an IO problem. They would see the top wait event db file sequential read then conclude there is an IO issue. However, as the fulltime.sh script clearly shows, CPU consumption is much larger than Oracle wait time, providing us with a greater opportunity to reduce the total time and devise additional spot-on solutions. Also looking at the report's "total time" breakdown, it clearly shows CPU consumption is 97% of the total time. I wouldn't be surprised if there was a raging CPU bottleneck!
So Where Are We In This Quest?
Now that you have access to the fulltime.sh script and know how to use it, you're going to start asking, "Just what is kcbgtcr anyways?!" In my next posting I'll address this.
But until then, get the fulltime.sh script, log into your non-production yet active Linux box and give it a go! If it's not production, profile the Oracle log writer or database writer background process or perhaps an active server process!
Feeling dangerous yet? (Always!)
Here are the key resource links:
1. Presentation: Detailing Oracle CPU Consumption. Download HERE.
2. Tool: Oracle wait time and kernel function CPU monitor. Download the fulltime.sh script HERE.
Thanks for reading,
If you enjoy my blog, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting, Advanced Oracle Performance Analysis, and my new seminar, Go Faster: Make Oracle Work For You! I teach these classes around the world multiple times each year. For the latest schedule, go to www.orapub.com . I also offer on-site training and consulting services.
P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com. Another option is to send an email to OraPub's general email address, which is currently info@ orapub. com.
The new fresh year is already here amidst fireworks, hard partying, resolutions, and new hopes. Log Buffer also enters into this brand new year focusing on the emerging trends while appreciating the proven technologies. Enjoy the first log buffer of this year.
Keeping up with the latest Java Runtime Environment (JRE) updates is both a priority and a challenge for most organizations.
Oracle‘s product portfolio offers a multitude of opportunities to combine cross-stack products and deliver incremental value to customers and partners.
Do you Deal with Supplier Profile Management flows?
In case you took a break over New Year’s and aren’t quite ready for “real” work, but still feel obligated to visit sites such as this one, here’s a reward for your efforts.
Using SSD for a temp tablespace on Exadata
This paper discuss if it is the right approach of using Hadoop as the analytics infrastructure.
Microsoft is releasing SQL Server 2014 with a new set of In-Memory OLTP features to significantly improve the OLTP performance and reduce the processing time for servers with a large amount of memory and multi-core processors.
SQL Server Optimizer — Under the Hood series of short articles.
How to use Profiler to generate TSQL scripts that can be modified and automated to run as a server-side trace by Grant Fritchey.
Big data is now a standard part of information technology architecture for most large organizations.
SSL with MySQL does not have to be complicated
This is a HOW-TO about installing MariaDB Galera Cluster on Debian/Ubuntu.
In many parts of the MySQL world, whether you have users, developers or administrators of MySQL, the season holidays are for family, relax, prayers, travel, or they are simply a moment where you can enjoy what you like or care most about.
How to recover table structure from .frm files with MySQL Utilities
Chinese and Japanese and Korean (CJK) text usually has no spaces between words. Conventional full-text search does its tokenizing by looking for spaces. Therefore conventional full-text search will fail for CJK.
It looks like we hit 13930580 on one of our systems and I wanted to put some example output showing log file sync times going above 10 milliseconds (10,000 microseconds) after the log writer switches to polling.
Here is the output in the lgwr trace file showing the switch to polling:
*** 2013-12-27 01:50:47.374 kcrfw_update_adaptive_sync_mode: post->poll long#=1 sync#=1 sync=55336 poll=1913 4 rw=5506 rw+=5506 ack=0 min_sleep=19134 *** 2013-12-27 01:50:47.374 Log file sync switching to polling Current scheduling delay is 1 usec Current approximate redo synch write rate is 0 per sec
Here is a report showing the number of log file sync waits and their average time in microseconds:
END_INTERVAL_TIME number of waits ave microseconds -------------------------- --------------- ---------------- 26-DEC-13 03.00.19.333 PM 976539 2279.06861 26-DEC-13 04.00.43.475 PM 681212 2029.32406 26-DEC-13 05.00.07.975 PM 343932 1575.26284 26-DEC-13 06.00.34.772 PM 163911 1850.74354 26-DEC-13 07.01.00.911 PM 73151 3815.28597 26-DEC-13 08.00.24.463 PM 39304 5038.05427 26-DEC-13 09.00.48.970 PM 32122 5677.00557 26-DEC-13 10.00.13.491 PM 472349 2353.95857 26-DEC-13 11.00.40.521 PM 18679 18655.5294 27-DEC-13 12.00.06.276 AM 19618 17046.2287 27-DEC-13 01.00.30.299 AM 18983 5721.99178 27-DEC-13 02.00.54.261 AM 17724 17106.3415 27-DEC-13 03.00.18.282 AM 9088 25342.7271 27-DEC-13 04.00.42.218 AM 14365 12128.7 27-DEC-13 05.00.06.391 AM 16323 12879.8831 27-DEC-13 06.00.31.379 AM 43758 15326.7298 27-DEC-13 07.00.56.027 AM 83819 14796.2851 27-DEC-13 08.00.20.637 AM 168718 13506.4363 27-DEC-13 09.00.47.262 AM 302827 19116.9491 27-DEC-13 10.00.14.014 AM 480347 19358.6655 27-DEC-13 11.00.41.178 AM 512777 15952.2358 27-DEC-13 12.00.08.220 PM 511091 13799.5512 27-DEC-13 01.00.38.131 PM 576341 10183.4347 27-DEC-13 02.00.06.308 PM 524568 10251.1259
Notice how the average wait time goes above 10,000 microseconds consistently once the log writer switches to polling between 1 and 2 am on 12/27/2013. I didn’t show all the output but this long log file sync wait time has continued since the switch.
Also, these long log file sync times don’t correspond to long log file parallel write times. Here are the number and averages of log file parallel write waits for the same time frame:
END_INTERVAL_TIME number of waits ave microseconds -------------------------- --------------- ---------------- 26-DEC-13 03.00.19.333 PM 902849 1426.66601 26-DEC-13 04.00.43.475 PM 659701 1394.87763 26-DEC-13 05.00.07.975 PM 344245 1294.92401 26-DEC-13 06.00.34.772 PM 166643 1586.64944 26-DEC-13 07.01.00.911 PM 80457 4019.29429 26-DEC-13 08.00.24.463 PM 46409 5580.67827 26-DEC-13 09.00.48.970 PM 69218 5115.20904 26-DEC-13 10.00.13.491 PM 475297 2219.80541 26-DEC-13 11.00.40.521 PM 40943 19405.052 27-DEC-13 12.00.06.276 AM 38835 18160.8073 27-DEC-13 01.00.30.299 AM 24734 6321.38425 27-DEC-13 02.00.54.261 AM 33617 11723.6698 27-DEC-13 03.00.18.282 AM 36469 17485.2614 27-DEC-13 04.00.42.218 AM 19344 6955.27042 27-DEC-13 05.00.06.391 AM 17857 4399.75718 27-DEC-13 06.00.31.379 AM 45098 4923.02763 27-DEC-13 07.00.56.027 AM 83700 3610.39713 27-DEC-13 08.00.20.637 AM 160919 2841.31507 27-DEC-13 09.00.47.262 AM 266405 3523.86855 27-DEC-13 10.00.14.014 AM 384795 3367.5075 27-DEC-13 11.00.41.178 AM 437806 2729.84248 27-DEC-13 12.00.08.220 PM 448261 2442.81012 27-DEC-13 01.00.38.131 PM 511648 1880.74418 27-DEC-13 02.00.06.308 PM 481106 1919.21158
The average I/O time – log file parallel write – is pretty low when the system is active (more than 100,000 waits per hour) – usually less than 4000 microseconds and yet log file sync is always more than 10,000 after the switch to polling. Also, the CPU on the system is consistently less than 30% used so it isn’t a system load issue.
Here are some Oracle support documents related to this issue:
Adaptive Switching Between Log Write Methods can Cause 'log file sync' Waits (Doc ID 1462942.1) Waits for "log file sync" with Adaptive Polling vs Post/Wait Choice Enabled (Doc ID 1541136.1) Bug 13930580: LGWR IS BLOCKING SESSIONS
Here is the script I used to get the wait output:
set linesize 32000 set pagesize 1000 set long 2000000000 set longchunksize 1000 set head off; set verify off; set termout off; column u new_value us noprint; column n new_value ns noprint; select name n from v$database; select user u from dual; set sqlprompt &ns:&us> set head on set echo on set termout on set trimspool on UNDEFINE WAITNAME UNDEFINE MINIMUMWAITS spool "&ns.&&WAITNAME..log" column END_INTERVAL_TIME format a26 select sn.END_INTERVAL_TIME, (after.total_waits-before.total_waits) "number of waits", (after.time_waited_micro-before.time_waited_micro)/ (after.total_waits-before.total_waits) "ave microseconds", before.event_name "wait name" from DBA_HIST_SYSTEM_EVENT before, DBA_HIST_SYSTEM_EVENT after, DBA_HIST_SNAPSHOT sn where before.event_name='&&WAITNAME' and after.event_name=before.event_name and after.snap_id=before.snap_id+1 and after.instance_number=1 and before.instance_number=after.instance_number and after.snap_id=sn.snap_id and after.instance_number=sn.instance_number and (after.total_waits-before.total_waits) > &&MINIMUMWAITS order by after.snap_id; spool off
I gave log file sync as the WAITNAME and 1 as MINIMUMWAITS for the first output. I changed WAITNAME to log file parallel write for the second one with 1 still for MINIMUMWAITS.
It looks like there is a new feature in 11.2 that was finally turned on by default in 188.8.131.52. The work around is to set a hidden parameter to false to turn off the new feature. Check out the Oracle support docs I listed for details.
p.s. I forgot to mention that when I tested on a test database with and without this new feature the log file sync times were lower with the new feature, as they should be.
With _use_adaptive_log_file_sync=TRUE (feature enabled):
Top 5 Timed Foreground EventsEvent Waits Time(s) Avg wait (ms) % DB time Wait Class log file sync 639,598 3,466 5 86.74 Commit DB CPU 397 9.93 buffer exterminate 683 14 21 0.36 Other virtual circuit wait 912 12 13 0.31 Network SQL*Net message to client 1,293,196 7 0 0.17 Network
With _use_adaptive_log_file_sync=FALSE (disabled as in earlier versions of 11.2):
Top 5 Timed Foreground EventsEvent Waits Time(s) Avg wait (ms) % DB time Wait Class log file sync 639,644 3,553 6 87.31 Commit DB CPU 367 9.02 buffer exterminate 1,364 28 21 0.69 Other buffer busy waits 2,679 15 6 0.37 Concurrency virtual circuit wait 903 13 15 0.32 Network
With the new feature enabled log file sync was 5 milliseconds instead of 6 without it. So, the new feature does speed up log file sync waits when it is working normally. But, there must be some bug condition where it degrades to greater than 10 millisecond log file syncs.
We have an incredible set of people doing workshops for our attendees. We are working on finalizing a fourth workshop in the Essbase area but not quite ready to announce that on.
We are going to go live this weekend with the ability to register for our latest addition to the workshops.
From Relational to Hadoop - Migrating your data pipelineI am very excited to announce this addition!
Will update this post with links when this workshop is online for registration.
Another Oozie tip blog post.
If you try to use Sqoop action in Oozie, you know you can use the “command” format, with the entire Sqoop configuration in a single line:
<pre><workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> ... <action name="myfirsthivejob"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-traker>foo:8021</job-tracker> <name-node>bar:8020</name-node> <command>import --connect jdbc:hsqldb:file:db.hsqldb --table TT --target-dir hdfs://localhost:8020/user/tucu/foo -m 1</command> </sqoop> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> ... </workflow-app>
This is convenient, but can be difficult to read and maintain. I prefer using the “arg” syntax, with each argument in its own line:
<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> ... <action name="myfirsthivejob"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-traker>foo:8021</job-tracker> <name-node>bar:8020</name-node> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:hsqldb:file:db.hsqldb</arg> <arg>--table</arg> <arg>TT</arg> <arg>--target-dir</arg> <arg>hdfs://localhost:8020/user/tucu/foo</arg> <arg>-m</arg> <arg>1</arg> </sqoop> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> ... </workflow-app>
As you can see, each argument here is in its own “arg” tag. Even two arguments that belong together like “–table” and “TT” go in two separate tags.
If you’ll try to put them together for readability, as I did, Sqoop will throw its entire user manual at you. It took me a while to figure out why this is an issue.
When you call Oozie from the command line, all the arguments you pass are sent as a String array, and the spaces separate the arguments into array elements. So if you call Sqoop with “–table TT” it will be two elements, “–table” and “TT”.
When using “arg” tags in Oozie, you are basically generating the same array in XML. Oozie will turn the XML argument list into an array and pass it to Sqoop just the way it would in the command line. Then Sqoop parses it in exactly the same way.
So every item separated with space in the command line must be in separate tags in Oozie.
Its simple and logical once you figure out why
If you want to dig a bit more into how Sqoop parses its arguments, it is using Apache Commons CLI with GnuParser. You can read all about it.
Just got off of a Webex with Delphix support. They upgraded our Delphix server from version 184.108.40.206 to 220.127.116.11. It took about twenty minutes. Pretty nice compared to an Oracle database upgrade I think!
The only thing that took some time was that I had to be sure I had enough space. It looks like Delphix needs your disk space to be no more than 85% utilized to fully function. We ended up extending our four 1.5 terabyte luns to 1.9 TB each to give us space. Then I cleaned up some archive log space by running a snapsync on each source database to completion. Our system just needed a little TLC to get some space free again.
But, the upgrade itself, running the scripts and rebooting the server, took 20 minutes and was all done by Delphix support for no extra charge. Sweet.
In March 2012, I put together a blog post entitled Disabling Oracle triggers on a per-session basis, outlining a way to suspend trigger execution for the current session through a PL/SQL call. Commenter Bryan posted a comment saying he couldn’t get it working in 18.104.22.168:
Unfortunately Oracle seems to have disabled this use in 22.214.171.124, and most likely 12.1 as well. Boo-Hiss! This is needed functionality for DBAs!A new parameter: enable_goldengate_replication
I tried this on an Oracle 126.96.36.199 system, and I indeed got an error:
SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END; * ERROR at line 1: ORA-26947: Oracle GoldenGate replication is not enabled. ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46 ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13 ORA-06512: at line 1
A quick look at
oerr gives a path forward, assuming you do indeed have a GoldenGate license:
[oracle@ora11gr2b ~]$ oerr ora 26947 26947, 00000, "Oracle GoldenGate replication is not enabled." // *Cause: The 'enable_goldengate_replication' parameter was not set to 'true'. // *Action: Set the 'enable_goldengate_replication' parameter to 'true' // and retry the operation. // Oracle GoldenGate license is needed to use this parameter.
The Oracle reference gives a bit more info
ENABLE_GOLDENGATE_REPLICATION controls services provided by the RDBMS for Oracle GoldenGate (both capture and apply services). Set this to true to enable RDBMS services used by Oracle GoldenGate.
The RDBMS services controlled by this parameter also include (but are not limited to):
Service to suppress triggers used by GoldenGate Replicat
As do the GoldenGate 12.1.2 docs:
The database services required to support Oracle GoldenGate capture and apply must be enabled explicitly for an Oracle 188.8.131.52 database. This is required for all modes of Extract and Replicat.
To enable Oracle GoldenGate, set the following database initialization parameter. All instances in Oracle RAC must have the same setting.
So here goes nothing:
SQL> alter system set enable_goldengate_replication=true; System altered. SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END; * ERROR at line 1: ORA-01031: insufficient privileges ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46 ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13 ORA-06512: at line 1
Another error: missing privileges. I checked and double-checked that the required GoldenGate privileges were indeed assigned.Tracing and permission checks
It’s time to run a 100046 trace (SQL trace) to see what’s really going on.
SQL> alter session set events '10046 trace name context forever, level 12'; Session altered. SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END; * ERROR at line 1: ORA-01031: insufficient privileges ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46 ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13 ORA-06512: at line 1
And tracefile does show some interesting information. A few of the more interesting snippets:
PARSING IN CURSOR #140324121137184 len=76 dep=0 uid=91 oct=47 lid=91 tim=1388531465245781 hv=1323338123 ad='6c1f63a0' sqlid='gvq73797f12cb' BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END; END OF STMT ... PARSING IN CURSOR #140324121064984 len=187 dep=1 uid=0 oct=3 lid=0 tim=1388531465246387 hv=2028900049 ad='6c128db8' sqlid='aa9h2ajwfx3qj' SELECT COUNT(*) FROM ( SELECT GP.USERNAME FROM DBA_GOLDENGATE_PRIVILEGES GP WHERE GP.USERNAME = :B1 UNION ALL SELECT GRANTEE FROM DBA_ROLE_PRIVS WHERE GRANTEE=:B1 AND GRANTED_ROLE='DBA' ) END OF STMT ... Bind#0 ... value="GGS" ... Bind#1 ... value="GGS" ...
The SQL statement is actually checking two things. The first is looking for the current username in the
dba_goldengate_privileges view. This view isn’t listed in the Oracle 11.2 documentation, but it does appear in the 12c docs:
ALL_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for the user.
Oracle GoldenGate privileges are granted using the DBMS_GOLDENGATE_AUTH package.
DBA_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for all users who have been granted Oracle GoldenGate privileges.
USER_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges. This view does not display the USERNAME column.
I had previously run
dbms_goldengate_auth to grant privs here, so should be OK.
The second check simply verifies that the
DBA role had been granted to the current user, again as recommended by the documentation. (A side note: in previous versions, I had avoided granting the overly broad DBA role to the GoldenGate user in favor of specific grants for the objects it uses. There’s no reason for the GoldenGate user to need to read and modify data objects that aren’t related to its own replication activities for example. And I would argue that it helps avoid errors such as putting the wrong schema in a map statement when permissions are restricted. But sadly it’s no longer possible in the world of 184.108.40.206.)
Running the query manually to verify that the grants are indeed in place:
SQL> SELECT COUNT(*) FROM ( SELECT GP.USERNAME FROM DBA_GOLDENGATE_PRIVILEGES GP WHERE GP.USERNAME = 'GGS' UNION ALL SELECT GRANTEE FROM DBA_ROLE_PRIVS WHERE GRANTEE='GGS' AND GRANTED_ROLE='DBA' ); COUNT(*) ---------- 2
Looks good, so that doesn’t seem to be the problem.Tracing #2: system properties
Back to the 10046 tracefile:
PARSING IN CURSOR #140324119717656 len=45 dep=1 uid=0 oct=3 lid=0 tim=1388531465253124 hv=3393782897 ad='78ae2b40' sqlid='9p6bq1v54k13j' select value$ from sys.props$ where name = :1 END OF STMT ... Bind#0 ... value="GG_XSTREAM_FOR_STREAMS" ... FETCH #140324119717656:c=0,e=44,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=415205717,tim=1388531465254441
Because this SQL statement involves an ordinary
select without an aggregate function, I can look at the
FETCH line in the tracefile to get the number of rows returned. In this case it’s
r=0, meaning no rows returned.
The query itself is looking for a system property I haven’t seen before: GG_XSTREAM_FOR_STREAMS. A Google search returns only a single result: the PDF version of the Oracle 11.2 XStream guide. Quoting:
This procedure enables XStream capabilities and performance optimizations for Oracle
This procedure is intended for users of Oracle Streams who want to enable XStream
capabilities and optimizations. For example, you can enable the optimizations for an
Oracle Streams replication configuration that uses capture processes and apply
processes to replicate changes between Oracle databases.
These capabilities and optimizations are enabled automatically for XStream
components, such as outbound servers, inbound servers, and capture processes that
send changes to outbound servers. It is not necessary to run this procedure for
When XStream capabilities are enabled, Oracle Streams components can stream ID key
LCRs and sequence LCRs. The XStream performance optimizations improve efficiency
in various areas, including:
? LCR processing
? Handling large transactions
? DML execution during apply
? Dependency computation and scheduling
? Capture process parallelism
On the surface, I don’t see what this would have to do with trigger execution, but I’m going to try enabling it as per the newly read document anyway:
SQL> exec dbms_xstream_adm.ENABLE_GG_XSTREAM_FOR_STREAMS(enable=>true); PL/SQL procedure successfully completed. SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END; * ERROR at line 1: ORA-01031: insufficient privileges ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46 ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13 ORA-06512: at line 1
No dice.Tracing #3: process names
Onto the next SQL in the tracefile:
PARSING IN CURSOR #140324120912848 len=114 dep=1 uid=0 oct=3 lid=0 tim=1388531465255628 hv=1670585998 ad='6c2d6098' sqlid='a9mwtndjt67nf' SELECT COUNT(*) FROM V$SESSION S, V$PROCESS P WHERE P.ADDR = S.PADDR AND S.PROGRAM LIKE 'extract%' AND p.spid = :1 END OF STMT ... Bind#0 ... value="2293"
Now we look in v$session, to see if a session associated with the process with OS PID 2293 (which happens to be the SPID of our current shadow process) has a PROGRAM column starting with the word extract. extract is, naturally, the name of the GoldenGate executable that captures data from the source system. In a GoldenGate system, however, trigger suppression does not happen in the extract process at all, but rather the replicat process that applies changes on the target system. So I’m going to skip this check and move on to the next one in the tracefile:
PARSING IN CURSOR #140324120905624 len=169 dep=1 uid=0 oct=3 lid=0 tim=1388531465257346 hv=3013382849 ad='6c122b38' sqlid='38pkvxattt4q1' SELECT COUNT(*) FROM V$SESSION S, V$PROCESS P WHERE P.ADDR = S.PADDR AND (S.MODULE LIKE 'OGG%' OR S.MODULE = 'GoldenGate') AND S.PROGRAM LIKE 'replicat%' AND p.spid = :1 END OF STMT ... Bind#0 ... value="2293"
This SQL is similar to the previous one, but instead of looking for a program called
extract, it looks for one called
replicat, and adds an extra check, so see if the
module column either starts with
OGG or is called
GoldenGate. And since it’s the
replicat process that does trigger disabling in GoldenGate, this check is likely to be related.
To make this check succeed, I’m going to have to change both the
module columns in v$session for the current session. of the two,
module is much easier to modify: a single call to dbms_application_info.set_module. But modifying
program is less straightforward. One approach is to use Java code with Oracle’s JDBC Thin driver and setting the aptly-named
v$session.program property, as explained in De Roeptoeter. But I’m hoping to stay with something I can do in SQL*Plus. If you’ve looked through a packet trace of a SQL*Net connection being established, you will know that the program name is passed by the client at the time of connection establishment, so could be modified by either modifying the network packet in transit. This is also complex to get working, as it also involves fixing checksums and the like. There’s a post on Slavik’s blog with a sample OCI C program that modifies its program information. Again more complexity thn I’d like, but it gave me an idea: if the
program is populated by the name of the client-side executable, why don’t we simply copy sqlplus to a name that the
dbms_xstream_gg likes better?
[oracle@ora11gr2b ~]$ cp $ORACLE_HOME/bin/sqlplus ./replicat [oracle@ora11gr2b ~]$ ./replicat ggs SQL*Plus: Release 220.127.116.11.0 Production on Mon Dec 30 14:09:05 2013 Copyright (c) 1982, 2013, Oracle. All rights reserved. Enter password: Connected to: Oracle Database 11g Enterprise Edition Release 18.104.22.168.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> exec dbms_application_info.set_module('OGG',''); PL/SQL procedure successfully completed. SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); PL/SQL procedure successfully completed.
So it looks like you can disable triggers per-session in 22.214.171.124 just like previous versions, but need to jump through a few more hoops to do it. A few conclusions to draw:
- Oracle patchsets, while normally intended to include bugfixes, can have major changes to underlying functionality too. See Jeremy Schneider’s post on adaptive log file sync for an even more egregious example. So before applying a patchset, test thoroughly!
- The enforcement of full DBA privileges for the GoldenGate user in Oracle 126.96.36.199 requires very broad permissions to use GoldenGate, which can be a concern in security-conscious or consolidated environments.
TL;DR: Yes you can still disable triggers per-session in Oracle 188.8.131.52, but you have to have a GoldenGate license, set the
enable_goldengate_replication parameter, use a program name that starts with
replicat, and set your module to
As an enhancement to the Oracle Database whoami for versions before 12c, this also shows the Container Name to which the session is connected:
[oracle@linuxbox ~]$ sqlplus / as sysdba SQL*Plus: Release 184.108.40.206.0 Production on Wed Jan 8 12:34:04 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 220.127.116.11.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> @whoami USER: SYS SESSION ID: 253 CURRENT_SCHEMA: SYS INSTANCE NAME: cdb1 CDB NAME: cdb1 CONTAINER NAME: CDB$ROOT DATABASE ROLE: PRIMARY OS USER: oracle CLIENT IP ADDRESS: SERVER HOSTNAME: linuxbox CLIENT HOSTNAME: linuxbox PL/SQL procedure successfully completed. SQL> connect system/oracle_4U@pdb1 Connected. SQL> @whoami USER: SYSTEM SESSION ID: 253 CURRENT_SCHEMA: SYSTEM INSTANCE NAME: cdb1 CDB NAME: cdb1 CONTAINER NAME: PDB1 DATABASE ROLE: PRIMARY OS USER: oracle CLIENT IP ADDRESS: 555.555.5.555 SERVER HOSTNAME: linuxbox CLIENT HOSTNAME: linuxbox PL/SQL procedure successfully completed.
The content of whoami.sql:
set serveroutput on begin dbms_output.put_line('USER: '||sys_context('userenv','session_user')); dbms_output.put_line('SESSION ID: '||sys_context('userenv','sid')); dbms_output.put_line('CURRENT_SCHEMA: '||sys_context('userenv','current_schema')); dbms_output.put_line('INSTANCE NAME: '||sys_context('userenv','instance_name')); dbms_output.put_line('CDB NAME: '||sys_context('userenv','cdb_name')); dbms_output.put_line('CONTAINER NAME: '||sys_context('userenv','con_name')); dbms_output.put_line('DATABASE ROLE: '||sys_context('userenv','database_role')); dbms_output.put_line('OS USER: '||sys_context('userenv','os_user')); dbms_output.put_line('CLIENT IP ADDRESS: '||sys_context('userenv','ip_address')); dbms_output.put_line('SERVER HOSTNAME: '||sys_context('userenv','server_host')); dbms_output.put_line('CLIENT HOSTNAME: '||sys_context('userenv','host')); end; /
Shortcut to get the name of the current container is:
SQL> show con_name CON_NAME ------------------------------ PDB1
You may find that useful in a multitenant environment with many Pluggable Databases within one Container Database
Tagged: 12c New Features, whoami
I just think this is a cool picture of an Oracle 12c instance and database from Oracle’s 12c Concepts manual (not my own work):
This is from the concepts manual found here: url
Just finished reading the book by Jonathan Lewis titled “Oracle Core: Essential Internals for DBAs and Developers“. I think I picked it up at the Collaborate 13 conference in Denver last April but haven’t had time (or taken the time) to read it.
Reading a book like Oracle Core can be a challenge because it is pretty dense with example scripts and outputs including dumps in hex. So, I decided to take the strategy of pushing myself to crash through the book without carefully following every example. I may only have absorbed about 10% of the material but if I didn’t jam through it I would have gotten 0%!
I picked up Oracle Core because I had read another book by the same author titled “Cost-Based Oracle Fundamentals” which has paid for itself 100 times over in terms of helping me tune queries. I highly recommend Cost-Based Oracle Fundamentals without reservation. But, like Oracle Core it can be a challenge to just sit down and read it and follow every SQL example and output. Probably it would be worth making a first pass focusing on just the English language text and skimming the examples, maybe delving into the examples of most interest.
In the case of Oracle Core I haven’t yet put it to practical use but I’m glad to have at least skimmed through it. Now I know what’s in it and can refer back to it when needed.
Next I hope to start reading up on Oracle 12c since I plan to get my certification this year. But, I wanted to finish Oracle Core before I moved on, even if I only read it at a high level.
It is amazing what Oracle Enterprise Manager 12c will report once it is configured for a product. Once such product is Oracle GoldenGate. I have stepped into a project where I’m running Oracle GoldenGate between many different environments for production purposes. Just trying to get a handle around what is going on has been a task. In talking with the customer, they were starting to implement Oracle Enterprise Manager 12c. Once OEM was setup, we added the Oracle GoldenGate plug-in and started to monitor the replication environments.
Monitoring the Oracle GoldenGate environments, I noticed a warning in the Incident Manager. The warning that was noticed was: “Status for Streams process OGG$_CGGMONX9AC55691 is DISABLED”. I got to thinking, what is this message about? Much more, how do I resolve this warning (I like OEM to be quite. I started to look around MOS for answers, to my surprise, not much is written about his message.
Oracle GoldenGate, classic capture, doesn’t report these types of messages within Oracle Enterprise Manager 12c. Classic Capture mostly reports the up and down status of Oracle GoldenGate processes. This message had to be coming from some integrated version of the extract (first clue was the word Streams). Keeping that Streams may be used in some way, the DBA_CAPTURE table should be able to shine a bit of light on this warning.
From a SQL*Plus prompt or an SQL IDE (prefer SQL Developer), the DBA_CAPTURE view can be queried.
From looking at the STATUS column, I verified that I’ve found the correct record. The PURPOSE column shows that this extract (capture) is being used for Streams. What!?, wait a minute, I’m using Oracle GoldenGate.
Yes, Oracle GoldenGate is being used. If there is information in the DBA_CAPTURE view it is because the Extract has been registered (integrated) with the database some how. The status is DISABLED, an indicator that this extract was registered for logretention:
GGSCI> stop extract cggmonx
GGSCI> dblogin userid ggate password ggate
GGSCI> register extract cggmonx logretention
GGSCI> start extract cggmonx
Now, that it is understood that the extract has been registered for log retention, what does this actually mean?
According to the Oracle GoldenGate 11g Reference Guide, an extract can be registered in one of two modes.
1. Database – Enables integrated capture for the Extract group. In this mode,
Extract integrates with the database logmining server to receive
change data in the form of logical change records (LCR). Extract
does not read the redo logs. Extract performs capture processing,
filtering, transformation, and other requirements
2. Logretention – Enables an Extract group, classic capture mode, to work with
Oracle Recovery Manager (RMAN) to retain the logs that Extract
needs for recovery
As indicated a few lines up, this extract has been registered with logretention. This means that the extract creates an underlying Streams capture process and prevents RMAN from removing any archivelogs that may be needed for replication of data. As part of creating the underlying Streams structure, Oracle creates a queue under the Oracle GoldenGate owner (The queue name can also be found in the DBA_CAPTURE view).
Now that the root problem of the DISABLE message in Oracle Enterprise Manager 12c has been identified, how can this message be resolved?
The simplest way is to unregister the extract from the database/logretention knowing that Oracle GoldenGate configuration is using Classic Capture. Keep in mind that when unregistering the extract, retention of archivelogs will not be enforced when RMAN backs them up and possibility removes them. Make sure you RMAN retention policies are what you expect them to be.
To unregister an extract that is using logretention, use the steps below:
GGSCI> stop extract cggmonx
GGSCI> dblogin userid ggate password ggate
GGSCI> unregister extract cggmonx logretention
GGSCI> start extract cggmonx
Filed under: Golden Gate, OEM
PLSQL has had the option for a long time
alter session set current_schema = SOME_SCHEMA_NAME;
This does not give you full schema owner capabilities ( well depends on what your login session capabilities have ) but can be very useful. For instance in some kind of script to validate that all the expected objects exist and are at the corrrect version you could use it like this.
set echo off
set feedback on
set heading off
set linesize 168
set serveroutput on size unlimited
set term on
alter session set current_schema = FIRST_SCHEMA_BEING_CHECKED;
validate_objects.bv_show_valid_messages := TRUE;
validate_objects.bv_stop_on_error := FALSE;
-- a bunch of calls against a validation package ... check that tables exists / views exists / foreign keys exists / indexes exist / packages procedures functions exists / data exists
-- at the end check that all objects are valid in the schema ...
Anyway, You are looking for a book about programming Elastic MapReduce. I mention a book titles - Programming Elastic MapReduce Using AWS Services to Build an End-to-End Application By Kevin Schmidt, Christopher Phillips.
This book will give readers the best practices for using Amazon EMR and various AWS and Apache technologies. Readers will learn much more about.
- Get an overview of the AWS and Apache software tools used in large-scale data analysis
- Go through the process of executing a Job Flow with a simple log analyzer
- Discover useful MapReduce patterns for filtering and analyzing data sets
- Use Apache Hive and Pig instead of Java to build a MapReduce Job Flow
- Learn the basics for using Amazon EMR to run machine learning algorithms
- Develop a project cost model for using Amazon EMR and other AWS tools
Written By: Surachart Opun http://surachartopun.com