One of the cool things about SQL Server is that it comes bundled with all the Business Intelligence services with the core database engine license. Reporting Services (which includes the Mobile Report Publisher), Analysis Services, and Integration Services are all integrated, and are ready to get going as soon as you install SQL Server. This has made it not only cost-efficient for many organizations to deploy BI, but it has also contributed to a wide adoption among the SQL Server customer base.
What is the Mobile Report Publisher?
Currently in preview, the Mobile Report Publisher is a new report and dashboard editor that publishes reports to Reporting Services, and it’s part of the bigger road map that Microsoft has for their Business Intelligence On-Premises story. We all know that in the cloud, Power BI has been getting a large amount of investment, but with on-premises there was a big gap that was getting wider and wider, until now.
With this upcoming SQL 2016 release, the Microsoft team is focusing on bringing Reporting Services into the future as a one-stop integrated solution for BI deployment so that cloud BI or a competitor’s product (*cough*Tableau*cough) are not the only modern alternatives.
This Reporting Services refactor is the biggest change made to the product since SQL Server 2005 was released over 10 years ago. Leveraging the best parts of the Datazen acquisition, the Microsoft team is looking to provide a cohesive BI story that integrates web-based and mobile reports, Report Builder reports, and Power View style of modern visuals.
How is it different?
You’re probably used to working with SSRS Report Builder and are wondering what the idea is with Mobile Report Publisher. The demo below will make it very clear, but let’s just say that reports have now been split in two types:
1. Paginated reports: this is the ‘legacy’ style report that is built by Report Builder. It looks more flat, has the same controls as before, and is fully compatible with all the existing reports you have already deployed.
2. Mobile reports: Even though the name says ‘mobile’ these reports work just as well on desktop, tablet and mobile. If you’re familiar with web design, then the appropriate term would be that these reports are “responsive”. They can be done once, and will display nicely across devices. This is the new experience that the Mobile Report Publisher is targeting.
Where do we go from here?
Right now the current version is SQL Server 2016 CTP 3.2. We still have several different versions that will be released before SQL 2016 goes RTM later on this year. Currently, a lot of the planned functionality is not entirely there for the main SSRS interface, and you will be asked to switch to the ‘”classic” view often.
The Mobile Report Publisher experience is also very much targeted towards creating a visualization, and there’s no data modeling to be found. You pretty much need to have your datasets pre-made and in the format that the tool will expect, or you won’t get very far. Hopefully, at some point the team will add modeling capabilities like the robust experience we already have on the Power Bi desktop tool.
If you want to practice and get your feet wet with this release you can do it now, and for free by visiting SQL Server 2016 CTP 3.2. You can also preview the Mobile Report Publisher. Like I said before, this is a big departure so most people will want to train up to be ready when the new versions go live.
In the demo below I’m going to walk through the new SSRS interface preview in CTP 3.2, and then walk through creating a quick dashboard for both desktop and mobile consumption using the publisher. Let’s check it out!
Today I would like to talk about multicast support on Azure, and how to make it work. While it’s not the most required feature on a Virtual environment, nevertheless, some applications require multicast support for networks. The perfect example is Oracle RAC, where multicast is required starting from version 188.8.131.52. In Oracle RAC, multicast is used for highly available IP (HAIP) on interconnect. If you’re thinking about building a training environment with Oracle RAC on Azure you will need the multicast support.
How can we check if it works, or if it’s working now? First, you can check if it’s supported by your kernel using the netstat utility.
[root@oradb5 ~]# netstat -g | grep mcast
lo 1 all-systems.mcast.net
eth0 1 all-systems.mcast.net
eth1 1 all-systems.mcast.net
You can see that all my interfaces are ready for the multicast support. That’s fine, but how can we check if it works on our network? We can use either iperf utility or a perl script created by Oracle. You can download the script from Oracle support if you have account, from the Oracle note “How to Validate Network and Name Resolution Setup for the Clusterware and RAC (Doc ID 1054902.1)”.
Here’s what I got: I have two Azure VM A3 size with Oracle Linux 6, with two network interfaces each. The VM hostnames were oradb5 and oradb6. You can check out my blog on how to make an Azure VM with two network interface here. The second interface eth1 is one where we are going to enable multicast.
I ran the mcasttest.pl script and saw that:
[oracle@oradb5 mcasttest]$ ./mcasttest.pl -n oradb5,oradb6 -i eth1
########### Setup for node oradb5 ##########
Checking node access 'oradb5'
Checking node login 'oradb5'
Checking/Creating Directory /tmp/mcasttest for binary on node 'oradb5'
Distributing mcast2 binary to node 'oradb5'
########### Setup for node oradb6 ##########
Checking node access 'oradb6'
Checking node login 'oradb6'
Checking/Creating Directory /tmp/mcasttest for binary on node 'oradb6'
Distributing mcast2 binary to node 'oradb6'
########### testing Multicast on all nodes ##########
Test for Multicast address 184.108.40.206
Nov 24 15:05:23 | Multicast Failed for eth1 using address 220.127.116.11:42000
Test for Multicast address 18.104.22.168
Nov 24 15:05:53 | Multicast Failed for eth1 using address 22.214.171.124:42001
The output clearly tells us that we don’t have multicast support for either for 126.96.36.199 or 188.8.131.52 multicast addresses.
What does the Virtual Network FAQ for Azure tell us about it?
Here is the answer:
Do VNets support multicast or broadcast?
No. We do not support multicast or broadcast.
What protocols can I use within VNets?
You can use standard IP-based protocols within VNets. However, multicast, broadcast, IP-in-IP encapsulated packets and Generic Routing Encapsulation (GRE) packets are blocked within VNets. Standard protocols that work include:
So, we need a workaround. Luckily we have one. Some time ago, while discussing RAC on Amazon AWS, I was pointed to an article written by my former colleague Jeremiah Wilton, where he had described how he could work around the same problem on Amazon. You can read the article here. I decided to give a try and see if it works for Azure.
We are going to use a Peer-to-Peer VPN n2n provided by ntop.
They have mentioned that the development for the product has been put on hold, but the tool is still widely used and provides an acceptable solution for our problem. I used a Stuart Buckell’s article on how to set it up and it worked for me.
We could just use an already precompiled package, but compiling the utility from the sources provides us with an opportunity to disable encryption and compression, or change any other options.
Here is what I’ve done:
Installed kernel headers to be able to compile the n2n :
[root@oradb5 n2n_v2]# yum install kernel-headers
ol6_UEK_latest | 1.2 kB 00:00
ol6_u4_base | 1.4 kB 00:00
ol6_u4_base/primary | 2.7 MB 00:00
Setting up Install Process
Installed subversion utility:
[root@oradb5 /]# yum install subversion.x86_64
Setting up Install Process
--> Running transaction check
---> Package subversion.x86_64 0:1.6.11-15.el6_7 will be installed
Downloaded the sources using svn:
[root@oradb5 /]# svn co https://svn.ntop.org/svn/ntop/trunk/n2n
Error validating server certificate for 'https://svn.ntop.org:443':
- The certificate hostname does not match.
- Hostname: shop.ntop.org
- Valid: from Sun, 15 Nov 2015 00:00:00 GMT until Wed, 14 Nov 2018 23:59:59 GMT
- Issuer: COMODO CA Limited, Salford, Greater Manchester, GB
- Fingerprint: fb:a6:ff:a7:58:f3:9d:54:24:45:e5:a0:c4:04:18:d5:58:91:e0:34
(R)eject, accept (t)emporarily or accept (p)ermanently? p
Disabled encryption and compression using this article
Changed directory to n2n/n2n_v2 and compiled it.
[root@oradb5 n2n_v2]# make
gcc -g3 -Wall -Wshadow -Wpointer-arith -Wmissing-declarations -Wnested-externs -c n2n.c
gcc -g3 -Wall -Wshadow -Wpointer-arith -Wmissing-declarations -Wnested-externs -c n2n_keyfile.c
gcc -g3 -Wall -Wshadow -Wpointer-arith -Wmissing-declarations -Wnested-externs -c wire.c
gcc -g3 -Wall -Wshadow -Wpointer-arith -Wmissing-declarations -Wnested-externs -c minilzo.c
gcc -g3 -Wall -Wshadow -Wpointer-arith -Wmissing-declarations -Wnested-externs -c twofish.c
Copied files to the both my servers (oradb5 and oradb6) to /usr/sbin directory:
[root@oradb5 n2n_v2]# cp supernode /usr/sbin/
[root@oradb5 n2n_v2]# cp edge /usr/sbin/
Start a supernode daemon on the 1-st node. We only need it running on one machine, and it can even be a totally different machine. I am using port 1200 for it:
[root@oradb5 ~]# supernode -l 1200
Started the edge on both servers. On oradb5 I am creating a sub-interface with IP 192.168.1.1 and providing some parameters:
-E – Accept multicast MAC addresses (default=drop).
-r – Enable packet forwarding through n2n community.
-c – n2n community name the edge belongs to.
-l – our supernode address:port.
[root@oradb5 ~]# edge -l 10.0.2.11:1200 -c RAC -a 192.168.1.1 -E -r
So we are getting an interface edge0 on both nodes and can use it for connection required multicast:
[root@oradb5 ~]# ifconfig edge0
edge0 Link encap:Ethernet HWaddr 52:CD:8E:20:3D:E5
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::50cd:8eff:fe20:3de5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1400 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:0 (0.0 b) TX bytes:592 (592.0 b)
On the second box:
[root@oradb6 ~]# ifconfig edge0
edge0 Link encap:Ethernet HWaddr 7E:B1:F1:41:7B:B7
inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::7cb1:f1ff:fe41:7bb7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1400 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:0 (0.0 b) TX bytes:362 (362.0 b)
Now we can run our multicast test again for edge0 interface and see how it works.
[oracle@oradb5 ~]$ cd mcasttest/
[oracle@oradb5 mcasttest]$ ./mcasttest.pl -n oradb5,oradb6 -i edge0
########### Setup for node oradb5 ##########
Checking node access 'oradb5'
Checking node login 'oradb5'
Checking/Creating Directory /tmp/mcasttest for binary on node 'oradb5'
Distributing mcast2 binary to node 'oradb5'
########### Setup for node oradb6 ##########
Checking node access 'oradb6'
Checking node login 'oradb6'
Checking/Creating Directory /tmp/mcasttest for binary on node 'oradb6'
Distributing mcast2 binary to node 'oradb6'
########### testing Multicast on all nodes ##########
Test for Multicast address 184.108.40.206
Nov 24 16:22:12 | Multicast Succeeded for edge0 using address 220.127.116.11:42000
Test for Multicast address 18.104.22.168
Nov 24 16:22:13 | Multicast Succeeded for edge0 using address 22.214.171.124:42001
As you can see, the test has completed successfully. So, the edge0 interface can be used now for any connections requiring multicast support.
In my next article I will show you how to create an Oracle RAC on Azure using the created multicast interface and a shared storage.
According to the Oracle 12cR1 documentation and concepts, it is 100% clear that there can be only one UNDO tablespace in a multitenant architecture and it is at CDB level; thus, a PDB cannot have any UNDO tablespace.
Are we really sure about that? Let’s test it!
First, we need a PDB with few tablespaces:
FRED_PDB> select NAME, OPEN_MODE, CON_ID from v$pdbs ; NAME OPEN_MODE CON_ID -------------------------------------------------- ---------- ---------- FRED_PDB READ WRITE 4 FRED_PDB> select tablespace_name from dba_tablespaces ; TABLESPACE_NAME ------------------------------ SYSTEM SYSAUX TEMP USERS TBS_DATA 5 rows selected. FRED_PDB> show parameter undo NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ temp_undo_enabled boolean FALSE undo_management string AUTO undo_retention integer 900 undo_tablespace string UNDOTBS1 FRED_PDB>
There we have an UNDO tablespace named UNDOTBS1 at CDB level and no UNDO at PDB level. Let’s try to create one :
FRED_CDB> create undo tablespace MY_PDB_UNDO ; Tablespace created. FRED_CDB>
It worked! Is the Oracle documentation wrong? Let’s verify this weird successful UNDO tablespace creation:
FRED_PDB> select tablespace_name from dba_tablespaces where tablespace_name like '%UNDO%' ; no rows selected FRED_PDB> select tablespace_name from dba_tablespaces TABLESPACE_NAME ------------------------------ SYSTEM SYSAUX TEMP USERS TBS_DATA 5 rows selected. FRED_PDB>
No UNDO tablespace has in fact been created even if no error message has been raised by Oracle. Digging in the documentation, this is not a not a bug but a feature. Indeed, it is well specified that:
When the current container is a PDB, an attempt to create an undo tablespace fails without returning an error.
Please note that this is the behavior of the 12cR1 release; from my side, I think that this a “not yet feature” and we should see some real UNDO tablespaces in PDBs in the next release(s)!
Discover more about our expertise in Oracle.
Troubleshooting connection issues with Oracle SQL*Net can be difficult at times due to the many options that can be taken during configuration. One of the options is where the file tnsnames.ora may be found. There are multiple locations available, and at times there is justification for having more than one copy of the file.
Perhaps there is a hybrid database naming configuration. Say there are a number of company-wide databases that are defined in Oracle OID, OpenLDAP or Active Directory, while local test databases are defined in one or more local tnsnames.ora files.
When one of the databases appears to no longer be available, even though you are quite sure it should be available, it’s good to know the default search order used by Oracle to resolve the name.
The Oracle names resolution default search order for Linux and Windows is explained here:
But wait, there’s more!
You may know that on linux and unix systems tnsnames.ora can be placed in the /etc directory.
Do you know just what happens when /etc/tnsnames.ora is used? Learn that and more by watching the rest of the presentation.
Discover more about our expertise in Oracle.
I recently encountered a déjà vu on a client system, something I’ve seen repeatedly over the last couple of years. I’ve decided to write about it to prevent others from tumbling down the same rabbit hole.
Red Hat Enterprise Linux 6 system with a Red Hat support contract on it. A DBA had installed Oracle’s oracle-rdbms-server-12cR1-preinstall RPM package. The DBA was doing that based on Oracle support note “Linux OS Installation with Reduced Set of Packages for Running Oracle Database Server (Doc ID 728346.1)” which in the main section simply states:
“For Oracle database 12cR1 running on OL6/RHEL6,use command below to install all packages required for running Oracle software and resolve all dependencies.
yum install oracle-rdbms-server-12cR1-preinstall”
I’ve got a bit problem with how that note was written. Let’s take a look at what exactly happens to your RHEL 6 system, if you do that. First of all, you have to add Oracle’s yum repo to your yum configuration in order to be able to install that package. I’m a firm believer that you should never mix repositories of different Linux distributions on a production server, but I digress.
Then, when you actually install the RPM:
--> Running transaction check
---> Package oracle-rdbms-server-12cR1-preinstall.x86_64 0:1.0-14.el6 will be installed
--> Processing Dependency: kernel-uek for package: oracle-rdbms-server-12cR1-preinstall-1.0-14.el6.x86_64
--> Processing Dependency: libaio-devel for package: oracle-rdbms-server-12cR1-preinstall-1.0-14.el6.x86_64
--> Running transaction check
---> Package kernel-uek.x86_64 0:2.6.39-400.264.13.el6uek will be installed
--> Processing Dependency: kernel-uek-firmware = 2.6.39-400.264.13.el6uek for package: kernel-uek-2.6.39-400.264.13.el6uek.x86_64
---> Package libaio-devel.x86_64 0:0.3.107-10.el6 will be installed
--> Running transaction check
---> Package kernel-uek-firmware.noarch 0:2.6.39-400.264.13.el6uek will be installed
--> Finished Dependency Resolution
Some DBAs just skip over that section entirely and don’t pay attention to it, but right there Oracle has just installed their own kernel on a RHEL6 system. It’s also been activated and marked as default in grub.conf (which is the norm when installing a kernel RPM):
title Red Hat Enterprise Linux Server (2.6.32-400.37.15.el6uek.x86_64)
kernel /vmlinuz-2.6.32-400.37.15.el6uek.x86_64 ro root=.....
Let that sink in for a minute.
Leaving the system as it is, we’d be going ahead with the installation of the Oracle software, start running our database and go into production. If at any point in the future when we’re be rebooting our server, or if it crashes, we’d suddenly be running the UEK kernel and no longer the Red Hat kernel. There’s also a fairly ugly can of worms awaiting the DBA in question when the SA sees that someone has changed the default kernel.
But the real question is, what would running a different kernel do to us?
Well, Red Hat has an article that’s locked behind a subscriber-only wall. In a nutshell the message it contains is that third party packages are not supported by Red Hat, and third party kernels render all issues unsupported. Fair enough, that makes perfect sense, doesn’t it?
Thus, in essence, we’ve just voided support for our server. If we would hit any issue, we’d have to first clean out any Oracle packages that have replaced Red Hat’s – including the kernel – and reboot the machine back into a clean state, or we’d have to go to the maker of our custom kernel for support. That’s clearly not something you’d want to do during a critical issue on a production server.
If we read the aforementioned Oracle support note a bit more closely, way at the bottom in “Remarks”, as if it’s of no importance, we see this:
“RHEL systems registered with RHN or without an registration with an update channel and which should remain RedHat, can generate a primary list of missing dependencies (manually download the oracle-validated rpm package on public-yum):
# rpm -Uhv oracle-validated–.rpm”
“RHEL systems which should remain RedHat”.
Doesn’t this basically mean that the note isn’t really telling us how to prepare for Oracle database software installation, but instead it’s telling us how to convert from RHEL to OEL? How to move our support contract over to Oracle?
Also note how the “Applies to” section in that particular note specifically does not include RHEL? It simply says “Linux”. This somehow reminds me of a certain horse that a certain city got as a present at some point in the distant past. Neatly packaged and easy to use, but potentially severe long term impact if you’re installing the package.
I’d like to appeal to both Oracle and Red Hat at this point, please folks, make this more clear. Both sides could do better here. There’s really no reason why solution 55413 should be locked behind a pay wall. It’s often the DBAs who are dealing with these packages to prep for a software install, and they often don’t have access to this content. On a similar note, support note 728346.1 also could be written in a much clearer manner to prevent this sort of confusion. Why is the kernel a dependency of that preinstall RPM? There’s absolutely no need for that.
We’re not in a cold war, are we?
TLDR; Don’t mix repositories of different distributions. Don’t install oracle-rdbms-server-12cR1-preinstall on RHEL unless you’re willing to deal with the consequences.
Discover more about our experience in Oracle.
This Log Buffer Edition covers various useful tips and tricks from blogs for Oracle, SQL Server and MySQL.
- pstack(or thread stack) for Windows to diagnose Firefox high CPU usage
- With the ever-changing browser landscape, we needed to make some tough decisions as to which browsers and versions are going to be deemed “supported” for Oracle Application Express. There isn’t enough time and money to support all browsers and all versions, each with different bugs and varying levels of support of standards.
- Are you effectively using Java SE 8 streams for data processing? Introduced in Java 8, streams allow you to process data in a declarative way and leverage multi-core architectures without writing multithreaded code.
- If you are upgrading but would like to increase or decrease the number of data sources you must do so when installing the latest version.
- When talking about BPM security, you need to know the about certain set of information and where those information will come from etc.
- Having fun with PARSENAME (SQL Spackle)
- Time and Space: How to Monitor Drive Space in SQL Server
- Application Security with Azure Key Vault
- Declarative SQL: Using CHECK() & DEFAULT
- Microsoft SQL Server 2016 Public Preview Boosts Database Security
- Oracle MySQL 5.7 Database Nears General Availability
- Most of you know, that it is possible to synchronize MySQL and MariaDB servers using replication. But with the latest releases, it is also possible to use more than just two servers as a multi-master setup.
- Transport Layer Security (TLS, also often referred to as SSL) is an important component of a secure MySQL deployment, but the complexities of properly generating the necessary key material and configuring the server dissuaded many users from completing this task.
- Managing MySQL Replication for High Availability
With the increased popularity of cloud services, one of the questions that I often receive is: “Is the DBA career dying? What will you do for a living in the future?” In this article I will give my personal opinion about the future of our beloved profession, and try to calm down those that have already started to look for another career.
The first thing that I want to point out is that when we started to work in IT we knew that it was a career that is different than most of the other ones out there. Its nature is a dynamic and exciting one that reinvents itself all the time, with technological news showing up every single year and changing the entire landscape. We have chosen a field that pushes us to keep studying, learning and evolving, and this is the kind of mindset I want you to have while reading this article.
The Database Administrator role is not going anywhere. We are not an endangered species and won’t become one in the foreseeable future. Cloud is not our enemy. The data market is just evolving, and the cloud is bringing a lot of new things that will give us more power and more options.
In today’s market we have two very common problems:
- Companies can’t find enough people to fill in all positions.
We all know this one. I’m sure we all know several companies that have an open position for months, have interviewed dozens of people, and just can’t find anyone that suits the position.
- Companies want to keep their costs as low as possible.
Companies want to make money, and we had a big worldwide crisis just a few years ago that we are still recovering from. This means companies are trying to find ways to improve their productivity, while keeping their costs as low as possible.
In a scenario like this, the cloud offerings come as an aid to both improve our productivity as a DBA, and to help the company save money. Let’s think for a while about how many tasks we perform daily that don’t bring real value for the business. No doubt that when we’re planning the new high availability solution, or doing performance tuning on that slow query we can see the value that it will bring to the company. In the first case, this will guarantee that all applications are up and running at full speed when the company needs it. The latter will make sure that the server is handling the workload, running more sessions at the same time, and making both internal and external customers happy.
But how about the time you spent trying to find more disk space for all your databases? How about trying to find disk space for all your backups because the database has grown too large and we didn’t plan ahead? Then there’s all the time that you spend installing SQL and Windows patches. I know, in some big companies, we have a dedicated SAN admin and the infrastructure administrators that will worry about those tasks, but that’s not the everyone’s reality. The vast majority of small and medium companies have a small team that is responsible for multiple areas. Why? Scroll up and read problems 1 and 2 om my list above one more time.
I’ll wait for you.
Now, let’s imagine another reality. Let’s imagine a world where I receive a disk space alert for my backups. The company has acquired a new company, the database growth was much bigger than expected, and we ran out of disk space. I go to a web portal and a few mouse clicks later I have 1TB of disk available to me. All I have to do is open SQL Server Management Studio and change my backup jobs to use the new storage area. Problem solved in less than 15 minutes.
Let’s envision a world where I can get all those small databases I have that are not too important for the business (yeah, we all have a lot of those, don’t lie to yourself) and move those databases to the cloud so they don’t use our precious server resources. I don’t need to worry about patching and managing those databases. Wouldn’t that be great? And how about getting rid of the QA and testing servers and replacing them with virtual machines that can just turn off when they are not in use and save money? And those huge tables with hundreds of millions of rows that causes us problems every single day. Wouldn’t it be great if I could replace that complicated sliding window partition solution that we developed to manage historic data, and instead make SQL Server automatically move old and unused data to the cloud, while also keeping the data available for end users in a transparent way?
Cloud is indeed a career shift dynamic, but not one that will kill the database administrator role and destroy families. Instead, it’s one that will make us more efficient, provide us with tools and options to focus ourselves on tasks that bring value to the company. It’s a solution where we can use the existing hardware more efficiently and make our lives easier. Embrace the changes just like we embraced all new technologies that came before it, and use each one as a tool to be successful in your role.
Discover more about our expertise in the Cloud.
My internship with Pythian started in October and has been an incredible opportunity to build upon my project management skills with a virtual internship. I never imagined working for a tech company. As business students, we are often characterized as lacking the hard skills needed for industries as competitive as those in STEM fields. After my internship with Pythian, I know that my perspective can bring value to a team especially within the tech industry. My work with Pythian has inspired me to apply to computer science programs in Taiwan after I graduate in June.
During my time at Pythian I worked on the Pythia Program which is Pythian’s commitment to improving gender diversity by encouraging, promoting and retaining women and young girls in STEM fields. I was able to work with managers across many different departments and learn how to be part of a virtual team, while building a plan for the Pythia Program.
Making an impact for women in STEM fields
The Pythia program is setting an incredible precedent for other tech companies in North America and I am very excited that I was a part of that process. Researching and compiling data on women in STEM fields, particularly the tech industries, was an eye-opening experience. Identifying the pain points for underrepresented groups in tech, particularly women, is key in developing solutions that encourage women to seek positions in this field.
I am looking forward to seeing Pythian’s impact in the local community with Technovation and the impact on the young women who will be walking the halls and learning from such great mentors. Pythian is not only making great strides for the young women in their local community, but for all women in tech by setting an example with their own diversity initiatives.
Working with the women of Pythian
While assigned to the Pythia Program, I was lucky to be working with women who were heads of their departments and brought a diverse range of skills to the table. Whether building communication plans with the marketing department, or measuring progress with the organizational development team, I was always challenged to look at the issue from different perspectives.
As a project manager it can be very easy to stay on the periphery and remain an outsider, and admittedly this was a concern of mine as an intern and a newcomer to Pythian. The level of trust that the OD department put in me, and their guiding hand in helping me navigate through the on boarding period was instrumental to our success.
Working with women from many different technical backgrounds, I was able to learn much more than if I had stayed within one specific department. I cannot say how important it is as a young women to work with other women on the Pythia Program. It was inspiring to be able to work with accomplished women with so much expertise that they were willing to share.
Working virtually is a whole different ballgame
It has been said that virtual work can be trying and difficult for those new to the team, however my time with Pythian was the complete opposite. I am very thankful to have been able to navigate a virtual internship with such incredible support from the OD team. The lines of communication have always been open, and this has been instrumental to our success on the Pythia Program.
Pythian’s managers made themselves accessible and available during my time on the Pythia program, and their guidance was excellent, as was learning from their experiences.
All in all, I could not have asked for a greater internship than my time at Pythian. I was able to work on a project that was important to me as a woman, while working with other women at Pythian. Together we made an impact within the organization and in the local tech community in Ottawa. In the months to come we will see the reach of the Pythia Program on others. For myself, the experience has been impressive as I apply to computer science programs abroad and see my own future in tech. I look forward to following the continued success of Pythian and the Pythia Program.
Find out more about the Pythia Program.
Microsoft recently released the first Cumulative Update for SQL Server 2012 SP3 This CU addresses 8 issues in the product. But more importantly, it also marks a big shift in the Cumulative Update message sent by Microsoft.
For a long time, there was a discussion between DBAs about when/if we should apply a cumulative update. The Microsoft official message always have been that we should apply a cumulative update only if we were facing an issue in our environment caused by a know and fixed bug. This was also evident by the fact that to be able to download a cumulative update it was necessary to register on their website, provide an email address and they would send a link to download the package to your email address.
So, what has changed? Starting now, the latest cumulative update package will be maintained in the Download Center instead of the hotfix server. This will eliminate the need to register to get the latest package, but this is not the only thing that has changed, the entire message that you can read in the knowledge base article has changed, and instead of a warning message saying that we should not install the package unless it was necessary, now we have:
“ Microsoft recommends ongoing, proactive installation of SQL Server CUs as they become available:
SQL Server CUs are certified to the same levels as Service Packs, and should be installed with the same level of confidence
Historical data show a significant number of support cases involve an issue that has already been addressed in a released CU
CUs may contain added value over and above hotfixes such as supportability, manageability, and reliability updates
As with SQL Server Service packs, Microsoft recommends that you test CUs before deploying to production environments”
This is a big change from what we had before. The concerns we had in the past were necessary because the hotfixes were not tested in the same levels as the service packs were. There were no regression tests and not all possible integration tests were executed. So there was a real concern that something could go wrong in specific scenarios that were not tested. But this has changed and now every cumulative update goes through all the same certification levels that are applied to Service Packs.
This is a trend that is happening not only with SQL Server, this is the result of an agile development effort that is happening throughout the entire Microsoft stack of products. Windows, both personal and server editions already have constant updates instead of Service Packs for some time now and it seems SQL Server will soon follow this road.
This big change in how Microsoft deliver updates bring us to an interesting discussion: how to manage frequent and constant product updates in your environment? The last item in the Microsoft message clearly says that you need to test CUs, just like you test Service Packs before applying. Are customers willing to go through testing and patching processes every couple of months when a new CU is released? How can we convince customers of the benefit of having the latest version of the product?
I believe people will eventually get used to this new model of constant updates and catch up, creating plans to update more often their environments, maybe not apply every single CU, but apply them every 2 releases, or every half year, etc.
What do you think? How do you see this new model fitting in your existing environment? I would love to know other people experience on this subject.
Oh, and before I forget: you can download the latest CU for SQL 2012 SP3 that I mentioned in the beginning of the article. The link will not change for every release, so you will always be able to download the latest version using this link.
Find out how the experts answered at Pythian’s Velocity of Innovation event in San Francisco
Once again, I had the pleasure of moderating another Velocity of Innovation event this past November in San Francisco. Both panelists and guests brought with them a varied range of insights, experiences and perspectives. And as always, this resulted in a thoughtful and lively discussion.
The format of these events is to put three IT leaders with expertise in a particular area in a room, and start a conversation. With our Velocity events, we always start our discussions with a few questions, and the panelists provide some answers. But the idea is to get a conversation going between the audience and panelists.
That day, we discussed a range of subjects from IT innovation, to security, to our favourite wrap-up topic: taking a look into the IT crystal ball and commenting on what current technology trends will really take hold in the future. This last one is always a lot of fun. In this blog post I will provide just some of the highlights from the first discussion topic at the event: innovation and agility in IT.
On our panel were three IT experts:
Sean Rich is the director of IT at Mozilla, leading their application services group. He takes care of web operations, along with pretty much everything data related.
Cory Isaacson is the CTO at Risk Management Solutions. He’s currently bringing big data and scalable systems together to create a new cloud-based platform.
Aaron Lee is Pythian’s VP of transformation services. He runs a team that specializes in helping clients harness technology to deliver real outcomes. Usually they involve things like big data, DevOps, cloud, advanced analytics. He’s involved in some of the most leading edge initiatives for Pythian customers.
I started the discussion by asking the panel, and the audience members, to discuss the notions of innovation and agility, and to try to describe what they have explicitly done to improve innovation and to make their own organizations and those of their customers more agile.
Cory: My business evaluates risk. Our customers are the biggest insurance companies in the world. We run catastrophe models for them so that we can actually see what an earthquake might cost them or a hurricane, for example. I run technology for the company and have to build all the software. We’re innovating tremendously and so now it’s funny because our executives ask us to evaluate the risk of our own projects. We’re trying to do some very, very innovative things. I don’t know if any of you have an insurance background, but it’s not the most up-to-date industry when it comes to technology. As you know it’s been around a long, long time. But at my company some of the things that we’re trying to do are, honestly, more advanced than most other things I’ve ever seen in my career. And that’s why I took the position. But when you’re doing innovation, it is risky. There’s no way around it. There are a lot to evaluate: from different algorithms to the risk models and the catastrophe models. How do you evaluate them? Can you actually run them? We’re going from a 25-year-old desktop application, running on Microsoft SQL server to a cloud-based implementation. We’re taking thousands of servers and trying to move all the customers into a cloud implementation.
Sean: In my role I’m interpreting it a little bit differently. Innovation is doing something new. In an effort toward agility, one of the things that we’re doing within our organization is enabling the agility of our business partners, by changing our own operating model. Traditional IT where we run all the services and infrastructure necessary to drive the business, actually taking more of an enabler or a partnership approach where we’re doing things like encouraging shadow IT, encouraging the use of SaaS applications and helping them really do that better through different service offerings like vendor management or change management when it comes to user adoption of certain platforms, data integration so that when we have work flows that span multiple areas of the business, we can complete those without moving data around manually and some of the other problems that come with that. That’s one way that we’re doing new things, looking at ourselves differently, what new capabilities do we need to develop, processes, tools and skills to enable agility for our marketing group or our product lines, as an example. That’s a little bit of one way I think about it.
Then I asked: What gets in the way of agility?
Aaron: I think it’s interesting considering we used to improve innovation by looking right at the root of the motivation for it. Why are we going down this path, trying to innovate something and what is the value of that thing we’re trying to innovate? It’s my belief that the thing that we should value in our colleagues is their judgment. The pre-conditions for being able to offer judgment are you need to be fully informed, you need to be aligned with an objective, there needs to be a reason and even an incentive to offer that kind of input. If the shared goals around innovation opportunities aren’t defined in a way that actually lead to success over time, then the business is just like any other organism: it gets its fingers burned, it starts to get more risk adverse and then it becomes harder and harder to execute any kind of change agenda. Planning in a way that is likely to have a good long-term outcome, even at the outset of any sort of initiative, is one key success criteria that we put in place to help ourselves and our customers get to a good place.
Attendee 1: Established companies who have the ability to have long-term leaders are losing the race because they’re not agile. Those leaders need to transform in their mind first to say, “This is something that needs to be done,” and commit to it and take maintain an attitude where, having given that direction, don’t penalize employees for failure as they run small experiments. A lot of companies have, complex projects where they spin-off a small team, say, “You go do whatever you want and we are going to give you some limited funding, but we are not going to ask you for results.” CEOs and COOs are looking for, “If I spend 10 million bucks, what am I going to get for it?” When you focus on bottom line results, then you hide the cost of innovation.
Sean: Yeah, it’s culturally endemic, sort of near-term focus on success instead of on the long term and the impact that has on innovation.
Cory: There are some companies, like Google who have been known to allow an engineer to take a day a week or a day every two weeks and just look at things. I think though, the challenge is you have to get your organization up to the point where this is an economically viable thing to do. That’d be something I’d love to do with our team, but they’re so stressed out on getting the next thing done. Once we get more ahead of the curve, I think we could do that kind of thing.
The discussion went on to cover operationalizing innovation, or making it a program within an organization, before we moved on to other similarly thought-provoking subjects.
Interested in being a part of a discussion like this one? VELOCITY OF INNOVATION is a series of thought-leadership events for senior IT management hosted by Pythian. Pythian invites leading IT innovators to participate in discussions about today’s disruptive technologies: big data, cloud, advanced analytics, DevOps, and more. These events are by invitation only.
If you are interested in attending an upcoming Velocity of Innovation event in a city near you, please contact firstname.lastname@example.org.
Almost every company’s success is contingent upon its ability to effectively turn data into knowledge. Data, combined with the power of analytics, brings new, previously unavailable business insights that help companies truly transform the way they do business—allowing them to make more informed decisions, improve customer engagement, and predict trends.
Companies can only achieve these insights if they have a holistic view of their data. Unfortunately, most companies have hybrid data—different types of data, living in different systems that don’t talk to each other. What companies need is a way to consolidate the data they currently have while allowing for the integration of new data.The Hybrid Data Challenge
There are two primary reasons why data starts life in separate systems. The first is team segregation. Each team in an organization builds its systems independently of one another based on its unique objectives and data requirements. This, in part, is also the second reason. Teams often choose a database system that specializes in handling the specific application tasks. Sometimes they need to prioritize for scalability and performance. Other times, flexibility and reliability are more important.
Businesses need to evolve their data strategy to establish and create a “data singularity,” which summarizes the total of their information assets as a competitive, strategic advantage and enables real-time knowledge. A key objective of this strategy should be to reduce data movement, especially during processing and while executing queries, because both are very expensive operations. As well, data movement, or data replication in general, is a very fragile operation, often requiring significant support resources to run smoothly.Enter The World Of Hybrid Databases
A hybrid database is a flexible, high-performing data store that can store and access multiple, different data types, including unstructured (pictures, videos, free text) and semi-structured data (XML, JSON). It can locate individual records quickly, handle both analytical and transactional workloads simultaneously, and perform analytical queries at scale. Analytical queries can be quite resource-intensive so a hybrid database needs to be able to scale out and scale linearly. It must also be highly available and include remote replication capabilities to ensure the data is always accessible.
In the past, many organizations consolidated their data in a data warehouse. Data warehouses gave us the ability to access all of our data, and while these systems were highly optimized for analytical long-running queries, they were strictly batch-oriented with weekly or nightly loads. Today, we demand results in real time and query times in milliseconds.
When Cassandra and Hadoop first entered the market, in 2008 and 2011, respectively, they addressed the scalability limitations of traditional relational database systems and data warehouses, but they had restricted functionality. Hadoop offered infinite linear scalability for storing data, but no support for SQL or any kind of defined data structure. Cassandra, a popular NoSQL option, supported semi-structured, distributed document formats, but couldn’t do analytics. They both required significant integration efforts to get the results organizations were looking for.
In 2008, Oracle introduced the first “mixed use” database appliance—Exadata. Oracle Exadata brought a very high performant reference hardware configuration, an engineered system, as well as unique features in query performance and data compression, on top of an already excellent transactional processing system, with full SQL support, document-style datatypes, spatial and graph extensions, and many other features.
In recent years, vendors have started more aggressively pursuing the hybrid market and existing products have started emerging that cross boundaries. We can now run Hadoop-style MapReduce jobs on Cassandra data, and SQL on Hadoop via Impala. Microsoft SQL Server introduced Columnar Format storage for analytical data sets, and In-Memory OLTP—a high performance transactional system, with log-based disk storage. Oracle also introduced improvements to their product line, with Oracle In-Memory, a data warehouse with a specific high-performance memory store, bringing extreme analytical performance.Choosing A Hybrid Database Solution
Rearchitecting your data infrastructure and choosing the right platform can be complicated—and expensive. To make informed decisions, start with your end goals in mind, know what types of data you have, and ensure you have a scalable solution to meet your growing and changing organizational and business needs—all while maintaining security, accessibility, flexibility, and interoperability with your existing infrastructure. A key design principle is to reduce your data movement and keep your architecture simple. For example, a solution that relies on data being in Hadoop but performs analytics in another database engine is not optimal because large amounts of data must be copied between the two systems during query execution—a very inefficient and compute-intensive process.
Have questions? Contact Pythian. We can help analyze your business data requirements, recommend solutions, and create a roadmap for your data strategy, leveraging either hybrid or special purpose databases.
As the amount of data generated around us continues to grow exponentially, organizations have to keep coming up with the solutions of our new technological landscape. Data integration has been part of this challenge for many years now and there are many tools that have been developed specifically for these needs. Some tools are geared specifically from moving data from point A to point B, other tools provide a full ETL (Extract-Transform-Load) solution that can work with many products using all kinds of different drivers.
For many years, the first party tool of choice for SQL Server professionals has been SSIS. Interestingly, even though it’s called SQL Server Integration Services, SSIS is really a general purpose ETL tool. If you want to extract data from Oracle, transform it with the full expressive capabilities of .NET and then upload it to a partner’s FTP as a flat file, you can do it in SSIS!
As we continue our journey into the cloud and hybrid environments, more tools will start coming up that will work as an ETL PaaS offering. You won’t have to manage the pipeline’s OS, hardware or underlying software, you’ll just create your data pipelines and be off to the races.
What is it?
Azure Data Factory (ADF) is Microsoft’s cloud offering for data integration and processing as a service. You don’t have to install any bits or manage any software, you’re only responsible of creating the pipelines. Since it’s developed to run inside the Azure the tool also has some pre-made hooks that make it really easy to interoperate with other Azure services such as blob storage, HDInsight or Azure Machine Learning.
On premises you would need a machine (VM or physical), you would need a license for your ETL tool (let’s say SSIS), then you would need to keep SSIS patched up, the machine up to date, think about software and hardware refreshes and so on. Using ADF, you can focus on the pipeline itself and not have to worry about what underlying sofware and hardware is actually making it work. The service supports a wide array of sources and targets (and continues to grow) and also robust options for scheduling the pipeline or running continuously to look for new slices of data.
When should you use it?
If you’re thinking about creating a new SSIS package and find that your sources are all web or cloud based then ADF is a good choice. Build a prototype of your pipeline, make sure that it supports your expected transformations and then you can operationalize it on the cloud. As a PaaS offering, it takes away the time, cost and effort of having to deal with the underlying bits and you can just focus on delivering quality data pipelines in a shorter timeframe.
Like all new things in Azure, there are still some service limitations. The biggest one at the moment is that the service is only available in the West US and North Europe regions. If you don’t have resources in those regions and will be moving a lot of data then I would advise to start learning the service and prototyping but not put in production the pipelines. The reason for that is that any data movement from outside the region will have an outbound transfer cost. If your resources are in those regions then there’s no charge and you can ignore this warning.
In the Demo video we’ll look at the user interface of Azure Data Factory, how to add a source and target, scheduling and checking the status of the pipeline. Enjoy!
In this article I’ll explain all the steps to move your database from on-premises to Azure, using three different approaches. You will need to choose the right one based on your migration strategy and on the database that you are migrating. Don’t forget that not all the features supported on-premises are supported on Azure, so some additional work may be needed prior to the migration.
I’ll show how to migrate a database to Azure SQL Database by using two general methods:
- Using the SQL Server Management Studio – Recommended when there are no code compatibility issues blocking the cloud migration.
- Using the SQL Server Data Tools – This approach is highly recommended when there are migration barriers, as the process of detecting and fixing the issues is simpler and more direct.
If you are in doubt about which one to use, the recommendation is to start by using the SQL Server Management Studio approach and, in case of failures, proceed with the SQL Server Data Tools.Migrate Using SQL Server Management Studio
SQL Server Management Studio (SSMS) offers two direct ways to transfer a database to an Azure SQL Database. To proceed, connect to the SQL Server instance and run either the “SQL Database Deployment Wizard” or the “Export data-tier application” option from SQL Server Management Studio.
If you cannot find the preferred option, you will need to update your SQL Server Management Studio (SSMS), which is now a free standalone product. You can do this by downloading the latest version.
The primary difference between the two options is that the “Deploy“ option requires an existing Database server in Azure and will directly deploy the on-premises database to that location. The difference is that afterwards the “Export” option will create a file to be imported from the Azure portal. The exported file can be loaded straight to an Azure Blob Storage account, which will help avoid an extra step to copy the file (recommended).
NOTE: For both options, an Azure Blob Storage account with a container and an Azure SQL server are needed.Migration Steps Using the Deployment Wizard
- Right-click the database and select the Deploy Database to Microsoft Azure SQL Database
- Fill in the required fields.
The server information is for the target (Azure SQL Database server). The settings to define the price tier are also configured at this stage. The bacpac file will be created locally and then applied on the Azure SQL Server, and because of this, we will need to store the bacpac file in a temporary place in the server.
- Click Next.
- Review the settings and click Finish.
- Wait for the process to complete.
At this stage the wizard will validate the database, create the DACPAC file, and apply the Azure SQL Server to create the database.
- The database is now ready to use the server admin account to access the Azure SQL Server.
Migration Steps using the Export Data-Tier Application Process
- Right-click the database and select the Export Data-tier Application.
- Save the file in an Azure Blob Storage Account. You will need the account name and access key.
- Select the container and click Next.
- Click Finish, and wait for the processing to complete.
- Once the process completes a “Success” message is seen as shown in the screen below. Otherwise, there are items needing to be resolved to make the database capable of being converted into an Azure SQL Database.
- Connect to the Azure portal and choose the SQL Servers.
- Select the SQL Server location where the database should be created, and then click the Import Database icon as shown below.
- Complete the required settings, including the BACPAC file location, price tier, and server administrator’s password, and then click Create.
- Once the process completes, the database will be seen in the list.
Migrate Using SQL Server Data Tools
By using the SSMS to migrate the database using a DACPAC, we don’t have the needed flexibility to properly detect and fix the found issues. For this purpose, the SQL Server Data Tools – Business Intelligence is a better option to analyze the database objects. To proceed with this option, follow the steps below.
Creating the Main Project
- Using the SQL Server Data Tools BI, click the SQL Server Object Explorer tab and connect to the on-premises instance:
- Right-click the database to be migrated to Azure, and then click Create New Project.
- Add a name to the project and select a path to save the project files.
- Click next and wait for the processing to complete.
- After the project is created, right-click the project root, go to properties and change the Target Platform to Azure SQL Database. Save and close.
- Right-click the project and click Rebuild. If problems are detected, all the errors will be shown in the Error List.
- Go to File->New->Project, give a project name (I will name it AWAzure) and in the Solution option, click Add to solution:
Creating the New Schema
In order to filter the non-supported features and find the code to be corrected, the next step is a Schema Comparison creation. Follow the steps shown:
- Now, select the options. Click the icon shown.
- In the Schema Compare Options window, click to clear the following known non-supported items:
- Application Roles
- Asymmetric Keys
- Broker Providers
- Extended Properties
- Full-Text Stoplists
- Full-Text Catalogs
- Full-Text Indexes
- Message Types
- Partition Functions
- Partition Schemes
- Remote Service Bindings
- Symmetric Keys
- Used-Defined Types (CLR)
- XML Indexes
- XML Schemas Collections
- Click Ok and save the Schema Comparison, as it can be useful later.
- Select the source: The On-premises database.
- Select the Target: The empty SQL Server create project.
We will have the following:
- Now, click Compare. Wait for the process to complete and then click Update (click YES in the confirmation pop-up), to update the selected target.
- Next, go to the AWAzure (the target) project, right-click on the root, go to properties, and change the Target Platform to Azure SQL Database.
- Click Save and Close the screen.
Now it’s time to resolve the problems. Check the errors tab and double click on each found item to open the code. Resolve the issue and save the file.
Use the filter to ensure you are dealing with the right project.
Deploying the Schema
After the schema revision, we can publish the database.
- To publish the database, right click the AWAzure project, and click Publish.
- Edit the target instance and connect to the Azure SQL Server:
- Fill in the database name and click Publish.
Moving the Data
The schema is deployed. Now it is time to move the data. To do this, use the Import and Export Wizard, from the SQL Server Management Studio.
- Connect to the on-premises instance, right click the database used as the data source and follow the steps shown:
- In the wizard, confirm the Server name and the source database, and then click Next.
Now, do the same for the Azure SQL Database.
- In the Destination field, select SQL Server Native Client 11.0, fill in the server name, and select the target database.
- Click Next.
- For this step, keep the first option selected, and then click Next.
Select all the tables and views from the source. Notice that SQL Server will automatically map the target tables on Azure.
About data hierarchy: If foreign key constraints are being used in the database, the data migration should be made in phases to avoid failure. This needs to be analyzed prior to the final migration.
- Make sure that all the tables are highlighted and click Edit Mappings.
- Select Enable Identity Insert and then click Ok.
- Then, in the main Wizard window click Next.
- Make sure the Run immediately check box is selected and click Next.
- In the following screen, review the options, and then click Finish.
- Monitor and the data transfer and close the wizard.
That’s it. I hope that the steps were clear and this article was useful. If you have questions, do not hesitate in post your comment or contact me using twitter (@murilocmiranda). “See” you in another article.
Discover more about our expertise in SQL Server.
In the previous article, I covered the basics of how to remove database passwords (credentials) from Oracle monitoring or backup scripts and how to instead secure them using a “Secure External Password Store” (SEPS) and Oracle Wallet.
While this mechanism is far better than putting a plain text credential in a script file, one of the more advanced options, specifically tying the files to the local host with the “-auto_login_local” introduces bugs with Oracle 12cR1 software not present with other versions.
This article goes deeper into how to harden the approach, lock-down script access to the local server, and workaround Oracle Wallet limitations and bugs.
Issues with the “-auto_login_local” Option
Oracle suggests using the “-auto_login_local” option to secure an Oracle Wallet and only allow it to be used on the server on which it was created and by the user that created it. See MOS document: “How To Prevent The Secure Password Store Wallet From Being Moved to Another Host (Doc ID 1114599.1)”
This is supposed to protect from a bad actor obtaining a copy of the file, say from a backup, and being able to use it (and the credentials contained within it) from another machine. Unfortunately, there’s a number of issues and problems with this option:
- There are ways to work around the protection it provides.
- The option fundamentally doesn’t work with 126.96.36.199 (while it does with 188.8.131.52 and 184.108.40.206). This is clearly an Oracle bug.
By-passing the “-auto_login_local” Parameter”
The “-auto_login_local” parameter is supposed to protect the Wallet from being used on another server. However testing proves that this is really easy to workaround.
The basics of SEPS and Oracle Wallets I covered in my previous article. To enable the -auto_login_local option, we simply modify the existing Wallet file using the orapki utility:
$ orapki wallet create -wallet "/u01/app/oracle/wallet" -auto_login_local Oracle PKI Tool : Version 220.127.116.11.0 - Production Copyright (c) 2004, 2013, Oracle and/or its affiliates. All rights reserved. Enter wallet password: $ mkstore -wrl "/u01/app/oracle/wallet" -listCredential Oracle Secret Store Tool : Version 18.104.22.168.0 - Production Copyright (c) 2004, 2013, Oracle and/or its affiliates. All rights reserved. Enter wallet password: List credential (index: connect_string username) 1: ORCL scott $
Testing on the local machine shows that the connection using the Wallet works as expected:
$ sqlplus /@ORCL SQL*Plus: Release 22.214.171.124.0 Production on Wed Jan 13 15:27:54 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 126.96.36.199.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> select SYS_CONTEXT('userenv','IP_ADDRESS') IP_ADDRESS, 2 SYS_CONTEXT('userenv','DB_NAME') DB_NAME, 3 SYS_CONTEXT('userenv','CURRENT_USER') CURRENT_USER 4 from dual; IP_ADDRESS DB_NAME CURRENT_USER --------------- ------------ ------------ 192.168.1.123 ORCL SCOTT SQL>
It’s not overly simple for a bad actor already inside the network to obtain all of the information they’d need to access the database remotely, but it is possible. Say, for the sake of an example, that a bad actor obtained access to a backup of the OS. From that they could see the DBA scripts and how they connect, obtain the network files such as the sqlnet.ora and tnsnames.ora files, and obtain the Oracle Wallet files.
If a SEPS and Oracle Wallet was not being used, they’d presumably also be able to work out the database credentials as they’d either be hard-coded in the DBA script files or obfuscated in some other plain text file (not hard to reverse engineer).
Copying the cwallet.sso and ewallet.p12 (and maybe the tnsnames.ora) files to a secondary server simulates the actions of the “bad actor”.
But trying to make the same connection from the secondary server (which the “bad actor” controls) shows the “ORA-12578: TNS:wallet open failed” error:
$ sqlplus /@ORCL SQL*Plus: Release 188.8.131.52.0 Production on Wed Jan 13 15:38:50 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-12578: TNS:wallet open failed Enter user-name:
This is the expected error when the “-auto_login_local” option is used. However it’s simple to work-around.
MOS Note 1114599.1 suggests that the /etc/hosts file may cause this error. So the first thing to try is changing the name in the hosts file to that of the legitimate DB server:
# cp /etc/hosts /etc/hosts.backup # cat /etc/hosts.backup | sed -e "s/HACKED_OS/DBSERVER/ig" > /etc/hosts # su - oracle -c "sqlplus /@ORCL" SQL*Plus: Release 184.108.40.206.0 Production on Wed Jan 13 15:59:53 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-12578: TNS:wallet open failed Enter user-name:
Clearly that didn’t help the situation at all. Undoing that and instead trying to rename the compromised server (separately as root) gives a different error:
# cp /etc/hosts.backup /etc/hosts # hostname HACKED_OS # hostname DBSERVER # hostname DBSERVER # su - oracle -c "sqlplus /@ORCL" SQL*Plus: Release 220.127.116.11.0 Production on Wed Jan 13 15:53:02 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-21561: OID generation failed Enter user-name:
But if we do both:
# cp /etc/hosts /etc/hosts.backup # cat /etc/hosts.backup | sed -e "s/HACKED_OS/DBSERVER/ig" > /etc/hosts # hostname DBSERVER # su - oracle -c "sqlplus /@ORCL" SQL*Plus: Release 18.104.22.168.0 Production on Wed Jan 13 16:05:53 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 16:04:45 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 22.214.171.124.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> select SYS_CONTEXT('userenv','IP_ADDRESS') IP_ADDRESS, 2 SYS_CONTEXT('userenv','DB_NAME') DB_NAME, 3 SYS_CONTEXT('userenv','CURRENT_USER') CURRENT_USER 4 from dual; IP_ADDRESS DB_NAME CURRENT_USER --------------- ------------ ------------ 192.168.1.200 ORCL SCOTT SQL>
So if we change both the hostname via the hostname command (or in the /etc/sysconfig/network file) and update the /etc/hosts file, then the -auto_login_local security is by-passed and we can log into the database from a compromised machine using the credentials stored in the Oracle Wallet!
Important to note there that I’m connecting to a 126.96.36.199 database but using a Wallet file that was created using the 188.8.131.52 software.
ORA-12578 with Oracle Database 184.108.40.206
To make matters worse, with Oracle 220.127.116.11 the -auto_login_local option doesn’t even work at all.
Back on the database server (legitimate DBA activity – not simulating a “bad actor”), creating the Oracle Wallet file using 18.104.22.168 software seems to prevent connectivity locally:
$ orapki wallet create -wallet "/u01/app/oracle/wallet" -auto_login_local Oracle PKI Tool : Version 22.214.171.124 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter wallet password: $ sqlplus /@ORCL SQL*Plus: Release 126.96.36.199.0 Production on Wed Jan 13 16:21:05 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-12578: TNS:wallet open failed
This is unexpected behaviour and clearly shows and Oracle bug. Taking off the -auto_login_local option (by using -auto_login) shows that the Oracle Wallet does indeed work on this server:
$ orapki wallet create -wallet "/u01/app/oracle/wallet" -auto_login Oracle PKI Tool : Version 188.8.131.52 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter wallet password: $ sqlplus /@ORCL SQL*Plus: Release 184.108.40.206.0 Production on Wed Jan 13 16:22:30 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 16:20:38 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 220.127.116.11.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> select SYS_CONTEXT('userenv','IP_ADDRESS') IP_ADDRESS, 2 SYS_CONTEXT('userenv','DB_NAME') DB_NAME, 3 SYS_CONTEXT('userenv','CURRENT_USER') CURRENT_USER 4 from dual; IP_ADDRESS DB_NAME CURRENT_USER --------------- ------------ ------------ 192.168.1.123 ORCL SCOTT SQL>
Hence, there clearly is a bug that’s specific to the 18.104.22.168 software where as the ORA-12578 error is returned when it shouldn’t be. Repeating the same procedure using 22.214.171.124 or 126.96.36.199 software does not exhibit the same error.
And it’s important to understand that it doesn’t matter which version of the database the connection is to. The problem is specific only to which software was used to create the Wallet file. So creating the Wallet with 188.8.131.52 software just to use against a 184.108.40.206 database works without issue.
Harding Using Other Strategies
Due to the above mentioned issues, other strategies can be used to harden the connections and credential management for use by DBA scripts.
Using localhost or 127.0.0.1
The simplest way to prevent the Wallet files from being usable on another server is to change the OracleNET Service Name to an EZconnect string that uses localhost or 127.0.0.1. For example, on the DB server:
$ mkstore -wrl "/u01/app/oracle/wallet" -create -createCredential localhost:1521/ORCL scott Oracle Secret Store Tool : Version 220.127.116.11 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter password: Enter password again: Your secret/Password is missing in the command line Enter your secret/Password: Re-enter your secret/Password: Create credential oracle.security.client.connect_string1 $ sqlplus /@localhost:1521/ORCL SQL*Plus: Release 18.104.22.168.0 Production on Wed Jan 13 16:33:27 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 16:31:50 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 22.214.171.124.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show user USER is "SCOTT" SQL>
Now if we try using the Oracle Wallet files on a compromised server (with the /etc/hosts and /etc/sysconfig/network spoofing as described previously), the connection attempt routes through the localhost back to the compromised server and not to the database server. Hence a connection attempt gives:
[oracle@HACKED_OS ~]$ sqlplus /@localhost:1521/ORCL SQL*Plus: Release 126.96.36.199.0 Production on Wed Jan 13 16:34:27 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. ERROR: ORA-12541: TNS:no listener
Thus, by using an EZconnect connection string and localhost instead of the actual server’s hostname, FQDN, or IP address, we’ve avoided the 188.8.131.52 bug and provided more thorough protection than the -auto_login_local option provides anyway.
And of course we could have used 127.0.0.1 instead of localhost – the results are the same.
Finally, remember that the connection string forms the primary key of the 3DES protected data in the Wallet file which can’t be modified without knowing the Wallet’s password.
Connecting Through a Dedicated Listener and “Valid Node Checking”
Another way to prevent the Oracle Wallet from being used to access the database from an unauthorized server (actually any server other than the DB server) is to have the scripts connect through a dedicated listener. The dedicated listener’s port can then be restricted using either a firewall or the listener’s “valid node checking” functionality.
For example, the dedicated listener could be configured with the following in the listener.ora file:
MONITORING_LISTENER = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1599))) ) SID_LIST_MONITORING_LISTENER = (SID_LIST = (SID_DESC = (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1) (SID_NAME = ORCL) ) ) PASSWORDS_MONITORING_LISTENER= (F251EDED29514235)
Then for added (though possibly redundant due to the use of localhost) protection, the following entries could be adding to the server’s sqlnet.ora:
TCP.VALIDNODE_CHECKING = YES TCP.INVITED_NODES = (localhost)
As a result, local connections using localhost:1599/ORCL will work, while connections from the compromised server, connections will receive:
ERROR: ORA-12537: TNS:connection closed
Preventing Use by Another OS User
Another challenge is to prevent another OS user on the same server from using the Oracle Wallet to connect to the database.
Of course the wallet files should be well secured using OS directory and file security. It can further be obfuscated by making the wallet directory a hidden directory (starting with a period).
If the -auto_login_local option is used then other users on the same server will not be able to use the Oracle Wallet credentials and instead will get the “ORA-12578: TNS:wallet open failed” error. Hence, creating the Oracle Wallet using a version other than 184.108.40.206 (regardless of the database version) and enabling the -auto_login_local option is still the best solution.
Beyond OS directory and file restrictions and the -auto_login_local option, the only other method for restricting access from other OS users on the same server would be a database scoped logon trigger or secured application role.
Using an Oracle Secure External Password Store (SEPS) and Oracle Wallet files is the best way to handle database credentials and passwords in OS scripts. However, a number of significant problems exist:
- The -auto_login_local parameter can be bypassed on a compromised server by changing the hostname (in /etc/hosts and /etc/hosts/network).
- The -auto_login_local parameter doesn’t work at all when created with 220.127.116.11 software.
That being said, we can still harden our script’s database access by following some additional suggestions:
- Create the Oracle Wallet using 18.104.22.168 or 22.214.171.124 software even if connecting to 126.96.36.199 databases.
- If the Oracle Wallet files were created using 188.8.131.52 or 184.108.40.206, protect from usage by other users by using the -auto_login_local parameter.
- Prevent use from other servers by not using an OracleNET Service Name in Oracle Wallets and instead using an EZconnect connection string using either localhost or 127.0.0.1 (not the proper DB server’s hostname, FQDN, or IP address).
- Another strategy is to use a dedicated listener on a dedicated port with listener “valid node checking” to only permit connections from the local server.
- As a last resort prevent non-authorized IPs or OS Users from connecting using a logon trigger or secure application role within the DB.
Discover more about our expertise in Oracle.
Almost every DBA writes and uses various custom scripts to monitor and backup their Oracle databases. However, finding the optimal and most secure way to connect to the database is often not prioritized.
The short summary is that having your script put the username/password (credentials) in any sort of variables or command arguments is the “bad way”. Using an “Oracle External Password Store” (SEPS) or Oracle Wallet is the “better way”. Yet this technology which has been around since Oracle 10gR2 and which does not require the Advanced Security Option is often not used.
Many DBAs will store credentials in their actual scripts. Sometimes obfuscating the actual password through some custom mechanism or script. For example, the following is a simplified version of an RMAN backup script found at a client site:
#!/bin/bash export DB_USER=backup_user export DB_PASS=`~/.secure_pwd_extractor` $ORACLE_HOME/bin/rman << EOF connect target $DB_USER/$DB_PASS shutdown immediate startup mount backup database; alter database open; EOF
The client thought that it was somewhat secure as the actual password wasn’t used as a command line argument to RMAN and was stored in a Linux “hidden file” (starts with a period), which was protected by properly setting OS permissions. However, it dynamically extracted the password from a plain text file (based on the DB_USER environment variable). Another key problem was the fact that the environment variable was exported and hence was part of the environmental profile under which the database was started.
The exported environment variables in this case can be a little bit of a security risk in a couple of ways:
First of all, the complete operating environment including the exported environment variables under which the database is running are recorded by the listener when service is registered. Hence, they are visible in a listener “services” command with “set displaymode verbose“:
Secondly, they may be recorded in OS process files. For example, the pmon process’ operating environment or even the RMAN process’ while running:
But most significantly the credentials can be extracted by anyone with access to the script file and/or the underlying credentials file.
A Better Approach
A better way to store database credentials for monitoring or backup scripts is to use a “Secure External Password Store” (SEPS) which relies on having the necessary credentials securely stored in an Oracle Wallet file.
Typically a DBA might create their own plain text password listing file for use by scripts and batch jobs. Usually with three columns: 1) DB_NAME; 2) DB_USERNAME; 3) DB_PASSWORD. The Oracle Wallet is structured exactly the same way except:
- The file is protected with 3DES encryption.
- The DB_NAME is really an OracleNET Service Name meaning you can have multiple aliases for the same database.
- The passwords are never exposed.
- A separate Wallet password is required to manipulate the file’s contents.
- Control on whether the Oracle Wallet file is tied to the local machine or whether it can be copied to and used on other machines.
The advantages of this approach include:
- No clear text password in any scripts or files.
- No possible exposure of passwords by the listener or process operating environments.
- Control on whether the Oracle Wallet file can be copied and used on another machine.
The last point is actually a complex one. A Wallet can be created as an “auto_login” wallet (done by default). To secure it to only work on the local server, it can be changed to “auto_login_local“. However, there are various issues, limitations, and 12c bugs with the additional functionality that Oracle provides. A separate article goes into this in detail.
Setting up a “Secure External Password Store” and Oracle Wallet is actually quite quick and easy:
1) Adjust the sqlnet.ora file to point to an Oracle Wallet location. For example, add the following to the sqlnet.ora file (assuming that the specified directory exists):
WALLET_LOCATION = (SOURCE = (METHOD = FILE) (METHOD_DATA = (DIRECTORY = /u01/app/oracle/wallet) ) ) SQLNET.WALLET_OVERRIDE = TRUE
2) Create the Oracle Wallet files and add a credential. Two files will actually be created in the specified directory:
- ewallet.p12 – the actual password protected Wallet file.
- cwallet.sso – an additional file for auto-login credentials.
This can be done as either two separate commands or all in a single command:
$ mkstore -wrl "/u01/app/oracle/wallet" -create Oracle Secret Store Tool : Version 220.127.116.11 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter password: Enter password again: $ mkstore -wrl "/u01/app/oracle/wallet" -createCredential ORCL scott Oracle Secret Store Tool : Version 18.104.22.168 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Your secret/Password is missing in the command line Enter your secret/Password: Re-enter your secret/Password: Enter wallet password: Create credential oracle.security.client.connect_string1
Or as a single command:
$ mkstore -wrl "/u01/app/oracle/wallet" -create -createCredential ORCL scott Oracle Secret Store Tool : Version 22.214.171.124 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter password: Enter password again: Your secret/Password is missing in the command line Enter your secret/Password: Re-enter your secret/Password: Create credential oracle.security.client.connect_string1 $
Notice that the Wallet is secured by a password. And then the SCOTT credentials are stored within the Wallet. The Wallet password is required to manipulate contents – not for scripts to access the stored credentials.
The first parameter after the “-createCredential” argument is an OracleNET Service Name. Just like with any database connection, here we can specify an OracleNET Service Name (from the tnsnames.ora file), or a full connection string, or an EZconnect string.
Hence, we could add a second and third connection to the same database as:
$ mkstore -wrl "/u01/app/oracle/wallet" -createCredential localhost:1521/ORCL monitoring_user Oracle Secret Store Tool : Version 126.96.36.199 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Your secret/Password is missing in the command line Enter your secret/Password: Re-enter your secret/Password: Enter wallet password: Create credential oracle.security.client.connect_string2 $ mkstore -wrl "/u01/app/oracle/wallet" -createCredential 127.0.0.1:1521/ORCL batch_reporting Oracle Secret Store Tool : Version 188.8.131.52 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Your secret/Password is missing in the command line Enter your secret/Password: Re-enter your secret/Password: Enter wallet password: Create credential oracle.security.client.connect_string3 $
And to list the contents of the Oracle Wallet:
$ mkstore -wrl "/u01/app/oracle/wallet" -listCredential Oracle Secret Store Tool : Version 184.108.40.206 Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved. Enter wallet password: List credential (index: connect_string username) 3: 127.0.0.1:1521/ORCL batch_reporting 2: localhost:1521/ORCL monitoring_user 1: ORCL scott $
Now any of the three can be used (from the same OS account: “oracle”) depending on which OracleNET Service Name is referenced:
$ sqlplus /@ORCL SQL*Plus: Release 220.127.116.11.0 Production on Wed Jan 13 08:59:12 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 08:56:56 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 18.104.22.168.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show user USER is "SCOTT" SQL>
$ sqlplus /@localhost:1521/ORCL SQL*Plus: Release 22.214.171.124.0 Production on Wed Jan 13 08:59:41 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 08:57:25 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 126.96.36.199.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show user USER is "MONITORING_USER" SQL>
$ sqlplus /@127.0.0.1:1521/ORCL SQL*Plus: Release 188.8.131.52.0 Production on Wed Jan 13 09:00:14 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Last Successful login time: Wed Jan 13 2016 08:43:44 -07:00 Connected to: Oracle Database 12c Enterprise Edition Release 184.108.40.206.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show user USER is "BATCH_REPORTING" SQL>
$ sqlplus / as sysdba SQL*Plus: Release 220.127.116.11.0 Production on Wed Jan 13 09:01:12 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 18.104.22.168.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show user USER is "SYS" SQL>
However, one might challenge the fact that the Oracle Wallet file itself that must be secured just as a plain text password file would need to be and that the risks of either being obtained by a “bad actor” are the same. That being said, there are still some benefits of the SEPS method:
- Passwords are never maintained in plain text in scripts or (hidden) password files.
- No risk of password exposure at the operating system process list.
- No risk of password exposure in operating system process environmental files.
- No risk of exposure from any bad actor with access to the script.
- No risk of password exposure in OS backups maintained by sysadmins or backup vendors (though the files themselves may still be usable).
Regardless, using the SEPS and Oracle Wallet shouldn’t make anything less secure. One could argue that security and risk is equal, but definitely not worse.
However they are a few operational disadvantages:
- The OracleNET Service Name forms the primary key of the entries in the Wallet and hence must be unique. So if another user credential is required for the same DB, an OracleNET alias will be required (as trying to add another user using the same OracleNET Service Name will generate the “Credential already exists” error based on the OracleNET Service Name not being unique).
- Doesn’t work after connecting to SQLPLUS using the “CONNECT” command.
- Scripts are now dependent on listener availability for establishing a TNS connection instead of a BEQ. Though a workaround may be to connect directly to a DISPATCHER port.
Advantages Over OS Authentication
Using a SEPS and an Oracle Wallet may seem functionally similar to just using “OS Authentication”. However it does pose a few differences or advantages.
Most DBAs operate and implement scripts under the “Oracle software owner” account which is typically called “oracle” on Unix or Linux systems. And hence most are able to connect to the database using a SYSDBA connection. So one solution would be to use a dedicated OS user account specifically for monitoring or database backup scripts. And then rely on OS authentication for database connections. However this is often not done. And if running scripts from a remote server or centralized monitoring server then the REMOTE_OS_AUTHENT=TRUE parameter would have to be set which poses other security risks.
Instead, using a SEPS allows for non-OS authenticated connections to a dedicated and minimally privileged database account even from the software owner (Oracle) account locally, or from any other account remotely.
Q: Do I need to re-create my Wallet file as part of a 12c upgrade?
Q: Do I need to backup the Wallet files?
A: Absolutely. Just back them up as you would other Oracle environmental files, such as the sqlnet.ora. Otherwise they’ll need to be re-created.
Q: Is SEPS compatible with “Proxy Authenticated Connections”?
A: YES. See Proxy Authenticated Database connections.
Q: Can the Wallet be used with RMAN?
A: YES. It definitely can and should be used by RMAN scripts.
Q: Can the Wallet be used with JDBC thin or ODBC connections?
A: YES to both. See MOS documents 1441745.1 and 1430666.1 for further details.
Q: Are SEPS and Wallet connections compatible with the CDB architecture and PDBs?
A: Of course. PDBs connect through Oracle Services and OracleNET Service Names.
Q: Can we tell from within the database whether a connected session referenced an Oracle Wallet and used SEPS?
A: NO. There doesn’t seem to be any indication from within the database. As far as the DB is concerned, it just seems to be a password authenticated connection. A SQLNET trace shows that the Wallet files are accessed but doesn’t transpose that information into any DB metric or SYS_CONTEXT USERENV data.
Preventing the Secure Password Store Wallet File from Being Moved to Another Host
Oracle suggests that we can add additional security by tying the Oracle Wallet file to a specific host. See MOS document 1114599.1. However, this poses some restrictions and bugs with 22.214.171.124 specifically. As this discussion is complex, this follow-up article has been created.
A good workaround (and a key point from the follow-up article) is to simply use an EZconnect connection in the Oracle Wallet file and to specify localhost or 127.0.0.1.
Summarizing Best Practices
- Continue to properly secure the directory and file permissions of the Wallet files as you would a plain text password file. Further, why not make the entire wallet directory a hidden directory starting with a period. Remember that If the wallet file can be copied to another server then potentially the credentials within it can continue to be used. See the follow-up article for suggested techniques for securing access further.
- Have scripts connect to the database using dedicated purpose based user accounts with minimal privileges (don’t use SYS or SYSTEM).
- Use OracleNET Service Name aliases (not duplicate entries, but aliases to an existing entry) in the tnsnames.ora file to allow multiple credentials for the same database.
- For additional security add Wallet entries based on EZconnect strings using localhost or 127.0.0.1 instead of relying on OracleNET Service Names from the tnsnames.ora file.
Discover more about our expertise in the world of Oracle.
This Log Buffer Edition begins with some great blog posts from Oracle, goes through SQL Server and then ends with MySQL.
- Ruby-oci8 is a ruby interface for an Oracle Database.
- Another python graph – one wait event.
- This article compares FBL and HDL – two of the commonly used data loading tools in Fusion HCM to highlight key differences and similarities.
- Better Data Modeling: Customizing Oracle Sql Developer Data Modeler (#SQLDevModeler) to Support Custom Data Types.
- Sample code: Oracle Grid Infrastructure action script for Windows.
- Being a database administrator can be very challenging at times when you have to troubleshoot performance issues.
- Another Reason to Use NOEXPAND hints in Enterprise Edition.
- Error: Microsoft .NET framework 3.5 service pack 1 is Required.
- Removing Duplicates from Strings in SQL Server.
- .NET Core is more interesting than the name might suggest. Whereas the .NET framework provides a consistent runtime for all the applications on a machine.
- OpenSSH CVE-2016-0777: Details and Mitigation.
- MySQL Group Replication for MySQL 5.7.10.
- MySQL 5.7 auto-generated root password.
- MySQL Support People – Those Who Were There First.
- Planning the defaults for MySQL 5.8.
Configuring High Availability for Hive requires the following components to be fail proof:1. Hive Metastore underlying RDBMS
3. Hive Metastore Server
For the sake of simplicity this blog will focus on enabling HA for the Hive Metastore Server and HiveServer2. We recommend that the underlying Hive Metastore underlying RDBMS be configured for High Availability and we have configured multiple Zookeeper instances on the current cluster.Enabling High Availability for Hive Metastore Server1. Log on to Cloudera Manager2. Click on HIVE > Hive Metastore Server. Locate the host for the Hive Metastore Server.
3. SSH to Hive Metastore Server.# vi /etc/hive/conf.cloudera.hive/hive-site.xmlExpected Output below.
4. On the Cloudera Manager Console click Hive > Configuration
Select Scope > Hive Metastore Server.
Select Category > Advanced.
Locate the Hive Metastore Delegation Token Store property.
Click Save Changes.
5. On the Cloudera Manager Console click Hive > Instances. Click on Add Role Instances.
Click on Select Hosts for Hive Metastore Server.
6. Choose multiple Hosts (at least 2 more to make a total of 3) to configure Hive Metastore Server on.
Click OK and Continue.
7. Click Finish. You should now see new hosts added as the Hive Metastore Server.Click on Restart the service (or the instance) for the changes to take effect.
8. Notice that hive.metastore.uris now has multiple instances of Hive Metastore Server.
Click on Restart Stale Service.
9. Click Restart Now.
10. Review Restart Messages.
11. Notice that you now have multiple instances of Hive Metastore Server.
12. SSH again to Hive Metastore Server.# vi /etc/hive/conf.cloudera.hive/hive-site.xmlExpected Output below. Note that new instances have been added.
So how do you know the settings are working? The following is the recommended plan for testing the High Availability of Hive MetaStore.1. SSH to any DataNode. Connect to Hiveserver2 using Beeline.
# beeline -u “jdbc:hive2://ip-10-7-176-204.ec2.internal:10000”
2. On the Cloudera Manager Console click Hive > Hive MetaStore Server.Stop the first Hive MetaStore Server in the list.
Issue “Show databases” command in the beeline shell of step 1. The command should work normally.3. Stop the second Hive Metastore Server in the list. Issue Show databases command in the beeline shell of step 1.The command should still work normally.4. Stop the third Hive Metastore Server in the list. Issue Show databases command in the beeline shell of step 1.This command should fail which is normal.
Expected Output from beeline below.
5. Now start a random Hive Metastore Server in the list. Issue Show databases command in the beeline shell of step 1.This command should start working normally again.6. After testing it completed make sure you start all Hive Metastore Servers in the list.Enabling Load Balancing and High Availability for Hiveserver2To provide high availability and load balancing for HiveServer2, Hive provides a function called dynamic service discovery where multiple HiveServer2 instances can register themselves with Zookeeper. Instead of connecting to a specific HiveServer2 directly, clients connect to Zookeeper which returns a randomly selected registered HiveServer2 instance.1. Log on to Cloudera Manager.Click Hive > Instances. Click on Add Role Instances.
Click on Select Hosts for HiveServer2.
2. Choose multiple Hosts (at least 2 more to make a total of 3) to configure HiveServer2 on.
Click OK and Continue.
3. You should now see new hosts added as HiveServer2.
Choose the newly added instances and Choose Start.
4. Click on Close. The newly added HiveServer2 instances are now ready for use.
5. Open Hive -> Configuration -> Category -> Advanced.Find “HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml”
Add a new property as below:
6. Go to the Cloudera Manager Home Page and Restart Hive Service.
7. You should now have multiple instances of HiveServer2.
So how do you know the settings are working? Following is the recommended plan for testing the Load Balancing for Hiveserver2.1. As mentioned before HiveServer2 High Availability is managed through Zookeeper.
The clients connecting to HiveServer2 now go through Zookeeper. An example, JDBC connect string is as follows. Notice that the JDBC now points to a list of nodes that have Zookeeper on them.
beeline -u “jdbc:hive2://ip-10-7-176-204.ec2.internal:2181,ip-10-229-16-131.ec2.internal:2181,ip-10-179-159-209.ec2.internal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2”2. SSH to any data node. Connect to Hiveserver2 using Beeline.# beeline -u “jdbc:hive2://ip-10-7-176-204.ec2.internal:2181,ip-10-229-16-131.ec2.internal:2181,ip-10-179-159-209.ec2.internal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2”
3. The connection gets routed to the HiveServer2 instances in a round robin fashion.
Issue the following command on the HiveServer2 nodes.# tail -f /var/log/hive/hadoop-cmf-hive-HIVESERVER2-ip-10-7-176-204.ec2.internal.log.out
Issue the following command on the HiveServer2 nodes.
4. You may issue the beeline command from multiple sources and monitor the HiveServer2 logs.
So how do you know the settings are working? Following is the recommended plan for testing the High Availability for Hiveserver2.1. On the Cloudera Manager Console click Hive > HiveServer2.Stop the first HiveServer2 in the list
Connection to Beeline using command below should work normally.# beeline -u “jdbc:hive2://ip-10-7-176-204.ec2.internal:2181,ip-10-229-16-131.ec2.internal:2181,ip-10-179-159-209.ec2.internal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2”2. Stop the second HiveServer2 in the list
Connection to Beeline using command below should still work normally.3. Stop the third HiveServer2 in the list.
Connection to Beeline using command below should fail.
4. Start the third HiveServer2 in the list.
Connection to Beeline using command below should work normally again.5. After the testing completes make sure you start all HiveServer2 in the list.
This Log Buffer Edition covers many aspects discussed this week in the realms of Oracle, SQL Server and MySQL.
- Oracle and Informatica have a very close working relationship and one of the recent results of this collaboration is the joint project done by Informatica and our Oracle ISV Engineering team to test the performance of Informatica software with Oracle Database 12c In-memory on Oracle SPARC systems.
- The only thing you can do easily is be wrong, and that’s hardly worth the effort.
- Enterprise Manager 13c: What’s New in Database Lifecycle Management.
- SnoopEE is a very interesting grass roots open source Java EE ecosystem project. Akin to NetFlixOSS Eureka it enables microservices discovery, lookup and registration.
- Docker is an open source container technology that became immensely popular in 2014. Docker itself is written in Google’s programming language “Go” and supported on all major Linux distributions (RedHat, CentOS, Oracle Linux, Ubuntu etc.).
- This blog helps you understand Graphical Execution Plans in SQL Server.
- New DAX functions in SQL Server 2016.
- JSON support in SQL Server 2016.
- PowerShell Tool Time: Building Help.
- Datetime vs. Datetime2.
- Peter Gulutzan discusses SQL qualified names.
- It is not new that we can store a JSON content in a normal table text field. This has always been the case in the past. But two key features were missing: filtering based on JSON content attributes and indexing of the JSON content.
- MySQL 5.7 Multi-Source Replication – Automatically Combining Data From Multiple Databases Into One.
- OOM killer vs. MySQL 5.7.10 epic results with over-allocating memory.
- Apache Spark with Air ontime performance data.
I worked at Pythian as a Global Talent Acquisition Coordinator (co-op student) within the company’s human resources function in Ottawa, Canada. At the core of my role, I provided direct support to my colleagues in guiding job applicants through our hiring process. I also had the opportunity to take on and create small projects for myself during my four months there.
Since our past technical CO-OPs have written fantastic blogs about their experiences at Pythian (be sure to read them!), how about I write this one about the business side of working there? Here are my top three reasons why any business student would want to work at Pythian:1. What better place is there to develop cross-cultural literacy?
With Pythian, I had the pleasure of working with remote and international colleagues for the first time. Top that with actively communicating with a global pool of job applicants on a daily basis. Succeeding in this kind of environment definitely requires you to be cross-culturally literate, which means that you understand how cultural differences—both inside and outside an organization—affect a business’s day-to-day practices.
In business school, we are often reminded about the importance of considering the external environment when a firm goes global (CDSTEP, anyone?), so it was quite eye-opening to see how my experience at Pythian really validated my studies. For example, processes that are of no legal concern in Canada might present a huge obstacle when hiring abroad, and pieces of information that North Americans prefer to keep private are quite openly discussed in other cultures. Talking to candidates from around the world definitely taught me how to think more critically about my communication—not only in terms of what I say, but also how I say it.2. It feels nice not to be just “the CO-OP student”.
Upon my first day, I immediately felt that I would not be looked at as simply an intern. My team greeted me with open arms (already knowing my name!) and repeatedly emphasized the value of having me on board throughout my term. Within days, I could already understand the significance of my tasks and how they contributed not only to the team, but also to the organization as a whole.
Another great thing about not being just “the CO-OP student” is empowerment. At Pythian, you are really treated like a colleague rather than a subordinate. I never worked for my boss, but I always worked with her. During my term, my team enthusiastically invited me to explore our work processes and offer ideas to make things better. It was pretty cool to see my thoughts being listened to and put into action, even after my departure!3.There’s nothing more rewarding than stepping out of your comfort zone.
One of the things I hated doing most before working at Pythian was using the phone. If you run a quick Google search, you will find many studies showing that Millennials are not as comfortable with making phone calls as their predecessors were—and I could speak to that, 100 percent! Want me to order a pizza? I’ll do it through the restaurant’s online platform. Am I due for a dentist appointment? I’ll ignore the receptionist’s voicemails until she sends me an e-mail (although my telephonophobia might not be the only reason for that one!).
My colleagues helped me overcome this discomfort by having me conduct reference checks. As embarrassing as it might sound, I actually had to take the greatest leap of faith to get my fingers dialling for the first time. Although I certainly had a mini heart-attack whenever I was asked to help with a reference, I eventually eased into the task with time. While I might still shy away from the telephone now and then, I really do feel accomplished in getting more comfortable with using less texting and more talking!
All in all, my experience at Pythian has been nothing less than fantastic. It has truly been a pleasure to work with a diverse group of people from around the world, and I would be thrilled to see my future take me back there one day. If you’re looking to intern for a company with a global focus and an information-sharing, empowering culture, then you would definitely love to join Pythian!
Often we are asked by our clients about Table Partitioning, and specifically, which tables will be good candidates to be partitioned?
Here are some of the main use cases for Table Partitioning:
- You have the Enterprise Edition of SQL Server 2005 or higher.
- The table contains or will contain more than 5-6 million rows and growing fast, or its size is growing by around 1GB per month or more.
- The FULL backup is taking too long or the backup file is too large and older data is not being updated (i.e.: users can only update data from the last 3 months).
- Data needs to be archived or purged on a regular basis, potentially the current archiving or deletion of data is causing blocks or deadlocks to other processes.
- There is a NOT NULL date column or another NOT NULL sequential column that the table can be partitioned upon.
- Better if most queries are including the partitioned column in WHERE clauses (i.e: between the date range).
When partitioning a table, here are few things to look into:
- Create a file and a filegroup per partition (even if the files are created in the same place). This way, it is easy to backup (i.e.: FILEGROUP backup), maintain and archive/purge.
- The best way to partition a table is by a date column because data is usually archived or purged by a date. If you do not have such a column, you may want to consider adding a column that will contain the current date/time when the row is created. This column can contain the default GETDATE(). Using an ID or a calculated column may cause too many headaches.
If the application returns an error when adding the date column, consider using a view on top of the underlying partitioned table.
- Partitioning requires maintenance:
- To add files, filegroups and partitions on a regular basis and in advance.
- Monitor data growth and potentially modify the partition architecture (i.e.: move from a monthly partition to a weekly or daily partition).
- Archiving/purging partitions on a regular basis. Consider using a SWITCH partition for quick archiving and purging.
- You can defragment indexes per partition.
- Remember that statistics cannot be updated by partition, however, you may want to consider FILTERED INDEXES or FILTERED STATISTICS to avoid updating statistics on the entire table, as well as improving performance in specific cases.
- Consider using MAXDOP <> 1 on the instance level or for specific queries that span multiple partitions in order to take advantage of parallelism. Configure parallelism with caution.
There is a lot of information around SQL Server Table Partitioning. Here are some of the useful links:
SQL Server Database Partitioning Myths and Truths
Table Partitioning in SQL Server – The Basics
How To Decide if You Should Use Table Partitioning
Query Processing Enhancements on Partitioned Tables and Indexes
Discover more about our expertise in SQL Server.