Skip navigation.

Feed aggregator

Slides and demo script from my APEX command line scripting talk at APEX Connect 2016 in Berlin

Dietmar Aust - Thu, 2016-04-28 11:30
Hi everybody,

I just came back from the DOAG APEX Connect 2016 conference in Berlin ... very nice location, great content and the wonderful APEX community to hang out with ... always a pleasure. This time we felt a little pinkish ;)

As promised, you can download the slides and the demo script (as is) from my site. They are in German, but I will give the talk in June at KScope 2016 in Chicago in English just as well.

Instructions are included.

See you at KScope in Chicago, #letswreckthistogether .

Cheers and enyoy!
~Dietmar. 

Slides and demo script from my ORDS talk at APEX Connect 2016 in Berlin

Dietmar Aust - Thu, 2016-04-28 11:25
Hi everybody,

I just came back from the DOAG APEX Connect 2016 conference in Berlin ... very nice location, great content and the wonderful APEX community to hang out with ... always a pleasure. This time we felt a little bit pink ;)

As promised, you can download the slides and the demo script (as is) from my site.

Instructions are included.

See you at KScope in Chicago, #letswreckthistogether .

Cheers and enyoy!
~Dietmar. 

Log Buffer #471: A Carnival of the Vanities for DBAs

Pythian Group - Thu, 2016-04-28 09:14

This Log Buffer Edition covers Oracle, SQL Server and MySQL blog posts of the week.

Oracle:

Improving PL/SQL performance in APEX

A utility to extract and present PeopleSoft Configuration and Performance Data

No, Oracle security vulnerabilities didn’t just get a whole lot worse this quarter.  Instead, Oracle updated the scoring metric used in the Critical Patch Updates (CPU) from CVSS v2 to CVSS v3.0 for the April 2016 CPU.  The Common Vulnerability Score System (CVSS) is a generally accepted method for scoring and rating security vulnerabilities.  CVSS is used by Oracle, Microsoft, Cisco, and other major software vendors.

Oracle Cloud – DBaaS instance down for no apparent reason

Using guaranteed restore points to navigate through time

SQL Server:

ANSI SQL with Analytic Functions on Snowflake DB

Exporting Azure Data Factory (ADF) into TFS Source Control

Getting started with Azure SQL Data Warehouse

Performance Surprises and Assumptions : DATEADD()

With the new security policy feature in SQL Server 2016 you can restrict write operations at the row level by defining a block predicate.

MySQL:

How to rename MySQL DB name by moving tables

MySQL 5.7 Introduces a JSON Data Type

Ubuntu 16.04 first stable distro with MySQL 5.7

MariaDB AWS Key Management Service (KMS) Encryption Plugin

MySQL Document Store versus Bug hunter

Categories: DBA Blogs

Standard SQL ? – Oracle REGEXP_LIKE

The Anti-Kyte - Thu, 2016-04-28 05:55

Is there any such thing as ANSI Standard SQL ?
Lots of databases claim to conform to this standard. Recent experience tends to make me wonder whether it’s more a just basis for negotiation.
This view is partly the result of having to juggle SQL between three different SQL parsers in the Cloudera Hadoop infrastructure, each with their own “quirks”.
It’s worth remembering however, that SQL differs across established Relational Databases as well, as a recent question from Simon (Teradata virtuoso and Luton Town Season Ticket Holder) demonstrates :

Is there an Oracle equivalent of the Teradata LIKE ANY operator when you want to match against a list of patterns, for example :

like any ('%a%', '%b%')

In other words, can you do a string comparison, including wildcards, within a single predicate in Oracle SQL ?

The short answer is yes, but the syntax is a bit different….

The test table

We’ve already established that we’re not comparing apples with apples, but I’m on a bit of a health kick at the moment, so…

create table fruits as
    select 'apple' as fruit from dual
    union all
    select 'banana' from dual
    union all
    select 'orange' from dual
    union all
    select 'lemon' from dual
/
The multiple predicate approach

Traditionally the search statement would look something like :

select fruit
from fruits
where fruit like '%a%'
or fruit like '%b%'
/

FRUIT 
------
apple 
banana
orange

REGEXP_LIKE

Using REGEXP_LIKE takes a bit less typing and – unusually for a regular expression – less non-alphanumeric characters …

select fruit
from fruits
where regexp_like(fruit, '(a)|(b)')
/

FRUIT 
------
apple 
banana
orange

We can also search for multiple substrings in the same way :

select fruit
from fruits
where regexp_like(fruit, '(an)|(on)')
/

FRUIT 
------
banana
orange
lemon 

I know, it doesn’t feel like a proper regular expression unless we’re using the top row of the keyboard.

Alright then, if we just want to get records that start with ‘a’ or ‘b’ :

select fruit
from fruits
where regexp_like(fruit, '(^a)|(^b)')
/

FRUIT 
------
apple 
banana

If instead, we want to match the end of the string…

select fruit
from fruits
where regexp_like(fruit, '(ge$)|(on$)')
/

FRUIT
------
orange
lemon

…and if you want to combine searching for patterns at the start, end or anywhere in a string, in this case searching for records that

  • start with ‘o’
  • or contain the string ‘ana’
  • or end with the string ‘on’

select fruit
from fruits
where regexp_like(fruit, '(^o)|(ana)|(on$)')
/

FRUIT
------
banana
orange
lemon

Finally on this whistle-stop tour of REGEXP_LIKE, for a case insensitive search…

select fruit
from fruits
where regexp_like(fruit, '(^O)|(ANA)|(ON$)', 'i')
/

FRUIT
------
banana
orange
lemon

There’s quite a bit more to regular expressions in Oracle SQL.
For a start, here’s an example of using REGEXP_LIKE to validate a UK Post Code.
There’s also a comprehensive guide here on the PSOUG site.
Now I’ve gone through all that fruit I feel healthy enough for a quick jog… to the nearest pub.
I wonder if that piece of lime they put in top of a bottle of beer counts as one of my five a day ?


Filed under: Oracle, SQL Tagged: regexp_like

Contemplating Upgrading to OBIEE 12c?

Rittman Mead Consulting - Thu, 2016-04-28 04:00
Where You Are Now

NewImage
OBIEE 12c has been out for some time, and it seems like most folks are delaying upgrading to OBIEE 12c until the very last minute. Or at least until Oracle decides to put out another major version change of OBIEE, which is understandable. You’ve already spent time and money and devoted hundreds of resource hours to system monitoring, maintenance, testing, and development. Maybe you’ve invested in staff training to try to maximize your ROI in your existing OBIEE purchase. And now, after all this time and effort, you and your team have finally gotten things just right. Your BI engine is humming along, user adoption and stickiness are up, and you don’t have a lot of dead objects clogging up the Web Catalog. Your report hacks and work-arounds have been worked and reworked to become sustainable and maintainable business solutions. Everyone is getting what they want.

Sure, this scenario is part fantasy, but it doesn’t mean that as a BI team lead or member, you’re not always working toward this end. It would be nice to think that the people designing the tools with which we do this work understood the daily challenges and processes we must undergo in order to maintain the precarious homeostasis of our BI ecosystems. That’s where Rittman Mead comes in. If you’re considering upgrading to OBIEE 12c, or are even curious, keep reading. We’re here to help.

So Why Upgrade

Let’s get right down to it. Shoot over here and here to check out what our very own Mark Rittman had to say about the good, the bad, and the ugly of 12c. Our Silvia Rauton did a piece on lots of the nuts and bolts of 12c’s new front-end features. They’re all worth a read. Upgrading to OBIEE 12c offers many exciting new features that shouldn’t be ignored.

Heat Map

How Rittman Mead Can Help

We understand what it is to be presented with so many project challenges. Do you really want to risk the potential perils and pitfalls presented by upgrading to OBIEE 12c? We work both harder and smarter to make this stuff look good. And we get the most out of strategy and delivery via a number of in-house tools designed to keep your OBIEE deployment in tip top shape.

Maybe you want to make sure all your Catalog and RPD content gets ported over without issue? Instead of spending hours on testing every dashboard, report, and other catalog content post-migration, we’ve got the Automated Regression Testing package in our tool belt. We deploy this series of proprietary scripts and dashboards to ensure that everything will work just the way it was, if not better, from one version to the next.

Maybe you’d like to make sure your system will fire on all cylinders or you’d like to proactively monitor your OBIEE implementation. For that we’ve got the Performance Analytics Dashboards, built on the open source ELK stack to give you live, active monitoring of critical BI system stats and the underlying database and OS.

OBIEE Performance Analytics

On top of these tools, we’ve got the strategies and processes in place to not only guarantee the success of your upgrade, but to ensure that you and your team remain active and involved in the process.

What to Expect

You might be wondering what kinds of issues you can expect to experience during upgrading to OBIEE 12c (which is to say, nothing’s going to break, right?). Are you going to have to go through a big training curve? Does upgrading to OBIEE 12c mean you’re going to experience considerable resource downtime as your team, or an even an outside company, manages this process? To answer this question, I’m reminded of a quote from the movie Fight Club: “Choose your level of involvement.”

While we always prefer to work alongside your BI or IT team to facilitate the upgrade process, we also know that resource time is valuable and that your crew can’t stop what they’re doing until things wraps up. We often find that the more clients are engaged with the process, however, the easier the hand-off is because clients better understand best practices, and IT and BI teams are more empowered for the future.

Learning More about OBIEE 12c

But if you’re like many organizations, maybe you have to stay more hands off and get training after the upgrade is complete. Check out the link here to look over the agenda of our OBIEE 12c Bootcamp training course. Like our hugely popular 11g course, this program is five days of back-to-front instruction taught via a selection of seminars and hands-on labs, designed to impart most everything your team will need to know to continue or begin their successful BI practice.

What we often find is that, in addition to being a thorough and informative course, the Bootcamp is a great way to bring together teams or team members, often dispersed among different offices, under one roof to gain common understanding about how each person plays an important role as a member of the BI process. Whether they handle the ETL, data modeling, or report development, everyone can benefit from what often evolves from a training session into some impromptu team building.

Feel Empowered

If you’re still on the fence about whether or not to upgrade, as I said before, you’re not alone. There are lots of things you need to consider, and rightfully so. You might be thinking, “What does this mean for extra work on the plates of my resources? How can I ensure the success of my project? Is it worth it to do it now, or should I wait for the next release?” Whatever you may be mulling over, we’ve been there, know how to answer the questions, and have some neat tools in our utility belt to move the process along. In the end, I hope to have presented you with some bits to aid you in making a decision about upgrading to OBIEE 12c, or at least the impetus to start thinking about it.

If you’d like any more information or just want to talk more about the ins and outs of what an upgrade might entail, send over an email or give us a call.

The post Contemplating Upgrading to OBIEE 12c? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Any Questions

Jonathan Lewis - Thu, 2016-04-28 02:56

I’m going to beat the OUG Scotland conference on 22nd June, and one of my sessions is a panel session on Optimisation where I’ll be joined by Joze Senegacnik and Card Dudley.

The panel is NOT restricted to questions about how the cost based optimizer works (or not), we’re prepared to tackle any questions about making Oracle work faster (or more efficiently – which is not always the same thing). This might be configuration, indexing, other infrastructure etc.; and if we haven’t got a clue we can always ask the audience.

To set the ball rolling on the day it would be nice to have a few questions in advance, preferably from the audience but any real-world problems will be welcome and (probably) relevant to the audience. If you have a question that you think suitable please email it to me or add it as a comment below. Ideally a question will be fairly short and be relevant to many people; if you have to spend a long time setting the scene and supplying lots of specific detail then it’s probably a question that an audience (and the panel) would not be able to follow closely enough to give relevant help.

 


Server Problems : Update

Tim Hall - Thu, 2016-04-28 01:08

hard-disk-42935_640This is a follow on from my server problems post from yesterday…

Regarding the general issue, misiaq came up with a great suggestion, which was to use watchdog. It’s not going to “fix” anything, but if I get a reboot when the general issue happens, that would be much better than having the server sit idle for 5 hours until I  wake up.

Archive Storage Services vs Archive Cloud Services

Pat Shuff - Thu, 2016-04-28 01:07
Yesterday we started talking about the cost comparison for storage in the cloud. We briefly touched on the cost of long term archive in the cloud. How much does it cost to backup data for long term archive and what is the best way to do this? Years ago the default way of doing this was to copy your data on disk to a tape unit and put the tape in a box. The box was then put in an environmentally controlled room to extend the lifetime of tape and a person was put on staff to pull the data off the shelf when the data was needed. The data might be a backup of data on disk or a secondary copy just in case the disk failed. Tape was typically used to provide separation of duties required by Sarbanes-Oxly to keep people who report on financial data separate from the financial data. It also allowed companies to take large volumes of data, like seismic data, and not keep it on spinning disks. The traces were reloaded when geophysicists wanted to look at the data.

The first innovation in this technology was to provide a robot to load and unload tapes as a tape unit gets full or needs to be reloaded. Magazines were created that could hold eight tapes and the robots had bar code readers so that they could seek to the right tape in the magazine, pull it out of the series of tapes and inserted into the tape unit for reading or writing. Management software got more advanced and understood the bar code values and could sequence the whopping 800 GB of data that could be written to an LT04 tape. Again, technology gets updated and the industry moved to LT05 and LT06 tapes with significantly higher densities. A single LT06 could hold 2.5 TB per tape unit. Technology marches on and compression allows us to store 6 TB on these disks. If we go back to our 120 TB case that we talked about yesterday this means that we will need 20 tapes (at $30-$45 for each tape) and $25K for a single tape drive unit. Most tape drive systems support 8 tapes per magazine so we are talking about something that will support three magazines. To support three magazines, we need a second shelf in our tape storage so the price goes up by about $20K. We are sitting at about $55K to backup our 120 TB and $5.5K in support annually for the hardware. We also need about $1K in tape for the number of full and incremental backups that we want which would be $20K for four months of retention before we recycle the tapes. These tapes are good for a dozen re-writes so every three years we will need to repurchase tapes. If we spread the cost of the tape unit, tape drives, and tapes across three years we are looking at $2K/month to backup our 120 TB. We also need to factor in $60/week for tape pickup and storage fees at a service like Iron Mountain and a couple of $250 charges to retrieve tapes in the event of a catastrophic failure to drive tapes back to our data center from cold storage. This bumps the cost to $2.2K/month which is significantly cheaper than the $10K/month for network storage in our data center or $3.6K/month for cloud storage services. Unfortunately, a tape unit requires someone to care and feed it and you will pay that person more than $600/month but not $7.8K/month which you would with the cloud or disk solutions.

If you had a ton of data to archive you could purchase a tape silo that supported hundreds or thousands of magazines. Unfortunately, this expandability cones at a cost. The tape backup unit grew from an eighth of a rack to twenty full racks. There isn't much in between. You can get an eighth of a rack solution, a full rack solution, or a twenty full rack solution. The larger solution comes in at hundreds of thousands of dollars rather than tens of thousands.

Enter cloud solutions. Amazon and Oracle offer tape solutions in the cloud. Both companies offer the twenty full rack solution but only charge a per tape charge to consumers. Amazon Glacier charges $7/TB/month to store data. Oracle charges $1/TB/month for the same service. Both companies charge for data restoration and outbound transfer of data. The Amazon Glacier cost of writing 120 TB and reading back 10% of it comes in at $2218/month. This is the same cost as having the tape unit on site. The key difference is that we can recover the data by requesting it from Amazon and get it back in less than four hours. There is no emergency recovery charges. There is not the weekly pickup charges. We can expand the amount that we backup and the bulk of this cost is reading back the data ($1300). Storage is relatively cheap for our backups, we just need to plan on the cost of recovery and try to limit this since it is the bulk of the cost.

We can drop this cost even more using the Oracle Archive Cloud Services. The price from Oracle is $1/TB/month but the recovery and transmission charges are about the same. The same archive service with Oracle is $1560/month with roughly $1300 being the charges for restoring and outbound transfer of the data. Unfortunately, Oracle does not offer an un-metered archive service so we have to guestimate how much we are going to restore on a monthly basis.

Both services use REST apis to write, restore, and read data. When a container (Oracle Archive) or bucket (Amazon Glacier) is created, a PUT call is done to the endpoint of the service. The first step required by both are authentication to provide credentials into the service. Below we show the Oracle authentication and creation process through the REST api.

The important part of this is the archive header extension. This differentiates if the container is spinning disk or if it is tape in the cloud.

Amazon recommends using a windows based tool like s3browser, CloudBerry, or using a language like Java, .NET, or Ruby and their published SDKs. CloudBerry works for the Oracle Archive as well. When you create a container you have the option of pulling down storage or archive as the container type.

Both services allow you to encrypt and compress the data as it is written with HTML Headers changing the characteristics and parameters of the container. Both services require you to issue a PUT request to write the data to tape. Below we show the Oracle REST api.

For CloudBerry and the other gui based tools, uploading is just a drag and drop from your local file system to the tape storage in the cloud.

Amazon details the readback procedure and job system that shows the status of the restore request. Oracle has a similarly defined retrieval policy as well as an archive tutorial. Both services offer a 4 hour window to allow for restoration. Below is an example of a restore request and checking on the job status of the job spawned to load the tape and transfer the data for reading. The file is ready to read when the completedPercentage is 100.

We can do the same thing with the S3 browser and Amazon Glacier. We need to request the restore, check the job status, then download the restored files. The files change color when they are ready to read.

In summary, we have looked at how to reduce cost of archives and backups. We looked at using a secondary disk at our data center or another data center. We looked at using on site tape units. We looked at disk in the cloud. Today we looked at tape in the cloud. It is important to remember that not one of these solutions is the answer. A combination of any or all of them are needed. Daily and weekly backups should happen to a secondary disk locally. This data is most likely to be restored on a regular basis. Once you get a full backup or two under your belt, move the data to another site. It might be spinning disk, it might be tape but something needs to be offsite in the event of a true catastrophic failure like a communication link going out (think Dell PowerVault and a thunderstorm) and you loose your primary lun and secondary lun that contains your backups. The whole idea of offsite backups are not for restore but primary for insurance and regulation compliance. If someone needs to see the old data, it is there. You are betting that you won't need to read it back and the cloud vendors are counting on that. If you do read it back on a regular basis you might want to significantly increase your budget, pass the charges onto the people who want to read data back, or look for another solution. Tape storage in the cloud is a great way of archiving data for a long time at a low cost.

Fall 2014 IPEDS Data: Top 30 largest online enrollments per institution

Michael Feldstein - Wed, 2016-04-27 17:21

By Phil HillMore Posts (401)

The National Center for Educational Statistics (NCES) and its Integrated Postsecondary Education Data System (IPEDS) provide the most official data on colleges and universities in the United States. This is the third year of data.

Let’s look at the top 30 online programs for Fall 2014 (in terms of total number of students taking at least one online course). Some notes on the data source:

  • I have combined the categories ‘students exclusively taking distance education courses’ and ‘students taking some but not all distance education courses’ to obtain the ‘at least one online course’ category;
  • Each sector is listed by column;
  • IPEDS tracks data based on the accredited body, which can differ for systems – I manually combined most for-profit systems into one institution entity as well as Arizona State University[1];
  • See this post for Fall 2013 Top 30 data and see this post for Fall 2014 profile by sector and state.
Fall 2014 Top 30 Largest Online Enrollments Per Institution Number Of Students Taking At Least One Online Course (Graduate & Undergraduate Combined)

Top 30 Online Enrollments By Fall 2014 IPEDS Data

The post Fall 2014 IPEDS Data: Top 30 largest online enrollments per institution appeared first on e-Literate.

Fall 2014 IPEDS Data: New Profile of US Higher Ed Online Education

Michael Feldstein - Wed, 2016-04-27 16:12

By Phil HillMore Posts (401)

The National Center for Educational Statistics (NCES) and its Integrated Postsecondary Education Data System (IPEDS) provide the most official data on colleges and universities in the United States. I have been analyzing and sharing the data in the initial Fall 2012 dataset and for the Fall 2013 dataset. Both WCET and the Babson Survey Research Group also provide analysis of the IPEDS data for distance education. I highly recommend the following analysis in addition to the profile below (we have all worked together behind the scenes to share data and analyses).

Below is a profile of online education in the US for degree-granting colleges and university, broken out by sector and for each state.

Please note the following:

  • For the most part distance education and online education terms are interchangeable, but they are not equivalent as DE can include courses delivered by a medium other than the Internet (e.g. correspondence course).
  • I have provided some flat images as well as an interactive graphic at the bottom of the post. The interactive graphic has much better image resolution than the flat images.
  • There are three tabs below in the interactive graphic – the first shows totals for the US by sector and by level (grad, undergrad); the second also shows the data for each state; the third shows a map view.
  • Yes, I know I’m late this year in getting to the data.

By Sector

If you select the middle tab, you can view the same data for any selected state. As an example, here is data for Virginia in table form.

By State Table VA

There is also a map view of state data colored by number of, and percentage of, students taking at least one online class for each sector. If you hover over any state you can get the basic data. As an example, here is a view highlighting Virginia private 4-year institutions.

By State Map VA

Interactive Graphic

For those of you who have made it this far, here is the interactive graphic. Enjoy the data.

The post Fall 2014 IPEDS Data: New Profile of US Higher Ed Online Education appeared first on e-Literate.

Oracle Security And Delphix Paper and Video Available

Pete Finnigan - Wed, 2016-04-27 14:50

I did a webinar with Delphix on 30th March 2016 on USA time. This was a very good session with some great questions at the end from the attendees. I did a talk on Oracle Security in general, securing non-production....[Read More]

Posted by Pete On 01/04/16 At 03:43 PM

Categories: Security Blogs

3 Days of Oracle Security Training In York, UK

Pete Finnigan - Wed, 2016-04-27 14:50

I have just updated the public Oracle Security training dates on our Oracle Security training page to remove the public trainings that have already taken place this year and to add a new training in York for 2016. After the....[Read More]

Posted by Pete On 31/03/16 At 01:53 PM

Categories: Security Blogs

Oracle Data Masking and Secure Test Databases

Pete Finnigan - Wed, 2016-04-27 14:50

My daily work is helping my customers secure their Oracle databases. I do this in many ways from performing detailed security audits of key databases to helping in design of secure lock down policies to creating audit trails to teaching....[Read More]

Posted by Pete On 14/03/16 At 08:45 AM

Categories: Security Blogs

BOF: A Sample Application For Testing Oracle Security

Pete Finnigan - Wed, 2016-04-27 14:50

In my Oracle security training classes I use a couple of sample applications for various demonstrations. I teach people how to perform security audits of Oracle databases, secure coding in PL/SQL, designing audit trail solutions and locking down Oracle. We....[Read More]

Posted by Pete On 10/03/16 At 11:07 AM

Categories: Security Blogs

Two New Oracle Security Presentations Available

Pete Finnigan - Wed, 2016-04-27 14:50

I attended the UKOUG conference last week Monday to Wednesday in Birmingham. This is the first year for three years that it has been back at the ICC in the center of Birmingham. The last two years have seen the....[Read More]

Posted by Pete On 14/12/15 At 08:54 PM

Categories: Security Blogs

Oracle Security Training In York

Pete Finnigan - Wed, 2016-04-27 14:50

We ran a five day Oracle Security training event in York, England from September 21st to September 25th at the Holiday Inn hotel. This proved to be very successful and good fun. The event included back to back teaching by....[Read More]

Posted by Pete On 22/10/15 At 08:49 PM

Categories: Security Blogs

New Presentation - Building Practical Oracle Audit Trails

Pete Finnigan - Wed, 2016-04-27 14:50

I wrote a presentation on designing and building practical audit trails back in 2012 and presented it once and then never again. By chance I did not post the pdf's of these slides at that time. I did though some....[Read More]

Posted by Pete On 01/10/15 At 05:16 PM

Categories: Security Blogs

Protect Your APEX Application PL/SQL Source Code

Pete Finnigan - Wed, 2016-04-27 14:50

Oracle Application Express is a great rapid application development tool where you can write your applications functionality in PL/SQL and create the interface easily in the APEX UI using all of the tools available to create forms and reports and....[Read More]

Posted by Pete On 21/07/15 At 04:27 PM

Categories: Security Blogs

How to recover space from already deleted files

Pythian Group - Wed, 2016-04-27 13:15

Wait, what? Deleted files are gone, right? Well, not so if they’re currently in use, with an open file handle by an application. In the Windows world, you just can’t touch it, but under Linux (if you’ve got sufficient permissions), you can!

Often in the Systems Administration, and Site Reliability Engineering world, we will encounter a disk space issue being reported, and there’s very little we can do to recover the space. Everything is critically important! We then check for deleted files and find massive amounts of space consumed when someone has previously deleted Catalina, Tomcat, or Weblogic log files while Java had them in use, and we can’t restart the processes to release the handles due to the critical nature of the service. Conundrum!

Here at Pythian, we Love Your Data, so I thought I’d share some of the ways we deal with situations like this.

How to recover

First, we grab a list of PIDs with files still open, but deleted. Then iterate over the open file handles, and null them.

PIDS=$(lsof | awk '/deleted/ { if ($7 > 0) { print $2 }; }' | uniq)
for PID in $PIDS; do ll /proc/$PID/fd | grep deleted; done

This could be scripted in an automatic nulling of all deleted files, with great care.

Worked example

1. Locating deleted files:

[root@importantserver1 usr]# lsof | head -n 1 ; lsof | grep -i deleted
 COMMAND   PID   USER   FD  TYPE DEVICE SIZE/OFF NODE   NAME
 vmtoolsd  2573  root   7u  REG  253,0  9857     65005  /tmp/vmware-root/appLoader-2573.log (deleted)
 zabbix_ag 3091  zabbix 3wW REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 zabbix_ag 3093  zabbix 3w  REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 zabbix_ag 3094  zabbix 3w  REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 zabbix_ag 3095  zabbix 3w  REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 zabbix_ag 3096  zabbix 3w  REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 zabbix_ag 3097  zabbix 3w  REG  253,0  4        573271 /var/tmp/zabbix_agentd.pid (deleted)
 java      23938 tomcat 1w  REG  253,0  0        32155  /opt/log/tomcat/catalina.out (deleted)
 java      23938 tomcat 2w  REG  253,0  45322216 32155  /opt/log/tomcat/catalina.out (deleted)
 java      23938 tomcat 9w  REG  253,0  174      32133  /opt/log/tomcat/catalina.2015-01-17.log (deleted)
 java      23938 tomcat 10w REG  253,0  57408    32154  /opt/log/tomcat/localhost.2016-02-12.log (deleted)
 java      23938 tomcat 11w REG  253,0  0        32156  /opt/log/tomcat/manager.2014-12-09.log (deleted)
 java      23938 tomcat 12w REG  253,0  0        32157  /opt/log/tomcat/host-manager.2014-12-09.log (deleted)
 java      23938 tomcat 65w REG  253,0  363069   638386 /opt/log/archive/athena.log.20160105-09 (deleted)

2. Grab the PIDs:

[root@importantserver1 usr]# lsof | awk '/deleted/ { if ($7 > 0) { print $2 }; }' | uniq
 2573
 3091
 3093
 3094
 3095
 3096
 3097
 23938

Show the deleted files that each process still has open (and is consuming space):

[root@importantserver1 usr]# export PIDS=$(lsof | awk '/deleted/ { if ($7 > 0) { print $2 }; }' | uniq)
[root@importantserver1 usr]# for PID in $PIDS; do ll /proc/$PID/fd | grep deleted; done
 lrwx------ 1 root root 64 Mar 21 21:15 7 -> /tmp/vmware-root/appLoader-2573.log (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 root root 64 Mar 21 21:15 3 -> /var/tmp/zabbix_agentd.pid (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 1 -> /opt/log/tomcat/catalina.out (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 10 -> /opt/log/tomcat/localhost.2016-02-12.log (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 11 -> /opt/log/tomcat/manager.2014-12-09.log (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 12 -> /opt/log/tomcat/host-manager.2014-12-09.log (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 2 -> /opt/log/tomcat/catalina.out (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 65 -> /opt/log/archive/athena.log.20160105-09 (deleted)
 l-wx------ 1 tomcat tomcat 64 Mar 21 21:15 9 -> /opt/log/tomcat/catalina.2015-01-17.log (deleted)

Null the specific files (here, we target the catalina.out file):

[root@importantserver1 usr]# cat /dev/null > /proc/23938/fd/2
Alternative ending

Instead of deleting the contents to recover the space, you might be in the situation where you need to recover the contents of the deleted file. If the application still has the file descriptor open on it, you can then recover the entire file to another one (dd if=/proc/23938/fd/2 of=/tmp/my_new_file.log) – assuming you have the space to do it!

Conclusion

While it’s best not to get in the situation in the first place, you’ll sometimes find yourself cleaning up after someone else’s good intentions. Now, instead of trying to find a window of “least disruption” to the service, you can recover the situation nicely. Or, if the alternative solution is what you’re after, you’ve recovered a file that you thought was long since gone.

Categories: DBA Blogs

Deploy Docker containers using AWS Opsworks

Pythian Group - Wed, 2016-04-27 12:51
Introduction

This post is about how to deploy Docker containers on AWS using Opsworks and Docker Composer.
For AWS and Docker, the introduction isn’t required. So, let’s quickly introduce Opsworks and Docker Composer.

Opsworks

Opsworks is a great tool provided by AWS, which runs Chef recipes on your Instances. If the instance is an AWS instance, you don’t pay anything for using Opsworks, but you can also manage instances outside of AWS with a flat cost just by installing the Agent and registering the instance on Opsworks.

Opsworks Instances type

We have three different types of instances on Opsworks:

1. 24x7x365
Run with no stop

2. Time based
Run in a predefined time. Such as work hours.

3. Load based
Scale up and down according to the metrics preconfigured.

You can find more details here.

Custom JSON

Opsworks provides Chef Databags (variables to be used in your recipes) via Custom JSON, and that’s the key to this solution. We will manage everything just changing a JSON file. This file can become a member of your development pipeline easily.

Life cycle

Opsworks has five life cycles:
1. Setup
2. Configure
3. Deploy
4. Undeploy
5. Shutdown
We will use setup, deploy, and shutdown. You can find more details about Opsworks life cycle here.

Docker Compose

Docker Compose was originally developed under the Fig project. Nowadays, the fig is deprecated, and docker-compose is a built-in component of Docker.
Using docker-compose, you can manage all containers and their attributes (links, share volumes, etc.) in a Docker host. Docker-compose can only manage containers on the local host where it is deployed. It cannot orchestrate Docker containers between hosts.
All configuration is specified inside of a YML file.

Chef recipes

Using Opsworks, you will manage all hosts using just one small Chef cookbook. All the magic is in translating Custom JSON file from Opsworks to YML file to be used by docker-compose.
The cookbook will install all components (Docker, pip, and docker-compose), translate Custom JSON to YML file and send commands to docker-compose.

Hands ON

Let’s stop talking and see things happen.

We can split it into five steps:

  1. Resources creation
    1. Opsworks Stack
        1. Log into your AWS account
        2. Go to Services -> Management Tools -> Opsworks
          Accessing Opsworks menu
        3. Click on Add stack (if you already have stacks on Opsworks) or Add your first stack (if it’s the first time you are creating stacks on opsworks)
        4. Select type Chef 12 stack
          Note: The Chef cookbook used in this example only supports Chef12
        5. Fill out stack information
          aws_opsworks_docker_image02
          Note:
          – You can use any name as stack name
          – Make sure VPC selected are properly configured
          – This solution supports Amazon Linux and Ubuntu
          – Repository URL https://bitbucket.org/tnache/opsworks-recipes.git
        6. Click on advanced if you want to change something. Changing “Use OpsWorks security groups” to No can be a good idea when you need to communicate with instances which are running outside of Opsworks
        7. Click on “Add stack”
    2. Opsworks layer
        1. Click on “Add a layer”
        2. Set Name, Short name and Security groups. I will use webserver

      Note:
      Use a simple name because we will use this name in next steps
      The Name web is reserved for AWS internal use

        1. Click on “Add layer”

      aws_opsworks_docker_image03

    3. Opsworks Instance
        1. Click on “Instances” on left painel
        2. Click on “Add an instance”
        3. Select the size (instance type)
        4. Select the subnet
        5. Click on “Add instance”

      aws_opsworks_docker_image05

  2. Resources configuration
    1. Opsworks stack
        1. Click on “Stack” on left painel
        2. Click on “Stack Settings”
        3. Click on “Edit”
        4. Find Custom JSON field and paste the content of the file bellow

      custom_json_1

      1. Click on “Save”
    2. Opsworks layer
        1. Click on “Layers” on left painel
        2. Click on “Recipes”
        3. Hit docker-compose and press enter on Setup
        4. Hit docker-compose::deploy and press enter on Deploy
        5. Hit docker-compose::stop and press enter on Deploy
        6. Click on “Save”

      aws_opsworks_docker_image04

  3. Start
    1. Start instance
        1. Click on start

      aws_opsworks_docker_image06

  4. Tests
    Note: Wait until instance get online state

      1. Open your browser and you should be able to see It works!
      2. Checking running containers

    aws_opsworks_docker_image07

  5. Management
      1. Change custom json to file bellow (See resources configuration=>Opsworks stack)

    custom_json_2

      1. Click on “Deployments” on left painel
      2. Click on “Run Command”
      3. Select “Execute Recipes” as “Command”
      4. Hit “docker-compose::deploy” as “Recipes to execute”
      5. Click on “Execute Recipes”

    Note: Wait until deployment finish

      1. Checking running containers

    aws_opsworks_docker_image08

Categories: DBA Blogs