Skip navigation.

Pythian Group

Syndicate content
Official Pythian Blog - Love Your Data
Updated: 14 hours 52 min ago

Tracking ASM Metrics

Wed, 2015-06-03 14:36
Collecting ASM IO Metrics

If you are an Oracle DBA, then it is quite likely that Oracle ASM is used as the storage management for at least some of the databases you manage.

Eventually you will want to see ASM metrics that can be used to track the performance of the storage over time.  There are built-in data dictionary views that allow monitoring ASM IO performance at the database, instance, diskgroup and disk level.  These are current time metric only however, so they must be collected and saved to be of much use.

Point your browser at your favorite search engine and search for the terms Oracle, ASM and metrics.  Quite likely near the top of that list relevant hits will be bdt’s oracle blog. Bertrand Drouvot has created asm_metrics.pl, an excellent utility for monitoring and reporting on ASM IO metrics.  I have used this utility several times now with good results.

As good as asm_metrics.pl is, however, I have on several occasions wanted something slightly different.  The asm_metrics.pl script output report format, while my preference is to save data in a CSV file.  By saving all available metrics in this manner it is not necessary to decide ahead of time how the data is to be used and then chose the appropriate command line options.

When all of the metrics are preserved as data there is then no limitation on the types of post-data-collection analysis that can be performed. If for instance, you would like to break down all of the data by disk for a particular disk group, the data will be at your disposal to do so.

And so, the following collection of scripts was created.  First, I will provide a brief description of each script, followed by detailed usage.

asm-metrics-collector.pl

This is the main Perl script used for collecting ASM metrics data.

asm-metrics-aggregator.pl

This Perl script can be used as a filter to aggregate previously collected metrics.

asm-metrics-chart.pl

Output of either of the previous scripts can be used as input to this script. asm-metrics-chart.pl will create a XLSX format Excel file with line charts for the columns you select.

These scripts try not to make too many decisions for you. Their job is simply to get the data, perform some aggregations as needed, and chart the data.

While there are many options that could be added to the scripts, I have attempted to keep from making them unnecessarily complicated.  Standard Linux command line utilities can be used for many operations that would otherwise require a fair bit of code complexity.

Don’t have Linux? In this day of free hypervisors, anyone can install and use Linux.

Using the ASM Scripts

The following are more detailed descriptions of each script and their usage.

For each script the –help option will provide some explanation of the options, while the –man option may be used to show extended help.

The Perl Environment

All recent versions of the Oracle RDBMS software ship with Perl installed.  The Oracle based Perl installations already include the DBI and DBD::Oracle modules which are necessary for connecting to the database.

If you don’t already have a version of Perl with the DBI and DBD::Oracle modules installed, simply use the one installed with Oracle:

 $ORACLE_HOME/perl/bin/perl asm-metrics-collector.pl ...

The asm-metrics-chart.pl script requires the module Excel::Writer::XLSX.

If there is a local copy of Perl that you can add modules to, simply install the Excel::Writer::XLSX module.

Otherwise the module can be installed into your home directory.  Setting the PERL5LIB environment variable will allow the script to find the local copy of the module.

asm-metrics-collector.pl

This is the main Perl script used for collecting ASM metrics data.

Connecting to the database

Connections can be made either via bequeath or TNS as SYSDBA or any user with appropriate privileges.

The following example makes a bequeath connection to the local instance specified in $ORACLE_SID.  As no username or database is specified in the options this connection is the same as ‘/ as sysdba’.

asm-metrics-collector.pl -interval 10 --iterations 20 --delimiter ,

This is the most basic usage of this script.   The –interval parameter refers to the number of seconds between snapshots.

With this basic usage, not all columns are captured.  The –help and –man options display the optional columns.

This next example shows how to include optional columns in the output:

 asm-metrics-collector.pl -interval 5 -iterations 5 -delimiter , --opt-cols DISK_NAME COLD_BYTES_WRITTEN PATH > asm_metrics.csv

And here is an example that causes all available columns to be output:

  asm-metrics-collector.pl -interval 5 -iterations 5 -delimiter , --opt-cols ALL-COLUMNS > asm_metrics_allcols.csv

All output is to STDOUT and so must be redirected as needed.

You may have noticed there is no password on the command line and indeed, there is no provision for a password on the command line.  In the cause of security, the password must either be entered from the keyboard or sent to the script via STDIN.  Following are some examples of connecting with a user that requires a password.

This first example will require you to type in the password as the script appears to hang:

asm-metrics-collector.pl --database orcl --username scott --sysdba > my-asm.csv 

The following two examples get the password from a file:

 asm-metrics-collector.pl --database orcl --username scott --sysdba < password.txt > my-asm.csv
cat password.txt | asm-metrics-collector.pl --database orcl --username scott --sysdba > my-asm.csv 

And just for fun, this method works with the Bash Shell:

asm-metrics-collector.pl --database orcl --username scott --sysdba <<< scott > my-asm.csv
asm-metrics-aggregator.pl

By default the output of asm-metrics-collector.pl will include a row for each disk in each diskgroup.  Should you wish to observe and chart the read and write times at diskgroup level, the presence of one row per disk causes that operation to be somewhat difficult.  The following brief shell script will read output created by asm-metrics-collector.pl, aggregate the chosen columns at the requested level and write it to STDOUT in CSV format.

The aggregation level is determined by the –grouping-cols option, while the columns to aggregate are specified with –agg-cols option.


INPUT_DATAFILE='logs/asm-oravm-data-20150523-172525.csv'
OUTPUT_DATAFILE='logs/asm-oravm-aggtest.csv'

./asm-metrics-aggregator.pl  \
        --grouping-cols DISKGROUP_NAME \
        --agg-cols READS WRITES READ_TIME WRITE_TIME \
        --display-cols  DISPLAYTIME ELAPSEDTIME DBNAME DISKGROUP_NAME READS \
                WRITES READ_TIME AVG_READ_TIME WRITE_TIME \
                AVG_WRITE_TIME BYTES_READ BYTES_WRITTEN  \
        -- ${INPUT_DATAFILE}   \
        > ${OUTPUT_DATAFILE}

You may be wondering about the purpose of the double dashes “–” that appear in the command line.

This is how the Perl option processor Getopt::Long knows that there are no more options available on the command line.  As the –display-cols option can take several arguments, some method must be used indicating the end of the arguments where there is following text that is is not to be processed as part of the option.  The “–” is the option list terminator, and will be well known to long time Unix and Linux users.

This script also does one set of calculations behind the scenes; the average read and write times are calculated at the current aggregation level.

What if you don’t know which columns are available in the input file? Simply use the –list-available-cols option with an input file and the script will output the available columns and exit.


> ./asm-metrics-aggregator.pl --list-available-cols logs/asm-oravm-data-20150523-172525.csv
DISPLAYTIME
SNAPTIME
ELAPSEDTIME
INSTNAME
DBNAME
GROUP_NUMBER
DISK_NUMBER
DISKGROUP_NAME
READS
WRITES
READ_TIME
AVG_READ_TIME
WRITE_TIME
AVG_WRITE_TIME
BYTES_READ
BYTES_WRITTEN
DISK_NAME
READ_ERRS

As you have probably noticed, input to this script is from STDIN.

asm-metrics-chart.pl

Now it all gets more interesting as we can visualize the data collected.  There are many methods available to accomplish this.  I chose Excel as it is ubiquitous and easy to work with.  The previously discussed Perl module Excel::Writer::XLSX  makes it relatively easy to create Excel files complete with charts, directly from CSV data.

The following command line will create the default Excel file asm-metrics.xlsx with line chart for reads and writes. The –worksheet-col options specifies that each diskgroup will be shown on a separate worksheet.

asm-metrics-chart.pl asm-metrics-chart.pl --worksheet-col DISKGROUP_NAME \
--chart-cols READS WRITES -- logs/asm-oravm-20150512_01-agg-dg.csv

This next example provides a non-default name for the Excel file:

./asm-metrics-chart.pl \
        --worksheet-col DISKGROUP_NAME \
        --spreadsheet-file oravm-asm-metrics.xlsx \
        --chart-cols READS AVG_READ_TIME WRITES AVG_WRITE_TIME  \
        -- logs/asm-oravm-aggtest.csv
Modifying the Data with Command Line Tools

Suppose you have two disk groups, DATA and FRA.  The DATA disk group has five disks and FRA has two disks. In addition there are two RAC databases with two instances each. For each iteration the data collected by asm-metrics-collector.pl will have 5 rows per DATA, 2 rows per FRA, multiplied by the number of instances, so 40 rows per iteration.

Should you wish to see only the DATA rows for a single instance, and then aggregate these, it can be done via a combination of command line tools and these scripts.

First let’s get the list of columns and number them so we know how to find the instance:


> ./asm-metrics-aggregator.pl --list-available-cols logs/asm-oravm-20150512_01.csv|nl
     1  DISPLAYTIME
     2  SNAPTIME
     3  ELAPSEDTIME
     4  INSTNAME
     5  DBNAME
     6  GROUP_NUMBER
     7  DISK_NUMBER
     8  HEADER_STATUS
     9  REDUNDANCY
    10  OS_MB
    11  TOTAL_MB
    12  FREE_MB
    13  HOT_USED_MB
    14  COLD_USED_MB
    15  DISKGROUP_NAME
    16  DISK_NAME
    17  FAILGROUP
    18  LABEL
    19  PATH
    20  UDID
    21  PRODUCT
    22  CREATE_DATE
    23  MOUNT_DATE
    24  REPAIR_TIMER
    25  PREFERRED_READ
    26  VOTING_FILE
    27  SECTOR_SIZE
    28  FAILGROUP_TYPE
    29  READS
    30  WRITES
    31  READ_ERRS
    32  WRITE_ERRS
    33  READ_TIME
    34  AVG_READ_TIME
    35  WRITE_TIME
    36  AVG_WRITE_TIME
    37  BYTES_READ
    38  BYTES_WRITTEN
    39  HOT_READS
    40  HOT_WRITES
    41  HOT_BYTES_READ
    42  HOT_BYTES_WRITTEN
    43  COLD_READS
    44  COLD_WRITES
    45  COLD_BYTES_READ
    46  COLD_BYTES_WRITTEN

So now that we know that INSTANCE is column #4, let’s see what the available instances are.


> cut -f4 -d, logs/asm-oravm-20150512_01.csv | sort -u
INSTNAME
oravm1
oravm2

To get the headers and the data for only instance oravm1, and only for the DATA diskgroup, aggregated by diskgroup:


(head -1 logs/asm-oravm-20150512_01.csv ; grep ',oravm1,.*,DATA,' logs/asm-oravm-20150512_01.csv ) | \
  ./asm-metrics-aggregator.pl --grouping-cols DISKGROUP_NAME \
  --agg-cols READS WRITES READ_TIME WRITE_TIME \
  --display-cols DISPLAYTIME ELAPSEDTIME DBNAME DISKGROUP_NAME READS \
                 WRITES READ_TIME AVG_READ_TIME WRITE_TIME \
                 AVG_WRITE_TIME BYTES_READ BYTES_WRITTEN \
  -- > new-file.csv

That should get you started on modifying the data via standard command line utilities.

To Do

These scripts do not know about any of the ASM enhancements found in 12c, so there is some room for improvement there. So far they have fit my needs, but you may have some ideas to make these scripts better. Or (heaven forbid) you found a bug. Either way, please try them out and let me know.

The scripts can all be found in asm-metrics on github.

Discover more about Pythian’s expertise in Oracle and Jared Still.

Categories: DBA Blogs

Handling the Leap Second – Linux

Tue, 2015-06-02 13:31

Last week I published a blog post titled “Are You Ready For the Leap Second?“, and by looking at the blog statistics I could tell that many of you read it, and that’s good, because you became aware of the risks that the leap second on June 30th, 2015 introduces. On the other hand, I must admit I didn’t provide clear instructions that you could use to avoid all possible scenarios. I’ve been looking into this for a good while and I think the official RedHat announcements and My Oracle Support notes are confusing. This blog post is my attempt to explain how to avoid the possible issues.

Update (June 9th, 2015): Made it clear in the text below that ntp’s slewing mode (ntp -x) is mandatory from Oracle Grid Infrastructure and therefore for RAC too.

The complexity of solving these problems comes from the fact that there are multiple contributing factors. The behavior of the system will depend on a combination of these factors.
In the coming sections I’ll try to explain what exactly you should pay attention to and what you should do to avoid problems. The content of this post is fully theoretical and based on the documentation I’ve read. I have NOT tested it, so it may behave differently. Please, if you notice any nonsense in what I’m writing, let me know by leaving a comment!

1. Collect the data

The following information will be required for you to understand what you’re dealing with:

  1. OS version and kernel version:
    $ cat /etc/issue
    Oracle Linux Server release 6.4
    Kernel \r on an \m
    
    $ uname -r
    2.6.39-400.17.1.el6uek.x86_64
    
  2. Is NTP used and which version of NTP is installed:
    $ ps -ef | grep ntp
    oracle    1627  1598  0 02:06 pts/0    00:00:00 grep ntp
    ntp       7419     1  0 May17 ?        00:00:17 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
    
    $ rpm -qa | grep ntp-
    ntp-4.2.4p8-3.el6.x86_64
    
  3. Version of tzdata and the configuration of /etc/localtime:
    $ rpm -qa | grep tzdata-
    tzdata-2012j-1.el6.noarc
    
    $ file /etc/localtime
    /etc/localtime: timezone data, version 2, 5 gmt time flags, 5 std time flags, no leap seconds, 235 transition times, 5 abbreviation chars
    
2. Check the kernel

Here’s a number of bugs that are related to leap second handling on Linux:

  1. System hangs on printing the leap second insertion message – This bug will hang your server at the time when the NTP notifies kernel about the leap second, and that can happen anytime on the day before the leap second (in our case anytime on June 30th, 2015). It’s fixed in kernel-2.6.9-89.EL (RHEL4) and kernel-2.6.18-164.el5 (RHEL5).
  2. Systems hang due to leap-second livelock – Because of this bug systems repeatedly crash due to NMI Watchdog detecting a hang. This becomes effective when the leap second is added. The note doesn’t exactly specify which versions fix the bug.
  3. Why is there high CPU usage after inserting the leap second? – This bug causes futex-active applications (i.e. java) to start consuming 100% CPU. Based on what’s discussed in this email in Linux Kernel Mailing List Archive, it’s triggered by a mismatch between timekeeping and hrtimer structures, which the leap second introduces. The document again does not clearly specify which versions fix the problem, however this “Kernal Bug Fix Update” mentions these symptoms to be fixed in 2.6.32-279.5.2.el6.

MOS Note: “How Leap Second Affects the OS Clock on Linux and Oracle VM (Doc ID 1453523.1)” mentions that kernels 2.4 to 2.6.39 are affected, but I’d like to know the exact versions. I’ve searched a lot, but I haven t found much, so here are the ones that I did find:

I’m quite sure by reading this you’re thinking: “What a mess!”. And that’s true. I believe, the safest approach is to be on kernel 2.6.39-200.29.3 or higher.

3. NTP is used

You’re using NTP if the ntpd process is running. In the outputs displayed above it’s running and has the following arguments: ntpd -u ntp:ntp -p /var/run/ntpd.pid -g. The behavior of the system during the leap second depends on which version of NTP you use and what’s the environment.

  • ntp-4.2.2p1-9 or higher (but not ntp-4.2.6p5-19.el7, ntp-4.2.6p5-1.el6 and ntp-4.2.6p5-2.el6_6) configured in slew mode (with option “-x”) – The leap second is not added by kernel, but the extra time is added by increasing the length of each second over ~2000 second period based on the differences of the server’s time and the time from NTP after the leap second. The clock is never turned backward. This is the configuration you want because:
    • Time never goes back, so there will be no impact to the application logic.
    • Strange time values like 23:59:60 are not used, so you won’t hit any DATE and TIMESTAMP datatype limitation issues.
    • As the leap second is not actually added, It should be possible to avoid all 3 kernel bugs that I mentioned by using this configuration. In many cases updating NTP is much simpler than a kernel upgrade, so if you’re still on an affected kernel use this option to bypass the bugs.

    The drawbacks of this configuration are related to the fact that the leap second is smeared out over a longer period of time:

    • This probably is not usable for applications requiring very accurate time.
    • This may not be usable for some clusters where all nodes must have exactly the same clocktime, because NTP updates are usually received every 1 to 18 minutes, plus giving the ~2000 seconds of time adjustment in slew mode the clocks could be off for as long as ~50 minutes. Please note, the slewing mode is (ntp -x) is mandatory for Oracle Grid Infrastructure as documented in Oracle® Grid Infrastructure Installation Guides 11g Release 2 and 12c Release 1.
  • ntp-4.2.2p1-9 or higher configured without slew mode (no “-x” option) – The NTP will notify the kernel about the upcoming leap second some time during June 30th, and the leap second will be added as an extra “23:59:59″ second (time goes backward by one second). You will want to be on kernel with all fixes present.
  • below ntp-4.2.2p1-9 – The NTP will notify the kernel about the upcoming leap second some time during June 30th, and depending on the environment, the leap second will be added as an extra “23:59:59″ second (time goes backward by one second), or the time will freeze for one second at midnight.

Extra precaution: if you’re running NTP make sure your /etc/localtime does not include leap seconds by running “file /etc/localtime” and confirming it lists message “no leap seconds”.

4. NTP is NOT used

If NTP is not used the time is managed locally by the server. The time is most likely off already, so I really do recommend enabling NTP in slew mode as described above, this is the right moment to do so.

If you have tzdata-2015a or higher installed, the information about the leap second on June 30th, 2015 is also available locally on the server, but it doesn’t mean yet it’s going to be added. Also if NTP is not used and the leap second is added locally, it will appear as “23:59:60″, which is an unsupported value for DATE and TIMESTAMP columns, so this is the configuration you don’t want to use. Here are the different conditions:

  • You’re below tzdata-2015a – the leap second will not be added.
  • You’re on tzdata-2015a or higher and “file /etc/localtime” includes message “X leap seconds”, where X is a number – the leap second will be added as “23:59:60″ and will cause problems for your DATE/TIMESTAMP datatypes. You don’t want this configuration. Disable leap second by copying the appropriate timezone file from /usr/share/zoneinfo over /etc/localtime. It’s a dynamic change, no reboots needed. (Timezone files including the leap seconds are located in /usr/share/zoneinfo<strong>/right</strong>)
  • “file /etc/localtime” includes message “no leap seconds” – the leap second will not be added.
The recommendations

Again I must say this is a theoretical summary on how to avoid leap second issues on Linux, based on what’s written above. Make sure you think about it before implementing as you’re the one who knows your own systems:

  • Single node servers, or clusters where time between nodes can differ – Upgrade to ntp-4.2.2p1-9 or higher and configure it in slew mode (option “-x”). This should avoid the kernel bugs too, but due to lack of accurate documentation it’s still safer to be on kernel 2.6.39-200.29.3 or higher.
  • Clusters or applications with very accurate time requirements – NTP with slew mode is not suitable as it’s unpredictable when it will start adjusting the time on each server. You want to be on kernel 2.6.39-200.29.3 or higher. NTP should be enabled. Leap second will be added as an extra “23:59:59″ second (the time will go backward by one second). Oracle Database/Clusterware should detect time drifting and should deal with it. Check MOS for any bugs related to time drifting for the versions you’re running.
  • I don’t care about the time accuracy, I can’t update any packages, but need my systems up at any cost – The simplest solution to this is stopping the NTP on June 29th and starting it up on July 1st, so that the server was left unaware of the leap second. Also, you need to make sure the /etc/localtime does not contain the leap second for June 30th, 2015 as explained above.
    -- on June 29th (UTC)
    # /etc/init.d/ntpd stop
    # date -s "`date`"    (reset the system clock)
    -- on July 1st (UTC)
    # /etc/init.d/ntpd start
  • Very accurate time requirements + time reduction is not allowed – I don’t know. I can’t see how this can be implemented. Does anyone have any ideas?
Post Scriptum

Initially I couldn’t understand why this extra second caused so much trouble. Don’t we change the time by a round hour twice a year without any issues? I found the answers during the research, and it’s obvious. Servers work in UTC time, which does not have daylight saving time changes. The timezone information is added just for representation purposes later on. UTC Time is continuous and predictable, but the leap second is something which breaks this normal continuity and that’s why it is so difficult to handle it. It’s also a known fact that Oracle Databases rely heavily on gettimeofday() system calls and these work in UTC too.

 

Discover more about Pythian’s Oracle Ace Maris Elsins.

Categories: DBA Blogs

Cassandra Update – Leap Second & JMX Security

Tue, 2015-06-02 08:45

This is a short post about two things that should be on the to-do list for all Cassandra Administrators. The leap second issue and the new JMX default.

The Leap Second

Before we move on you should learn more about how the leap second affects Cassandra in more detail.

In short, you must update your JVM to version 7u60 or above. If you are on Cassandra 2.0.14+ or 2.1.x then all JVM on version 8 are safe and tested. One issue that the the JVM doesn’t solve is that time-series data might become interleaved. If is this is critical for your deployment (not for most cases) be aware of this.

JMX Security

Since Cassandra 2.0.14 and 2.1.4 the cassandra-env.sh file sets the JMX to only listen to the localhost.  So unless you are fine with this you should enable remote access, while making sure you activate security!

Short version:

    1. Edit $CASSANDRA_CONF/cassandra-env.sh update and set LOCAL_JMX=no
    2. Create /etc/cassandra/jmxremote.password and add the username and password:
monitorRole QED
controlRole R&amp;D
USERNAME PASSWORD
    1. Change ownership to the user you run Cassandra with and permission to read only:
chown cassandra:cassandra /etc/cassandra/jmxremote.password
chmod 400 /etc/cassandra/jmxremote.password
    1. Add the username defined before with readwrite permission to $JAVA_HOME/lib/management/jmxremote.access:
monitorRole readonly
USERNAME readwrite
controlRole readwrite \
create javax.management.monitor.,javax.management.timer. \
unregister
  1. Re-start Cassandra

 

Want to learn more about our Cassandra services? Contact us now!

Categories: DBA Blogs

Part 1: How to Effectively Use a Performance Schema

Sat, 2015-05-30 12:36

Performance Schema (PS) has been the subject of many, many recent discussions, presentations, and articles.  After its release in MySQL 5.7, PS has become the main actor for people who want to take the further steps in MySQL monitoring. At the same time, it has become clear that Oracle intends to make PS powerful with so many features and new instrumentation that old-style monitoring will begin to look like obsolete tools from the Stone Age.

This article will explain PS and provide guidance on what needs to be done in order to use it effectively.

What I am not going to do is to dig into specific performance issues or address polemics about what PS is and what, in a Utopian vision, it should be. I have seen too many presentations, articles and comments like this and they are not productive, nor are they in line with my target which is: keep people informed on how to do things EASILY.

For the scope of this article I will base my code mainly on version MySQL 5.7, with some digression to MySQL 5.6, if and when it makes sense.

Basic Concepts

Before starting the real how-to, it is my opinion that we must cover a few basic concepts and principles about PS. The primary goal of the Performance Schema is to measure (instrument) the execution of the server. A good measure should not cause any change in behavior. To achieve this, the overall design of the Performance Schema complies with the following, very severe design constraints:

  • The parser is unchanged. Also, there are no new keywords or statements. This guarantees that existing applications will run the same way with or without the Performance Schema.
  • All the instrumentation points return “void”, there are no error codes. Even if the performance schema fails internally, execution of the server code will proceed.
  • None of the instrumentation points allocate memory. All the memory used by the Performance Schema is pre-allocated at startup, and is considered “static” during the server life time.
  • None of the instrumentation points use any pthread_mutex, pthread_rwlock, or pthread_cond (or platform equivalents). Executing the instrumentation point should not cause thread scheduling to change in the server.

In other words, the implementation of the instrumentation points, including all the code called by the instrumentation points is:

  • Malloc free
  • Mutex free
  • Rwlock free

Currently, there is still an issue with the usage of the LF_HASH, which introduces memory allocation, though a plan exists to be replace it with lock-free/malloc-free hash code table.

The observer should not influence the one observe. As such, the PS must be as fast as possible, while being less invasive. In cases when there are choices between:

  • Processing when recording the performance data in the instrumentation.

OR

  • Processing when retrieving the performance data.

Priority is given in the design to make the instrumentation faster, pushing some complexity to data retrieval.

Performance schema was designed while keeping an eye on future developments and how to facilitate the PS usage in new code. As such, to make it more successful, the barrier of entry for a developer should be low, so it is easy to instrument code. This is particularly true for the instrumentation interface. The interface is available for C and C++ code, so it does not require parameters that the calling code cannot easily provide, supports partial instrumentation (for example, instrumenting mutexes does not require that every mutex is instrumented). The Performance Schema instrument interface is designed in such a way that any improvement/additions in the future will not require modifications, as well as old instrumentation remaining unaffected by the changes.

The final scope for PS is to have it implemented in any plugin included in MySQL, although pretending to have them always using the latest version will be unrealistic in most cases. Given that the Performance Schema implementation must provide up to date support, within the same deployment, multiple versions of the instrumentation interface must ensure binary compatibility with each version.

The importance of flexibility means we may have conditions like:

  • Server supporting the Performance Schema + a storage engine that is instrumented.
  • Server supporting the Performance Schema + a storage engine that is not
  • Server not supporting the Performance Schema + a storage engine that is instrumented.

Finally, we need to take in to account that the Performance Schema can be included or excluded from the server binary, using build time configuration options, with exposure in the compiling interface.

Performance Schema Interfaces

As mentioned above, PS can be excluded from code at the moment of the code compilation, thanks to the PS compile interface. This interface is one of seven that are present in PS. The full list is:

  • Instrument interface
  • Compiling interface
  • Server bootstrap interface
  • Server startup interface
  • Runtime configuration interface
  • Internal audit interface
  • Query interface
Instrument Interface:

This is the one that allows plugin implementers to add their instruments to PS. In general the interface is available for:

  • C implementations
  • C++ implementations
  • The core SQL layer (/sql)
  • The mysys library (/mysys)
  • MySQL plugins, including storage engines,
  • Third party plugins, including third party storage engines.
Compiling Interface:

As mentioned earlier, this is used during the build and will include or exclude PS code from the binaries.

Server Bootstrap Interface:

This is an internal private interface, which has the scope to provide access to the instructions demanded and create the tables for the PS itself.

Server Startup Interface:

This interface will expose options used with the mysqld command line or in the my.cnf, required to:

  • Enable or disable the performance schema.
  • Specify some sizing parameters.
Runtime Configuration Interface

This is one of the two most important interfaces for DBAs and SAs. It will allow the configuration of the PS at runtime. Using the methods expose by this interface, we will be able to configure what instruments, consumers, users and more we want to have active. This interface uses standard SQL and is very easy to access and use. Also, it is the preferred method to activate or deactivate instruments. Thus, when we start the server we should always enable the PS with all the instruments and consumers deactivated, and use this interface to choose only the ones we are interested in.

Internal Audit Interface:

The internal audit interface is provided to the DBA to inspect if the Performance Schema code itself is functioning properly. This interface is necessary because a failure caused while instrumenting code in the server should not cause failures in the MySQL server itself, and in turn the performance schema implementation never raises errors during runtime execution. To access the information a DBA just needs to issue the SHOW ENGINE PERFORMANCE SCHEMA STATUS; command.

Query Interface:

Lastly, this interface is the one that allows us to access the collected data, and to perform data filtering, grouping, join, etc. It will also allow access to a special table like the summary tables and digest, which will be discussed later on.

Consumers and Instruments

Another important concept in PS to understand is the difference between Instruments and Consumers.

Instruments:

Instruments are the ones collecting raw data where the calls are embedded in the code, such as:

MYSQL_TABLE_IO_WAIT(PSI_TABLE_FETCH_ROW, active_index, result,

{ result= index_prev(buf); })

In this case the code refers to the MYSQL_TABLE_IO_WAIT function declared in the handler.cc class (<mysql_root_code>/sql/handler.cc). If enabled in the compilation phase the above function will provide PS the information related to specific table io_wait.

The instruments demanded to manage that data collection is: wait/io/table/sql/handler.

The naming convention for the instruments is quite easy. The first part wait is the name of the Top-level Instrument component (list later), the second io is the observed condition, and table is the object.  The remaining suffix is referring to more specific plugin implementations and includes innodb, myisam, sql or names like IO_CACHE::append_buffer_lock. In the above example it refers to the Handler class in SQL tree.

Instruments are organized by top level components like:

  • Idle: An instrumented idle event. This instrument has no further components.
  • Memory: An instrumented memory event.
  • Stage: An instrumented stage event.
  • Statement: An instrumented statement event.
  • Transaction: An instrumented transaction event. This instrument has no further components.
  • Wait: An instrumented wait event.

Each top level has an n number of instruments:

+-------------+------+
| name        | Numb |
+-------------+------+
| idle        |    1 |
| memory      |  367 |
| stage       |  117 |
| statement   |  191 |
| transaction |    1 |
| wait        |  297 |
+-------------+------+

We can and should keep in consideration that, it is best practice to enable only the instruments we may require for the time we need them. This can be achieved using the re-using the runtime interface (I will explain how exactly later on).

There exists official documentation  (insert hyperlink: http://dev.mysql.com/doc/refman/5.7/en/performance-schema-instrument-naming.html) providing more detailed information about the list of what is available for each Top Component.

Consumers:

The Consumers are the destination of the data collected from the instruments. Consumers have different scope and timelines. Also, consumer like event statements has many different tables like:

  • Current
  • History
  • History long
  • Summaries (by different aggregation)
  • Summary Digest (like what we can find by processing the slow query log)

Once more it is important to define what we are looking for and enable only what we need. For instance, if we need to review/identify the SQL with the most impacting, we should enable only the events_statements_current, events_statements_history and events_statements_summary_by_digest. All the other consumers can stay off. It is also important to keep in mind that each event may have a relation with another one. In this case, we will be able to navigate the tree relating the events using the fields EVENT_ID and NESTING_EVENT_ID where the last one is the EVENT_ID of the parent.

Pre-Filtering vs. Post-filtering

We are almost there, stay tight! Another important concept to understand is the difference between post and pre-filtering. As I mentioned, we can easily query the Consumer tables with SQL, we can create complex SQL to join tables and generate complex reports. But this can be quite heavy and resource consuming, especially if we want to dig on specific sections of our MySQL server.

In this case we can use the pre-filtering approach. The pre-filtering is basically a way to tell to PS to collect information ONLY from a specific source like user/IP (actors) or Object(s) like Tables, Triggers, Events, and Functions. The last one can be set at a general level or down to a specific object name.

The pre-filtering with the activation of the right instruments and consumer is a powerful way to collect the information without overloading the server with useless data. It is also very easy to implement given we just need to set the objects and/or actors in the setup tables as we like.

Rolling the Ball, Setup the PS for Observation as Start

Now that we have covered the basic concepts we can start to work on the real implementation.

Compile the Source Code:

As mentioned earlier, we can use the compile interface to include or exclude features from the code compilation. The available options are:

  • DISABLE_PSI_COND Exclude Performance Schema condition instrumentation
  • DISABLE_PSI_FILE Exclude Performance Schema file instrumentation
  • DISABLE_PSI_IDLE Exclude Performance Schema idle instrumentation
  • DISABLE_PSI_MEMORY Exclude Performance Schema memory instrumentation
  • DISABLE_PSI_METADATA Exclude Performance Schema metadata instrumentation
  • DISABLE_PSI_MUTEX Exclude Performance Schema mutex instrumentation
  • DISABLE_PSI_RWLOCK Exclude Performance Schema rwlock instrumentation
  • DISABLE_PSI_SOCKET Exclude Performance Schema socket instrumentation
  • DISABLE_PSI_SP Exclude Performance Schema stored program instrumentation
  • DISABLE_PSI_STAGE Exclude Performance Schema stage instrumentation
  • DISABLE_PSI_STATEMENT Exclude Performance Schema statement instrumentation
  • DISABLE_PSI_STATEMENT_DIGEST Exclude Performance Schema statement_digest instrumentation
  • DISABLE_PSI_TABLE Exclude Performance Schema table instrumentation

This level of detail is so granular that we can only include the things we are planning to use.

The positive aspect of doing so at the compilation level is that we will be sure no one will mess-up adding undesired instruments. The drawback is that if we change our mind and we decide we may need the ones we had excluded, we will have to compile the whole server again.

As a result, I would say that using this approach is not for someone that is just starting to use PS. Given you are still discovering what is there, it make sense to compile with all the features (default).

Configure PS in my.cnf:

To set the PS correctly in the my.cnf is quite important, so I strongly suggest disabling any instrument and consumer at the start-up. They can be enabled by the script later, and that would be much safer for a production database.

I normally recommend a section like the following:

performance_schema=1

performance_schema_events_waits_history_size=50

performance_schema_events_waits_history_long_size=15000

performance_schema_instrument=’%=OFF’

performance_schema_consumer_events_stages_current=0

performance_schema_consumer_events_stages_history=0

performance_schema_consumer_events_stages_history_long=0

performance_schema_consumer_events_statements_current=0

performance_schema_consumer_events_statements_history=0

performance_schema_consumer_events_statements_history_long=0

performance_schema_consumer_events_transactions_current=0

performance_schema_consumer_events_transactions_history=0

performance_schema_consumer_events_transactions_history_long=0

performance_schema_consumer_events_waits_current=0

performance_schema_consumer_events_waits_history=0

performance_schema_consumer_events_waits_history_long=0

performance_schema_consumer_global_instrumentation=0

performance_schema_consumer_thread_instrumentation=0

performance_schema_consumer_statements_digest=0

The settings above will start the server with PS as “enabled”, but all the instruments and consumer will be OFF. Well, this is not entirely true, as for the moment of the writing (MySQL 5.7.7) once the PS is enabled the instruments related to memory/performance_schema are enabled regardless, which make sense given they are dedicated to monitor the memory utilization of PS.

A final note about the configuration is that we can decide to use the counting option of the instruments instead, capturing the latency time. To do so, we just have to declare it as: performance_schema_instrument=’statement/sql/%=COUNTED’

In this case I had set that ALL the SQL statements should be counted.

Start Server and Set Only the Users We Need:

Once we have started our MySQL server, we are almost ready to go.

This is it, given we start it with NO instruments, we have to decide where to begin, and given we all know the most impacting factor in a database server is how we query it, we will start from there. In turn, analyzing what is going from the SQL point of view. Although, I want to catch the work coming from my application user, not from everywhere. Given this we can set the user in the actor table. This is very simple given we will use the Runtime configuration interface which uses SQL syntax.

So, let say I want to trace only my application user named stress running from machines in the 10.0.0.0/24 range. I will need to:

update setup_actors set ENABLED='NO' where user='%';
insert into setup_actors values('10.0.0.%','stress','%','YES');
(root@localhost) [performance_schema]>select * from setup_actors;
+----------+--------+------+---------+
| HOST     | USER   | ROLE | ENABLED |
+----------+--------+------+---------+
| %        | %      | %    | NO      |
| 10.0.0.% | stress | %    | YES     |
+----------+--------+------+---------+
2 rows in set (0.00 sec)

Great, from now on PS will only focus on my user stress, so now let us decide what to enable for instruments and consumers.

Once more using SQL command we will enable all the instruments related to SQL statements, but wait a minute, if you check the instrument table, you will see we have several variations of the statements instrument:

  • SQL
  • SP
  • Scheduler
  • Com
  • Abstract

Also, this is not included but relevant is the TRANSACTION. For now, we will only enable the SQL, ABSTRACT, Scheduler and Transaction.

SQL will be:

update  setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘statement/abstract/%';

update  setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘statement/sql/%';

update  setup_instruments SET ENABLED=’YES’ where ENABLED=’NO’ and name like ‘transaction';

(root@localhost) [performance_schema]>select count(*) from setup_instruments where ENABLED = ‘YES’ and name not like ‘memory%';

+———-+

| count(*) |

+———-+

|      143 |

+———-+

1 row in set (0.01 sec)

We have 143 instruments active. Now we must setup the consumers and choose the destination that will receive the data.

The list of consumers is the following:

(root@localhost) [performance_schema]>select * from setup_consumers;
+----------------------------------+---------+
| NAME                             | ENABLED |
+----------------------------------+---------+
| events_stages_current            | NO      |
| events_stages_history            | NO      |
| events_stages_history_long       | NO      |
| events_statements_current        | NO      |
| events_statements_history        | NO      |
| events_statements_history_long   | NO      |
| events_transactions_current      | NO      |
| events_transactions_history      | NO      |
| events_transactions_history_long | NO      |
| events_waits_current             | NO      |
| events_waits_history             | NO      |
| events_waits_history_long        | NO      |
| global_instrumentation           | NO      |
| thread_instrumentation           | NO      |
| statements_digest                | NO      |
+----------------------------------+---------+
15 rows in set (0.00 sec)

To enable ANY of them, first we have to enable the GLOBAL one, which works as a global power on/off. The same thing applies for the Thread instrumentation:

update setup_consumers set ENABLED=’YES’ where NAME=’global_instrumentation';

update setup_consumers set ENABLED=’YES’ where NAME=’thread_instrumentation';

Then we need to activate at least the events_statements_current to see something, I suggest activating also history and statements_digest.

update setup_consumers set ENABLED=’YES’ where NAME=’events_statements_current';

update setup_consumers set ENABLED=’YES’ where NAME=’events_statements_history';

update setup_consumers set ENABLED=’YES’ where NAME=’statements_digest';

update setup_consumers set ENABLED=’YES’ where NAME=’events_transactions_current';

update setup_consumers set ENABLED=’YES’ where NAME=’events_transactions_history';

As result, we will have the following consumers activated:


(root@localhost) [performance_schema]>select * from setup_consumers;
+----------------------------------+---------+
| NAME                             | ENABLED |
+----------------------------------+---------+
| events_stages_current            | NO      |
| events_stages_history            | NO      |
| events_stages_history_long       | NO      |
| events_statements_current        | YES     |
| events_statements_history        | YES     |
| events_statements_history_long   | NO      |
| events_transactions_current      | YES     |
| events_transactions_history      | YES     |
| events_transactions_history_long | NO      |
| events_waits_current             | NO      |
| events_waits_history             | NO      |
| events_waits_history_long        | NO      |
| global_instrumentation           | YES     |
| thread_instrumentation           | YES     |
| statements_digest                | YES     |
+----------------------------------+---------+
15 rows in set (0.00 sec)

Final optimization for the pre-filtering is to decide IF we want to catch all the objects and reduce them to a subset. By default PS will use the settings below:

(root@localhost) [performance_schema]>select * from setup_objects;
+-------------+--------------------+-------------+---------+-------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME | ENABLED | TIMED |
+-------------+--------------------+-------------+---------+-------+
| EVENT       | mysql              | %           | NO      | NO    |
| EVENT       | performance_schema | %           | NO      | NO    |
| EVENT       | information_schema | %           | NO      | NO    |
| EVENT       | %                  | %           | YES     | YES   |
| FUNCTION    | mysql              | %           | NO      | NO    |
| FUNCTION    | performance_schema | %           | NO      | NO    |
| FUNCTION    | information_schema | %           | NO      | NO    |
| FUNCTION    | %                  | %           | YES     | YES   |
| PROCEDURE   | mysql              | %           | NO      | NO    |
| PROCEDURE   | performance_schema | %           | NO      | NO    |
| PROCEDURE   | information_schema | %           | NO      | NO    |
| PROCEDURE   | %                  | %           | YES     | YES   |
| TABLE       | mysql              | %           | NO      | NO    |
| TABLE       | performance_schema | %           | NO      | NO    |
| TABLE       | information_schema | %           | NO      | NO    |
| TABLE       | %                  | %           | YES     | YES   |
| TRIGGER     | mysql              | %           | NO      | NO    |
| TRIGGER     | performance_schema | %           | NO      | NO    |
| TRIGGER     | information_schema | %           | NO      | NO    |
| TRIGGER     | %                  | %           | YES     | YES   |
+-------------+--------------------+-------------+---------+-------+
20 rows in set (0.00 sec)

It is easy to understand that ANY object existing in the default Schema will be ignored. In our case, for now, we will keep it as it is, but this will be our next filtering step after we have analyzed some data. This will happen in the PART 2, stay tuned.

Conclusions

For now, you should understand what a Performance Schema is, its basic concept, as well as what interfaces are available and for what. You should also be able to compile the source code with and without PS, or part of it. You should be able to configure the MySQL configuration file correctly, and perform the initial configuration at runtime. Finally, you should know how to query the PS and how to dig in the information, which will also be discussed in the Part 2.

Find out more about Pythian’s services in MySQL.

Categories: DBA Blogs

Log Buffer #425: A Carnival of the Vanities for DBAs

Fri, 2015-05-29 08:39

This Log Buffer edition sheds light on the ongoing innovations and updates in the SQL Server, MySQL and Oracle realms.


Oracle:

  • Warning: standby redo logs not configured for thread – BUG
  • SLOB 2.3 is soon to be released. This version has a lot of new, important features but also a significant amount of tuning in the data loading kit.
  • Even in the highly volatile world of JavaScript frameworks, AngularJS is still very clearly leading the pack.
  • Are you Ready for the Leap Second?
  • Responsive UI is a big deal nowadays, when enterprise applications should run on different platforms and devices.

SQL Server:

  • Three SQL Server MVPs (Jonathan Kehayias, Ted Krueger and Gail Shaw) provide fascinating insight into the most common SQL Server problems, why they occur, and how they can be diagnosed using tools such as Performance Monitor, Dynamic Management Views and server-side tracing.
  • Retrieving a Binary File from Your Database.
  • This tip describes some T-SQL techniques for converting a mmddyyyy string to a SQL Server date, handling rows with missing date strings, and comparing converted date strings.
  • The PIVOT operator was added to the Transact-SQL language in SQL Server 2005. It allows you to turn your row data on its side, so it can be presented as column data.
  • Using the T-SQL PERCENTILE Analytic Functions in SQL Server 2000, 2005 and 2008.

MySQL:

  • Developer Studio.. JavaScript.. ClusterControl DSL.. database clusters.. huh? what the heck is going on here?
  • Percona XtraBackup 2.2.11 is now available
  • Are your databases more costly than they need to be? If you’re not sure, this webinar will show you how to find out, quickly and easily, with KPIs that indicate overprovisioning or opportunity for driving more efficiency from your databases.
  • MySQL Incremental Backup – Point In Time Backup and Recovery of InnoDB and MyIsam Databases
  • MySQL 5.7 key features.

Learn more about Pythian’s expertise in Oracle , SQL Server and MySQL.

Categories: DBA Blogs

Are you Ready for the Leap Second?

Thu, 2015-05-28 11:33

If you’re not aware of what the leap second is look into it. The fact is, this year the last minute of June 30th will be one second longer and “June 30, 2015 23:59:60″ will be a valid and correct time. There are a few issues that could be caused by the leap second, so I’ve reviewed a number of MOS notes and this blog post is the summary of the findings.

Update (June 4th, 2015): I’ve put together another blog post about handling the leap second on Linux here.

There are 2 potential issues, which are described below.

1. NTPD’s leap second update causes a server hang or excessive CPU usage

Any Linux distributions using kernel versions from 2.4 though and including 2.6.39 may be affected (including both UEK and RedHat compatible kernels). This range is very wide and includes any RHEL and OEL releases except version 7 unless the kernel versions are kept up to date on lower versions.

Problems may be observed even a day before the leap second happens, so this year it could cause the symptoms any time on June 30. This is because the NTP server lets the host know about the upcoming leap second up to a day ahead of time, and the update from the NTP triggers the issues.

There are 2 possible symptoms:

  1. Servers will become unresponsive and the following can be seen in system logs, console, netconsole or vmcore dump analysis outputs:
    INFO: task kjournald:1119 blocked for more than 120 seconds.
    "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    kjournald     D ffff880028087f00     0  1119      2 0x00000000
    ffff8807ac15dc40 0000000000000246 ffffffff8100e6a1 ffffffffb053069f
    ffff8807ac22e140 ffff8807ada96080 ffff8807ac22e510 ffff880028073000
    ffff8807ac15dcd0 ffff88002802ea60 ffff8807ac15dc20 ffff8807ac22e140
  2. Any Java applications suddenly starts to use 100% CPU (leap second insertion causes futex to repeatedly timeout).
    $top - 09:38:24 up 354 days,  5:48,  4 users,  load average: 6.49, 6.34, 6.44
    Tasks: 296 total,   4 running, 292 sleeping,   0 stopped,   0 zombie
    Cpu(s): 97.2%us,  1.8%sy,  0.0%ni,  0.7%id,  0.1%wa,  0.1%hi,  0.2%si,  0.0%st
    Mem:     15991M total,    15937M used,       53M free,      107M buffers
    Swap:     8110M total,       72M used,     8038M free,    13614M cached
    PID USER      PR    NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    22564 oracle    16   0 1400m 421m 109m S  353  2.6   2225:11 java
    7294 oracle     17   0 3096m 108m 104m S   22  0.7   0:02.61 oracle
    

And the only workaround mentioned in the notes is to run these commands as root after the problem has occurred (obviously it would be for issue 2) only, as the issue 1) would require a reboot)

# /etc/init.d/ntpd stop
#  date -s "`date`"    (reset the system clock)
# /etc/init.d/ntpd start

I think, as the problem is triggered by the update coming from NTP on June 30, it should also be possible to stop the NTPD service on June 29th, and re-enable it on July 1st instead. This would allow it to bypass the problem conditions.
Just because any Java application can be effected we need to think about where Java is used. And for Oracle DBAs the typical ones to worry about would be all enterprise manager agents as well as any fusion Middleware products. So if you’re using Grid control or Cloud control to monitor your Oracle infrastructure it’s very likely most of your servers are potentially under risk if the kernels are not up to date.

2. Inserts to DATE and TIMESTAMP columns fail with “ORA-01852: seconds must be between 0 and 59″

Any OS could be affected. Based on MOS note “Insert leap seconds into a timestamp column fails with ORA-01852 (Doc ID 1553906.1)”, any inserts of time values having “60” seconds into DATE or TIMESTAMP columns will result in ORA-01852.
This can’t be reliably mitigated by stopping the NTPD as the up to date TZ information on the server may already contain the information about the extra second. The note also provides a “very efficient workaround”: *the leap second record can be stored in a varchar2 datatype instead.*.  You might be thinking, “What? Are you really suggesting me that?” According to MOS note 1453523.1 it appears that the time representation during the leap second is something that could differ depending on the OS/kernel/ntpd versions. For example, it could show “23:59:60″ or it could should show “23:59:59″ for 2 consecutive seconds, which would allow avoiding the ORA-01852. Be sure to check it with your OS admins and make sure that the clock never shows “23:59:60″ to avoid this issue completely.

Consider your infrastructure

By no means are the issues described above an exhaustive list. There’s too much information to cover everything, but based on what I’m reading the issues caused by leap second can be quite severe. Please consider your infrastructure and look for information about issues and fixes to address the upcoming leap second. Search MOS for the products you use and add the “leap second” keyword too, If you’re using software or OS from another vendor, check their support notes regarding leap seconds. Here are additional MOS notes for reading if you’re on some of Oracle’s engineered systems, but again, you’ll find more information if you search:

  • Leap Second Time Adjustment (e.g. on June 30, 2015 at 23:59:59 UTC) and Its Impact on Exadata Database Machine (Doc ID 1986986.1)
  • Exalogic: Affected EECS Releases and Patch Availability for Leap Second (Doc ID 2008413.1)
  • Leap Second on Oracle SuperCluster (Doc ID 1991954.1)
  • Leap Second Handling in Solaris – NTPv3 and NTPv4 (Doc ID 1019692.1)
References

Discover more about Pythian’s expertise in Oracle.

Categories: DBA Blogs

MySQL Query Profiling with Performance Schema

Tue, 2015-05-26 12:34

One of my favorite tools for query optimization is profiling. But recently I noticed this warning:

mysql> set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+----------------------------------------------------------------------+
| Level   | Code | Message                                                              |
+---------+------+----------------------------------------------------------------------+
| Warning | 1287 | '@@profiling' is deprecated and will be removed in a future release. |
+---------+------+----------------------------------------------------------------------+

After looking through certain documentation , I should indeed start using the Performance Schema to get this information.

Okay, so let’s give that a try.

I confirmed that I started MySQL 5.6.23 with the default of Performance Schema = ON:

mysql> show global variables like '%perf%';
+--------------------------------------------------------+-------+
| Variable_name                                          | Value |
+--------------------------------------------------------+-------+
| performance_schema                                     | ON    |
...

I’ll be using a development server for doing query profiling, so I can turn all of these on:

mysql> update performance_schema.setup_instruments set enabled='YES', timed='YES'; #you want the stage* ones enabled
mysql> update performance_schema.setup_consumers set enabled='YES'; #you want the events_statements_history* and events_stages_history* enabled

Start with fresh collection tables:

mysql> truncate performance_schema.events_stages_history_long;
mysql> truncate performance_schema.events_statements_history_long;

Then turn the profiler on:

mysql> set profiling=1;

Now run a sample query:

mysql> select distinct(msa) from zip.codes;

And find the resulting event IDs to use in the query below:

mysql> select event_id, end_event_id, sql_text from performance_schema.events_statements_history_long where sql_text like '%msa%';
...
|      41 |       938507 | select distinct(msa) from zip.codes                                                                  |
...

Insert those beginning and ending event IDs, and here’s the new profiling output on my test query from Performance Schema:

mysql> select substring_index(event_name,'/',-1) as Status, truncate((timer_end-timer_start)/1000000000000,6) as Duration from performance_schema.events_stages_history_long where event_id>=41 and event_id<=938507;

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| init                 | 0.000103 |
| checking permissions | 0.000006 |
| Opening tables       | 0.000051 |
| init                 | 0.000014 |
| System lock          | 0.000007 |
| optimizing           | 0.000003 |
| statistics           | 0.000011 |
| preparing            | 0.000011 |
| Creating tmp table   | 0.000048 |
| executing            | 0.000002 |
| Sending data         | 1.251331 |
| end                  | 0.000003 |
| removing tmp table   | 0.000008 |
| query end            | 0.000006 |
| closing tables       | 0.000009 |
| freeing items        | 0.000111 |
| cleaning up          | 0.000002 |
+----------------------+----------+

Compare the legacy profiling available for the query:

mysql> show profile for query 1;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000125 |
| checking permissions | 0.000007 |
| Opening tables       | 0.000020 |
| init                 | 0.000014 |
| System lock          | 0.000007 |
| optimizing           | 0.000003 |
| statistics           | 0.000011 |
| preparing            | 0.000011 |
| Creating tmp table   | 0.000027 |
| executing            | 0.000001 |
| Sending data         | 1.353825 |
| end                  | 0.000005 |
| removing tmp table   | 0.000007 |
| end                  | 0.000002 |
| query end            | 0.000006 |
| closing tables       | 0.000009 |
| freeing items        | 0.000069 |
| cleaning up          | 0.000028 |
+----------------------+----------+

The obvious question is: Why I would want to be limited to this information when the Performance Schema has so much more available?

But this proves we can get profiler information in a format we’re used to when MySQL fully deprecates the profiling tool.

 

Learn more about Pythian’s expertise in MySQL.

Categories: DBA Blogs

Making Existing SQLPLUS Scripts 12c and Container DB (PDB) Compatible

Tue, 2015-05-26 12:21

Oracle 12c introduces new catalog features including CDB_ dictionary views (which include a CON_ID column) superseding the DBA_ views that most DBA sqlplus scripts are based upon.

However, existing DBA sqlplus scripts can easily be modified using just a few simple sqlplus techniques to be compatible with 11g, as well as all types of 12c databases including legacy and container databases.

The following simple SQL and sqlplus techniques can be used to make a “universal script” that is compatible with all versions.

Illustrating the Issue

Let’s say for sake of example that we have a simple 10g/11g monitoring script that’s checking the amount of freespace in each tablespace by querying the DBA_TABLESPACE_USAGE_METRICS view.

On our 10g or 11g database the following query gives the necessary information:

SQL> select version from v$instance;

VERSION
-----------------
11.2.0.4.0

SQL> select tablespace_name, tablespace_size, used_percent
  2  from DBA_TABLESPACE_USAGE_METRICS
  3  order by tablespace_name;

TABLESPACE_NAME                TABLESPACE_SIZE USED_PERCENT
------------------------------ --------------- ------------
FCCDEV                                  256000      .053125
SYSAUX                                 1024000   31.0617188
SYSTEM                                 1024000   9.19453125
TEMP                                   1024000            0
UNDOTBS1                               1024000      .015625
USERS                                   256000        1.275

6 rows selected.

SQL>

 

Now will the same query work on a 12c database? Of course it will:

SQL> select version from v$instance;

VERSION
-----------------
12.1.0.2.0

SQL> select tablespace_name, tablespace_size, used_percent
  2  from DBA_TABLESPACE_USAGE_METRICS
  3  order by tablespace_name;

TABLESPACE_NAME                TABLESPACE_SIZE USED_PERCENT
------------------------------ --------------- ------------
SYSAUX                                 4194302   .773048769
SYSTEM                                 4194302   1.05991414
TEMP                                   4194302            0
UNDOTBS1                               4194302   .031280532
USERS                                  4194302   .003051759

SQL>

 

It executes successfully on the 12c database but there’s a problem: the query is only returning the data from the root container (or more accurately, from the container in which the statement was executed). The PDB data is missing, I have both open and closed PDBs in this database:

SQL> select con_id, name, open_mode from V$CONTAINERS order by con_id;

    CON_ID NAME                           OPEN_MODE
---------- ------------------------------ ----------
         1 CDB$ROOT                       READ WRITE
         2 PDB$SEED                       READ ONLY
         3 TEST1                          READ WRITE
         4 LDB3                           MOUNTED

SQL>

 

The LDB3 PDB is closed (mounted) so I’m not interested in monitoring the tablespace freespace in it but I am interested in the details from the opened TEST1 PDB.

To get the required information we need to make two or three (Third being optional) changes:

1) Change the view from DBA_ to CDB_
2) Add the CON_ID column to the output
3) Add the CON_ID column to the ORDER BY clause

Hence (executing from CDB$ROOT) the query becomes:

SQL> select con_id, tablespace_name, tablespace_size, used_percent
  2  from CDB_TABLESPACE_USAGE_METRICS
  3  order by con_id, tablespace_name;

    CON_ID TABLESPACE_NAME                TABLESPACE_SIZE USED_PERCENT
---------- ------------------------------ --------------- ------------
         1 SYSAUX                                 4194302   .773048769
         1 SYSTEM                                 4194302   1.05991414
         1 TEMP                                   4194302            0
         1 UNDOTBS1                               4194302   .031280532
         1 USERS                                  4194302   .003051759
         3 AUDIT_DATA                               64000        .2875
         3 SYSAUX                                 4194302   .410843091
         3 SYSTEM                                 4194302   .474167096
         3 TPCCTAB                                1024000   5.63203125

9 rows selected.

SQL>

 

So that works fine, but as it stands we have two versions of the query and therefore we need two monitoring scripts.

 

Building Blocks for the Universal Script

Applying a number of simple sqlplus techniques can help us with this and will allow us to make the single universal version of the sqlplus script.

1) Use a SQLPLUS variable:

The sqlplus DEFINE command allows us to define variables. We can easily define a variable that tells us which view prefix to use depending on whether the database version is 11g or 12c.

SQL> COLUMN view_prefix NEW_VALUE view_prefix
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','CDB','DBA') view_prefix FROM v$instance;

VIE
---
CDB

SQL>

 

2) Dynamically build the view name:

The second tip is that in sqlplus to concatenate a variable with a string a period must be used to show where the variable name ends:

SQL> prompt &view_prefix
CDB

SQL> prompt &view_prefix._TABLESPACE_USAGE_METRICS
CDB_TABLESPACE_USAGE_METRICS

SQL>

 

Plugging that into the original query gives:

SQL> select tablespace_name, tablespace_size, used_percent
  2  from &view_prefix._TABLESPACE_USAGE_METRICS
  3  order by tablespace_name;
old   2: from &view_prefix._TABLESPACE_USAGE_METRICS
new   2: from CDB_TABLESPACE_USAGE_METRICS

TABLESPACE_NAME                TABLESPACE_SIZE USED_PERCENT
------------------------------ --------------- ------------
AUDIT_DATA                               64000        .2875
SYSAUX                                 4194302   .410843091
SYSAUX                                 4194302   .773048769
SYSTEM                                 4194302   1.05991414
SYSTEM                                 4194302   .474167096
TEMP                                   4194302            0
TPCCTAB                                1024000   5.63203125
UNDOTBS1                               4194302   .031280532
USERS                                  4194302   .003051759

9 rows selected.

SQL>

But we’re missing the container ID column.

 

3) Add columns dynamically using additional sqlplus variables:

We can “optionally” include columns such as the CON_ID column using the same technique:

SQL> COLUMN view_prefix NEW_VALUE view_prefix NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','CDB','DBA') view_prefix FROM v$instance;

SQL> COLUMN con_id_col NEW_VALUE con_id_col NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','con_id,','') con_id_col FROM v$instance;

SQL> select &con_id_col tablespace_name, tablespace_size, used_percent
  2  from &view_prefix._TABLESPACE_USAGE_METRICS
  3  order by &con_id_col tablespace_name;
old   1: select &con_id_col tablespace_name, tablespace_size, used_percent
new   1: select con_id, tablespace_name, tablespace_size, used_percent
old   2: from &view_prefix._TABLESPACE_USAGE_METRICS
new   2: from CDB_TABLESPACE_USAGE_METRICS
old   3: order by &con_id_col tablespace_name
new   3: order by con_id, tablespace_name

    CON_ID TABLESPACE_NAME                TABLESPACE_SIZE USED_PERCENT
---------- ------------------------------ --------------- ------------
         1 SYSAUX                                 4194302   .773239504
         1 SYSTEM                                 4194302   1.05991414
         1 TEMP                                   4194302            0
         1 UNDOTBS1                               4194302   .003814699
         1 USERS                                  4194302   .003051759
         3 AUDIT_DATA                               64000        .2875
         3 SYSAUX                                 4194302   .410843091
         3 SYSTEM                                 4194302   .474167096
         3 TPCCTAB                                1024000   5.63203125

9 rows selected.

SQL>

 

Note that the comma is in the variable and not in the column list in the SQL SELECT or ORDER BY clauses.

The script is now dynamically determining whether to use the CDB_ or DBA_ view and similarly dynamically adding the CON_ID column to the SELECT and ORDER BY clauses. (And of course should be executed from the root container.)

And the exact same script still works on the 11g database using the 11g version of sqlplus!

Similarly the optional column (including the comma) defined in the sqlplus variable could be used in an aggregation GROUP BY clause. However, if the query has no other aggregation columns then we might need to add a constant to the GROUP BY clause (and ORDER BY), otherwise the GROUP BY would have no columns listed and the universal sqlplus script is executed against an 11g database.

For example:

SQL> COLUMN view_prefix NEW_VALUE view_prefix NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','CDB','DBA') view_prefix FROM v$instance;

SQL> COLUMN con_id_col NEW_VALUE con_id_col NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','con_id,','') con_id_col FROM v$instance;

SQL> select &con_id_col min(extended_timestamp), max(extended_timestamp)
  2  from &view_prefix._AUDIT_TRAIL
  3  group by &con_id_col 1 order by &con_id_col 1;
old   1: select &con_id_col min(extended_timestamp), max(extended_timestamp)
new   1: select con_id, min(extended_timestamp), max(extended_timestamp)
old   2: from &view_prefix._AUDIT_TRAIL
new   2: from CDB_AUDIT_TRAIL
old   3: group by &con_id_col 1 order by &con_id_col 1
new   3: group by con_id, 1 order by con_id, 1

    CON_ID MIN(EXTENDED_TIMESTAMP)                  MAX(EXTENDED_TIMESTAMP)
---------- ---------------------------------------- ----------------------------------------
         3 13-MAY-15 11.54.52.106301 AM -06:00      13-MAY-15 12.16.18.941308 PM -06:00

SQL>

 

Finally, once we’re done testing and debugging, we can get rid of the ugly “old” and “new” statements using:

SET VERIFY OFF

Implementing these techniques will allow modifications of most existing DBA sqlplus scripts to create universal versions, of which will be compatible with 11g (and likely earlier) databases as well as 12c legacy and container databases.

 

Deeper Dive

What if our monitoring query is based on an underlying catalog table and not a dictionary view?

For example, let’s say that our objective is to report on users and the last time the database password was changed. The password change date isn’t presented in the DBA_USERS or CDB_USERS view, but it is in the underlying SYS.USER$ table. Hence the monitoring query might be something like:

SQL> select name, ptime from SYS.USER$
  2  where type#=1 order by name;

NAME                     PTIME
------------------------ ---------
ANONYMOUS                23-APR-15
...
SYSTEM                   23-APR-15
XDB                      23-APR-15
XS$NULL                  23-APR-15

 

If we look at the view definition of any of the CDB_ views it is apparent that the view traverses the open PDBs by using the new 12c “CONTAINERS” function which accepts a table name as the only argument.

When run from the root container the CONTAINERS() function will traverse all open PDBs (assuming the common user used has local PDB permission to access the referenced table).

NOTE: Prior to 12.1.0.2 the CONTAINERS function was called CDB$VIEW.

Thus, we can use the new function as follows:

SQL> select con_id, name, ptime from CONTAINERS(SYS.USER$)
  2  where type#=1 order by con_id, name;

    CON_ID NAME                     PTIME
---------- ------------------------ ---------
         1 ANONYMOUS                23-APR-15
...
         1 SYSTEM                   23-APR-15
         1 XDB                      23-APR-15
         1 XS$NULL                  23-APR-15
         3 ANONYMOUS                23-APR-15
...
         3 SYSTEM                   23-APR-15
         3 XDB                      23-APR-15
         3 XS$NULL                  23-APR-15

 

Or to make the script universal so the single script can be run on both 11g and 12c:

SQL> COLUMN view_prefix NEW_VALUE view_prefix NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','CONTAINERS(SYS.USER$)','SYS.USER$') view_prefix FROM v$instance;

SQL> COLUMN con_id_col NEW_VALUE con_id_col NOPRINT
SQL> SELECT DECODE(SUBSTR(version,1,INSTR(version,'.')-1),'12','con_id,','') con_id_col FROM v$instance;

SQL> select &con_id_col name, ptime from &view_prefix.
  2  where type#=1 order by &con_id_col name;
old   1: select &con_id_col name, ptime from &view_prefix.
new   1: select con_id, name, ptime from CONTAINERS(SYS.USER$)
old   2: where type#=1 order by &con_id_col name
new   2: where type#=1 order by con_id, name

    CON_ID NAME                     PTIME
---------- ------------------------ ---------
         1 ANONYMOUS                23-APR-15
...
         1 XDB                      23-APR-15
         1 XS$NULL                  23-APR-15
         3 ANONYMOUS                23-APR-15
...
         3 XDB                      23-APR-15
         3 XS$NULL                  23-APR-15

SQL>

 

A final question might be: why isn’t the PDB$SEED database shown in the results?

The answer is that a new 12c initialization parameter EXCLUDE_SEED_CDB_VIEW controls whether the seed database is displayed in CDB_ view (or CONTAINERS() function calls). EXCLUDE_SEED_CDB_VIEW is dynamic and session modifiable:

SQL> show parameter EXCLUDE_SEED_CDB_VIEW

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
exclude_seed_cdb_view                boolean     TRUE

SQL> select con_id, count(1) from cdb_users group by con_id;

    CON_ID   COUNT(1)
---------- ----------
         1         18
         3         20

SQL> alter session set EXCLUDE_SEED_CDB_VIEW=FALSE;

Session altered.

SQL> select con_id, count(1) from cdb_users group by con_id;

    CON_ID   COUNT(1)
---------- ----------
         1         18
         2         17
         3         20

SQL>

 

Other tools

A final question is whether this technique will still work if the SQL script is run through other tools? The answer is: “it depends“.

It depends on whether the other tools support the “define” command and the use of script variables. Specifically, Oracle SQL Developer and the newer sqlcl tool does. The above examples work fine in SQL Developer and sqlcl using the standard sqlcl “default” sqlformat. Other sqlformat options in sqlcl show some issues (testing with sqlcl version 4.2.0.15.121.1046).

 

Learn more about Pythian’s expertise in Oracle and MySQL.

Categories: DBA Blogs

Log Buffer #424: A Carnival of the Vanities for DBAs

Tue, 2015-05-26 11:45

This Log Buffer Edition covers various valuable blog posts from the fields of Oracle, SQL Server and MySQL.

Oracle:

  • Oracle Big Data Appliance X5-2 with Big Data SQL for the DBA.
  • Loading, Updating and Deleting From HBase Tables using HiveQL and Python.
  • In keeping with the ODA quarterly patching strategy, Appliance Manager 12.1.2.3 is now available.
  • From time to time someone publishes a query on the OTN database forum and asks how to make it go faster, and you look at it and think, “it’s a nice example to explain a couple of principles because it’s short, easy to understand, obvious what sort of things might be wrong, and easy to fix.”
  • Optimizing the PL/SQL Challenge IV: More OR Condition Woes.

SQL Server:

  • Will RDBMs be obsolete? Should Data Professionals care about Big Data technologies? What is NoSQL? What is Hadoop?
  • In a development team, there are times when the relationships between developers and testers can become strained. How can you turn this potential conflict into something more positive?
  • Michael Fal is a huge advocate of automation and many ways it can improve the lives of developers and DBAs alike, but you can’t just automate all your problems away.
  • One way to handle a very complex database project with several databases and cross references.
  • Building the Ideal VMware-based SQL Server Virtual Machine.

MySQL:

  • Optimizing Out-of-order Parallel Replication with MariaDB 10.0.
  • General-purpose MySQL applications should read MySQL option files like /etc/my.cnf, ~/.my.cnf, … and ~/.mylogin.cnf. But ~/.mylogin.cnf is encrypted.
  • Creating and Restoring Database Backups With mysqldump and MySQL Enterprise Backup.
  • If you don’t know much about bash shell, you should start with the prior post to learn about bash arrays.
  • Installing Kubernetes Cluster with 3 minions on CentOS 7 to manage pods and services.

Learn more about Pythian’s expertise in Oracle , SQL Server and MySQL.

Categories: DBA Blogs

EM12c : Login to GUI with the correct password causes authentication failure

Thu, 2015-05-21 16:47

So the other day I was trying to log in to my EM12c R4 environment with the SSA_ADMINISTRATOR user, and I got the error:

“Authentication failed. If problem persists, contact your system administrator”

I was quite sure that the password that I had was correct, so I tried with the SYSMAN user and had the same error. I still wanted to verify that I had the correct password , so I tried with the SYSMAN user to log in to the repository database, and was successful, so I know something was wrong there.


SQL&gt; connect sysman/
Enter password:
Connected.

So I went to the<gc_inst>/em/EMGC_OMS1/sysman/log/emoms.log and saw the following error


2015-05-18 21:22:06,103 [[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'] ERROR audit.AuditManager auditLog.368 - Could not Log audit data, Error:java.sql.SQLException: ORA-14400: inserted partition key does not map to any partition
ORA-06512: at &quot;SYSMAN.MGMT_AUDIT&quot;, line 492
ORA-06512: at &quot;SYSMAN.MGMT_AUDIT&quot;, line 406
ORA-06512: at line 1

Which led me to believe that the JOB_QUEUE_PROCESSES was set to 0, but it wasn’t the case, since it was set to 50. Though, this is actually an incorrect limit, so I bumped it up to 1000 and tried to rerun the EM12c repository DBMS Scheduler jobs as per the documentation in 1498456.1:


SQL&gt; show parameter JOB_QUEUE_PROCESSES

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
job_queue_processes integer 50
SQL&gt; alter system set JOB_QUEUE_PROCESSES=1000 scope = both;

System altered.

SQL&gt; show parameter both
SQL&gt; show parameter job

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
job_queue_processes integer 1000
SQL&gt; connect / as sysdba
Connected.
SQL&gt; alter system set job_queue_processes = 0;

System altered.

SQL&gt; connect sysman/alyarog1605
Connected.
SQL&gt; exec emd_maintenance.remove_em_dbms_jobs;

PL/SQL procedure successfully completed.

SQL&gt; exec gc_interval_partition_mgr.partition_maintenance;

PL/SQL procedure successfully completed.

SQL&gt; @$OMS_HOME/sysman/admin/emdrep/sql/core/latest/admin/admin_recompile_invalid.sql SYSMAN
old 11: AND owner = upper('&amp;RECOMPILE_REPOS_USER')
new 11: AND owner = upper('SYSMAN')
old 26: dbms_utility.compile_schema(upper('&amp;RECOMPILE_REPOS_USER'),FALSE);
new 26: dbms_utility.compile_schema(upper('SYSMAN'),FALSE);
old 41: WHERE owner = upper('&amp;RECOMPILE_REPOS_USER')
new 41: WHERE owner = upper('SYSMAN')
old 84: AND owner = upper('&amp;RECOMPILE_REPOS_USER')
new 84: AND owner = upper('SYSMAN')
old 104: AND ds.table_owner = upper('&amp;RECOMPILE_REPOS_USER')
new 104: AND ds.table_owner = upper('SYSMAN')

PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.

SQL&gt; connect / as sysdba
Connected.
SQL&gt; alter system set job_queue_processes = 1000;

System altered.

SQL&gt; connect sysman/
Enter password:
Connected.
SQL&gt; exec emd_maintenance.submit_em_dbms_jobs;

PL/SQL procedure successfully completed.

SQL&gt; commit;

Commit complete.

After this I bounced the OMS, but still kept getting the same error. And though it fixed the scheduler jobs, I was now seeing the following error in the emoms.log:


2015-05-18 22:29:09,573 [[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'] WARN auth.EMRepLoginFilter doFilter.450 - InvalidEMUserException caught in EMRepLoginFilter: Failed to login using repository authentication for user: SSA_ADMIN
oracle.sysman.emSDK.sec.auth.InvalidEMUserException: Failed to login using repository authentication for user: SSA_ADMIN

So what I did was an update to the SYSMAN.MGMT_AUDIT_MASTER table and ran the procedure MGMT_AUDIT_ADMIN.ADD_AUDIT_PARTITION as was stated in document id 1493151.1:


oracle $ sqlplus

&amp;nbsp;

Enter user-name: sysman
Enter password:

SQL&gt; update mgmt_audit_master set prepopulate_days=5 where prepopulate_days is null;

1 rows updated.

SQL&gt; select count(1) from mgmt_audit_master where prepopulate_days is null;

COUNT(1)
----------
0

SQL&gt; exec mgmt_audit_admin.add_audit_partition;

PL/SQL procedure successfully completed.

SQL&gt; commit;

Commit complete.

Once I did this, I was able to login with all my EM12c administrators without any issues:


oracle@em12cr4.localdomain [emrep] /home/oracle
oracle $ emcli login -username=ssa_admin
Enter password

Login successful

Conclusion

Even though the JOB_QUEUE_PROCESSES were not set to 0, it was the cause that it was failing, as it was a low value for this parameter. Thus, be careful when setting up this parameter, be sure to follow the latest installation guidelines.

Note– This was originally published on rene-ace.com

Categories: DBA Blogs

Log Buffer #423: A Carnival of the Vanities for DBAs

Wed, 2015-05-20 16:36

This Log Buffer edition covers Oracle, SQL Server and MySQL blog posts from all over the blogosphere!


Oracle:

Hey DBAs:  You know you can  install and run Oracle Database 12c on different platforms, but if you install it on an Oracle Solaris 11 zone, you can take additional advantages.

Here is a video with Oracle VP of Global Hardware Systems Harish Venkat talking with Aaron De Los Reyes, Deputy Director at Cognizant about his company’s explosive growth & how they managed business functions, applications, and supporting infrastructure for success.

Oracle Unified Directory is an all-in-one directory solution with storage, proxy, synchronization and virtualization capabilities. While unifying the approach, it provides all the services required for high-performance enterprise and carrier-grade environments. Oracle Unified Directory ensures scalability to billions of entries. It is designed for ease of installation, elastic deployments, enterprise manageability, and effective monitoring.

Understanding Flash: Summary – NAND Flash Is A Royal Pain In The …

Extracting Oracle data & Generating JSON data file using ROracle.

SQL Server:

It is no good doing some or most of the aspects of SQL Server security right. You have to get them all right, because any effective penetration of your security is likely to spell disaster. If you fail in any of the ways that Robert Sheldon lists and describes, then you can’t assume that your data is secure, and things are likely to go horribly wrong.

How does a column store index compare to a (traditional )row store index with regards to performance?

Learn how to use the TOP clause in conjunction with the UPDATE, INSERT and DELETE statements.

Did you know that scalar-valued, user-defined functions can be used in DEFAULT/CHECK CONSTRAINTs and computed columns?

Tim Smith blogs as how to measure a behavioral streak with SQL Server, an important skill for determining ROI and extrapolating trends.

Pilip Horan lets us know as How to run SSIS Project as a SQL Job.

MySQL:

Encryption is important component of secure environments. While being intangible, property security doesn’t get enough attention when it comes to describing various systems. “Encryption support” is often the most of details what you can get asking how secure the system is. Other important details are often omitted, but the devil in details as we know. In this post I will describe how we secure backup copies in TwinDB.

The fsfreeze command, is used to suspend and resume access to a file system. This allows consistent snapshots to be taken of the filesystem. fsfreeze supports Ext3/4, ReiserFS, JFS and XFS.

Shinguz: Controlling worldwide manufacturing plants with MySQL.

MySQL 5.7.7 was recently released (it is the latest MySQL 5.7, and is the first “RC” or “Release Candidate” release of 5.7), and is available for download

Upgrading Directly From MySQL 5.0 to 5.6 With mysqldump.

One of the cool new features in 5.7 Release Candidate is Multi Source Replication.

 

Learn more about Pythian’s expertise in Oracle , SQL Server and MySQL.

Categories: DBA Blogs

fsfreeze in Linux

Thu, 2015-05-14 10:17

The fsfreeze command, is used to suspend and resume access to a file system. This allows consistent snapshots to be taken of the filesystem. fsfreeze supports Ext3/4, ReiserFS, JFS and XFS.

A filesystem can be frozen using following command:

# /sbin/fsfreeze -f /data

Now if you are writing to this filesystem, the process/command will be stuck. For example, following command will be stuck in D (UNINTERUPTEBLE_SLEEP) state:

# echo “testing” > /data/file

Only after the filesystem is unfreezed using the following command, can it continue:

# /sbin/fsfreeze -u /data

As per the fsfreeze main page, “fsfreeze is unnecessary for device-mapper devices. The device-mapper (and LVM) automatically freezes filesystem on the device when a snapshot creation is requested.”

fsfreeze is provided by the util-linux package in RHEL systems. Along with userspace support, fsfreeze also requires kernel support.

For example, in the following case, fsfreeze was used in the ext4 filesystem of an AWS CentOS node:

# fsfreeze -f /mysql
fsfreeze: /mysql: freeze failed: Operation not supported

From strace we found that ioctl is returning EOPNOTSUPP:

fstat(3, {st_dev=makedev(253, 0), st_ino=2, st_mode=S_IFDIR|0755,
st_nlink=4, st_uid=3076, st_gid=1119, st_blksize=4096, st_blocks=8,
st_size=4096, st_atime=2014/05/20-10:58:56,
st_mtime=2014/11/17-01:39:36, st_ctime=2014/11/17-01:39:36}) = 0
ioctl(3, 0xc0045877, 0) = -1 EOPNOTSUPP (Operation not
supported)

From latest upstream kernel source:

static int ioctl_fsfreeze(struct file *filp)
{
struct super_block *sb = file_inode(filp)->i_sb;if (!capable(CAP_SYS_ADMIN))
return -EPERM;

/* If filesystem doesn’t support freeze feature, return. */
if (sb->s_op->freeze_fs == NULL)
return -EOPNOTSUPP;

/* Freeze */
return freeze_super(sb);
}

EOPNOTSUPP is returned when a filesystem does not support the feature.

On testing to freeze ext4 in CentOs with AWS community AMI, fsfreeze worked fine.

This means that the issue was specific to the kernel of the system. It was found that AMI used to build the system was having a customized kernel without fsfreeze support.

Categories: DBA Blogs

Ingest a Single Table from Microsoft SQL Server Data into Hadoop

Wed, 2015-05-13 15:13
Introduction

This blog describes the best-practice approach in regards to the data ingestion from SQL Server into Hadoop. The case scenario is described as under:

  • Single table ingestion (no joins)
  • No partitioning
  • Complete data ingestion (trash old and replace new)
  • Data stored in Parquet format
Pre-requisites

This example has been tested using the following versions:

  • Hadoop 2.5.0-cdh5.3.0
  • Hive 0.13.1-cdh5.3.0
  • Sqoop 1.4.5-cdh5.3.0
  • Oozie client build version: 4.0.0-cdh5.3.0
Process Flow Diagram process_flow1 Configuration
  • Create the following directory/file structure (one per data ingestion process). For a new ingestion program please adjust the directory/file names as per requirements. Make sure to replace the
    tag with your table name
<table_name>_ingest + hive-<table_name> create-schema.hql + oozie-properties <table_name>.properties + oozie-<table_name>-ingest + lib kite-data-core.jar
kite-data-mapreduce.jar
sqljdbc4.jar coordinator.xml
impala_metadata.sh
workflow.xml
  • The ingestion process is invoked using an oozie workflow. The workflow invokes all steps necessary for data ingestion including pre-processing, ingestion using sqoop and post-processing.
oozie-<table_name>-ingest
This directory stores all files that are required by the oozie workflow engine. These files should be stored in HDFS for proper functioning of oozie oozie-properties
This directory stores the <table_name>.properties. This file stores the oozie variables such as database users, name node details etc. used by the oozie process at runtime. hive-<table_name>
This directory stores a file called create-schema.hql  which contains the schema definition of the HIVE tables. This file is required to be run in HIVE only once.
  • Configure files under oozie-<table_name>-ingest
1.   Download kite-data-core.jar and kite-data-mapreduce.jar files from http://mvnrepository.com/artifact/org.kitesdk
2.  Download sqljdbc4.jar from https://msdn.microsoft.com/en-us/sqlserver/aa937724.aspx 3.  Configure coordinator.xml. Copy and paste the following XML. <coordinator-app name=”<table_name>-ingest-coordinator” frequency=”${freq}” start=”${startTime}” end=”${endTime}” timezone=”UTC” xmlns=”uri:oozie:coordinator:0.2″>
<action>
<workflow>
<app-path>${workflowRoot}/workflow.xml</app-path>
<configuration>
<property>
<name>partition_name</name>
<value>${coord:formatTime(coord:nominalTime(), ‘YYYY-MM-dd’)}</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app>

4.  Configure workflow.xml. This workflow has three actions:

a) mv-data-to-old – Deletes old data before refreshing new
b) sqoop-ingest-<table_name> – Sqoop action to fetch table from SQL Server
c) invalidate-impala-metadata – Revalidate Impala data after each refresh Copy and paste the following XML. <workflow-app name=”<table_name>-ingest” xmlns=”uri:oozie:workflow:0.2″><start to=”mv-data-to-old” /><action name=”mv-data-to-old”>
<fs>
<delete path=’${sqoop_directory}/<table_name>/*.parquet’ />
<delete path=’${sqoop_directory}/<table_name>/.metadata’ />
</fs><ok to=”sqoop-ingest-<table_name>”/>
<error to=”kill”/>
</action><action name=”sqoop-ingest-<table_name>”>
<sqoop xmlns=”uri:oozie:sqoop-action:0.3″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path=”${nameNode}/user/${wf:user()}/_sqoop/*” />
</prepare><configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration><arg>import</arg>
<arg>–connect</arg>
<arg>${db_string}</arg>
<arg>–table</arg>
<arg>${db_table}</arg>
<arg>–columns</arg>
<arg>${db_columns}</arg>
<arg>–username</arg>
<arg>${db_username}</arg>
<arg>–password</arg>
<arg>${db_password}</arg>
<arg>–split-by</arg>
<arg>${db_table_pk}</arg>
<arg>–target-dir</arg>
<arg>${sqoop_directory}/<table_name></arg>
<arg>–as-parquetfile</arg>
<arg>–compress</arg>
<arg>–compression-codec</arg>
<arg>org.apache.hadoop.io.compress.SnappyCodec</arg>
</sqoop><ok to=”invalidate-impala-metadata”/>
<error to=”kill”/>
</action><action name=”invalidate-impala-metadata”>
<shell xmlns=”uri:oozie:shell-action:0.1″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node><configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${impalaFileName}</exec>
<file>${impalaFilePath}</file>
</shell>
<ok to=”fini”/>
<error to=”kill”/>
</action>
<kill name=”kill”>
<message>Workflow failed with error message ${wf:errorMessage(wf:lastErrorNode())}</message>
</kill><end name=”fini” /></workflow-app>

5. Configure impala_metadata.sh. This file will execute commands to revalidate impala metadata after each restore. Copy and paste the following data.

#!/bin/bash
export PYTHON_EGG_CACHE=./myeggs
impala-shell -i <hive_server> -q “invalidate metadata <hive_db_name>.<hive_table_name>”
  • Configure files under oozie-properties. Create file oozie.properties with contents as under. Edit the parameters as per requirements.
# Coordinator schedulings
freq=480
startTime=2015-04-28T14:00Z
endTime=2029-03-05T06:00Z jobTracker=<jobtracker>
nameNode=hdfs://<namenode>
queueName=<queue_name> rootDir=${nameNode}/user//oozie
workflowRoot=${rootDir}/<table_name>-ingest oozie.use.system.libpath=true
oozie.coord.application.path=${workflowRoot}/coordinator.xml # Sqoop settings
sqoop_directory=${nameNode}/data/sqoop # Hive/Impala Settings
hive_db_name=<hive_db_name>
impalaFileName=impala_metadata.sh
impalaFilePath=/user/oozie/<table_name>-ingest/impala_metadata.sh #impala_metadata.sh # MS SQL Server settings
db_string=jdbc:sqlserver://;databaseName=<sql_server_db_name>
db_username=<sql_server_username>
db_password=<sql_server_password>
db_table=<table_name>
db_columns=<columns>
  • Configure files under hive-<table_name>. Create a new file create-schema.hql with contents as under.
DROP TABLE IF EXISTS ;CREATE EXTERNAL TABLE ()
STORED AS PARQUET
LOCATION ‘hdfs:///data/sqoop/<table_name>'; Deployment
  • Create new directory in HDFS and copy files
$ hadoop fs -mkdir /user/<user>/oozie/<table_name>-ingest
$ hadoop fs -copyFromLocal <directory>/<table_name>/oozie-<table_name>-ingest/lib /user/<user>/oozie/ <table_name>-ingest
$ hadoop fs -copyFromLocal <directory>/<table_name>/oozie-<table_name>-ingest/ coordinator.xml /user/<user>/oozie/ <table_name>-ingest
$ hadoop fs -copyFromLocal <directory>/<table_name>/oozie-<table_name>-ingest/ impala_metadata.sh /user/<user>/oozie/<table_name>-ingest
$ hadoop fs -copyFromLocal <directory>/<table_name>/oozie-<table_name>-ingest/ workflow.xml /user/<user>/oozie/ <table_name>-ingest
  • Create new directory in HDFS for storing data files
$ hadoop fs -mkdir /user/SA.HadoopPipeline/oozie/<table_name>-ingest
$ hadoop fs -mkdir /data/sqoop/<table_name>
  • Now we are ready to select data in HIVE. Go to URL http://<hive_server>:8888/beeswax/#query.
a. Choose existing database on left or create new.
b. Paste contents of create-schema.hql in Query window and click Execute.
c. You should now have an external table in HIVE pointing to data in hdfs://<namenode>/data/sqoop/<table_name>
  • Create Oozie job
a. Choose existing database on left or create new.
$ oozie job -run -config /home/<user>/<<directory>/<table_name>/oozie-properties/oozie.properties Validation and Error Handling
  • At this point an oozie job should be created. To validate the oozie job creation open URL http://<hue_server>:8888/oozie/list_oozie_coordinators. Expected output as under. In case of error please review the logs for recent runs.
 oozie1
  • To validate the oozie job is running open URL http://<hue_server>:8888/oozie/list_oozie_workflows/ . Expected output as under. In case of error please review the logs for recent runs.
 oozie2
  • To validate data in HDFS execute the following command. You should see a file with *.metadata extension and a number of files with *.parquet extension.
$ hadoop fs -ls /data/sqoop/<table_name>/
  • Now we are ready to select data in HIVE or Impala.
    For HIVE go to URL http://<hue_server>:8888/beeswax/#query
    For Impala go to URL http://<hue_server>:8888/impala
    Choose the newly created database on left and execute the following SQL – select * from <hive_table_name> limit 10
    You should see the the data being outputted from the newly ingested data.
Categories: DBA Blogs