DBA Blogs

The Best of Both Worlds Regarding Mainframe Storage and the Cloud

Kubilay Çilkara - Tue, 2016-12-13 17:01
It might shock you to hear that managing data has never been more difficult than it is today.  Data is growing at the speed of light, while IT budgets are shrinking at a similar pace.  All of this growth and change is forcing administrators to find more relevant ways to successfully manage and store data.  This is no easy task, as there are many regulatory constraints with respect to data retention, and the business value of the data needs to be considered as well.  Those within the IT world likely remember (with fondness) the hierarchical storage management systems (HSM), which have traditionally played a key role in the mainframe information lifecycle management (ILM).  Though this was once a reliable and effective way to manage company data, gone are the days when businesses can put full confidence in such a method.  The truth of the matter is that things have become much more complicated.

There is a growing need to collect information and data, and the bad news with this is that there is simply not enough money in the budget to handle the huge load.  In fact, not only are budgets feeling the growth, but even current systems can’t keep up with the vast pace of the increase in data and its value.  It is estimated that global data center traffic will soon triple its numbers from 2013.  You can imagine what a tremendous strain this quick growth poses to HSM and ILM.  Administrators are left with loads of questions such as how long must data be kept, what data must be stored, what data is safe for deletion, and when it is safe to delete certain data.  These questions are simply the tip of the iceberg when it comes to data management.  Regulatory requirements, estimated costs, and the issues of backup, recovery and accessibility for critical data are areas of concern that also must be addressed with the changing atmosphere of tremendous data growth.

There is an alluring solution that has come on the scene that might make heads turn with respect to management of stored data.  The idea of hybrid cloud storage is making administrators within the IT world and businesses alike think that there might actually be a way to manage this vast amount of data in a cost effective way.  So, what would this hybrid cloud look like?  Essentially, it would be a combination of capabilities found in both private and public cloud storage solutions.  It would combine on-site company data with storage capabilities found on the public cloud.  Why would this be a good solution?  The reason is because companies are looking for a cost effective way to manage the massive influx of data.  This hybrid cloud solution would offer just that.  The best part is, users would only need to pay for what they use regarding their storage needs.  The goods news is, the options are seemingly unlimited, increasing or decreasing as client needs shift over time.  With a virtualized architecture in place, the variety of storage options are endless.  Imagine what it would be like to no longer be worried about the provider or the type of storage you are managing.  With the hybrid cloud storage system in place, these worries would go out the window.  Think of it as commodity storage.  Those within the business world understand that this type of storage has proven to work well within their spheres, ultimately offering a limitless capacity to meet all of their data storage needs.  What could be better?

In this fast-paced, shifting world, it’s high time relevant solutions come to the forefront that are effective for the growth and change so common in the world of technology today.  Keep in mind that the vast influx of data could become a huge problem if solutions such as the hybrid cloud options are not considered.  This combination of cloud storage is a great way to lower storage costs as the retention time increases, and the data value decreases.  With this solution, policies are respected, flexibility is gained, and costs are cut.  When it comes to managing data effectively over time, the hybrid cloud storage system is a solution that almost anyone could get behind!

Jason Zhang is the product marketing person for Rocket Software's Backup, Storage, and Cloud solutions.

Categories: DBA Blogs

Cutting Edge IMS Database Management

Kubilay Çilkara - Tue, 2016-12-13 16:39
Never before has the management of a database been more difficult for those within the IT world.  This should not come as a shock to those reading this, especially when you consider how vast the data volumes and streams currently are.  The unfortunate news is that these volumes and streams are not shrinking anytime soon, but IT budgets ARE, and so are things such as human resources and technical skills.  The question remains...how are businesses supposed to manage these databases in the most effective way?  Well, the very factors mentioned above make automation an extremely attractive choice.

Often times, clients have very specific requirements when it comes to automating their IMS systems.  Concerns arise such as how to make the most of CPU usage, what capabilities are available, strategic advantages, and how to save with respect to OpEx.  These are not simply concerns, but necessities, and factors that almost all clients consider.  Generally speaking, these requirements can be streamlined into 2 main categories.  The first is how to monitor database exceptions and the second is how to implement conditional reorgs. 

Regarding the monitoring of database exceptions, clients must consider how long this process actually takes without automation.  Without this tool, it needs to be accomplished manually and requires complicated analysis by staff that is well-experienced in this arena. However, when automation is utilized, policies are the monitors of databases.  In this instance, exceptions actually trigger a notification by email which ultimately reports what sort of help is necessary in order to help find a solution to the problem. 

Now, don’t hear what we are NOT saying here.  Does automation make life easier for everyone?  Yes!  Is implementing automation an easy and seamless process? No!  In fact, automation requires some detailed work prior to setting it up.  This work is accomplished so that it is clear what needs to be monitored and how that monitoring should be carried out.  Ultimately, a monitor list is created which will help to define and assign various policies. Overall, this list will make clear WHO gets sent a notification in the event of an exception, as well as what type of notification will be sent.  Even further, this list will show what the trigger of the exception was, and in the end will assign a notification list to the policy.  It may sound like a lot of work up front, but one could argue that it is certainly worth it in the long run. 

When it comes to conditional reorgs, automation saves the day once again.  Many clients can prove to be quite scattered with respect to their reorg cycle, some even organizing them throughout a spotty 4-week cycle.  The issue here is that reorg jobs are scheduled during a particular window of time, without the time or resources even being evaluated.  When clients choose to automate a reorg, automation will help determine the necessity of a reorg.  The best part of the whole process is that no manual work is needed!  In fact, the scheduler will continue to submit the reorg jobs, but execute them only if necessary.  Talk about a good use of both time and resources!  It ends up being a win-win situation. 

Automation often ends up “saving the day” when it comes to meeting and exceeding client goals.  Did you know that individual utilities -when teamed with the functionality and the vast capabilities of the automation process- actually improves overall business performance and value?  It is true.  So, if you are looking for the most cutting edge way to manage your IMS database, looking no further than the process of automation!

Jason Zhang is the product marketing person for Rocket Software's Backup, Storage, and Cloud solutions.
Categories: DBA Blogs

Investigating IO Performance on Amazon RDS for Oracle

Pythian Group - Tue, 2016-12-13 15:55

I’ve recently been involved in quite a few database migrations to Oracle RDS. One thing that I had noticed when dealing with post-migration performance issues was related to queries that used TABLE SCAN FULL in their execution. It seemed, that in many cases, it just took a single query to max out the allocated IOPS (IOs per second) or bandwidth, which in turn would caused overall slowness of the RDS instance.

The search in documentation showed that it could have been caused by how IO operations are counted on Amazon RDS, as it’s quite different from what a routine Oracle DBA like me would expect. For multi-block reads the database (depending on storage) would typically issue IOs of size up to 1MB, so if an 8K block size was used the table scans would read up to 128 blocks in a single IO of db file scattered read or direct path read.

Now, pay attention to what the AWS documentation says:
While Provisioned IOPS (io1 storage) can work with I/O sizes up to 256 KB, most databases do not typically use such large I/O. An I/O request smaller than 32 KB is handled as one I/O; for example, 1000 16 KB I/O requests are treated the same as 1000 32 KB requests. I/O requests larger than 32 KB consume more than one I/O request; Provisioned IOPS consumption is a linear function of I/O request size above 32 KB. For example, a 48 KB I/O request consumes 1.5 I/O requests of storage capacity; a 64 KB I/O request consumes 2 I/O requests, etc. … Note that I/O size does not affect the IOPS values reported by the metrics, which are based solely on the number of I/Os over time. This means that it is possible to consume all of the IOPS provisioned with fewer I/Os than specified if the I/O sizes are larger than 32 KB. For example, a system provisioned for 5,000 IOPS can attain a maximum of 2,500 IOPS with 64 KB I/O or 1,250 IOPS with 128 KB IO.
… and …
I/O requests larger than 32 KB are treated as more than one I/O for the purposes of PIOPS capacity consumption. A 40 KB I/O request will consume 1.25 I/Os, a 48 KB request will consume 1.5 I/Os, a 64 KB request will consume 2 I/Os, and so on. The I/O request is not split into separate I/Os; all I/O requests are presented to the storage device unchanged. For example, if the database submits a 128 KB I/O request, it goes to the storage device as a single 128 KB I/O request, but it will consume the same amount of PIOPS capacity as four 32 KB I/O requests.

Based on the statements above it looked like the large 1M IOs issued by the DB would be accounted as 32 separate IO operations, which would obviously exhaust the allocated IOPS much sooner than expected. The documentation talks only about Provisioned IOPS, but I think this would apply to General Purpose SSDs (gp2 storage) too, for which the IOPS baseline is 3 IOPS/GB (i.e. 300 IOPS if allocated size is 100GB of gp2).

I decided to do some testing to find out how RDS for Oracle handles large IOs.

The Testing

For testing purposes I used the following code to create a 1G table (Thanks Connor McDonald and AskTom):

ORCL> create table t(a number, b varchar2(100)) pctfree 99 pctused 1;

Table T created.

ORCL> insert into t  values (1,lpad('x',100));

1 row inserted.

ORCL> commit;

Commit complete.

ORCL> alter table t minimize records_per_block;

Table T altered.

ORCL> insert into t select rownum+1,lpad('x',100) from dual connect by level<131072; 131,071 rows inserted. ORCL> commit;

Commit complete.

ORCL> exec dbms_stats.gather_table_stats(user,'T');

PL/ORCL procedure successfully completed.

ORCL> select sum(bytes)/1024/1024 sizemb from user_segments where segment_name='T';

ORCL> select value/1024/1024 buffer_cache from v$sga where name='Database Buffers';

The code for testing will be:

exec rdsadmin.rdsadmin_util.flush_buffer_cache;
alter session set "_serial_direct_read"=always;
alter session set db_file_multiblock_read_count=&1;

-- Run FTS against T table forever.
  n number:=1;
  while n>0
    select /*+ full(t) */ count(*) into n from t;
  end loop;

Basically, I’ll flush the buffer cache, which will force the direct path reads by setting _serial_direct_read to “ALWAYS”, and then, will choose the db_file_multiblock_read_count based on how big IOs I want to issue (note, by default the db_file_multiblock_read_count is not set on RDS, and it resolves to 128, so the maximum size of an IO from the DB is 1 MB), I’ll test with different sizes of IOs, and will Capture the throughput and effective IOPS by using the “Enhanced Monitoring” of the RDS instance.

Side-note: the testing I had to do turned out to be more complex than I had expected before I started. In few cases, I was limited by the instance throughput before I could reach the maximum allocated IOPS, and due to this, the main testing needed to be one on large enough instance (db.m4.4xlarge), that had more of the dedicated EBS-throughput.

The ResultsProvisioned IOPS storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of io1 storage of 1000 Provisioned IOPS. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were completed by using db_file_multiblock_read_count of 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 16, 32, 64 and 128.
For each test the Throughput and IO/s were captured (from RDS CloudWatch graphs), and also the efficient IO size was derived.
The DB instance was idle, but still, there could be few small IO happening during the test.

Provisioned IOPS Measured Throughput

Provisioned IOPS Measured Throughput

Provisioned IOPS Measured IO/s

Provisioned IOPS Measured IO/s

From the graphs above the following features that are not documented can be observed:

  • The RDS instance is dynamically choosing the physical IO size (I’ll call them “physical“, just to differentiate that these are the IOs to storage, while in fact, that’s only what I see in the CLoudWatch graphs, the real physical IO could be some something different) based on the size of the IO request from the database. The possible physical IO sizes appear to be 16K, 32K, 64K, 128K and probably also 8K (this could also be in fact 16K physical IO reading just 8K of data)
  • The IOPS limit applies only to smaller physical IOs sizes (up to 32K), for larger physical IOs (64K, 128K) the throughput is the limiting factor of the IO capability. The throughput limit appears to be quite close to the maximum throughput that the instance is capable of delivering, but at this point, it’s not clear how the throughput limit for particular situation is calculated.
Throughput Limits for Provisioned IOPS

I ran additional tests on differently sized instances with io1 storage to understand better how the maximum throughput was determined. The graph below represents the throughput achieved on different instances, but all had the same 100G of 1000 PIOPS io1 storage. The throughput was done by using db_file_multiblock_read_count=128:

PIOPS Throughput by Instance Type

PIOPS Throughput by Instance Type

it appears that the maximum throughput is indeed limited by the instance type, except for the very largest instance db.m4.10xlarge (For this instance the situation is somewhat weird even in the documentation because the maximum throughput is mentioned as 500 MB/s, but the maximum throughput for a single io1 EBS volume, which should be there underneath the RDS, is just 320 MB/s, and I was unable to reach any of these limits)

General Purpose SSD storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of gp2 storage with 300 IOPS baseline. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were conducted similarly to how they were done for Provisioned IOPS above (note, this is the baseline performance, not burst performance)

General Purpose SSD Measured Throughput

General Purpose SSD Measured Throughput

General Purpose SSD Measured IO/s

General Purpose SSD Measured IO/s

Similarly to Provisioned IOPS, the General Purpose SSD storage behaves differently from what’s explained in the documentation:

  • The physical IO size again is calculated dynamically based on the size of the IO request from the database. The possible sizes appear to be the same as for io1: (8K), 16K, 32K, 64K and 128K.
  • The IOPS limit (to baseline level) appears to apply to IO sizes only up to 16K (compared to 32K in case of Provisioned IOPS), for larger physical IOs starting from 32K, the limit appears to be throughput-driven.
  • It’s not clear how the throughput limit is determined for the particular instance/storage combination, but in this case, it appeared to be around 30% of the maximum throughput for the instance, however, I didn’t confirm the same ratio for db.m4.large where the maximum achievable throughput depended on the allocated size of the gp2 storage.
Burst Performance

I haven’t collected enough data to derive anything concrete, but during my testing I observed that Burst performance applied to both maximum IOPS and also the maximum throughput. For example, while testing on db.m4.large (max instance throughput of 57600 KB/s) with 30G of 90 IOPS baseline performance, I saw that for small physical IOs it allowed bursting up to 3059 IOPS for short periods of time, while normally it indeed allowed only 300 IOPS. For larger IOs (32K+), the baseline maximum throughput was around 24500 KB/s, but the burst throughput was 55000 KB/s

Throughput Limits for General Purpose SSD storage

I don’t really know how the maximum allowed throughput is calculated for different instance type and storage configuration for gp2 instances, but one thing is clear: that instance size, and size of the allocated gp2 storage are considered in determining the maximum throughput. I was able to achieve the following throughput measures when gp2 storage was used:

  • 75144 KB/s (133776 KB/s burst) on db.m4.4xlarge (100G gp2)
  • 54500 KB/s (same as burst, this is close to the instance limit) on db.m4.large (100G gp2)
  • 24537 KB/s (54872 KB/s burst) on db.m4.large (30G gp2)
  • 29116 KB/s (burst was not measured) on db.m4.large (40G gp2)
  • 37291 KB/s (burst was not measured) on db.m4.large (50G gp2)

The testing provided some insight into how the maximum performance of IO is determined on Amazon RDS for Oracle, based on the instance type, storage type, and volume size. Despite finding some clues I also understood that managing IO performance on RDS is far more difficult than expected for mixed size IO requests that are typically issued by Oracle databases. There are many questions that still need to be answered (i.e. how the maximum throughput is calculated for gp2 storage instances) and it’d take many many hours to find all the answers.

On the other-hand, the testing already revealed a few valuable findings:

  1. Opposite to the documentation that states that all IOs are measured and accounted in 32KB units, we found that an IU reported by amazon can be of size 8K (probably), 16K, 32K, 64K and 128K
  2. For small physical IOs (up to 32K in case of Provisioned IOPS and up to 16K in case of General Purpose SSD) the allocated IOPS is used as the limit for the max performance.
  3. For larger physical IOs (from 64K in case of Provisioned IOPS and from 32K in case of General Purpose SSD) the throughput is used as the limit for the max performance, and the IOPS limit no longer applies.
  4. The Burst performance applies to both IOPS and throughput

P.S. As to my original issue of a single TABLE SCAN FULL severely impacting the overall performance, I found that in many cases we were using small RDS instances db.m3.large or db.m4.large, for which the maximum throughput was ridiculously small, and we were hitting the throughput limitation, not the IOPS limit that actually didn’t apply to the larger physical IOs on gp2 storage.

Categories: DBA Blogs

Partner Webcast - Oracle Cloud Machine Technical Overview

Oracle Cloud Machine delivers Oracle Cloud Services direct in your data center, fully managed by Oracle, so that you can take advantage of the agility, innovation and subscription-based pricing...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Query on application continuity

Tom Kyte - Tue, 2016-12-13 02:06
I'm thinking of configuring Application Continuity for my 2 node RAC (12c) database where my application is running on Weblogic server. Before I proceed with the setup of AC, I need to clarify/confirm few things regarding AC: 1. AC does not requir...
Categories: DBA Blogs

in wait event "disk file operations i/o", P1/p2/p3 is 8/0/8, the first 8 refers to the file type, what kind of file does the file type 8 stands for?

Tom Kyte - Tue, 2016-12-13 02:06
one of our DBs experienced a performance slow problem after a network issue, according to AWR report, the 2 bottlenecks are "disk file operations i/o" and "session login/logoff", since we have db audit turned on and it writes audit info to reboot os ...
Categories: DBA Blogs

Package Performance: Cloned vs Original

Tom Kyte - Tue, 2016-12-13 02:06
Hello Tom, This is my First POST. I experienced a performance issue with one of the package and in order to debug/troubleshoot the package, I cloned the original package and without changing any content of the cloned package; I ran a test for a...
Categories: DBA Blogs

Index Clustering Factor and Insert

Tom Kyte - Tue, 2016-12-13 02:06
HI Connor, I followed the below article by Tom, http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52asktom-1735913.html Lets ignore the CHAR, VARCHAR, and NUMBER context, and focus only on the clustering factor. Then, what woul...
Categories: DBA Blogs

APPEND hint in MERGE without INSERT clause

Tom Kyte - Tue, 2016-12-13 02:06
Hello, from the performance point of view: does it make sense to add the APPEND hint into the MERGE statement in that case there is only an UPDATE clause? If yes, what is the benefit? Thank you, Zdenek
Categories: DBA Blogs

Is there any way to convert an complex object type to JSON in 12c

Tom Kyte - Tue, 2016-12-13 02:06
Hello Tom We have a procedure which has an complex object type as output (we didn use Refcursor as we had lot of reusable codes in case of using object). This data is populated in Application layer for display and the same is moved back to a JSON...
Categories: DBA Blogs

ORA-02070: database does not support in this context

Tom Kyte - Mon, 2016-12-12 07:46
Hi, I am facing the below exception - ORA-02070: database does not support in this context Here is the query I am using - INSERT INTO table_1 (a,b,c) SELECT table_name , partition_name , subpartition_name FROM user_subpartitions; ...
Categories: DBA Blogs

Problem with PL/SQL, Cursor and trigger

Tom Kyte - Mon, 2016-12-12 07:46
Hi, I have some little problem with my exercises: i have these two tables CREATE TABLE PRENOTAZIONI ( CodPrenotazioni NUMBER(5) PRIMARY KEY, DataPrenotazioni DATE DEFAULT SYSDATE NOT NULL, CodProg NUMBER (3) NOT NULL, CodUtente VARCHAR2 (20) N...
Categories: DBA Blogs

Date Validation without function

Tom Kyte - Mon, 2016-12-12 07:46
Hi There, I've a string column and i would like to validate it to date between 01 Jan 1930 to 01 Jan 2030, Please help me. Date might be anything like below 1226547879 basdhbcus489 34^&dsulch56
Categories: DBA Blogs

Globally unique integers

Tom Kyte - Mon, 2016-12-12 07:46
My application uses NUMBER columns as primary keys, and uses sequences to populate the column (via a trigger with the typical "if :new.customer_id is null then select seq_customer_id.nextval into :new.customer_id from dual" construct). Now, I woul...
Categories: DBA Blogs

Most efficient way to UNPIVOT a table with large # of columns

Tom Kyte - Mon, 2016-12-12 07:46
Database is At this point, cannot use full featured FLASHBACK ARCHIVE with CONTEXT... but it would have the same issues I think <b>ENVIRONMENT</b> I have a change logging table (e.g. CHANGESTABLE) that is basically the mirror image col...
Categories: DBA Blogs

ORA-01460: unimplemented or unreasonable conversion requested in /var/www/html/rest/price/resolve/ResolvePrice.php

Tom Kyte - Mon, 2016-12-12 07:46
when i pass the hardcoded values $PRODUCT_NUM_ARR and $MEMBER_NAME through an oracle bind variable to execute a stored function,it works fine and i get the result. But when i pass the same values from an array i get the ORA error. I have found diffic...
Categories: DBA Blogs

Limit DOP of automatic statistics gathering

Tom Kyte - Mon, 2016-12-12 07:46
Tom - we run a RAC database with Automatic DOP enabled. The 2 rack nodes have 16 CPU cores each (32 with Hyperthreading) and 128GB of RAM. The database did run smoothly for over a year. The OS is Redhat Enterprise Linux 6.5 on both nod...
Categories: DBA Blogs

12.2 Index Advanced Compression “High” Part II (One Of My Turns)

Richard Foote - Mon, 2016-12-12 00:17
In Part I, I introduced the new Index Advanced Compression default value of “HIGH”, which has the potential to significantly compress indexes much more than previously possible. This is due to new index compression algorithms that do more than simply de-duplicate indexed values within a leaf block. Previously, any attempt to completely compress a Unique […]
Categories: DBA Blogs

Google Cloud Platform Fundamentals in Sydney

Pakistan's First Oracle Blog - Sun, 2016-12-11 22:58
Just finished up one day training at Google's Sydney office in Google Cloud Platform Fundamentals. GCP is pretty cool and I think I like it.

Lots of our customers at Pythian are already hosting, migrating and thinking of doing so on cloud. Pythian already has a huge presence in cloud using various technologies.

So it was good to learn something about the Google's cloud offering. It was a pleasant surprise as it all made sense. From App engine to compute engine and from big table to big query, the features are sound, mature and ready to use.

The dashboard is simple too. I will be blogging more about it as I play with it in coming days.
Categories: DBA Blogs


Tom Kyte - Sun, 2016-12-11 13:26
Hi,guy! How can I improve my rman backup speed? Please more details! Thanks a lot!
Categories: DBA Blogs


Subscribe to Oracle FAQ aggregator - DBA Blogs