Skip navigation.

Feed aggregator

Storage on Azure

Pat Shuff - Thu, 2016-04-21 01:07
Yesterday we were out on a limb. Today we are going to be skating on thin ice. Not only do I know less about Azure than AWS but Microsoft has significantly different thoughts and solutions on storage than the other two cloud vendors. First, let's look at the available literature on Azure storage

There are four types of storage available with Azure storage services; blob storage, table storage, queue storage, and file storage. Blob storage is similar to the Oracle Block Storage or Amazon S3 storage. It provides blocks of pages that can be used for documents, large log files, backups, databases, videos, and so on. Blobs are objects placed inside of containers that have characteristics and access controls. Table storage offers the ability to store key/attribute entries in a semi-structured dataset similar to a NoSQL database. The queue storage provides a messaging system so that you can buffer and sequence events between applications. The third and final is file based storage similar to dropbox or google docs. You can read and write files and file shares and access them through SMB file mounts on Windows systems.

Azure storage does give you the option of deciding upon your reliability model by selecting the replication model. The options are locally triple redundant storage, replication between two data centers, replication between different geographical locations, or read access geo-redundant storage.

Since blob storage is probably more relevant for what we are looking for, let's dive a little deeper into this type of storage. Blobs can be allocated either as block blobs or page blobs. Block blobs are aggregation of blocks that can be allocated in different sizes. Page blobs are of smaller fixed size chunks of 512 bytes for each page blob. Page blogs are the foundation of virtual machines and are used by default to support operating systems running in a virtual machine. Blobs are allocated into containers and inherit the characteristics of the container. Blobs are accessed via REST apis. The address of a blob is formatted as http://(account-name) Note that the account name is defined by the user. It is important to note that the account-name is not unique to your account. This is something that you create and Microsoft adds it to their DNS so that your ip address on the internet can be found. You can't choose simple names like test, testing, my, or other common terms because they have been allocated by someone else.

To begin the process we need to log into the Azure portal and browser for the Storage create options.

Once we find the storage management page we have to click the plus button to add a new storage resource.

It is important to create a unique name. This name will be used as an extension of the REST api and goes in front of the server address. This name must be unique so picking something like the word "test" will fail since someone else has already selected it.

In our example, we select wwpf which is an abbreviation for a non-profit that I work with, who we play for. We next need to select the replication policy to make sure that the data is highly available.

Once we are happy with the name, replication policy, resource group, and payment method, we can click Create. It takes a while so we see a deploying message at the top of the screen.

When we are finished we should see a list of storage containers that we have created. We can dive into the containers and see what services each contains.

Note that we have the option of blob, table, queue, and files at this point. We will dive into the blob part of this to create raw blocks that can be used for backups, holding images, and generic file storage. Clicking on the blob services allows us to create a blob container.

Note that the format of the container name is critical. You can't use special characters or capital letters. Make sure that you follow the naming convention for container names.

We are going to select a blob type container so that we have access to raw blocks.

When the container is created we can see the REST api point for the newly created storage.

We can examine the container properties by clicking on the properties button and looking at when it was created, lease information, file count, and other things related to container access rights.

The easiest way to access this newly created storage is to do the same thing that we did with Oracle Storage. We are going to use the CloudBerry Explorer. In this gui tool we will need to create an attachment to the account. Note that the tool used for Azure is different from the Oracle and Amazon tools. Each cost a little money and they are not the same tool unfortunately. They also only work on a Windows desktop which is challenging if you use a Mac of Linux desktop.

To figure out your access rights, go to the storage management interface and click on the key at the top right. This should open up a properties screen showing you the account and shared access key.

From here we can access the Azure blob storage and drag and drop files. We first add the account information then navigate to the blob container and can read and write objects.

In this example, we are looking at virtual images located on our desktop "E:\" drive and can drag and drop them into a blob container for use by an Azure compute engine.

In summary, Azure storage is very similar to Amazon S3 and Oracle Storage Cloud Services. The cost is similar. The way we access it is similar. The way we protect and restrict access to it is similar. We can address it through a REST api (which we did not detail) and can access it from our desktop or compute server running in Azure. Overall, storage in the cloud is storage in the cloud. You need to examine your use cases and see which storage type works best for you. Microsoft does have an on-premise gateway product called Azure SimpleStor which is similar to the Amazon Storage Gateway or the Oracle Cloud Storage Appliance. It is more of a hardware solution that attaches via iSCSI to existing servers.

Learning to answer questions for yourself!

Tim Hall - Thu, 2016-04-21 00:56

notes-514998_640It’s not important that you know the answer. It’s important you know how to get the answer!

I’m pretty sure I’ve written this before, but I am constantly surprised by some of the questions that come my way. Not surprised that people don’t know the answer, but surprised they don’t know how to get the answer. The vast majority of the time someone asks me a question that I can’t answer off the top of my head, this is what I do in this order.

  1. Google their question, often using the subject line of their post or email. A lot of the time, the first couple of links will give me the answer. Sometimes it’s one of my articles that gives me the answer.

Data Encryption at Rest in Oracle MySQL 5.7

Pythian Group - Wed, 2016-04-20 12:28


I’ve previously evaluated MariaDB’s 10.1 implementation of data encryption at rest (, and recently did the same for Oracle’s implementation ( in their MySQL 5.7.


First, here’s a walkthrough of enabling encryption for MySQL 5.7:

1. Install keyring plugin.

1a. Add the following to the [mysqld] section of /etc/my.cnf:


<script src=””></script>
1b. Restart the server:

service mysqld restart

1c. Verify:

| keyring_file | ACTIVE        |

2. Ensure innodb_file_per_table is on.

2a. Check.

mysql> show global variables like 'innodb_file_per_table';
| Variable_name         | Value |
| innodb_file_per_table | ON    |

2b. If OFF, add the following to the [mysqld] section of /etc/my.cnf, restart, and alter each existing table to move it to its own tablespace:


Get list of available InnoDB tables:

mysql>select table_schema, table_name, engine from information_schema.tables where engine='innodb' and table_schema not in ('information_schema');

Run ALTER … ENGINE=INNODB on each above InnoDB tables:



Next, I walked through some testing.

1. Create some data.

[root@localhost ~]# mysqlslap --concurrency=50 --number-int-cols=2 --number-char-cols=3 --auto-generate-sql --auto-generate-sql-write-number=10000 --no-drop

2. Observe the mysqlslap.t1 table is not automatically encrypted. Unlike MariaDB’s implementation, there is not an option to encrypt tables by default.

2a. Via the mysql client:

Empty set (0.05 sec)

2b. Via the command line:

(Install xxd if required.)

[root@localhost ~]# yum install vim-common
[root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
0010dc0: 5967 4b30 7530 7942 4266 664e 6666 3143  YgK0u0yBBffNff1C
0010dd0: 5175 6470 3332 536e 7647 5761 3654 6365  Qudp32SnvGWa6Tce
0010de0: 3977 6576 7053 3730 3765 4665 4838 7162  9wevpS707eFeH8qb
0010df0: 3253 5078 4d6c 6439 3137 6a7a 634a 5465  2SPxMld917jzcJTe

3. Insert some identifiable data into the table:

mysql> <strong>insert</strong> into mysqlslap.t1 values (1,2,"private","sensitive","data");
Query OK, 1 row affected (0.01 sec)

mysql> select * from mysqlslap.t1 where charcol2="sensitive";
| intcol1 | intcol2 | charcol1 | charcol2  | charcol3 |
|       1 |       2 | private  | sensitive | data     |
1 row in set (0.02 sec)

4. Observe this data via the command line:

[root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
04fa290: 0002 7072 6976 6174 6573 656e 7369 7469  ..privatesensiti

5. Encrypt the mysqlslap.t1 table:

mysql> <strong>alter</strong> table mysqlslap.t1 encryption='Y';
Query OK, 10300 rows affected (0.31 sec)
Records: 10300  Duplicates: 0  Warnings: 0

6. Observe the mysqlslap.t1 table is now encrypted:

6a. Via the mysql client:

| mysqlslap    | t1         | ENCRYPTION="Y" |

6b. Via the command line:

[root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep "private"
[root@localhost ~]#

6c. Observe snippet of the file:

[root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
0004160: 56e4 2930 bbea 167f 7c82 93b4 2fcf 8cc1  V.)0....|.../...
0004170: f443 9d6f 2e1e 9ac2 170a 3b7c 8f38 60bf  .C.o......;|.8`.
0004180: 3c75 2a42 0cc9 a79b 4309 cd83 da74 1b06  &amp;lt;u*B....C....t..
0004190: 3a32 e104 43c5 8dfd f913 0f69 bda6 5e76  :2..C......i..^v

7. Observe redo log is not encrypted:

[root@localhost ~]# xxd /var/lib/mysql/ib_logfile0 | less
23c6930: 0000 0144 0110 8000 0001 8000 0002 7072
23c6940: 6976 6174 6573 656e 7369 7469 7665 6461  ivatesensitiveda
23c6950: 7461 3723 0000 132e 2f6d 7973 716c 736c  ta7#..../mysqlsl

This is expected because the documentation ( reports encryption of files outside the tablespace is not supported: “Tablespace encryption only applies to data in the tablespace. Data is not encrypted in the redo log, undo log, or binary log.”


I found in my testing of MariaDB’s implementation of data encryption at rest that there were still places on the file system that a bad actor could view sensitive data. I’ve found the same in this test of Oracle’s implementation. Both leave data exposed in log files surrounding the tablespace files.


As a bonus to this walkthrough, during this testing, the table definition caught my eye:

mysql> show create table mysqlslap.t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `intcol1` int(32) DEFAULT NULL,
  `intcol2` int(32) DEFAULT NULL,
  `charcol1` varchar(128) DEFAULT NULL,
  `charcol2` varchar(128) DEFAULT NULL,
  `charcol3` varchar(128) DEFAULT NULL
1 row in set (0.00 sec)

As discussed in, the MariaDB implementation does not include the “encrypted=yes” information in the table definition when tables are implicitly encrypted.

I was curious what would happen if I did a mysqldump of this encrypted table and attempted to restore it to a nonencrypted server. DBAs expect mysqldump to create a portable file to recreate the table definition and data on a different version of mysql. During upgrades, for example, you might expect to use this for rollback.

Here is my test. I first did the dump and looked inside the file.

[root@localhost ~]# mysqldump mysqlslap t1 > mysqlslap_t1_dump
[root@localhost ~]# less mysqlslap_t1_dump
  `intcol1` int(32) DEFAULT NULL,
  `intcol2` int(32) DEFAULT NULL,
  `charcol1` varchar(128) DEFAULT NULL,
  `charcol2` varchar(128) DEFAULT NULL,
  `charcol3` varchar(128) DEFAULT NULL

<strong>INSERT</strong> INTO `t1` VALUES (

As expected, that definition makes the dump less portable. The restore from dump is not completed and throws an error (this is not remedied by using –force):

On a slightly older 5.7 version:

mysql> select version();
| version() |
| 5.7.8-rc  |

[root@centosmysql57 ~]# mysql mysqlslap < mysqlslap_t1_dump
ERROR 1064 (42000) at line 25: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ENCRYPTION='Y'' at line 7

On a different fork:

MariaDB [(none)]> select version();
| version()       |
| 10.1.12-MariaDB |
1 row in set (0.00 sec)

[root@maria101 ~]# mysql mysqlslap < mysqlslap_t1_dump
ERROR 1911 (HY000) at line 25: Unknown option 'ENCRYPTION'

This doesn’t have anything to do with the encrypted state of the data in the table, just the table definition. I do like the encryption showing up in the table definition, for better visibility of encryption. Maybe the fix is to have mysqldump strip this when writing to the dump file.

Categories: DBA Blogs

Well Run: Oracle PaaS and Fusion Middleware Community Forum XXII

Usable Apps - Wed, 2016-04-20 12:03

The Oracle Fusion Middleware Partner Community Forum, run by Oracle EMEA Alliances and Channels Community Manager Jürgen Kress (@soacommunity) is the premiere EMEA event to attend for Oracle Partners who build solutions using Oracle technology.

This forum is one awesome annual opportunity for partners to come together to show off their greatness, find out about the latest technology from Oracle, and take advantage of hands-on workshops. The forum is about learning and working collaboratively and generating ideas that'll convert into cloud and middleware business wins for partners.

The Oracle PaaS and Fusion Middleware Community Forum XXII (note that emphasis on PaaSwas held in Valencia, Spain, March 15-18, 2016, which coincided with the amazing color and sounds of Valencia's San José Fallas festival.


We run this cloud: #OAUX at the #OFMFORUM 

Jürgen’s report from the event makes impressive reading: There were over 650 attendees, of which about 80% were external (to Oracle) partners, with many already working on live PaaS projects and with lots more PaaS projects in the planning phase. This year saw a big attendance from the Oracle U.S. HQ  team, too, along with some 22 OTN ACEs from around the world. (I caught up with my old pal Rolando Carrasco [@borland_c] from Mexico and many others.)

All of the sessions were rated "excellent" or "good" in the feedback survey (not one was rated average or below). Bottom line: The event is excellent value for money and a very attractive proposition for partners working on PaaS and middleware projects, and it is of great interest to those working with Oracle ADF, BPM, JET, MAF, SOA, and so on and the range of Oracle Cloud services (ICS, IoT, JCS, MCS, PCS, and more). 

This year a User Experience and BPM track was added to the program. With so many PaaS partners interested in cloud UX as a competitive differentiator, the forum was a "must-do" outreach opportunity for Oracle Applications User Experience (OAUX). I represented OAUX, co-presenting with Lonneke Dikmans (@lonnekedikmans), Managing Partner at eProseed NL, and spoke about the OAUX Cloud UX Rapid Development Kit (RDK) and how we enable partners to rapidly design, build, and deploy reusable simplified UI cloud solutions with Oracle ADF, Oracle Alta UI, and other technology.

Lonneke talks about winning business using developer productivity offered by the RDK

Lonneke talks about winning business using the developer productivity and other benefits offered by the RDK. 

(Of course, Valencia is La Ciudad del Running, too. How could I not attend?)

I demoed the Release 10 Cloud UX RDK and showed off the design patterns, eBooks, and developer workspace involved, highlighting key features of the RDK, such as how partners can integrate typical web services easily and quickly deploy a winning simplified UI custom app to PaaS. An excellent overview of the RDK is available on the Fusion Applications Developer Relations blog.
Lonneke provided the partner side of our enablement story by telling the audience about the eProseed experience of developing a smart city IoT solution to enable modern ways of working (the famous activity-based approach of Erik Veldhoen). eProseed achieved some impressive results from the project through customer insight, storyboards, the RDK's UX design patterns, Oracle ADF templates and components, Oracle Alta UI, and with workshops that directly involved the customer and OAUX. eProseed benefitted from:
  • A higher quality solution where the concept of a modern working style fit easily with the design and development framework of the RDK. 
  • Reusable, consistent development through library and component reuse across different use cases.
  • Increased developer productivity resulting in a faster time to market, a quicker transfer of the ownership of the solution to the customer, and tighter collaboration between their design and development teams. Nothing got "lost in translation."
  • An impressive user experience based on the simplified UI Glance, Scan, Commit design philosophy. This approach crafts solutions that are easy to use because of the application of proven UX design patterns, which also enables users to make seamless transitions between the cloud's integrated apps while maintaining flexible ways of working across different devices using data in the cloud and responsive UIs.
  • An enhanced partner reputation for eProseed. Showcasing a solution with a clear eProseed UX signature offers customers confidence about pursuing solutions in the cloud.

You can read more about the enablement event we held with eProseed last year.

Lonneke concluded our presentation by illustrating real results by demoing the very cool "Licht is Uit" IoT solution.

eProseed is right up there in EMEA with providing solutions that have a smart user experience, are built with Oracle technology, and that leverage OAUX enablement. Catch up with any eProseed sessions you can at other events! 

Jurgen and Lonneke at the speaker's dinner. Post-sailing and post-running!

Jürgen and Lonneke at the speakers' dinner. Post-sailing and post-running! 

And the running? Ah yes. Oracle folks and partners did some running in the city proper and on the coast, too, taking in all of the sights. I am pleased to say that the OAUX running club Design Time @ Run Time technical shirts were a big hit, too. Watch out for those at the next event!

The Ultan and Lonneke Inaugural EMEA Forum 10K

The Lonneke and Ultan Inaugural EMEA Forum 10K in Valencia. A half marathon is planned for next year's event.

Partners with access to the community's Oracle Beehive space can now get my own and Lonneke's presentation at the Forum.


See you at Forum XXIII!

Log Buffer #470: A Carnival of the Vanities for DBAs

Pythian Group - Wed, 2016-04-20 11:39

This Log Buffer Edition rounds up blog posts from Oracle, SQL Server and MySQL.


jq is a unix utility that can parse json files and pull out individual elements – think of it as sed/awk for json files.

Some thoughts about analyzing performance problems.

Microsites: Add a Map Component and Publish your Microsite

New Installation Cookbook: Oracle Linux 6.7 with Oracle RAC

Are you worried about what to do now that Discoverer is almost out of support

Automatic Big Table Caching in RAC


SQL Server:

sp_msforeachdb: Improving on an Undocumented Stored Procedure

The Practical Problems of Determining Equality and Equivalence in SQL

TEMPDB Enhancements in SQL Server 2016

Performance Surprises and Assumptions : DATEADD()

Knee-Jerk Performance Tuning : Incorrect Use of Temporary Tables



Virtual Hosting with vsftpd + TLS encryption and MySQL on Ubuntu 15.10

MySQL 5.7.12 – Part 4: A new MySQL Command Line Shell

Database Firewall Filter in MaxScale 1.4.1

Orchestrator-agent: How to recover a MySQL database

Rosetta Stone: MySQL, Pig and Spark (Basics)

Categories: DBA Blogs

5 Reasons to Take a New Look at Oracle ERP Cloud

Linda Fishman Hoyle - Wed, 2016-04-20 11:17

A Guest Post by Vice President Terrance Wampler, Oracle Financials Product Strategy (pictured left) 

Is your finance team spending its time collecting data, updating spreadsheets, and developing reports? If so, it's probably just keeping score for you rather than changing the game. To change the game, your team needs modern technology so it can do value-added work. Wampler presents us with five reasons why nearly two thousand companies around the world have chosen Oracle ERP Cloud―and why you should too. This is a summary of the post and here's a link to the full article.

Reason #1: Increase Insight
Insight is listed as #5 in the article, but it’s the one reason why every customer chooses Oracle ERP Cloud. It has a unique reporting platform built with a multi-dimensional model that pulls data directly from the general ledger. This provides real-time reporting on live financial information from a single, central point. Now, finance teams have up-to-the-minute insight into profitability and performance instead of waiting for month end.
Reason #2: Optimize
Most people think “big” when they think of Oracle. But SaaS has opened the door to companies that never would have considered us before. Many of our ERP Cloud customers are fast growing companies ($10-$20M revenue) that want the power of an enterprise system. They’ll never have to worry about outgrowing ERP Cloud, which is capable of processing over 230 million transactions per hour. And, they’ll be ready to expand geographically with country-specific tax and legal reporting, payment processing, and budgetary control.
Reason #3: Simplify
Traditional ERP systems did wonders for process efficiency, and now it’s time to do the same for information workers. ERP Cloud is designed to be ultra efficient for today’s data-driven, on-the-go employees. They know what needs their immediate attention with real-time graphical indicators on dashboards and work areas. The new infolets page also allows them to see the status of key performance indicators and drill further for more detail.
Reason #4: Modernize
Oracle ERP Cloud is made for the digital workplace, enabling finance teams to work socially, collaboratively, and logically. Social capabilities are embedded in the business processes, so team members can always track and refer back to conversations immediately. Other innovations include new integrations between Oracle Procurement Cloud and Oracle Project Management Cloud. Buyers can manage supplier negotiations and requests-for-quote as projects instead of negotiation-by-spreadsheet. They can also author documents using standard, pre-approved contract language; use embedded social workflows to collaborate and review the documents; upload supplier responses into the system; and compare bids using live, negotiation-monitoring analytics.
Reason #5: Enforce Controls
ERP Cloud meets the highest security requirements and standardizes data, rules and processes across the enterprise, including accounting, taxes, and payments.

PS360: A Utility to Extract and Present PeopleSoft Configuration and Performance Data

David Kurtz - Wed, 2016-04-20 11:16
IntroductionMost of a PeopleSoft application is itself stored in the database in PeopleTools tables.  Therefore there is lot of information about the configuration and operation of a PeopleSoft system.  There are also performance metrics, particularly about batch processes.
PS360 is a new tool on which I am working.  It just uses SQL scripts to extract that data to html files, and package them up in a zip file so that they can be sent for further analysis.  The style and method is closely modelled on Enkitec's EDB360 by Carlos Sierra.  This is another free tool used for health check and performance analysis of any Oracle database system.  PS360 aims to gather PeopleSoft specific information that is not presented by EDB360.  It also runs in Oracle's SQL*Plus tool, and so is only available for use with an Oracle database.
Every section of PS360 is just the output of a SQL query, sometimes pre-processing is done in an anonymous PL/SQL block.  It does not install anything into the database, and does not update any table (other than the PLAN_TABLE which is used for temporary working storage).  Each report is in tabular and/or graphical format.  All the charts are produced with the Google chart API.
The output falls into three sections.  
  • Configuration: Simple reports of certain configuration tables.
  • Performance Metrics: Process Scheduler and Application Engine timings
  • Consistency Checks: both within PeopleTools tables and between the PeopleSoft and Oracle catalogues.
Sample PS360 Index Page generated on PeopleSoft Demo DatabaseInstructionsThe tool can be run by anyone with access to the PeopleSoft Owner database user (usually SYSADM).  That user will already have privilege to read the Oracle catalogue.
Download the tool and unzip it into a directory.  Navigate to the ps360 (master) directory, open SQL*Plus and connect as SYSADM.  Execute the script ps360.sql.  The output will be written to a zip file in the same directory.  Unpack that zip file on your own PC and open the file ps360_[database name]_0_index.html with a browser.
Feedback EnhancementsI am looking for feedback about the tool, and suggestions for further enhancements.
Please either leave comments here or e-mail me at

©David Kurtz

Partition Storage -- 2 : New Rows Inserted in 12.1 Partitioned Table

Hemant K Chitale - Wed, 2016-04-20 09:41
Following up from the previous post,  given the current sizes of the Partitions (The underlying Tablespace is AUTOALLOCATE)  :

SQL> col segment_name format a30
SQL> col partition_name format a12
SQL> l
1 select segment_name, partition_name, segment_type, bytes/1024
2 from user_segments
3 where segment_name like 'MY_PART_%'
4* order by 1,2
SQL> /

------------------------------ ------------ ------------------ ----------

8 rows selected.


I shall now insert rows so that a Partition has to grow beyond the first extent.

SQL> insert into my_part_tbl                      
2 select 25, 'New Row'
3 from dual
4 connect by level < 100001;

100000 rows created.

SQL> insert into my_part_tbl
2 select 45, 'New Row'
3 from dual
4 connect by level < 500001;

500000 rows created.

SQL> /

500000 rows created.

SQL> commit;

Commit complete.

SQL> select segment_name, partition_name, segment_type, bytes/1024
2 from user_segments
3 where segment_name like 'MY_PART_%'
4 order by 1,2;

------------------------------ ------------ ------------------ ----------

8 rows selected.


So, now, the first Partition P_100 has grown to exceed 24MB. Let's check the Extent Sizes.

SQL> select extent_id, blocks, blocks*8192/1024 Size_KB
2 from user_extents
3 where segment_name = 'MY_PART_TBL'
4 and segment_type = 'TABLE PARTITION'
5 and partition_name = 'P_100'
6 order by 1;

---------- ---------- ----------
0 1024 8192
1 1024 8192
2 1024 8192

SQL> l
1 select extent_id, blocks, blocks*8192/1024 Size_KB
2 from user_extents
3 where segment_name = 'MY_PART_TBL_NDX'
4 and segment_type = 'INDEX PARTITION'
5 and partition_name = 'P_100'
6* order by 1
SQL> /

---------- ---------- ----------
0 8 64
1 8 64
2 8 64
3 8 64
4 8 64
5 8 64
6 8 64
7 8 64
8 8 64
9 8 64
10 8 64
11 8 64
12 8 64
13 8 64
14 8 64
15 8 64
16 128 1024
17 128 1024
18 128 1024
19 128 1024
20 128 1024
21 128 1024
22 128 1024
23 128 1024
24 128 1024
25 128 1024
26 128 1024
27 128 1024
28 128 1024
29 128 1024
30 128 1024
31 128 1024
32 128 1024
33 128 1024
34 128 1024
35 128 1024
36 128 1024
37 128 1024
38 128 1024
39 128 1024
40 128 1024
41 128 1024
42 128 1024

43 rows selected.


So, the Table Partition grew as 8MB Extents but the Index Partition used the normal AutoAllocate routine of 64KB Extents for the first 1MB (16 Extents) followed by 1MB Extents.

What might have been the expected size of the Table Partition ?   ** see the UPDATE at the end of this post.

SQL> exec dbms_stats.gather_table_stats('','MY_PART_TBL',granularity=>'ALL');

PL/SQL procedure successfully completed.

SQL> select partition_name, trunc(1.2*avg_row_len*num_rows/1024) Expected_KB
2 from user_tab_partitions
3 where table_name = 'MY_PART_TBL'
4 order by 1
5 /

------------ -----------
P_100 14179
P_200 0
P_300 0
P_400 0


I use a 20% factor to account for PCTFREE and block/row overheads.  **  see the UPDATE at the end of  this post.

Next, what if I use Parallel Insert ?

SQL> alter table my_part_tbl parallel 4;

Table altered.

SQL> alter session enable parallel dml ;

Session altered.

SQL> insert /*+ PARALLEL (MANUAL) */
2 into my_part_tbl
3 select 125, 'New Row'
4 from dual
5 connect by level < 100001;

100000 rows created.

SQL> commit;

Commit complete.

SQL> insert /*+ PARALLEL (MANUAL) */
2 into my_part_tbl
3 select 125, 'New Row'
4 from dual
5 connect by level < 500001;

500000 rows created.

SQL> commit;

Commit complete.

SQL> insert /*+ PARALLEL (MANUAL) */
2 into my_part_tbl
3 select 125, 'New Row'
4 from dual
5 connect by level < 500001;

500000 rows created.

SQL> commit;

Commit complete.

SQL> l
1 select segment_name, partition_name, segment_type, extents, bytes/1024
2 from user_segments
3 where segment_name like 'MY_PART_%'
4* order by 1,2
SQL> /

------------------------- ------------ ------------------ ---------- ----------

8 rows selected.


Note the size of P_200.  Parallel Insert actually allocated more extents because each PX server process created at least 1 extent for the rows it inserted (Note : For a smaller row-count it is not necessary that each PX server would have inserted rows, some may not have). So, Parallel Insert can result in more allocated space.

** UPDATE :  What about the "expected size" computation based on AVG_ROW_LEN x NUM_ROWS ?

SQL> select partition_name, avg_row_len, num_rows, blocks, trunc(1.2*avg_row_len*num_rows/8192) Expected_Blocks
2 from user_tab_partitions
3 where table_name = 'MY_PART_TBL'
4 order by 1;

------------ ----------- ---------- ---------- ---------------
P_100 11 1100001 3022 1772
P_200 12 1100001 3668 1933
P_300 26 1 1006 0
P_400 28 1 1006 0
P_MAX 0 0 0 0


This shows how any"formula" based on AVG_ROW_LEN and NUM_ROWS can be inaccurate in forecasting the expected size (blocks used).  Also note that P_100 was populated with a Serial Insert while P_200 was populated with a Parallel (Degree=4) Insert.

Categories: DBA Blogs

No Arguments Here – Use xargs for File Management Performance

Pythian Group - Wed, 2016-04-20 09:30


Database Administrators and System Administrators have this in common: managing a large number of log files is just part of the job on Linux systems.

Tools such as logrotate significantly simplify the file management task for routinely created log files. Even so, there are still many ‘opportunities’ to exercise your command line fu to manage thousands or millions of files.  These may be files that need to be moved, removed or searched.

When the files span multiple directories the find command is often used. The following command for instance will find all log files of a certain age and size and remove them.

find . -name "*.log" -size +1M -exec rm  {} \;


For a few files this will work just fine, but what happens if the number of files to be processed is several thousands, or even millions?

The xargs Difference

Let’s first create 200k files to use for testing. These files will all be empty, there is no need for any content for these tests.

The script can be used to create the directories and empty files.

As it takes some time to create the files, we will not use the rm command here, but rather just the file command. The command will be timed as well.

#  time find . -type f -name file_\* -exec file {} \; >/dev/null

real    1m24.764s
user    0m4.624s
sys     0m12.581s

Perhaps 1 minute and 24 seconds seems to be a reasonable amount of time to process so many files.

It isn’t.

Let’s use a slightly different method to process these files, this time by adding xargs in a command pipe.

 time find . -type f -name file_\* | xargs file >/dev/null

real    0m0.860s
user    0m0.456s
sys     0m0.432s

Wait, what?!  0.8 seconds? Can that be correct?

Yes, it is correct. Using xargs with find can greatly reduce the resources needed to iterate through files.

How then, is is possible for the command that used xargs to complete so much faster than the command that did not use xargs?

When iterating through a list of files with the -exec  argument to the find command, a new shell is forked for each execution of find.

For a large number of files this requires a lot of resources.

For demonstration purposes I will be using the ‘file’ command rather than ‘rm’.

Could it be that the xargs method may have benefited from the caching effects of running the first find command?

Could be – let’s run find … -exec again and see if it benefits from caching.

# time find . -type f -name file_\* -exec file {} \; >/dev/null

real    1m25.722s
user    0m3.900s
sys     0m11.893s

Clearly any caching didn’t help find … -exec.

Why Is xargs Fast?

Why is the use of xargs so much faster than find? In short it is due to find starting a new process for each file it finds when the -exec option is used.

The command ‘find | xargs’ was wrapped in a shell script to facilitate the use of strace.

The script takes 2 arguments; the number of files to pipe to xargs and the number files that xargs should send to the file command for each invocation of file.

The number of files to process is controlled by piping the output of find to head.

The xargs –max-args argument is used to control how many arguments are sent to each invocation of find.

We can now use strace with the -c option; -c accumulates a count of all calls along with timing information.

Calling the script to run for the first 10000 files, with 1000 files sent to each invocation of find:

# strace -c -f  ./ 10000 1000
MAX_FILES: 10000
MAX_ARGS: 1000
Process 11268 attached
Process 11269 attached
Process 11267 resumed
Process 11269 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.55    0.080017        5001        16         2 wait4
  0.35    0.000280           0     12372           newfstatat
  0.09    0.000074           0       208           getdents
  0.01    0.000006           0     10000           lstat
  0.00    0.000000           0       199           read
  0.00    0.000000           0       276         1 write
  0.00    0.000000           0       384        91 open
  0.00    0.000000           0       313         4 close
  0.00    0.000000           0        68        42 stat
  0.00    0.000000           0       189           fstat
  0.00    0.000000           0         5         1 lseek
  0.00    0.000000           0       209           mmap
  0.00    0.000000           0        71           mprotect
  0.00    0.000000           0        37           munmap
  0.00    0.000000           0        72           brk
  0.00    0.000000           0        41           rt_sigaction
  0.00    0.000000           0        80           rt_sigprocmask
  0.00    0.000000           0         2           rt_sigreturn
  0.00    0.000000           0        13        12 ioctl
  0.00    0.000000           0        77        77 access
  0.00    0.000000           0         2           pipe
  0.00    0.000000           0         6           dup2
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0        14           clone
  0.00    0.000000           0        14           execve
  0.00    0.000000           0         2           uname
  0.00    0.000000           0         4         1 fcntl
  0.00    0.000000           0       206           fchdir
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0         1           getuid
  0.00    0.000000           0         1           getgid
  0.00    0.000000           0         1           geteuid
  0.00    0.000000           0         1           getegid
  0.00    0.000000           0         1           getppid
  0.00    0.000000           0         1           getpgrp
  0.00    0.000000           0        14           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.080377                 24910       232 total

The largest chunk of time was spent in the wait4 system call. These are waits on execve, of which there were 14.

Of the 14 calls to execve, there was 1 each for the use of bash (the script itself), find, head and xargs, leaving 10 calls to be consumed by file.

The following command can be used if you would like to try this yourself:

strace  -f -e trace=execve  ./ 10000 1000  2>&1 | grep execve

What happens when the same type of test is run against find with the -exec argument?

There is no method (that I can find in the man page anyway) by which we can limit the number of files that are sent to the program specified in the -exec argument of find.

We can still learn what is going on, it is just necessary to wait 1.5 minutes for the command to complete.

# strace -c -f find . -type f -name file_\*  -exec file {} \; >/dev/null

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.80    4.101094          21    200000           wait4
  0.69    0.029305           0    200000           clone
  0.46    0.019278           0   2602351   1400007 open
  0.44    0.018833           0    600001           munmap
  0.31    0.013108           0   3200017           mmap
  0.30    0.012715           0   1401173           fstat
  0.16    0.006979           0   1200006   1200006 access
  0.15    0.006543           0   1202345           close
  0.15    0.006288           0   1000004    600003 stat
  0.13    0.005632           0   1000004           read
  0.12    0.004981           0    200000           lstat
  0.09    0.003704           0    600026           brk
  0.07    0.003016           0   1000009           mprotect
  0.07    0.002776           0    200001    200000 ioctl
  0.03    0.001079           0    201169           newfstatat
  0.02    0.000806           0      2347           getdents
  0.01    0.000600           0    200000           write
  0.00    0.000003           0    200001           arch_prctl
  0.00    0.000002           0    202341           fchdir
  0.00    0.000000           0         3           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0    400001    200000 execve
  0.00    0.000000           0         1           uname
  0.00    0.000000           0         1           fcntl
  0.00    0.000000           0         2           getrlimit
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    4.236742              15811808   3600017 total


You may have noticed there are twice as many calls to execve than there were files to process.

This is due to something referenced in the comments of Unless a full path name is specified when running a command, the PATH variable is searched for that command. If the command is not found by the first invocation of execve, then another attempt is made the next directory in PATH.

The following example shows the difference between using the command name only, and then using the fully pathed name of the file command.

# strace -e trace=execve -f find -maxdepth 1 -type f -name \*.sh  -exec file {} \;  2>&1 | grep execve
execve("/usr/bin/find", ["find", "-maxdepth", "1", "-type", "f", "-name", "*.sh", "-exec", "file", "{}", ";"], [/* 83 vars */]) = 0
[pid  9267] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
[pid  9267] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0
[pid  9268] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
[pid  9268] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0
[pid  9269] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
[pid  9269] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0

# strace -e trace=execve -f find -maxdepth 1 -type f -name \*.sh  -exec /usr/bin/file {} \;  2>&1 | grep execve
execve("/usr/bin/find", ["find", "-maxdepth", "1", "-type", "f", "-name", "*.sh", "-exec", "/usr/bin/file", "{}", ";"], [/* 83 vars */]) = 0
[pid  9273] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
[pid  9274] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
[pid  9275] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
Too Much Space

Regardless of how bad a practice it may be, there will be times that file and directory names may contain space characters. Literal spaces, newlines and tabs can all play havoc with file name processing;  xargs has you covered.

Two files are created to demonstrate:


# touch 'this filename has spaces' this-filename-has-no-spaces

# ls -l
total 0
-rw-r--r-- 1 jkstill dba 0 Apr 15 09:28 this filename has spaces
-rw-r--r-- 1 jkstill dba 0 Apr 15 09:28 this-filename-has-no-spaces

What happens when the output of find it piped to xargs?


 find . -type f | xargs file
./this-filename-has-no-spaces: empty
./this:                        ERROR: cannot open `./this' (No such file or directory)
filename:                      ERROR: cannot open `filename' (No such file or directory)
has:                           ERROR: cannot open `has' (No such file or directory)
spaces:                        ERROR: cannot open `spaces' (No such file or directory)

The spaces in one of the filenames causes xargs to treat each word in the filename as a separate file.

Because of this it is a good idea to use the -print0 and -0 args as seen in the following example. These arguments change the output terminator of find to the null character, as well as changing the input terminator of xargs to the null character to deal with space characters in file and directory names.


 find . -type f -print0  | xargs -0 file
./this-filename-has-no-spaces: empty
./this filename has spaces:    empty

There is quite a bit more to xargs than this, I would encourage you to read the man page and experiment with the options to better learn how to make use of it.

Hope For find

For many versions of GNU find there is an easy modification that can be made to the command line that will cause the -exec option to emulate the method xargs uses pass input to a command.

Simply by changing -exec command {} \; to  -exec command {} +, the find command will execute much faster than previously.

Here the find command has matched the performance of xargs when processing 200k files:


# time find . -type f -name file_\*  -exec file {} +  | wc
 200000  400000 8069198

real    0m0.801s
user    0m0.436s
sys     0m0.404s

This may mean a quick and simple change to maintenance scripts can yield a very large increase in performance.

Does this mean there is no longer a need for xargs?  Not really, as xargs offers levels of control over the input to piped commands that simply are not available in the find command.

If you’ve never used xargs, you should consider doing so, as it can reduce the resource usages on your systems and decrease the runtime for maintenance tasks.

Categories: DBA Blogs

Oracle Security Vulnerability Scoring Metric Change (CVSS)

No, Oracle security vulnerabilities didn’t just get a whole lot worse this quarter.  Instead, Oracle updated the scoring metric used in the Critical Patch Updates (CPU) from CVSS v2 to CVSS v3.0 for the April 2016 CPU.  The Common Vulnerability Score System (CVSS) is a generally accepted method for scoring and rating security vulnerabilities.  CVSS is used by Oracle, Microsoft, Cisco, and other major software vendors.

As we have discussed previously, CVSS v2 did score Oracle security vulnerabilities for the database, middleware, and applications lower than operating system and network component vulnerabilities.  Contrary to what many security researchers claim, the problem is with the CVSS standard, not manipulation of the scores by Oracle.  CVSS v2 puts a premium on the ability to compromise the entire operating system (i.e., root account) or device.  For most Oracle security vulnerabilities, across all products, it is very difficult to compromise the root operating system account by exploiting an Oracle Database, Fusion Middleware, or application (Oracle E-Business Suite, PeopleSoft, etc.) security bug.  Although, there are some exceptions mostly limited to the Oracle Database running on Microsoft Windows Server, which allow compromise of the administrator account.

To account for this limitation in CVSS, Oracle included in the CPU advisory matrices for informational purposes only a “Partial+” to indicate where the entire database, middleware server, or application could be compromised.  However, this was not reflected in the score since the CVSS standard says a “Complete” impact “… is total information disclosure, resulting in all system files being revealed.”  As a result, Oracle CVSS v2 scores for critical or severe bugs tended to be 6.5 for the Oracle Database, 7.5 for Fusion Middleware, and 6.4 for applications like the Oracle E-Business Suite and PeopleSoft.

CVSS v3.0 changes the scoring to put more of an emphasis on the asset or component being protected (i.e., database or application).  The key CVSS definition has changed from “system” to “impacted component.”  The scoring algorithm also includes more granularity for privileges required to exploit and the scope of the exploit, such as can a database attack compromise the underlying operating system.

The Oracle CVSS v3.0 scores will be much higher now, especially for the Fusion Middleware and applications like Oracle E-Business Suite and PeopleSoft.  Critical Fusion Middleware security bugs will rise from 7.5 to 9.8.  Oracle E-Business Suite and PeopleSoft critical security bugs like unauthenticated SQL injection will jump from 6.4 to 9.8.  As almost all Oracle Database security bugs require database authentication, the Oracle Database CVSS scores will go from 6.5 to 8.8 for easy to exploit SQL injection vulnerabilities in PUBLIC packages.

The critical risk associated with most Oracle security vulnerabilities is still critical.  Now the CVSS score properly reflects the critical nature of many of these bugs.

Oracle Critical Patch Updates
Categories: APPS Blogs, Security Blogs

Oracle E-Business Suite 11i CPU Security Patches Only Available for Tier 1 Support Customers

Oracle E-Business Suite 11i is impacted by 8 security vulnerabilities in the April 2016 CPU, which includes the Oracle Configurator and Oracle Complex Maintenance, Repair, and Overhaul security bugs listed under the Oracle Supply Chain Products Suite.

Starting with the April 2016 Critical Patch Update (CPU), Oracle E-Business Suite 11i security patches are only available for Oracle customers with Tier 1 Support contracts, previously referred to as Advanced Customer Support (ACS).  Tier 1 Support must be purchased and is an additional fee on top of standard Oracle maintenance.  Optional Tier 1 Support will include CPU security patches through October 2016.

CPU information for 11i has been moved from the standard quarterly CPU My Oracle Support (MOS) note for Oracle E-Business Suite to MOS Note ID 2126170.1 “Oracle E-Business Suite Release 11i Critical Patch Update Knowledge Document (For Oracle E-Business Suite 11i Tier 1 Support Customers).”

For more information on CPU support for 11i, please see MOS Note ID 1596629.1 “ANNOUNCEMENT: Additional Coverage Options for 11.5.10 E-Business Suite Sustaining Support.”

As an alternative to Oracle Tier 1 Support or as an additional layer of defense for Oracle E-Business Suite 11i, Integrigy’s web application firewall for Oracle E-Business Suite, AppDefend, provides virtual patching of Oracle E-Business Suite web security vulnerabilities, web application attack surface reduction, and protection from SQL injection and cross-site scripting (XSS) attacks.

Oracle E-Business Suite, Oracle Critical Patch Updates
Categories: APPS Blogs, Security Blogs

A Taste of FinTech: Bitterballen and Banking in the Cloud with Profource

Usable Apps - Wed, 2016-04-20 07:05

Financial technology (#FinTech) innovation and the future of banking are hot topics. If you wondered for one tiny moment how the importance of financial applications and the cloud fit into the Oracle Cloud User Experience #PaaS4SaaS enablement that Oracle Applications User Experience (OAUX) offers Oracle Partners, well, here's one rocking example from EMEA!

Recently, we (OAUX) held a hands-on enablement event at Oracle Nederland in Utrecht to work alongside one of our leading EMEA partners, Profource B.V., to design and build a simplified UI Banking Cloud solution using the Cloud UX Rapid Development Kit (RDK) for Release 10. This event was the culmination of upfront design and development exploration done collaboratively between the two teams online.

Profource and OAUX Teams in Oracle Nederland

Part of the Profource team with the OAUX enablers: (L-R): Julian Orr, Lancy Silveira, Ronald van Herpen, Martijn Rijpkema, Pam Koertshuis, and Ultan Ó Broin

Held over 2.5 days, the Microsoft PowerPoint Karaoke was cut to a minimum as design and development teams stormed and formed around the solution requirements and worked fast to iterate designs and build a modern banking cloud solution that was then deployed as a service to the Profource cloud. A great success!

Banking Cloud Simplified UI Launch Page Wireframe

Part of the Banking Cloud simplified UI launch experience wireframe. Wireframing the solution using the RDK tools, garnering agreement, and transferring the design to the development team made for rapid, agile innovation and iteration, right through to deployment.

Banking Cloud Statements Wireframe

Simply that simplified UI again: This time for an Oracle ERP Cloud Release 10 solution. This is a wireframe view of part of the user experience flow, built using the RDK's Oracle ADF page templates and declarative components, the Oracle Alta UI design system, and the agile Learn-Design-Build approach of the RDK's guidance.

You can read more about the Banking Cloud event on Profource consultant Pam Koertshuis's (@pkoertshuis) blog: Simplified UI for PAAS 4 SAAS solution.

Profource consultant, Hakan Biroglu (@hakanbiroglu), said about the event:

"We [the Profource team] have learned a lot. Not just on a technical level, on how to reuse your RDK, but also on an architectural and design level. Your workshop forced us to rethink our UX solution and to evaluate every item on every page in every flow, 'Why is this needed? What information does it provide me? What does it trigger me to do?'" 

Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:8.0pt; font-family:"Times New Roman Bold"; color:black; mso-fareast-language:JA;}

If you're heading to OBUG's APPSCONNECTed16 event in Arnhem, check out the sessions about the Profource Banking Cloud and their other Release 10 cloud solutions (HCM, PPM, and Resource Management) and about how you can use the RDK and OAUX enablement to do the same for your business.

Many thanks to the Profource team and to Oracle Nederland.  

If you are an Oracle Partner who wants to get ahead in the Oracle Cloud, you can contact us through the usual OAUX channels or your Oracle PartnerNetwork contacts.

Oracle Usable Apps Catds and Skerches

Oracle Applications Cloud User Experience: Enablement for partners from design to deployment. It starts with a sketch . . . .

Oh, the bittterballen?

Since you asked...


Bitterballen: An Oracle Nederland culinary delight that somehow always features as part of OAUX enablement events in Utrecht. 

Column Groups

Jonathan Lewis - Wed, 2016-04-20 02:07

Patrick Jolliffe alerted the Oracle-L list to a problem that appears when you combine fixed length character columns (i.e. char() or nchar())  with column group statistics. The underlying cause of the problem is the “blank padding” semantics that Oracle uses by default to compare varchar2 with char, so I’ll start with a little demo of that. First some sample data:

rem     Script:         col_group_char_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2016

execute dbms_random.seed(0)

create table t1
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(1))  c1,
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(2))  c2,
        cast('X' as varchar2(2))                                v2
        generator       v1
        rownum <= 5 * 5 * 10

insert into t1(c1, c2, v2)
select  'X', 'X', 'X'
from    t1

update t1 set v2 = c2;

The little demos I’m going to report here don’t use all the data in this table – there are several other tests in the script that I won’t be reporting – so I’ll just point out that there are 500 rows in the table, half of them have ‘X’ in all three columns, and half of them have a uniform distribution of the letters ‘A’ to ‘E’ in every column.

  • Column c1 is declared as char(1) – so it will hold the data exactly as it was inserted by the script.
  • Column c2 is declared as char(2) – so even though the script apparently inserts a character string of length 1, this will be padded with a space to two characters before being stored.

Now we can create some stats – in particular a frequency histogram on the c2 column – and check the cardinality estimates for a couple of queries:

                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 254'

set autotrace traceonly explain

prompt  ==================
prompt  c2 without padding
prompt  ==================

select  *
from    t1
where   c2 = 'X'

prompt  ================
prompt  c2 with padding
prompt  ================

select  *
from    t1
where   c2 = 'X '

set autotrace off

The first query compares c2 with the single character ‘X’, the second compares it with the two-character string ‘X ‘. But since the comparison is with a char(2) column the optimizer pads the first constant with spaces, and both queries end up predicting the same cardinality:

c2 without padding

Execution Plan
Plan hash value: 3617692013

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - filter("C2"='X')

c2 with padding

Execution Plan
Plan hash value: 3617692013

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - filter("C2"='X ')

Note that both queries predict the 250 rows where (we know) c2 = ‘X ‘; even though the predicate sections suggest the queries are looking for different data sets. This IS the expected behaviour.

Now let’s make things more complex – we’ll add the predicate “and c1 = ‘X'” to both queries but we’ll create a column group with histogram on (c1, c2) before checking the plans. Again we expect both versions of the new query to predict the same volume of data and (in fact) to produce a perfect prediction because we have so few rows and so few distinct combinations that we should get a perfect frequency histogram:

                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1 for columns (c1, c2) size 254'

prompt  ========================
prompt  (c1, c2) without padding
prompt  ========================

select  *
from    t1
where   c1 = 'X' and c2 = 'X'

prompt  =====================
prompt  (c1, c2) with padding
prompt  =====================

select  *
from    t1
where   c1 = 'X' and c2 = 'X '

And here are the execution plans:

(c1, c2) without padding

Execution Plan
Plan hash value: 3617692013

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |     2 |    16 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    16 |    17   (0)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - filter("C1"='X' AND "C2"='X')

(c1, c2) with padding

Execution Plan
Plan hash value: 3617692013

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - filter("C1"='X' AND "C2"='X ')

If we run the query where the literal is padded with spaces to the correct length (2nd query) then the prediction is correct. But if we haven’t padded the literal the prediction is wrong; the estimate is the one the optimizer would have used for “value not found in histogram”.

I think what’s happening is that the optimizer doesn’t “remember” that the literal is being compared with a char() when making the call to sys_op_combined_hash() that it uses for calculating column group stats so it doesn’t pad the column with spaces before calling the function and, as a consequence, the hashed value isn’t the one it should be using.

I’ve run this test on and – the effects are the same on both versions.

Bottom Line:

Be careful about how you use char() data types in your code, and be especially careful if you think you’re going to be creating column group stats involving char() columns – and then remember that 12c may generate column group stats automatically for you. If you use char() columns you will have to ensure that predicates using literal values should have those values padded to the correct number of spaces if you want to have the best possible chance of getting the correct execution plans.


Storage in Amazon S3

Pat Shuff - Wed, 2016-04-20 01:07
To be honest, I am going out on a limb here. I know just enough about Amazon S3 to be dangerous. Most of the reference material that I used was from the amazon web site or Safari Books. The books that I relied upon the most are The key use cases according to the S3 Essentials book are file hosting, storing data on mobile based applications, static web hosting, video hosting, and data backup. We will look at a couple of these configurations and how to deploy and use them.

With Oracle Storage Cloud Services we had the concept of a container. This container had characteristics like spinning disk, archive, ownership, and other features related to ownership and security. Amazon S3 has a similar concept but they call this container a bucket. A bucket can contain nested folders and has properties associated with it. If we look at the AWS Console, we see six types of storage and content delivery

  • S3 - block storage in the cloud
  • Cloud Front - content delivery network
  • Elastic File System - fully managed file system for EC2
  • Glacier - tape storage in the cloud
  • Snowball - large scale data transport to get data into and out of S3
  • Storage Gateway - an appliance to reduce latency for S3 storage
We will be focusing on S3 and Glacier. Snowball is a mechanism to transport large amounts of data to and from the Amazon data center. The Storage Gateway is an appliance in a data center to reduce latency and provide a quicker access to data stored in S3. We will need to dive a little deeper into S3 and the Storage Gateway but not the Cloud Front, and the Elastic File System in this blog.

We first start with the AWS console and click on the S3 console. We can create a new bucket by clicking on Create Bucket.

When we create a new S3 bucket, we can name it and define which data center we want the storage to be allocated into. We have to be careful when we create a bucket name. The namespace is shared with all users. If we want to create a common name, it will probably be used by someone else and we will see an error in creating the name.

If we look at the properties associated with this storage we can see that we have a variety of options to configure.

  • Permissions
  • Static Web Hosting
  • Logging
  • Events
  • Versioning
  • Lifecycle
  • Cross-Region Replication
  • Tags
  • Requester Pays

Let's go through each of these individually. With Permissions, you have the ability to control who can see, modify, delete, and download the contents of the bucket. Bucket policies can get relatively complex and have a variety of conditions and restrictions applied to it. You can find out more at Detailing Advanced Policies. This feature allows you to restrict who can read content by ip address, access keys, or usernames.

Static web hosting allows you to create a web site based on the files in a container. If you have an index.html, it becomes to the basis for accessing all of the other files in this directory. This is both good and bad because you get the basic functionality of a web server but you don't get the configuration and access logs. It has some uses but is limited in how it works. It does make static web page presentation easy because you no longer need an EC2 instance, operating system, or application to host the web site.

Logging allows you to view how, who, and from where files were accessed. You can generate logs to look at access patterns and access locations.

Versioning allows you to keep past copies of files. If a file is edited and changed, previous versions and deltas are tracked. This is a good feature but does cause storage consumption to grow because you never delete older versions of a file but keep the deltas for a fixed amount of history.

Lifecycle allows you to automatically archive files from spinning disk to tape after a fixed amount of time and history of access. If no one has accessed a file in months, it can be written to Glacier for long term lower cost archive.

Cross-Region Replication allows you to replicate blocks between data centers automatically. This allows for high availability in the event that one data center fails or storage at one location is having significant problems.

Tags and Request Payer allows for charge-back features to allow people who consume resources to pay for the download and storage. The person creating the bucket is not charged for usage but has the mechanism to transfer the charges to the person reading the data.

Reading and writing to our newly created bucket requires a user interface or usage of the Amazon Rest api to transfer files. Amazon does provide a user interface to upload and edit the properties of the files and directories. We recommend using another interface like CloudBerry or other graphical tool or the command line utilities because this interface is a bit limiting.

This blog entry is significantly different from the one yesterday. Yesterday we started with pricing then got technical. Today we dove straight into the technical and ignored pricing. Let's dive into pricing. The cost of S3 storage is $30/TB/month plus outbound charges. I suggest using the S3 price list and the S3 price calculator to figure pricing. Attached are screen shots of pricing for 120 TB of storage using the calculator and screen shots of the price list.

One thing that we talked about with the Oracle Storage Cloud and have not talked about here is an on premise virtual machine to reduce latency. Amazon offers this with the AWS Storage Gateway. The key differences between the two products are 1) AWS Gateway uses iSCSI on the client side to provide storage access to the data center and 2) it cost $125/month/gateway. It solves the same latency problem but does it slightly differently. Unfortunately, we are not going to install and configure this virtual instance and play with it because it requires 8 virtual CPUs which is greater than my laptop will offer.

In summary, this is an initial review of S3 storage with Amazon AWS. We did not dive deep into Glacier or the Storage Gateway. We did not review elastic block services (EBS) because these are typically attached to EC2 instances. It is important to note that the focus of S3 is different than Oracle Storage Cloud Services but very similar. Files and directories can be stored in containers and access can be controlled. S3 extends services to provide things like video streaming, static web site hosting, and migrating data to and from tape in the cloud. You can use S3 for backup archives and generic block storage and access it via REST api or AWS api calls. Products like CloudBerry Explorer and S3 Explorer exist to help translate the human interface to S3 storage calls. The cost for S3 is roughly $30/TB/month with additional charges for outbound data on a per GB basis. Archive storage is roughly $7/TB/month with additional charges for data retrieval and outbound data on a per GB basis. The intent of this blog is not to say that one service is better than the other but provide resources to help you make your own decisions and decide what works best for your situation and corporation.

Designing PL/SQL Programs: Series home page

Andrew Clarke - Wed, 2016-04-20 00:57
Designing PL/SQL Programs is a succession of articles published the articles in a nonlinear fashion. Eventually it will evolve into a coherent series. In the meantime this page serves as a map and navigation aid. I will add articles to it as and when I publish them.
IntroductionDesigning PL/SQL Programs
It's all about the interface
Principles and PatternsIntroducing the SOLID principles
Introducing the RCCASS principles
Three more principles
The Dependency Inversion Principle: a practical example
Working with the Interface Segregation Principle Software ArchitectureThe importance of cohesionInterface designTools and Techniques

The importance of cohesion

Andrew Clarke - Wed, 2016-04-20 00:56
"Come on, come on, let's stick together" - Bryan Ferry

There's more to PL/SQL programs than packages, but most of our code will live in packages. The PL/SQL Reference offers the following benefits of organising our code into packages:

Modularity - we encapsulate logically related components into an easy to understand structure.

Easier Application Design - we can start with the interface in the package specification and code the implementation later.

Hidden Implementation Details - the package body is private so we can prevent application users having direct access to certain functionality.

Added Functionality - we can share the state of Package public variables and cursors for the life of a session.

Better Performance - Oracle Database loads the whole package into memory the first time you invoke a package subprogram, which makes subsequent invocations of any other subprogram quicker. Also packages prevent cascading dependencies and unnecessary recompilation.

Grants - we can grant permission on a single package instead of a whole bunch of objects.

However, we can only realise these benefits if the packaged components belong together: in other words, if our package is cohesive.  

The ever reliable Wikipedia defines cohesion like this: "the degree to which the elements of a module belong together"; in other words how it's a measure of the strength of the relationship between components. It's common to think of cohesion as a binary state - either a package is cohesive or it isn't - but actually it's a spectrum. (Perhaps computer science should use  "cohesiveness" which is more expressi but cohesion it is.)
CohesionCohesion owes its origin as a Comp Sci term to Stevens, Myers, and Constantine.  Back in the Seventies they used the terms "module" and "processing elements", but we're discussing PL/SQL so let's use Package and Procedure instead. They defined seven levels of cohesion, with each level being better - more usefully cohesive - than its predecessor.
CoincidentalThe package comprises an arbitrary selection of procedures and functions which are not related in any way. This obviously seems like a daft thing to do, but most packages with "Utility" in their name fall into this category.
LogicalThe package contains procedures which all belong to the same logical class of functions. For instance, we might have a package to collect all the procedures which act as endpoints for REST Data Services.
TemporalThe package consists of procedures which are executed at the same system event. So we might have a package of procedures executed when a user logs on - authentication, auditing, session initialisation - and similar package for tidying up when the user logs off. Other than the triggering event the packaged functions are unrelated to each other.
ProceduralThe package consists of procedures which are executed as part of the same business event. For instance, in an auction application there are a set of actions to follow whenever a bid is made: compare to asking price, evaluate against existing maximum bid, update lot's status, update bidder's history, send an email to the bidder, send an email to the user who's been outbid, etc.
CommunicationalThe package contains procedures which share common inputs or outputs. For example a payroll package may have procedures to calculate base salary, overtime, sick pay, commission, bonuses and produce the overall remuneration for an employee.
SequentialThe package comprises procedures which are executed as a chain, so that the output of one procedure becomes the input for another procedure. A classic example of this is an ETL package with procedures for loading data into a staging area, validating and transforming the data, and then loading records into the target table(s).
FunctionalThe package comprises procedures which are focused on a single task. Not only are all the procedures strongly related to each other but they are fitted to user roles too. So procedures for power users are in a separate package from procedures for normal users. The Oracle built-in packages for Advanced Queuing are a good model of Functional cohesion.
How cohesive is cohesive enough?The grades of cohesion, with Coincidental as the worst and Functional as the best, are guidelines. Not every package needs to have Functional cohesion. In a software architecture we will have modules at different levels. The higher modules will tend to be composed of calls to lower level modules. The low level modules are the concrete implementations and they should aspire to Sequential or Functional cohesion.

The higher level modules can be organised to other levels. For instance we might want to build packages around user roles - Sales, Production, HR, IT - because Procedural cohesion makes it easier for the UI teams to develop screens, especially if they need to skin them for various different technologies (desktop, web, mobile). Likewise we wouldn't want to have Temporally cohesive packages with concrete code for managing user logon or logoff. But there is a value in organising a package which bundles up all the low level calls into a single abstract call for use in schema level AFTER LOGON triggers.    

Cohesion is not an easily evaluated condition. We need cohesion with a purpose, a reason to stick those procedures together. It's not enough to say "this package is cohesive". We must take into consideration how cohesive the package needs to be: how will it be used? what is its relationships with the other packages?

Applying design principles such as Single Responsibility, Common Reuse, Common Closure and Interface Segregation can help us to build cohesive packages. Getting the balance right requires an understanding of the purpose of the package and its place within the overall software architecture.  

Part of the Designing PL/SQL Programs series

Pop Quiz

Michael Feldstein - Wed, 2016-04-20 00:38

By Michael FeldsteinMore Posts (1070)

Which CEO has recently said or done all of the following:

  • Suggested to an audience of VCs and ed tech entrepreneurs at the GSV conference that the importance of big data in education has been overstated
  • Told that same audience that the biggest gains from adaptive learning come when it is wrapped in good pedagogy delivered by good teachers
  • Asked former CIOs from Harvard and MIT, both of whom are senior company employees, to develop collaborations with the academic learning science community
  • Accurately described Benjamin Bloom’s two-sigma research, with special attention to the implications for the bottom half of the bell curve
  • When asked a question by an audience member about an IMS technical interoperability standard in development, correctly described both the goals of the standard and its value to educators in plain English

Answer: David Levin of McGraw Hill.

Yes yes, those are just words. But I have gotten a good look at some of what their ed tech product and data science groups have been up to lately, and I have spoken to Levin at length on a few occasions (and grilled him at length on two of them).

My advice: Pay attention to this company. They are not screwing around.

The post Pop Quiz appeared first on e-Literate.

Jonathan Lewis

Bobby Durrett's DBA Blog - Tue, 2016-04-19 17:09

I am finally getting around to finishing my four-part blog series on people who have had the most influence on my Oracle performance tuning work. The previous three people were Craig ShallahamerDon Burleson, and Cary Millsap. The last person is Jonathan Lewis. These four people, listed and blogged about in chronological order, had the most influence on my understanding of how to do Oracle database performance tuning. There are many other great people out there and I am sure that other DBAs would produce their own, different, list of people who influenced them. But this list reflects my journey through my Oracle database career and the issues that I ran into and the experiences that I had. I ran into Jonathan Lewis’ work only after years of struggling with query tuning and getting advice from others. I ran into his material right around the time that I was beginning to learn about how the Oracle optimizer worked and some of its limits. Jonathan was a critical next step in my understanding of how Oracle’s optimizer worked and why it sometimes failed to pick the most efficient way to run a query.

Jonathan has produced many helpful tuning resources including his blog, his participation in online forums, and his talks at user group conferences, but the first and most profound way he taught me about Oracle performance tuning was through his query tuning book Cost-Based Oracle Fundamentals. It’s $30 on Amazon and that is an incredibly small amount of money to pay compared to the value of the material inside the book. I had spent many hours over several years trying to understand why the Oracle optimizer some times choses the wrong way to run a query. In many cases the fast way to run something was clear to me and the optimizer’s choices left me stumped. The book helped me better understand how the Oracle optimizer chooses what it thinks is the best execution plan. Jonathan’s book describes the different parts of a plan – join types, access methods, etc. – and how the optimizer assigns a cost to the different pieces of a plan. The optimizer chooses the plan with the least cost, but if some mistake causes the optimizer to calculate an unrealistic cost then it might choose a poor plan. Understanding why the optimizer would choose a slow plan helped me understand how to resolve performance issues or prevent them from happening, a very valuable skill.

There is a lot more I could say about what I got from Jonathan Lewis’ book including just observing how he operated. Jonathan filled his book with examples which show concepts that he was teaching. I think that I have emulated the kind of building of test scripts that you see throughout his book and on his blog and community forums. I think I have emulated not only Jonathan’s approach but the approaches of all four of the people who I have spotlighted in this series. Each have provided me with profoundly helpful technical information that has helped me in my career. But they have also provided me with a pattern of what an Oracle performance tuning practitioner looks like. What kind of things do they do? To this point in my career I have found the Oracle performance tuning part of my job to be the most challenging and interesting and probably the most valuable to my employers. Jonathan Lewis and the three others in this four-part series have been instrumental in propelling me along this path and I am very appreciative.


Categories: DBA Blogs

April 2016 Critical Patch Update Released

Oracle Security Team - Tue, 2016-04-19 14:02

Oracle today released the April 2016 Critical Patch Update.

This Critical Patch Update provides fixes for a wide range of product families including: Oracle Database Server, Oracle E-Business Suite, Oracle Fusion Middleware, Oracle Sun Products, Oracle Java SE, and Oracle MySQL.

Oracle recommends this Critical Patch Update be applied as soon as possible. A summary and analysis of this Critical Patch Update has been published on My Oracle Support (MOS Note 2126904.1)

For More Information:

The Critical Patch Update Advisory is located at

My Oracle Support Note 2126904.1 is located at (MOS account required).

USING SELECT 'X' in query/subqueries.

Learn DB Concepts with me... - Tue, 2016-04-19 13:26
USING SELECT 'X' in query/sub-queries.

--  DDL for Table TAB1

   (    "ID" NUMBER,
   ) ;

Insert into ATEST.TAB1 (ID,NAME) values (1,'AAA');
Insert into ATEST.TAB1 (ID,NAME) values (2,'BBB');
Insert into ATEST.TAB1 (ID,NAME) values (3,'EEE');
Insert into ATEST.TAB1 (ID,NAME) values (4,'FFF');

--  DDL for Table TAB2

   (    "ID" NUMBER,
   ) ;

Insert into ATEST.TAB2 (ID,NAME) values (1,'CCC');
Insert into ATEST.TAB2 (ID,NAME) values (2,'DDD');

Get records that exits in TAB1 and not in TAB2 using select 'X' :

select * from TAB1 f where not exists (select 'X' from TAB2 where;
--    ---- 
4    FFF
3    EEE

IN the above query we get output of all the records from TAB1 that doesnt match with TAB2 ID's.
Hence we do not get the records with ID's 1 & 2 as they only exits in TAB1.
This is just like using "select * from TAB1 f where not exists (select ID from TAB2 where;"

Get records that exits in TAB1 and in TAB2 using select 'X' :

select * from TAB1 f where exists (select 'X' from TAB2 where;

--    ---- 
1    AAA
2    BBB

IN the above query we get output of all the records from TAB1 that exist with same ID in TAB2 .
Hence we get only records with ID 1 & 2 as they exists in both TABLES.
This is just like using "select * from TAB1 f where exists (select ID from TAB2 where;"
Categories: DBA Blogs