I’ve popped this to the top of the stack because OUG Scotland is only a couple of weeks away:
I’m going to be at the OUG Scotland conference on 22nd June, and one of my sessions is a panel session on Optimisation where I’ll be joined by Joze Senegacnik and Card Dudley.
The panel is NOT restricted to questions about how the cost based optimizer works (or not), we’re prepared to tackle any questions about making Oracle work faster (or more efficiently – which is not always the same thing). This might be configuration, indexing, other infrastructure etc.; and if we haven’t got a clue we can always ask the audience.
To set the ball rolling on the day it would be nice to have a few questions in advance, preferably from the audience but any real-world problems will be welcome and (probably) relevant to the audience. If you have a question that you think suitable please email it to me or add it as a comment below. Ideally a question will be fairly short and be relevant to many people; if you have to spend a long time setting the scene and supplying lots of specific detail then it’s probably a question that an audience (and the panel) would not be able to follow closely enough to give relevant help.Update 29th April
I’ve already had a couple of questions in the comments and a couple by email – but keep them coming.
Some places to learn more about partitioning include
- Oracle Database Problem Solving and Troubleshooting Handbook
- Oracle Database 11g New Features (Oracle Press)
- Oracle Database 12c The Complete Reference (Oracle Press)
- Oracle Database 12c New Features
- Partitioning home page
- 12c Partitioning Tutorial
- 12c new features - partitioning
- Oracle Education class - 12c partitioning
- youtube video - partitioning
- series of youtube videos - partitioning
Before we go down the rabbit hole and dive deep into partitioning, let's review how a select statement works and how data is stored. Say for example we have a database that contains addresses for customers. The table contains an id number, a first name, last name, address, phone number, city, state, zip code, credit card number, credit card expiration, and email address. We have a second table for our on-line catalog that contains part numbers, a title, a description, and a file link for photos. We have a third table for our orders and it contains a customer id number, a part number, an order quantity, and order date. We would create our tables with the following commands
create table customers ( customer_id number(8),, first_name varchar2(32), last_name varchar2(32), address varchar2(64), phone_number varchar2(10), city varchar2(32), state varchar2(16), zip_code varchar2(16), credit_card_number varchar2(16), credit_card_expiration varchar2(8) email_address varchar2(64) ); create table catalog ( part_number number(8), title varchar2(32), description varchar2(128), part_image blob ); create order_entry( order_number number(8), customer_id number(8), part_number number(8), part_quantity number(8), order_date date );
If we have ten million items in our catalog we potentially consume 128 + 32 + 8 + 16 bytes times 10,000,000. This makes our table roughly 2 TB in size. If we have two million orders we have about 0.5 TB for the order_entry table. When we create a database we have the option of defining not only the storage type that we want our table to reside in but we can define how and where to store the data associated with this table. By default all tables that we create as a user are stored in the SYSTEM tablespace. All three of these tables will be stored in the DATA area under the SYSTEM tablespace since we did not specify a storage area or tablespace to hold the tables. For the database that we created in previous blog entries using Oracle DBaaS, these files are stored in /u02. We can dive down the /u02/app/oracle/oradata/ORCL/PDB1 and see that there is a system01.dbf file. This correlates to the SYSTEM tablespace in the PDB1 pluggable database. As tables are added, they are added to the system01.dbf file. If we are in the container database ORCL the files are stored in /u02/app/oracle/oradata/ORCL/system01.dbf file.
To help with database performance, index are created into tables so that a reference to a table knows where in the system01.dbf file the table customers and catalog are located. We can also create an index on the table. This index is also stored in the system01.dbf file so that we can look up common queries as they are executed. For example, if we are looking for all orders that happened in February we can select this data quicker with an index by presorting all of the data related to order_date. The index allows us to directly access the table entries in the system01.dbf table by creating an index link to the entries. This index is also stored in the system01.dbf file and re-created when we enter new data into the order_entry table. Hopefully our indexes are small enough to stay resident in memory and we don't need to go to storage to reload and reindex our indexes. Partitioning helps keep indexes smaller as well and unused indexes can be aged out to disk to free up memory. If we never look at data that is two years old, we don't need to keep an index on our two year old data in memory but pull it in from disk when needed.
To reduce the access time and select time we can pre-sort the data in a different way. We can partition the data and store the table information in different files. Rather than storing everything in system01.dbf, we can store February order data in february.dbf. When an update to a table is done the insert is done into the system01.dbf file or the january.dbf, february.dbf, or march.dbf file. When we transition into April an april.dbf file is created and the january.dbf data is moved into q1_2016.dbf file. The key advantage to this is when we perform a select statement and look for data in March and April, we only look in the march.dbf and april.dbf files. The rest of the data is not loaded because we know that the data is not in the other table extents. This reduces the amount of data that is loaded into memory and reduces the amount of disk operations that are performed for every select statement. If everything was stored in the system01.dbf file, we would need to load all two million orders just to find the one or two hundred that happened in April. We basically read then throw away 97% of the data read because it does not match our request. True, the index would help but this requires multiple writes to the disk when an insert happens. With partitioning enabled for the order_date column, all order entries are stored pre-sorted by date in the different table extents. Since 11g interval partitioning automatically creates new partition tables. As we cross from February to March, the march.dbf is created and all writes corresponding to march orders are written to the new partition.
There are a variety of partition types that you can use to divide data
- Range partitioning - typically based on date columns, months, quarters, a range of numberic or character values. You can specify a value less than or value greater than when defining the partition. The value that you compare can be relative or specific to a current date or number.
- List partitioning - this describes a discrete value and assigns them to their own tablespace. We might split our catalog into plumbing products, lawn and garden products, or appliances. This helps searches into our catalog for a specific item. Note when you do a search at HomeDepot or Amazon you have the option of search in department. They are using list partitions on the back end.
- Hash partitioning - this is good if you don't have a specific range that will split the data. If you want to sort by address for example, it is difficult to list all of the addresses or sort them into a range. The hash partition allows you to split your data into 16 different partitions and the database will split the data with a best effort to spread all of the data between the number of partitions you define.
- Composite partitioning - this is a combination of two of the types described above. Composite partitioning is accomplished with the subpartition command where we first sort by one method then sub-sort by another. We could use a list-list or a list-range. We can use two of any of the above to help manage a large tablespace into smaller chunks.
- Reference partitioning - this allows you to partition data based on referential constraints. If, for example, you want to create a constraint in a table creation and sort on that constraint, you can do this with partition by reference. If we create a table and add a constraint that an order_id must be tied to a customer_id in our customers table we can partition by this constraint which effectively splits the orders table into orders by customer_id which is not defined in this table.
- Virtual column-based partitioning - virtual column partitioning allows us to split a table based on part of a column value. If, for example, we index our parts in our catalog by sequence numbers with 1-3 representing plumbing and 4-6 representing appliances, we can partition based on the first number in our part_id and effectively split the catalog based on departments without having to define the department as a column in our table. We just need to make sure that all part numbers that are inserted into our catalog follow our numbering convention and not put a gas range into the catalog staring with a 2 as the part number.
If we change the customers table described and append a partition by range statement with the command
create table customers ( .... ) partition by range (state);we divide the table into potentially fifty different tables. As a new customer is added, they are added to the new state table. Inserts happen quicker, selects happen quicker, backups happen quicker unless all of our customers are located in one state.
If we group our customers into regions and want to store data not in fifty states but in three regions we could do this wist a list range. Note that we can define the table name when we define the partition.
create table customers (....) partition by list (state) (partition part1 values ('Texas', 'Louisiana', 'Oklahoma', 'Arkansas') tablespace tola_ts , partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts, partition category_other values (default));In this example we create the tola_ts, pac_ts, and default tablespace. We split seven of the states into three buckets and store customers into the three areas. This make reporting simpler and optimizes for select statements looking for customers in or around Texas or along the Pacific Ocean. Note that we could also subpartition this data to separate the big cities from rural areas
create table customers (.....) partition by list (state) (partition part1 values ('Texas") tablespace texas_ts subpartition big_cities (partition texas_cities values('Houston', 'Dallas', 'San Antonio', 'Austin', 'Fort Worth', 'El Paso') tablespace big_texas_ts, partition category_other values(default) tablespace small_texas_ts), partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts, partition category_other values (default)); );This will create four tablespaces. One for Texas big cities, one for Texas small cities, one for Pacific rim states, and one for all other states.
Database 12c added a few new commands to help manage and maintain partitions. We can now alter partitions and add, truncate, drop, split, and merge. The add and merge are very valuable functions that allow us to update ranges. If, for example, we paid a consultant two years ago to define a partition by range and they went out four years with the following
create table sales (.....) partition by range (salesdate) ( partition part_2015 values less than (TO_DATE('01-Jan-2016', 'DD-MON-YYYY')), partition part_2016 values less than (TO_DATE('01-Jan-2017', 'DD-MON-YYYY')) ) ENABLE ROW MOVEMENT;But we want to start saving data by quarter rather than by year we could execute the following
alter table sales add partition p_q1_2016 values less than (TO_DATE('01-Apr-2016', 'DD-MON-YYYY')), partition p_q2_2016 values less than (TO_DATE('01-Jul-2016', 'DD-MON-YYYY'));This would slide in quarterly reporting and allow us to handle a larger volume than was created before. If at the end of the year we want to aggregate everything back into a year rather than a quarter basis we can do this with a merge command
alter table sales merge partitions p_q1_2016, p_q2_2016, p_q3_2016, p_q45_2016 into partition part_2016;
Fortunately, Enterprise Manager has a partition advisor that looks at the history of your select statements and suggests how you should divide your tables into partitions. It notices that you do a lot of select by state or select by zip code and recommends partitioning by list or by hash based on your usage patterns. This was a new feature added with Enterprise Manager 11 and has gotten more robust and reliable with 13c. We should see a significant speed up if we get the right combination of partitions and indexes and could potentially take a select statement from 45 seconds to sub seconds as shown in the Enterprise Manager screen shots below.
In summary, partitioning is very powerful. It helps you split up your larger tables so that they fit into the memory that you have allocated. The return on investment is difficult to do because the cost for partitioning vs the cost of memory and resulting speed up for queries is hard to measure. Enterprise Manager has tools to help you with this analysis but it is difficult to put into future dollars and what if analysis. It would be nice if you could say that splitting your table into partitions would reduce your buffer cache and allow you to shrink your SGA size by 25%. The tools are not quite there. They do tell you that you can reduce your select times by partitioning the data and predict relatively accurately how much faster a select statement will be with partitioning based on your current hardware configuration. All of these functions should work on Amazon RDS with the exception of manipulating a tablespace. This requires a different command syntax since manipulation of a tablespace requires system access. Typically the command would be alter database default tablespace users2 but with Amazon RDS you have to execute exec rdsadmin.rdsadmin_util.alter_default_tablespace('users2') instead. Given that this is not done very often, it is up to you to decide how and where you deploy your large table database.
- Oracle Big Data Discovery Cloud Service for Visual Face of Big Data (Oracle Partner Hub: ISV Migration Center Team)
via Oracle Partner Hub: ISV Migration Center Team http://ift.tt/1AAiVSD
My plan was to spend a long weekend in Amsterdam before heading off to the next conference…
I got off the train at Amsterdam Central and felt hot, very tired, sick and had the start of a headache. By the time I got to my hotel room I was feeling about 3/4 Machu Picchu. I took some Lemsip capsules and crashed out. I woke up about 20 minutes later still feeling bad, but knowing I would survive.
The next day came and went. I spent most of the day in bed. I ventured out of my room to film some bits of the hotel and get some food, which I promptly threw up, then went back to bed. If nothing else, I did manage to edit some videos between bouts of sleep. If you want to see where I spent Friday evening and all day Saturday, check out the video below.
Sunday morning was significantly better. After my Machu Picchu incident last year, a couple from the US gave me some electrolyte replacement sachets. I didn’t use them at the time, but found them in my bag and took one as soon as I got up. I decided to go and have a walk around, but it quickly became apparent I had zero energy. I didn’t feel ill, but just couldn’t walk any sort of distance. After about 2 miles I was totally done.
To make sure the day wasn’t a total right-off, I got onto a Sightseeing Boat and did one of the three possible loops (Green). I then picked up some food and headed back to the hotel. No puking this time, which was good. Once I was sure I wouldn’t be seeing lunch again, I headed back out and did another loop in the boat (Blue). You can see the photos and videos from the day here.
It wasn’t exactly what I had in mind for the weekend, but what ya gonna do?
So tomorrow I have to get a train down to Zeist for the OGH Oracle DBA and SQL Celebration Day on the following day.
PS. The “My Glamorous Life” title for these type of videos was always meant to be a joke, but in this case it is really funny. There was nothing glamorous about this weekend.
For my tests I've set up a 126.96.36.199 single instance database using "4 OCPUs" (Oracle CPUs) which ends up as an Oracle Linux 6 system showing 8 CPUs *and* 8 cores of type "Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz".
As edition for the database I've chosen the "Extreme Performance" Enterprise Edition which also shows up at the version banner (note the difference to regular database installations, not sure this might break some applications that don't know this banner):
The graph is supposed to show how many of the runs deviated how much from the overall median runtime, so the Y axis represents the percentage, and the X axis represents the deviation from the median runtime, for example 0.5 on the Y axis means 0.5 percent deviation from the median value
- The DBaaS service after a couple of days showed a very stable consistency pattern, only during the first four days the pattern was different:
As I observed yet again last week, much of analytics is concerned with anomaly detection, analysis and response. I don’t think anybody understands the full consequences of that fact,* but let’s start with some basics.
An anomaly, for our purposes, is a data point or more likely a data aggregate that is notably different from the trend or norm. If I may oversimplify, there are three kinds of anomalies:
- Important signals. Something is going on, and it matters. Somebody — or perhaps just an automated system — needs to know about it. Time may be of the essence.
- Unimportant signals. Something is going on, but so what?
- Pure noise. Even a fair coin flip can have long streaks of coming up “heads”.
Two major considerations are:
- Whether the recipient of a signal can do something valuable with the information.
- How “costly” it is for the recipient to receive an unimportant signal or other false positive.
What I mean by the latter point is:
- Something that sets a cell phone buzzing had better be important, to the phone’s owner personally.
- But it may be OK if something unimportant changes one small part of a busy screen display.
Anyhow, the Holy Grail* of anomaly management is a system that sends the right alerts to the right people, and never sends them wrong ones. And the quest seems about as hard as that for the Holy Grail, although this one uses more venture capital and fewer horses.
*The Holy Grail, in legend, was found by 1-3 knights: Sir Galahad (in most stories), Sir Percival (in many), and Sir Bors (in some). Leading vendors right now are perhaps around the level of Sir Kay.
Difficulties in anomaly management technology include:
- Performance is a major challenge. Ideally, you’re running statistical tests on all data — at least on all fresh data — at all times.
- User experiences are held to high standards.
- False negatives are very bad.
- False positives can be very annoying.
- Robust role-based alert selection is often needed.
- So are robust visualization and drilldown.
- Data quality problems can look like anomalies. In some cases, bad data screws up anomaly detection, by causing false positives. In others, it’s just another kind of anomaly to detect.
- Anomalies are inherently surprising. We don’t know in advance what they’ll be.
Consequences of the last point include:
- It’s hard to tune performance when one doesn’t know exactly how the system will be used.
- It’s hard to set up role-based alerting if one doesn’t know exactly what kinds of alerts there will be.
- It’s hard to choose models for the machine learning part of the system.
Donald Rumsfeld’s distinction between “known unknowns” and “unknown unknowns” is relevant here, although it feels wrong to mention Rumsfeld and Sir Galahad in the same post.
And so a reasonable summary of my views might be:
Anomaly management is an important and difficult problem. So far, vendors have done a questionable job of solving it.
But there’s a lot of activity, which I look forward to writing about in considerable detail.
- The most directly relevant companies I’ve written about are probably Rocana and Splunk.
We share our skills to maximize your revenue!
Here is a list of download options for APR2016 (188.8.131.52.160419) PSU:
OJVM PSU (Linux/Unix) 22674697
Combo OJVM + DB PSU 22738777
Combo OJVM + DB SPU 22738732
Combo OJVM + GI PSU 22738793
That’s not enough and there should be more (sarcasm).
FYI: GI PSU includes DB and DB PSU does not include GI.
There is a thread from MOS – OJVM PSU and RAC: What happened to high availability? and I was wondering the same.
One document says do A then B while another has do B then A.Oracle Recommended Patches — “Oracle JavaVM Component Database PSU” (OJVM PSU) Patches (Doc ID 1929745.1)
1. Shutdown databases and services on all nodes
2. Apply DB PSU (or equivalent) but DO NOT RUN DB PSU POST INSTALL STEPS
3. Apply OJVM PSU patch [see note-1 below]
4. October 2014 only for DB versions below 184.108.40.206: Apply the JDBC Patch [see note-2 below]
5. Run post install steps on all DBs in the patched home:
For 220.127.116.11 and 18.104.22.168 run the OJVM PSU post install steps followed by the DB PSU (or equivalent) post install steps.
6. Re-start any stopped databases / services running from this ORACLE_HOMEPatch 22738793 – Combo of OJVM Component 22.214.171.124.160419 DB PSU + GI PSU 126.96.36.199.160419 (Apr2016) README has the opposite.
SQL> @catbundle.sql psu apply
When opatch auto is used for Combo OJVM + GI PSU 22738793, database will be shutdown and may not start up.
PRCH-1061 Applying OCT2014 Grid PSU 19380115 With Opatch Auto (Doc ID 1946048.1)
srvctl stop/status home creates empty file after applying oct2014 psu 19380115 (Doc ID 1946050.1)
[root@arrow ~]# . oraenv <<< grid ORACLE_SID = [root] ? The Oracle base has been set to /u01/app/oracle [root@arrow ~]# which make ar ld nm /usr/bin/make /usr/bin/ar /usr/bin/ld /usr/bin/nm [root@arrow ~]# export PATCH_TOP_DIR=/media/sf_linux_x64 [root@arrow ~]# $ORACLE_HOME/OPatch/ocm/bin/emocmrsp -no_banner -output /tmp/ocm.rsp Provide your email address to be informed of security issues, install and initiate Oracle Configuration Manager. Easier for you if you use your My Oracle Support Email address/User Name. Visit http://www.oracle.com/support/policies.html for details. Email address/User Name: You have not provided an email address for notification of security issues. Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]: Y The OCM configuration response file (/tmp/ocm.rsp) was successfully created. [root@arrow ~]# $ORACLE_HOME/OPatch/opatch auto $PATCH_TOP_DIR/22738793 -ocmrf /tmp/ocm.rsp Executing /u01/app/11.2.0/grid/perl/bin/perl /u01/app/11.2.0/grid/OPatch/crs/patch11203.pl -patchdir /media/sf_linux_x64 -patchn 22738793 -ocmrf /tmp/ocm.rsp -paramfile /u01/app/11.2.0/grid/crs/install/crsconfig_params This is the main log file: /u01/app/11.2.0/grid/cfgtoollogs/opatchauto2016-06-04_09-11-57.log This file will show your detected configuration and all the steps that opatchauto attempted to do on your system: /u01/app/11.2.0/grid/cfgtoollogs/opatchauto2016-06-04_09-11-57.report.log 2016-06-04 09:11:57: Starting Oracle Restart Patch Setup Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params Stopping RAC /u01/app/oracle/product/11.2.0/dbhome_1 ... Stopped RAC /u01/app/oracle/product/11.2.0/dbhome_1 successfully patch /media/sf_linux_x64/22738793/22502456 apply successful for home /u01/app/oracle/product/11.2.0/dbhome_1 patch /media/sf_linux_x64/22738793/22502549/custom/server/22502549 apply successful for home /u01/app/oracle/product/11.2.0/dbhome_1 patch /media/sf_linux_x64/22738793/22674697 apply successful for home /u01/app/oracle/product/11.2.0/dbhome_1 Stopping CRS... Stopped CRS successfully patch /media/sf_linux_x64/22738793/22502456 apply successful for home /u01/app/11.2.0/grid patch /media/sf_linux_x64/22738793/22502549 apply successful for home /u01/app/11.2.0/grid patch /media/sf_linux_x64/22738793/22502505 apply successful for home /u01/app/11.2.0/grid Starting CRS... CRS-4123: Oracle High Availability Services has been started. Starting RAC /u01/app/oracle/product/11.2.0/dbhome_1 ... Failed to start resources from database home /u01/app/oracle/product/11.2.0/dbhome_1 ERROR: Refer log file for more details. opatch auto failed. [root@arrow ~]# tail /u01/app/11.2.0/grid/cfgtoollogs/opatchauto2016-06-04_09-11-57.log 2016-06-04 09:30:23: Removing file /tmp/fileR3nOfE 2016-06-04 09:30:23: Successfully removed file: /tmp/fileR3nOfE 2016-06-04 09:30:23: /bin/su exited with rc=1 2016-06-04 09:30:23: /u01/app/oracle/product/11.2.0/dbhome_1/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /u01/app/oracle/product/11.2.0/dbhome_1/srvm/admin/stophome.txt output is PRCH-1001 : Failed to start resources in Oracle home /u01/app/oracle/product/11.2.0/dbhome_1 PRCH-1061 : The following resources specified in the state file do not exist: svc-hulk_svc, svc-thor_svc 2016-06-04 09:30:23: Failed to start resources from database home /u01/app/oracle/product/11.2.0/dbhome_1 2016-06-04 09:30:23: ERROR: Refer log file for more details. [root@arrow ~]# tail /u01/app/11.2.0/grid/cfgtoollogs/opatchauto2016-06-04_09-11-57.report.log 33: /u01/app/11.2.0/grid/rdbms/install/rootadd_rdbms.sh : run as root 34: /u01/app/11.2.0/grid/crs/install/roothas.pl -patch : run as root 35: /u01/app/oracle/product/11.2.0/dbhome_1/bin/emctl start dbconsole 36: /u01/app/oracle/product/11.2.0/dbhome_1/bin/emctl start agent 37: /u01/app/oracle/product/11.2.0/dbhome_1/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /u01/app/oracle/product/11.2.0/dbhome_1/srvm/admin/stophome.txt [root@arrow ~]# grep stop /u01/app/11.2.0/grid/cfgtoollogs/opatchauto2016-06-04_09-11-57.report.log 13: /u01/app/oracle/product/11.2.0/dbhome_1/bin/emctl stop dbconsole 14: /u01/app/oracle/product/11.2.0/dbhome_1/bin/emctl stop agent 18: /u01/app/oracle/product/11.2.0/dbhome_1/bin/srvctl stop home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /u01/app/oracle/product/11.2.0/dbhome_1/srvm/admin/stophome.txt -f 37: /u01/app/oracle/product/11.2.0/dbhome_1/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /u01/app/oracle/product/11.2.0/dbhome_1/srvm/admin/stophome.txt [root@arrow ~]#
Day 2 of AMIS 25 – Beyond the Horizon started with me playing catch-up on some blogging, then the conference proper started.
The first session I went to was Jeremy Ashley presenting “General Session – Oracle Applications User Experience: Trends and Strategy”. I wrote loads of notes on this, because I find it really interesting. I’m not going to post them here because I will make some mistakes and look like a fool. I’ve seen a number of sessions by the UX team over the years and each time I do, it seems to click a little more. I’m starting to think SQL*Plus may not be the ultimate in UI or UX, but I’m not 100% sure…
Next up was Björn Rost doing a “Quick Guide to Open Source SQL Tuning Tools (Quickie)”. This was a short session talking about some tools I know and use (rlWrap, SQLcl, MOATS, Snapper, SQL Developer) and something that was new to me TUNAs360.
After that was my session about running Oracle databases in the cloud.
I then took a break do some work, upgrading Tomcat and deploying a new application, before heading off to the OTN Cloud Developer Challenge results.
I would like to say we were robbed, but we really weren’t.
A wise man (or woman – the quote is unattributed) once said that assumption is the mother of all cock-ups.
This is especially true in the wonderful world of databases.
The term NoSQL covers databases as different from each other as they are from the traditional Relational Database Management Systems (RDBMS).
The assumption implicit in that last sentence is that Relational Databases are broadly the same.
The problems with this assumption begin to manifest themselves when a team is assembled to write a new application running on an Oracle RDBMS.
Non-Oracle developers may have been used to treating databases as merely a persistence layer. Their previous applications may well have been written to be Database Agnostic.
This is a term which is likely to cause consternation among Oracle Developers, or at least, Oracle Developers who have ever tried to implement and support a Database Agnostic application running on Oracle. They may well think of this approach as the “Big Skip” anti-pattern where the database is treated as a dumping ground for any old rubbish the application feels like storing.
As a consequence, they will strongly favour the application being “Front-End Agnostic”. In other words, they will lean toward the Thick Database Paradigm as a template for application architecture.
With all of this Agnosticism about it’s amazing how religious things can get as the relative merits of these opposing views are debated.
These diametrically opposing views on the optimum architecture for a database centric application all stem from that one assumption about Relational Databases.
To make things even more interesting, both sides in this debate share this assumption.
The fact of the matter is that Oracle is very different from other RDBMSs. Oracle Developers need to appreciate this so that they can accept that the Database Agnostic Architecture is a legitimate choice for some RDBMSs and is not simply the result of non-Oracle Developers not knowing anything about databases.
The other point to note is that Oracle is very different from other RDBMS – OK, it’s technically the same point, but it’s such an important one, it’s worth mentioning twice.
Non-Oracle Developers need to understand this so that they can accept that the Thick Database Paradigm is a legitimate choice for the Oracle RDBMS and not simply the result of technological parochialism on the part of Oracle Developers.
Whatever kind of developer you are, you’re probably wondering just what I’m banging on about right now and where this is going.
Well, the purpose of this post is to take several steps back from the normal starting point for the debate over the optimal application architecture for a Database Centric Application on Oracle and set out :
- Why Relational Databases are different from each other
- Why the Thick Database Approach can be particularly suited to Oracle
- Under what circumstances this may not be the case
Hopefully, by the end I’ll have demonstrated to any non-Oracle Developers reading this that the Thick Database Paradigm is at least worth considering when developing this type of application when Oracle is the RDBMS.
I will also have reminded any Oracle Developers that Oracle is a bit different to other RDBMS and that this needs to be pointed out to their non-Oracle colleagues when the subject of application architecture is being discussed.
I will attempt to keep the discussion at a reasonably high-level, but there is the odd coding example.
Where I’ve included code, I’ve used the standard Oracle demo tables from the HR application.
There are several good articles that do dive into the technical nitty-gritty of the Thick Database Paradigm on Oracle and I have included links to some of them at the end of this post.
I can already hear some sniggering when the term Thick Database gets used. Yes, you there in the “Web Developers Do It Online” t-shirt.
In some ways it would be better to think of this as the Intelligent Database Paradigm, if only to cater to those with a more basic sense of humour.
Before I go too much further, I should really be clear about the assumptions all of this is based on.Application Requirements
To keep things simple, I’m going to assume that our theoretical application implements some form On-Line Transaction Processing (OLTP) functionality.
Of course, I’m going to assume that Oracle is the chosen database platform (or at least, the one you’re stuck with).
Most importantly, I’m going to assume that the fundamental non-functional requirements of the application are :
On a not entirely unrelated topic, I should also mention some terms, when used in the context of the Oracle RDBMS, have a slightly different meanings to that you might expect…
- database – normally a term used to describe the database objects in an application – in Oracle we’d call this a schema. This is because database objects in Oracle must be owned by a database user or schema.
- stored procedure – it’s common practice in PL/SQL to collect procedures and functions into Packages – so you’ll often hear the term Packaged Procedures, Packages, or Stored Program Units to cover this
- database object – this is simply any discrete object held in the database – tables, views, packages etc
- transaction – by default, Oracle implements the default ANSI SQL behaviour that a transaction consists of one or more SQL statements. A transaction is normally terminated explicitly by the issuing of a COMMIT or a ROLLBACK command.
This is normally pre-installed with every Oracle database, although your DBA may have removed it as part of the installation.
If you want to follow along and it’s not installed, you can find the build script for it in :
Note that the script requires you to provide the SYS password for the database.
I’ve created copies of two of the tables from this application, the EMPLOYEES and DEPARTMENTS tables, for use in the examples below.Database Agnostic and Thick Database – definitions
To keep things simple, we can explain each in the context of the Model-View-Controller(MVC) design pattern.
In MVC, the application components are divided into three categories :
- The View – the GUI
- The Controller – where all of the application logic exists. This layer sits in the middle between the view and the…
- Model – the persistence layer – traditionally a Relational Database implementing a Physical Data Model
The Database Agnostic Approach is to treat the Model simply as a persistence layer. Implementation of Referential Integrity in the database is minimal and the implementation of any business logic is done entirely in the Controller layer, little or none of which impinges upon the RDBMS in the Model.
The main idea behind this approach is that it is trivial to migrate the application from one RDBMS to another.
The Thick Database Paradigm takes a very different approach.
The Referential Integrity is rigorously applied in the RDBMS, and the Data Model is done in some approximation of Third Normal Form.
The Controller layer is in fact implemented as two physical layers.
The code outside of the database – the Data Access Layer (DAL) accesses the model by means of a Transactional API (XAPI) which is held in Stored Procedures inside the RDBMS engine itself.
We’re going to explore the advantages of this approach in the context of the Oracle RDBMS.
There we are then, something for everyone to object to.
The thing is, both of these approaches have their place. The trick is to know the circumstances under which one is more appropriate.
It may help at this point then, if we can return to the question of…
Maybe that heading should read “Are RDBMSs different from each other ?” Superficially at least, they do seem to have a fair bit in common.
To start with, they all implement the relational model to some degree. This means that data is arranged in tables and that (in the main) it is possible to define relationships between these tables.
For circumstances where a Business Transaction may require multiple DML statements, the RDBMS will enable the creation of Stored Procedures to enable such transactions to be done in a single call to the database.
The most obvious similarity is, of course, that any retrieval of or amendment to data stored in the database is ultimately done by means of a Structured Query Language (SQL) statement.
A fundamental characteristic of SQL is that it is a Declarative Language. You use it to tell the database what data you want to access. It is the Database Engine that then has to figure out how to do this.
Whilst the implementation of SQL is (more-or-less) standard across RDBMSs, the underlying Database Engines behave very differently.
One example of the differences between Database Engines can be seen when you need to execute a query that contains many table joins.
If you were running such a query on MSSQL, it may well be more efficient to do this in multiple steps. This would be done by writing a query to populate a temporary table with a result set and then joining from that table to return the final results.
This contrasts with Oracle, where the optimal approach is usually to do this with a single SQL statement.
For the moment, I’ll assume that the above has been sufficient to persuade you that Relational Databases are in fact different from each other in a fairly fundamental way.
Feel free to put this to the test yourself. Go ahead, I’ll wait….
OK. We’re all agreed on that then.
The question you’re now asking is this –
If RDBMSs are different from each other is the Database Agnostic approach the best architecture for all of them ?
The next thing we need to understand is….Why Oracle is Special
“Because it’s so expensive !” may well be your first thought. Remember that we’re assuming that Oracle is the RDBMS platform that you have chosen ( or been lumbered with) for your application. This being the case, we come back to the question of why the Thick Database Paradigm is worthy of consideration for your application architecture.
Returning to our list of non-functional application requirements, can you guess which of the application components is likely to have the biggest impact on performance of an Oracle Database Application ? Clue : It’s also the most expensive thing to change after go-live as it’s the card at the bottom of the house of cards that is your Application….The Physical Data Model
This aspect of the Thick Database Paradigm is often overlooked. However, it is by far the most important aspect in maximizing the success of the implementation of this architectural approach.
Oh, that’s your sceptical face, isn’t it. You’re really not entirely sure about this. You’re probably not alone, even some Oracle Developers will be giving me that same look about now. I hope this next bit is convincing because my flame-proof underpants are currently in the wash.
OK, as I said a little while ago ( and I think you pretty much agreed at the time), any interaction with stored in an RDBMS will ultimately require the execution of an SQL statement by the Database Engine.
The particular bit of the Oracle Kernel that works out the how is probably called KX$ something. Friends however, tend to refer to it as the Cost Based Optimizer (CBO).
The CBO is pretty sophisticated. The more information you can provide Oracle about your data model the better the execution plans the CBO generates.
The upshot is that the better the data model, the faster that statements against it will run.
For example, the CBO understands RI constraints and can account for them in it’s execution plans as I will now demonstrate…
I’ve copied the EMPLOYEES and DEPARTMENTS tables, including data, from the standard Oracle demo – the HR Application.
The DEPARTMENTS table looks like this :
create table departments ( department_id number(4) not null, department_name varchar2(30) not null, manager_id number(6), location_id number(4) ) / alter table departments add constraint departments_pk primary key( department_id) /
…and the EMPLOYEES like this :
create table employees ( employee_id number(6) not null, first_name varchar2(20), last_name varchar2(25) not null, email varchar2(25) not null, phone_number varchar2(20), hire_date date not null, job_id varchar2(10) not null, salary number(8,2), commission_pct number(2,2), manager_id number(6), department_id number(4) ) / alter table employees add constraint employees_pk primary key (employee_id) /
Note that whilst DEPARTMENT_ID is listed in both tables I’ve not implemented any RI constraints at this point.
Now consider the following query
select emp.first_name, emp.last_name, dept.department_id from employees emp inner join departments dept on emp.department_id = dept.department_id where emp.department_id = 60 /
If we ask the CBO for an execution plan for this query…
explain plan for select emp.first_name, emp.last_name, dept.department_id from employees emp inner join departments dept on emp.department_id = dept.department_id where emp.department_id = 60 /
… it will come back with something like this :
select * from table(dbms_xplan.display) / Plan hash value: 2016977165 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 110 | 3 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 5 | 110 | 3 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN| DEPARTMENTS_PK | 1 | 4 | 0 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| EMPLOYEES | 5 | 90 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("DEPT"."DEPARTMENT_ID"=60) 3 - filter("EMP"."DEPARTMENT_ID"=60) 16 rows selected.
If we now add a constraint to ensure that a DEPARTMENT_ID in the EMPLOYEES table must already exist in the DEPARTMENTS table…
alter table employees add constraint emp_dept_fk foreign key (department_id) references departments(department_id) /
…and then get the execution plan…
explain plan for select emp.first_name, emp.last_name, dept.department_id from employees emp inner join departments dept on emp.department_id = dept.department_id where emp.department_id = 60 / select * from table(dbms_xplan.display) / Plan hash value: 1445457117 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 90 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| EMPLOYEES | 5 | 90 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMP"."DEPARTMENT_ID"=60) 13 rows selected.
…we can see that the CBO is smart enough to know that the RI constraint eliminates the need to read the DEPARTMENTS table at all for this query.
A sensible data model has some other key benefits.
insert into hr.employees ( employee_id, first_name, last_name, email, hire_date, job_id, department_id ) values ( 207, 'MIKE', 'S', 'mikes', sysdate, 'IT_PROG', 999 -- department_id does not exist in the DEPARTMENTS table ) /
…results in …
SQL Error: ORA-02291: integrity constraint (HR.EMP_DEPT_FK) violated - parent key not found
Simply by typing a one-line statement to add this constraint, we’ve prevented the possibility of orphaned records being added to our application.
Better still, this rule will be enforced however the data is added – no just records added via the application.
About now, non-Oracle developers may well be making the point that this logic needs to be implemented in the application code anyway. By adding it to the data model, aren’t we effectively coding the same functionality twice ?
Well, as we can see from the example above, the code required to create an RI constraint is minimal. Also, once it’s created, it exists in Oracle, there is no need to explicitly invoke it every time you need to use it.
Additionally, if you fully adopt the Thick Database approach, you don’t necessarily have to write code to re-implement rules enforced by constraints.
One other point that may well come up is the fact that most of the world now uses some variation of the Agile Development Methodology. Producing a complete data model in Sprint 1 is going to be a bit of a tall order for an application of even moderate complexity.
This is true. However, by implementing the Data Access Layer (DAL) pattern and separating the application code from the underlying data model, it’s possible to create stubs in place of parts of the data model that haven’t been developed. This does make it possible to fit Data Modelling into the structure required by these methodologies.
The key point here is that, even if this is the only bit of the thick db paradigm you implement your app will be far more maintainable.
The tuning tools at your disposal will be far more effective and useful if your application is based on a well defined, relational data model.
Whilst we’re on the subject of Application Code, it’s probably worth asking….What are Stored Procedures Good For ?
In order to understand this, we need to look at the concept of a Database Transaction.
The ANSI Standard for SQL mandates that a transaction consists of one or more DML statements. In general terms, if the transaction is committed then all of the changes made by each statement in the transaction is saved. Otherwise, none of them are.
By default, many RDBMSs implement a transaction as a single SQL statement. In Oracle, the default behaviour conforms to the ANSI Standard.
In circumstances where a Business Transaction requires multiple DML statements, things can get a bit tricky without a Stored Procedure.
The application needs to issue multiple individual statements and commit each one in turn.
If a second or subsequent statement fails for any reason then you find that your data is left in an inconsistent state.
Stored Procedures solve this problem by bundling these statements up into a single transaction.
We’ll have a look at a specific example of this approach in Oracle using…
The typical approach taken by vendors to implement Stored Procedures in an RDBMS involves providing some extensions to SQL to make it Turing Complete.
These extensions ( variable declaration, conditional statements, looping) are normally fairly minimal.
Oracle took a rather different approach with PL/SQL.
They took the ADA programming language and provided it with SQL extensions.
From the start then, PL/SQL was rather more fully featured than your average Stored Procedure language.
In the almost 30 years of it’s existence, PL/SQL has been further integrated within the RDBMS engine. Also, the addition of thousands of Oracle supplied libraries (packages) have extended it’s functionality to the point where it can be used for tasks as diverse as inter-session communication, backup and recovery, and sending e-mail.
Being a fully-fledged 3GL embedded into the heart of the database engine, PL/SQL is the fastest language for processing data in Oracle.
This is partly due to the fact that the code is co-located with the data, so network latency and bandwidth are not really an issue.
Yes, and you thought the idea of co-locating code and data was invented when those whizzy NoSQL databases came along, didn’t you ?
PL/SQL allows the developer to take a set-based approach to working with data. You can pretty much drop a SQL DML statement straight into a PL/SQL program without (necessarily) having to build it as a string first.
Furthermore, remember that transactions can encompass multiple database changes. By implementing these in PL/SQL, the entire transaction can be completed with a single database call, something that is not necessarily the case when the Controller code is outside of the database.
Implementing Business Transactions in PL/SQL is commonly done using the Transactional API (XAPI) pattern.
There is one particular aspect of ADA which has become central to the way that PL/SQL applications are written and that is the Package.
Rather than having lots of standalone procedures and functions, it is common practice to group these “stored procedures” into PL/SQL packages.
This approach has several advantages.
Grouping related functionality into packages reduces the number of individual programs you need to keep track of.
PL/SQL packages are stored in Oracle’s Database Catalogue ( the Data Dictionary) as two distinct objects – a Package Header or Specification – essentially the signature of all of the functions and procedures in the package ( package members) – and a Package Body – the actual code.
The Package Header is the object that is called to invoke a member procedure.
Provided you are not changing the signature of a public package member, you can amend the code in the package body without having to re-compile the header.
This means that you can make changes to the transactional code “under the hood” without necessarily requiring any re-coding in the caller to a packaged procedure.
Right, it’s time for an example.
Say we want to change the Manager of a Department. In this case, the current IT Department Manager – Alexander Hunold has decided that he’s far too busy to attend all of those planning meetings. I mean he took this job so he didn’t have to speak to anybody. You can tell he’s not really a people person, I mean just look at that T-shirt.
Diana Lorentz on the other hand, whilst also having the required technical background has a much better way with people.
So, in order to change the manager in the IT Department from Alexander to Diana we need to :
- Update the record in the DEPARTMENTS table with the ID of the new manager
- Update the EMPLOYEES records for members of that department so that they now report to the new manager
- Update the EMPLOYEES record for the new manager so that she now reports to the Department’s overall boss
Among other things, we’ll need to know which DEPARTMENT_ID we need to make these changes for. This would normally be selected from a drop-down list in the Application’s UI, with the name of the Department being displayed to the user but the ID being passed to our procedure.
Whilst the list of Departments is static/reference data and may well be cached on the mid-tier of our application to save repeated database calls, we’ll still need a means of getting this data out of the database initially.
Therefore, we may well have a package that contains two members :
- a function to return the department information
- a procedure to assign the new manager
Such a package will probably look something like this. First the Package Header…
create or replace package manage_departments is -- -- This is the package header or specification. -- It gives the signature of all public package members (functions and packages -- function get_department_list return sys_refcursor; procedure change_manager ( i_department_id departments.department_id%type, i_old_manager_id employees.employee_id%type, i_new_manager_id departments.manager_id%type ); end manage_departments; /
… and now the body…
create or replace package body manage_departments is -- -- This is the package body. -- It contains the actual code for the functions and procedures in the package -- function get_department_list return sys_refcursor is l_rc sys_refcursor; begin open l_rc for select department_name, department_id from departments; return l_rc; end get_department_list; procedure change_manager ( i_department_id departments.department_id%type, i_old_manager_id employees.employee_id%type, i_new_manager_id departments.manager_id%type ) is l_dept_head_manager employees.manager_id%type; begin -- -- First update the department record with the new manager -- update departments set manager_id = i_new_manager_id where department_id = i_department_id; -- Now find the Manager of the existing department head -- we'll need this to assign to the new department head -- select manager_id into l_dept_head_manager from employees where employee_id = i_old_manager_id; -- -- Now update all of the employees in that department to -- report to the new manager...apart from the new manager themselves -- who reports to the department head. update employees set manager_id = case when employee_id != i_new_manager_id then i_new_manager_id else l_dept_head_manager end where department_id = i_department_id; -- -- Note - for the purposes of simplicity I have not included any -- error handling. -- Additionally, best practice is normally to allow transaction control -- to be determined by the caller of a procedure so an explicit commit -- or rollback needs to take place there. -- end change_manager; end manage_departments; /
Using the Oracle CLI, SQL*Plus to act as the caller, we can see how the function works :
set autoprint on set pages 0 var depts refcursor exec :depts := manage_departments.get_department_list PL/SQL procedure successfully completed. Administration 10 Marketing 20 Purchasing 30 Human Resources 40 Shipping 50 IT 60 Public Relations 70 Sales 80 Executive 90 Finance 100 Accounting 110 Treasury 120 Corporate Tax 130 Control And Credit 140 Shareholder Services 150 Benefits 160 Manufacturing 170 Construction 180 Contracting 190 Operations 200 IT Support 210 NOC 220 IT Helpdesk 230 Government Sales 240 Retail Sales 250 Recruiting 260 Payroll 270 27 rows selected.
Now we need to call the procedure to change the manager. In order to keep things simple, I’ve cheated a bit here and not included the code to lookup the EMPLOYEE_IDs of Alexander (103) and Diana ( 107).
So, using SQL*Plus once again :
exec manage_departments.change_manager(60, 103, 107) commit;
NOTE – it is also possible (and often preferred) to pass parameters by reference when calling PL/SQL. So, the following code would work equally well ( and possibly be a bit more readable) :
exec manage_departments.change_manager( i_department_id => 60, i_old_manager_id => 103, i_new_manager_id =>; 107); commit;
We can now see that both of the DML changes have been applied :
select emp.first_name||' '||emp.last_name, dept.manager_id from departments dept inner join employees emp on dept.manager_id = emp.employee_id where dept.department_id = 60 / EMP.FIRST_NAME||''||EMP.LAST_NAME MANAGER_ID ---------------------------------------------- ---------- Diana Lorentz 107 select first_name, last_name, manager_id from employees where department_id = 60 / FIRST_NAME LAST_NAME MANAGER_ID -------------------- ------------------------- ---------- Alexander Hunold 107 Bruce Ernst 107 David Austin 107 Valli Pataballa 107 Diana Lorentz 102
The fact that Packages are stored in the Data Dictionary means that Oracle automatically keeps track of the dependencies that they have on other database objects.
This makes impact analysis much easier. For example, if we were going to make a change to the DEPARTMENTS table, we could see what other database objects might be impacted by running the following query on the Data Dictionary :
select name, type from user_dependencies where referenced_name = 'DEPARTMENTS' and referenced_type = 'TABLE' / NAME TYPE ------------------------------ ------------------ MANAGE_DEPARTMENTS PACKAGE MANAGE_DEPARTMENTS PACKAGE BODY
One more significant benefit of using PL/SQL is that any parameters passed into a stored procedure – whether part of a package or standalone – are automatically bound.
Bind variables are advantageous for two reasons.
Firstly, use of them enables Oracle to re-execute frequently invoked statements from memory, without having to re-validate them each time. This is known as a soft parse. This offers significant performance benefits.
The second, and perhaps more important advantage is that bind variables tend not to be susceptible to SQL Injection strings.
Effectively, calling a PL/SQL stored program unit is the equivalent of making a Prepared Statement call.
Whilst this automatic binding does not render PL/SQL completely immune from SQL Injection, it does greatly reduce the attack surface for this kind of exploit.
In-Memory processing is big at the moment. It’s one of those things like Big Data in that there is lots of enthusiasm around something which, to be frank, has already been happening for many years.
Oracle has some rather sophisticated memory management out of the box.
As already mentioned, SQL and PL/SQL code that is frequently executed, together with the meta-data required to parse it, is cached in memory.
The same is true for frequently used data blocks. In other words, if you have data that is frequently accessed, Oracle will look to store this in memory, thus reducing the amount of physical I/O it needs to do.
This has nothing to do with Oracle’s newfangled “In-memory” option. It’s a core part of the product.
Generally speaking, the more application code you add to the RDBMS, the more efficiently Oracle will work.
When measured against the non-functional requirements for our application, the Thick Database approach ticks all of the boxes.Accuracy
Referential Integrity in the Data Model means that we can prevent incorrect data from being stored.
The flexibility of PL/SQL and it’s close coupling with SQL means that we can easily implement business rules to ensure system accuracy.
By implementing a XAPI layer in PL/SQL, we ensure that there is a single point of entry into the application. Because business transactions always execute the same code, we can ensure that the results of those transactions are repeatable, and accurate.
As we have seen, a well-defined Data Model allows the CBO to choose the optimum execution plan for each query.
The use of bind variables ensures that frequently executed statements are cached in memory.
The fact that most of the processing happens inside the database engine means that network latency is minimized as a performance overhead.
By it’s very nature, any application that manipulates and stores data will increase the amount of data it handles over time.
This increase in data volumes will start to affect performance.
Oracle is designed and optimized to handle data stored in relational structures. Having a properly defined data model will enable you to maximise the effectiveness of the tuning tools at your disposal.
Having your application code in a single location ( i.e. the PL/SQL XAPI layer) means that code is not replicated across multiple application layers.
As PL/SQL is tightly coupled with SQL, it also means that you tend to need fewer lines of code to implement application functionality.
Having the application code in the database means that dependency tracking comes “for free” by means of the Data Dictionary.
This is especially handy when doing Impact Analysis on any application changes that may be required down the line.
PL/SQL parameters are bound auto-magically. Unless you’re being careless with some dynamic SQL inside of the PL/SQL code itself, these parameters are pretty much immune to SQL Injection.
Still feeling sceptical after reading that ? Good. Whilst I have provided some evidence to support these assertions, it’s not what you’d call incontrovertible.
But I’m getting ahead of myself. Before summarising, I did say that there may be some circumstances where this approach may not be suitable…
By it’s very nature the Thick Database approach on Oracle RDBMS puts an Application smack in the middle of an Oracle “walled garden”.
If you ever want to migrate to another RDBMS, the task is unlikely to be straight forward.
Yes, PostgresSQL is similar in nature to PL/SQL. As I’ve never attempted a migration from Oracle to Postgres, I can’t comment on whether this lessens the effort required.
So, if you’re in a situation where you know that your application will need to move to another RDBMS in the short term, the pain of sub-optimal performance on Oracle may be worth the gain when you come to do the migration.
A word of warning here – I have personal experience of applications that we’re only supposed to be on Oracle for six months after Go-live…and we’re still in Production several years later.
Alternatively, you may be a software vendor who needs to support your application across multiple database platforms.
The benefit of having a single code base for all supported platforms may outweigh the overhead of the additional effort required to address the issues that will almost certainly arise when running a Database Agnostic application on an Oracle RDBMS.
If you do find yourself in this situation then you may consider recommending a database other than Oracle to your clients.
It is worth pointing out however, that in either case, a well-designed physical data model where Referential Integrity is enforced by means of constraints will provide substantial mitigation to some of the performance issues you may encounter.
This is certainly not going to help with an application using the Entity-Attribute-Value (EAV) model.
I would suggest that if EAV is absolutely essential to your solution then a Relational Database almost certainly isn’t.
If you’ve made it this far, I hope that you have at least been persuaded that the Thick Database Paradigm is not a completely bonkers way of writing an application against an Oracle database.
That’s not to say that you’re sold on the idea by any means. As I’ve said already, what I’ve attempted to do here is provide some illustrations as to why this approach is preferred among Oracle Developers. It’s not cast-iron proof that this is the case with your specific application.
What you’ll probably want to do now is read up a bit more on this approach following which, you may well want to do some testing to see if all of these claims stack up.
So, if you want some proper, in-depth technical discussions on the Thick Database Paradigm, these links may be of some use :
- Bryn Llewellyn, Oracle Distinguished Product Manager for PL/SQL, has written an in-depth White Paper on this subject – Why Use PL/SQL ?
- Dr Paul Dorsey’s presentation is worth a read
- There’s an excellent article challenging the conventional wisdom about Oracle Stored Procedures from Morten Braten
If and when you do come to do some testing, it’s important to remember that the benefits of the Thick Database approach – certainly in performance terms – become more apparent the greater the volume of data and transactions the application needs to handle.
Running performance tests against the tiny HR application that I’ve used here is probably not going to tell you too much.
Filed under: Oracle, PL/SQL, SQL Tagged: CBO, database agnostice, dbms_xplan, Ref cursors, Referential Integrity, thick database paradigm, USER_DEPENDENCIES
In my last post, I looked at using recursive WITH to implement simple recursive algorithms in SQL. One very common use of recursion is to traverse hierarchical data. I recently wrote a series of posts on hierarchical data, using Oracle’s CONNECT BY syntax and a fun example. In this post, I’ll be revisiting the same data using recursive WITH.
We had already been at the venue the day before for the beginning of the OTN Cloud Developer Challenge, but today was the first day of AMIS 25 – Beyond the Horizon. The day started at about 07:00, with a short minibus trip to the venue.
The first session I attended was “Get your money’s worth out of your Database” by Patrick Barel. He presented a whole bunch of database features that developers need to be using to make the most of their Oracle databases, including a bunch of 12c features. I like these kind of sessions. Much of what we end up doing at conferences is giving people pointers to what we think is interesting.
Next I went to “Smart Offices Are the Future of Work, Powered by the Internet of Things” by Noel Portugal. This started with some talk about designing for mobility, because people are connected all the time. It then moved on to the internet of things, where he discussed how small, cheap electronics can bridge the gap between the physical world and the internet, using examples like Amazon Dash, gesture controls, voice controls like Amazon Echo etc. The UX team are regularly using a lot of this stuff for their own office, kind-of making it the office of the present, not the future.
At this point, our team got together to carry on working on the OTN Cloud Developer Challenge, which took a big chunk of the day…
Later I headed off to see Jonathan Lewis present a session called “Just Don’t Do It”. As the name suggests, this session was focussed on improving performance by avoiding unnecessary work. The session was made up of examples from the OTN forums and consulting jobs where SQL was doing loads of unnecessary work. Often reframing the question allowed the statement to be rewritten to reduce the amount of work necessary to achieve the same goal. Neat!
From there it was back to the OTN Cloud Developer Challenge. All the groups got together to do the pitch for their solutions. With such a short timescale and the complexity of some of the services, most groups had hit some roadblocks, but everyone produced something. Debra actually did the pitch for two groups, but she did an exceptional job for ours. As I mentioned in a previous post, our team didn’t actually include people who do development outside of the database, and the challenge focussed on the non-DB side of things, which left us in a rather awkward position. By substituting comedy for content, Debra managed to make our pitch sound a lot better than it was. She also demoed our mobile app on her phone, and conveniently forgot to mention it was written in APEX. Your can see the home page below.
- To access cloud storage
- Mozilla Firefox
- RestClient extension for Firefox
- Google Chrome
- Postman extension for Chrome
- CloudBerry for OpenStack
- To access files in our instance
- To access our database instance
- SQL Developer
- Microsoft Visual C++ libraries
The eventual target that we are looking to get to is
To do this we need to go to the Oracle Cloud Marketplace and look for the Windows 2012 Server instance. What we want to do is provision a Windows instance and use it as a remote desktop for connecting to the cloud. We could do this on our desktop but the benefit of using a Windows Server is that we can create more users and use this instance for a hands on workshop. We don't need to have anyone load any virtual machines, fight the differences between Mac and Windows, or wait for the binaries to download and install. We can do most of this on a virtual machine in the cloud and just add and delete users for workshops. To provision the Windows server, we go to the cloud marketplace, select Infrastructure, and Compute. We can the search for Windows and get a bootable image to use as our foundation.
Once we agree to the legal terms we can select an instance to provision this into. The way it works is that we copy a bootable image into a cloud instance. We can then create compute instances from this bootable image and customize it to our liking. Once we agree to the terms the marketplace connects to the Oracle Cloud and uses your authentication credentials to connect to the instance. From this is gets a list of instances associated with this account, checks to see if you have agreed to terms of marketplace use for this instance by setting your profile settings for the instance. Once the bootable image is ready, a splash screen is presented stating that you are ready to provision a compute instance.
The screen dumps you into a compute creation wizard that walks you through the compute provisioning. Rather than going through that interface we decided to start from scratch and log into the instance and provision a compute engine from scratch. We first select the boot image from our private images, select the shape to boot, define the instance name, configure ssh connectivity as well as set the Admininstrator password (not shown). Once we get the confirmation screen it takes a few minutes to create the boot disk then boot the compute instance on this newly formatted disk.
We can check the progress by looking at the storage and compute instance. When everything is done we should see a public ip address for our instance. If we don't see our instance it is either still building or we should see an error in the history. Unfortunately, the history is hidden and a failed provisioning is now shown unless you look for it by expanding the history.
Before we can connect to our instance with remote desktop, we need to define a security list to allow for rdp, associate this rule with our instance, and define the security rule for rdp and associate it with the security list and instance.
Once we have rdp enabled to our instance, we look up the public ip address and connect as the Administrator user with the password that we passed in with a json header at the bottom of the creation screen (not shown). When we log in we see the server tools splash screen pop up.
We want to create a secondary user, give this user admin rights as well as rights to remote desktop connect to the server. We might want to add more users not as admins but with remote desktop rights for hands on labs. We can add and delete users using this method and it refreshes the workshop for the next class.
At this point we can create a staging directory and install the software that we listed above. The only product that causes a problem with the install is the SQL Developer because it requires a Microsoft package that is not installed by default. We need to download the library and all of the packages that we downloaded are ready to install. I did not go through customization of the desktop or downloading the public and private keys used for the workshop. These are obvious steps using filezilla from a shared network storage on a server in the cloud. We downloaded Firefox and Chrome primarily because Internet Explorer does not support REST Api protocols and we will need a way to create and list storage containers. We could have skipped this installation and done everything through CloudBerry but we can do everything similarly on a Mac (no need for putty or cygwin). With Firefox you need to install the REST Client api extension and Chrome requires the Postman Extension.
In summary, we created a compute Windows 2012 Server instance in the Oracle Compute IaaS. We added a new user as a backup to our Administrator user. We enabled remote desktop and configured a Mac to connect to this service remotely. We then downloaded a set of binaries to our Windows desktop to allow us to manage and manipulate storage containers and database instances. We also downloaded some utilities to help us use command line tools to access our database and customize our instances. We technically could do all of this with a Windows desktop, Internet Explorer, and SQL Developer. We went to the extra steps so that we can do the same from a Mac or Windows desktop using the same tools.
I occasionally get questions from customers and colleagues about performance expectations for the Oracle Database procedure called calibrate_io on XtremIO storage. This procedure must be executed in order to update the data dictionary. I assert, however, that it shouldn’t be used to measure platform suitability for Oracle Database physical I/O. The main reason I say this is because calibrate_io is a black box, as it were.
The procedure is, indeed, documented so it can’t possibly be a “black box”, right? Well, consider the fact that the following eight words are the technical detail provided in the Oracle documentation regarding what calibrate_io does:
This procedure calibrates the I/O capabilities of storage.
OK, I admit it. I’m being too harsh. There is also this section of the Oracle documentation that says a few more words about what this procedure does but not enough to make it useful as a platform suitability testing tool.A Necessary Evil?
Yes, you must run calibrate_io. The measurements gleaned by calibrate_io are used by the query processing runtime (specifically involving Auto DOP functionality). The way I think of it is similar to how I think of gathering statistics for CBO. Gathering statistics generates I/O but I don’t care about the I/O it generates. I only care that CBO might have half a chance of generating a reasonable query plan given a complex SQL statement, schema and the nature of the data contained in the tables. So yes, calibrate_io generates I/O—and this, like I/O generated when gathering statistics, is I/O I do not care about. But why?
Here are some facts about the I/O generated by calibrate_io:
- The I/O is 100% read
- The reads are asynchronous
- The reads are buffered in the process heap (not shared buffers in the SGA)
- The code doesn’t even peek into the contents of the blocks being read!
- There is limited control over what tablespaces are accessed for the I/O
- The results are not predictable
- The results are not repeatable
Having provided the above list of calibrate_io characteristics, I feel compelled to elaborate.About Asynchronous I/O
My main issue with calibrate_io is it performs single-block random reads with asynchronous I/O calls buffered in the process heap. This type of I/O has nothing in common with the main reason random single-block I/O is performed by Oracle Database. The vast majority of single-block random I/O is known as db file sequential read—which is buffered in the SGA and is synchronous I/O. The wait event is called db file sequential read because each synchronous call to the operating system is made sequentially, one after the other by foreground processes. But there is more to SGA-buffered reads than just I/O.About Server Metadata and Mutual Exclusion
Wrapped up in SGA-buffered I/O is all the necessary overhead of shared-cache management. Oracle can’t just plop a block of data from disk in the SGA and expect that other processes will be able to locate it later. When a process is reading a block into the SGA buffer cache it has to navigate spinlocks for the protected cache contents metadata known as cache buffers chains. Cache buffers chains tracks what blocks are in the buffer cache by their on-disk address. Buffer caches, like that in the SGA, also need to track the age of buffers. Oracle processes can’t just use any shared buffer. Oracle maintains buffer age in metadata known as cache buffers lru—which is also spinlock-protected metadata.
All of this talk about server metadata means that as the rate of SGA buffer cache block replacement increases—with newly-read blocks from storage—there is also increased pressure on these spinlocks. In other words, faster storage means more pressure on CPU. Scaling spinlocks is a huge CPU problem. It always has been—and even more so on NUMA systems. Testing I/O performance without also involving these critical CPU-intensive code paths provides false comfort when trying to determine platform suitability for Oracle Database.
Since applications to not drive random single-block asynchronous reads in Oracle Database, why measure it? I say don’t! Yes, execute calibrate_io, for reasons related to Auto DOP functionality, but not for a relevant reading of storage subsystem performance.About User Data
This is one that surprises me quite frequently. It astounds me how quick some folks are to dismiss the importance of test tools that access user data. Say what? Yes, I routinely point out that neither calibrate_io nor Orion access the data that is being read from storage. All Orion and calibrate_io do is perform the I/O and let the data in the buffer remain untouched. It always seems strange to me when folks dismiss the relevance of this fact. Is it not database technology we are talking about here? Databases store your data. When you test platform suitability for Oracle Database I hold fast that it is best to 1) use Oracle Database (thus an actual SQL-driven toolkit as opposed to an external kit like Orion or fio or vdbench or any other such tool) and 2) that the test kit access rows of data in the blocks! I’m just that way.
Of course SLOB (and other SQL-driven test kits such as Swingbench do indeed access rows of data). Swingbench handily tests Oracle Database transaction capabilities and SLOB uses SQL to perform maximum I/O per host CPU cycle. Different test kits for different testing.A Look At Some Testing Results
The first thing about calibrate_io I’ll discuss in this section is how the user is given no control or insight into what data segments are being accessed. Consider the following screenshot which shows:
- Use of the calibrate.sql script found under the misc directory in the SLOB kit (SLOB/misc/calibrate.sql) to achieve 371,010 peak IOPS and zero latency. This particular test was executed with a Linux host attached to an XtremIO array. Um, no, the actual latencies are not zero.
- I then created a 1TB tablespace. What is not seen in the screenshot is that all the tablespaces in this database are stored in an ASM disk group consisting of 4 XtremIO volumes. So the tablespace called FOO resides in the same ASM disk group. The ASM disk group uses external redundancy.
- After adding a 1TB tablespace to the database I once again executed calibrate_io and found that the IOPS increased 13% and latencies remained at zero. Um, no, the actual latencies are not zero!
- I then offlined the tablespace called FOO and executed calibrate_io to find that that IOPS fell back to within 1% of the first sample.
- Finally, I onlined the tablespace called FOO and the IOPS came back to within 1% of the original sample that included the FOO tablespace.
My objections to this result is calibrate_io is a black box. I’m left with no way to understand why adding a 1TB tablespace improved IOPS. After all, the tablespace was created in the same ASM disk group consisting of block devices provisioned from an all-flash array (XtremIO). There is simply no storage-related reason for the test result to improve as it did.
I decided to spend some time taking a closer look at calibrate_io but since I wanted more performance capability I moved my testing to an XtremIO array with 4 X-Bricks and used a 2-Socket Xeon E5-2699v3 (HSW-EP 2s36c72t) server to drive the I/O.
The following screenshot shows the result of calibrate_io. This test configuration yielded 572,145 IOPS and, again, zero latency. Um, no, the real latency is not zero. The latencies are sub-millisecond though. The screen shot also shows the commands in the SLOB/misc/calibrate.sql file. The first two arguments to DBMS_RESOURCE_MANAGER.CALIBRATE_IO are “in” parameters. The value seen for parameter 2 is not the default. The next section of this blog post shows a variety of testing with varying values assigned to these parameters.
As per the documentation, the first parameter to calibrate_io is “approximate number of physical disks” being tested and the second parameter is “the maximum tolerable latency in milliseconds” for the single-block I/O.
As the table above shows I varied the “approximate number of physical disks” from 1 to 10,000 and the “maximum tolerable latency” from 10 to 20 and then 100. For each test I measured the elapsed time.
The results show us that the test requires twice the elapsed time with 1 approximate physical disk as it does for with 10,000 approximate physical disks. This is a nonsensical result but without any documentation on what calibrate_io actually does we are simply left scratching our heads. Another oddity is that with 10,000 approximate disks the throughput in megabytes per second is reduced by nearly 40% and that is without regard for the “tolerable latency” value. This is clearly a self-imposed limited within calibrate_io but why is the big question.
I’ll leave you, the reader, to draw your own conclusions about the data in the table. However, I use the set of results with “tolerable latency” set to 20 as validation for one of my indictments above. I stated calibrate_io is not predictable. Simply look at the set of results in the 20 “latency” parameter case and you too will conclude calibrate_io is not predictable.So How Does CALIBRATE_IO Compare To SLOB?
I get this question quite frequently. Jokingly I say it compares in much the same way a chicken compares to a snake. They both lay eggs. Well, I should say they both perform I/O.
I wrote a few words above about how calibrate_io uses asynchronous I/O calls to test single-block random reads. I also have pointed out that SLOB performs the more correct synchronous single block reads. There is, however, an advanced testing technique many SLOB users employ to test PGA reads with SLOB as opposed to the typical SLOB reads into the SGA. What’s the difference? Well, revisit the section above where I discuss the server metadata management overhead related to reading blocks into the SGA. If you tweak SLOB to perform full scans you will test the flow of data through the PGA and thus the effect of eliminating all the shared-cache overhead. The difference is dramatic because, after all, “everything is a CPU problem.”
In a subsequent blog post I’ll give more details on how to configure SLOB for direct path with single-block reads!
To close out this blog entry I will show a table of test results comparing some key time model data. I collected AWR reports when calibrate_io was running as well as SLOB with direct path reads and then again with the default SLOB with SGA reads. Notice how the direct path SLOB increased IOPS by 19% just because blocks flowed through the PGA as opposed to the SGA. Remember, both of the SLOB results are 100% single-block reads. The only difference is the cache management overhead is removed. This is clearly seen by the difference in DB CPU. When performing the lightweight PGA reads the host was able to drive 29,884 IOPS per DB CPU but the proper SLOB results (SGA buffered) shows the host could only drive 19,306 IOPS per DB CPU. Remember DB CPU represents processor threads utilization on a threaded-processor. These results are from a 2s36c72t (HSW-EP) so these figures could also be stated as per DB CPU or per CPU thread.
If you are testing platforms suitability for Oracle it’s best to not use a test kit that is artificially lightweight. Your OLTP/ERP application uses the SGA so test that!
The table also shows that calibrate_io achieved the highest IOPS but I don’t care one bit about that!
I’d like to offer the following links to the full AWR reports summarized in the above table:
- I recommend everything that Luca Canali writes and this deck about calibrate_io is no exception.
- This post is more about calibrate_io as an I/O testing tool but I did mention Auto DOP. For more on Auto DOP please see the following posts:
Use calibrate_io. Just don’t use it to test platform suitability for Oracle Database.
Filed under: oracle
Oracle Traffic Director is a high-throughput low-latency load-balancer that can optimize HTTP and HTTPS traffic for Oracle E-Business Suite environments. It offers built-in optimizations for Oracle WebLogic Server along with rule-based request routing, reverse proxy capabilities, request rate limiting, throttling, QoS tuning, and more. Oracle Traffic Director is included with Oracle's engineered systems and is now available for standalone deployments, too.
Oracle Traffic Director 12c is now certified with E-Business Suite 12.1 and 12.2. See:
- Oracle Traffic Director 12c Integration with Oracle E-Business Suite Releases 12.1 and 12.2 (Note 2130592.1)
Oracle Traffic Director 12c is certified to run on any operating system for which Oracle WebLogic Server 12c is certified. See:
- Using a Reverse Proxy as an SSL/TLS Termination Point for EBS 12.1.3
- In-Depth: Demilitarized Zones and the E-Business Suite
- In-Depth: Load-Balancing E-Business Suite Environments
Customer has requirement to alert if the multipath count goes down from 4.
Command to get the count as privileged User:
Metric Extension Examples:
How to monitor Weblogic correct HEALTH STATE using EM12c Metric Extension
Monitoring BRM Host Processes using Metric Extension in EM12c
Monitoring the Filesystem for READONLY mounts using Metric Extension in OEM12c
Em12c Creating Metric Extension(User Defined Metrics) for BPEL Process State OFF
Creating an Enterprise Manager Metric Extension to monitor Huge Page Allocation
One of the standard slides that I use to explain the differences in the editions is shown below.
The options are cumulative when you look at them. The Enterprise Edition, for example, comes with Transparent Data Encryption (TDE). TDE is also included in the High Performance and Extreme Performance Editions. We are going to pull the pricing for all of these options from the Technology Price List. Below is a list of the options.
- Enterprise Edition
- Transparent Data Encryption
- High Performance Edition
- Advanced Compression
- Advanced Security
- Data Guard
- Label Security
- Audit Vault
- Database Vault
- Real Application Testing
- Spatial and Graphics
- Extreme Performance Edition
- Active Data Guard
- In Memory
- Real Application Clusters (RAC)
- RAC One
Transparent Data Encryption
TDE is a subset of the Advanced Security option. TDE stops would-be attackers from bypassing the database and reading sensitive information from storage by enforcing data-at-rest encryption in the database layer. Data is stored in the table extents encrypted and read into the database encrypted. The Oracle Wallet is needed to read the data back and perform operations on the data. Advanced Security and Security Inside Out are blogs to dive deeper into TDE features, functions, and tutorials. There is also a Community Security Discussion Forum. The Advanced Security option is priced at $300 per named user or $15,000 per processor. If we assume a four year amortization the cost of this option is $587.50 per month per processor. The database license is $1,860 per month per processor. This says that a dual core system on Amazon EC2, RDS, or Azure Compute running the Oracle database will cost you the cost of the server plus $2,448 per month. If we go with a t2.large on Amazon EC2 (2 vCPUs and 8 GB of RAM) and 128 GB of disk our charge is $128 per month. If we bump this up to an r3.large (2 vCPU, 15 GB of RAM) the price goes up to $173 per month. The cost will be $2,620 per month which compares to Enterprise Edition at $3,000 per month per processor for PaaS/DBaaS. We could also run this in Oracle IaaS Compute at $150 per month (2 vCPUs, 30 GB of RAM) to compare apples to apples. It is strongly recommended that any data that you put in the cloud be encrypted. Security is good in the cloud but encryption of data in storage is much better. When you replicate data or backup data it is copied in the format that it is stored in. If your data is clear text, your backups could be clear text thus exposing you to potential loss of data. Encrypting the data at rest is storage is a baseline for running database in the cloud.
Diagnostics is a subset of the Database Management Packs that allows you to look into the database and figure out things like lock contention, what is holding up a wait queue, and what resources are being consumed by processes inside the database. Historic views into the automated workload repository (AWR) reports are available with this option. You can get spot options but not historical views and comparative analytics on AWR information. Some of the tools are free like compression advisor and partitioning advisor while others are part of the diagnostics pack. Diagnostics are licensed at $150 per named user or $7,500 per processor. This correlates to $294 per processor per month. Unfortunately, you can't purchase Enterprise Edition DBaaS and add this but need to go with IaaS Compute and add this to the bring your own database license. The only way to get this feature is to go with the High Performance Edition. The binary that is installed on the cloud service specifically labels the database as Enterprise Edition, High Performance Edition, or Extreme Performance Edition. All of the features listed from here and below are prohibited from running on the Enterprise Edition when provisioned into the Oracle DBaaS. If you just want Diagnostics Pack on Enterprise Edition it does not make economic sense to purchase High Performance Edition at $4,000 per month per processor when you can do this on IaaS at $2,914 (the $2,620 from above plus $294).
Tuning is also a subset of the Database Management Packs that allows you to look into sql queries, table layouts, and overall performance issues. Options like the SQL Tuning Advisor and Automatic SQL Tuning are part of this option. Tuning pack is $100 per named user or $5,000 per processor. This comes in at $196 per processor per month if purchased separately. A Tuning Whitepaper details some of the features and functions of the tuning pack if you want to learn more.
Partitioning is a way of improving performance of your database and backup by splitting how data is stored and read. Partitioning is powerful functionality that allows tables, indexes, and index-organized tables to be subdivided into smaller pieces, enabling these database objects to be managed and accessed at a finer level of granularity. Oracle provides a comprehensive range of partitioning schemes to address every business requirement. The key improvement is to reduce the amount of data that you are reading into memory on a query. For example, if you are looking for financial summary data for the last quarter, issuing a query into eight years of financial data should not need to read in 32 quarters of data but only data from the last quarter. If we partition the data on a monthly basis we only read in three partitions rather than all 32. Partitioning also allows us to compress older data to consume less storage while at rest. When we backup the database we don't need to copy the older partitions that don't change, only backup the partitions that have updated since our last backup. Partitioning is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. The three most purchased database options are diagnostics, tuning, and partitioning. The combined cost of these three options is $940 per processor per month. When we compare the $4,000 per processor per month of DBaaS to IaaS with these three options we are at parity.
Advanced Compression is a feature that allows you to compress data at rest (and in memory) so that it consumes less resources. Oracle Advanced Compression provides a comprehensive set of compression capabilities to help improve performance and reduce storage costs. It allows organizations to reduce their overall database storage footprint by enabling compression for all types of data: relational (table), unstructured (file), network, Data Guard Redo and backup data. Cost comparisons for this feature are directly comparable to storage costs. Advanced compression is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. Typical compression ratios are 3x to 10x compressions. This means that 1 TB of data will take up 600 GB or 100 GB at these compression ratios. Lower compression rates are recommended for data that lightly changes and high compression for data that will not change. The penalty for compression comes in when you update data that is compressed. The data must be uncompressed, the new data inserted, and recompressed.
Advanced Security allows you to secure and encrypt data in the database. Advanced Security provides two important preventive controls to protect sensitive data at the source including transparent database encryption and on-the-fly redaction of display data. TDE stops would-be attackers from bypassing the database and reading sensitive information directly from storage by enforcing data-at-rest encryption in the database layer. Data Redaction complements TDE by reducing the risk of unauthorized data exposure in applications, redacting sensitive data before it leaves the database. Advanced Security is priced at $300 per named user and $15,000 per processor. The monthly cost will be $587.50 per month per processor for this option. Data redaction is typically required for replicating production data to development and test. If you have credit card, social security numbers, home addresses, or drivers license information in your database, redaction is important to have to remain Sarbanes Oxly and PCI compliant.
Data Guard is a key foundation piece of Maximum Availability Architecture and does not cost any additional money. You get data replication between two databases at no additional cost and data can be replicated as physical or logical replication between the database instances. This feature ensures high availability, data protection, and disaster recovery for enterprise data. Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Data Guard maintains these standby databases as transactionally consistent copies of the production database. Then, if the production database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability. It is important to note that Data Guard is not allowed in Amazon RDS and you must use EC2 or another cloud service to use this feature.
Label Security has the ability to control access based on data classification and to enforce traditional multi-level security (MLS) policies for government and defense applications. Oracle Label Security benefits commercial organizations attempting to address numerous access control challenges including those associated with database and application consolidation, privacy laws and regulatory compliance requirements. When a user requests data, the database looks at the user credentials and roles that they have access to and filters the results that the user sees from a query. Label Security is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. Note that this is different than data redaction. With redaction, data is scrambled when data is copied. With Label Security, the data is not returned if the user does not have rights to read the data. An error is not returned from a query but a null value is returned if the user does not have rights to read a column. The biggest benefit to this option it does not require program changes to restrict access to data and present results to users. If, for example, we are going to show sales in a customer relationship program, we don't need to change the code based on the user being a sales rep or sales manager. The sales manager can see all of the sales rep information to track how their team is performing. Each sales rep can see their data but not the other sales rep data. It is important to note that Label Security is not allowed in Amazon RDS and you must use EC2 or another cloud service to use this feature.
Multitenant or Pluggable Database allows you to consolidate instances onto one server and reduce your overall management cost. The many pluggable databases in a single multitenant container database share its memory and background processes. This enables consolidation of many more pluggable databases compared to the old architecture, offering similar benefits to schema-based consolidation but with none of the major application changes required by that approach. Backups are done at the parent layer. Users are provisioned at the pluggable layer. Features of the instance (RAC, DataGuard, etc) are inherent to the parent and adopted by the pluggable container. To take a test system from single instance to data guard replicated only requires unplugging the database from the single instance system and plugging it into a data guard system. The same is true for RAC and all other features. Multitenant is licensed at $350 per user or $17,500 per processor. This come in at $685 per processor per month. It is important to note that this option is not available on Amazon RDS. This option is specifically disabled and not allowed. You must run this on EC2 to use this functionality or on another cloud platform.
Audit Vault and Database Firewall monitors Oracle and non-Oracle database traffic to detect and block threats, as well as improves compliance reporting by consolidating audit data from databases, operating systems, directories, and other sources. Audit vault is licensed at $6,000 per processor and is not available on a per user basis. This comes in at $235 per processor per month. This option typically requires a separate server for security reasons where logs and logging information is copied to prevent data to be manipulated on a single system and the auditing system.
Database Vault reduces the risk of insider and outsider threats and addresses common compliance requirements by preventing privileged users (DBA) from accessing sensitive application data, preventing compromised privileged users accounts from being used to steal sensitive data or make unauthorized changes to databases and applications, providing strong controls inside the database over who can do what and controls over when and how applications, data and databases can be accessed, providing privilege analysis for all users and applications inside the database to help achieve least privilege model and make the databases and applications more secure. Database Vault is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. It is important to note that this option is not available on Amazon RDS. This option is specifically disabled and not allowed. You must run this on EC2 to use this functionality or on another cloud platform.
Real Application Testing
Real Application Testing helps you fully assess the effect of such system changes on real-world applications in test environments before deploying the change in production. Oracle Real Application Testing consists of two features, Database Replay and SQL Performance Analyzer. Together they enable enterprises to rapidly adopt new technologies that add value to the business while minimizing risk. Traces can be recorded for reads and writes and replayed on a test system. This makes the replay option perfect for development and testing instances. The product is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. It is important to note that not having the sys level access might or might not break this feature in Amazon RDS based on what you are trying to replay.
Online Analytics Processing or OLAP is a multidimensional analytic engine embedded in Oracle Database 12c. Oracle OLAP cubes deliver sophisticated calculations using simple SQL queries - producing results with speed of thought response times. This outstanding query performance may be leveraged transparently when deploying OLAP cubes as materialized views – enhancing the performance of summary queries against detail relational tables. Because Oracle OLAP is embedded in Oracle Database 12c, it allows centralized management of data and business rules in a secure, scalable and enterprise-ready platform. OLAP is licensed at $460 per user or $23,000 per processor. This comes in at $901 per processor per month. This feature is good for BI Analytics packages and Data Warehouse systems.
Spatial and Graphics
Spatial and Graphics supports a full range of geospatial data and analytics for land management and GIS, mobile location services, sales territory management, transportation, LiDAR analysis and location-enabled Business Intelligence. The graph features include RDF graphs for applications ranging from semantic data integration to social network analysis to linked open data and network graphs used in transportation, utilities, energy and telcos and drive-time analysis for sales and marketing applications. This option is licensed at $350 per user or $17,500 per processor. This come in at $685 per processor per month. It is important to note that this option is not supported in Amazon RDS. You must select EC2 or another cloud service to get this option.
All of the above options are bundled into the High Performance Edition. If we add up all of the options we get a total of
- Transparent Data Encryption - $587.50 per month
- Diagnostics - $294 per month
- Tuning - $196 per month
- Partitioning - $450 per month
- Advanced Compression - $450 per month
- Advanced Security - $587.50 per month
- Data Guard - bundled
- Label Security - $450 per month
- Multitenant - $685 per month
- Audit Vault - $235 per month
- Database Vault - $450 per month
- Real Application Testing - $450 per month
- OLAP - $901 per month
- Spatial and Graphics - $685 per month
Active Data Guard
Active Data Guard has the same features and functions as Data Guard but allows the target database to be open for read/write and updates happen bidirectionally. Active Data Guard is licensed at $230 per user or $11,500 per processor. This come in at $450 per processor per month.
In Memory optimizes both analytics and mixed workload OLTP, delivering outstanding performance for transactions while simultaneously supporting real-time analytics, business intelligence, and reports. Most DBAs optimize performance by creating indexes to find data quicker. This works if you know the questions ahead of time. If you don't know the question it is difficult to tune for everything. In Memory allows you to create a row based copy of the data as well as a column based copy of the data for quick column sorts and searches. In Memory is licensed at $460 per user or $23,000 per processor. This come in at $901 per month per processor. The key advantage of this option is that it prevents you from purchasing a second database to do analytics and reporting on the same box as your transactional system.
Real Application Clusters (RAC)
RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide highly scalable and available database solutions for all your business applications. Oracle RAC is a key component of Oracle's private cloud architecture. Oracle RAC support is included in the Oracle Database Standard Edition for higher levels of system uptime and a critical part of the MAA strategy. RAC is licensed at $460 per user or $23,000 per processor. This come in at $901 per month per processor. It is important to note that RAC is not supported in Amazon or Azure. The system requires shared storage between compute instances that neither platforms provide. The only option for this configuration is Oracle DBaaS/PaaS.
The options for Extreme performance come in at $2,252 per processor per month but Oracle only charges an extra $1000 with Extreme Edition.
In Summary, there are a ton of options for the database. You need to figure out what options you need and if you need more than a couple it is economically beneficial to go with High Performance. If you need RAC, Active Data Guard, or In Memory you must purchase the Extreme Performance Edition. It is also important to note that not all features are supported in Amazon RDS and you must either go with Oracle Database as a Service or build a system using IaaS. RAC is the only exception where it is only available with Oracle DBaaS. We will go into a couple of these features in upcoming days to look at the value, how to use, and what is required to make the functionality work with other cloud providers.
After a long (but I’m sure well worth it) wait, Oracle just announced APEX 5.1 Early Adopter is available.
Head over to https://apexea.oracle.com/ to sign up and request a Workspace.
The two big items everyone has been waiting for are
- Interactive Grids
- Jet Charts
but reading through the “Additional Features in Application Express 5.1” section on the EA homepage it looks like there are lots of other interesting improvements and enhancements.