The Anti-Kyte
The Ultimate Question of Life, the Universe, and… how big is my Oracle Table ?
At a time when the debate rages about how much you should trust what AI tells you, it’s probably worth recalling Deep Thought’s wildly incorrect assertion that the answer to the Ultimate Question of Life, the Universe, and Everything is forty-two.
As any Database specialist will know, the answer is the same as it is to the question “How big is my Oracle Table ?” which is, of course, “It depends”.
What it depends on is whether you want to know the volume of data held in the table, or the amount of space the database is using to store the table and any associated segments (e.g. indexes).
Connecting to my trusty Free Tier OCI Oracle Instance ( running 19c Enterprise Edition), I’ve set out on my journey through (disk) space to see if I can find some more definitive answers…
How big is my table in terms of the raw data stored in it ?Before going any further, I should be clear on the database language settings and character set that’s being used in the examples that follow. Note particularly that I’m not using a multi-byte character set :
select parameter, value from gv$nls_parameters order by parameter / PARAMETER VALUE ------------------------------ ------------------------------ NLS_CALENDAR GREGORIAN NLS_CHARACTERSET AL32UTF8 NLS_COMP BINARY NLS_CURRENCY £ NLS_DATE_FORMAT DD-MON-YYYY NLS_DATE_LANGUAGE ENGLISH NLS_DUAL_CURRENCY € NLS_ISO_CURRENCY UNITED KINGDOM NLS_LANGUAGE ENGLISH NLS_LENGTH_SEMANTICS BYTE NLS_NCHAR_CHARACTERSET AL16UTF16 NLS_NCHAR_CONV_EXCP FALSE NLS_NUMERIC_CHARACTERS ., NLS_SORT BINARY NLS_TERRITORY UNITED KINGDOM NLS_TIMESTAMP_FORMAT DD-MON-RR HH24.MI.SSXFF NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH24.MI.SSXFF TZR NLS_TIME_FORMAT HH24.MI.SSXFF NLS_TIME_TZ_FORMAT HH24.MI.SSXFF TZR 19 rows selected.
Now, let see if we can work out how much raw data is held in a table.
We’ll start with a very simple example :
create table marvin as
select 1000 + rownum as id
from dual
connect by rownum <= 1024;
Marvin may have a brain the size of a planet but his tabular namesake has more modest space requirements.
It’s 1024 records are all 4 digits long.
Therefore, the size of the table data should be 4096 bytes, right ?
“Hang on”, your thinking, “why not just lookup the size in USER_SEGMENTS and make this a really short post ?”
Well :
select bytes from user_segments where segment_name = 'MARVIN' and segment_type = 'TABLE' / BYTES ---------- 65536
USER_SEGMENTS will give you a size in bytes, but it’s not the same as the amount of raw data.
We’ll come back to this in a bit.
For now though, we can cross-check the size from elsewhere in the data dictionary, provided the stats on the table are up-to-date.
To ensure that this is so, I can run :
exec dbms_stats.gather_table_stats('MIKE', 'MARVIN');
This will ensure that statistics data is populated in the USER_TABLES view. This means that we can estimate the data volume by running the following query :
select num_rows, avg_row_len,
num_rows * avg_row_len as data_in_bytes
from user_tables
where table_name = 'MARVIN'
/
…which returns…
NUM_ROWS AVG_ROW_LEN DATA_IN_BYTES ---------- ----------- ------------- 1024 4 4096
That looks promising.
We can further verify this by running :
select sum(length(id)) as data_in_bytes from marvin / DATA_IN_BYTES ------------- 4096
OK, now let’s see what happens with a slightly more complex data set, and an index as well …
create table hitchikers
(
id number generated always as identity,
character_name varchar2(250),
quote varchar2(4000),
constraint hitchikers_pk primary key (id)
)
/
declare
procedure ins( i_character in varchar2, i_quote in varchar2)
is
begin
insert into hitchikers( character_name, quote)
values(i_character, i_quote);
end;
begin
for i in 1..1024 loop
ins('Deep Thought', 'Forty-Two');
ins('Trillian', q'[we have normality... anything you can't cope with is, therefore, your own problem]');
ins('Ford Prefect', 'Time is an illusion. Lunchtime doubly so.');
ins('Zaphod Beeblebrox', q'[If there's anything more important than my ego around, I want it caught and shot right now!]');
ins(null, 'Anyone who is capable of getting themselves made President should on no account be allowed to do the job');
ins('Marvin', q'[Life! Loathe it or hate it, you can't ignore it.]');
ins('Arthur Dent', 'This must be Thursday. I never could get the hang of Thursdays');
ins('Slartibartfast', q'[I'd rather be happy than right any day]');
end loop;
commit;
end;
/
commit;
Once stats are present on the table, we can check the expected data size as before :
select num_rows, avg_row_length, num_rows * avg_row_length as data_in_bytes from user_tables where table_name = 'HITCHIKERS' / NUM_ROWS AVG_ROW_LEN DATA_IN_BYTES ---------- ----------- ------------- 8192 75 614400
This time, the size in bytes figure we get back is not exact, as we can confirm with :
select sum( length(id) + nvl(length(character_name),0) + nvl(length(quote), 0)) as data_in_bytes from hitchikers / DATA_IN_BYTES ------------- 598957
To verify the actual size in bytes, we can dump the contents of a table into a csv file. In this case, I’m using SQLDeveloper :

The resulting file is a different size again :
ls -l hitchikers.csv -rw-rw-r-- 1 mike mike 656331 May 13 11:50 hitchikers.csv
This can be accounted for by the characters added as part of the csv formatting.
First, the csv file includes a header row :
head -1 hitchikers.csv "ID","CHARACTER_NAME","QUOTE"
Including the line terminator this is 30 bytes :
head -1 hitchikers.csv |wc -c
30
The format in each of the 8192 data rows includes :
- a comma after all but the last attribute on a row
- a line terminator after the last attribute
- double quotes enclosing each of the two VARCHAR attributes.
For example :
grep ^42, hitchikers.csv
42,"Trillian","we have normality... anything you can't cope with is, therefore, your own problem"
That’s a total of 7 extra bytes per data row.
Add all that up and it comes to 57374 bytes which are a consequence of csv formatting.
Subtract that from the file size and we get back to the calculated data size we started with :
656331 - 57374 = 598957
This confirms that the figures in USER_TABLES are approximate and you’ll need to bear this in mind if you’re relying on them to calculate the size of the data in a table.
Whilst were here, let’s see what effect compression might have on our ability to determine the raw data size.
We can do this by creating a table that has the same structure as HITCHIKERS and contains the same data, but which is compressed :
create table magrathea
(
id number,
character_name varchar2(250),
quote varchar2(4000),
constraint magrathea_pk primary key (id)
)
row store compress advanced
/
insert into magrathea( id, character_name, quote)
select id, character_name, quote
from hitchikers
/
commit;
exec dbms_stats.gather_table_stats(user, 'MAGRATHEA');
It turns out that, for the purposes of our raw data calculation, the effect of table compression is…none at all :
select num_rows, avg_row_len, num_rows * avg_row_len from user_tables where table_name = 'MAGRATHEA' / NUM_ROWS AVG_ROW_LEN NUM_ROWS*AVG_ROW_LEN ---------- ----------- -------------------- 8192 75 614400
However, if you look at the number of blocks used to store the table, the effects of compression are more evident :
select table_name, blocks from user_tables where table_name in ('HITCHIKERS', 'MAGRATHEA') order by 2 / TABLE_NAME BLOCKS ------------------------------ ---------- MAGRATHEA 20 HITCHIKERS 95
Incidentally, it’s worth noting that, as with the data size, the number of blocks reported in USER_TABLES are somewhat approximate.
USER_SEGMENTS reports the number of blocks for each table as :
select segment_name, blocks from user_segments where segment_name in ('HITCHIKERS', 'MAGRATHEA') order by 2 / SEGMENT_NAME BLOCKS ------------------------------ ---------- MAGRATHEA 24 HITCHIKERS 104
So it looks like compression will affect the amount of database space required to store an object but not the size of the actual data. This brings us nicely on to…
How big is my table in terms of the amount of space it’s taking up in the database ?Let’s go back to MARVIN. Remember, this table contains 4K of raw data, but USER_SEGMENTS claims that it’s quite a bit larger :
select bytes from user_segments where segment_name = 'MARVIN' and segment_type = 'TABLE' / BYTES ---------- 65536
To understand how Oracle has come up with this figure, you need to consider that :
- the smallest unit of space that Oracle addresses is measured in blocks
- the size of these blocks is defined at tablespace level.
- any object that uses space is allocated that space in units of an extent – which is a number of contiguous blocks.
If we take a look at MARVIN, we can see that the table resides in the DATA tablespace and has been allocated a single extent of 8 blocks :
select tablespace_name, bytes, blocks, extents from user_segments where segment_name = 'MARVIN'; TABLESPACE_NAME BYTES BLOCKS EXTENTS ------------------------------ ---------- ---------- ---------- DATA 65536 8 1
The block size is defined at the tablespace level and is held in USER_TABLESPACES in bytes :
select block_size from user_tablespaces where tablespace_name = 'DATA'; BLOCK_SIZE ---------- 8192
If we now multiply the number of blocks in the table by the size of those blocks, we get back to the size that USER_SEGMENTS is reporting :
select seg.blocks * tsp.block_size from user_segments seg inner join user_tablespaces tsp on seg.tablespace_name = tsp.tablespace_name where seg.segment_name = 'MARVIN'; SEG.BLOCKS*TSP.BLOCK_SIZE ------------------------- 65536
MARVIN is a table with no ancillary segments, such as indexes.
To find the total space being used for the HITCHIKERS table, we’ll also need to consider the space being taken up by it’s index, HITCHIKERS_PK :
select seg.segment_name, seg.segment_type, seg.blocks, ts.block_size, seg.bytes from user_segments seg inner join user_tablespaces ts on ts.tablespace_name = seg.tablespace_name where seg.segment_name in ('HITCHIKERS', 'HITCHIKERS_PK') / SEGMENT_NAME SEGMENT_TYPE BLOCKS BLOCK_SIZE BYTES -------------------- --------------- ---------- ---------- ---------- HITCHIKERS TABLE 104 8192 851968 HITCHIKERS_PK INDEX 24 8192 196608
…in other words…
select sum(seg.bytes) from user_segments seg where seg.segment_name in ('HITCHIKERS', 'HITCHIKERS_PK') / SUM(SEG.BYTES) -------------- 1048576
On the subject of ancillary segments, what about LOBS ?
create table the_guide(
id number generated always as identity,
message clob);
declare
v_msg clob;
begin
for i in 1..1000 loop
v_msg := v_msg||q'[Don't Panic!]';
end loop;
insert into the_guide(message) values( v_msg);
end;
/
commit;
Unlike other segment types, LOBSEGMENT and LOBINDEX segments do not have their parent tables listed as the SEGMENT_NAME in USER_SEGMENTS.
Therefore, we need to look in USER_LOBS to identify it’s parent table for a LOBSEGMENT and USER_INDEXES for a LOBINDEX :
select segment_name, segment_type, bytes, blocks from user_segments where( segment_name = 'THE_GUIDE' or segment_name in ( select segment_name from user_lobs where table_name = 'THE_GUIDE' ) or segment_name in ( select index_name from user_indexes where table_name = 'THE_GUIDE' ) ) / SEGMENT_NAME SEGMENT_TYPE BYTES BLOCKS ------------------------------ --------------- ---------- ---------- THE_GUIDE TABLE 65536 8 SYS_IL0000145509C00002$$ LOBINDEX 65536 8 SYS_LOB0000145509C00002$$ LOBSEGMENT 1245184 152
In this instance, although the table segment itself is only taking up 65536 bytes, when you add in all of the supporting objects, the total space requirement increases to 1376256 bytes.
If you’ve managed to make this far then meet me at the Big Bang Burger Bar for a Pan Galactic Garble Blaster. I need a drink after that.
(You Gotta) Fight for Your Right (To Query !) – the easy way to use External Tables with CSV files
Since the days when dinosaurs roamed the Earth – and I realised that programming was indoor work with little physical effort – the CSV file has been an ETL workhouse in the world of Data Warehousing.
In the case of Oracle, working with CSVs is somewhat simplified by the use of External Tables.
That’s not to say there aren’t challenges, especially when things don’t quite go as expected and you have to wade through various operating system files looking for exactly why some or all of the records in a file have not been loaded.
As enthusiastic as I am about the joys of the linux command-line, the prospect of grappling with sed, awk and regexp to diagnose data issues does not appeal.
I’d much rather use SQL. After all, that’s what it’s designed for.
Fortunately, for me, external tables make it relatively simple to upload the contents of a text file directly into a database table, from where SQL can be used to :
- validate the structure and contents of the file
- transform the data into it’s target data types
- identify any problems with the data we’ve received
As you may have guessed from the title (and the age of the author), the data we’re using has a bit of a Beastie Boys theme.
We have a file containing details of Beastie Boys singles that have charted in the UK :
row_type,track_name,peak_chart_position,album,release_date "","She's On It",10,,19860912, "","(You Gotta) Fight for Your Right (To Party!)",11,"Licensed to Ill",, "","No Sleep till Brooklyn",14,"Licensed to Ill",19870301, "","Girls",34,"Licensed to Ill",19870506, "","Hey Ladies",76,"Paul's Boutique",19890725, "","Pass the Mic",47,"Check Your Head",19920407, "","Jimmy James",55,"Check Your Head",19920828, "","Sabotage",19,"Ill Communication",19940128, "","Sure Shot",27,"Ill Communication",19940602, "","Intergalactic",5,"Hello Nasty",19980602, "","Remote Control/Three MC's and One DJ",21,"Hello Nasty",19990123 "","Alive",28,,, "","Ch-Check It Out",8,"To the 5 Boroughs",20040503, "","Triple Trouble",37,"To the 5 Boroughs",, "","An Open Letter to NYC",38,"To the 5 Boroughs",, "TRAILER",15
I want to load this data into my application tables on an Oracle 19c database.
One thing that often gets overlooked when considering a CSV file is that, whatever datatype the fields started out as (or indeed, are intended to end up as), in the file they are all merely text strings.
This can make the design of the external table we’re using to read this file quite simple in terms of datatypes…
create table beastie_xt (
row_type varchar2(4000),
track_name varchar2(4000),
peak_chart_position varchar2(4000),
album varchar2(4000),
release_date varchar2(4000)
)
organization external (
type oracle_loader
default directory dbshare
access parameters (
fields csv without embedded
missing field values are null
reject rows with all null fields
)
location('beastie.csv')
)
reject limit unlimited
/
Note that using the csv keyword in the access parameters clause, causes the external table to assume that the delimiter character is a “,” and that fields may be enclosed by double-quotes (“)
If we test the table now, we can see that all of the rows in the file are loaded, including the header and trailer records :

Now, we could do all of our data interrogation and manipulation using the external table. However, each time the table is queried, the file is read in it’s entirety and the fact recorded in the external table’s associated logfile.
As well as being a bit slow for large files, this does cause the logfile to grow rather rapidly.
As an alternative to this then, I’m going to use the external table only to read the file as it’s contents is loaded into a permanent database segment :
create table beastie_stg(
file_name varchar2(500),
file_row_number number,
row_type varchar2(4000),
track_name varchar2(4000),
peak_chart_position varchar2(4000),
album varchar2(4000),
release_date varchar2(4000))
/
insert into beastie_stg
select
'beastie.csv' as file_name,
rownum as file_row_number,
row_type,
track_name,
peak_chart_position,
album,
release_date
from beastie_xt
/
commit;
Now we have our data in a permanent segment, we can validate and load it into the application proper.
Remember, because both our External Table and Staging table definitions are quite permissive, we can load the entire file, including it’s header and trailer records.
To start with, let’s check that the file contains the all of the fields we expect, in the order in which we expect them :
-- Expected ordered field list
select listagg( lower(column_name), ',') within group( order by column_id)
from user_tab_columns
where table_name = 'BEASTIE_XT'
minus
-- Actual ordered field list
select
row_type
||','||track_name
||','||peak_chart_position
||','||album
||','||release_date
from beastie_stg
where row_type = 'row_type';
If this query returns any rows then there is a mismatch between the header record and what we’re expecting so we’ll want to investigate.
In this instance though, we’re OK.
Next we need to make sure that we’ve received all of the data records. We can see that the trailer record is saying we should have 15 data records in the file. Note that, because we’re still treating everything as a string, the count in the trailer record is actually in the TRACK_NAME column of our staging table :
select track_name as data_record_count from beastie_stg where row_type = 'TRAILER' and file_name = 'beastie.csv'; DATA_RECORD_COUNT ------------------ 15
We can verify this by simply counting the records that we’ve staged, excluding the header and trailer…
select count(*) from beastie_stg where row_type is null and file_name = 'beastie.csv' / COUNT(*) ---------- 15
Finally, we can load our data records into our core table, converting them to their target datatypes in the process.
The core table looks like this :
create table beastie_core (
file_name varchar2(500),
file_row_number number,
track_name varchar2(500),
peak_chart_position number,
album varchar2(500),
release_date date)
/
If we want to load all of the valid records whilst handling any that we cannot transform correctly, there are several options available. In this case we’re going to use an error logging table.
To create the error logging table itself…
exec dbms_errlog.create_error_log('beastie_core');
Now we can populate the core application table using the LOG ERRORS clause to direct any failing records to the error table :
insert into beastie_core
select
file_name,
file_row_number,
track_name,
to_number(peak_chart_position) as peak_chart_position,
album,
to_date(release_date, 'YYYYMMDD')
from beastie_stg
where row_type is null -- just load the data records
log errors('beastie.csv')
reject limit unlimited
/
commit;
Once the core load is completed, we can check that we’ve loaded all of the records…
select count(*) from beastie_core where file_name = 'beastie.csv'; COUNT(*) ---------- 15
…and confirm that there are no errors…
select count(*) from err$_beastie_core where ora_err_tag$ = 'beastie.csv'; COUNT(*) ---------- 0
Our file has been successfully validated and loaded, all without the need for any fiendishly clever regular expressions.
Loading data into Oracle directly from compressed or enrcypted files
Whilst it’s not uncommon to transfer data between systems by means of text files, the files themselves often turn-up in a binary format.
They may have been compressed or even encrypted before transit.
Turning them back into text so that they can be processed may be a bit of an overhead.
Not only can you end up with two copies of the data ( the binary original and the re-constituted text version), the process of conversion may be both time consuming and resource intensive.
In the case of encrypted files, persisting the unencrypted data in a file may have additional security implications.
Fortunately, it’s possible to load data from such binary formatted files into Oracle without first having to write it to a text file.
Irrespective of whether your incoming feed file is enrcypted or merely compressed, loading it into Oracle should be effortless for you after reading this.
You need not worry about any potential pitfalls because I’ve already fallen into each pit in turn, as I shall now recount.
To start with, I’ll be looking at how to use an External Table Preprocessor to load data from a compressed file.
I’ll then go through loading data that’s GPG encrypted.
Following that, we’ll take a look at why PDB_OS_CREDENTIAL might not be the help you hoped it might be when dealing with GPG decryption and how SQL*Loader can help.
Whilst I was writing this, Oracle considerately released 23c Free and made it available in a VirtualBox appliance running Oracle Linux Server 8.7, so it’d be rude not to use it for the examples that follow…
SetupTo start with I’ve created a new OS user called app_user and a group called app_os.
I’ve made this the primary group for app_user as well as adding oracle to the group.
sudo useradd -m app_user
sudo groupadd app_os
sudo usermod -g app_os app_user
sudo usermod -a -G app_os oracle
NOTE – for this change to take effect for oracle, I had to restart the VM.
Next, I’ve created some directories under the app_user home and granted appropriate permissions.
The app_user home is visible to the group :
cd $HOME ls -ld drwxr-x---. 7 app_user app_os 180 Apr 8 11:55 .
The bin directory will hold any shell scripts we need to execute.
The upload directory will hold feed files to be loaded into the database.
Note that oracle will have access to execute these through the group, but will not be able to write to them :
drwxr-x---. 2 app_user app_os 27 Apr 8 12:16 bin drwxr-x---. 2 app_user app_os 50 Apr 9 10:36 upload
Finally, we have a directory to hold any log files generated. Obviously, oracle does need write access to this directory :
drwxrwx---. 2 app_user app_os 54 Apr 8 14:09 logs
In the Database, I’ve created directory objects on each of these directories.
Note that home on Oracle Linux Server is a symbolic link to /opt/oracle/userhome, so we need to use the physical path in the Directory Object definition :
create or replace directory app_bin as '/opt/oracle/userhome/app_user/bin'
/
create or replace directory app_log as '/opt/oracle/userhome/app_user/logs'
/
create or replace directory app_upload as '/opt/oracle/userhome/app_user/upload'
/
…and granted the appropriate permissions to the HR database user :
grant read, execute on directory app_bin to hr;
grant read, write on directory app_log to hr;
grant read on directory app_upload to hr;
I’ve created a csv file containing all of the records in the employees table.
In SQLCL this is accomplished by connecting to the database as HR and running :
set sqlformat csv
spool employees.csv
select 'DATA' as row_type,
emp.*
from employees
/
spool off
I’ve also tweaked the row count at the end of the file to make it look like a trailer record you might expect to see on a feed file…
TRA,107Coming gundun with gunzip
I’ve got a compressed version of the file in the uploads directory :
-rw-r--r--. 1 app_user app_os 3471 Apr 9 10:36 employees.csv.gz
Now I need to create a simple shell script to call from an external table preprocessor to unzip the file on the fly and send the output to STDOUT. This will then be read and uploaded into the external table automatically.
The script is called unzip_file.sh and, initially, it looks like this :
#!/usr/bin/sh
/usr/bin/gunzip -c $1
When invoked by an external table preprocessor call, the script will be passed the fully qualified path of the file specified as the location of the external table.
The permissions on the script are :
-rwxr-x---. 1 app_user app_os 198 Apr 8 13:04 unzip_file.sh
Now for the external table itself. This is called employees_zip_xt and is created in the HR schema :
create table employees_zip_xt
(
row_type varchar2(10),
employee_id number,
first_name varchar2(20),
last_name varchar2(25),
email varchar2(25),
phone_number varchar2(20),
hire_date date,
job_id varchar2(10),
salary number,
commission_pct number,
manager_id number,
department_id number
)
organization external
(
type oracle_loader
default directory app_upload
access parameters
(
records delimited by newline
logfile app_log : 'employees_zip_xt.log'
badfile app_log : 'employees_zip_xt.bad'
nodiscardfile
preprocessor app_bin : 'unzip_file.sh'
skip 1
load when row_type = 'DATA'
fields terminated by ',' optionally enclosed by '"'
missing field values are null
(
row_type char(10),
employee_id integer external(6),
first_name char(20),
last_name char(25),
email char(25),
phone_number char(20),
hire_date date "DD-MON-RR",
job_id char(10),
salary float external,
commission_pct float external,
manager_id integer external(6),
department_id integer external(4)
)
)
location('employees.csv.gz')
)
reject limit unlimited
/
The table will process the data written to STDOUT by the preprocessor script.
Hmmm, I wonder what that suspicious clump of leaves and twigs is covering…
select * from employees_zip_xt; ORA-29913: error while processing ODCIEXTTABLEFETCH routine ORA-29400: data cartridge error KUP-04095: preprocessor command /opt/oracle/userhome/app_user/bin/unzip_file.sh encountered error "/usr/bin/gunzip: line 57: /opt/oracle/product/23c/dbhomeFree/dbs/gzip: No such file or directory
Give me a hand up, will you ?
As well as the ever inscrutable ODCIEXTTABLEFETCH, we have KUP-04095 complaining that it can’t find an executable that we haven’t called.
Whilst we’ve specified the full path to the gunzip executable in our script, GNU gunzip calls gzip during execution. Because that doesn’t use a fully qualified path, it looks in the wrong place.
The solution then, is to set the PATH environment variable in our script…
#!/usr/bin/sh
# Need the PATH specified because gunzip executable is calling gzip under the covers
export PATH=/usr/bin
gunzip -c $1
Now, when we query the table, it works as expected :

…and the unencrypted data is not persisted in a file…
ls -l emp* -rw-r--r--. 1 app_user app_os 3471 Apr 9 10:36 employees.csv.gz
Now we’ve established the concept, let’s try something a little more ambitious…
Where did I leave my keys ?Just for a change, I’m going to start with the example that actually works as expected.
I’ve generated GPG keys for oracle and used the public key to encrypt the original csv. The encrypted file has been moved to the upload directory :
ls -l *.gpg -rw-r--r--. 1 app_user app_os 3795 Apr 9 12:31 employees.csv.gpg
Note that the passphrase has been saved in passphrase.txt in oracle‘s home directory with the following permissions :
ls -l passphrase.txt -rw-------. 1 oracle oracle 19 Apr 9 15:41 passphrase.txt
Now we need a script in our bin directory to decrypt the file (decrypt_file.sh) :
#!/usr/bin/sh
/usr/bin/gpg \
--decrypt \
--pinentry-mode loopback \
--passphrase-file /opt/oracle/userhome/oracle/passphrase.txt \
--batch $1 2>/dev/null
We have a new external table – employees_enc_xt, which may look somewhat familiar :
create table employees_enc_xt
(
row_type varchar2(10),
employee_id number,
first_name varchar2(20),
last_name varchar2(25),
email varchar2(25),
phone_number varchar2(20),
hire_date date,
job_id varchar2(10),
salary number,
commission_pct number,
manager_id number,
department_id number
)
organization external
(
type oracle_loader
default directory app_upload
access parameters
(
records delimited by newline
logfile app_log : 'employees_enc_xt.log'
badfile app_log : 'employees_enc_xt.bad'
nodiscardfile
preprocessor app_bin : 'decrypt_file.sh'
skip 1
load when row_type = 'DATA'
fields terminated by ',' optionally enclosed by '"'
missing field values are null
(
row_type char(10),
employee_id integer external(6),
first_name char(20),
last_name char(25),
email char(25),
phone_number char(20),
hire_date date "DD-MON-RR",
job_id char(10),
salary float external,
commission_pct float external,
manager_id integer external(6),
department_id integer external(4)
)
)
location('employees.csv.gpg')
)
reject limit unlimited
/
…as will the results when we query it :

The point of a private GPG key is that it’s only accessible to the user that owns it. If that’s not oracle then we need to find a means of executing as the account which does own the key.
No biggie, the PDB_OS_CREDENTIAL parameter has that covered…oh, what have I stepped in.
PDB_OS_CREDENTIAL has been around since 12c and should allow you to specify the os user which runs when you invoke an external table preprocessor.
Coincidentally, the bug that causes this setting to be ignored has been around for a similar length of time.
I have personally confirmed this on versions 19c (splash !), 21c (squelch !) and now 23c (splat !).
I do hope that’s mud.
In these articles, Szymon Skorupinski does a marvellous job of explaining system behaviour both before and after the application of the patch to fix this issue :
If applying the patch is going to be problematic, not least because of the regression testing effort required to make sure none of your existing code breaks, then you do have an alternative…
I should be able to scrape it off once it driesBefore external tables there was SQL*Loader. It’s still supported, which is just as well in this case as we can use it to run a script as the owner of the PGP key and load the data directly into a staging table in the database.
The advantage of this approach is that you can execute it as the owner of the GPG key (although I’m still using the oracle user in this example).
To start with, we need a table to stage the data into :
create table employees_stg
(
row_type varchar2(10),
employee_id number,
first_name varchar2(20),
last_name varchar2(25 ),
email varchar2(25),
phone_number varchar2(20),
hire_date date,
job_id varchar2(10),
salary number,
commission_pct number,
manager_id number,
department_id number
)
/
Now we create a SQL*Loader control file in the bin directory on the server (load_employees.ctl) :
options(skip=1)
load data
badfile '/opt/oracle/userhome/app_user/logs/load_employees.bad'
discardfile '/opt/oracle/userhome/app_user/logs/load_employees.dis'
append
into table employees_stg
when row_type='DATA'
fields terminated by "," optionally enclosed by '"'
trailing nullcols
(
row_type char(10),
employee_id integer external(6),
first_name char(20),
last_name char(25),
email char(25),
phone_number char(20),
hire_date date "DD-MON-RR",
job_id char(10),
salary float external,
commission_pct float external,
manager_id integer external(6),
department_id integer external(4)
)
Finally, we create script to perform the load (load_employees.sh) :
#!/usr/bin/sh
/usr/bin/gpg \
--decrypt \
--pinentry-mode loopback \
--passphrase-file /opt/oracle/userhome/oracle/passphrase.txt \
--batch \
/home/app_user/upload/employees.csv.gpg 2>/dev/null|sqlldr \
control=load_employees.ctl \
userid=connect_string \
log=/home/app_user/logs/load_employees.log \
data=\'-\'
…replacing connect_string with the connect string for the schema you’re loading into.
Incidentally, in order to avoid having a database password hard-coded in the script, an Oracle Wallet would come in handy here.
Note that we tell SQL*Loader to read it’s STDIN for the data by specifying :
data =\'-\'
When we run this, we can see that it works as expected :


You could easily automate this load process by creating a DBMS_SCHEDULER external job.
ConclusionAs we’ve seen, it’s perfectly possible to pipe data from binary format files into the database without having to persist it in plain text. However, if you do decide to go down this path, you may want to wear your wellies.
Loading selected fields from a delimited file into Oracle using an External Table
If you’ve dealt with ETL processes for any length of time, sooner or later, you’ll be faced with the need to load data from a delimited file into your database.
In the case of Oracle, External Tables are tailor-made for this purpose.
However, whilst they might make loading an entire file is very simple, exactly how do you persuade them to just load certain fields ?
What we’ll be looking at here is :
- an external table that loads an entire file
- an external table that just loads the first few fields of each record in the file
- an external table that loads a selection of fields that are not contiguous in the file
All of these examples have been tested on Oracle 21cXE, although I’ve not seen any behaviour here that’s not consistent with Oracle versions back to 11.2.
The example file is simply a csv (called employees.csv) containing all the records from the EMPLOYEES table in the HR sample schema.

The external table which reads the entire file is called ALL_EMPS_XT :
create table all_emps_xt(
employee_id number(6,0),
first_name varchar2(20),
last_name varchar2(25),
email varchar2(25),
phone_number varchar2(20),
hire_date date,
job_id varchar2(10),
salary number(8,2),
commission_pct number(2,2),
manager_id number(6,0),
department_id number(4,0))
organization external (
type oracle_loader
default directory my_files
access parameters (
records delimited by newline
logfile 'all_emps.log'
badfile 'all_emps.bad'
discardfile 'all_emps.dis'
skip 1
load when (1:3) != 'TRL'
fields terminated by ',' optionally enclosed by '"'
missing field values are null (
employee_id integer external(6),
first_name char(20),
last_name char(25),
email char(25),
phone_number char(20),
hire_date date "DD-MON-YYYY",
job_id char(10),
salary float external,
commission_pct float external,
manager_id integer external(6),
department_id integer external(4))
)
location('employees.csv')
)
reject limit unlimited
/
Sure enough, if we select from the external table, we can see that all of the fields have been populated as expected …
select *
from all_emps_xt
where job_id in ('AD_PRES', 'SA_MAN')
order by employee_id
/

… and that all 107 data records have been loaded :
select count(*) from all_emps_xt; COUNT(*) ---------- 107Loading first few fields only
Let’s say that we weren’t interested in all of the fields in the file and we just wanted the first five – i.e. :
- EMPLOYEE_ID
- FIRST_NAME
- LAST_NAME
- PHONE_NUMBER
This is simple enough, we just specify definitions for those fields in the access parameters clause and our external table will ignore all of the fields after phone_number, simply by making sure that
missing field values are null
…is still specified in the access parameters clause of the External Table. This one is called START_EMPS_XT :
create table start_emps_xt(
employee_id number(6,0),
first_name varchar2(20),
last_name varchar2(25),
email varchar2(25),
phone_number varchar2(20))
organization external (
type oracle_loader
default directory my_files
access parameters (
records delimited by newline
logfile 'start_emps.log'
badfile 'start_emps.bad'
discardfile 'start_emps.dis'
skip 1
load when (1:3) != 'TRL'
fields terminated by ',' optionally enclosed by '"'
missing field values are null (
employee_id integer external(6),
first_name char(20),
last_name char(25),
email char(25),
phone_number char(20))
)
location('employees.csv')
)
reject limit unlimited
/
select *
from start_emps_xt
where employee_id between 100 and 110
order by employee_id
/

This is where it gets a bit more tricky…but not much, as it turns out. Say we want to load :
Field NoField Name1EMPLOYEE_ID2FIRST_NAME3LAST_NAME6HIRE_DATE10MANAGER_ID11DEPARTMENT_IDWe simply need to include placeholder values for in the access parameters clause for the columns we don’t want to load. In this case, I’ve used the actual field names but not specified a data type for them. That way, if I ever need to add one of these columns into the external table definition, I know where it appears in the file and the change is compartively simple. Anyhow, the table is called SELECTED_EMPS_XT :
create table selected_emps_xt(
employee_id number(6,0),
first_name varchar2(20),
last_name varchar2(25),
hire_date date,
manager_id number(6,0),
department_id number(4,0))
organization external (
type oracle_loader
default directory my_files
access parameters (
records delimited by newline
logfile 'selected_emps.log'
badfile 'selected_emps.bad'
discardfile 'selected_emps.dis'
skip 1
load when (1:3) != 'TRL'
fields terminated by ',' optionally enclosed by '"'
missing field values are null (
employee_id integer external(6),
first_name char(20),
last_name char(25),
email,
phone_number,
hire_date date "DD-MON-YYYY",
job_id,
salary,
commission_pct,
manager_id integer external(6),
department_id integer external(4))
)
location('employees.csv')
)
reject limit unlimited
/
Sure enough, if we query the table :
select *
from selected_emps_xt
where department_id = 60
order by hire_date
/

FORALL DML – why context isn’t everything
This post is the latest in an occasional series on the theme of stuff that doesn’t work quite how I thought it did.
It is the result of finding out the fun way that, rather than being “much faster” than using a humble Cursor For Loop , Bulk Collect/Forall for DML in PL/SQL can merely be a bit less slow.
Just in case my boss is reading this, I’d better get my excuses in right at the beginning.
This is what the Oracle PL/SQL Language Reference has to say about Forall :
A FORALL
statement is usually much faster than an equivalent FOR
LOOP
statement. However, a FOR
LOOP
statement can contain multiple DML statements, while a FORALL
statement can contain only one. The batch of DML statements that a FORALL
statement sends to SQL differ only in their VALUES
and WHERE
clauses. The values in those clauses must come from existing, populated collections.
…and later in the same document ….
The FORALL
statement runs one DML statement multiple times, with different values in the VALUES
and WHERE
clauses.
The different values come from existing, populated collections or host arrays. The FORALL
statement is usually much faster than an equivalent FOR
LOOP
statement.
I’ll be testing that assertion in this post.
Using a simple test case, which involves performing different DML operations on different tables using a single record set, I’ll begin by comparing the relative performance of:
- simple SQL statements inside a PL/SQL block
- Cursor For Loops
- Bulk Collect/Forall operations
I’ll then explore the potential performance gains available using database objects such as VARRAYs and Global Temporary Tables to hold the array being used in the DML.
NOTE – If you want to run these tests yourself on your own environment/database version to validate these findings, you can find the scripts I’ve used here on my Github Repo.
Test ScenarioI have a single set of records that I want to use in an insert, an update and a delete.
Each of these operations has a different target table.
I’ve created a simple test case, using this script ( called setup.sql) :
clear screen
set serverout on size unlimited
drop table input_records;
drop table record_statuses;
drop table target_table;
create table input_records
as
select rownum as id
from dual
connect by rownum <=100000;
create table record_statuses
as
select id, 'WAITING' as status
from input_records;
create table target_table ( id number);
The tests have all been run on Oracle Enterprise Edition 19c, although the behaviour described here appears to be consistent across Oracle Database versions from 11g upwards.
I’ve run each script twice in succession and then taken the fastest runtime. This is to ensure I’ve not been caught out by the effects of any caching. Unlikely as I drop and re-create the tables each time, but as I’ve already fallen foul of one assumption recently, I didn’t want to take any chances.
I’ve used tkprof to format tracefile information, so it’s probably worth bearing in mind the column heading definitions for tkprof output :
****************************************************************************** count = number of times OCI procedure was executed cpu = cpu time in seconds executing elapsed = elapsed time in seconds executing disk = number of physical reads of buffers from disk query = number of buffers gotten for consistent read current = number of buffers gotten in current mode (usually for update) rows = number of rows processed by the fetch or execute call ******************************************************************************
To generate the trace files, I’ve wrapped the anonymous blocks with the test code with the following statements.
Before the PL/SQL block containing the test code :
alter session set timed_statistics = true;
alter session set sql_trace = true;
alter session set tracefile_identifier = '<a string>';
…where <a string> is a string to include in the tracefile name to make it easier to find in the trace directory on the database server.
After the block :
alter session set sql_trace = false;
select value
from v$diag_info
where name = 'Default Trace File'
/
A Simple SQL Baseline
Let’s start by benchmarking just using SQL.
This is as fast as it gets in Oracle terms…generally speaking.
Therefore it should provide a useful indicator of what the RDBMS is capable of in terms of performing these DML operations ( straight_iud.sql ):
@setup.sql
set timing on
begin
insert into target_table(id)
select id from input_records;
update record_statuses
set status = 'LOADED'
where id in ( select id from input_records);
delete from input_records
where id in ( select id from record_statuses where status = 'LOADED');
commit;
end;
/
set timing off
Once the setup is done, the block completes in 1.878 seconds.
Whilst it may be faster than a fast thing on fast pills, a SQL-only solution is not always practical.
As Steven Feuerstein points out in his excellent article on this subject, there are times when you need PL/SQL to get the job done.
This is where the Bulk Collect/Forall and Cursor For Loop approaches come to the fore.
Let’s start by looking at how they compare when called upon to perform the first of our DML operations…
Let’s start with the straight SQL option (straight_insert.sql ):
@setup.sql
set timing on
begin
insert into target_table(id)
select id from input_records;
commit;
end;
/
set timing off
This runs in a predictably rapid 0.128 seconds.
Now for the Cursor For Loop ( forloop_insert.sql ):
@setup.sql
set timing on
declare
v_count pls_integer := 0;
begin
for r_rec in(
select id
from input_records)
loop
insert into target_table(id)
values( r_rec.id);
end loop;
commit;
end;
/
set timing off
…which completes in an, equally predictable, laggardly 1.445 seconds.
Incidentally, tracing this operation confirms that the select statement in a Cursor For Loop behaves in the same way as a Bulk Collect with the LIMIT set to 100. The Fetch count shows the 100K records have been fetched in batches of 100.
SELECT ID FROM INPUT_RECORDS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 2 0.00 0.00 0 0 0 0 Execute 2 0.00 0.00 0 0 0 0 Fetch 1002 0.43 0.38 154 328059 0 100000 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 1006 0.43 0.39 154 328059 0 100000
This brings us to the much vaunted Forall.
If we start by running with a limit of 100 on the Bulk Collect, we can get an idea of how much difference it makes when we minimize the context switches between the SQL and PL/SQL engine as compared to the humble Cursor For Loop ( forall_insert_100.sql ):
@../setup.sql
set timing on
declare
v_count pls_integer := 0;
cursor c_input is
select id
from input_records;
type typ_input is table of input_records%rowtype index by pls_integer;
v_arr_id typ_input;
begin
open c_input;
loop
fetch c_input
bulk collect into v_arr_id
limit 100;
forall i in 1..v_arr_id.count
insert into target_table(id)
values( v_arr_id(i).id);
exit when v_arr_id.count = 0;
end loop;
close c_input;
commit;
end;
/
set timing off
Yep, that’s much faster at 0.165 seconds.
We can conclude from this that most of the Cursor For Loop execution time is due to context switching between the SQL and PL/SQL engines.
However, after some trial and error, I’ve found that setting a limit of 1000 is about optimal for the test case at hand – not using a limit clause actually proves to be a bit slower – so I’ll be using that in the rest of the Bulk Collect tests in this post. Starting with this ( forall_insert.sql ):
@../setup.sql
set timing on
declare
v_count pls_integer := 0;
cursor c_input is
select id
from input_records;
type typ_input is table of input_records%rowtype index by pls_integer;
v_arr_id typ_input;
begin
open c_input;
loop
fetch c_input
bulk collect into v_arr_id
limit 1000;
forall i in 1..v_arr_id.count
insert into target_table(id)
values( v_arr_id(i).id);
exit when v_arr_id.count = 0;
end loop;
close c_input;
commit;
end;
/
set timing off
This runs in 0.145 seconds.
Let’s tabulate those execution times for ease of comparison :
RankMethodRuntime (seconds)1Simple SQL0.1282Forall (Limit 1000)0.1453Cursor For Loop1.445So far, it all seems to be pretty much as you might expect.
OK, let’s try some more DML statements…
We’re going to insert records into a table as before.
Then, using the same record set, we’ll update a second table then delete from a third table.
We’ve already seen the unadorned SQL script for this test, so let’s move on to the Cursor For Loop (forloop_iud.sql). Once I kick this off, you may want to take the opportunity to go and make a coffee…
@setup.sql
set timing on
declare
v_count pls_integer := 0;
begin
for r_rec in(
select id
from input_records)
loop
insert into target_table(id)
values( r_rec.id);
update record_statuses
set status = 'LOADED'
where id = r_rec.id;
delete from input_records
where id = r_rec.id;
end loop;
commit;
end;
/
set timing off
That took almost four minutes ( 238.486 seconds).
Let’s see what “much faster” looks like with Forall (forall_iud.sql ):
@setup.sql
set timing on
declare
v_count pls_integer := 0;
cursor c_input is
select id
from input_records;
type typ_input is table of input_records%rowtype index by pls_integer;
v_arr_id typ_input;
begin
open c_input;
loop
fetch c_input
bulk collect into v_arr_id
limit 1000;
forall i in 1..v_arr_id.count
insert into target_table(id)
values( v_arr_id(i).id);
forall j in 1..v_arr_id.count
update record_statuses
set status = 'LOADED'
where id = v_arr_id(j).id;
forall k in 1..v_arr_id.count
delete from input_records
where id = v_arr_id(k).id;
exit when v_arr_id.count = 0;
end loop;
close c_input;
commit;
end;
/
set timing off
My coffee’s gone cold. That’s 224.162 seconds – around 6% faster if you’re interested.
I’m not sure that qualifies as “much faster”.
Remember, the simple SQL runtime for the same DML operations was 1.878 seconds.
Time for another table :
RankMethodRuntime (seconds)1Simple SQL 1.8782Forall (Limit 1000)224.1623Cursor For Loop238.486In order to find out just what’s going on, it may be worth isolating a single DML operation, as we did with the INSERT statement earlier. This time, we’ll try an UPDATE…
Tracing an UpdateThe simple SQL benchmark for the update is ( straight_update.sql ):
@../setup.sql
insert into target_table
select id from input_records;
commit;
set timing on
begin
update record_statuses
set status = 'LOADED'
where id in ( select id from input_records);
commit;
end;
/
set timing off
This runs in 0.798 seconds.
Now for the Cursor For Loop ( forloop_update.sql):
@setup.sql
insert into target_table
select id from input_records;
commit;
set timing on
declare
v_count pls_integer := 0;
begin
for r_rec in(
select id
from input_records)
loop
update record_statuses
set status = 'LOADED'
where id = r_rec.id;
end loop;
commit;
end;
/
set timing off
144.856 seconds.
Now for the Forall (forall_update.sql ):
@setup.sql
insert into target_table(id)
select id from input_records;
commit;
set timing on
declare
v_count pls_integer := 0;
cursor c_input is
select id
from input_records;
type typ_input is table of input_records%rowtype index by pls_integer;
v_arr_id typ_input;
begin
open c_input;
loop
fetch c_input
bulk collect into v_arr_id
limit 1000;
forall j in 1..v_arr_id.count
update record_statuses
set status = 'LOADED'
where id = v_arr_id(j).id;
exit when v_arr_id.count = 0;
end loop;
close c_input;
commit;
end;
/
set timing off
That’s a not-much-faster-at-all-really 141.449 seconds.
RankMethodRuntime (seconds)1Simple SQL 0.7982Forall (Limit 1000)141.4493Cursor For Loop144.856Once I’ve turned tracing on and re-executed these scripts, the reason for the performance discrepancy becomes clear.
Here’s the tkprof output for the Simple SQL Update. Remember, query is the number of buffers gotten for a consistent read :
UPDATE RECORD_STATUSES SET STATUS = 'LOADED' WHERE ID IN ( SELECT ID FROM INPUT_RECORDS) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 2 0 Execute 1 0.51 0.92 249 416 102416 100000 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.51 0.92 249 416 102418 100000
As you’d expect, the Cursor For Loop performs a huge number of gets by comparison as it’s executing 100K update statements, rather than just one :
UPDATE RECORD_STATUSES SET STATUS = 'LOADED' WHERE ID = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 100000 215.15 221.86 248 25104059 103919 100000 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 100001 215.15 221.86 248 25104059 103919 100000
Now, although the Forall is only executing the insert 100 times, the number of gets is on a par with the Cursor For Loop :
UPDATE RECORD_STATUSES SET STATUS = 'LOADED' WHERE ID = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 100 217.41 220.58 248 25105754 103319 100000 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 101 217.41 220.58 248 25105754 103319 100000
Further digging shows that it’s a similar story with the DELETEs.
At this point, maybe we should take another look at that documentation I quoted at the start. Specifically, the bit which says :
The FORALL statement runs one DML statement multiple times, with different values in the VALUES and WHERE clauses.
So, despite the tkprof execution count for the statement suggesting otherwise, it looks like Forall is actually running a seperate UPDATE statement for each value in the collection.
If we’re going to get this particular DML code to perform, we’ll need to look at storing our array in a structure other than a PL/SQL collection.
VARRAYIn their guise as a database object (as opposed to a PL/SQL collection), VARRAYs are directly visible to the SQL engine. This means that they can be directly referenced in SQL queries without the necessity for any context switching (varray_iud.sql) …
@setup.sql
-- If we try to declare and use the varray entirely within the PL/SQL block
-- we run into PLS-00642 : local collection types not allowed in SQL statements
-- so...
create or replace type prince_varray as varray(100000) of number
/
set timing on
declare
v_id_arr prince_varray := prince_varray();
begin
select id
bulk collect into v_id_arr
from input_records;
insert into target_table(id)
select * from table(v_id_arr);
update record_statuses
set status = 'LOADED'
where id in (select * from table(v_id_arr));
delete from input_records
where id in (select * from table(v_id_arr));
commit;
end;
/
set timing off
Side Note : some may consider it bad practice to name objects simply for the purposes of making a weak pun.
The big news is that the runtime is a rather respectable 1.846 seconds.
If you’re sure that you’re never going to need to process more than 32767 records in your VARRAY you can even save yourself the effort of coming up with a sensible name for one.
The following types are already created in Oracle for each of the base datatypes:
- sys.ODCINumberList
- sys.ODCIDateList
- sys.ODCIVarchar2List
Incidentally, in case you’re wondering about the maximum size of a VARRAY…
create or replace type varray_kane as varray(2147483647) of number / Type VARRAY_KANE compiled
Whilst a VARRAY is an option in this instance, in most cases, you’ll be dealing with a multi-dimensional array.
If only we could use a table…
Using a Global Temporary Table (GTT) offers all of the advantages of using a VARRAY with the added bonus that it’ll work for multi-dimensional arrays (gtt_iud.sql).
@setup.sql
drop table gtt_array;
create global temporary table gtt_array (
id number)
on commit delete rows;
set timing on
begin
insert into gtt_array
select id from input_records;
insert into target_table(id)
select id from gtt_array;
update record_statuses
set status = 'LOADED'
where id in (select id from gtt_array);
delete from input_records
where id in (select id from gtt_array);
commit;
end;
/
set timing off
Runtime is 1.950 seconds.
If we look at our final results table for the runs that involved all three DML statements, we can see that the performance of the VARRAY and GTT methods is broadly similar to that of using simple SQL statements, mainly because they are also, effectively simple SQL statements.
I say “broadly similar” because I’ve not found to one method to be consistently faster than the other two. All of them are, however, much faster than Forall.
RankMethodRuntime (seconds)1VARRAY 1.8462Simple SQL 1.8783GTT 1.9504Forall224.1625Cursor For Loop238.486 ConclusionsUsing Forall for an INSERT is likely to be significantly faster than using a Cursor For Loop due to the reduction in context switching. However, this does not hold for other types of DML.
If you find yourself in circumstances where performance is an issue then you may want to consider using a database object, such as a VARRAY or GTT, in your array processing.
Using a full outer join to “diff” two tables
I was inspired to write this post by Joshua Ottwell’s thoughts on finding rows present in one table but not another.
What follows is an exploration of how we can use a Full Outer Join to perform a “diff” on the data in two tables. We’ll also look at doing something similar for two distinct result sets from the same table.
In this instance, we want to identify :
- records that exist in the first table but not the second
- records that exist in the second table but not the first
- records that exist in both tables but where some values differ
Before going any further, I should say that the example that follows will make more sense if you consider Terry Pratchett’s observation that :
“In ancient times cats were worshipped as gods; they have not forgotten this.”

Let’s say I have a report that I want to change.
I’ve simulated the output of the report and captured it in a table like this :
create table employee_report_baseline(
id number,
emp_name varchar2(100),
job_title varchar2(100) )
/
insert into employee_report_baseline( id, emp_name, job_title)
values(1,'DEBBIE', 'CEO');
insert into employee_report_baseline( id, emp_name, job_title)
values(2, 'MIKE', 'MINION');
insert into employee_report_baseline( id, emp_name, job_title)
values(3, 'TEDDY', 'DOG');
commit;
The output of the new version of the report is simulated like this :
create table employee_report_new
as
select *
from employee_report_baseline
where emp_name != 'MIKE';
update employee_report_new
set job_title = 'LADY-IN-WAITING'
where emp_name = 'DEBBIE';
insert into employee_report_new( id, emp_name, job_title)
values(4, 'CLEO', 'CAT');
commit;
Identifying missing or new records
First of all, let’s see if the records returned are the same for both reports. Happily ID is a unique key for the dataset, which makes the comparison fairly simple :
select bsl.id, nr.id,
case
when bsl.id is null then 'Added in New Report'
when nr.id is null then 'Missing from New Report'
end as status
from employee_report_baseline bsl
full outer join employee_report_new nr
on bsl.id = nr.id
where bsl.id is null
or nr.id is null
order by 1 nulls last;
The absence of the key value from either table indicates that the record is in one table but not the other.
Sure enough, when we run this query we get :
BASELINE_ID NEW_ID STATUS ----------- ------ ------------------------------ 2 Missing from New Report 4 Added in New ReportAll Differences
If we want to identify all of the differences, including changed records, we can do so with just a little extra typing :
select
nvl(bsl.id, nr.id) as id,
bsl.emp_name as old_emp_name,
nr.emp_name as new_emp_name,
bsl.job_title as old_job_title,
nr.job_title as new_job_title,
case
when bsl.id is null then 'Added in New Report'
when nr.id is null then 'Missing from New Report'
else 'Different between Report Versions'
end as status
from employee_report_baseline bsl
full outer join employee_report_new nr
on bsl.id = nr.id
where bsl.id is null
or nr.id is null
or nvl(bsl.emp_name, 'X') != nvl(nr.emp_name, 'X')
or nvl(bsl.job_title, 'X') != nvl(nr.job_title, 'X')
order by 1 nulls last
/
As well as the two records that are missing from either result set, we can see that Debbie has been demoted ( or possibly usurped). Cleo is now the top dog(!) :

One thing to bear in mind when using a full outer join is that it will match any row in the tables being joined.
This gets a bit annoying when you want to do something with a subset of data such as comparing the column definitions of two tables in USER_TAB_COLUMNS in Oracle.
To demonstrate :
create table old_firm(
id number,
first_name varchar2(500),
last_name varchar2(500),
job_title varchar2(100),
start_date date,
end_date date)
/
create table new_firm(
id number,
first_name varchar2(250),
last_name varchar2(500),
salary number,
start_date date,
end_date varchar2(20))
/
If we want to compare only the records relating to these two tables then we’ll need a couple of in-line-views to restrict the result sets that our full outer join will look at.
Incidentally, as we’re running this on Oracle 19c, we can throw in an in-line function as well to save a bit of typing…
with function col_diff(
i_old_name in varchar2 default null,
i_new_name in varchar2 default null,
i_old_type in varchar2 default null,
i_new_type in varchar2 default null,
i_old_len in number default null,
i_new_len in number default null)
return varchar2
is
begin
if i_old_name is null then
return 'NEW';
elsif i_new_name is null then
return 'MISSING';
end if;
-- compare the attributes for the columns
if i_old_type != i_new_type then
return 'TYPE';
elsif i_old_len != i_new_len then
return 'LENGTH';
end if;
end;
ofm as (
select column_name, data_type, data_length
from user_tab_columns
where table_name = 'OLD_FIRM'),
nfm as (
select column_name, data_type, data_length
from user_tab_columns
where table_name = 'NEW_FIRM')
select
nvl(ofm.column_name, nfm.column_name) as column_name,
case col_diff( ofm.column_name, nfm.column_name, ofm.data_type, nfm.data_type, ofm.data_length, nfm.data_length)
when 'NEW' then 'New Column'
when 'MISSING' then 'Missing Column in NEW_FIRM'
when 'TYPE' then 'Type Mismatch. OLD_FIRM type is '||ofm.data_type||' NEW_FIRM is '||nfm.data_type
when 'LENGTH' then 'Length Mismatch. OLD_FIRM length is '||ofm.data_length||' NEW_FIRM is '||nfm.data_length
end as status
from ofm
full outer join nfm
on ofm.column_name = nfm.column_name
where (ofm.column_name is null or nfm.column_name is null)
or( ofm.data_type != nfm.data_type or ofm.data_length != nfm.data_length)
/

I’ll have to leave it there. The “boss” is demanding Dreamies to appease her wrath for being referred to as “top dog” earlier.
The Oracle SQL Limit Clause and teaching a New Dog Old Tricks
It’s been a few weeks now and I’ve finally gotten over England’s latest World Cup penalty drama.
I’m not so sure about Teddy though…

Anyhow, he has decided that I need to know about the SQL Limit Clause that Oracle introduced in 12c and has decided to use data from the Tournament in the examples that follow…
EnvironmentThese examples have all been run on my OCI Free Tier database ( Oracle 19c at the time of writing).
The table we’re going to use contains details of players who scored or gave an assist during the recent Men’s World Cup Finals and looks like this :
create table goal_scorers (
player varchar2(4000),
team varchar2(500),
goals number,
assists number)
/
The table contains 176 rows.
select player, team, goals, assists from goal_scorers order by goals desc, assists desc / PLAYER TEAM GOALS ASSISTS ----------------------- ----------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 RICHARLISON BRAZIL 3 0 E VALENCIA ECUADOR 3 0 ... snip ... PRECIADO ECUADOR 0 1 MCGREE AUSTRALIA 0 1 RODRYGO BRAZIL 0 1
Before we get into the new(ish) syntax, let’s have a quick history lesson…
Before 12cBack in the mists of time, if you wanted to retrieve the first five rows in a table based on an order you specified, you’d need to do something like this :
select player, team, goals, assists
from (
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc, player, team)
where rownum <= 5;
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 GONCALO RAMOS PORTUGAL 3 1 5 rows selected.
Teddy reckons that you’ll only come across code like this if it’s really old…or written by a really old programmer.
Ignoring the cheeky young pup, we move on to the point were Analytic Functions become available (8i if you’re counting), which allows us to dispense with the – in this context – artificial order criteria of player and team and ensure that our query returns any tied records :
with top_scorers as (
select player, team, goals, assists,
rank() over ( order by goals desc, assists desc) as recnum
from goal_scorers )
select *
from top_scorers
where recnum <= 5
order by recnum;
PLAYER TEAM GOALS ASSISTS RECNUM -------------------- -------------------- ----- ------- ---------- MBAPPE FRANCE 8 2 1 MESSI ARGENTINA 7 3 2 ALVAREZ ARGENTINA 4 0 3 GIROUD FRANCE 4 0 3 MORATA SPAIN 3 1 5 GONCALO RAMOS PORTUGAL 3 1 5 6 rows selected.
In the latest Oracle versions, Teddy reckons that you can get that same analytical function goodness but with a bit less typing…
Fetch FirstLet’s re-write our top 5 query using the limit clause…
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 rows only
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 5 rows selected.
As we’ve already seen, that’s ever so slightly different from the top 5 goal scorers. Like Morata, Goncalo Ramos also has 3 goals and 1 assist but he has been arbitrarily excluded from the result set.
Remember, Oracle does not guarantee the order of a result set of a SELECT statement other than that specified in the ORDER BY clause. Therefore, there’s no guarantee a future run of this query won’t include Goncalo Ramos and exclude Morata.
A more reliable query would include all of the tied records for 5th place, as with the RANK() query above.
Happily, we can achieve the same effect with a limit clause :
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 rows with ties
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 6 rows selected.
As well as specifying a set number of rows, you can also specify a percentage using either the ONLY clause…
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 percent rows only
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 9 rows selected.
… or the WITH TIES clause…
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 percent rows with ties
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 RICHARLISON BRAZIL 3 0 E VALENCIA ECUADOR 3 0 11 rows selected.Offset
If we want to skip the first n rows we can use the OFFSET clause :
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- E VALENCIA ECUADOR 3 0 KANE ENGLAND 2 3 BRUNO FERNANDES PORTUGAL 2 1 LEWANDOWSKI POLAND 2 1 NEYMAR BRAZIL 2 1 ...snip... RODRYGO BRAZIL 0 1 166 rows selected.
We can also specify the number of rows to fetch in the offset clause :
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows only
/
or…
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows with ties
/
If you’re thinking of “paging” your results using this method, but you are not certain that your order by clause will not result in any ties, you may get some unexpected results.
For example, when I ran :
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 10 rows only
/
I got :
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 E VALENCIA ECUADOR 3 0 SAKA ENGLAND 3 0 RICHARLISON BRAZIL 3 0 GAKPO NETHERLANDS 3 0 10 rows selected.
However, when I then ran…
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows only
/
Gapko appeared again…
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- GAKPO NETHERLANDS 3 0 KANE ENGLAND 2 3 NEYMAR BRAZIL 2 1 LEWANDOWSKI POLAND 2 1 BRUNO FERNANDES PORTUGAL 2 1 TAREMI IRAN 2 1 ABOUBAKAR CAMEROON 2 1 CHO SOUTH KOREA 2 0 KRAMARIC CROATIA 2 0 AL DAWSARI SAUDI ARABIA 2 0 10 rows selected.
In this example, if I want to eliminate duplicates then I need to make sure that my order by clause does not allow any ties :
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc, player, team
fetch first 10 rows only
/
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 GONCALO RAMOS PORTUGAL 3 1 MORATA SPAIN 3 1 E VALENCIA ECUADOR 3 0 GAKPO NETHERLANDS 3 0 RASHFORD ENGLAND 3 0 RICHARLISON BRAZIL 3 0 10 rows selected.
select player, team, goals, assists from goal_scorers order by goals desc, assists desc, player, team offset 10 rows fetch next 10 rows only / PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- SAKA ENGLAND 3 0 KANE ENGLAND 2 3 ABOUBAKAR CAMEROON 2 1 BRUNO FERNANDES PORTUGAL 2 1 LEWANDOWSKI POLAND 2 1 NEYMAR BRAZIL 2 1 TAREMI IRAN 2 1 AL DAWSARI SAUDI ARABIA 2 0 CHO SOUTH KOREA 2 0 DE ARRASCAETA URUGUAY 2 0 10 rows selected.
So that’s it then, give the dog a biscuit and be on our way ?
Well, it’s just possible that the rownum method from olden times is not ready to be consigned to history quite yet…
Let’s say we have a table with lots of rows…
create table bigtab as
select rownum as id from dual connect by rownum <= 10000000;
If we run the following queries against it, the relative performance may be a bit of a surprise…
set timing on with ordered_recs as ( select id from bigtab order by 1) select id from bigtab where rownum <=5; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.042 select id from bigtab order by id fetch first 5 rows only; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.608 select id from ( select id, rank() over (order by id) as recnum from bigtab) where recnum <= 5; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.636
Note that this is the second run of each query to account for the effects of caching.
The similarity in runtime between the limit clause and the analytic function is not that surprising. If you look at an explain plan for a limit clause query, it appears to be using analytical functions under the covers.
The big news here is that the shonky old rownum query is about 14 times faster than the shiny new limit clause equivalent.
Remember, I’m running this on an OCI 19c instance managed by Oracle so there’s nothing odd in the database configuration.
It appears that there is a patch to address an optimizer issue with this type of query, details of which can be found in this article by Nigel Bayliss.
However, if you do come across a performance issue with a limit clause query, it may be worth seeing if good old-fashioned rownum will dig you out of a hole.
Conditionally calling a script in a SQL*Plus control script
For those of us who haven’t quite gotten around to incorporating Liquibase in our database code release pipeline, we sometimes find ourselves in a situation where it would be extremely useful to be able to apply some branching logic in a SQL*Plus control script.
In order to save myself the pain of trying to figure out exactly how you can do this without writing lots of dynamic code, I’ve decided to write it down.
NOTE – the scripting techniques outlined here work the same way in both SQLCL and SQL*Plus.
What follows are examples of SQL*Plus scripts which implement branching to :
- choose which of two scripts to run
- choose whether or not to run a script
The difference between these two use cases is fairly subtle, but worth exploring…
Choosing between two scriptsHere are two very simple scripts. The first is called weekday.sql :
prompt Another working day :-(
…and the second is called weekend.sql
prompt Yay, it's the weekend :-)
I want to write a control script which calls one – and only one – of these scripts based on what day it is today.
We can establish the day easily enough with :
select to_char(sysdate, 'DY') from dual;
In order to use this information to influence the behaviour of the control script at runtime, we’ll need to assign it to a variable that we can reference. In SQL*PLUS, we can do this with NEW_VALUE :
column weekday new_value v_script noprint
select
case when to_char(sysdate, 'DY') in ('SAT', 'SUN')
then 'weekend.sql'
else 'weekday.sql'
end as weekday
from dual;
@&v_script
The first line of the script…
column weekday new_value v_script noprint
…takes the value of the column “weekday” from result of the subsequent query and assigns it to the variable v_script.
I’ve saved this script as choose_message.sql.
When I run it I get :

As you can see, only the message from weekend.sql is shown. The weekday.sql script is not executed.
Choosing whether or not to run a scriptThis is a little different as, with this technique, the script will always call a file at the end.
Therefore, I’ve got a placeholder script which will get executed if we decide we don’t want to run the actaul target script. The new script is placeholder.sql and is predictably simple :
prompt You have chosen to skip weekday.sql
This time, the control script is called run_weekday_yn.sql and accepts a parameter :
set verify off
accept run_yn prompt 'Do you want to run the script ? (Y/N) : '
column run_it new_value v_script noprint
select
case when upper('&run_yn') = 'Y'
then 'weekday.sql'
else 'placeholder.sql'
end run_it
from dual;
@&v_script
This will run weekday.sql…

…but only if we tell it to…

Of course, you can use a positional notation for the parameter to allow it to be passed unprompted at runtime. This is run_it_yn.sql :
set verify off
column run_it new_value v_script noprint
select
case
when upper('&1') = 'Y' then 'weekday.sql'
else 'placeholder.sql'
end run_it
from dual;
@&v_script

Converting a non-partitioned table to partitioned in Oracle
Teddy tells me that his New Year’s Resolution is to teach an old dog new tricks.
I’ve no idea which particular “Old Dog” he has in mind.
Anyway, it looks like 2023 is going to be The Year of The “Ted(dy) Talk” on this blog, starting with the “modern” method ( i.e. in Oracle 12c and later) of convert a non-partitioned table to be partitioned.

Such an undertaking used to require a lot of messing about, probably involving DBMS_REDEFINITION.
If you’re still stuck on 11g, you can find a good example of this method on Oracle Base
Since 12c however, things have become much easier.
Before we go on however…
License WarningUse of partitioning may require an additional license depending on the Oracle version and Edition so it’s a good idea to make sure you’re covered before you start using it.
With that in mind…
We have a non-partitioned table called TEDDY, which we’ve created like this :
create table teddy as
select
rownum as id,
case
when mod(trunc(dbms_random.value(1,5)),4) = 0 then 'Woof'
when mod(trunc(dbms_random.value(1,5)),4) = 1 then 'Bark'
when mod(trunc(dbms_random.value(1,5)),4) = 2 then 'Whine'
else 'Growl'
end as message,
add_months(sysdate, - trunc(dbms_random.value(1,4))) as start_date
from dual
connect by rownum <= 20
/
alter table teddy add constraint teddy_pk primary key (id);
create index teddy_local_start_date_ix on teddy(start_date);
This DDL results in the creation of the following segments :
select segment_name, segment_type from user_segments where segment_name like 'TEDDY%' / SEGMENT_NAME SEGMENT_TYPE ---------------------------------------- -------------------- TEDDY TABLE TEDDY_LOCAL_START_DATE_IX INDEX TEDDY_PK INDEX
Some time after creating the table, we’ve realised that TEDDY should be Interval Partitioned on START_DATE.
Happily for us, on Oracle versions from 12c, we can accomplish this very simply :
alter table teddy modify
partition by range (start_date) interval( numtoyminterval(1, 'MONTH'))
(partition p_default values less than (to_date('2022-01-01', 'YYYY-MM-DD')))
online
/
If we look at the segments now we can see that Oracle has created multiple table and index partitions :
select segment_name, partition_name, segment_type from user_segments where segment_name like 'TEDDY%'; SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE ---------------------------- --------------- -------------------- TEDDY P_DEFAULT TABLE PARTITION TEDDY SYS_P621 TABLE PARTITION TEDDY SYS_P622 TABLE PARTITION TEDDY SYS_P623 TABLE PARTITION TEDDY_LOCAL_START_DATE_IX P_DEFAULT INDEX PARTITION TEDDY_LOCAL_START_DATE_IX SYS_P621 INDEX PARTITION TEDDY_LOCAL_START_DATE_IX SYS_P622 INDEX PARTITION TEDDY_LOCAL_START_DATE_IX SYS_P623 INDEX PARTITION TEDDY_PK INDEX 9 rows selected.
As the TEDDY_LOCAL_START_DATE_IX index is on the partition key we’ve chosen (START_DATE), Oracle has converted this index to be locally partitioned.
Teddy has one final tip to impart.
If you want to query a specific partition of a table, as well as specifying the partition name…
select count(*)
from teddy partition(SYS_P623)
where message = 'Woof'
/
…you can also specify a value that falls within the partition using PARTITION FOR…
select count(*)
from teddy partition for(date '2022-10-30')
where message = 'Growl'
/
This syntax has been around since at least 11g, but he knows I seldom use them and it would be good to have examples to hand.
Building a Master-Detail Drill-Down with APEX
APEX is one of those technologies that I get to use intensively for a short period of time and then don’t touch for years.
As I’ve recently acquired an OCI Free Tier Database (version 19c) and, consequently, Oracle update the APEX version for me (currently 22.3), now seems like a good time to see how it’s evolved since I last had a play.
It just so happens that I’m in need of a UI for my hastily constructed World Cup Wallchart application.
What I’m going to look at here is :
- how to use the APEX Drill-Down creation wizard to create related application pages
- how I can customize the generated pages without writing much code
Incidentally, I’ve put the database source-code for the app on Github should you wish to take a look.
Health Warning : when I say “hastily constructed”, I’m not kidding.
On the plus side, I have spent some time on the physical data model, with special attention made to the table relations.
“OK Boomer”, I hear you say, “but how’s that going to help with the UI ?” Let’s find out…
RequirementsI want to see a list of matches in the tournament.
For each match, I want to be able to enter the score and record anything that happened in the match (e.g. goal scorers).
The Physical Data Model underpinning this functionality looks like this :

After reviewing the options available in APEX, it looks like a Drill-Down Master Detail will best meet this requirement…
APEX Drill-Down WizardI’ve already created a new APEX application, based on the WC22 database schema.Now I want to create a page :

…specifically a Master Detail….

…Drill Down…

I’ve called the first page Tournament Fixtures.
The Master Data Source (parent table) is FIXTURES :

The Primary Key Column1 is already populated as FIXTURES has a Primary Key defined on the ID column :

I’ve set MATCH_DATE as the Form Navigation Order column.
The Create Master Detail step has Show Only Related Tables toggled On, which makes searching the drop-down list of tables to use as the detail rather simple :

…and in the next screen, APEX has pre-populated the required details using the Primary and Foreign Key definitions on the MATCH_DETAILS table :

Now let’s see what APEX has made of all that….

The good news is that APEX seems to have realised that the STAGE_CID, T1_CID and T2_CID are actually Foreign Keys (FKs) to lookup tables in the application and pulled through the appropriate NAME value from the lookup in question.
The numeric columns are all displayed as floats, which is fair enough as the underlying columns are defined as NUMBER rather than INTEGER. It’s almost as if it was hastily constructed :).
Less encouraging is that the columns containing “Draw” codes seem to have incorrect values.
If we click on the link icon and take a look at the drill-down we can see a similar story :

All in all, it’s a reasonable start considering I haven’t had to write a line of code to generate either of these pages. However, I will need to do some tweaking to get the UI into a useable state.
Let’s start with the Report.
The Tournament Fixtures Reports In the Page DesignerFirst of all, there are a number of columns where the data appears to be incorrect :
- GROUP_CID
- T1_DRAW_CID
- T2_DRAW_CID
- WINNER_DRAW_CID
- WINNER_DRAW_CID
Most of these won’t be needed in the final report but it’s probably wise to correct them, just in case I stumble across a bug.
We can see that APEX has defined GROUP_CID to have a Type of Plain Text (based on List of Values) and created a List of Values (LOV) on GROUPS.WINNER_DRAW_CID to use as it’s lookup.

As a consequence of this, the value it displays on the report is taken from the LOV rather than being the actual value in the underlying database row.
To fix this, change the Type to Plain Text, which automatically removes the association with the LOV :

The “Draw” columns (T1_DRAW_CID, T2_DRAW_CID, WINNER_DRAW_CID and LOSER_DRAW_CID) are similarly afflicted (LOV on DRAW.STAGE_CID) and the solution is the same.
To ensure that the Score and Points columns ( T1_SCORE, T2_SCORE, T1_PENS, T2_PENS, T1_PTS, T2_PTS) are displayed as integers, change the Appearance/Format Mask to : 999

One other formatting change I’d like to make is for the date of the fixture to include the day of the week as I’ll be referring to the application to see who is playing and when so I can plan my evenings accordingly.
Therefore, I’ve set the format mask for MATCH_DATE to : fmDay, fmDD fmMonth, YYYY
This application should not have functionality to create Fixtures as I’ve already added them in separately.
Therefore, we need to remove the CREATE button, which links to the Match Events page.
To accomplish this, I’ve simply deleted this button in the Page Designer (right-click on the button and select Delete).
Next, I’m going to remove the columns that have no relevance to the user on this report.
The “Draw” columns are part of the application logic to work out which teams go into each knockout fixture. This functionality is coded in the back end and requires no direct user interaction.
The T1_PTS and T2_PTS columns are virtual columns used to calculate Points totals for the Group Tables.
These are also irrelevant to this report.
There are several ways to prevent these columns being displayed in the report, but the one I’ve chosen is to set the Server-Side Condition to Never

To change the column labels so that they are a bit more user-friendly is a simple matter of changing the Heading property :

The changes are :
ColumnLabelSTAGE_CIDStageGROUP_CIDGroupT1_CIDTeam 1T2_CIDTeam 2T1_SCORET1 GoalsT2_SCORET2 GoalsT1_PENST1 ShootoutT2_PENST2 ShootoutAs a result, the Report is now looking a bit more useable :

I can make further changes in the Report Runtime using the Actions Menu
Reports runtimeUsing various Actions Menu Options, I’ve ordered the Primary Report by Match Date :

I’ve created a Named Report for Group Stage Fixtures.
Note that the Stage and Shootout Columns have been removed from the Report as they aren’t relevant for Group Stage matches.

There’s also a Named Report for Knockout Fixtures.
In this case, the Shootout Columns are relevant but Group isn’t :

The final Named Report is Upcoming Fixtures – i.e. fixtures that have no result entered for them as yet :

As we’ve seen, this page contains two Regions :
- a Form on the FIXTURES table
- an Interactive Grid on the MATCH_EVENTS table
Before we go any further, we need to make the Form useable when updating the base table records.
Remember, T1_PTS and T2_PTS in the FIXTURES table are virtual columns. At the moment, the Form will try to set these values on update, resulting in :

Also, the only values we want to change are in the *_SCORE and *_PENS columns – i.e.
- P3_T1_SCORE
- P3_T2_SCORE
- P3_T1_PENS
- P3_T2_PENS
So let’s make everything else Query Only :

We only want to update records from this Region. We do not want to Create or Delete any.
Therefore, we need to remove the DELETE and CREATE buttons from the page.
This is done by simply right-clicking the buttons in the Page Designer and selecting Delete from the pop-up menu.
The items we don’t want displayed are :
- P3_T1_DRAW_CID
- P3_T2_DRAW_CID
- P3_WINNER_DRAW_CID
- P3_LOSER_DRAW_CID
- P3_T1_PTS
- P3_T2_PTS
To hide them we simply go to Identification and set the Type to Hidden.
The items we do want to display are :
- P3_MATCH_DATE
- P3_STAGE_CID (Tournament Stage )
- P3_GROUP_CID (Group Stage Group – only for Group Stage fixtures)
- P3_T1_CID (Team 1 )
- P3_T2_CID (Team 2 )
- P3_T1_SCORE – Team 1 Goals Scored
- P3_T2_SCORE – Team 2 Goals Scored
For Knockout fixtures only :
- P3_T1_PENS – Team 1 Shootout Penalties Scored
- P3_T2_PENS – Team 2 Shootout Penalties Scored
For the fields we want to display but not update, we need to change the properties as follows…
Identification/Type : Display Only
For P3_STAGE_CID, P3_T1_CID and P3_T2_CID, the value shown is derived from an LOV.
To retain this functionality :
In Settings/Based On : Display Value of List of Values
In List of Values choose :
Type : Shared Component
For P3_STAGE_CID :
List of Values : STAGES.STAGE_NAME
For P3_T1_CID and P3_T2_CID :
List of Values : TEAMS.TEAM_NAME
For all of them :
Display Extra Values : Toggle Off
For the layout changes, All displayed fields have been changed to :
Appearance/Template : Optional
Next, I want to display some fields only under certain conditions.
I only want to show P3_GROUP_CID for a Group Stage fixture.
I only want to show the *PENS fields for a Knockout Stage fixture.
I can do this with a Server-Side Condition property.
For P3_GROUP_CID :
Type : Item is NOT NULL
Item : P3_GROUP_CID
For The *PENS fields :
Type : Item Is NULL
Item : P3_GROUP_CID
Our form now looks like this :

Rather than having each field on it’s own line, I want to display the Match Date, Tournament Stage and (if applicable) the Group on the first line.
To do this, I can simply toggle Off the Layout/Start New Row property for P3_STAGE_CID and P3_GROUP_CID.
Whilst I’m at it, I’ll remove the labels for P3_MATCH_DATE and P3_STAGE_CID as they are now a bit superfluous
The next line will be the teams
For P3_T2_CID set Layout/Label Column Span : 0
This stops the field justifying to the right.
Then Goals
Set P3_T1_SCORE Label to Goals
Set P3_T2_SCORE Label to be blank
Then Penalties
Set P3_T1_PENS Label to Shootout Penalties
Set P3_T2_PENS Label to blank
For the *SCORE and *PENS columns set :
Appearance/Width : 3
Finally, I’ve updated the label on the Save button to say Save rather than Apply Changes.
Match Details Interactive GridBeing an Interactive Grid (IG) the layout requires a bit less tweaking than for a Form.
For EVENT_TYPE_CID, we need to use a List of Values based on the MATCH_EVENT_TYPES table.
First, set Identification/Type : Select List
In the List of Values properties :
Type : SQL Query
SQL Query is :
select event_name as display,
cid as actual
from match_event_types
order by event_name
Display Extra Values : Toggle Off
Display Null Value : Toggle Off
NOTE – I know this is adding to our pile of technical debt and that it would be better to create the LOV as a shared component in case it is needed elsewhere in the application in future. I’ve created the LOV “in-line” here because, to be honest, I can never remember which way round the display and actual value columns should go and it’s quicker to muck about with it here than as a shared component.
Look, I’ll do it later. Promise.
Anyhow, the EVENT_TYPE_CID column can now only be populated with a value from the LOV :

PLAYER_NAME is currently a Textarea so we need to change it to a Type of Text Field.
Also, I want to ensure that the player name is entered in Uppercase so that I don’t have to worry about mixed case records when composing the SQL Query to show the Tournament top scorers. To achieve this :
Settings/Text Case : Upper
PLAYER_TEAM_CID should also be an LOV but should only be Team 1 or Team 2 for the Fixture currently in context.
Once again, we set the Type to Select List.
In the List of Value properties :
Type : SQL Query
This time, the query references values in the items on the current page :
select team_name as display,
cid as actual
from teams
where cid in (:P3_T1_CID, :P3_T2_CID)
Display Extra Values : Toggle Off
Display Null Value : Toggle Off

On this occasion, I’d argue that specifying the LOV in the column properties is reasonable, as it is specific to the FIXTURES record currently in context.
That brings us to MATCH_MINUTE, which is a little more complex than you might think at first glance.
Timekeeping in Association Football is done a bit differently from other sports. Rather than stopping the clock when play is held up for any reason, time is added to the end of a half. By convention this Added Time is recorded as a
“+” figure appended to the last minute of the half e.g. 45+4 = 4th minute of added time at the end of the first half.
The second half begins counting from 46 so, in game terms, the 46th minute is chronologically after 45+4.
To make sure that our Match Events are displayed in chronological order, I need to set the Match Details Region Order By Clause to :
case instr(match_minute, '+',1,1) when 0 then to_number(match_minute) else to_number(substr(match_minute,1,instr(match_minute, '+',1,1) -1)) + (to_number(substr(match_minute,-1, instr(match_minute, '+',-1,1))) *.01) end
Note that the above is formatted for readability.
The Order By Clause field appears to be limited to 256 characters so I had to input this code on a single line.
The final step is to add some highlights for Yellow Cards, Red Cards and Goals.
This can all be done from the Actions Menu in the runtime.
The result looks like this :

So, with a reasonable Physical Data Model and very little additional code, we’ve managed to produce a serviceable two-screen User Interface.
Who are you calling “Boomer” ?
Flipping Plans ! Retrieving past execution plans from AWR
SQL is a special programming language.
It’s quite possible that I’ll be interrupted at this point by some Tech Bro arguing that SQL is not a programming language actually. The phrase “Turing Complete” may already be heading my way.
To save time, I’ll simply observe that I get paid folding money to write SQL so I don’t really care whether it lacks some the properties of Powerpoint (e.g. being Turing Complete).
As I was saying SQL is almost unique among modern programming languages in that it is a 4GL.
In other words it is declarative – you use it to describe the information you want to retrieve. It is then up to the database engine to figure out how best to fulfill your requirements.
Oracle likes to remind me of this every so often when, after months of efficient and trouble-free running, it suddenly gets bored and chooses a new and exciting execution plan which causes a query to take hours rather than minutes.
What follows is a short exploration of how we can tell this is happening and how to retrieve current and previous execution plans for a SQL statement from Oracle’s Active Workload Repository (AWR).
Before we go on, however…
WARNING – LicensingBefore you begin exploring AWR, please make sure that you have a license for the Oracle Diagnostics Pack ( or the Diagnostics and Tuning Pack in earlier Oracle versions).
The documentation for every Oracle version includes details of the licensing approach – here’s the one for 19c, which is the version I’ll be using for this post.
Environment and SetupIn an appropriately licensed production environment, it’s usual for AWR to have been configured appropriately. However, I’m using an Oracle Developer Day Virtual Box Image running 19c Enterprise Edition, so I’ll need to do some tweaking.
First of all (connected as SYS as SYSDBA on the PDB), I need to check the AWR parameters :
show parameter awr NAME TYPE VALUE --------------------------- ------- ----- awr_pdb_autoflush_enabled boolean FALSE awr_pdb_max_parallel_slaves integer 10 awr_snapshot_time_offset integer 0
I can see that I need to enable the autoflush :
alter system set awr_pdb_autoflush_enabled = true; System SET altered
At present, the interval between AWR snapshots is a bit lengthy :
select * from cdb_hist_wr_control;

We’re going to shorten this interval to 10 minutes. Also, were going to record details of as many statements as possible in the snapshot :
begin
dbms_workload_repository.modify_snapshot_settings(
interval => 10,
topnsql => 'MAXIMUM');
end;
/

Even with AWR configured like this, it’s not guaranteed to capture every single statement run in a snapshot so I need to create a sizeable table to make sure my queries make the cut.
DBA_OBJECTS contains around 79K rows so I’ll use that as a basis for my table :
create table chunky as select * from dba_objects; begin for i in 1..100 loop insert into chunky select * from dba_objects; -- commit after each iteration as we're a bit tight -- on resources commit; end loop; end; / create index chunky_owner on chunky(owner); exec dbms_stats.gather_table_stats(user, 'CHUNKY');
CHUNKY contains 7.9 million rows.
Now that’s done, we should be able to start the “slideshow” …
Simulating a Plan FlipI’m going to start with a new snapshot…
exec dbms_workload_repository.create_snapshot;
Now to run a query against the table. In order to make it easier to find, I’ll pretend that I’m working in a Turing Complete medium…
select /* slide 1 */ *
from chunky
where owner = 'HR';
Now that I’ve executed the query, I should be able to find it’s SQL_ID :
select sql_id, sql_text
from v$sql
where sql_text like 'select%/*%slide 1%*/%'
and sql_text not like '%v$sql%'
/
SQL_ID SQL_TEXT ------------- ------------------------------------------------------ 9nwrantgwhcta select /* slide 1 */ * from chunky where owner = 'HR'
Before we run the query again, let’s ensure that the optimizer uses a different plan for the next execution by removing the index :
alter index chunky_owner invisible;
To simulate a separate batch run, the next run will be in a new snapshot :
exec dbms_workload_repository.create_snapshot;
select /* slide 1 */ *
from chunky
where owner = 'HR';
exec dbms_workload_repository.create_snapshot;
The two runs should now be in separate AWR snapshots.
Identifying an Execution Plan ChangeIt’s the morning after the overnight batch run and I’m trying to figure out why my job has taken much longer than normal.
Thanks to this extremely useful query from Kerry Osborne, I can see that the execution plan has changed from when the query was first captured in Snapshot 29 to the latest execution in Snapshot 32 :
select
snap.snap_id,
snap.instance_number,
begin_interval_time,
sql_id,
plan_hash_value,
nvl(executions_delta,0) execs,
(elapsed_time_delta/decode(nvl(executions_delta,0),0,1,executions_delta))/1000000 avg_etime,
(buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta)) avg_lio
from dba_hist_sqlstat stat, dba_hist_snapshot snap
where sql_id = '9nwrantgwhcta'
and snap.snap_id = stat.snap_id
and snap.instance_number = stat.instance_number
and executions_delta > 0
order by 3
/

The component lines of these execution plans can be found in DBA_HIST_SQL_PLAN :
select *
from dba_hist_sql_plan
where sql_id = '9nwrantgwhcta'
/
Alternatively, if you’d like to the plans nicely formatted, you can use DBMS_XPLAN.DISPLAY_WORKLOAD_REPOSITORY.
A couple of points to note here :
- the DISPLAY_AWR function is depracated from 12c onwards in favour of this procedure
- if, as in this case, we’re viewing the contents of a Pluggable Database Repository then we need to specify this in the AWR_LOCATION parameter.
Other than that, we simply need to provide the SQL_ID and PLAN_HASH_VALUE.
For the current plan (in SNAP_ID 32 above) :
select *
from table(
dbms_xplan.display_workload_repository(
sql_id => '9nwrantgwhcta',
plan_hash_value => 1105471336,
awr_location => 'AWR_PDB'));

We can see the old plan ( from SNAP_ID 29) by running :
select *
from table(
dbms_xplan.display_workload_repository(
sql_id => '9nwrantgwhcta',
plan_hash_value => 317341109,
awr_location => 'AWR_PDB'));

As well as the aforementioned article by Kerry Osborne, I also found the following quite useful :
Connor McDonald’s Superhero alter ego provides an overview of how AWR works here
Generating Journal Triggers with help from DBMS_UTILITY
It’s good to see modern working practices being implemented in Government, especially in Downing Street, where they seem to have embraced the idea of hot-desking.
Not to say there isn’t room for improvement. In the recent leadership election, candidates claimed a total of 380 nominations…from an electorate of 357. Given the state of the Economy, you do rather hope they chose the candidate that can count.
In order to distract myself from the ongoing shenanigans in Whitehall, I’m going address myself to one of the regular chores facing anyone working on a new Data Warehousing application – implementing journal triggers on your mapping tables.
Any variation on the theme of Star Schema is likely to result in numerous mapping tables, the contents of which will affect how the application behaves.
Recording DML changes to data in these tables is therefore important to maintain an audit trail, or even just to aid debugging.
Rather than writing nearly identical triggers for each of these tables, wouldn’t it be better if we could automate the process to some degree ?
As well as the usual data-dictionary lookups, I’m going to keep things interesting by :
- creating a journal table for the trigger to insert into
- ensuring consistent formatting (canonicalization) of “key” character values – by making them all uppercase
- keeping my dynamic SQL statements readable by using a template
- converting comma-separated lists to PL/SQL tables and back again with DBMS_UTILITY
The trigger will need to do the following :
- capture an updated record before the update is applied.
- capture a deleted record
- ensure that “key” values on an inserted or updated record are canonicalized
Note that we won’t be recording newly inserted records as they’ll be available in the base table anyway.
Also, in this context “key” columns are those that are either used in the predicate of a lookup or returned as the result of a lookup rather than necessarily being part of a physical key in the lookup table.
We’ll need a journal table to write to. This table will contain all of the non-virtual columns of the base table plus a column for the DML operation being performed, the user (or os_user) and a timestamp.
Using “<>” to signify placeholders, a trigger implementing this functionality would look something like this :
create or replace trigger <trigger_name>
before insert or update or delete on <base_table>
for each row
declare
v_user varchar2(4000);
v_operation varchar2(25);
begin
v_user := sys_context('userenv', 'os_user');
v_operation := case when inserting then 'INSERT' when updating then 'UPDATE' else 'DELETE' end;
if inserting or updating then
<canonicalize_cols>
end if;
if updating or deleting then
insert into <jnl_table> (
operation, os_user, change_ts,
<col_list>)
values( v_operation, v_user, systimestamp,
<old_list>);
end if;
end;
I know what you’re thinking, that’s going to look messy once we start building the actual statements dynamically to create each trigger. After all, we’re building a DDL statement, which means that EXECUTE IMMEDIATE will not be accepting bind variables. Therefore we’ll need to do lots of concatenation.
Well, maybe not.
Replacing the placeholders with actual values rather than concatenating them in, we should be able to keep our code fairly readable.
Before we get to that though, we really should look at …
Generating the Journal TableThe Journal tables consist of the permanent columns from the base table plus
- OPERATION – to identify the DML operation for which the record was created
- OS_USER – the Operating System user performing the operation
- CHANGE_TS – the timestamp of the record
Now, we could use the data dictionary to painstakingly reconstruct the required columns.
Alternatively, we can simply do this :
create table <jnl_table> as
select
rpad(' ', 6) as operation,
rpad(' ', 4000) as os_user,
systimestamp as change_ts,
t.*
from <base_table> t
where 1=0
/
The rpad for OPERATION and OS_USER will cause those columns to be defined as varchar2 and large enough to store the padded string.
At this point it’s probably a good idea to look at the base table we’ll be using in the examples that follow :
create table cabinet_members(
office varchar2(500),
first_name varchar2(255),
last_name varchar2(255) not null,
email varchar2(500),
start_date date,
end_date date,
constraint cabinet_members_pk primary key (office, start_date))
/
Using this as an example, the journal table we create using this technique would look like this :
create table cabinet_members_jnl as
select rpad(' ', 25) as operation,
rpad(' ', 4000) as os_user,
systimestamp as change_ts,
t.*
from cabinet_members t
where 1=0
/

The list of columns we need to canonicalize will be passed into our journal trigger generation procedure as a comma-delimited list at runtime.
The columns to insert into can be retrieved from the data dictionary.
For CABINET_MEMBERS, the columns to canonicalize are :
office, first_name, last_name
For each element of this list, we’ll want to :
- make sure that it is the name of a column in the base table
- generate code to uppercase the new value that’s being inserted or updated
If you’re looking for a simple way to loop through this comma-delimited list that doesn’t involve too much typing, DBMS_UTILITY.COMMA_TO_TABLE can help. For example :
clear screen
set serverout on
declare
v_col_list varchar2(4000) := 'office, first_name, last_name';
v_col_arr dbms_utility.lname_array;
v_tab_count binary_integer;
begin
dbms_utility.comma_to_table(
list => v_col_list,
tablen => v_tab_count,
tab => v_col_arr);
dbms_output.put_line('Table populated with '||v_tab_count||' elements');
for i in 1..v_tab_count loop
dbms_output.put_line(trim(v_col_arr(i)));
end loop;
end;
/
Run this and you’re rewarded with…
Table populated with 3 elements office first_name last_name PL/SQL procedure successfully completed.
However, things are not quite as they seem.
Let’s try running that script again, but this time add a second loop using the COUNT attribute of the PL/SQL table variable as well as the tablen parameter output value from the procedure call…
clear screen
set serverout on
declare
v_col_list varchar2(4000) := 'office, first_name, last_name';
v_col_arr dbms_utility.lname_array;
v_tab_count binary_integer;
begin
dbms_utility.comma_to_table(
list => v_col_list,
tablen => v_tab_count,
tab => v_col_arr);
dbms_output.put_line('Table populated with '||v_tab_count||' elements');
for i in 1..v_tab_count loop
dbms_output.put_line(i||' - '||trim(v_col_arr(i)));
end loop;
-- NOTE - we can use v_col_arr.count but this includes an empty element...
dbms_output.put_line(q'[...and here's all ]'||v_col_arr.count||' of them');
for j in 1..v_col_arr.count loop
dbms_output.put_line(j||' - '||v_col_arr(j));
end loop;
end;
/
Table populated with 3 elements 1 - office 2 - first_name 3 - last_name ...and here's all 4 of them 1 - office 2 - first_name 3 - last_name 4 - PL/SQL procedure successfully completed.
In our case, we’ll want to use the element count provided by the procedure call rather than the table variable attribute.
For the columns in the Journal table that we want to insert into, we could just use a listagg in the query against the dictionary, but that would miss a perfect opportunity to demonstrate DBMS_UTILITY.TABLE_TO_COMMA…
declare
v_tab_cols_arr dbms_utility.lname_array;
v_tablen binary_integer;
v_col_list varchar2(32767);
begin
select column_name
bulk collect into v_tab_cols_arr
from user_tab_columns
where table_name = 'CABINET_MEMBERS'
order by column_id;
dbms_output.put_line(v_tab_cols_arr.count||' columns in table');
dbms_utility.table_to_comma(
tab => v_tab_cols_arr,
tablen => v_tablen,
list => v_col_list);
dbms_output.put_line(v_col_list);
end;
/
6 columns in table OFFICE,FIRST_NAME,LAST_NAME,EMAIL,START_DATE,END_DATE PL/SQL procedure successfully completed.The Journal Trigger Generator
Now we have all of the elements we require to populate our placeholder strings, let’s see what the final package might look like :
create or replace package generate_jnl_trigger
authid current_user
as
procedure build_trigger(
i_base_table in varchar2,
i_jnl_table in varchar2 default null,
i_trigger_name in varchar2 default null,
i_canonicalize_cols in varchar2 default null);
end;
/
create or replace package body generate_jnl_trigger
as
function check_object_name( i_name in varchar2)
return boolean
is
begin
-- Ensure that object name does not contain any suspicious characters
return length(regexp_replace(i_name, '[[:alnum:]]|[$_#]')) = 0;
end check_object_name;
procedure build_trigger(
i_base_table in varchar2,
i_jnl_table in varchar2 default null,
i_trigger_name in varchar2 default null,
i_canonicalize_cols in varchar2 default null)
is
v_base_table user_tables.table_name%type;
v_tab_name_max_len pls_integer;
v_jnl_table varchar2(4000);
v_trigger varchar2(4000);
v_jnl_tab_ddl_stmnt clob;
v_stmnt clob;
v_canon_count binary_integer;
v_canon_arr dbms_utility.lname_array;
v_canon_stmnt clob;
v_target_col_arr dbms_utility.lname_array;
v_ins_cols varchar2(32767);
v_old_cols varchar2(32767);
C_TRIGGER_TEMPLATE constant clob :=
q'[
create or replace trigger <trigger_name>
before insert or update or delete on <base_table>
for each row
declare
v_user varchar2(4000);
v_operation varchar2(25);
begin
v_user := sys_context('userenv', 'os_user');
v_operation := case when inserting then 'INSERT' when updating then 'UPDATE' else 'DELETE' end;
if inserting then
<canonicalize_cols>
end if;
if updating or deleting then
insert into <jnl_table> (
operation, os_user, change_ts,
<col_list>)
values( v_operation, v_user, systimestamp,
<old_list>);
end if;
end; ]';
C_JNL_TAB_DDL_TEMPLATE constant clob :=
q'[
create table <jnl_table> as
select
rpad(' ', 6) as operation,
rpad(' ', 4000) as os_user,
systimestamp as change_ts,
t.*
from <base_table> t
where 1=0]';
C_CANON_TEMPLATE constant varchar2(4000) := ':new.<col_name> := upper(:new.<col_name>);';
begin
if i_base_table is null then
raise_application_error(-20000, 'Base Table must be specified.');
end if;
-- Make sure that we've been passed the name of an existing table
v_base_table := dbms_assert.sql_object_name(i_base_table);
--
-- Determine the maxiumum length of a table_name in this Oracle version
--
select data_length
into v_tab_name_max_len
from all_tab_columns
where table_name = 'USER_TABLES'
and column_name = 'TABLE_NAME';
v_trigger := nvl(i_trigger_name, substr(v_base_table,1,(v_tab_name_max_len -4))||'_TRG');
if not check_object_name(v_trigger) then
raise_application_error(-20010, 'Trigger Name '||v_trigger||' not allowed');
end if;
v_jnl_table := nvl(i_jnl_table, substr(v_base_table,1, (v_tab_name_max_len -4))||'_JNL');
if not check_object_name(v_jnl_table) then
raise_application_error(-20020, 'Trigger Name '||v_jnl_table||' not allowed');
end if;
-- Create the Journal table
-- Replace the placeholder text in the template with the actual values.
-- Doing this one placeholder at-a-time for readability
v_jnl_tab_ddl_stmnt := replace(C_JNL_TAB_DDL_TEMPLATE, '<jnl_table>', v_jnl_table);
v_jnl_tab_ddl_stmnt := replace(v_jnl_tab_ddl_stmnt, '<base_table>', v_base_table);
execute immediate v_jnl_tab_ddl_stmnt;
v_stmnt := replace(C_TRIGGER_TEMPLATE, '<trigger_name>', v_trigger);
v_stmnt := replace(v_stmnt, '<base_table>', i_base_table);
v_stmnt := replace(v_stmnt, '<jnl_table>', v_jnl_table);
if i_canonicalize_cols is not null then
dbms_utility.comma_to_table( i_canonicalize_cols, v_canon_count, v_canon_arr);
-- Remember to use the count returned from the procedure...
for i in 1..v_canon_count loop
v_canon_stmnt := v_canon_stmnt ||
replace(C_CANON_TEMPLATE, '<col_name>', trim(v_canon_arr(i)))||chr(10);
end loop;
v_stmnt := replace(v_stmnt, '<canonicalize_cols>', v_canon_stmnt);
else
-- Just do nothing
v_stmnt := replace(v_stmnt, '<canonicalize_cols>', 'null;');
end if;
select column_name
bulk collect into v_target_col_arr
from user_tab_columns
where table_name = upper(i_base_table)
order by column_id;
-- We know that these columns belong to the base table because we've just got them from the data dictionary.
-- However, we want to double check there's nothing fishy about them.
-- Check that they contain only valid characters ( which may not be the case if the column names are quoted).
-- I know I don't have any quoted column names in my data model...
for i in 1..v_target_col_arr.count loop
if not check_object_name(v_target_col_arr(i)) then
raise_application_error(-20030, 'Column name contains invalid characters : '||v_target_col_arr(i));
end if;
v_ins_cols := v_ins_cols||v_target_col_arr(i)||', ';
v_old_cols := v_old_cols||':old.'||v_target_col_arr(i)||', ';
end loop;
v_ins_cols := rtrim(v_ins_cols, ', ');
v_old_cols := rtrim(v_old_cols, ', ');
v_stmnt := replace(v_stmnt, '<col_list>', v_ins_cols);
v_stmnt := replace(v_stmnt, '<old_list>', v_old_cols);
execute immediate(v_stmnt);
end build_trigger;
end;
/
If we execute this for CABINET_MEMBERS ( having first dropped the CABINET_MEMBERS_JNL table we created earlier)…
begin
generate_jnl_trigger.build_trigger(
i_base_table => 'cabinet_members',
i_canonicalize_cols => 'office, first_name, last_name');
end;
/
…the Journal table is created, together with a DML trigger on the table :
create or replace trigger cabinet_members_TRG
before insert or update or delete on cabinet_members
for each row
declare
v_user varchar2(4000);
v_operation varchar2(25);
begin
v_user := sys_context('userenv', 'os_user');
v_operation := case when inserting then 'INSERT' when updating then 'UPDATE' else 'DELETE' end;
if inserting then
:new.office := upper(:new.office);
:new.first_name := upper(:new.first_name);
:new.last_name := upper(:new.last_name);
end if;
if updating or deleting then
insert into cabinet_members_JNL (
operation, os_user, change_ts,
OFFICE, FIRST_NAME, LAST_NAME, EMAIL, START_DATE, END_DATE)
values( v_operation, v_user, systimestamp,
:old.OFFICE, :old.FIRST_NAME, :old.LAST_NAME, :old.EMAIL, :old.START_DATE, :old.END_DATE);
end if;
end;
Let’s run some DML on the table to check the effect :
insert into cabinet_members (
office,
first_name,
last_name,
email,
start_date,
end_date)
values(
'Prime Minister',
'Boris',
'Johnson',
'bigdog@partyhq.co.uk',
date '2019-07-24',
date '2022-09-05');
commit;
insert into cabinet_members (
office,
first_name,
last_name,
email,
start_date,
end_date)
values(
'pRImE mINisteR',
'Liz',
'Truss',
'liz@lettuce.org',
date '2022-09-05',
date '2022-10-25');
commit;
insert into cabinet_members (
office,
first_name,
last_name,
email,
start_date )
values(
'prime minister',
'rishi',
'sunak',
'rishi@tolduso.com',
date '2022-10-25');
commit;
We can see that the office, first_name and last_name values have been canonicalized :
select first_name, last_name
from cabinet_members
where office = 'PRIME MINISTER'
order by end_date nulls last
/
OFFICE FIRST_NAME LAST_NAME
-------------------- ------------------------------ ------------------------------
PRIME MINISTER BORIS JOHNSON
PRIME MINISTER LIZ TRUSS
PRIME MINISTER RISHI SUNAK
If we now do some tidying up…
delete from cabinet_members
where end_date is not null;
commit;
…we can see that these deletes have been recorded in the Journal table :
select * from cabinet_members_jnl;

Saving the World from Fat-finger moments – with regexp_like
It’s not uncommon for a database application to have it’s behaviour defined, to an extent at least, by records in reference data tables.
By it’s nature, this data is static and the tables in which it resides tend to contain comparatively few rows. However, such an approach can be susceptible to erroneous data entry, especially where key values are concerned.
Having spent many an “entertaining” afternoon/evening/dead-of-night, trying to hunt down some mystery bug, only to find that one of these values includes an extraneous space or invisible control character, I’ve come to appreciate the ability of regexp_like to point these things out.
The code examples that follow should be based on some sensible data set, probably from the HR demo schema. However, Dr Fatfinger does sound rather like a Bond villain…
Over the years, MI6 has built up a list of individuals whose activities bear close scrutiny.
They want to know if any of these people start bulk buying piranha food, or looking at hollowed out volcanoes on Zoopla :
create table megalomaniacs (
evil_genius varchar2(100),
start_date date,
constraint megalomaniacs_pk primary key (evil_genius))
/
insert into megalomaniacs( evil_genius, start_date)
values('DOCTOR NO', to_date('01-OCT-1962', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('ERNST STAVRO'||chr(256)||' BLOFELD', to_date('10-OCT-1963', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values(' AURIC GOLDFINGER', to_date('17-SEP-1964', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('EMILIO LARGO', to_date('09-DEC-1965', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('DOCTOR KANANGA', to_date('27-JUN-1973', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('FRANCISCO SCARAMANGA', to_date('19-DEC-1974', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('KARL STROMBERG', to_date('7-JUL-1977', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('HUGO DRAX ', to_date('26-JUN-1979', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('ARISTOTLE KRISTATOS', to_date('24-JUN-1981', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('KAMAL KHAN', to_date('06-JUN-1983', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('MAX ZORIN', to_date('22-MAY-1985', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('GENERAL KOSKOV', to_date('29-JUN-1987', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('FRANZ SANCHEZ', to_date('13-JUN-1989', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('ALEC TREVELYAN', to_date('13-NOV-1995', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('ELLIOT CARVER', to_date('09-DEC-1997', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('ELEKTRA KING', to_date('08-NOV-1999', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('COLONEL TAN-SUN MOON', to_date('20-NOV-2002', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('MR WHITE', to_date('14-NOV-2006', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('DOMINIC GREEN', to_date('20-OCT-2008', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('RAOUL SILVA', to_date('23-OCT-2012', 'DD-MON-YYYY'));
insert into megalomaniacs( evil_genius, start_date)
values('LYUTSIFER SAFIN ', to_date('28-SEP-2021', 'DD-MON-YYYY'));
commit;
However, some of these people are slipping through the net…
select evil_genius, to_char(start_date, 'DD-MON-YYYY') as start_date
from megalomaniacs
where evil_genius in ( 'MR WHITE', 'LYUTSIFER SAFIN', 'AURIC GOLDFINGER', 'ERNST STAVRO BLOFELD');
EVIL_GENIUS START_DATE
------------------------------ --------------------
MR WHITE 14-NOV-2006
1 row selected.
We suspect the handy work of Dr Fatfinger, possibly through the activities of those notorious henchpeople, Copy and Paste.
Fortunately, we can use a regexp to identify any records that contain :
- a leading or trailing non-printing character
- a control character
select evil_genius, length( evil_genius),
to_char(start_date, 'DD-MON-YYYY') as start_date
from megalomaniacs
where regexp_like(evil_genius, '^[[:space:]]|[[:cntrl:]]|[[:space:]]$', 'i');

Linux – Checking that two files contain the same data but in a different order – Sorted!
Trying to regression test a change to a feed-file generation program can be tricky. Whether the format is CSV or some fashionable markup language, the ordering of the result set tends to be unimportant in such circumstances.
When testing, we need to verify that the files produced by the new version of the program contain the same data as those produced by the old version, irrespective of the order in which the records are written.
Recently, I was rescued from my struggle with just such a problem by my colleague, Don Thomson, who imparted some (Linux) Jedi wisdom resulting in a simple yet effective solution, involving an inventive combination of Linux utilities.
What we’re going to look at here is :
- comparing files with diff
- using sort to give diff a hand
- comparing sets of files in different directories using sum
- ignoring trailer records in delimited files using grep or head
The first file we’ll use in our example is called episodes.txt and contains the following :
one
two
three
four
five
six
seven
eight
nine
The file we’re comparing it to is called releases.txt :
four
five
six
one
two
three
seven
eight
nine
As you can see, the files contain the same data but the first is in numeral order and the second is the result of what may be considered a “Skywalker” sort.
Predictably, diff decides that they are not identical :
diff episodes.txt releases.txt
1,3d0
< one
< two
< three
6a4,6
> one
> two
> three
Before we go any further, let’s use some switches to minimize the diff output as, for the purposes of this exercise, we just want to tell whether or not the files are the same.
diff -qs episodes.txt release.txt
Files episodes.txt and releases.txt differ
Fortunately, Don knows just the thing to help us perform a data – rather than line-by-line – comparison…
Sorting it outLet’s see what happens when we use the sort command on episodes.txt :
sort episodes.txt
eight
five
four
nine
one
seven
six
three
two
Interesting. It seems to have sorted the file contents ( “data”) into alphabetical order. Let’s see what it does with releases.txt :
sort releases.txt
eight
five
four
nine
one
seven
six
three
two
That’s useful. The output is identical. Now we just need to pass the sort output for each file to diff.
We can do this using sub-shells. As we’re running in Bash, the syntax for this is :
diff -qs <(sort episodes.txt) <(sort releases.txt)
Files /dev/fd/63 and /dev/fd/62 are identical
Note that the filenames in the output are the temporary files that hold the output (stdout) from each sub-shell.
Just to prove that this solution does detect when the rows of data are different in the files, let’s introduce a “rogue” one…
echo 'three-and-a-bit' >>episodes.txt
diff -qs <(sort episodes.txt) <(sort releases.txt)
Files /dev/fd/63 and /dev/fd/62 differ
Comparing two sets of files in different directories with a checksum
Whilst this approach works really well for comparing two files, you may find that it’s not that quick when you’re comparing a large number of large files. For example, we have a directory containing files that we generated before making any changes to our (fictitious) program :
mkdir baseline_text_files
ls -l baseline_text_files/*.txt
-rw-rw-r-- 1 mike mike 14 Sep 10 13:36 baseline_text_files/originals.txt
-rw-rw-r-- 1 mike mike 14 Sep 10 13:36 baseline_text_files/prequels.txt
-rw-rw-r-- 1 mike mike 17 Sep 10 13:36 baseline_text_files/sequels.txt
The file contents are :
cat originals.txt
four
five
six
cat prequels.txt
one
two
three
cat sequels.txt
seven
eight
nine
Files generated after modifying the program are in the new_text_files directory :
cat originals.txt
four
six
five
cat prequels.txt
three
two
one
cat sequels.txt
eight
nine
seven
Don’s rather neat alternative to diffing each pair of files is to create a checksum for each file and write the output to a temporary file. We then just diff the files with the output for each directory.
There are a number of utilities you can use to do this and the complexity of the checksum algorithm used may impact the runtime for a large number of files.
In light of this, we’ll be using sum, which seems to be the simplest and therefore (in theory) the fastest.
A quick test first :
sum baseline_text_files/originals.txt
45749 1
The first number is the checksum. The second is the file block count.
Now we’ve identified the required utilities, this script should do the job. I’ve called it data_diff.sh and you can download it from Github should you feel inclined to do so. The link is here.
#!/bin/sh
# Difference between files in two directories
orig_dir=$1
new_dir=$2
TMPFILE1=$(mktemp)
TMPFILE2=$(mktemp)
for file in $orig_dir/*
do
sort $file |sum >> $TMPFILE1
done
for file in $new_dir/*
do
sort $file|sum >>$TMPFILE2
done
diff -qs $TMPFILE1 $TMPFILE2
is_same=$?
if [ $is_same -eq 1 ]
then
echo 'Files do not match'
else
echo 'Files are identical'
fi
#delete the temporary files before exiting, even if we hit an error
trap 'rm -f $TMPFILE1 $TMPFILE2' exit
Run this and we get :
sh data_diff.sh baseline_text_files new_text_files
Files /tmp/tmp.OshLmwGL0J and /tmp/tmp.Uz2mUa0SSY are identical
Files are identical
If we introduce a difference in one of the existing files…
echo 'SOLO' >>new_text_files/sequels.txt
sh data_diff.sh baseline_text_files new_text_files
Files /tmp/tmp.4OGnjQls0S and /tmp/tmp.L7OUZyGUzl differ
Files do not match
Unsurprisingly, the script will also detect a difference if we’re missing a file…
touch baseline_text_files/tv_series.txt
sh data_diff.sh baseline_text_files new_text_files
Files /tmp/tmp.LsCHbhxK1D and /tmp/tmp.358UInXSJX differ
Files do not match
Ignoring Trailer Records in delimited files
With text delimited files, it’s common practice to include a trailer record at the end of the file to confirm it is complete.
This record will typically include the date (or timestamp) of when the file was created.
Such a file might look like this in the baseline_files directory
For example :
cat baseline_files/episodes.csv
HEADER|episode|title|release_year
I|The Phantom Menace|1999
II|Attack of the Clones|2002
III|Revenge of the Sith|2005
IV|A New Hope|1977
V|The Empire Strikes Back|1980
VI|Return of the Jedi|1983
VII|The Force Awakens|2015
VIII|The Last Jedi|2017
IX|The Rise of Skywalker|2019
TRAILER|9|20220903
The trailer in the corresponding file in the new directory includes a different date :
cat new_files/episodes.csv
HEADER|episode|title|release_year
I|The Phantom Menace|1999
II|Attack of the Clones|2002
III|Revenge of the Sith|2005
IV|A New Hope|1977
V|The Empire Strikes Back|1980
VI|Return of the Jedi|1983
VII|The Force Awakens|2015
VIII|The Last Jedi|2017
IX|The Rise of Skywalker|2019
TRAILER|9|20220904
To accurately compare the data in these files, we’ll need to ignore the trailer record.
Once again, there are numerous ways to do this. We could use :
grep -iv trailer baseline_files/episodes.csv
HEADER|episode|title|release_year
I|The Phantom Menace|1999
II|Attack of the Clones|2002
III|Revenge of the Sith|2005
IV|A New Hope|1977
V|The Empire Strikes Back|1980
VI|Return of the Jedi|1983
VII|The Force Awakens|2015
VIII|The Last Jedi|2017
IX|The Rise of Skywalker|2019
…which would result in our diff looking like this :
diff -qs <(sort baseline_files/episodes.csv|grep -iv trailer) <(sort new_files/episodes.csv| grep -iv trailer)
Alternatively, if we know that the trailer record is always the last line of the file we can use head to output everything apart from the last line :
head -n -1 baseline_files/episodes.csv.
HEADER|episode|title|release_year
I|The Phantom Menace|1999
II|Attack of the Clones|2002
III|Revenge of the Sith|2005
IV|A New Hope|1977
V|The Empire Strikes Back|1980
VI|Return of the Jedi|1983
VII|The Force Awakens|2015
VIII|The Last Jedi|2017
IX|The Rise of Skywalker|2019
…which would entail a diff command like this :
diff -qs <(head -n -1 baseline_files/episodes.csv| sort) <(head -n -1 new_files/episodes.csv| sort)
This being Linux there are probably several more options but these should cover at least some of the more common circumstances where comparison of file by data is required.
Excluding English Public Holidays from a DBMS_SCHEDULER Schedule
This week we’re taking on the entire Establishment through the medium of… DBMS_SCHEDULER Calendar Strings.
I should probably explain that, in a normal year, England has eight Public Holidays. Four of these are based around religious festivals. Whilst Christmas Day is always on 25th December, working out when Easter falls requires some complex calculations.
On top of that, there are times when the secular holidays are re-arranged and even days added. These are usually due to some Royal anniversary or event.
What I want to explore is how we can construct DBMS_SCHEDULER calender string s which include working days ( in this case week-days) but exclude Public Holidays.
The three specific examples I’d like to end up with are :
- a schedule for the first working day of each month
- a schedule for the last working day of each month
- a schedule for the working day on or immediately after the 6th of each month ( which we’ll say is Pay day)
In order to achieve this we’ll :
- explore how the DBMS_SCHEDULER calendar syntax can help us to calculate when each holiday will fall
- write a PL/SQL function to return the date of Easter Sunday for a given year
- create a table to hold the holiday dates and also to allow any changes to be recorded
- write a procedure to maintain the table
- write a procedure to dynamically build and maintain a Schedule of Public Holiday dates
- build the three schedules we want using the Public Holidays Schedule to exclude the Public Holidays.
The code in what follows can be found in this Github repo
Public Holidays in EnglandThe regular Public Holidays in England and Wales (Scotland and Northern Ireland have their own variations) are :
- New Years Day Holiday – the first weekday of the year
- Good Friday – the Friday before Easter Sunday
- Easter Monday – the day after Easter Sunday
- May Day – the first Monday in May
- Spring Bank Holiday – the last Monday in May
- August Bank Holiday – the last Monday in August
- Christmas Day Holiday – the first weekday on or after 25th December
- Boxing Day Holiday – the second weekday on or after 25th December
DBMS_SCHEDULER’s calendaring syntax does allow for quite a bit of flexibility and there are several ways to achieve the same result.
This is covered fairly comprehensively in the documentation, but hopefully the following examples will prove useful to future me when I’m skim-reading this in a hurry…
'freq=yearly; bymonth=5; byday=mon; bysetpos=1'
- bymonth=5 is the 5th month of the Year (May).
- byday=mon – we’re only interested in Mondays
- bysetpos=1 – we want the earliest date that matches
'freq=yearly; bydate=0531-span:7d; byday=mon; bysetpos=1'
- bydate=0531-span:7d – evaluate from the 31st May but go back ( –span) up to 7 days
Just for a change, let’s try a different approach to the one we’ve used for Spring Bank Holiday :
'freq=monthly; bymonth=8; byday=-1mon'
- byday=-1mon – find the last Monday of the month.
Note that unlike the other calendar strings we’ve used, which have a frequency of yearly, this one is set to monthly. That’s becasue using yearly with this form of the byday syntax will give you :
ORA-27419: unable to determine valid execution date from repeat interval
Christmas Day and New Year’s Day holidays are similar in that they occur on the first week day on or after a specific date.
Boxing Day occurs on the second week day on or after 25th December.
For Christmas Day Holiday :
'freq=yearly; bydate=1225+span:7d; byday=mon,tue,wed,thu,fri; bysetpos=1'
Boxing Day is almost exactly the same a Christmas by bysetpos is set to 2 instead of 1:
'freq=yearly; bydate=1225+span:7d; byday=mon,tue,wed,thu,fri; bysetpos=2'
Whilst the syntax in these examples is similar to that in the string we’ve used for the last Monday in May, here we’re searching forward from the date specified rather than backward, as denoted by the “+” in the bydate string :
bydate=1225+span:7d
New Year’s Day Holiday
We’re using the same approach as that for the Christmas and Boxing Day Holidays :
'freq=yearly; bydate=0101+span:7d; byday=mon,tue,wed,thu,fri; bysetpos=1'
We’ll be using DBMS_SCHEULER.EVALUATE_CALENDAR_STRING shortly, but you can use it as described here if you want to play with any of the above.
Now for the “fun” bit…
EasterIt seems that, if your a Major Abrahamic Religion then an arcane mathematical calculation is mandatory to work out when at least some of your holy days are.
Some years ago, I implemented the Easter algorithm in PL/SQL.
I’ve re-created it as part of the package I’m going to use to bring all this calander malarky together.
The only thing I can really say about it is that it works. I don’t know how. I guess you just need to have a bit of faith…
...
function calculate_easter( i_year in number default null)
return date
is
a pls_integer;
b pls_integer;
c pls_integer;
d pls_integer;
e pls_integer;
f pls_integer;
g pls_integer;
h pls_integer;
i pls_integer;
k pls_integer;
l pls_integer;
m pls_integer;
p pls_integer;
v_month pls_integer;
v_day pls_integer;
v_year pls_integer;
v_easter_sunday date;
begin
v_year := nvl(i_year, extract( year from sysdate));
a := mod( v_year, 19);
b := floor(v_year/100);
c := mod(v_year,100);
d := floor(b/4);
e := mod(b,4);
f := floor((b+8)/25);
g := floor((b-f+1)/3);
h := mod((19*a+b-d-g+15),30);
i := floor(c/4);
k := mod(c,4);
l := mod((32+2*e+2*i-h-k),7);
m := floor((a+11*h+22*l)/451);
v_month := floor((h+l-7*m+114)/31); -- 3=March, 4=April
p := mod((h+l-7*m+114),31);
v_day := p+1 ; -- date in Easter Month
v_easter_sunday := to_date(v_day||'-'||v_month||'-'||v_year, 'DD-MM-YYYY');
return v_easter_sunday;
end calculate_easter;
...
The table looks like this :
create table public_holidays(
holiday_date date constraint public_holidays_pk primary key,
holiday_name varchar2(100) not null,
notes varchar2(4000))
/
Using the HOLIDAY_DATE as the Primary Key means that we should be careful to ensure that it only holds a truncated date value :
create or replace trigger public_holidays_bi_trg
before insert on public_holidays
for each row
begin
:new.holiday_date := trunc(:new.holiday_date);
end;
/
In order to keep our schedule…er…up-to-date, I’ve wrapped up the code into a neat(ish) package :
create or replace package maintain_public_holidays
as
function calculate_easter( i_year in number default null) return date;
procedure add_standard_holidays( i_year in number);
procedure generate_schedule;
end;
/
create or replace package body maintain_public_holidays
as
function calculate_easter( i_year in number default null)
return date
is
a pls_integer;
b pls_integer;
c pls_integer;
d pls_integer;
e pls_integer;
f pls_integer;
g pls_integer;
h pls_integer;
i pls_integer;
k pls_integer;
l pls_integer;
m pls_integer;
p pls_integer;
v_month pls_integer;
v_day pls_integer;
v_year pls_integer;
v_easter_sunday date;
begin
v_year := nvl(i_year, extract( year from sysdate));
a := mod( v_year, 19);
b := floor(v_year/100);
c := mod(v_year,100);
d := floor(b/4);
e := mod(b,4);
f := floor((b+8)/25);
g := floor((b-f+1)/3);
h := mod((19*a+b-d-g+15),30);
i := floor(c/4);
k := mod(c,4);
l := mod((32+2*e+2*i-h-k),7);
m := floor((a+11*h+22*l)/451);
v_month := floor((h+l-7*m+114)/31); -- 3=March, 4=April
p := mod((h+l-7*m+114),31);
v_day := p+1 ; -- date in Easter Month
v_easter_sunday := to_date(v_day||'-'||v_month||'-'||v_year, 'DD-MM-YYYY');
return v_easter_sunday;
end calculate_easter;
procedure add_standard_holidays( i_year in number)
is
type rec_holidays is record (
holiday_name public_holidays.holiday_name%type,
calendar_string varchar2(4000),
holiday_date date);
type typ_holidays is table of rec_holidays
index by pls_integer;
tbl_holidays typ_holidays;
v_start_date date;
v_easter_sunday date;
begin
-- We'll be using this as the RETURN_DATE_AFTER parameter in the call to
-- DBMS_SCHEDULER.EVALUATE_CALENDAR_STRING so set it to the last day of the year
-- *before* the one we want to add dates for
v_start_date := to_date(i_year -1||'1231', 'YYYYMMDD');
tbl_holidays(1).holiday_name := 'NEW YEARS DAY';
tbl_holidays(1).calendar_string := 'freq=yearly; bymonth=1; byday=mon,tue,wed,thu,fri; bysetpos=1';
tbl_holidays(2).holiday_name := 'MAY DAY';
tbl_holidays(2).calendar_string := 'freq=yearly; bydate=0501+span:7D; byday=mon; bysetpos=1';
tbl_holidays(3).holiday_name := 'SPRING BANK HOLIDAY';
tbl_holidays(3).calendar_string := 'freq=yearly; bydate=0531-span:7D; byday=mon; bysetpos=1';
tbl_holidays(4).holiday_name := 'AUGUST BANK HOLIDAY';
tbl_holidays(4).calendar_string := 'freq=yearly; bydate=0831-span:7D; byday=mon; bysetpos=1';
tbl_holidays(5).holiday_name := 'CHRISTMAS DAY';
tbl_holidays(5).calendar_string := 'freq=yearly; bydate=1225+span:7d; byday=mon,tue,wed,thu,fri; bysetpos=1';
tbl_holidays(6).holiday_name := 'BOXING DAY';
tbl_holidays(6).calendar_string := 'freq=yearly; bydate=1225+span:7d; byday=mon,tue,wed,thu,fri; bysetpos=2';
for i in 1..tbl_holidays.count loop
dbms_scheduler.evaluate_calendar_string(
calendar_string => tbl_holidays(i).calendar_string,
start_date => null,
return_date_after => v_start_date,
next_run_date => tbl_holidays(i).holiday_date);
end loop;
v_easter_sunday := calculate_easter( i_year);
tbl_holidays(7).holiday_name := 'GOOD FRIDAY';
tbl_holidays(7).holiday_date := v_easter_sunday - 2;
tbl_holidays(8).holiday_name := 'EASTER MONDAY';
tbl_holidays(8).holiday_date := v_easter_sunday + 1;
for j in 1..tbl_holidays.count loop
dbms_output.put_line(tbl_holidays(j).holiday_name||' : '||to_char(tbl_holidays(j).holiday_date, 'DD-MON-YYYY'));
insert into public_holidays( holiday_name, holiday_date, notes)
values( tbl_holidays(j).holiday_name, tbl_holidays(j).holiday_date, 'Generated record');
end loop;
end add_standard_holidays;
procedure generate_schedule
is
C_SCHEDULE_NAME constant user_scheduler_schedules.schedule_name%type := 'PUBLIC_HOLIDAYS_ENGLAND';
v_calendar_string user_scheduler_schedules.repeat_interval%type;
v_start_date date;
v_placeholder pls_integer;
v_exists boolean;
cursor c_schedule_exists is
select null
from user_scheduler_schedules
where schedule_name = C_SCHEDULE_NAME;
begin
v_calendar_string := 'freq=yearly;bydate=';
for r_hols in (
select to_char(holiday_date, 'YYYYMMDD') as string_date
from public_holidays
order by holiday_date)
loop
v_calendar_string := v_calendar_string||r_hols.string_date||',';
end loop;
-- strip off the last ','
v_calendar_string := rtrim(v_calendar_string, ',');
dbms_output.put_line(v_calendar_string);
open c_schedule_exists;
fetch c_schedule_exists into v_placeholder;
v_exists := c_schedule_exists%found;
close c_schedule_exists;
if v_exists then
dbms_scheduler.set_attribute( C_SCHEDULE_NAME, 'repeat_interval', v_calendar_string);
else
dbms_scheduler.create_schedule(
schedule_name => C_SCHEDULE_NAME,
repeat_interval => v_calendar_string,
comments => 'Bank Holidays in England');
end if;
end generate_schedule;
end maintain_public_holidays;
/
Note that DBMS_OUTPUT statements are included for illustrative purposes.
Let’s use the standard holiday dates for 2022 for the initial table population :

If we check the table, we can confirm that we have the expected holidays :
select holiday_date, holiday_name, notes
from public_holidays
where extract( year from holiday_date) = 2022
order by holiday_date
/
HOLIDAY_DATE HOLIDAY_NAME NOTES --------------- ------------------ ---------------- 03-JAN-2022 NEW YEARS DAY Generated record 15-APR-2022 GOOD FRIDAY Generated record 18-APR-2022 EASTER MONDAY Generated record 02-MAY-2022 MAY DAY Generated record 30-MAY-2022 SPRING BANK HOLIDAY Generated record 29-AUG-2022 AUGUST BANK HOLIDAY Generated record 26-DEC-2022 CHRISTMAS DAY Generated record 27-DEC-2022 BOXING DAY Generated record 8 rows selected.
Now we can use these dates for the initial Public Holidays Schedule creation :

We can check the details of the schedule in the Data Dictionary :
select schedule_type, repeat_interval, comments
from user_scheduler_schedules
where schedule_name = 'PUBLIC_HOLIDAYS_ENGLAND'
/

Just to make things interesting, the Bank Holidays have been altered in 2022.
The Spring Bank Holiday has been moved and an extra holiday has been added to commemorate the Queen’s Platinum Jubilee.
In DML terms, the effect on our data is :
delete from public_holidays
where holiday_date = date '2022-05-30'
and holiday_name = 'SPRING BANK HOLIDAY';
insert into public_holidays( holiday_date, holiday_name, notes)
values(date '2022-06-02', 'SPRING BANK HOLIDAY', q'[Moved for the Queen's Platinum Jubilee]');
insert into public_holidays( holiday_date, holiday_name, notes)
values(date '2022-06-03', 'PLATINUM JUBILEE', q'[The Queen's Platinum Jubilee]');
commit;
…which means that the Holidays for 2022 are now :
select holiday_date, holiday_name, notes from public_holidays where extract(year from holiday_date) = 2022 order by holiday_date / HOLIDAY_DATE HOLIDAY_NAME NOTES ------------ ----------------- ------------------- 03-JAN-2022 NEW YEARS DAY Generated record 15-APR-2022 GOOD FRIDAY Generated record 18-APR-2022 EASTER MONDAY Generated record 02-MAY-2022 MAY DAY Generated record 02-JUN-2022 SPRING BANK HOLIDAY Moved for Platinum Jubilee 03-JUN-2022 PLATINUM JUBILEE The Queen's Platinum Jubilee 29-AUG-2022 AUGUST BANK HOLIDAY Generated record 26-DEC-2022 CHRISTMAS DAY Generated record 27-DEC-2022 BOXING DAY Generated record 9 rows selected.
Fortunately, we can keep the Schedule in synch by re-running the maintain_schedule procedure :

You may be struck by the fact that there’s no functionality to curate the dates in the PUBLIC_HOLIDAYS, by deleting dates in the past for example.
Also the GENERATE_SCHEDULE procedure doesn’t bother checking to see if anything has changed before it goes ahead and updates the Schedule.
The reason for this is that we’re dealing with a rather small and fairly static data set. Remember, under normal circumstances there are eight Bank Holidays in a year.
The limiting factor for the number of individual dates that can be included a Schedule is the maximum size of the REPEAT_INTERVAL, which (up to and including 21c) appears to be 4000 judging by the definition of USER_SCHEDULER_SCHEDULES.
For the PUBLIC_HOLIDAYS_ENGLAND Schedule, the first 19 characters are taken up with :
"'freq=yearly;bydate='"
I’ve chosen to put the dates in the string using the ISO date format – YYYYMMDD. In additon, each date has a single ‘,’ character as a separator in the list. Therefore, one date takes up 9 characters.
This means that, assuming we’re using a single byte character set, we’ve got enough space to hold roughly 440 dates which, unless we suddenly get a lot of additional holidays, is around 55 years worth.
I’ll probably refrain from adding all of those dates for now, but no harm in doing the next few years…
set serverout on size unlimited
clear screen
declare
v_start_year number(4);
v_end_year number(4);
v_range number := 5;
begin
-- find the latest year for which we have holidays defined
select extract( year from max(holiday_date))
into v_start_year
from public_holidays;
v_start_year := v_start_year + 1;
v_end_year := v_start_year + v_range;
for i in v_start_year..v_end_year loop
dbms_output.put_line('Adding Standard Holidays for '||i);
maintain_public_holidays.add_standard_holidays(i);
end loop;
commit;
end;
/
exec maintain_public_holidays.generate_schedule; select repeat_interval from user_scheduler_schedules where schedule_name = 'PUBLIC_HOLIDAYS_ENGLAND' / REPEAT_INTERVAL -------------------------------------------------------------------------------- freq=yearly;bydate=20220103,20220415,20220418,20220502,20220602,20220603,2022082 9,20221226,20221227,20230102,20230407,20230410,20230501,20230529,20230828,202312 25,20231226,20240101,20240329,20240401,20240506,20240527,20240826,20241225,20241 226,20250101,20250418,20250421,20250505,20250526,20250825,20251225,20251226,2026 0101,20260403,20260406,20260504,20260525,20260831,20261225,20261228,20270101,202 70326,20270329,20270503,20270531,20270830,20271227,20271228,20280103,20280414,20 280417,20280501,20280529,20280828,20281225,20281226The Final Calendar Strings
With this schedule in place, we can now define our required calendar strings.
Note that you can get the output below by running final_schedules.sql, which is included in the Repo.
The first working day of each month is :
'freq=monthly; byday=mon,tue,wed,thu,fri; exclude=public_holidays_england; bysetpos=1'
For 2022, the dates returned are :
First Working Day for each month in 2022 Tuesday 4th January 2022 Tuesday 1st February 2022 Tuesday 1st March 2022 Friday 1st April 2022 Tuesday 3rd May 2022 Wednesday 1st June 2022 Friday 1st July 2022 Monday 1st August 2022 Thursday 1st September 2022 Monday 3rd October 2022 Tuesday 1st November 2022 Thursday 1st December 2022
The last working day of the month is :
'freq=monthly; byday=mon,tue,wed,thu,fri; exclude=public_holidays_england; bysetpos=-1'
which returns :
Last Working Day for each month in 2022 Monday 31st January 2022 Monday 28th February 2022 Thursday 31st March 2022 Friday 29th April 2022 Tuesday 31st May 2022 Thursday 30th June 2022 Friday 29th July 2022 Wednesday 31st August 2022 Friday 30th September 2022 Monday 31st October 2022 Wednesday 30th November 2022 Friday 30th December 2022
Pay Day ( closest working day to 6th of the month) is :
'freq=monthly; bymonthday=6,7,8,9,10; byday=mon,tue,wed,thu,fri; exclude=public_holidays_england;bysetpos=1'
resulting in :
Pay Day for each month in 2022 Thursday 6th January 2022 Monday 7th February 2022 Monday 7th March 2022 Wednesday 6th April 2022 Friday 6th May 2022 Monday 6th June 2022 Wednesday 6th July 2022 Monday 8th August 2022 Tuesday 6th September 2022 Thursday 6th October 2022 Monday 7th November 2022 Tuesday 6th December 2022
Connecting to an Autonomous Database in the Oracle Cloud
Look, this is a post about setting up and configuring databases in the Cloud.
Therefore, by law, I am required to make numerous references to Captain Scarlett and the Mysterons.
This is because Spectrum has an HQ called Skybase, that looks like this :

Meet Lieutenant Green. He is the entirety of Spectrum’s IT Department, responsible for the system that runs Skybase, and who therefore understands the full implications of the phrase “unplanned downtime”.
Indestructable as Captain Scarlett may be, it’s Lieutenant Green whose indispensible.
We’re going to follow the Lieutenant as he :
- creates an Oracle Autonomous Database in the Free Tier of the Oracle Cloud Infrastructure (OCI)
- connects to the new database from his local copies of SQLDeveloper and SQLCL using a Cloud Wallet
- configures the TNS settings to allow him to initiate Thick client connections from other client tools
- gets to the bottom of whether SQLDeveloper Web has been kidnapped by Mysterons
Right, S.I.G….
Sign UpGreen starts by signing up for an Oracle Cloud Account having broadly following the steps detailed by Todd Sharp.
In this case, the database is called “spectrum”.
Following Todd’s instructions all the way, Green now has a file in the Downloads folder called Wallet_spectrum.zip.
We’ll take a peek inside the zip in a bit.
However, the first order of business for Green is to connect using his favourite Oralce Database IDE…
Green currently has a fairly up-to-date version of SQLDeveloper ( 21.2) on his laptop.
Before he starts testing connectivity to the new database, he makes sure that he’s working with a clean configuration by isolating the new zip file in it’s own directory, away from the existng $ORACLE_HOME/network/admin directory used by his Oracle Client.
cd $HOME
mkdir cloud_tns
cp $HOME/Downloads/Wallet_spectrum.zip $HOME/cloud_tns/.
export TNS_ADMIN=/home/green/cloud_tns
With TNS_ADMIN pointing at the zip, he now starts SQLDeveloper :
sh /opt/sqldeveloper212/sqldeveloper/sqldeveloper.sh
NOTE – if you happen to fall over “ORA-12529 – TNS : connect request rejected based on current filtering rules”, then it’s a fair bet that there’s a conflicting setting somewhere in your existing client TNS setup.
Using the zip file he creates a new connection with a Connection Type of Cloud Wallet :

As we can see :
- The User Info is specified as normal ( i.e. database username and password)
- The Connection Type is Cloud Wallet
- The Configuration File value is simply the full path to the .zip file we downloaded.
- The Service drop-down is populated with the services available in the .zip

As you might expect, SQLcl is also able to use the wallet to connect.
This requires the wallet being specified before the database connection is attempted :
sql /nolog
set cloudconfig <path_to_wallet_zip>
connect uid/password@connect_str

For other client tools which are using Thick Client connections, we need to do a bit more legwork.
Time to have a look at the contents of the zip file :
unzip -l Wallet_spectrum.zip
Archive: Wallet_spectrum.zip
Length Date Time Name
--------- ---------- ----- ----
7475 2022-08-06 18:37 ewallet.pem
2782 2022-08-06 18:37 README
6701 2022-08-06 18:37 cwallet.sso
1750 2022-08-06 18:37 tnsnames.ora
3378 2022-08-06 18:37 truststore.jks
691 2022-08-06 18:37 ojdbc.properties
114 2022-08-06 18:37 sqlnet.ora
6656 2022-08-06 18:37 ewallet.p12
3191 2022-08-06 18:37 keystore.jks
--------- -------
32738 9 files
That looks rather like a set of files we might find in the $ORACLE_HOME/network/admin directory of an Oracle client installation.
If we take a look at the sqlnet.ora in the zip file, it seems to confirm this impression :
unzip -c Wallet_spectrum.zip sqlnet.ora
Archive: Wallet_spectrum.zip
inflating: sqlnet.ora
WALLET_LOCATION = (SOURCE = (METHOD = file) (METHOD_DATA = (DIRECTORY="?/network/admin")))
SSL_SERVER_DN_MATCH=yes
Therefore by unzipping the files into /home/green/cloud_tns…
unzip Wallet_spectrum.zip -d /home/green/cloud_tns
…editing the sqlnet.ora to point to the correct loctaion…
WALLET_LOCATION = (SOURCE = (METHOD = file)
(METHOD_DATA = (DIRECTORY="home/green/cloud_tns")))
SSL_SERVER_DN_MATCH=yes
…and pointing the client at this directory…
export TNS_ADMIN=/home/green/cloud_tns
…Green can connect to his cloud database in the usual way :

Green has been looking forward to having a good look round SQLDeveloper Web. However locating it is proving something of a challenge.
Whilst Mysteron activity cannot be entirely ruled out, the most likely causes for this absence are :
- it’s called Database Actions in OCI
- it’s not enabled by default for non-Admin users
As ADMIN, Green first creates a database user by doing the following :
Open the Database Actions Link and selecting the Database Users option from the hamburger menu.

Click Create User

Here, he completes the form :

Incidentally, by toggling the Show Code button at the bottom of the form, he can see the code that will actually be executed.
By clicking the CREATE USER button he should now get a new database user. As he’s specified that they are Web enabled, he can connect to SQLDeveloper Web as this new user, using the URL specified in their user card :

When this users goes to the URL, they need to provide their database username and password. Once connected, they should see the Database Actions Launchpad.
Select the SQL tab and things look rather familiar :

Whilst it won’t prove decisive in the battle with the Mysterons, Lieutenant Green is confident that having Skybase connected to a Cloud Database should at least reduce latency.
Creating a Datapump Export using an External Table
Oracle’s Data Pump utility provides a fast way of extracting data from the database and – potentially – moving it to another Oracle database.
Both the command line utility and the DBMS_DATAPUMP package the ability to develop sophisticated data extraction logic.
However, if you just want to capture a single table ( or even the result set from a single query), using the ORACLE_DATAPUMP access driver in an external table is a much simpler alternative.
What I’m going to cover here is :
- creating a Data Pump dump file using an external table
- reading a dump file from an external table
- using a dump file to transfer data to an older version of Oracle
We need to extract some data from the HR application, running on an 18c database, to populate the Company Phone Directory application.
Obviously, this application was commissioned before Video Conferencing reached it’s current level of ubiquity. Fittingly therefore, our target application runs on a venerable Oracle 11g database running on a separate server.
We already have a directory object called APP_DIR on our source database, which is used for data extracts.
Creating the Export Dump FIleThis is achieved by simply creating the external table using a common-or-garden Create-Table-As-Select (CTAS) statement and specifying the external table driver as oracle_datapump :
create table phonebook_dp
organization external (
type oracle_datapump
default directory app_dir
access parameters (logfile 'phonebook_report.log')
location('phonebook.dmp')
)
as
select emp.employee_id, emp.first_name, emp.last_name, emp.phone_number,
dept.department_name, loc.city, coun.country_name, reg.region_name
from employees emp
inner join departments dept
on emp.department_id = dept.department_id
inner join locations loc
on dept.location_id = loc.location_id
inner join countries coun
on loc.country_id = coun.country_id
inner join regions reg
on coun.region_id = reg.region_id
/
If we now look at the directory on disc, we can see that the dump file has been generated :
ls -l phonebook.*
-rw-r-----. 1 oracle oinstall 20480 Jul 30 13:44 phonebook.dmp
-rw-r--r--. 1 oracle oinstall 41 Jul 30 13:44 phonebook.log
We can query the new table in the usual way :
select first_name, last_name, phone_number
from phonebook_dp
where department_name = 'IT'
/
FIRST_NAME LAST_NAME PHONE_NUMBER
-------------------- -------------------- --------------------
Alexander Hunold 590.423.4567
Bruce Ernst 590.423.4568
David Austin 590.423.4569
Valli Pataballa 590.423.4560
Diana Lorentz 590.423.5567
However, we cannot perform any non-select DML :
delete from phonebook_dp
where department_name = 'IT'
/
SQL Error: ORA-30657: operation not supported on external organized table
30657.0000 - "operation not supported on external organized table"
*Cause: User attempted on operation on an external table which is
not supported.
*Action: Don't do that!
A Note on Data Security
Whilst a Data Pump dump file is in a binary format and therefore not readable by the casual observer, it is not encrypted and can be read by anyone with access to a copy of the relevant Oracle software.
From a security perspective therefore, these dump files should be treated in the same way as any other data that’s written out of the database into a file.
If we want to read the data in the dump file, we can use the oracle_datapump Access Driver and simply point an external table at it :
create table read_phonebook_xt (
employee_id number,
first_name varchar2(4000),
last_name varchar2(4000),
phone_number varchar2(4000),
department_name varchar2(4000),
city varchar2(4000),
country_name varchar2(4000),
region_name varchar2(4000))
organization external (
type oracle_datapump
default directory app_dir
location('phonebook.dmp')
)
/
Whilst this dump file works perfectly well on 18c, when I try to open it in my 11.2 database things don’t run quite so smoothly.
I’ve transferred the file to the 11g database server (the database object here is called MY_FILES) and created an external table to read it :
create table read_phonebook_xt (
employee_id number,
first_name varchar2(4000),
last_name varchar2(4000),
phone_number varchar2(4000),
department_name varchar2(4000),
city varchar2(4000),
country_name varchar2(4000),
region_name varchar2(4000))
organization external (
type oracle_datapump
default directory my_files
access parameters (nologfile)
location('phonebook.dmp')
)
/
However, if I try to select from the table I get :
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-39142: incompatible version number 5.1 in dump file "/u01/app/oracle/my_files/phonebook.dmp"
29913. 00000 - "error in executing %s callout"
*Cause: The execution of the specified callout caused an error.
*Action: Examine the error messages take appropriate action.
To remedy this, back on the source database, I need to generate a dump file that’s compatible with Oracle 11.2. Fortunately, this is quite straightforward as I simply need to specify a minimum compatible VERSION for the dump file :
create table phonebook_for_11g_dp
organization external (
type oracle_datapump
default directory app_dir
access parameters (
logfile 'phonebook_report.log'
version '11.2')
location('phonebook_11g.dmp')
)
as
select emp.employee_id, emp.first_name, emp.last_name, emp.phone_number,
dept.department_name, loc.city, coun.country_name, reg.region_name
from employees emp
inner join departments dept
on emp.department_id = dept.department_id
inner join locations loc
on dept.location_id = loc.location_id
inner join countries coun
on loc.country_id = coun.country_id
inner join regions reg
on coun.region_id = reg.region_id
/
Once we’ve transferred the new file to the target database server, we can re-point the external table to the new dump file…
alter table read_phonebook_xt location('phonebook_11g.dmp');
…and we can now see the data :
select first_name, last_name, phone_number
from read_phonebook_xt
where department_name = 'Executive'
/
FIRST_NAME LAST_NAME PHONE_NUMBER
-------------------- -------------------- --------------------
Steven King 515.123.4567
Neena Kochhar 515.123.4568
Lex De Haan 515.123.4569
That should keep things going unitl we get round to upgrading the database.
Coming a cropper with SQL*Plus COPY – coping with unexpected DML on error
The Oracle Docs have this to say about the SQL*Plus COPY command :
“The COPY command will be deprecated in future releases of SQL*Plus. After Oracle 9i, no new datatypes are supported by COPY.”
Then again, they’ve been saying that since 9i back in 2002. COPY, it seems, is not ready to go quietly just yet.
Such tenacity is a trait shared by Women’s Football, which is currently thriving having at one point survived a 50 year ban by the footballing authorities.
As the Women’s Euros are currently in full swing, it seems fitting that the examples that follow should have a footballing theme.
What we’re going to look at is :
- the functionality of SQL*Plus COPY and when you might choose to use it
- how copy may insert records ( and even commit) despite hitting an error
Incidentally, the functionality detailed here is pretty unchanged between 11g and 21c. For the record, I’m using the Oracle 19c Developer Day Virtual Box VM, running Oracle Enterprise Edition for the examples that follow.
What COPY does and when to use itReturning to the docs, COPY :
“Copies data from a query to a table in the same or another database”
OK, so when would you use it in preference to a database link or even datapump ?
Well, as Tim Hall says, for copying between databases, it’s a lot less hassle than datapump.
Other use cases may include copying data between databases running different versions of Oracle when there may well be compatibility issues with datapump and/or database links.
A simple exampleWe have a table in our source database, which holds details of all of the previous Womens Euro tournaments :

We want to copy this data to an existing table – EURO_TARGET, which is in the database we’re currently logged into.
Before we actually run the COPY, it’s worth considering the two SQL*Plus variables which affect how it behaves :
- arraysize – the number of rows that SQL*Plus will fetch from the database at one time (15 by default). The maximum value is 5000
- copycommit – the number of arraysize batches that will be fetched before a commit is issued. This is set to 0 by default, which means the commit will be issued only after all batches are fetched.
Whilst we’re on the subject of SQL*Plus variables, it’s worth mentioning that autocommit is set to OFF in all of the examples that follow.
Before we run anything, let’s check that the environment variables are all set as expected :
SQL> show arraysize
arraysize 15
SQL> show copycommit
copycommit 0
SQL> show autocommit
autocommit OFF
SQL>
As we’re inputting this across multiple lines, we need to add a “-” at the end of each line as the continuation character :
copy from hr@orcl -
insert euro_target using -
select year, hosts, winners -
from womens_euros;
The output is :
Enter FROM password:
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
12 rows selected from hr@orcl.
12 rows inserted into EURO_TARGET.
12 rows committed into EURO_TARGET at DEFAULT HOST connection.
Irrespective of the autocommit setting, we can see that the inserted rows have been committed :
SQL> rollback;
Rollback complete.
SQL> select count(*) from euro_target;
COUNT(*)
----------
12
As well as the INSERT keyword which – as you’d expect – inserts data into an existing table, copy supports three other operations :
- APPEND – before inserting data, this will create the target table if it does not already exist
- REPLACE – this will drop and re-create the target table before inserting
- CREATE – this will create the target table before inserting but will error if it already exists
Whether we’re copying from a remote database to the one we’re currently connected to or simply copying between tables in the same database, the syntax is the same.
Whilst, individually, the FROM and TO clauses are optional, at least one must be specified :
copy insert euro_target using -
> select year, hosts, winners -
> from womens_euros;
SP2-0495: FROM and TO clauses both missing; specify at least one.
Copying to a remote database from the current one ( or even within the same database) can be achieved with :
copy to hr@orcl -
replace euro_target using -
select year, hosts, winners -
from womens_euros;
Until now, we’ve omitted the password from the connect string so that COPY prompts for it.
It will work without prompting if the password is specified :
copy from hr/Thel1onesses@orcl -
insert euro_target( year, hosts, winners) using -
select year, hosts, winners -
from womens_euros;
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
12 rows selected from hr@orcl.
12 rows inserted into EURO_TARGET.
12 rows committed into EURO_TARGET at DEFAULT HOST connection.
Fortunately, COPY also works with Oracle Wallet :
copy to /@hr_on_orcl replace euro_target using -
select year, hosts, winners -
from womens_euros;
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
Table EURO_TARGET dropped.
Table EURO_TARGET created.
12 rows selected from DEFAULT HOST connection.
12 rows inserted into EURO_TARGET.
12 rows committed into EURO_TARGET at @hr_on_orcl.
Unexpected Inserts on Error
Let’s consider another table that we might like to copy :

The target table looks like this :
desc wwc_target
Name Null? Type
------------ -------- ---------------
YEAR NUMBER
HOSTS VARCHAR2(100)
WINNERS VARCHAR2(100)
ORGANIZERS VARCHAR2(4)
Can you spot where we might have an issue ?
copy from hr@orcl insert wwc_target (year, hosts, winners, organizers) using -
select year, hosts, winners, organizers -
from womens_world_cups -
order by year;
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
ERROR:
ORA-12899: value too large for column "HR"."WWC_TARGET"."ORGANIZERS" (actual:
5, maximum: 4)
SQL>
As you’d expect, no rows are inserted into the target table :
select count(*) from wwc_target;
COUNT(*)
----------
0
Notice that I used an order by clause in the query. This meant that the erroring records were the first ones processed. However, if we reverse the order…
copy from hr@orcl insert wwc_target using -
select year, hosts, winners, organizers -
from womens_world_cups -
order by year desc;
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
ERROR:
ORA-12899: value too large for column "HR"."WWC_TARGET"."ORGANIZERS" (actual:
5, maximum: 4)
SQL> select count(*) from wwc_target;
COUNT(*)
----------
8
So, the records retrieved prior to hitting the error have been inserted.
At this point however, the insert has not committed so we can remedy the situation :
rollback;
select count(*) from wwc_target;
COUNT(*)
----------
0
Things are more problematic if you’re processing multiple batches. For example, if we’re processing two batches and we’re committing after each batch – i.e. :
set arraysize 5
set copycommit 1
and we execute again, we still get the 8 rows inserted. However, the first batch (i.e. where we do not encounter an error) is also committed :
SQL> copy from hr@orcl -
> insert wwc_target( year, hosts, winners, organizers) -
> using -
> select year, hosts, winners, organizers -
> from womens_world_cups -
> order by year desc;
Enter FROM password:
Array fetch/bind size is 5. (arraysize is 5)
Will commit after every array bind. (copycommit is 1)
Maximum long size is 80. (long is 80)
ERROR:
ORA-12899: value too large for column "HR"."WWC_TARGET"."ORGANIZERS" (actual:
5, maximum: 4)
select count(*) from wwc_target;
COUNT(*)
----------
8
rollback;
Rollback complete.
SQL> select count(*) from wwc_target;
COUNT(*)
----------
5
Remember, the maximum value of arraysize is 5000, so if you’re retrieving more records than that, COPY will fetch multiple batches of records.
Fortunately, we can exert some control over the COPY’s transaction processing with WHENEVER SQLERROR :
whenever sqlerror exit failure rollback
copy from hr@orcl insert wwc_target using -
select year, hosts, winners, organizers -
from womens_world_cups -
order by year desc;
Enter FROM password:
Array fetch/bind size is 15. (arraysize is 15)
Will commit when done. (copycommit is 0)
Maximum long size is 80. (long is 80)
ERROR:
ORA-12899: value too large for column "HR"."WWC_TARGET"."ORGANIZERS" (actual:
5, maximum: 4)
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
We can see that the exit status is the value of the FAILURE constant :
echo $?
1
Re-connecting to the database, we can confirm that the insert has been rolled back :
SQL> select count(*) from wwc_target;
COUNT(*)
----------
0
Further Reading
Jana has written a useful article on the COPY command, which includes some useful info on monitoring a COPY operation.
Using an Access Control List to set default permission for new files in a directory on Linux
Today’s “I’ve finally figured out how to do this” instalment is all about Linux file permissions.
In this case, it’s how to ensure that files in a given directory are created with a default group and set of permissions.
The idea is that any files created in a given directory are readable by members of a specific group.
Teddy will be joining me for this Linux adventure. Lately he’s started to wonder if there’s much future in the Boris-Johnson-Lookalike business and so he’s looking to diversify…

Let’s start by giving teddy an account…
sudo useradd teddy
…and then creating a group…
sudo groupadd pack
…to which both teddy and mike are added…
sudo usermod -a -G pack mike
sudo usermod -a -G pack teddy
We can confirm that both users now belong to the group :
getent group pack
pack:x:1007:mike,teddy
The Directory
When I create a directory as mike, I can see that it inherits that user’s primary group :
mkdir treats
ls -ld treats
drwxrwxr-x 2 mike mike 4096 Jul 2 10:40 treats
We want to change this so that it uses the group we’ve just created :
chgrp pack treats
ls -ld treats
drwxrwxr-x 2 mike pack 4096 Jul 2 10:40 treats
This does not affect any new files created in the directory, which are still assigned their owners’ primary group by default :
mike$ touch treats/biscuit.txt
mike$ ls -l treats/biscuit.txt
-rw-rw-r-- 1 mike mike 0 Jul 2 10:44 treats/biscuit.txt
teddy$ touch treats/woof.txt
teddy$ ls -l treats/woof.txt
-rw-rw-r-- 1 teddy teddy 0 Jul 2 10:46 treats/woof.txt
To ensure that any new files created in the directory inherit the the directory’s group, we need to set the setgid bit. Don’t worry, that’s not as complicated as it sounds. In fact, we simply need to run :
chmod g+s treats
We can see the effect this has on the directory permissions :
ls -ld treats
drwxrwsr-x 2 mike pack 4096 Jul 2 10:46 treats
When teddy creates his next file, the directory’s group is used :
teddy $ touch treats/growl.txt
teddy $ ls -l treats/growl.txt
-rw-rw-r-- 1 teddy pack 0 Jul 2 10:54 treats/growl.txt
We still have a bit of work to do. Remember, that we want any new files to be read-write for the owner, read-only for group members and not accessible for anyone else.
Access Control ListsIn Linux each file has a File Access Control List (ACL). Being just a type of file, directories are no different. For treats, the current ACL looks like this :
getfacl treats
# file: treats
# owner: mike
# group: pack
# flags: -s-
user::rwx
group::rwx
other::r-x
Incidentally, we can see the setgid bit in the flags line.
Side Note – Execute PermissionsYou may have noticed that the execute permission is set for everyone. However, none of our files have been created with execute enabled.
As explained in this stack overflow answer, it doesn’t really make sense to grant execute permissions on a file unless it’s known to be executable, therefore, linux does not do this automatically.
Directories however, are subtly different. The execute permission is required to enter a directory.
To demonstrate, we can create a directory from which the others execute permission is then revoked …
mike$ mkdir walkies
chmod o-x walkies
ls -ld walkies
drwxrwxr-- 2 mike mike 4096 Jul 2 11:30 walkies
As teddy is not the file owner or a group member of mike, he has other permissions on the directory.
Whilst he can list the contents of the directory, he cannot navigate to it :
teddy$ ls -l walkies
total 0
teddy$ cd walkies
sh: 3: cd: can't cd to walkies
Defaulting the ACL settings
Remember, we want any new files we create to be read-only for members of the pack group and to have no permissions for other users.
We can do this by setting the default ACL permissions.
To remove write permissions from group, we add a default setting to the directory’s ACL :
setfacl -d -m g::r-- treats
The ACL now looks like this :
getfacl treats
# file: treats
# owner: mike
# group: pack
# flags: -s-
user::rwx
group::rwx
other::r-x
default:user::rwx
default:group::r--
default:other::r-x
The directory listing has also changed :
ls -ld treats
drwxrwsr-x+ 2 mike family 4096 Jul 1 12:43 treats
The “+” at the end of the permissions indicates that this directory now has defaults set in it’s ACL.
Next, we want to revoke privileges from other :
setfacl -d -m o::--- treats
mike $ getfacl treats
# file: treats
# owner: mike
# group: pack
# flags: -s-
user::rwx
group::rwx
other::r-x
default:user::rwx
default:group::r--
default:other::---
These permissions are now applied to any new files created in the directory :
mike $ echo 'Biscuit'>treats/goodboy.log
mike $ ls -l treats/goodboy.log
-rw-r----- 1 mike pack 8 Jul 2 13:27 treats/goodboy.log
Teddy can read the new file (although he refuses to use cat for some reason) :
teddy $ more treats/goodboy.log
Biscuit
However, he cannot write to the file :
teddy $ echo 'More biscuits' >>treats/goodboy.log
sh: 3: cannot create treats/goodboy.log: Permission denied
On the other hand, he can create a new file which mike can see, but no write to (much to teddy’s relief) …
teddy$ echo 'Munch! Wag! Woof!' >treats/happydog.log
teddy$ ls -l treats/happydog.log
-rw-r----- 1 teddy pack 18 Jul 2 13:30 treats/happydog.log
mike$ cat treats/happydog.log
Munch! Wag! Woof!
mike$ echo 'Bath Time !' >>treats/happydog.log
bash: treats/happydog.log: Permission denied
Remember, the ACL changes we’ve made to file permissions in the directory do not apply retrospectively :
ls -lrt treats
total 8
-rw-rw-r-- 1 mike mike 0 Jul 2 10:44 biscuit.txt
-rw-rw-r-- 1 teddy teddy 0 Jul 2 10:46 woof.txt
-rw-rw-r-- 1 teddy pack 0 Jul 2 10:54 growl.txt <- after the setgid
-rw-r----- 1 mike pack 8 Jul 2 13:27 goodboy.log <- after ACL changes
-rw-r----- 1 teddy pack 18 Jul 2 13:30 happydog.log
Further Reading
There’s a useful guide to linux file permissions here.
This article provides further information on configuring Linux ACLs.
We’ve managed to get through all of this file permission malarkey without mentioning sticky bits. If you’re still curious you can check out this article.
Skippy – Lazy Logging for Lazy PL/SQL Devs
What’s that Skippy ? Mike’s finally gotten round to publishing that logging framework ?
Yes, it’s only taken around seven years but I’ve finally managed to put together a coherent version of the PL/SQL logger I described in this post.
Yes, Skippy has now bounded into view on Github.
Skippy is fairly lightweight, the core components consisting of one package, one view and three tables.
It’s also designed for people like me, who just don’t want to be bothered with using a variable to hold the name of the current package member and remembering to include it in every single logging statement.
Skippy does that for you, and a bit more besides.
What I’ll cover here is :
- installing Skippy
- using Skippy in your PL/SQL code
- options for configuriation
- running the unit tests
- some notes on it’s behaviour in 11g
- The SQLDeveloper Reports
Right, let’s hop to it…
InstallationSkippy has been tested against multiple Oracle Database versions and works on everything from 11gR2 onwards. So, provided your Oracle version falls into this very broad range, you’re good to go.
In terms of database privileges and grants you need the following system privileges…
- create table
- create view
- create package
- create sequence
…execute permissions on these packages ( granted to PUBLIC by default )…
- DBMS_DB_VERSION
- OWA_UTIL
- DBMS_DEBUG_JDWP
…and a quota on the schema’s default tablespace.
Now you can head over to Github and grab a copy of the Skippy Repo.
Once it’s on your machine, navigate to the top-level directory.
You can then connect to the database as the schema you want to install into and run :
install.sql
The script output should be similar to the following ( as they say in the phone ads, some sequences have been shortened) :
SQL> @install.sql Installing the Skippy Logging framework : Creating Tables... Table created. Comment created. Comment created. ***snip*** Creating Reference Data... 1 row created. 1 row created. *** snip *** Commit complete. Creating Package Package created. Package body created. Creating Views View created.
You should now have some new objects in the schema :
select object_name, object_type
from user_objects
where object_name like 'SKIPPY%'
order by 2,1
/
OBJECT_NAME OBJECT_TYPE
------------------------------ -----------------------
SKIPPY_LOGS_PK INDEX
SKIPPY_MESSAGE_TYPES_PK INDEX
SKIPPY_USERENV_PARAMETERS_PK INDEX
SKIPPY PACKAGE
SKIPPY PACKAGE BODY
SKIPPY_LOGS_ID_SEQ SEQUENCE
SKIPPY_LOGS TABLE
SKIPPY_MESSAGE_TYPES TABLE
SKIPPY_USERENV_PARAMETERS TABLE
SKIPPY_ENV VIEW
10 rows selected.
Now it’s installed, we can start playing with our logging framework.
Using SkippyThe examples that follow are taken from the demo package which is included in the demo folder in the repo.
The source for this package is :
create or replace package demo
as
-- Package to demonstrate features of the SKIPPY logging framework.
procedure jill;
procedure boomer;
procedure flyer(
i_string in varchar2 default 'Skippy',
i_number in number default 91,
i_date in date default to_date('19680502', 'YYYYMMDD'),
i_boolean in boolean default true);
procedure joey;
procedure jack;
procedure run_all;
end demo;
/
create or replace package body demo as
procedure jill
is
begin
-- Default usage for writing a log message
skippy.log('Morning everyone !');
end jill;
procedure boomer
is
begin
-- set the group for the current session. Log messages will be assigned to this group
-- unless the group is overridden
skippy.set_msg_group('TROUPE');
skippy.log(q'[G'day mate]');
-- Log the current not null sys_context userenv settings
skippy.env;
end boomer;
procedure flyer(
i_string in varchar2 default 'Skippy',
i_number in number default 91,
i_date in date default to_date('19680502', 'YYYYMMDD'),
i_boolean in boolean default true)
is
v_paramlist varchar2(4000);
begin
-- build a list of parameter values to log
skippy.add_param( 'I_STRING', i_string, v_paramlist);
skippy.add_param( 'I_NUMBER', i_number, v_paramlist);
skippy.add_param( 'I_DATE', i_date, v_paramlist);
skippy.add_param( 'I_BOOLEAN', i_boolean, v_paramlist);
skippy.log( v_paramlist);
end flyer;
procedure joey
is
begin
raise_application_error(-20501, 'Who are you calling a wallaby ?');
exception when others then
-- Log the error stack
skippy.err;
end joey;
procedure jack
is
begin
-- Setting the log level to W(arning)...
skippy.set_log_level('W');
-- ...means that this message will not be logged
skippy.log(q'[What's that Skippy ?]');
-- ...but this one will
skippy.log('Down a mine shaft ?', 'W');
end jack;
procedure run_all
is
begin
jill;
boomer;
flyer;
joey;
-- Turn off message grouping in this session
skippy.set_msg_group(null);
jack;
-- Explicitly set the message group
skippy.log('No worries', i_group=> 'TROUPE');
end run_all;
end demo;
/
Simply calling skippy with a message like this…
skippy.log('Morning everyone !');
…will result in quite a bit of message meta-data being logged…

Whilst Skippy will accept explicit values for the message source and line number, it’s smart enough to work these out if you don’t provide them.
That’s not all.
Capturing Parameter ValuesTo record a list of parameter values, you can use the ADD_PARAM procedure to build the list. Once the list is complete, it can be logged in the normal way.
Parameters of type VARCHAR2, DATE, NUMBER and BOOLEAN are supported :
skippy.add_param( 'I_STRING', i_string, v_paramlist);
skippy.add_param( 'I_NUMBER', i_number, v_paramlist);
skippy.add_param( 'I_DATE', i_date, v_paramlist);
skippy.add_param( 'I_BOOLEAN', i_boolean, v_paramlist);
skippy.log( v_paramlist);

You can assign messages to a group. This can be useful if you are running a batch process which invloves calls to a number of database objects.
You can either assign a message to a group explicitly…
skippy.log('No worries', i_group=> 'TROUPE');
…or you can set the group at session level, after which all logs will be assigned to the group…
skippy.set_msg_group('TROUPE');
skippy.log('No worries');

Skippy supports four logging levels out-of-the-box :
- E(rror)
- W(arning)
- I(nformation)
- D(ebug)
- A(ll) – the default
You can set the logging level for a session.
For example, if you only want Warning and Error messages recorded :
skippy.set_log_level('W');
-- ...means that this message will not be logged
skippy.log(q'[What's that Skippy ?]');
-- ...but this one will
skippy.log('Down a mine shaft ?', 'W');

You can turn off logging altogether in the current session by running :
skippy.disable_logging;
You can configure logging levels by editing/adding them in the SKIPPY_MESSAGE_TYPES table.
USERENV ParametersDo you want to log the SYS_CONTEXT userenv parameter settings ? Can’t remember what they all are ? Don’t worry, Skippy knows…
skippy.env;

Incidentally, after many years of being unable to find exactly where Oracle keep the userenv parameter settings in the data dictionary, I’ve made my own list by combing through the various release notes since 11gR2 and putting them into the SKIPPY_USERENV_PARAMETERS.
If you want to see you’re current settings in one place, you can use the SKIPPY_ENV view :
select parameter_name, session_value
from skippy_env
order by 1;
To log the current error stack :
skippy.err;
Note that this is less useful in versions after 11gR2 as the error stack is much better at reporting the location at which the error occured, as opposed to where it was raised.
Show current settingsTo see the message group that’s currently set in the session :
select skippy.current_setting('group') from dual;
For the current logging level :
select skippy.current_setting('level') from dual;
In writing Skippy, I’ve adopted attempted to ensure that :
- it’s lightweight and simple to install
- it does not use any features which may require additional licensing ( e.g. partitioning)
- it can to run on 11gR2
- the database objects are easily identifiable (hence the SKIPPY_ prefix)
As I have no way of anticipating the data volumes, performance requirements, archiving requirements etc that may be required I have left these matters open for the user to customize as they see fit.
I have steered away from using a CLOB to store the message text in SKIPPY_LOGS because the number of log messages that would exceed the size of a standard VARCHAR2 (currently 4000) is likely to be relatively small.
Using a VARCHAR2 means that we’ll never have to worry about out-of-line storage of messages with the potential administrative and performance overheads that may bring.
Skippy uses the Oracle supplied OWA_UTIL package to determine the source of the log message.
This involves OWA_UTIL.WHO_CALLED_ME parsing the call stack ( i.e. the output from DBMS_UTILITY.FORMAT_CALL_STACK) to determine this information.
In 11g, the call stack only holds the name of the top level calling object ( e.g. a package name).
In later versions it holds the package_name.package_member.
This limitation means that, in 11g, Skippy uses ALL_IDENTIFIERS to derive the name of a package member where required. This approach throws up a couple of differences in terms of reporting the LOG_SOURCE.
Firstly, if you’re logging from a function/procedure that’s local to a package member itself, Skippy will report the name of the nested program unit.
In 12c and later, the name of the outer package member will be reported.
For example :
create or replace package roo as
procedure doe;
end roo;
/
create or replace package body roo as
procedure doe is
procedure pouch is
begin
skippy.log('Pockets are so useful');
end pouch;
begin
pouch;
end doe;
end roo;
/
exec roo.doe;
If we run the above in 11g, the log_source is reported as ROO.POUCH.
In later versions, it is more accurately identified as ROO.DOE.POUCH
Note that the correct line number is returned in both instances.
We’d need to provide some override values to make 11g behave consistently with later releases. For example :
skippy.log(
i_msg => 'Pockets are so useful',
i_source => 'ROO.DOE.POUCH',
i_line_no => $$plsql_line);
Another situation where you may experience different behaviour between Oracle versions is where you have a block in the main package body :
create or replace package kangaroos as
procedure family;
end kangaroos;
/
create or replace package body kangaroos as
procedure family is
v_court dbms_utility.name_array;
-- red, eastern grey, western grey, antilopine
begin
v_court(1) := 'Red';
v_court(2) := 'Eastern Grey';
v_court(3) := 'Western Grey';
v_court(4) := 'Antilopine';
end family;
-- Main package block
begin
skippy.log(q'[The gang's all here !]');
end kangaroos;
/
exec kangaroos.family;
In releases after 11g, the log_source is reported as KANGAROOS.__pkg_init
However, in 11g, the framework just picks up the previous procedure in the package.
Once again, we could use override values to correct this.
An alternative workaround would be to add an empty private procedure to the package :
create or replace package body kangaroos as
procedure family is
v_court dbms_utility.name_array;
-- red, eastern grey, western grey, antilopine
begin
v_court(1) := 'Red';
v_court(2) := 'Eastern Grey';
v_court(3) := 'Western Grey';
v_court(4) := 'Antilopine';
end family;
-- Main package block
procedure pkg_init is begin null; end;
begin
skippy.log(q'[The gang's all here !]');
end kangaroos;
/
There are 24 unit tests in the repository, written using the utPLSQL 3.x test framework.
If you’re so inclined ( and have the framework installed), you can run the tests by connecting via your favourite Oracle tool and running :
run_all_tests.sql
Here’s the output for a test execution on an Oracle 21c instance :
Running Full Test Suite on Oracle Database Version 21.0.0.0.0 skippy_ut skippy_log default message [.088 sec] non-default valid message level [.005 sec] message group [.005 sec] override source [.004 sec] override_line_no [.004 sec] long_message [.01 sec] logging_disabled [.004 sec] skippy_group set group [.006 sec] unset group [.004 sec] log group [.007 sec] log no group [.005 sec] skippy_err exception_block [.009 sec] specify_group [.006 sec] skippy_env environment variables [.019 sec] override message type [.007 sec] specify message group [.007 sec] skippy_current_settings Get Current Log Level [.008 sec] Get current Message Group when set [.003 sec] Get current Message Group when not set [.003 sec] skippy_params add varchar param [.004 sec] add number param [.003 sec] add date param [.003 sec] add boolean param [.004 sec] build param string [.003 sec] Finished in .250975 seconds 24 tests, 0 failed, 0 errored, 0 disabled, 0 warning(s) PL/SQL procedure successfully completed.Reports
For anyone using SQLDeveloper, I’ve included a suite of User Defined reports to play with :
- Log entries for last n hours
- Message group log entries with intervals
- Tail Log Table ( log entries are ordered by timestamp, latest first)
The message group report, for example, accepts two parameters :
- Log Date ( defaults to today)
- Message Group

After running the demo package, the report might look like this :

I’d like to say a particular thank-you to Jacek Gebal, who first suggested I publish this code to Github.
What’s that Skippy ? Don’t forget Uncle Steve – daredevil and corrupter of youth ?
How could I.
Finally, thanks to the youth in question, my favourite (only) son, Michael. It’s taken so long to put this all together that he’s now a grown-up IT Professional in his own right and the current de facto first line support for all family printer problems !