The Anti-Kyte

Subscribe to The Anti-Kyte feed The Anti-Kyte
Oracle - for when it was like that when you got there
Updated: 4 hours 21 min ago

Using Edition Based Redefinition for Rolling Back Stored Program Unit Changes

Thu, 2017-08-10 15:14

We had a few days of warm, sunny weather in Milton Keynes recently and this induced Deb and I to purchase a Garden Umberella to provide some shade.
After a lifetime of Great British Summers we should have known better. The sun hasn’t been seen since.
As for the umbrella ? Well that does still serve a purpose – it keeps the rain off.

Rather like an umbrella Oracle’s Edition Based Redefinition feature can be utilized for purposes other than those for which it was designed.
Introducted in Oracle Database 11gR2, Edition Based Redefinition (EBR to it’s friends) is a mechanism for facilitating zero-downtime releases of application code.
It achieves this by separating the deployment of code to the database and that code being made visible in the application.

To fully retro-fit EBR to an application, you would need to create special views – Editioning Views – for each application table and then ensure that any application code referenced those views and not the underlying tables.
Even if you do have a full automated test suite to perform your regression tests, this is likely to be a major undertaking.
The other aspect of EBR, one which is of interest here, is the way it allows you to have multiple versions of the same stored program unit in the database concurrently.

Generally speaking, as a database application matures, the changes made to it tend to be in the code rather more than in the table structure.
So, rather than diving feet-first into a full EBR deployment, what I’m going to look at here is how we could use EBR to:

  • decouple the deployment and release of stored program units
  • speed up the process of rolling back the release of multiple stored program unit changes
  • create a simple mechanism to roll back individual stored program unit changes

There’s a very good introductory article to EBR on OracleBase.
Whilst you’re here though, forget any Cross-Edition Trigger or Editioning View complexity and let’s dive into…

Fun and Games when releasing Stored Program Units

As I’ve observed before, deploying a new version of a PL/SQL program unit is destructive in nature. By default, the old version of the program is overwritten by the new version and is unrecoverable from within the database.
This can be problematic, especially on those occasions when you discover that your source control repository doesn’t contain what you thought it did.

Having the safety net of the old version stored in the database, ready to be restored should the need arise, is not something to be sniffed at.
Incidentally, Connor Macdonald has his own roll-you-own method for backing up PL/SQL source.

Before we get into how EBR can help with this, we need to do a bit of configuration…

Setting up Editions

From 11gR2 onward, any Oracle database will have at least one Edition…

When you connect to the database, you can specify an Edition to connect to. By default this is the current Database Edition.
To start with, when you’re connected, both the current edition and session edition will be ORA$BASE :

select sys_context('userenv', 'current_edition_name') as default_edition,
    sys_context('userenv', 'session_edition_name') as session_edition
from dual;

However, by default, it does not appear that any database objects are associated with the ORA$BASE edition.
Taking the HR schema as an example :

select object_name, object_type, edition_name
from dba_objects_ae
where owner = 'HR'
and object_type != 'INDEX'
order by 1,2
/

When we execute this query, we get :

That’s because, at this point, HR is blissfully unaware of any Editions. However, if we enable editions for this user…

alter user hr enable editions
/

…and re-execute the query, we can see that things have changed a bit…

The Editionable objects in the schema ( Procedures, Triggers and the View) are now associated with the ORA$BASE edition.

The scope of an Edition is the database ( or the PDB if you’re on 12c). To demonstrate this, let’s say we have another schema – called EDDY :

create user eddy identified by eddy
/

alter user eddy temporary tablespace temp
/

alter user eddy default tablespace users
/

alter user eddy quota unlimited on users
/

grant create session, create table, create procedure to eddy
/

alter user eddy enable editions
/

Eddy is a bit of an ‘ed-banger and the schema contains the following…

create table rock_classics
(
    artist varchar2(100),
    track_name varchar2(100)
)
/

create or replace package i_love_rock_n_roll as
	function eddy_the_ed return varchar2;
end i_love_rock_n_roll;
/

create or replace package body i_love_rock_n_roll as
    function eddy_the_ed return varchar2 is
    begin
        return 'Run to the Hills !';
    end eddy_the_ed;
end i_love_rock_n_roll;
/

At this stage, these objects have the Edition you would expect. This time, we can query the USER_ version of the OBJECT_AE view whilst connected as EDDY…

select object_name, object_type, edition_name
from user_objects_ae
order by 2,1
/

I want to make some changes to the code in the EDDY application. In order to preserve the “old” code, as well as making deployment a fair bit easier, I need a new edition…

Using a New Edition

First off, as a user with the CREATE EDITION privilege…

create edition deep_purple
/

We can see that the new Edition has been created with ORA$BASE as it’s parent…

At present ( i.e. 12cR2), an Edition can have only one child and a maximum of one parent. Every Edition other than ORA$BASE must have a parent.
Therefore, it’s probably helpful to think of Editions as release labels rather than branches.

At this point, whilst we now have two editions in the database, it’s only possible for EDDY to use one of them.
If EDDY attempts to switch to the new Edition…

alter session set edition = deep_purple
/

…we get…

In order for EDDY to be able to use the new Edition, we need to grant it…

grant use on edition deep_purple to eddy
/

Now Eddy can see the new edition as well as the existing one :

alter session set edition = deep_purple
/
select property_value as default_edition,
    sys_context('userenv', 'session_edition_name') as session_edition
from database_properties
where property_name = 'DEFAULT_EDITION'
/

Now we have access to the new Edition, we’re going to make some changes to the application code.
First of all, we want to add a function to the package :

create or replace package i_love_rock_n_roll as
	function eddy_the_ed return varchar2;
    function motor_ed return varchar2;
end i_love_rock_n_roll;
/

create or replace package body i_love_rock_n_roll as
    function eddy_the_ed return varchar2 is
    begin
        return 'Run to the Hills !';
    end eddy_the_ed;
    function motor_ed return varchar2
    is
    begin
        return 'Sunrise, wrong side of another day';
    end motor_ed;
end i_love_rock_n_roll;
/

We’re also going to create a new standalone function :

create or replace function for_those_about_to_rock
	return varchar2 as
begin
	return 'We salute you !';
end for_those_about_to_rock;
/

Looking at how these changes have affected the Editions with which these objects are associated with is revealing :

select object_name, object_type, edition_name
from user_objects_ae
order by 1,2
/

OBJECT_NAME              OBJECT_TYPE   EDITION_NAME
-----------              -----------   ------------
FOR_THOSE_ABOUT_TO_ROCK  FUNCTION      DEEP_PURPLE
I_LOVE_ROCK_N_ROLL       PACKAGE       ORA$BASE
I_LOVE_ROCK_N_ROLL       PACKAGE       DEEP_PURPLE
I_LOVE_ROCK_N_ROLL       PACKAGE BODY  ORA$BASE
I_LOVE_ROCK_N_ROLL       PACKAGE BODY  DEEP_PURPLE
ROCK_CLASSICS            TABLE

The new function, for_those_about_to_rock, is assigned to the current session edition as we would expect. However, it appears that the i_love_rock_n_roll package is now assigned to both versions.
That’s not right, surely ?

Let’s do a quick check…

select i_love_rock_n_roll.motor_ed
from dual
/

MOTOR_ED
--------
Sunrise, wrong side of another day

So, we can see the new package function.
However, if we now switch to the other Edition…

alter session set edition = ORA$BASE
/

Session altered.

…and try to invoke the standalone function we just created…

select i_love_rock_n_roll.motor_ed
from dual
/

Error starting at line : 1 in command -
select i_love_rock_n_roll.motor_ed
from dual

Error at Command Line : 1 Column : 8
Error report -
SQL Error: ORA-00904: "I_LOVE_ROCK_N_ROLL"."MOTOR_ED": invalid identifier
00904. 00000 -  "%s: invalid identifier"
*Cause:
*Action:

However, we can still see the original package…

select i_love_rock_n_roll.eddy_the_ed
from dual
/

EDDY_THE_ED
-----------
Run to the Hills !

Where it gets really interesting – for our current purposes at least, is that we can see the source code for both versions of the package in the USER_SOURCE_AE view.
For the original Package Header :

select text
from user_source_ae
where type = 'PACKAGE'
and name = 'I_LOVE_ROCK_N_ROLL'
and edition_name = 'ORA$BASE'
order by line
/

…we get …

TEXT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
package i_love_rock_n_roll as
	function eddy_the_ed return varchar2;
end i_love_rock_n_roll;

…but we can also get the new version from the same view…

select text
from user_source_ae
where type = 'PACKAGE'
and name = 'I_LOVE_ROCK_N_ROLL'
and edition_name = 'DEEP_PURPLE'
order by line
/

…returns…

TEXT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
package i_love_rock_n_roll as
	function eddy_the_ed return varchar2;
    function motor_ed return varchar2;
end i_love_rock_n_roll;

One other point to note is that you can grant privileges on an object that only exists in your “new” edition…

SQL> grant execute on for_those_about_to_rock to hr;

Grant succeeded.

…but when connected as that user, the object will not be visible…

select eddy.for_those_about_to_rock from dual;

Error starting at line : 1 in command -
select eddy.for_those_about_to_rock from dual
Error at Command Line : 1 Column : 8
Error report -
SQL Error: ORA-00904: "EDDY"."FOR_THOSE_ABOUT_TO_ROCK": invalid identifier
00904. 00000 -  "%s: invalid identifier"
*Cause:
*Action:

…nor will the grantee be able to see the Edition if they do not otherwise have privileges to do so…

alter session set edition = deep_purple;

Error starting at line : 1 in command -
alter session set edition = deep_purple
Error report -
ORA-38802: edition does not exist
38802. 00000 -  "edition does not exist"
*Cause:    This error occurred because a reference was made to an edition that
           does not exist or that the current user does not have privileges on.
*Action:   Check the edition name and verify that the current user has
           appropriate privileges on the edition.
Releasing code using Editions

As we can see, Editions allow us to separate the deployment of code from the actual release.
We’ve already deployed our application changes but they are only visible to eddy at the moment.
NOTE – as I said at the start, we’re only using EBR for releasing stored program units. If we had any table DDL then we’d need to deal with that separately from EBR in these particular circumstances.

Anyhow, once we’re sure that all is well, we just need to “release” the code from the DEEP_PURPLE edition as follows :

alter database default edition = deep_purple
/

Now when we connect as hr…

select sys_context('userenv', 'session_edition_name')
from dual
/

SYS_CONTEXT('USERENV','SESSION_EDITION_NAME')
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
DEEP_PURPLE

…and the new function is now accessible…

select eddy.for_those_about_to_rock
from dual
/

FOR_THOSE_ABOUT_TO_ROCK
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
We salute you !                                                                                                                                                        

Note that, whilst the Editionable objects in the HR schema itself have not directly inherited the new Edition…

select object_name, object_type, edition_name
from user_objects_ae
where object_type in ('PROCEDURE', 'TRIGGER', 'VIEW')
/

OBJECT_NAME               OBJECT_TYPE         EDITION_NAME
------------------------- ------------------- ------------------------------
UPDATE_JOB_HISTORY        TRIGGER             ORA$BASE
ADD_JOB_HISTORY           PROCEDURE           ORA$BASE
SECURE_EMPLOYEES          TRIGGER             ORA$BASE
SECURE_DML                PROCEDURE           ORA$BASE
EMP_DETAILS_VIEW          VIEW                ORA$BASE                      

…they are still usable now that we’ve migrated to the DEEP_PURPLE edition…

select first_name, last_name
from emp_details_view
where department_id = 60
/

FIRST_NAME           LAST_NAME
-------------------- -------------------------
Alexander            Hunold
Bruce                Ernst
David                Austin
Valli                Pataballa
Diana                Lorentz                  

Rolling back the entire release

If we need to rollback all of the code changes we’ve made EBR makes this process very simple.

Remember, the objects owned by EDDY in the DEEP_PURPLE Edition are :

select object_name, object_type
from dba_objects
where owner = 'EDDY'
order by 2,1
/

OBJECT_NAME              OBJECT_TYPE
-----------              -----------
FOR_THOSE_ABOUT_TO_ROCK  FUNCTION
I_LOVE_ROCK_N_ROLL       PACKAGE
I_LOVE_ROCK_N_ROLL       PACKAGE BODY
ROCK_CLASSICS            TABLE         

…and the package members are…

Now, to rollback all of the application changes associated with the DEEP_PURPLE Edition, we simply need to run…

alter database default edition = ora$base
/

We can see that this has had the desired effect

select object_name, object_type
from dba_objects
where owner = 'EDDY'
order by 2,1
/

OBJECT_NAME         OBJECT_TYPE
-----------         -----------
I_LOVE_ROCK_N_ROLL  PACKAGE
I_LOVE_ROCK_N_ROLL  PACKAGE BODY
ROCK_CLASSICS       TABLE

The function has disappeared, along with the additional package member…

Well, that’s nice and easy, but how could we use EBR to rollback a single change rather than the entire release ?

Rolling back a single change

To demonstrate this, we need to set the current Edition back to DEEP_PURPLE…

alter database default edition = deep_purple
/

Remember that, where relevant, EBR ensures that a copy of an object’s source code for previous Editions is kept in the Data Dictionary.
We can use this stored code to restore these versions to the current Edition.
NOTE – the owner of this next procedure will need the ALTER ANY PROCEDURE privilege :

create or replace procedure restore_version
(
    i_owner dba_source_ae.owner%type,
    i_name dba_source_ae.name%type,
    i_type dba_source_ae.name%type,
    i_source_edition dba_source_ae.edition_name%type,
    i_target_edition dba_source_ae.edition_name%type
)
is
--
-- Simple procedure to demonstrate restoring a given Edition's version
-- of a stored program unit.
--
    -- The DDL we execute will complete the current transaction so...
    pragma autonomous_transaction;

    rb_source clob;
begin
    if i_owner is null or i_name is null or i_type is null
        or i_source_edition is null or i_target_edition is null
    then
        raise_application_error(-20000, 'Values for all parameters must be supplied');
    end if;

    -- Make sure our session is in the target edition. If not then error.
    if upper(i_target_edition) != upper(sys_context('userenv', 'session_edition_name')) then
        raise_application_error(-20001, 'Session must be in the target edition');
    end if;

    for r_code in
    (
        select line,text
        from dba_source_ae
        where owner = upper(i_owner)
        and name = upper(i_name)
        and type = upper(i_type)
        and edition_name = upper(i_source_edition)
        order by line
    )
    loop
        if r_code.line = 1 then
            rb_source := 'create or replace '
                ||replace(lower(r_code.text), lower(i_type)||' ', i_type||' '||i_owner||'.');
        else
            rb_source := rb_source||r_code.text;
        end if;
    end loop;

    if nvl(length(rb_source),0) = 0 then
        raise_application_error(-20002, 'Object source not found');
    end if;    

    -- execute the ddl to restore the object
    execute immediate rb_source;

end restore_version;
/

In the current example we have the Package header and Package body of EDDY.I_LOVE_ROCK_N_ROLL in both the ORA$BASE and DEEP_PURPLE Editions.
If we want to reverse these changes but leave the rest of the release unaffected, we can simply invoke this procedure…

begin
    restore_version('EDDY', 'I_LOVE_ROCK_N_ROLL', 'PACKAGE', 'ORA$BASE', 'DEEP_PURPLE');
    restore_version('EDDY', 'I_LOVE_ROCK_N_ROLL', 'PACKAGE BODY', 'ORA$BASE', 'DEEP_PURPLE');
end;
/

We can now see that the original package has been restored and is available in the DEEP_PURPLE Edition, along with the other code from the release. However, the package function we’ve removed isn’t :

As it stands, this would is a one-time operation on as we’re effectively restoring the old version by creating it in the new Edition. At that point the stored program units are identical in both Editions.
The obvious solution would be to change the session edition programatically in the procedure. Unfortunately, attempts to do so run into :

ORA-38815: ALTER SESSION SET EDITION must be a top-level SQL statement

Of course, you could issue the ALTER SESSION commands in the script you’re using to call the procedure. However, you would then also need to make a copy of the current Edition code before restoring the old version and it all gets fairly involved.

Conclusion

Whilst all of this isn’t quite using EBR for it’s intended purpose, it does offer a couple of advantages over the more traditional method of releasing stored program unit changes.
First of all, you can separate the deployment of code into your production environment and making it visible to users.
Secondly, releasing the code becomes a single ALTER DATABASE statement, as does rolling back those changes.
Finally, it is possible to quickly revert individual stored program units should the need become evident once the release has been completed.
All of this functionality becomes available without you having to write much code.
The downside is that a reversion of an individual program unit is a one-time operation unless you write some custom code around this, which is what we were trying to get away from to start with.
Additionally, without implementing any Editioning Views, you will still have to manage structural changes to tables in the same way as before.

The weather forecast is sunny for this coming weekend. Unfortunately, that means I’ll have to mow the lawn rather than sit under the umbrella. Honestly, I’m sure astro-turf can’t be that expensive…


Filed under: Oracle, PL/SQL Tagged: 'current_edition_name'), 'session_edition_name'), alter database default edition, alter session set edition, alter user enable editions, create edition, database default edition, dba_objects_ae, dba_source_ae, Edition Based Redefinition, execute immediate, ORA-38802: edition does not exist, ORA-38815: ALTER SESSION SET EDITION must be a top-level SQL statement, session edition, sys_context('userenv', user_objects_ae, user_source_ae

Keyboard not working in Citrix Receiver for Linux – a workaround

Sat, 2017-07-08 14:47

In technological terms, this is an amazing time to be alive.
In many ways, the advances in computing over the last 20-odd years have changed the way we live.
The specific advance that concerns me in this post is the ability to securely and remotely connect from my computer at home, to the computer in the office.
These days, remote working of this nature often requires the Citrix Receiver to be installed on the client machine – i.e. the one I’m using at home.
In my case, this machine is almost certainly running a Linux OS.
This shouldn’t be a problem. After all, the Citrix Receiver is available for Linux. However, as with any application available on multiple platforms, any bugs may be specific to an individual platform.
I was reminded of this recently. Whilst my Windows and Mac using colleagues were able to use the Citrix Receiver with no problems, I found the lack of a working keyboard when connecting to my work machine something of a handicap.
What follows is a quick overview of the symptoms I experienced, together with the diagnosis of the issue. Then I go through the workaround – i.e. uninstalling the latest version of the Receiver and installing the previous version in it’s place.

Version and OS specifics

I’ve replicated what follows on both Ubuntu 16.04 ( the current LTS version) and Linux Mint 17.3 (Cinnamon desktop). Whilst these are both Debian based distros using the .deb package, I believe that the issue in question applies to the Receiver for any Linux distro.
Both of the machines I worked on were built on the x86_64 architecture (essentially any 64-bit Intel or AMD processor).
The Receiver version in which the problem was encountered is 13.5.

The symptoms

The problem I encountered was that, once I had polluted my lovely Linux desktop by connecting to to my Windows 7 workstation via the Receiver, the keyboard was unresponsive in the Receiver Window.
The mouse still works. If you switch out of the Receiver window, the keyboard still works.
Switching between Window and Full-Screen view in the Receiver – which sometimes solves intermittent responsiveness issues – does not resolve this particular problem.

The Diagnosis

Whilst initially, I suspected this could be some kind of hardware or driver issue specific to my machine, the fact that I was able to replicate this on multiple PCs using multiple Linux Distros lead me to do some digging.
This lead me to this bug report on the Citrix site.

Good news then. I don’t have to delve into the murky world of drivers. Bad news, it looks like I’m going to have to schlep into the office until Citrix get around to fixing the bug.Or maybe not…

The Workaround

Faced with the prospect of being nose-to-armpit with a bunch of strangers on The Northern Line, I decided that installing the previous version of the Receiver was worth a go.

***Spoiler Alert*** – it worked.

The steps I took to uninstall and re-install The Receiver are as follows…

First of all, verify the version of The Receiver that’s installed :

dpkg -l icaclient

If it’s installed you should see something like :

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                          Version                     Architecture                Description
+++-=============================================-===========================-===========================-===============================================================================================
ii  icaclient                                     13.5.0.10185126             amd64                       Citrix Receiver for Linux

Remember, the version with the problem is 13.5 so, if you have that version installed, you first need to uninstall it. This can be done by :

sudo apt-get remove icaclient

Once that’s done, we need to head over to the Citrix Download site and get hold of the previous version of the Receiver, in this case 13.4.

First, we need to go to the Citrix Receiver Downloads Page and find the section for “Earlier Versions of Receiver for Linux”

In our case we select the link to take us to the download page for version 13.4.

I selected the Full Package (64-bit version) :

Accept the licence agreement and a short while later, you should have a new file in your Downloads folder ( or wherever you chose to store it) :

ls -l icaclient*

-rw-rw-r-- 1 mike mike 19000146 Jun 19 12:18 icaclient_13.5.0.10185126_amd64.deb

To install…

sudo gdebi icaclient_13.4.0.10109380_amd64.deb 

To verify the installation we can now run dpkg again…

dpkg -l icaclient

…which this time should say …

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                          Version                     Architecture                Description
+++-=============================================-===========================-===========================-===============================================================================================
ii  icaclient                                     13.4.0.10109380             amd64                       Citrix Receiver for Linux

After all of that, I do now have the option of working from home rather than catching the bus.


Filed under: Linux, Mint, Ubuntu Tagged: apt-get remove, Citrix Receiver for Linux, dpkg, gdebi, icaclient

Installing SQLDeveloper and SQLCL on CentOS

Mon, 2017-06-19 14:02

As is becoming usual in the UK, the nation has been left somewhat confused in the aftermath of yet another “epoch-defining” vote.
In this case, we’ve just had a General Election campaign in which Brexit – Britain’s Exit from the EU – played a vanishingly small part. However, the result is now being interpreted as a judgement on the sort of Brexit that is demanded by the Great British Public.
It doesn’t help that, beyond prefixing the word “Brexit” with an adjective, there’s not much detail on the options that each term represents.
Up until now, we’ve had “Soft Brexit” and “Hard Brexit”, which could describe the future relationship with the EU but equally could be how you prefer your pillows.
Suddenly we’re getting Open Brexit and even Red-White-and-Blue Brexit.
It looks like the latest craze sweeping the nation is Brexit Bingo.
This involves drawing up a list of adjectives and ticking them off as they get used as a prefix for the word “Brexit”.
As an example, we could use the names of the Seven Dwarfs. After all, no-one wants a Dopey Brexit, ideally we’d like a Happy Brexit but realistically, we’re likely to end up with a Grumpy Brexit.

To take my mind off all of this wacky word-play, I’ve been playing around with CentOS again. What I’m going to cover here is how to install Oracle’s database development tools and persuade them to talk to a locally installed Express Edition database.

Specifically, I’ll be looking at :

  • Installing the appropriate Java Developer Kit (JDK)
  • Installing and configuring SQLDeveloper
  • Installing SQLCL

Sound like a Chocolate Brexit with sprinkles ? OK then…

Environment

I’m running on CentOS 7 (64 bit). I’m using the default Gnome 3 desktop (3.1.4.2).
CentOS is part of the Red Hat family of Linux distros which includes Red Hat, Fedora and Oracle Linux. If you’re running on one of these distros, or on something that’s based on one of them then these instructions should work pretty much unaltered.
If, on the other hand, you’re running a Debian based distro ( e.g. Ubuntu, Mint etc) then you’ll probably find these instructions rather more useful.

I’ve also got Oracle Database 11gR2 Express Edition installed locally. Should you feel so inclined, you can perform that install on CentOS using these instructions.

One other point to note, I haven’t bothered with any Oracle database client software on this particular machine.

Both SQLDeveloper and SQLCL require Java so…

Installing the JDK

To start with, we’ll need to download the JDK version that SQLDeveloper needs to run against. At the time of writing ( SQLDeveloper 4.2), this is Java 8.

So, we need to head over to the Java download page
… and download the appropriate rpm package. In our case :

jdk-8u131-linux-x64.rpm

Once the file has been downloaded, open the containing directory in Files, right-click our new rpm and open it with Software Install :

Now press the install button.

Once it’s all finished, you need to make a note of the directory that the jdk has been installed into as we’ll need to point SQLDeveloper at it. In my case, the directory is :

/usr/java/jdk1.8.0_131

Speaking of SQLDeveloper…

SQLDeveloper

Head over to the SQLDeveloper Download Page and get the latest version. We’re looking for the ??? option. In my case :

sqldeveloper-4.2.0.17.089.1709-1.noarch.rpm

While we’re here, we may as well get the latest SQLCL version as well. The download for this is a single file as it’s platform independent.

Once again, we can take advantage of the fact that Oracle provides us with an rpm file by right-clicking it in Files and opening with Software Install.

Press the install button and wait for a bit…

Once the installation is complete, we need to configure SQLDeveloper to point to the JDK we’ve installed. To do this, we need to run :

sh /opt/sqldeveloper/sqldeveloper.sh

…and provide the jdk path when prompted, in this case :

/usr/java/jdk1.8.0_131

The end result should look something like this :

In my case I have no previous install to import preferences from so I’ll hit the No button.

Once SQLDeveloper opens, you’ll want to create a connection to your database.

To do this, go to the File Menu and select New/Connection.

To connect as SYSTEM to my local XE database I created a connection that looks like this :

Once you’ve entered the connection details, you can hit Test to confirm that all is in order and you can actually connect to the database.
Provided all is well, hit Save and the Connection will appear in the Tree in the left-side of the tool from this point forward.

One final point to note, as part of the installation, a menu item for SQLDeveloper is created in the Programming Menu. Once you’ve done the JDK configuration, you can start the tool using this menu option.

SQLCL

As previously noted, SQLCL is a zip file rather than an rpm, so the installation process is slightly different.
As with SQLDeveloper, I want to install SQLCL in /opt .
To do this, I’m going to need to use sudo so I have write privileges to /opt.

To start with then, open a Terminal and then start files as sudo for the directory that holds the zip. So, if the directory is $HOME/Downloads …

sudo nautilus $HOME/Downloads

In Files, right click the zip file and select Open With Archivte Manager

Click the Extract Button and extract to /opt

You should now have a sqlcl directory under /opt.

To start sqlcl, run

/opt/sqlcl/bin/sql

…and you should be rewarded with…

There, hopefully that’s all gone as expected and you’ve not been left with a Sneezy Brexit.


Filed under: Linux, Oracle, SQLDeveloper Tagged: jdk, sqlcl, SQLDeveloper, sudo nautilus

Dude, Where’s My File ? Finding External Table Files in the midst of (another) General Election

Mon, 2017-05-22 16:11

It’s early summer in the UK, which means it must be time for an epoch defining vote of some kind. No, I’m not talking about Britain’s Got Talent.
Having promised that there wouldn’t be another General Election until 2020, our political classes have now decided that they can’t go any longer without asking us what we think. Again.
Try as I might, it may not be possible to prevent the ear-worm phrases from the current campaign slipping into this post.
What I want to look at is how you can persuade Oracle to tell you the location on disk of any files associated with a given external table.
Specifically, I’ll be covering :

  • getting the name of the Database Server
  • finding the fully qualified path of the datafile the external table is pointing to
  • finding other files associated with the table, such as logfiles

In the course of this, we’ll be challenging the orthodoxy of Western Capitalism “If You Can Do It In SQL…” with the principle of DRY ( Don’t Repeat Yourself).
Hopefully I’ll be able to come up with a solution that is “Strong and Stable” and yet at the same time “Works For The Many, Not the Few”…

The Application

For the most part, I’ve written this code against Oracle 11g Express Edition. However, there are two versions of the final script, one of which is specifically for 12c. I’ll let you know which is which when we get there.

I have an external table which I use to load data from a csv file.

Initially, our application’s external table looks like this :

create table plebiscites_xt
(
    vote_year number(4),
    vote_name varchar2(100)
)
    organization external
    (
        type oracle_loader
        default directory my_files
        access parameters
        (
            records delimited by newline
            badfile 'plebiscites.bad'
            logfile 'plebiscites.log'
            skip 1
            fields terminated by ','
            (
                vote_year integer external(4),
                vote_name char(100)
            )
        )
            location('plebiscites.csv')
    )
reject limit unlimited
/

I’ve created the table in the MIKE schema.

The file that we’re currently loading – plebiscites.csv contains the following :

year,vote_name
2014,Scottish Independence Referendum
2015,UK General Election
2016,EU Referendum
2017,UK General Election

For the purposes of this exercise, I’ll assume that the file is uploaded frequently ( say once per day). I’ll also assume that there’s some ETL process that loads the data from the external table into a more permanent table elsewhere in the database.

As is the nature of this sort of ETL, there are times when it doesn’t quite work as planned.
This is when, equipped with just Read-Only access to production, you will need to diagnose the problem.

In these circumstances, just how do you locate any files that are associated with the external table?
Furthermore, how do you do this without having to create any database objects of your own ?

Finding the server that the files are on

There are a couple of ways to do this.
You could simply look in V$INSTANCE…

select host_name 
from v$instance
/

Alternatively…

select sys_context('userenv', 'server_host')
from dual
/

…will do the same job.
Either way, you should now have the name of the server that your database is running on and, therefore, the server from which the file in question will be visible.
Now to find the location of the data file itself…

Finding the datafile

In keeping with the current standard of public discourse, we’re going to answer the question “How do you find an External Table’s current Location File when not connected as the table owner” by answering a slightly different question ( i.e. as above but as the table owner)…

Our search is simplified somewhat by the fact that the location of any external table is held in the _EXTERNAL_LOCATIONS dictionary views :

select directory_owner, directory_name, location
from user_external_locations
where table_name = 'PLEBISCITES_XT'
/

With this information, we can establish the full path of the file by running…

select dir.directory_path||'/'||uel.location as xt_file
from user_external_locations uel
inner join all_directories dir
    on dir.owner = uel.directory_owner
    and dir.directory_name = uel.directory_name
where uel.table_name = 'PLEBISCITES_XT'
/

…which results in…

XT_FILE                                                                        
--------------------------------------------------------------------------------
/u01/app/oracle/my_files/plebiscites.csv                                        

This is all rather neat and simple. Unfortunately, our scenario of having to investigate an issue with the load is likely to take place in circumstances that render all of this of limited use, at best.

Remember, the scenario here is that we’re investigating an issue with the load on a production system. Therefore, it’s quite likely that we are connected as a user other than the application owner.
In my case, I’m connected as a user with CREATE SESSION and the LOOK_BUT_DONT_TOUCH role, which is created as follows :

create role look_but_dont_touch
/

grant select any dictionary to look_but_dont_touch
/

grant select_catalog_role to look_but_dont_touch
/

As well as the table’s data file, we’re going to want to look at any logfiles, badfiles and discardfiles associated with the table.

Finding other External Table files

At this point it’s worth taking a look at how we can find these additional files. Once again, we have two options.
First of all, we can simply check the table definition using DBMS_METADATA…

set long 5000
set pages 100
select dbms_metadata.get_ddl('TABLE', 'PLEBISCITES_XT', 'MIKE')
from dual
/

…alternatively, we can use the _EXTERNAL_TABLES to home in on the ACCESS_PARAMTERS defined for the table…

set long 5000
select access_parameters
from dba_external_tables
where owner = 'MIKE'
and table_name = 'PLEBISCITES_XT'
/

For our table as it’s currently defined, this query returns :

records delimited by newline
            badfile 'plebiscites.bad'
            logfile 'plebiscites.log'
            skip 1
            fields terminated by ','
            (
                vote_year integer external(4),
                vote_name char(100)
            )

In either case, we end up with a CLOB that we need to search to find the information we need.
To do this programatically, you may be tempted to follow the time-honoured approach of “If you can do it in SQL, do it in SQL”…

with exttab as
(
    select dir.directory_path,  
        regexp_replace( ext.access_parameters, '[[:space:]]') as access_parameters
    from dba_external_tables ext
    inner join dba_directories dir
        on dir.owner = ext.default_directory_owner
        and dir.directory_name = ext.default_directory_name
    where ext.owner = 'MIKE' 
    and ext.table_name = 'PLEBISCITES_XT'
)
select directory_path||'/'||
    case when instr(access_parameters, 'logfile',1,1) > 0 then
        substr
        ( 
            access_parameters, 
            instr(access_parameters, 'logfile') +8, -- substring start position
            instr(access_parameters, chr(39), instr(access_parameters, 'logfile') +8, 1) - (instr(access_parameters, 'logfile') +8) -- substr number of characters
        ) 
        else to_clob('Filename not specified')    
    end as log_file_name,
    directory_path||'/'||
    case when instr(access_parameters, 'badfile',1,1) > 0 then
        substr
        ( 
            access_parameters, 
            instr(access_parameters, 'badfile') +8, -- substring start position
            instr(access_parameters, chr(39), instr(access_parameters, 'badfile') +8, 1) - (instr(access_parameters, 'badfile') +8) -- substr number of characters
        ) 
        else to_clob('Filename not specified')    
    end as bad_file_name,
    directory_path||'/'||
    case when instr(access_parameters, 'discardfile',1,1) > 0 then    
        substr
        ( 
            access_parameters, 
            instr(access_parameters, 'discardfile') +12, -- substring start position
            instr(access_parameters, chr(39), instr(access_parameters, 'discardfile') +12, 1) - (instr(access_parameters, 'discardfile') +12) -- substr number of characters
        ) 
        else to_clob('Filename not specified')    
    end as discard_file_name    
from exttab    
/

…which returns…

Hmmm, it’s possible that a slightly more pragmatic approach is in order here…

set serveroutput on size unlimited
declare
    function get_file
    ( 
        i_owner in dba_external_tables.owner%type,
        i_table in dba_external_tables.table_name%type, 
        i_ftype in varchar2
    )
        return varchar2
    is
        separator constant varchar2(1) := '/';
        dir_path dba_directories.directory_path%type;
        access_params dba_external_tables.access_parameters%type;
        start_pos pls_integer := 0;
        end_pos pls_integer := 0;
    begin
        select dir.directory_path, lower(regexp_replace(ext.access_parameters, '[[:space:]]'))
        into dir_path, access_params
        from dba_external_tables ext
        inner join dba_directories dir
            on dir.owner = ext.default_directory_owner
            and dir.directory_name = ext.default_directory_name
        where ext.owner = upper(i_owner) 
        and ext.table_name = upper(i_table);

        start_pos := instr( access_params, i_ftype||chr(39),1,1) + length(i_ftype||chr(39));
        if start_pos - length(i_ftype||chr(39)) = 0 then
            return 'Filename Not Specified';
        end if;    
        end_pos := instr(access_params, chr(39), start_pos, 1);
        return dir_path||separator||substr(access_params, start_pos, end_pos - start_pos);
    end get_file;

begin
    dbms_output.put_line('LOGFILE '||get_file('MIKE', 'PLEBISCITES_XT', 'logfile'));
    dbms_output.put_line('BADFILE '||get_file('MIKE', 'PLEBISCITES_XT','badfile'));
    dbms_output.put_line('DISCARDFILE '||get_file('MIKE', 'PLEBISCITES_XT','discardfile'));
end;
/

Yes, it’s PL/SQL. No, I don’t think I’ll be getting a visit from the Database Police as this is a rather more DRY method of doing pretty much the same thing…

LOGFILE /u01/app/oracle/my_files/plebiscites.log
BADFILE /u01/app/oracle/my_files/plebiscites.bad
DISCARDFILE Filename Not Specified


PL/SQL procedure successfully completed.

As we’re about to find out, this solution also falls short of being a panacea…

Separate Directory Definitions

What happens when the directories that the files are created in are different from each other ?
Let’s re-define our table :

drop table plebiscites_xt
/

create table plebiscites_xt
(
    vote_year number(4),
    vote_name varchar2(100)
)
    organization external
    (
        type oracle_loader
        default directory my_files
        access parameters
        (
            records delimited by newline
            badfile 'plebiscites.bad'
            logfile my_files_logs:'plebiscites.log'
            discardfile my_files_discards:'plebiscites.disc'
            skip 1
            fields terminated by ','
            (
                vote_year integer external(4),
                vote_name char(100)
            )
        )
            location('plebiscites.csv')
    )
reject limit unlimited
/

You’ll notice here that we’ve added a discard file specification. More pertinently, the directory location for both the discard file and the log file are now specified.
Therefore, our solution needs some tweaking to ensure that it is fit for the many. In fact, while we’re at it, we may as well add the location file in as well….

set serveroutput on size unlimited
declare
    separator constant varchar2(1) := chr(47); -- '/'

    loc_dir_path dba_directories.directory_path%type;
    loc_file user_external_locations.location%type;
    
    function get_file( i_table user_external_tables.table_name%type, i_ftype in varchar2)
        return varchar2
    is
        squote constant varchar2(1) := chr(39); -- " ' "
        colon constant varchar2(1) := chr(58); -- ':'
        
        dir_path dba_directories.directory_path%type;
        access_params dba_external_tables.access_parameters%type;
        
        filedef_start pls_integer := 0;
        filedef_end pls_integer := 0;
        filedef_str clob;
        
        dir_defined boolean;
        
        dir_start pls_integer := 0;
        dir_end pls_integer := 0;
        
        dir_name dba_directories.directory_name%type;
        
        fname_start pls_integer := 0;
        fname_end pls_integer := 0;
        
        fname varchar2(4000);
        
    begin
        select dir.directory_path, lower(regexp_replace(ext.access_parameters, '[[:space:]]'))
        into dir_path, access_params
        from dba_external_tables ext
        inner join dba_directories dir
            on dir.owner = ext.default_directory_owner
            and dir.directory_name = ext.default_directory_name
        where ext.table_name = upper(i_table);
        
        filedef_start := instr(access_params, i_ftype, 1,1); 
        
        if filedef_start = 0 then
            return 'Filename Not Specified';
        end if;
        filedef_end := instr(access_params, squote, filedef_start, 2) + 1;
        filedef_str := substr(access_params, filedef_start, filedef_end - filedef_start);

        dir_defined := instr( filedef_str, colon, 1, 1) > 0;
        if dir_defined then 

            dir_start := length(i_ftype) + 1; 
            dir_end := instr( filedef_str, colon, 1, 1);
            dir_name := substr(filedef_str, dir_start, dir_end - dir_start);
            begin
                select directory_path
                into dir_path
                from dba_directories
                where directory_name = upper(dir_name);
            exception when no_data_found then
                return 'The directory object specified for this file does not exist';
            end;    
        end if;    
        
        fname_start := instr(filedef_str, squote, 1, 1) + 1; 
        fname_end := instr(filedef_str, squote, 1, 2);
        fname := substr( filedef_str, fname_start, fname_end - fname_start);
        return dir_path||separator||fname;
    end get_file;

begin
    -- Get the current file that the XT is pointing to 
    select dir.directory_path, ext.location
        into loc_dir_path, loc_file 
    from dba_external_locations ext
    inner join dba_directories dir
        on dir.owner = ext.directory_owner
        and dir.directory_name = ext.directory_name
        and ext.table_name = 'PLEBISCITES_XT';
        
    dbms_output.put_line('LOCATION '||loc_dir_path||separator||loc_file);    
    dbms_output.put_line('LOGFILE '||get_file('PLEBISCITES_XT', 'logfile'));
    dbms_output.put_line('BADFILE '||get_file('PLEBISCITES_XT','badfile'));
    dbms_output.put_line('DISCARDFILE '||get_file('PLEBISCITES_XT','discardfile'));
    dbms_output.put_line('PREPROCESSOR '||get_file('plebiscites_xt', 'preprocessor'));
end;
/

Run this and we get :

LOCATION /u01/app/oracle/my_files/plebiscites.csv
LOGFILE /u01/app/oracle/my_files/logs/plebiscites.log
BADFILE /u01/app/oracle/my_files/plebiscites.bad
DISCARDFILE /u01/app/oracle/my_files/discards/plebiscites.disc
PREPROCESSOR Filename Not Specified


PL/SQL procedure successfully completed.

Having made such a big thing of preferring the DRY principle to the “Do it in SQL” doctrine, I feel it’s only fair to point out that the new features of the WITH clause in 12c does tend to blur the line between SQL and PL/SQL somewhat…

set lines 130
column ftype format a20
column file_path format a60
with function get_file( i_table in dba_external_tables.table_name%type, i_ftype in varchar2)
    return varchar2
    is
    
        separator constant varchar2(1) := chr(47); -- '/'
        squote constant varchar2(1) := chr(39); -- " ' "
        colon constant varchar2(1) := chr(58); -- ':'
        
        dir_path dba_directories.directory_path%type;
        access_params dba_external_tables.access_parameters%type;
        
        filedef_start pls_integer := 0;
        filedef_end pls_integer := 0;
        filedef_str clob;
        
        dir_defined boolean;
        
        dir_start pls_integer := 0;
        dir_end pls_integer := 0;
        
        dir_name dba_directories.directory_name%type;
        
        fname_start pls_integer := 0;
        fname_end pls_integer := 0;
        
        fname varchar2(4000);
        
    begin
        select dir.directory_path, lower(regexp_replace(ext.access_parameters, '[[:space:]]'))
        into dir_path, access_params
        from dba_external_tables ext
        inner join dba_directories dir
            on dir.owner = ext.default_directory_owner
            and dir.directory_name = ext.default_directory_name
        where ext.table_name = upper(i_table);
        
        filedef_start := instr(access_params, i_ftype, 1,1); 
        
        if filedef_start = 0 then
            return 'Filename Not Specified';
        end if;
        filedef_end := instr(access_params, squote, filedef_start, 2) + 1;
        filedef_str := substr(access_params, filedef_start, filedef_end - filedef_start);

        dir_defined := instr( filedef_str, colon, 1, 1) > 0;
        if dir_defined then 

            dir_start := length(i_ftype) + 1; 
            dir_end := instr( filedef_str, colon, 1, 1);
            dir_name := substr(filedef_str, dir_start, dir_end - dir_start);
            begin
                select directory_path
                into dir_path
                from dba_directories
                where directory_name = upper(dir_name);
            exception when no_data_found then
                return 'The directory object specified for this file does not exist';
            end;    
        end if;    
        
        fname_start := instr(filedef_str, squote, 1, 1) + 1; 
        fname_end := instr(filedef_str, squote, 1, 2);
        fname := substr( filedef_str, fname_start, fname_end - fname_start);
        return dir_path||separator||fname;
    end get_file;

select 'LOCATION ' as ftype, dir.directory_path||sys_context('userenv', 'platform_slash')||ext.location as file_path
from user_external_locations ext
inner join dba_directories dir
    on dir.owner = ext.directory_owner
    and dir.directory_name = ext.directory_name
    and ext.table_name = 'PLEBISCITES_XT'
union select 'LOGFILE', get_file('plebiscites_xt', 'logfile') from dual
union select 'BADFILE', get_file('plebiscites_xt', 'badfile') from dual
union select 'DISCARDFILE', get_file('plebiscites_xt', 'discardfile') from dual
union select 'PREPROCESSOR', get_file('plebiscites_xt', 'preprocessor') from dual
/

Hopefully that’s something to think about in between the Party Election Broadcasts.


Filed under: Oracle, PL/SQL, SQL Tagged: 'server_host'), 12c, 12c pl/sql function in with clause, dba_external_locations, dba_external_tables, don't repeat yourself, external table badfile, external table discardfile, external table logfile, external tables, sys_context('userenv'

Having a mid-life crisis on top-of-the-range hardware

Mon, 2017-05-08 17:31

I’ve recently begun to realise that I’m not going to live forever.
“Surely not”, you’re thinking, “look at that young scamp in the profile picture, he’s not old enough to be having a mid-life crisis”.

Well, five minutes ago, that was a recent picture. Suddenly, it’s more than 10 years old. As Terry Pratchett once observed, “Inside every old person is a young person wondering what happened”.

Fortunately, with age comes wisdom…or a sufficiently good credit rating with which to be properly self-indulgent.
Now, from what I’ve observed, men who get to my stage in life seem to seek some rather fast machinery as a cure for the onset of morbid reflections on the nature of their own mortality.
In this case however, it’s not the lure of a fast care that I’ve succumbed to. First and foremost, I am a geek. And right now, I’m a geek with a budget.

Time then to draw up the wish list for my new notebook. It will need to…

  • be bigger than my 10-inch netbook but small enough to still be reasonably portable
  • have a fast, cutting-edge processor
  • have an SSD with sufficient storage for all my needs
  • have large quantities of RAM
  • come with a Linux Operating System pre-installed

For any non-technical readers who’ve wandered down this far, the rough translation is that I want something with more silicon in it than one of those hour-glasses for measuring the time left before Brexit that have been on the telly recently.
It’s going to have to be so fast that it will, at the very least, offer Scotty the prospect of changing the Laws of Physics.
Oh, and I should still be able to use it on the train.

The requirement for a pre-installed Linux OS may be a factor which limits my choices.
Usually, I’m happy enough to purchase a machine with Windows pre-installed and then replace it with a Linux Distro of my choice.
Yes, this may involve some messing about with drivers and – in some cases – a kernel upgrade, but the process is generally fairly painless.
This time though, I’m going to be demanding. However much of a design classic a Mac may be, OSX just isn’t going to cut it. Linux is my OS of choice.
Furthermore, if I’m going to be paying top dollar for top-of-the range then I want everything to work out of the box.
Why? (pause to flick non-existent hair) Because I’m worth it.

Oh, as a beneficial side-effect it does also mean that I’ll save myself a few quid because I won’t have to fork out for a Windows License.

In the end, a combination of my exacting requirements and the advice and guidance of my son, who knows far more about this sort of thing, lead me to my final choice – the Dell XPS13

What follows is in the style of an Apple fanboy/fangirl handling their latest iThing…

Upon delivery, the package was carried to the kitchen table where it lay with all it’s promise of untold joy…

Yea, and there followed careful unwrapping…

It’s a….box

…Russian doll-like…

…before finally…

If Geekgasm isn’t a thing, it jolly well should be.

Now to setup the OS…

…before finally re-starting.

The first re-boot of a machine usually takes a little while as it sorts itself out so I’ll go and make a cof… oh, it’s back.
Yep, Ubuntu plus SSD ( 512GB capacity) plus a quad-core i7-7560 CPU equals “are you sure you actually pressed the button ?”

Ubuntu itself wasn’t necessarily my Linux distro of choice. That doesn’t matter too much however.
First of all, I’m quite happy to get familiar with Unity if it means I can still access all of that Linux loveliness.
Secondly, with the insane amount of system resources available( 16GB RAM to go with that CPU), I can simply spin up virtual environments with different linux distros, all sufficiently fast to act as they would if being run natively.
For example…

Right, now I’ve got that out of my system, I can wipe the drool off the keyboard and start to do something constructive…like search for cat videos.


Filed under: Uncategorized Tagged: xps13

Kicking the habit of WM_CONCAT for a delimited list of rows, with LISTAGG

Tue, 2017-05-02 15:40

I gave up smoking recently.
Among other bad habits that I need to kick is using the (not so) trusty WM_CONCAT.

Say I want to get a record set consisting a comma-delimited list of columns in the EMPLOYEES table. In the past, this may have been somewhat challenging to do in a single SQL query, unless you knew about the undocumented WM_CONCAT…

select wm_concat(column_name)
from user_tab_cols
where hidden_column = 'NO'
and table_name = 'EMPLOYEES';

From around 10g, right up to 11g R2 Enterprise Edition, this function would return your result set in a single, comma-delimited list.
However, if you attempt to execute the same query in 12g, or even 11g Express Edition, you’ll get a nasty surprise …

Error at Command Line : 1 Column : 8
Error report -
SQL Error: ORA-00904: "WM_CONCAT": invalid identifier
00904. 00000 -  "%s: invalid identifier"
*Cause:    
*Action:

Fortunately, a more modern (and supported) alternative has been around since 11g…

select listagg( column_name, ',') within group( order by column_id)
from user_tab_cols
where hidden_column = 'NO'
and table_name = 'EMPLOYEES'
/

LISTAGG(COLUMN_NAME,',')WITHINGROUP(ORDERBYCOLUMN_ID)
----------------------------------------------------------------------------------------------------------------------------------
EMPLOYEE_ID,FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,JOB_ID,SALARY,COMMISSION_PCT,MANAGER_ID,DEPARTMENT_ID

Unlike WS_CONCAT, LISTAGG allows you to specify the order in which the delimited values should be concatenated. It also allows you to specify the delimiter to use.
So you could use a “|” symbol, for example, or, if you have definite ideas about how a list of columns should be written you may consider something like :

select listagg( column_name, chr(10)||',') within group( order by column_id)
from user_tab_cols
where hidden_column = 'NO'
and table_name = 'EMPLOYEES'
/

LISTAGG(COLUMN_NAME,CHR(10)||',')WITHINGROUP(ORDERBYCOLUMN_ID)
--------------------------------------------------------------------------------
EMPLOYEE_ID                                                                     
,FIRST_NAME                                                                     
,LAST_NAME                                                                      
,EMAIL                                                                         
,PHONE_NUMBER                                                                   
,HIRE_DATE                                                                      
,JOB_ID                                                                         
,SALARY                                                                         
,COMMISSION_PCT                                                                 
,MANAGER_ID                                                                     
,DEPARTMENT_ID                      

Now, if only I could remember not to squeeze the toothpaste tube in the middle…


Filed under: Oracle, SQL Tagged: listagg, wm_concat

The Rest of the Django App – the View and Controller Tiers

Fri, 2017-04-21 15:27

As is the way of Software Projects, I’m starting to get a bit of pressure from the customer about delivery.
As is slightly less usual in such circumstances, the question I’m being asked is “when are you going to get out there and mow that lawn ?”
Fortunately, Django is “for perfectionists with deadlines” …or minions with gardening chores waiting (probably) so I’d better crack on.

Now, I could do with some assistance. Fortunately, these guys will be around to help :

Pay bananas, get minions.

In case you haven’t been following the story to date, this project is to create an Application to allow my better half to look at which movies we have on DVD or Blu-Ray.

So far my Django journey has consisted of :

Django follows the Model-View-Controller (MVC) pattern of application design. Having spent some time looking at the Database (Model) layer, we’re now going to turn our attention to the View (what the end-user sees) and the Controller ( the application logic that makes the application work).

Recap of the Current Application state

After developing the data model, which looks like this :

…the application codebase currently looks like this :

tree -L 2 --dirsfirst --noreport

We have a useable database for our application. Now we need to provide a means of presenting the application data to our users.

Before I go any further, I’m going to try and simplify matters somewhat when it comes to talking about file locations.

I’m going to set an environment variable called $DJANGO_HOME to hold the root directory of the Python virtual environment we’ll be using.
This is the directory that has manage.py in it.

To do this, I’ve written the following shell script, which also starts the virtual environment :

#!/bin/sh

export DJANGO_HOME=`pwd`
source dvdsenv/bin/activate
echo $DJANGO_HOME

Once we’ve granted execute permissions on the script…

chmod a+x set_virt.sh

…we can set our environment variable and start the virtual environment…

. ./set_virt.sh

For the remainder of this post, I’ll be referencing file locations using $DJANGO_HOME to denote the root directory of the python virtual environment.

Now, let’s take a first look at how to retrieve the application data from the database and present it to the users…

Our First Django Page

In $DJANGO_HOME/dvds we need to create some controller code. Confusingly, this is in a file called views.py :

from django.shortcuts import render
from django.views.generic import ListView

from .models import Title

class TitleList(ListView) :
    model = Title

There’s not really much to it. A simple ListView class based on the Title model object, which will list all of the records in the Title table.

Now, $DJANGO_HOME/dvds/urls.py (having stripped out the default comments for brevity):

from django.conf.urls import url
from django.contrib import admin

from dvds.views import TitleList
app_name = 'dvds'

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^$', TitleList.as_view(), name="main"),
]

So, we’ve imported the TitleList view we’ve just created in views.py and then added a pattern so that we can navigate to it.

If we go ahead and run this now, we probably won’t get the result we’re hoping for …

…so let’s make one…

First, we want to create a directory in which to hold the templates. We’ll also want to keep templates from different applications separate so…

cd $DJANGO_HOME/dvds
mkdir templates
cd templates
mkdir dvds

Now, in the newly created directory, we want to create a file called title_list.html, which looks something like this…

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>Deb's DVD Library</title>
    </head>
    <body>
        <h1>DVD Library List</h1>
        <table border="1">
            <tr>
                <th>Title</th>
                <th>Released</th>
                <th>Certificate</th>
                <th>Format</th>
                <th>Director</th>
                <th>Synopsis</th>
                <th>series</th>
                <th>No. In Series</th>
                <th>Categories</th>
            </tr>
            {% for title in object_list %}
                <tr>
                    <td>{{title.title_name}}</td>
                    <td>{{title.year_released}}</td>
                    <td>{{title.bbfc_certificate}}</td>
                    <td>{{title.get_media_format_display}}</td>
                    <td>{{title.director}}</td>
                    <td>{{title.synopsis}}</td>
                    <td>{{title.series}}</td>
                    <td>{{title.number_in_series}}</td>
                    <td>
                        {% for cat in title.categories.all %}
                            {{cat}}&nbsp;
                        {% endfor %}
                    </td>
                </tr>
            {% endfor %}
        </table>
    </body>
</html>

This should look fairly familiar if you’ve used languages such as PHP. Effectively, the {% %} tags indicate programmatic structures ( in this case, for loops), and the {{}} encase actual values pulled from the database.

Examples of this in the above listing include :

  • Line 20 – loop through the Object List
  • Line 25 – display the plain db column values until here, where we use the built-in “get display” to retrieve the display value from the MEDIA_FORMAT_CHOICE list we’ve assigned to the media_format column.
  • Line 31 – loop through all the values in the nested category column.
  • Line 36 – close the for loop

The effect of this is quite exciting… from a programming perspective ….

Not only have we managed to display all of the Title data we’ve added to our application, we can also see that :

  • Django is smart enough to return the name of the Series rather than the series id
  • we can display all of the categories by means of a for loop through the categories column in the Titles table

At this point thought, the look-and-feel is rather…utilitarian. Not only that, it looks like I may be in for some fairly tedious copy-and-paste action unless I can find a way to re-use some of this code.
Maybe Django can help…

A Site Map

Before we go too much further, it’s probably worth pausing to consider which pages our application will ultimately comprise of.
We already have our main page, but we’ll probably want to search for titles in our database.
Given that all the DML in the application is done through the Django Admin site that we get out of the box, we’ll also need to link to that.

Ultimately then, our application pages will be :

  • Main Page – listing of titles in the library
  • Search Page – to enter search criteria
  • Results Page – to display search results

To make the application look a bit nicer, each page will need to use the same style sheet.

cd $DJANGO_HOME/dvds
mkdir static
mkdir static/css
mkdir static/images

The file – $DJANGO_HOME/dvds/static/css/style.css looks like this…

*
{
    font-family: sans-serif;
    font-size: large;
    background: #e4ffc9;
}

table 
{
    border-collapse: collapse;
}
table, th, td
{
    border: 1px solid black;
    text-align: left;
    vertical-align: top;
}

.explanation
{
    display: table-row;
}


.explanation .landscape
{
    vertical-align: middle;
    height: 150px;
    width: 300px;
}


.explanation .portrait
{
    vertical-align: middle;
    height: 250px;
    width: 200px;    
}

.explanation textarea
{
    vertical-align: top;
    height: 150px;
    width: 400px;
    border: none;
    background: none;
    font-size: large;
}

.row
{
    display: table-row;
}

#search_form label
{
    display : table-cell;
    text-align : right;
}

#search_form input, select
{
    display : table-cell;
    width : 300px;
}

#search_form input[type=submit]
{
    font-size : large;
    font-weight : bold;
    border-radius:25px;
    background-color : #2FC7C9;
}

Once we’ve added the style sheet, along with the images we’re going to use in the application, the file layout in the static directory should look like this :

From a layout perspective, I’d like each page to have :

  • a navigation bar
  • in-line explainatory text about what to do on the page
  • the page specific content

If only I could apply a template for this…

Templates

First off we can create a template, called base.html which will serve to apply the layout to all of our application pages.
Essentially, the pages can inherit this template and then override the various sections (blocks).
We can also use this template to include all of the tedious HTML header stuff, as well as a link to our application style sheet.

The file is saved in the same location as our existing template ($DJANGO_HOME/dvds/templates/dvds) and looks like this :

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf8">
        <title>{% block title %}Base Title{% endblock %}</title>
        <link rel="stylesheet" type="text/css" href="/static/css/style.css" />
    </head>
    <body>
        {% block breadcrumb %}{% endblock %}
        {% block explanation %}{% endblock %}
        {% block page_content %}{% endblock %}
    </body>
</html> 

Before we incorporate this into our main page, we also need to consider that the code to list titles can be used in the application results page as well as in the main page. Fortunately, we can also move this out into a template.
The file is called display_titles.html and is saved in the same templates directory as all the other html files…

<table>
    <tr>
        <th>Title</th> <th>Released</th> <th>Certificate</th>
        <th>Format</th> <th>Director</th>
        <th>Synopsis</th> <th>Series</th> <th>No. In Series</th>
        <th>Categories</th> 
    </tr>
    {% for title in titles %}
        <tr>
            <td>{{title.title_name}}</td> <td>{{title.year_released}}</td> <td>{{title.bbfc_certificate}}</td>
            <td>{{title.get_media_format_display}}</td>
            <td>{{title.director}}</td> <td>{{title.synopsis}}</td> <td>{{title.series}}</td>
            <td>{{title.number_in_series}}</td>
            <td>
                {% for cat in title.categories.all %}
                    {{cat}}&nbsp;
                {% endfor %}
            </td>
        </tr>
    {% endfor %}
</table>

The final template component of our application is the navigation menu. The template is called breadcrumb.html and includes all of the breadcrumb menu entries in the application :

<img src="/static/images/deb_movie_logo.jpg">&nbsp;
{% if home %}<a href="/">Home</a>&nbsp;{% endif %}
{% if admin %}|&nbsp;<a href="admin" target="_blank">Add or Edit Titles</a>{% endif %}
{% if back %}|&nbsp;<a href="javascript:history.back(-1)">Back</a>&nbsp;{% endif %}
{% if search_refine %}|&nbsp;<a href="javascript:history.back(-1)">Refine Search</a>&nbsp;{% endif %}
<!-- additional logic required to see if search is first option -->
{% if search %}|&nbsp;<a href="/search">Search</a>{% endif %}
{% if search_again %}|&nbsp;<a href="/search">Search Again</a>{% endif %}

What navigation items are displayed are dependent on how the breadcrumb template is called. Essentially we treat it like a function.

So, putting all of this together in our main page, we get something like :

{% extends "dvds/base.html" %}
{% block title %}Deb's Movie Library{% endblock %}
{% block breadcrumb %}{% include "dvds/breadcrumb.html" with search=True admin=True %}{% endblock %}
{% block explanation %}
    <div class="explanation">
        <h1>So, you'd like to watch a Film...</h1>
        <img src="static/images/avengers.jpg">&nbsp;
        <textarea>We are here to help. 
There's a list of films in the library right here. 
Alternatively, click on "Search" at the top of the page if you're looking for something specific.
        </textarea>
    </div>
{% endblock %}
{% block page_content %}{% include "dvds/display_titles.html" with titles=object_list %}{% endblock %}

To begin with, the extends tag inherits a template as the page parent. The blocks in the remainder of the page are used to override the blocks in the parent template (base.html).

The breadcrumb and page_content blocks use the include tag to import templates into the page.
In the case of the breadcrumb template, we specify the value of the search and admin parameters.
This results in these links being displayed.

When we run the application, the page will look like this :

Before we get there though, it’s probably an idea to put together the first draft of our other application pages…

Search Pages

To start with, we’ll have a fairly simple search screen, which simply involves searching on some user-entered text.

We’ll come onto the pages themselves shortly. First though…

for the search form itself, we’ll need a simple function in $DJANGO_HOME/dvds/views.py :

def search_titles(request):
    return render( request, 'dvds/search.html')

We need to add the appropriate code to execute the search. In views.py, we also add a function called search_results :

def search_results(request) :
    if 'search_string' in request.GET and request.GET['search_string'] :
        titles = Title.objects.filter(title_name__icontains=request.GET['search_string'])
    else :
        titles = Title.objects.all()
    return render( request, 'dvds/results.html', {'titles' : titles, 'search_string' : request.GET['search_string']})

So, if the user submits a search with the search_string populated, then find any titles which contain that string.
The icontains method performs a case insensitive “LIKE” comparison.
If no search string is entered, we’ll display all records in the results.
The search results are then displayed in the dvds/results.html page.
The render function is also passing a couple of arguments to the results page :

  • the titles object containing the search results
  • the original search string entered by the user

In urls.py, we need to add the appropriate entries for these pages :

...
urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^$', TitleList.as_view(), name="main"),
    url(r'^search/$', views.search_titles),
    url(r'^results/$', views.search_results),
]

Once again, the html code resides in the templates directory. It looks rather familiar.
The search form – search.html :

{% extends "dvds/base.html" %}
{% block title %}Find a Movie{% endblock %}

{% block breadcrumb %}{% include "dvds/breadcrumb.html" with home=True %}{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search For A Title</h1>
        <img src="/static/images/spiderman_minion.jpg" >&nbsp;
        <textarea>Enter some text that the title your searching for contains.</textarea>
    </div>
{% endblock %}

{% block page_content %}
    <form id="search_form" action="/results" method="GET">
        <!-- Name of the movie (or part therof) -->
            <label>Title Contains &nbsp;:&nbsp;</label>
            <input name="search_string" placeholder="Words in the Title" />
            <label>&nbsp;</label><input type="submit" value="Find My Movie"/>
    </form>
{% endblock %}

The results page, results.html :

{% extends "dvds/base.html" %}
{% block title %}Search Results{% endblock %}

{% block breadcrumb %}
    {% include "dvds/breadcrumb.html" with home=True back=True search_refine=True search_again=True %}
{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search Results</h1>
        <img src="/static/images/minion_family.jpg" >&nbsp;
       <h2>You searched for titles ...</h2>
       <ul><li>containing the phrase&nbsp;<strong><em>{{search_string}}</em></strong></li></ul>
    </div>
{% endblock %}
{% block page_content %}
    {% if titles %}
        <h2>We found {{titles | length}} title{{ titles|pluralize}}...</h2>
        {% include "dvds/display_titles.html" %}
    {% else %}
        <h2>No Titles matched your search criteria</h2>
    {% endif %}
{% endblock %}

Notice that the call to breadcrumb.html provides different parameter values in the two files.
Now let’s give it all a try…

Click on the Search link from the Home Page and we should get :

Click the “Find My Movie” button and we get :

Refining the Search Functionality

As well as words in the title, it would be useful to be able to search on other criteria such as the Series to which a movie belongs, or maybe the Media Format.

In both cases our application has a limited number of values to select from. The Series table contains a list of all of the series in the application. The MEDIA_FORMAT_CHOICES list contains all of the valid media formats.

By including these objects in $DJANGO_HOME/dvds/views.py

from .models import Title, Series, MEDIA_FORMAT_CHOICES

…we can reference them in the search function we need to create (also in views.py)…

def search_titles(request):
    series_list = Series.objects.all()

    return render( request, 'dvds/search.html', {'series_list' : series_list, 'format_list':MEDIA_FORMAT_CHOICES})

The series_list and format_list objects passed in the call to render can now be used in the html template for the search screen – $DJANGO_HOME/dvds/templates/dvds/search.html :

{% extends "dvds/base.html" %}
{% block title %}Find a Movie{% endblock %}

{% block breadcrumb %}{% include "dvds/breadcrumb.html" with home=True %}{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search For A Title</h1>
        <img class="portrait" src="/static/images/spiderman_minion.jpg" >&nbsp;
        <textarea>Enter some text that the title your searching for contains, and/or a Series and/or a format.
Note - search criteria is addative
        </textarea>
    </div>
    <br />
{% endblock %}

{% block page_content %}
    <form id="search_form" action="/results" method="GET">
        <!-- Name of the movie (or part therof) -->
        <div class="row">    
            <label>Title Contains &nbsp;:&nbsp;</label>
            <input name="search_string" placeholder="Words in the Title" />
        </div>
        <!-- Series drop-down -->
        <div class="row">
            <label>Series&nbsp;:&nbsp;</label>
            <select class="drop_down_list" name="series_id">
                <option value=>None</option>
                {% for series in series_list %}
                    <option value={{series.id}}>{{series.series_name}}</option>
                {% endfor %}
            </select>
        </div>
        <!-- Media Format -->
        <div class="row">
            <label class="search_label">Media Format : </label>
            <select class="drop_down_list" name="media_format">
                <option value=>None</option>
                {% for id, value in format_list %}
                    <option value={{id}}>{{value}}</option>
                {% endfor %}
            </select>
        </div>
        <div class="row">
            <label>&nbsp;</label><input type="submit" value="Find My Movie" />
        </div>
    </form>
{% endblock %}

After all of that, our search screen now looks like this :

As for the search results, well, here’s where things get a bit tricky.
The search function in $DJANGO_HOME/dvds/views.py has changed a bit :

def search_results(request) :
    if 'search_string' in request.GET and request.GET['search_string'] :
        titles = Title.objects.filter(title_name__icontains=request.GET['search_string'])
    else :
        titles = Title.objects.all()
        search_string = None
    if 'series_id' in request.GET and request.GET['series_id'] :
        series_name = Series.objects.filter(pk = request.GET['series_id']).values_list('series_name', flat=True)[0]
        titles = titles.filter(series_id =  request.GET['series_id'])
    else :
        series_name = None
        titles = titles
    if 'media_format' in request.GET and request.GET['media_format'] :
        titles = titles.filter(media_format = request.GET['media_format'])
        # Get the display value to pass to the results page
        media_format = mf_display(request.GET['media_format'])
    else :
        titles = titles
    return render( request, 'dvds/results.html', {'titles' : titles, 'search_string' : request.GET['search_string'], 'series_name' : series_name, 'media_format': media_format})

The first point to note is that, despite the multiple assignment to the titles object, the only database call that will actually be made is the one when render is called. Prior to that, the search conditions will be added. The effect is pretty much the same as building a SQL statement dynamically, adding predicates based on various conditions, before finally executing the finished statement.

We still want to display the search criteria values that were entered.
Whilst looking up display values in the context of a record retrieved from the database is simple enough, the same is not true outside of this context.

For the Series Name, I’ve taken the rather profligate approach of performing an additional database lookup (line 7 in the listing above). In a higher-volume application, you might well look for something a bit less resource intensive.

As for the media format, views.py now also includes this function :

def mf_display( media_format) :
    for (val, mf_name) in MEDIA_FORMAT_CHOICES :
        if val == media_format :
            return mf_name
    return None

…which is invoked on line 16.

This enables us to pass the appropriate values to the results page, $DJANGO_HOME/dvds/templates/dvds/results.html, which now looks like this :

{% extends "dvds/base.html" %}
{% block title %}Search Results{% endblock %}

{% block breadcrumb %}
    {% include "dvds/breadcrumb.html" with home=True search_refine=True search_again=True %}
{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search Results</h1>
        <img class="landscape" src="/static/images/minion_family.jpg" >&nbsp;
       <h2>You searched for titles ...</h2>
       <ul>
            {% if search_string %}
                <li>containing the phrase&nbsp;<strong><em>{{search_string}}</em></strong></li>
            {% endif %}
            {% if series_name %}
                <li>in the series&nbsp;<strong><em>{{series_name}}</em></strong></li>
            {% endif %}
            {% if media_format %}
                <li>on&nbsp;<strong><em>{{media_format}}</em></strong></li>
            {% endif %}
       </ul>
    </div>
{% endblock %}
{% block page_content %}
    {% if titles %}
        <h2>We found {{titles | length}} title{{ titles|pluralize}}...</h2>
        {% include "dvds/display_titles.html" %}
    {% else %}
        <h2>No Titles matched your search criteria</h2>
    {% endif %}
{% endblock %}
The End Result

It’s probably helpful at this point to provide complete listings of all of the code we’ve changed.
This is partly to show that we’ve accomplished a fair bit with Django with surprisingly little code. The more practical reason is to help in the event that I’ve been a bit unclear as to which file a certain code snippet might be in.

Starting in $DJANGO_HOME/dvds we already had the Model layer of our application when we started :

models.py
from django.db import models
from django.core.validators import MinValueValidator

# Allowable values Lists
MEDIA_FORMAT_CHOICES = (
    ('BR', 'Blu Ray'),
    ('DVD', 'DVD'),
)

# British Board of Film Classification Certificates
# as per the official BBFC site - http://www.bbfc.co.uk
BBFC_CHOICES = (
    ('U', 'U'),
    ('PG', 'PG'),
    ('12A', '12A'),
    ('15', '15'),
    ('18', '18'),
    ('R18', 'R18'),
)

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    # Note that Django creates a synthetic key by default so no need to
    # specify one here
    category_name = models.CharField(max_length = 50, unique = True)
        
    def __str__(self):
        return self.category_name

    class Meta :
        # set the default behaviour to be returning categories in alphabetical order by category_name
        ordering = ["category_name"]
        # Define the correct plural of "Category". Among other places, this is referenced in the Admin application
        verbose_name_plural = "categories"

class Series(models.Model) :
    series_name = models.CharField( max_length = 50, unique = True)

    def __str__(self) :
        return self.series_name

    class Meta :
        ordering = ["series_name"]
        verbose_name_plural = "series"

class Title( models.Model) :

    # For optional fields, blank = True means you can leave the field blank when entering the record details
    # null = True means that the column is nullable in the database 
    title_name = models.CharField( max_length = 250)
    year_released = models.IntegerField(validators=[MinValueValidator(1878)]) # Movies invented around 1878.
    bbfc_certificate = models.CharField("BBFC Certificate", max_length = 3, choices = BBFC_CHOICES)
    media_format = models.CharField("Format", max_length = 3, choices = MEDIA_FORMAT_CHOICES)
    director = models.CharField(max_length = 100, null=True, blank=True)
    synopsis = models.CharField( max_length = 4000, null = True, blank = True)
    series = models.ForeignKey( Series, on_delete = models.CASCADE, null = True, blank = True)
    number_in_series = models.IntegerField(null = True, blank = True)
    categories = models.ManyToManyField(Category, blank = True)

    class Meta :
        ordering = ["series", "number_in_series", "title_name"]
        # Natural Key for a Title record is title_name, year_released and media_format - we have some films on DVD AND Blu-Ray.
        unique_together = ('title_name', 'year_released', 'media_format',)

    def __str__(self) :
        return self.title_name

Additionally, we now have the following files which pretty much comprise the Controller layer of our MVC application :

urls.py
from django.conf.urls import url
from django.contrib import admin

from dvds.views import TitleList
from dvds import views
app_name = 'dvds'

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^$', TitleList.as_view(), name="main"),
    url(r'^search/$', views.search_titles),
    url(r'^results/$', views.search_results),
]

views.py
from django.shortcuts import render
from django.views.generic import ListView

from .models import Title, Series, MEDIA_FORMAT_CHOICES

def mf_display( media_format) :
    # Get the display value for a MEDIA_FORMAT_CHOICES entry.
    # NOTE this is for use in the search results screen where we confirm
    # the search criteria entered so it's NOT in the context of a Title record at that point.
    for (val, mf_name) in MEDIA_FORMAT_CHOICES :
        if val == media_format :
            return mf_name
    return None

class TitleList(ListView):
    model = Title

def search_titles(request):
    series_list = Series.objects.all()

    return render( request, 'dvds/search.html', {'series_list' : series_list, 'format_list':MEDIA_FORMAT_CHOICES})

def search_results(request) :
    if 'search_string' in request.GET and request.GET['search_string'] :
        titles = Title.objects.filter(title_name__icontains=request.GET['search_string'])
    else :
        titles = Title.objects.all()
        search_string = None
    if 'series_id' in request.GET and request.GET['series_id'] :
        series_name = Series.objects.filter(pk = request.GET['series_id']).values_list('series_name', flat=True)[0]
        titles = titles.filter(series_id =  request.GET['series_id'])
    else :
        series_name = None
        titles = titles
    if 'media_format' in request.GET and request.GET['media_format'] :
        titles = titles.filter(media_format = request.GET['media_format'])
        # Get the display value to pass to the results page
        media_format = mf_display(request.GET['media_format'])
    else :
        titles = titles
        media_format = None
    return render( request, 'dvds/results.html', {'titles' : titles, 'search_string' : request.GET['search_string'], 'series_name' : series_name, 'media_format': media_format})

The View MVC Layer is found in $DJANGO_HOME/dvds/templates/dvds. First, the templates that are extended or included :

base.html
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf8">
        <title>{% block title %}Base Title{% endblock %}</title>
        <link rel="stylesheet" type="text/css" href="/static/css/style.css" />
    </head>
    <body>
        {% block breadcrumb %}{% endblock %}
        {% block explanation %}{% endblock %}
        {% block page_content %}{% endblock %}
    </body>
</html> 
breadcrumb.html
<img src="/static/images/deb_movie_logo.jpg">&nbsp;
{% if home %}<a href="/">Home</a>&nbsp;{% endif %}
{% if admin %}|&nbsp;<a href="admin" target="_blank">Add or Edit Titles</a>{% endif %}
{% if back %}|&nbsp;<a href="javascript:history.back(-1)">Back</a>&nbsp;{% endif %}
{% if search_refine %}|&nbsp;<a href="javascript:history.back(-1)">Refine Search</a>&nbsp;{% endif %}
<!-- additional logic required to see if search is first option -->
{% if search %}|&nbsp;<a href="/search">Search</a>{% endif %}
{% if search_again %}|&nbsp;<a href="/search">Search Again</a>{% endif %}
display_titles.html
<table>
    <tr>
        <th>Title</th> <th>Released</th> <th>Certificate</th>
        <th>Format</th> <th>Director</th>
        <th>Synopsis</th> <th>Series</th> <th>No. In Series</th>
        <th>Categories</th> 
    </tr>
    {% for title in titles %}
        <tr>
            <td>{{title.title_name}}</td> <td>{{title.year_released}}</td> <td>{{title.bbfc_certificate}}</td>
            <td>{{title.get_media_format_display}}</td>
            <td>{{title.director}}</td> <td>{{title.synopsis}}</td> <td>{{title.series}}</td>
            <td>{{title.number_in_series}}</td>
            <td>
                {% for cat in title.categories.all %}
                    {{cat}}&nbsp;
                {% endfor %}
            </td>
        </tr>
    {% endfor %}
</table>

Finally, the application pages themselves, starting with :

title_list.html
{% extends "dvds/base.html" %}
{% block title %}Deb's Movie Library{% endblock %}
{% block breadcrumb %}{% include "dvds/breadcrumb.html" with search=True admin=True %}{% endblock %}
{% block explanation %}
    <div class="explanation">
        <h1>So, you'd like to watch a Film...</h1>
        <img class="landscape" src="static/images/avengers.jpg">&nbsp;
        <textarea>We are here to help. 
There's a list of films in the library right here. 
Alternatively, click on "Search" at the top of the page if you're looking for something specific.
        </textarea>
    </div>
{% endblock %}
{% block page_content %}{% include "dvds/display_titles.html" with titles=object_list %}{% endblock %}

…which looks like this :

search.html
{% extends "dvds/base.html" %}
{% block title %}Find a Movie{% endblock %}

{% block breadcrumb %}{% include "dvds/breadcrumb.html" with home=True %}{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search For A Title</h1>
        <img class="portrait" src="/static/images/spiderman_minion.jpg" >&nbsp;
        <textarea>Enter some text that the title your searching for contains, and/or a Series and/or a format.
Note - search criteria is addative
        </textarea>
    </div>
    <br />
{% endblock %}

{% block page_content %}
    <form id="search_form" action="/results" method="GET">
        <!-- Name of the movie (or part therof) -->
        <div class="row">    
            <label>Title Contains &nbsp;:&nbsp;</label>
            <input name="search_string" placeholder="Words in the Title" />
        </div>
        <!-- Series drop-down -->
        <div class="row">
            <label>Series&nbsp;:&nbsp;</label>
            <select class="drop_down_list" name="series_id">
                <option value=>None</option>
                {% for series in series_list %}
                    <option value={{series.id}}>{{series.series_name}}</option>
                {% endfor %}
            </select>
        </div>
        <!-- Media Format -->
        <div class="row">
            <label class="search_label">Media Format : </label>
            <select class="drop_down_list" name="media_format">
                <option value=>None</option>
                {% for id, value in format_list %}
                    <option value={{id}}>{{value}}</option>
                {% endfor %}
            </select>
        </div>
        <div class="row">
            <label>&nbsp;</label><input type="submit" value="Find My Movie" />
        </div>
    </form>
{% endblock %}

…which, remember, looks like this :

Finally, we have :

results.html

{% extends "dvds/base.html" %}
{% block title %}Search Results{% endblock %}

{% block breadcrumb %}
    {% include "dvds/breadcrumb.html" with home=True search_refine=True search_again=True %}
{% endblock %}

{% block explanation %}
    <div class="explanation">
        <h1>Search Results</h1>
        <img class="landscape" src="/static/images/minion_family.jpg" >&nbsp;
       <h2>You searched for titles ...</h2>
       <ul>
            {% if search_string %}
                <li>containing the phrase&nbsp;<strong><em>{{search_string}}</em></strong></li>
            {% endif %}
            {% if series_name %}
                <li>in the series&nbsp;<strong><em>{{series_name}}</em></strong></li>
            {% endif %}
            {% if media_format %}
                <li>on&nbsp;<strong><em>{{media_format}}</em></s
                trong></li>
            {% endif %}
       </ul>
    </div>
{% endblock %}
{% block page_content %}
    {% if titles %}
        <h2>We found {{titles | length}} title{{ titles|pluralize}}...</h2>
        {% include "dvds/display_titles.html" %}
    {% else %}
        <h2>No Titles matched your search criteria</h2>
    {% endif %}
{% endblock %}

which now looks like this :

Obviously, as more films get added to the library,further enhancements will be needed.
For now though, 1.0 is ready to see the light of day…

Bananas all round !


Filed under: python Tagged: display a choices list name from a value, Django, drop down lists from choices list, drop-down lists from table values, extends by template, get display, include template, ListView, templates, tree command, urls.py, views.py

The Django Fandango Farrago – Looking at Django’s Physical Data Model Design

Wed, 2017-03-15 08:40

I’m sure I’m not the only Oracle Developer who, over the years, has conjured a similar mental image during a planning meeting for a new web-based application…

wibble

…and we’re going to use an ORM

If you want the full gory details as to why this is so troubling from an Oracle database perspective, it is a topic I have covered at length previously.

This time, however, things are different.
Yes, I am somewhat limited in my choice of database due to the hardware my application will run on (Raspberry Pi).
Yes, Django is a logical choice for a framework as I’m developing in Python.
But, here’s the thing, I plan to do a bit of an audit of the database code that Django spits out.
< obligatory-Monty-Python-reference >That’s right Django, No-one expects the Spanish Inquisition ! < obligatory-Monty-Python-reference / >

torturer

Donde esta el Base de datos ?!

I know, this is a character from Blackadder and not Monty Python, but I’ve often regretted the fact that there never seems to be a vat of warm marmalade around (or some kind of gardening implement for that matter), when you enter those all important application architecture discussions at the start of a project.

As a result, one or two further Blackadder references may have crept in to the remainder of this post…

What we’re looking at

The Application I’m developing is as described in my previous post and we’ll be using SQLite as the database for our application.

What I’ll be covering here is :

  • The physical data model we want to implement for our DVD Library Application
  • Using Django to generate the data Model
  • Installation and use of the SQLite command line
  • Tweaking our code to improve the model

We’re not too concerned about performance at this point. The application is low-volume in terms of both data and traffic.
I’ll point out aspects of the code that have a potential performance impact as and when they come up (and I notice them), but performance optimisation is not really the objective here.
The main aim is to ensure that we maximise the benefits of using a relational database by ensuring data integrity.

The target model

By default, Django applies a synthetic key to each table it creates. I have indulged this proclivity in the model that follows, although it’s something I will return to later on.

The application I’m building is a simple catalogue of DVDs and Blu-Rays we have lying around the house.
The main table in this application will be TITLE, which will hold details of each Title we have on Disk.
Note that the Unique Key for this table is the combination of TITLE_NAME, YEAR_RELEASED and MEDIA_FORMAT. Yes I do have some films on both DVD and Blu-Ray.
As for the relationships :

  • a film/tv SERIES may have one, or more than one TITLE
  • a TITLE may belong to one or more CATEGORY
  • a CATEGORY may apply to one or more TITLE

So, in addition to our main data table, TITLE, we need two reference tables – SERIES and CATEGORY. We also need a join table between CATEGORY and TITLE to resolve the many-to-many relationship between them.
Each of the tables will have a Synthetic Key, which makes storing of Foreign Key values simple. However, Synthetic Key values alone are no guarantee of the uniqueness of a record (beyond that of the key itself), so these tables will also require unique constraints on their Natural Keys to prevent duplicate records being added.

The final data model should ideally look something like this :

dvds_data_model

Fun with Synthetic Keys

The first tables we’re going to generate are the CATEGORY and SERIES reference tables.
As we’re using Django, we don’t need to type any SQL for this.
Instead, we need to go to the project directory and create a file called models.py.

So, if we’re using the installation I setup previously…

cd ~/dvds/dvds
nano models.py

…and now we can define the CATEGORY object like this :

from django.db import models

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    category_name = models.CharField(max_length = 50)

    def __str__(self):
        return self.category_name

We now need to tell Django to implement (migrate) this definition to the database so…

cd ~/dvds
./manage.py makemigrations dvds
./manage.py migrate

Now, if this were a common or garden Python article, we’d be heading over to the Python interactive command line (possibly via another Monty Python reference). The fact is though that I’m getting withdrawal symptoms from not writing any SQL so, we’re going to install a CLI for SQLite.
Incidentally, if hacking around on the command line is not your idea of “a big party”, you can always go down the route of obtaining an IDE for SQLite – SQLite Studio seems as good as any for this purpose.

If like me however, you regard the command line as an opportunity for “a wizard-jolly time”…

sudo apt-get install sqlite3

…and to access the CLI, we can now simply run the following :

cd ~/dvds
sqlite3 db.sqlite3

Django will have created the table using the application name as a prefix. So, in SQLite, we can see the DDL used to generate the table by running …

.schema dvds_category

The output (reformatted for clarity) is :

CREATE TABLE "dvds_category"
(
    "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
    "category_name" varchar(50) NOT NULL
);

The problem with this particular table can be demonstrated easily enough (incidentally, a Blokebuster is the opposite of a Chick Flick, in case you’re wondering)…

insert into dvds_category(category_name) values ('BLOKEBUSTER');

insert into dvds_category(category_name) values ('BLOKEBUSTER');

select *
from dvds_category;

1|BLOKEBUSTER
2|BLOKEBUSTER

As is evident, the Unique Key on category_name has not been implemented. Without this, the Synthetic Key on the table (the ID column) does nothing to prevent the addition of what are, in effect, duplicate records.

After tidying up…

delete from dvds_category;
.quit

…we need to re-visit the Category class in models.py…

from django.db import models

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    # Note that Django creates a synthetic key by default so no need to
    # specify one here
    category_name = models.CharField(max_length = 50, unique = True)

    def __str__(self):
        return self.category_name

This time, we’ve told Django that category_name has to be unique as well. So, when we migrate our change…

cd ~/dvds
./manage.py makemigrations dvds
./manage.py migrate

…and check the DDL that Django has used this time…

sqlite3 db.sqlite3
.schema dvds_category

…we can see that Django has added a Unique Constraint on the category_name…

CREATE TABLE "dvds_category"
(
    "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
    "category_name" varchar(50) NOT NULL UNIQUE
);

…meaning that we now no longer get duplicate category_names in the table…

insert into dvds_category(category_name) values('BLOKEBUSTER');
insert into dvds_category(category_name) values('BLOKEBUSTER');
Error: UNIQUE constraint failed: dvds_category.category_name

It’s worth noting here that some RDBMS engines create a Unique Index to enforce a Primary Key. Were this the case for this table, you’d end up with two indexes on a two-column table. This is would not be the most efficient approach in terms of performance or storage.
Assuming that’s not a problem, we can move on and add the Series object to models.py as it’s structure is similar to that of Category…

from django.db import models

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    # Note that Django creates a synthetic key by default so no need to
    # specify one here
    category_name = models.CharField(max_length = 50, unique = True)

    def __str__(self):
        return self.category_name

class Series(models.Model) :
    series_name = models.CharField( max_length = 50, unique = True)

    def __str__(self) :
        return self.series_name

…and deploy it…

cd ~/dvds
./manage.py makemigrations dvds
./manage.py migrate

…which should result in a table that looks like this in SQLite :

CREATE TABLE "dvds_series"
(
    "id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
    "series_name" varchar(50) NOT NULL UNIQUE
);

Fun as it is messing around in the database on a command line, it’s not very frameworky…

DML using the Admin Interface

It’s a fairly simple matter to persuade Django to provide an interface that allows us to manage the data in our tables.
Step forward admin.py. This file lives in the same directory as models.py and, for our application as it stands at the moment, contains :

from django.contrib import admin
from .models import Category, Series

#Tables where DML is to be managed via admin
admin.site.register(Category)
admin.site.register(Series)

Save this and then just run :

./manage.py migrations

Now, if we run the server…

./manage.py runserver

…we can navigate to the admin site (appending /admin to the development server URL…

admin

You can then connect using the credentials of the super user you created when you setup Django initially.

Once connected, you’ll notice that Django admin has a bit of an issue with pluralising our table names

admin_wrong_plural

We’ll come back to this in a mo. First though, let’s add some Category records…

Click the Add icon next to “Categorys” and you’ll see …add_cat

Once we’ve added a few records, we can see a list of Categories just by clicking on the name of the table in the Admin UI :

cat_list

This list appears to be sorted by most recently added Category first. It may well be that we would prefer this listing to be sorted in alphabetical order.

We can persuade Django to implement this ordering for our tables, as well as correctly pluralizing our table names by adding a Meta class for each of the corresponding classes in models.py :

from django.db import models

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    # Note that Django creates a synthetic key by default so no need to
    # specify one here
    category_name = models.CharField(max_length = 50, unique = True)

    def __str__(self):
        return self.category_name

    class Meta :
        # set the default behaviour to be returning categories in alphabetical order by category_name
        ordering = ["category_name"]
        # Define the correct plural of "Category". Among other places, this is referenced in the Admin application
        verbose_name_plural = "categories"

class Series(models.Model) :
    series_name = models.CharField( max_length = 50, unique = True)

    def __str__(self) :
        return self.series_name

    class Meta :
        ordering = ["series_name"]
        verbose_name_plural = "series"

Once we migrate these changes :

./manage.py makemigrations dvds
./manage.py migrate

…and restart the dev server…

./manage.py runserver

…we can see that we’ve managed to cure the Admin app of it’s speech impediment…

correct_plural

…and that the Category records are now ordered alphabetically…

cat_order

It’s worth noting that specifying the ordering of records in this way will cause an additional sort operation whenever Django goes to the database to select from this table.
For our purposes the overhead is negligible. However, this may not be the case for larger tables.

So far, we’ve looked at a couple of fairly simple reference data tables. Now however, things are about to get rather more interesting…

Foreign Keys and other exotic database constructs

The Title object (and it’s corresponding table) are at the core of our application.
Unsurprisingly therefore, it’s the most complex class in our models.py.

In addition to the Referential Integrity constraints that we need to implement, there are also the media_type and bbfc_certificate fields, which can contain one of a small number of static values.
We also need to account for the fact that Django doesn’t really do composite Primary Keys.
I’m going to go through elements of the code for Title a bit at a time before presenting the final models.py file in it’s entirety.

To start with then, we’ll want to create a couple of choices lists for Django to use to validate values for some of the columns in the Title table…

# Allowable values Lists
MEDIA_FORMAT_CHOICES = (
    ('BR', 'Blu Ray'),
    ('DVD', 'DVD'),
)

# British Board of Film Classification Certificates
# as per the official BBFC site - http://www.bbfc.co.uk
BBFC_CHOICES = (
    ('U', 'U'),
    ('PG', 'PG'),
    ('12A', '12A'),
    ('15', '15'),
    ('18', '18'),
    ('R18', 'R18'),
)

In a database, you would expect these valid values to be implemented by check constraints. Django however, goes it’s own way on this. I’d infer from the lack of resulting database constraints that the Choices Lists will work so long as you always populate/update your underlying tables via the Django application itself.
Incidentally, it is possible to reference these name/value pairs in Django templates should the need arise, something I will cover in a future post. It is for this reason that I’ve declared them outside of the classes in which they’re used here.

As with choices, the same appears to apply to the check we’ve added to ensure that we don’t get a silly value for the year a film was released, which necessitates …

from django.core.validators import MinValueValidator
...
year_released = models.IntegerField(validators=[MinValueValidator(1878)]) # Movies invented around 1878.

Our first attempt at the Title class looks like this :

class Title( models.Model) :

    # For optional fields, blank = True means you can leave the field blank when entering the record details
    # null = True means that the column is nullable in the database
    title_name = models.CharField( max_length = 250)
    year_released = models.IntegerField(validators=[MinValueValidator(1878)]) # Movies invented around 1878.
    bbfc_certificate = models.CharField("BBFC Certificate", max_length = 3, choices = BBFC_CHOICES)
    media_format = models.CharField("Format", max_length = 3, choices = MEDIA_FORMAT_CHOICES)
    director = models.CharField(max_length = 100, null=True, blank=True)
    synopsis = models.CharField( max_length = 4000, null = True, blank = True)
    series = models.ForeignKey( Series, on_delete = models.CASCADE, null = True, blank = True)
    number_in_series = models.IntegerField(null = True, blank = True)
    categories = models.ManyToManyField(Category, blank = True)

    class Meta :
        ordering = ["series", "number_in_series", "title_name"]

    def __str__(self) :
        return self.title_name

Hang on, haven’t I forgotten something here ? Surely I need some way of implementing the Natural Key on this table ?
You’re right. However this omission is deliberate at this stage, for reasons that will become apparent shortly.
Yes, this is part of a plan “so cunning you could brush your teeth with it”.

Even without this key element, there’s quite a lot going on here. In the main class :

  • the year_released cannot be before 1878
  • the bbfc_certificate and media_format columns are associated with their choices lists using the choices option
  • we’ve specified that series as type models.ForeignKey
  • we’ve specified categories as the somewhat intriguing type models.ManyToManyField

In the Meta class, we’ve stipulated a multi-column ordering clause. Note that the default ordering appears to put nulls last. Therefore Title records that have null series and number_in_series values will appear first.

When we plug this into our models.py and apply the changes…

./manage.py makemigrations dvds
./manage.py migrate

…then check in the database…

sqlite3 db.sqlite3
.tables dvds_title%
dvds_title             dvds_title_categories

…we can see that Django has created not one, but two new tables.

In addition to the DVDS_TITLE table, which we may have expected and which looks like this :

CREATE TABLE dvds_title (
    id               INTEGER        NOT NULL
                                    PRIMARY KEY AUTOINCREMENT,
    title_name       VARCHAR (250)  NOT NULL,
    year_released    INTEGER        NOT NULL,
    bbfc_certificate VARCHAR (3)    NOT NULL,
    media_format     VARCHAR (3)    NOT NULL,
    director         VARCHAR (100),
    synopsis         VARCHAR (4000),
    number_in_series INTEGER,
    series_id        INTEGER        REFERENCES dvds_series (id)
);

…Django has been smart enough to create a join table to resolve the many-to-many relationship between TITLE and CATEGORY :

CREATE TABLE dvds_title_categories (
    id          INTEGER NOT NULL
                        PRIMARY KEY AUTOINCREMENT,
    title_id    INTEGER NOT NULL
                        REFERENCES dvds_title (id),
    category_id INTEGER NOT NULL
                        REFERENCES dvds_category (id)
);

Whilst Django can’t resist slapping on a gratuitous Synthetic Key, it is at least clever enough to realise that a composite key is also required. To this end, it also creates an Unique Index on DVDS_TITLE_CATEGORIES :

CREATE UNIQUE INDEX dvds_title_categories_title_id_96178db6_uniq ON dvds_title_categories (
    title_id,
    category_id
);

So, it seems that Django can handle composite keys after all. Well, not quite.

Remember that we still need to add a unique key to TITLE as we’ve modelled it to have a Natural Key consisting of TITLE_NAME, YEAR_RELEASED and MEDIA_FORMAT.

We can do that easily enough, simply by adding a unique_together clause to Title’s Meta class in models.py :

class Meta :
    ordering = ["series", "number_in_series", "title_name"]
    # Natural Key for a Title record is title_name, year_released and media_format - we have some films on DVD AND Blu-Ray.
    unique_together = ('title_name', 'year_released', 'media_format',)

If we now apply this change…

./manage.py makemigrations dvds
./manage.py migrate

…we can see that Django has added the appropriate index…

CREATE UNIQUE INDEX dvds_title_title_name_ae9b05c4_uniq ON dvds_title (
    title_name,
    year_released,
    media_format
);

The really wacky thing about all this is that, if we had used the unique_together function in the first place, Django would not have created the Unique Key on the DVDS_TITLE_CATEGORIES table. However, as we’ve added Title’s Natural Key in a separate migration, Django leaves the Unique Key on DVDS_TITLE_CATEGORIES in place.
Irrespective of how practical the Synthetic Key on DVDS_TITLE may be, the fact is, it is defined as the Primary Key for that table. As DVDS_TITLE_CATEGORIES is a Join Table then, in relational terms, it should itself have a Natural Key consisting of the Primary Keys of the two tables it’s joining.

Anyway, our final models.py looks like this :

from django.db import models
from django.core.validators import MinValueValidator

# Allowable values Lists
MEDIA_FORMAT_CHOICES = (
    ('BR', 'Blu Ray'),
    ('DVD', 'DVD'),
)

# British Board of Film Classification Certificates
# as per the official BBFC site - http://www.bbfc.co.uk
BBFC_CHOICES = (
    ('U', 'U'),
    ('PG', 'PG'),
    ('12A', '12A'),
    ('15', '15'),
    ('18', '18'),
    ('R18', 'R18'),
)

class Category(models.Model) :
    # Object properties defined here map directly to database columns.
    # Note that Django creates a synthetic key by default so no need to
    # specify one here
    category_name = models.CharField(max_length = 50, unique = True)

    def __str__(self):
        return self.category_name

    class Meta :
        # set the default behaviour to be returning categories in alphabetical order by category_name
        ordering = ["category_name"]
        # Define the correct plural of "Category". Among other places, this is referenced in the Admin application
        verbose_name_plural = "categories"

class Series(models.Model) :
    series_name = models.CharField( max_length = 50, unique = True)

    def __str__(self) :
        return self.series_name

    class Meta :
        ordering = ["series_name"]
        verbose_name_plural = "series"

class Title( models.Model) :

    # For optional fields, blank = True means you can leave the field blank when entering the record details
    # null = True means that the column is nullable in the database
    title_name = models.CharField( max_length = 250)
    year_released = models.IntegerField(validators=[MinValueValidator(1878)]) # Movies invented around 1878.
    bbfc_certificate = models.CharField("BBFC Certificate", max_length = 3, choices = BBFC_CHOICES)
    media_format = models.CharField("Format", max_length = 3, choices = MEDIA_FORMAT_CHOICES)
    director = models.CharField(max_length = 100, null=True, blank=True)
    synopsis = models.CharField( max_length = 4000, null = True, blank = True)
    series = models.ForeignKey( Series, on_delete = models.CASCADE, null = True, blank = True)
    number_in_series = models.IntegerField(null = True, blank = True)
    categories = models.ManyToManyField(Category, blank = True)

    class Meta :
        ordering = ["series", "number_in_series", "title_name"]
        # Natural Key for a Title record is title_name, year_released and media_format - we have some films on DVD AND Blu-Ray.
        unique_together = ('title_name', 'year_released', 'media_format',)

    def __str__(self) :
        return self.title_name

We also want to add Title to admin.py so that we can perform DML on the table in the admin application. Hence our final admin.py looks like this :

from django.contrib import admin
from .models import Category, Series, Title

#Tables where DML is to be managed via admin
admin.site.register(Category)
admin.site.register(Series)
admin.site.register(Title)
Conclusion

Django makes a pretty decent fist of implementing and maintaining a Relational Data Model without the developer having to write a single line of SQL.
Of course, as with any code generator, some of it’s design decisions may not be those that you might make if you were writing the code by hand.
So, if the data model and it’s physical implementation is important to your application, then it’s probably worth just checking up on what Django is up to in the database.


Filed under: python, SQL Tagged: admin.py, Django, foreign key, makemigrations, manage.py, migrate, models.py, Natural Key, runserver, SQLite, synthetic key, unique_together

Configuring Django with Apache on a Raspberry Pi

Tue, 2017-02-21 07:07

Deb has another job for me to do around the house.
She would like to have a means of looking up which Films/TV Series we have lying around on Blu-Ray or DVD so she can save time looking for films we haven’t actually got. Just to be clear, she doesn’t mind hunting around for the disc in question, she just wants to make sure that it’s somewhere to be found in the first place.
She wants to be able to do this on any device at any time (let’s face it, there’s even a browser on your telly these days).
As DIY jobs go, this is a long way from being the worst as far as I’m concerned. After all, this time I should be able to put something together without the potential for carnage that’s usually attendant when I reach for the toolbox.

I happen to have a Raspberry Pi lying around which should serve as the perfect hardware platform for this sort of low traffic, low data-volume application.
The Pi is running Raspbian Jessie.
Therefore, Python is the obvious choice of programming language to use. By extension therefore, Django appears to be a rather appropriate framework.
In order to store the details of each movie we have, we’ll need a database. Django uses with Sqlite as the default.

We’ll also need an HTTP server. Whilst Django has it’s own built-in “development” server for playing around with, the favoured production http server appears to be Apache.

Now, getting Django and Apache to talk to each other seems to get a bit fiddly in places so what follows is a description of the steps I took to get this working…leaving out all the bits where I hammered my thumb…

Other places you may want to look

There are lots of good resources for Django out there.
The Django Project has a a list of Django Tutorials.
One particularly good beginners tutorial, especially if you have little or no experience of programming, is the Django Girls Tutorial.

Making sure that Raspbian is up-to-date

Before we start installing the bits we need, it’s probably a good idea to make sure that the OS on the Pi is up-to-date.
Therefore, open a Terminal Window on the Pi and run the following two commands…

sudo apt-get update -y
sudo apt-get upgrade -y

This may take a while, depending on how up-to-date your system is.
Once these commands have completed, you’ll probably want to make sure you haven’t got any unwanted packages lying around. To achieve this, simply run :

sudo apt-get autoremove
Python Virtual Environments

Look, don’t panic. This isn’t the sort of Virtual Environment that requires hypervisors and Virtual Machines and all that other complicated gubbins. We’re running on a Pi, after all, we really haven’t got the system resources to expend on that sort of nonsense.
A Python virtual environment is simply a way of “insulating” your application’s Python dependencies from those of any other applications you have/are/will/may develop/run on the same physical machine.

Getting this up and running is fairly simple, but first, just as a sanity check, let’s make sure that we have Python 3 installed and available :

python3 --version

python3_version

Provided all is well, then next step is to install the appropriate Python 3 package for creating and running Virtual Environments so…

sudo pip3 install virtualenv

Next, we need to create a parent directory for our application. I’m going to create this under the home directory of the pi user that I’m connected as on the pi.
I’m going to call this directory “dvds” because I want to keep the name nice and short.
To create a directory under your home in Linux…

mkdir ~/dvds

You can confirm that the directory has been created in the expected location by running …

ls -ld ~/dvds

drwxr-xr-x 5 pi pi 4096 Feb 14 13:05 /home/pi/dvds

Now…

cd ~/dvds
virtualenv dvdsenv

…will create the python executables referenced in this environment :

virtualenv

Notice that this has created a directory structure under a new directory called dvdsenv :

dvdsenv

Now start the virtualenv and note what happens to the prompt :

source dvdsenv/bin/activate

virtual_prompt

One small but welcome advantage to running in your new environment is that you don’t have to remember the “3” whenever you want to run python. The easiest way to demonstrate this is to stop the virtual environment, get the python version, then re-start the virtual environment and check again, like this…

virtual_python

Installing Django

We want to do this in our newly created virtual environment.
So, if you’re not already in it, start it up :

cd ~/dvds
source dvdsenv/bin/activate

Now we use pip3 to get django. NOTE – as with the python command, we don’t need to remember the “3” for pip inside the virtual environment…

pip install django

django_install

Still in the Virtual environment, we can now create our new django project ( be sure to be in the dvds directory we created earlier) :

cd ~/dvds
django-admin.py startproject dvds .

Note the “.” at the end of this command. that means that the directory tree structure of the new application should be created in the current directory.

Once this has run, you should see a sub-directory called dvds :

django_dir

We now need to make some changes to some of the files that Django has created in this directory. To make these changes I’m going to use the default Raspbian graphical editor, Leafpad. If you’d prefer something like nano, then knock yourself out. Just replace “leafpad” with the executable name of your editor in the commands that follow…

leafpad ~/dvds/dvds/settings.py

We need to make a couple of changes to this file.
Firstly, in the INSTALLED_APPS section of the file (around about line 33) we want to add our application – dvds. After the change, this particular section of the file should look something like this :

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'dvds',
]

The other thing to do is to make sure that STATIC_ROOT has been defined. If this does not already exist in settings.py then add it at the end of the file :

STATIC_ROOT = os.path.join( BASE_DIR, "static/")

To get Django to accept these changes we need to migrate them. Note that we need to do this from inside the virtual environment so start it if it’s not already running…

cd ~/dvds
source dvdsenv/bin/activate
./manage.py makemigrations
./manage.py migrate

migrations

Before we finally get Django up and running, we need to setup the default admin UI.
To do this, we first need to create an admin user :

./manage.py createsuperuser

superuser

…then setup the static files used by the admin app…

./manage.py collectstatic

You have requested to collect static files at the destination
location as specified in your settings:

    /home/pi/dvds/static

This will overwrite existing files!
Are you sure you want to do this?

Type 'yes' to continue, or 'no' to cancel:

Type “yes” and you’ll get …

Copying '/home/pi/dvds/dvdsenv/lib/python3.4/site-packages/django/contrib/admin/static/admin/css/base.css'
Copying '/home/pi/dvds/dvdsenv/lib/python3.4/site-packages/django/contrib/admin/static/admin/css/widgets.css'
Copying '/home/pi/dvds/dvdsenv/lib/python3.4/site-packages/django/contrib/admin/static/admin/css/rtl.css'
...
Copying '/home/pi/dvds/dvdsenv/lib/python3.4/site-packages/django/contrib/admin/static/admin/js/admin/RelatedObjectLookups.js'

61 static files copied to '/home/pi/dvds/static'.

Now we can test that everything is working as expected by running Django’s own “development” http server :

./manage.py runserver

django_server

If we now point the Epiphany browser on the pi to that address, we should see the default Django page :

django_default

Better even than that, if you append “/admin” to the url – i.e.

http://127.0.0.1:8000/admin

You should see…

admin_login

Using the username and password you just created for with the “createsuperuser” command just now, you should get access to :

admin_page_new

Installing Apache

This is fairly straight forward, to start with at least.
First of all, you don’t need to be in the Python Virtual Environment for this so, if you are then deactivate it :

deactivate

Once this command has completed, the prompt should now return to normal.

I’ll be sure to tell you when you need the Virtual Environment again.

To install Apache…

sudo apt-get install apache2 -y

Once that’s completed, you should be able to confirm that Apache is up and running simply by pointing your browser to :

http://localhost

…which should display the Apache Default Page :

apache_default_page

In addition to Apache itself, we need some further packages to persuade Apache to serve pages from our Django application :

sudo apt-get install apache2-dev -y
sudo apt-get install apache2-mpm-worker -y
sudo apt-get install libapache2-mod-wsgi-py3 

Got all that ? Right…

Configuring Apache to serve Django Pages using WSGI

First of all, we need to tell Apache about our Django application. To do this we need to edit the 000-default.conf which can be found in the Apache directories :

leafpad /etc/apache2/sites-available/000-default.conf

We need to add some entries to the section of the file. Once we’re done, the entire file should look something like this :

<VirtualHost *:80>
	# The ServerName directive sets the request scheme, hostname and port that
	# the server uses to identify itself. This is used when creating
	# redirection URLs. In the context of virtual hosts, the ServerName
	# specifies what hostname must appear in the request's Host: header to
	# match this virtual host. For the default virtual host (this file) this
	# value is not decisive as it is used as a last resort host regardless.
	# However, you must set it for any further virtual host explicitly.
	#ServerName www.example.com

	ServerAdmin webmaster@localhost
	DocumentRoot /var/www/html

	# Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
	# error, crit, alert, emerg.
	# It is also possible to configure the loglevel for particular
	# modules, e.g.
	#LogLevel info ssl:warn

	ErrorLog ${APACHE_LOG_DIR}/error.log
	CustomLog ${APACHE_LOG_DIR}/access.log combined

	# For most configuration files from conf-available/, which are
	# enabled or disabled at a global level, it is possible to
	# include a line for only one particular virtual host. For example the
	# following line enables the CGI configuration for this host only
	# after it has been globally disabled with "a2disconf".
	#Include conf-available/serve-cgi-bin.conf

 Alias /static /home/pi/dvds/static
    <Directory /home/pi/dvds/static> 
        Require all granted
    </Directory>

    <Directory /home/pi/dvds/dvds>
        <Files wsgi.py>
            Require all granted
        </Files>
    </Directory>

    WSGIDaemonProcess dvds python-path=/home/pi/dvds python-home=/home/pi/dvds/dvdsenv
    WSGIProcessGroup dvds
    WSGIScriptAlias / /home/pi/dvds/dvds/wsgi.py
</VirtualHost>

# vim: syntax=apache ts=4 sw=4 sts=4 sr noet

Next, we need to make sure that Apache has access to the bits of Django it needs. To do this, we’ll give access to the group that the user under which Apache runs belongs to :

chmod g+w ~/dvds/db.sqlite3
chmod g+w ~/dvds
sudo chown :www-data db.sqlite3
sudo chown :www-data ~/dvds

After all of that, the “Apache” group (www-data) should be the group owner of the Virtual environment as well as our SQLITE database :

apache_privs

Finally, we need to re-start Apache for these changes to take effect :

sudo service apache2 restart

If we now go to the Apache url (http://localhost), we can see that it’s now showing the Django default page :

django_in_apache
If you see that then, congratulations, it works !

Accessing Django from another computer on the network

The server name of my Raspberry Pi is raspberrypi. If you want to check that this is the case for you, simply open a Terminal on the pi and run :

uname -n

In order to access the application from other computers on my local network, I’ll need to add this server name to the ALLOWED_HOSTS list in the settings.py file of the application.

To do this :

leafpad ~/dvds/dvds/settings.py

Amend the ALLOWED_HOSTS entry from this :

ALLOWED_HOSTS=[]

…to this…

ALLOWED_HOSTS=['raspberrypi']

And you should now be able to access the Django application from a remote machine by using the url :

raspberrypi

…like this…

remote_page_view

Hopefully, this has all helped you to get up and running without hitting your thumb.


Filed under: python Tagged: Apache, Django, Python virtual environment, Raspberry Pi

Breaking the Rules – why sometimes it’s OK to have a standalone PL/SQL Function

Mon, 2017-02-06 16:45

It was late. We were snuggled up on the sofa, watching a Romcom and debating whether to go to bed or see it through to the bitter( well, sickly sweet) end.

Wearily, I made the point that in the end the film would follow Heigl’s Iron Law of Romcom which can be summarised as “Katherine always gets her man”.

Deb begged to differ. Her argument was that, for every Colin Firth, riding into the sunset with his Bridget Jones, there’s a poor( largely blameless) Patrick Dempsey whose immediate future includes long-evenings alone in front of the telly and shopping for microwave meals for one.
The point is that even the most rigid rules tend to have their exceptions.

The star of this post is the oft-quoted rule that PL/SQL program units should always be incorporated into a Package.
There are special cameo appearances by “Never use Public Synonyms” and the ever popular “Never grant privileges to Public”.

Why Grouping Functions and Procedures in Packages is a Good Idea

“Always use a package. Never use a standalone procedure.”
This is a quote from Tom Kyte.
More precisely, it’s a partial quote. We’ll come back to that in a moment.

Mr Kyte goes on to expound the virtues of packages because they ( quoting once again)…

“- break the dependency chain (no cascading invalidations when you install a new package body — if you have procedures that call procedures — compiling one will invalidate your database)

– support encapsulation — I will be allowed to write MODULAR, easy to understand code — rather then MONOLITHIC, non-understandable procedures

– increase my namespace measurably. package names have to be unique in a schema, but I can have many procedures across packages with the same name without colliding

– support overloading

– support session variables when you need them

– promote overall good coding techniques, stuff that lets you write code that is modular, understandable, logically grouped together….

Well that all seems fairly comprehensive. So why are we even having the discussion ? Well, it comes back to the rest of the above quote, which tends to get missed when rules like this are invoked.

The full quote is actually :

“Always use a package. Never use a standalone procedure except for demos, tests and standalone utilities (that call nothing and are called by nothing).”

Having recently covered the fact that “unless your writing tests” should be appended to any rule relating to Oracle code, I’m going to focus on…

An Application-Independent Standalone function

It just so happens that I have one of these lying around
The radix_to_decimal function takes a string representation of a number in a base between 2 and 36 and returns it’s decimal equivalent.
The function does not read from or write to any application tables :

create or replace function radix_to_decimal( i_number in varchar2, i_radix in pls_integer)
    return pls_integer
--
-- Function to return the decimal representation of i_number in i_radix.
-- This handles bases between 2 and 36 (i.e. any base where numeric values are represented by alphanumeric characters)
--
is
    ASCII_0 constant pls_integer := 48; -- the ascii value for '0'
    ASCII_9 constant pls_integer :=  57; -- the ascii value for '9'

    revnum varchar2(38);
    rtnval pls_integer := 0;
    digit varchar2(1);

    e_missing_param_value exception;
    e_invalid_radix exception;
    e_invalid_digit_for_base exception;
begin
    -- Parameter sanity checks
    if i_number is null or i_radix is null then
        raise e_missing_param_value;

    elsif i_radix not between 2 and 36 then
        raise e_invalid_radix;

    elsif i_radix = 10 then
        return i_number;

    -- Validate that i_number is actually a valid i_radix value.
    elsif (i_radix > 10 and instr( i_number, chr(55 + i_radix),1,1) > 0)
        or ( i_radix < 10 and instr( i_number, i_radix, 1, 1) > 0) 
    then
            raise e_invalid_digit_for_base;
    end if;

    -- Reverse the i_number string so we can loop through and sum the decimal numbers represented by each character
    -- without having to check the length of i_number.
    -- The REVERSE function is a SQL, rather than PL/SQL built-in, hence...

    select reverse(i_number) into revnum from sys.dual;

    for i in 1..length(revnum) loop
        digit := substr(revnum, i, 1);
        if ascii(digit) between ASCII_0 and ASCII_9 then
            rtnval := rtnval + ( digit * power(i_radix, i - 1));
        else
            -- letters in bases above 10 are always offset from 10 - e.g. A = 10, B = 11 etc.
            -- so, subtracting 55 from the ascii code of the upper case letter will give us the decimal value
            rtnval := rtnval + ( ( ascii( upper( digit)) - 55) * power( i_radix, i - 1) );
        end if;
    end loop;
    return rtnval;

exception

    when e_missing_param_value then
        raise_application_error( -20000, 'Both a number and a base must be specified');

    when e_invalid_radix then
        raise_application_error( -20001, 'This function only converts bases 2 - 36');

    when e_invalid_digit_for_base then
        raise_application_error( -20002, 'Number '||i_number||' is not a valid '||i_radix||' number.');

end radix_to_decimal;
/

Here’s a quick demo of the function in action….

select radix_to_decimal('101', 2) from dual;

RADIX_TO_DECIMAL('101',2)
-------------------------
                        5

select radix_to_decimal('401', 8) from dual;

RADIX_TO_DECIMAL('401',8)
-------------------------
                      257

select radix_to_decimal('7E0', 16) from dual;

RADIX_TO_DECIMAL('7E0',16)
--------------------------
                      2016

I’ve also uploaded the function to LiveSQL so feel free to have a play around with it.

Meanwhile, back in the database, to make this function generally available, grant execute to everyone…

grant execute on radix_to_decimal to public
/

What’s that ? I’ve violated the principle of least privilege ? Well, you may have a point. However, that principle has been weighed against the practicality of being able to re-use this code ( the principle of Don’t Repeat Yourself).
Whilst, under most circumstances, security wins out, there are (in Oracle at least) one or two exceptions as you can see by running…

select count(*)
from all_tab_privs
where privilege = 'EXECUTE'
and grantee = 'PUBLIC'
/

In order to make it easy to call this function, we don’t want to have to remember which schema we happened to put this in, so we’re going to create a public synonym…

create or replace public synonym radix_to_decimal for mike.radix_to_decimal
/

Once again, you may raise the very valid issue of namespace pollution caused by the use of Public Synonyms.
Once again, I’ve chosen pragmatism over principle in this specific instance.
Of course, if the next version of oracle contains a function called radix_to_decimal you can come back and say “I told you so !”


Filed under: Oracle, PL/SQL Tagged: grant to public, LiveSQL, public synonyms, radix_to_decimal, Standalone Function

Automated Testing Frameworks and General Rule-Breaking in PL/SQL

Sat, 2017-01-07 08:58

If there’s one thing that 2016 has taught us is that rules (and in some cases, rulers) are made for breaking. Oh, and that it’s worth putting a fiver on when you see odds of 5000-1 on Leicester winning the League.

Having lacked the foresight to benefit from that last lesson, I’ve spent several months looking at Unit Testing frameworks for PL/SQL. In the course of this odyssey I’ve covered:

This post is a summary of what I’ve learned from this exercise, starting with the fact that many of the rules we follow about good programming practice are wrong…

Writing Unit Tests means Breaking the Rules

OK, so maybe that should be “incomplete” rather than wrong.

As well as general “golden rules” that govern good programming practice, each language will have it’s own specific rules. These rules are usually along the lines of “Never do x” or “Always do Y”.
Leaving aside the problems inherent of using the words “Always” and “Never” in this context, I am now of the opinion that they should normally end with the words “…unless you’re writing a Unit Test”.

Reviewing the test code I’ve written over the last few months offers numerous examples of this.
Of course, it could just be down to the quality of the programmer but…

Always use Bind Variables

This is a standard in PL/SQL for very good reasons. Bind variables not only offer significant performance advantages, they serve to protect against the injection of malicious code. However, whilst I found myself rather uncomfortable about writing this code in Ruby-plsql-spec…

def get_tournament_rec( i_comp_code, i_year_completed, i_host_nation)
# Return the record for a tournament
# Alternative version concatenating arguments into a string...
    l_stmnt = "
        select id, comp_code, year_completed, host_nation, year_started, number_of_teams
        from footie.tournaments
        where comp_code = '#{i_comp_code}'
        and year_completed = #{i_year_completed}
        and host_nation "
         
    if i_host_nation.nil? then
        l_stmnt = l_stmnt + "is null"
    else
        l_stmnt += " = '#{i_host_nation}'"
    end
        plsql.select_first l_stmnt
end

…this equivalent example using SQLDeveloper Unit Testing seemed perfectly fine…

select null
from competitions
where comp_code = '{I_CODE}'

…even in utPLSQL, we find stuff like …

procedure ut_update_no_of_teams
is
begin
	-- Execute
	footie.manage_tournaments.edit_tournament(i_id => g_id, i_teams => 16, i_year_start => null);
	-- Validate
	utAssert.eqQueryValue
	(
		'Confirm Number of Teams Updated',
		'select count(*) from footie.tournaments where id = '||g_id||' and number_of_teams = 16 and year_started is null',
		1
	);
end ut_update_no_of_teams;

Seeing these constructs in Application code would start alarm bells ringing. So why then, are they apparently OK in Unit Tests ?

Well, first of all, performance is not necessarily as crucial an issue for Unit Tests as it might be for the Application itself. You may well be able to live with the odd additional hard-parse in your test suite if it means writing a bit less test code.

From a security perspective, whilst it’s still prudent to be wary of concatenating user input into executable statements, in all of the above instances, the variables in question do not contain user supplied values. They are either generated at runtime or are hard-coded.

Wait, that’s not right is it ? I mean you’re supposed to avoid hard-coding values right ?

Do not hard-code values

The fundamental purpose of a unit test could be summarised as :

With an application in a known state, a known input will result in a known output.

That’s an awful lot of stuff that you need to know in order to execute a Unit Test and to then verify the output.
With this in mind, it’s not surprising that this sort of thing becomes commonplace in Unit Test Code :

create or replace package body  utp_footie.ut_edit_tournament 
as
	g_code footie.competitions.comp_code%type := 'EURO';
	g_year_end footie.tournaments.year_completed%type := 2016;
	g_host footie.tournaments.host_nation%type := 'FRANCE';
	g_teams footie.tournaments.number_of_teams%type := '24';
	g_year_start footie.tournaments.year_started%type := 2013;
,
...

Perhaps even more controversial in PL/SQL circles is code that contravenes the rule that says…

A When Others Exception not followed by a RAISE is a bug

In testing terms, my evidence for having this commandment re-worded is…

    procedure ut_add_tourn_start_after_end
    is
        l_err_code number := 0;
        l_year_start footie.tournaments.year_started%type := 1918;
    begin
        footie_ut_helpers.ensure_comp_exists(g_code);
        begin
            footie.manage_tournaments.add_tournament
            (
                i_code => g_code,
                i_year_end => g_year_end,
                i_teams => g_teams,
                i_year_start => l_year_start
            );
        exception when others then
            l_err_code := sqlcode;
        end;
        utAssert.this('Cannot add a tournament that ends before it starts', l_err_code = -20000);
        rollback;
    end ut_add_tourn_start_after_end;

In this instance of course, we want to compare the error we actually get with the (hard-coded) error we were expecting.

One more illustration of exactly why a healthy disregard for rules is an asset when writing unit tests…

Any database interaction should be via a PL/SQL API

This is an approach that I’m particularly fond of, having had rather too much experience of applications where this architecture was not followed. However, if you’re writing a test for your PL/SQL API in a language that isn’t PL/SQL then something like this seems to be perfectly reasonable :

...
    expect( 
        plsql.footie.competitions.select( :all, "order by comp_code")
    ).to eq @expected
...

Now I’ve got that out of the way, it’s time to compare the frameworks in detail, starting with :

What these frameworks have in common

The first thing you may notice is that they are all similarly priced. That is to say that they are all free.
Both utPLSQL and Ruby-plsql-spec are Open Source.
SQLDeveloper, of which SQLDeveloper Unit Testing is an integral part, is also available at no cost.

As the first framework to be developed for PL/SQL, it is perhaps not surprising that utPLSQL has provided a template which the other frameworks have followed. This template itself, originated from JUnit.

In simple terms, a Unit Test consists of up to four phases :

  • Setup – any steps necessary to ensure that the Application is in a known state
  • Execute – run the code that is being tested
  • Validate – check that the actual results were what was expected
  • Teardown – any steps necessary to return the Application to it’s original state prior to the test being run

In terms of their capabilities, all of the Frameworks facilitate testing of scenarios that are commonly found in PL/SQL applications :

  • ability to test DML actions against a data model fully implementing Referential Integrity
  • ability to test DDL statements
  • ability to handle both OLTP and Data Warehouse style operations – including test steps that cross transaction boundaries
  • Ability to handle IN/OUT Ref Cursors

Additionally, they also share characteristics such as :

  • ability to run tests singly or as a larger suite
  • tests can be saved into discrete files and are therefore amenable to being stored in a Source Code Repository
  • possible (with varying degrees of configuration) to incorporate tests into a Continuous Integration tool

In short, any of the frameworks I’ve covered here will do the basics when it comes to Unit Testing your PL/SQL code. Their main distinguishing characteristics lie in their architecture…

SQLDeveloper Unit Testing

SQLDeveloper Unit Testing (SUT) is at the pointy-and-clicky end of the scale.
Typically for a declarative tool, there’s a bit of typing to start with but this reduces quickly once you start to add code to the Library.
Whilst SUT does require a database to house it’s repository objects, the fact that it’s built-in to SQLDeveloper means that the repository objects (and the tests themselves) can be separated completely from the database that holds the code to be tested.
The tests can be saved into text files (xml) and therefore placed under source control like any other text file.
If you want to execute SUT tests in a Continuous Integration environment, that environment will need to have SQLDeveloper installed. The tests themselves, can be executed using the SQLDeveloper CLI.

SUT is likely to be appealing as the PL/SQL testing framework of choice if :

  • SQLDeveloper is your standard Oracle IDE
  • You want to get up and running quickly with your TDD effort
  • You want to maintain a separation between test and application code in the RDBMS
utPLSQL

Unsurprisingly, utPLSQL is at the opposite end of the spectrum.
As with SUT, the repository lives in the database. However, the actual tests themselves are PL/SQL packages and are therefore likely to exist right alongside the application code being tested.
Of course, it’s perfectly possible to maintain some distinction and ease administration by ensuring that the tests are located in a separate schema.
Issues of Source Control and CI don’t really arise as the infrastructure required for utPLSQL is the same for that of the PL/SQL application code that you’re testing.

utPLSQL may come up for consideration because :

  • it is (probably) the most widely used PL/SQL testing framework
  • it has no dependency on any software that you don’t already have up and running
  • it provides a greater degree of control than SUT as all tests are pretty much hand-written
  • SQLDeveloper is not your default IDE
Ruby-plsql-spec

Ruby has long been a favourite language when writing unit tests. The RSpec framework, upon which ruby-plsql-spec is based is highly regarded by many who make their living writing code outside of the database.
Unlike the other two frameworks, ruby-plsql-spec does not need to create any objects in any Oracle database.
All it needs is a means of connecting to the database holding the application code you want to test.
In terms of pre-requisites for executing tests in a CI envioronment, all you need is ruby itself ( which comes as standard in most Linux Distros), and the Ruby Gems that we installed for the client.

Ruby-plsql-spec is a contender when :

  • You’re already testing other technologies in your application using Ruby
  • You want a complete separation between test and application code
  • SQLDeveloper is not your default IDE
  • The overhead of writing more code is offset by either/both of the above
  • You’re happy to sacrifice some of the flexibility offered by the fact that the other frameworks are native to Oracle ( e.g. Merge statements)

It’s clear that each of these frameworks have their own particular strengths and weaknesses, but which one would I choose ?

If the Fence is Strong Enough, I’ll Sit on It

If you ask me which of these frameworks I’ll use going forward, the answer is all of them.

For my Footie application, which served as a Guinea Pig for the framework evaluation, I’m going to stick with SUT.
I use SQLDeveloper and using the declarative tools mean that I’ll spend much less time coding tests and more time doing the fun stuff.

As far as utPLSQL is concerned, not only is it highly likely that I’ll find myself working on a project where this is used as the standard, there is a major overhaul under way to bring utPLSQL up to date.
utPLSQL Version 3.0 may well be a game-changer.

As for Ruby-plsql-spec, that’s the framework that we use for the crudo application. I’m in the process of adding more tests using this framework ( or will be when I finish writing this).

Conclusion

Whatever approach to Unit Testing and/or Test Driven Development that appeals, there is a framework freely available for PL/SQL.
Whilst introducing tests to an existing code base can be challenging, adopting one of these a the start of a new project could well lead to a saving in time and effort down the line.


Filed under: Oracle, PL/SQL, SQL, SQLDeveloper Tagged: bind variables, ruby-plsql-spec, SQLDeveloper Unit Testing, utPLSQL, when others exception

Post-Truth PL/SQL

Fri, 2016-12-09 15:19

We’re living in a Post-truth age. I know this because I read it in my Fake News Feed.
Taking advantage of this, I’ve updated the definition of PL/SQL.
Up until now, it would be true to say that PL/SQL is a 3GL based on ADA that’s incorporated into the Oracle RDBMS.
Post truth, the definition is that PL/SQL is a 3GL that comes with it’s own built-in Oracle RDBMS.

By a stroke of good fortune, my son recently bought me a copy of Ghost in the Wires by Kevin Mitnick and William L Simon, which begins each chapter with an encrypted phrase.
If your anything like me, you’d spend a fair amount of time geeking over this sort of problem, most likely using some fashionable programming language to help solve the riddles with which you were presented.

In my house at least, PL/SQL is back in fashion…

Some true (and post-true) statements

The code presented here has been written on and tested on an Oracle 11g Express Edition database.
The purpose of this exercise is to demonstrate the power and utility of PL/SQL as a language in it’s own right. Yes, I make use of the database. To ignore it would be a bit like trying to write C without using pointers.

The point is that PL/SQL is not simply another stored procedure language, SQL tricked out to make it Turing Complete. It’s a proper 3GL in it’s own right.

The PL/SQL “standards” I’ve adopted here are intended for aesthetic reasons as well as those of readability. In other words, not using “l_” as a prefix for a local variable makes the code look less spikey.

I’ve followed the C convention with constants, which I’ve defined and used in uppercase.
I still can’t bring myself to use Camel Case in PL/SQL for reasons which I go into here.
If I look back at this code in six months time and try to figure out just what I was doing I’ll know whether the “aesthetic” standards, was a good idea.

The cryptograms contained in this post are taken from the Hardback edition of the book. The ones in the paperback are different.

Before we go any further, I would like it to be known that I managed to solve all of the encrypted phrases without any help at all. I did this by googling “Ghost in the Wires encrypted phrases” and then read the comprehensive solution offerred by Fabien Sanglard.
In the old days, this may have been considered akin to “looking in the back of the book for the answers”. As we’ve already established, times have changed…

The Caesar Cipher

Let’s take a look at the first encrypted phrase in the book :

yjcv ku vjg pcog qh vjg uauvgo wugf da jco qrgtcvqtu vq ocmg htgg rjqpg ecnnu ?

Pretending that I haven’t already “looked in the back of the book” for the answers, we can deduce a couple of things from the way this phrase is formatted.

The fact that the letter groupings separated by spaces are not a constant number would seem to suggest that these are possibly words.
The question mark at the end would seem to re-enforce this notion and tell us that the encrypted phrase is a question.
This being the case, the repeating three letter pattern “vjg” might represent the word “THE”.
If this were the case, then it would mean that all of the letters in the phrase had been shifted forwards by 2 letters in the alphabet. Thus, V = T, J = H and G = E.
This method of encoding is known as a Caesar Cipher.

If we’re going to “have a stab” at cracking a Caesar Cipher then brutus.sql would seem to be an appropriate name for the following program…

set serveroutput on size unlimited
declare
    UPPER_A constant pls_integer := 65; -- Ascii code for 'A'
    UPPER_Z constant pls_integer := 90; -- Ascii code for 'Z'
    LOWER_A constant pls_integer := 97; -- Ascii code for 'a'

    phrase varchar2(4000); 
    offset pls_integer;
    this_char varchar2(1);
    decrypted_char varchar2(1);
    decrypted_phrase varchar2(32767);
    ascii_a pls_integer;
begin
    
    phrase := 'yjcv ku vjg pcog qh vjg uauvgo wugf da jco qrgtcvqtu vq ocmg htgg rjqpg ecnnu ?';
    offset := 2;

    for i in 1..length( phrase) loop
        this_char := substr( phrase, i, 1);
        if ascii( upper( this_char)) not between UPPER_A and UPPER_Z then
            -- not a letter, just use the character unchanged
            decrypted_char :=  this_char;
        else
            -- Make sure that the character stays within the bounds of
            -- alpha ascii codes after the offset is applied
            ascii_a := case when ascii(this_char) between UPPER_A and UPPER_Z then UPPER_A else LOWER_A end;

            -- now apply the offset...
            decrypted_char := chr( ascii( this_char) - offset);

            if ascii(decrypted_char) < ascii_a then
                -- character needs to "wrap around" to the other end of the alphabet
                decrypted_char := chr( ascii(decrypted_char) + 26);
            end if;
        end if;
        decrypted_phrase := decrypted_phrase||decrypted_char;
    end loop;
    dbms_output.put_line( decrypted_phrase);
end;
/

Using the fact that uppercase letters have ASCII codes between 65 and 90, we can apply the offset by easily enough by subtracting the offset from the ASCII code of each letter then converting the result back into a character.
The ASCII function shows an ASCII code for a given character. The CHR function converts an ASCII value to it’s corresponding character.
In pre-truth terms, these are both SQL, rather than PL/SQL functions. However, SQL is merely a component in the Oracle RDBMS and therefore a subset of the all-encompassing Post Truth PL/SQL.

Leaving the semantics aside, when we run this we get :

@brutus.sql
what is the name of the system used by ham operators to make free phone calls ?

PL/SQL procedure successfully completed.

… so our hypothesis is correct.

One down, 37 to go. Whilst the script we’ve got at the moment is fine for cracking a single code, we could probably do with something a bit more parameterized…

Brute Forcing with Brutus

As you may have observed already, the Caesar Cipher is rather susceptible to brute forcing as the offset used to encode a phrase can only be up to 25.
OK, you can offset to 26 (the number of letters in the English alphabet) but then your encoded string will be the same as the unencoded one which rather defeats the whole object of the exercise.

To start with then, we can make things a bit easier for ourselves by persisting the phrases we want to crack. We’ll also want to record the plain text for each of them once we’ve managed to figure it out. Hang on, we’ve got a database lying around somewhere…

create table giw_codes
(
    chapter_no number(2) not null,
    cryptogram varchar2(4000) not null,
    encryption_method varchar2(50),
    message varchar2(4000),
    answer varchar2(4000),

    constraint giwc_pk primary key (chapter_no)
)
/

comment on table giw_codes is 'Cryptograms from the book Ghost in the Wires by Kevin Mitnick with William L. Simon. Table Alias : GIWC'
/

comment on column giw_codes.chapter_no is 'The chapter number in which the cryptogram appears. Part of Primary Key'
/

comment on column giw_codes.cryptogram is 'The encrypted phrase'
/

comment on column giw_codes.encryption_method is 'Method of encryption used'
/

comment on column giw_codes.message is 'The deciphered message'
/

comment on column giw_codes.answer is 'The answer to the question in the deciphered message.'
/

Now to populate it. Post-Truth PL/SQL still allows you to be a bit lazy with your inserts

declare

    -- because I can't be bothered to type the insert statement 38 times...

    procedure ins( i_chapter giw_codes.chapter_no%type, i_cryptogram giw_codes.cryptogram%type)
    is
    begin
        insert into giw_codes( chapter_no, cryptogram, encryption_method, message, answer)
        values( i_chapter, i_cryptogram, null, null, null);
    end ins;
    
begin
    -- ... I'll just call the procedure...
    
    ins(1, 'yjcv ku vjg pcog qh vjg uauvgo wugf da jco qrgtcvqtu vq ocmg htgg rjqpg ecnnu ?');
    ins(2, 'wbth lal voe htat oy voe wxbirtn vfzbqt wagye C poh aeovsn vojgav ?');
    ins(3, 'Nyrk grjjnfiu uzu Z xzmv kf jvklg re rttflek fe Kyv Rib ?');
    ins(4, q'[Flle ujw esc wexp mo xsp kjr hsm hiwwcm, 'Wplpll stq lec qma e wzerg mzkk!' ?]');
    ins(5, 'Bmfy ytbs ini N mnij tzy ns zsynq ymj Ozajsnqj Htzwy qtxy ozwnxinhynts tajw rj ?');
    ins(6, 'Kyoo olxi rzr Niyovo Cohjpcx ojy dn T apopsy ?');
    ins(7, 'Kvoh wg hvs boas ct hvs Doqwtwq Pszz sadzcmss kvc fsor hvs wbhsfboz asac opcih am voqywbu oqhwjwhwsq cjsf hvs voa forwc ?');
    ins(8, 'Iwh xwqv wpvpj fwr Vfvyj qks wf nzc ncgsoo esg psd gwc ntoqujvr ejs rypz nzfs ?');
    ins(9, 'Hsle td esp epcx qzc dzqehlcp mfcypo zy esp nsta esle Yzglepw dpye xp ?');
    ins(10, 'Bprf cup esanqneu xmm gtknv amme U biiwy krxheu Iwqt Taied ?');

    ins(11, 'Lwpi idlc sxs bn upiwtg axkt xc lwtc X bdkts xc lxiw wxb ?');
    ins(12, q'[Yhlt xak tzg iytfrfad RanBfld squtpm uhst uquwd ce mswf tz wjrwtsr a wioe lhsv Ecid mwnlkoyee bmt oquwdo't ledn mp acomt ?]');
    ins(13, 'Zkdw lv wkh qdph ri wkh SL ilup wkdw zdv zluhwdsshg eb Sdflilf Ehoo ?');
    ins(14, 'Plpki ytw eai rtc aaspx M llogw qj wef ms rh xq ?');
    ins(15, 'Ituot oaybmzk ymwqe ftq pqhuoq ftmf Xqiue geqp fa buow gb mzk dmpua eusmxe zqmd Qduo ?');
    ins(16, q'[Kwth qzrva rbq lcq rxw Svtg vxcz zm vzs lbfieerl nsem rmh dg ac oef'l cwamu ?]');
    ins(17, 'Epib qa bpm vium wn bpm ixizbumvb kwuxtmf epmzm Q bziksml lwev Mzqk Pmqvh ?');
    ins(18, 'Khkp wg wve kyfcqmm yb hvh TBS oeidr trwh Yhb MmCiwus Wko ogvwgxar hr ?');
    ins(19, q'[Rcvo dn ivhz ja ocz omvinvxodji oj adiy v kzmnji'n njxdvg nzxpmdot iphwzm pndib oczdm ivhz viy yvoz ja wdmoc ?]');

    ins(20, q'[Wspa wdw gae ypte rj gae dilan lbnsp loeui V tndllrhh gae awvnh 'HZO, hzl jaq M uxla nvu']');
    ins(21, '4A 75 6E 67 20 6A 6E 66 20 62 68 65 20 61 76 70 78 61 6E 7A 72 20 74 76 69 72 61 20 67 62 20 47 72 65 65 6C 20 55 6E 65 71 6C 3F ');
    ins(22, 'Gsig cof dsm fkqeoe vnss jo farj tbb epr Csyvd Nnxub mzlr ut grp lne ?');
    ins(23, 'Fqjc nunlcaxwrl mnerln mrm cqn OKR rwcnwcrxwjuuh kanjt fqnw cqnh bnjalqnm vh jyjacvnwc rw Ljujkjbjb ?');
    ins(24, 'Xvof jg qis bmns lg hvq thlss ktffb J cifsok EAJ uojbthwsbhlsg ?');
    ins(25, 'Cngz zuct ngy znk grsg sgzkx lux znk xkgr Kxoi Ckoyy ?');
    ins(26, 'Aslx jst nyk rlxi bx ns wgzzcmgw UP jnsh hlrjf nyk TT seq s cojorpdw pssx gxmyeie ao bzy glc ?');
    ins(27, '85 102 121 114 32 103 113 32 114 102 99 32 108 121 107 99 32 109 100 32 114 102 99 32 122 109 105 113 '
        ||'114 109 112 99 32 71 32 100 112 99 111 115 99 108 114 99 98 32 103 108 32 66 99 108 116 99 112 63');
    ins(28, 'Phtm zvvvkci sw mhx Fmtvr VOX Ycmrt Emki vqimgv vowx hzh L cgf Ecbst ysi ?');
    ins(29, '126 147 172 163 040 166 172 162 040 154 170 040 157 172 162 162 166 156 161 143 040 145 156 161 '
        ||'040 163 147 144 040 115 156 165 144 153 153 040 163 144 161 154 150 155 172 153 040 162 144 161 165 '
        ||'144 161 040 150 155 040 122 172 155 040 111 156 162 144 077');

    ins(30, q'[Ouop lqeg gs zkds ulv V deds zq lus DS urqstsn't wwiaps ?]');
    ins(31, 'Alex B25 rixasvo hmh M ywi xs xli HQZ qemrjveqi ?');
    ins(32, q'[Caem alw Ymek Xptq'd tnwlchvw xz lrv lkkzxv ?]');
    ins(33, 'Ozg ojglw lzw hshwj gf AH Khggxafy lzsl BKR skcww ew stgml ?');
    ins(34, q'[Nvbx nte hyv bqgs pj gaabv jmjmwdi whd hyv UVT'g Giuxdoc Gctcwd Hvyqbuvz hycoij ?]');
    ins(35, '2B 2T W 2X 2Z 36 36 2P 36 2V 3C W 3A 32 39 38 2Z W 3D 33 31 38 2V 36 3D W '
        ||'2R 2Z 3C 2Z W 3E 3C 2V 2X 2Z 2Y W 3E 39 W 2R 32 2V 3E W 2V 3A 2V 3C 3E 37 2Z 38 3E '
        ||'W 2X 39 37 3A 36 2Z 2S 1R');
    ins(36, 'Lsar JSA cryoi ergiu lq wipz tnrs dq dccfunaqi zf oj uqpctkiel dpzpgp I jstcgo cu dy hgq ?');
    ins(37, 'V2hhdCBGQkkgYWdlbnQgYXNrZWQgU3VuIE1pY3Jvc3lzdGVtcyB0byBjbGFpbSB0aGV5IGxvc3QgODAgbWlsbGlvbiBkb2xsYXJzPw==');
    ins(38, '100-1111-10-0 011-000-1-111 00-0100 1101-10-1110-000-101-11-0-1 '
        ||'0111-110-00-1001-1-101 111-0-11-0101-010-1-101 111-10-0100 110011');


    commit;
        
end;
/

Now, whilst we could modify our Brutus script to simply select the encrypted phrases from the table and just loop through all possible offsets for each of them, that would result in the output of 950 phrases, almost all of which would be gibberish. It would be good then, if we could persuade the program to be a bit more discerning about it’s output.

According to Wikipedia, the 25 most common words make up about one third of all printed material in English. So, by searching for these, we can potentially filter out most of the junk from our result set.
Another point to consider is that many of the messages we’re trying to decrypt appear to be questions, so throwing in some common ‘question’ words ( e.g. how, why, who, what etc) may be helpful.
I’m going to exclude ‘I’ and ‘A’ from this list as matching on them is likely to generate a lot of false positives.
In the end, what I’m left with then (using the boringly conventional method of inserting rows to a table) is…

-- Creating this as an Index Organized Table ensures that no words are duplicated
-- whilst avoiding the overhead of copying the entire contents of the table to an index.

create table common_words( word varchar2(30), constraint cw_pk primary key(word))
    organization index
/

insert into common_words( word) values('THE');
insert into common_words( word) values('BE');
insert into common_words( word) values('TO');
insert into common_words( word) values('OF');
insert into common_words( word) values('AND');
insert into common_words( word) values('IN');
insert into common_words( word) values('THAT');
insert into common_words( word) values('HAVE');
insert into common_words( word) values('IT');
insert into common_words( word) values('FOR');
insert into common_words( word) values('NOT');
insert into common_words( word) values('ON');
insert into common_words( word) values('WITH');
insert into common_words( word) values('HE');
insert into common_words( word) values('AS');
insert into common_words( word) values('YOU');
insert into common_words( word) values('DO');
insert into common_words( word) values('AT');
insert into common_words( word) values('THIS');
insert into common_words( word) values('BUT');
insert into common_words( word) values('HIS');
insert into common_words( word) values('BY');
insert into common_words( word) values('FROM');

-- Throw in some lexemes (whatever they are)...

insert into common_words( word) values('IS');
insert into common_words( word) values('WERE');
insert into common_words( word) values('WAS');
insert into common_words( word) values('SHE');
insert into common_words( word) values('HERS');
insert into common_words( word) values('THEIRS');

-- And some 'question' words...

insert into common_words( word) values('WHO');
insert into common_words( word) values('WHAT');
insert into common_words( word) values('HOW');
insert into common_words( word) values('WHERE');
insert into common_words( word) values('WHEN');
insert into common_words( word) values('WHY');

-- Add past tense of the verb to do...
insert into common_words(word) values('DID');

-- and a conditional...possibly not as gramatically correct as it should be but hey...
insert into common_words(word) values('IF');
insert into common_words(word) values('ELSE');
insert into common_words(word) values('DOES');
insert into common_words(word) values('WHILE');

-- and whatever other random stuff seems reasonable...
insert into common_words(word) values('WE');
insert into common_words(word) values('US');
insert into common_words(word) values('THEM');
insert into common_words(word) values('THEIR');
insert into common_words(word) values('OUR');

commit;

Now we’ve persisted our encrypted phrases and some of the most common English words, we can save ourselves a fair bit of typing by incorporating the required code into a package.

First we need to create a package header to define the signature of the public package members (functions and procedures that are available to callers from outside of the package) :

create or replace package decrypt as
    --
    -- utilities for decrypting the cryptograms in the GIW_CODES table.
    --

    -- These constants were originally in the brutus (now caesar) function.
    -- However, I have a funny feeling that they may be needed elsewhere...

    UPPER_A constant pls_integer := 65; -- Ascii code for 'A'
    UPPER_Z constant pls_integer := 90; -- Ascii code for 'Z'
    LOWER_A constant pls_integer := 97; -- Ascii code for 'a'

    -- If i_string contains at least i_num_matches common words then return true
    function has_common_words( i_string in varchar2, i_num_matches in pls_integer default 2)
        return boolean;

    -- update GIW_CODES with any decrypted messages.
    procedure save_decrypted
    (
        i_chapter_no giw_codes.chapter_no%type,
        i_encryption_method giw_codes.encryption_method%type,
        i_message giw_codes.message%type
    );

    -- Call this function to decrypt a single cryptogram generated from a single cipher with a known offset
    function caesar( i_cryptogram in giw_codes.cryptogram%type, i_offset in pls_integer)
        return giw_codes.message%type;

    -- For the impatient, this will attempt to decrypt all cryptograms in the GIW_CODES table...
    procedure brute_force_caesar;
end decrypt;
/

…now for the code itself, which is in the package body…

create or replace package body decrypt as

    function has_common_words( i_string in varchar2, i_num_matches in pls_integer default 2)
        return boolean
    is
    --
    -- check i_string for whole words in COMMON_WORDS.
    -- if we get i_num_matches then return true
    --
    
        SPACE constant varchar2(1) := chr(32);
        matches pls_integer := 0;
        
    begin
        for r_words in (select SPACE||word||SPACE as word from common_words) loop
            if instr( upper( i_string), r_words.word, 1, 1) > 0 then
                matches := matches + 1;
                if matches = i_num_matches then
                    return true;
                end if;
            end if;
        end loop;
        return false;
    end has_common_words;

    procedure save_decrypted
    (
        i_chapter_no giw_codes.chapter_no%type,
        i_encryption_method giw_codes.encryption_method%type,
        i_message giw_codes.message%type
    )
    is
    --
    -- Update the GIW_CODES record identified by i_book_edition and i_chapter_no
    -- with i_encryption_method and i_message
    --
    begin
        update giw_codes
        set encryption_method = i_encryption_method,
            message = i_message
        where chapter_no = i_chapter_no;
    end save_decrypted;

    function caesar( i_cryptogram in giw_codes.cryptogram%type, i_offset in pls_integer)
        return giw_codes.message%type
    is
        --
        -- Translate i_cryptogram using an offset of i_offset this_chars.
        -- This is essentially brutus.sql made a bit more respectable now it's part of a package...
        --
        phrase varchar2(4000); 
        offset pls_integer;
        this_char varchar2(1);
        decrypted_char varchar2(1);
        decrypted_phrase varchar2(32767);
        ascii_a pls_integer;
    begin
        -- Parameter sanity check...
        if i_cryptogram is null or i_offset is null then
            raise_application_error(-20000, 'Both the cryptogram and the number of this_chars to offset must be supplied');
        end if;

        for i in 1..length( i_cryptogram) loop
            this_char := substr( i_cryptogram, i, 1);
            if ascii( upper( this_char)) not between UPPER_A and UPPER_Z then
                -- not a letter, just use the this_char unchanged
                decrypted_char :=  this_char;
            else
                -- Make sure that the this_char stays within the bounds of
                -- alpha ascii codes after the offset is applied
                ascii_a := case when ascii(this_char) between UPPER_A and UPPER_Z then UPPER_A else LOWER_A end;

                -- now apply the offset...
                decrypted_char := chr( ascii( this_char) - i_offset);

                if ascii(decrypted_char) < ascii_a then
                    -- this_char needs to "wrap around" to the other end of the alphabet
                    decrypted_char := chr( ascii(decrypted_char) + 26);
                end if;
            end if;
            decrypted_phrase := decrypted_phrase||decrypted_char;
        end loop;
        return decrypted_phrase;
    end caesar;
    
    procedure brute_force_caesar is
    --
    -- Cycle through all of the undeciphered cryptograms in GIW_CODES.
    -- Check the "decrypted" string for common words and if it passes this test, update the record in GIW_CODES
    --
    
        candidate_string giw_codes.message%type;
        
    begin
        for r_phrase in
        (
            select chapter_no, cryptogram
            from giw_codes
            where encryption_method is null
            order by 1,2
        )
        loop
            for i in 1..25 loop
                -- Loop through each possible Caesar cipher, stop if we get a match
                candidate_string := caesar( r_phrase.cryptogram, i);
                if has_common_words( candidate_string) then
                    save_decrypted( r_phrase.chapter_no, 'CAESAR', candidate_string);
                    exit;
                end if;
            end loop;
        end loop;
    end brute_force_caesar;
end decrypt;
/

After all of that, let’s see how many cryptograms we can decipher…

begin
    decrypt.brute_force_caesar;
end;
/

PL/SQL procedure successfully completed.

commit;

…and through the medium of sqlcl, we can see…

crack_caesar

So, it looks like 14 of our cryptograms are using a Caesar Cipher. Now for the the other 22…

Chipping away with Vigenere

It’s rather appropriate in an era of post-truth that we now come to a Cipher that’s not named after the person who invented it.
The Vigenere Cipher works in the same way as the Caesar Cipher but adds a key phrase into the mix. The key dictates the offset for each individual letter in the message.
It’s possible to lookup the letter code for each letter of the key using a grid known as a Vigenere Square or Vigenere Table.
Hmmm, table, that gives me an idea…

create table vigenere_squares
(
    cipher_key varchar2(1),
    plain_value varchar2(1),
    cipher_value varchar2(1)
)
/

comment on table vigenere_squares is
    'Table to translate the Vigenere Cipher. For a given key character (cipher_key), the cipher_value translates to the plain_value'
/

declare
    cipher varchar2(1);
    key varchar2(1);
    plain varchar2(1);
begin
    for i in 65..90 loop
        key := chr(i);
        for j in 0..25 loop
            plain := chr( j + 65);
            if (ascii(key) + j) > 90 then
                cipher := chr(ascii( key) + j - 26);
            else
                cipher := chr( ascii( key) + j);
            end if;
            insert into vigenere_squares( cipher_key, plain_value, cipher_value)
            values( key, plain, cipher);
        end loop;
    end loop;
end;
/

So, if our key phrase contains, say F ( for Fabien), we can lookup the translation for any letter…

vigenere_f

Alexandre Dumas is reputed to have remarked that English is French, badly spoken. As a true Englishman, with requisite apalling French accent, when someone mentions Vigenere, I automatically think of vinegar…malt vinegar – the condiment that defines the Great British Chip and distinguishes it from those continental frites which are usually subjected to mayonnaise.
Let’s see if we can ensure that the cipher used in chapter 2 has had it’s chips( or French Fries if your American).
Once again, let’s pretend I haven’t just copied Fabien but have instead been struck by an inspiration and have tried to use the answer to the question in chapter one as the key for the cipher in chapter2…

set serveroutput on size unlimited
declare
    phrase varchar2(4000);
    key varchar2(100);
    ptr pls_integer := 1;
    enc_char varchar2(1);
    plain_char varchar2(1);
    message varchar2(4000);
    
begin
    phrase := 'wbth lal voe htat oy voe wxbirtn vfzbqt wagye C poh aeovsn vojgav ?';
    key := 'AUTOPATCH';

    for i in 1..length(phrase) loop
        enc_char := substr( phrase, i, 1);
        if ascii( upper( enc_char)) not between 65 and 90 then
            -- not an alpha character...
            plain_char := enc_char;
        else
            -- lookup the plain value, preserving case
            select case when ascii( enc_char) between 65 and 90 then plain_value else lower(plain_value) end
            into plain_char
            from vigenere_squares
            where cipher_value = upper( enc_char)
            and cipher_key = upper( substr( key, ptr, 1));

            -- Move to the next character in the key phrase 
            ptr := ptr + 1;
            if ptr > length( key) then
                -- we've reached the end of the key, loop around to the start
                ptr := 1;
            end if;
        end if;

        message := message||plain_char;
    end loop;
    dbms_output.put_line(message);
end;
/

…sure enough…

what was the name of the central office where I was almost caught ?

The next step then, would be to fill in the answers to all of the questions we’ve already decoded in case the answer for the Caesar Ciphered question in one chapter always provides the key for the Vigenere Ciphered question in the next…*sound of pages being turned*

update giw_codes set answer = 'AUTOPATCH' where chapter_no = 1;
update giw_codes set answer = 'JELLY' where chapter_no = 3;
update giw_codes set answer = 'OROVILLE' where chapter_no = 5;
update giw_codes set answer = 'BILLCOOK' where chapter_no = 7;
update giw_codes set answer = 'FIRMWARE' where chapter_no = 9;
update giw_codes set answer = 'CALABASAS' where chapter_no = 11;
update giw_codes set answer = 'TELTEC' where chapter_no = 13;
update giw_codes set answer = 'OPTOELECTRONICS' where chapter_no = 15;
update giw_codes set answer = 'OAKWOOD' where chapter_no = 17;
update giw_codes set answer = 'ALPHADENT' where chapter_no = 19;
update giw_codes set answer = 'BOOMBOX' where chapter_no = 23;
update giw_codes set answer = 'ELLENSBURG' where chapter_no = 25;
update giw_codes set answer = 'GTETELENET' where chapter_no = 31;
update giw_codes set answer = 'ROBERTMORRIS' where chapter_no = 33;

commit;

We can add vigenere decryption functionality into the package by means of a new couple of new procedures. In the package header, we add :

-- Decrypt i_cryptogram using vigenere cipher with i_key as key
function vigenere( i_cryptogram in giw_codes.cryptogram%type, i_key in varchar2)
    return giw_codes.message%type;

-- If we have enough data, check to see if any un-cracked cryptograms could be Vigenere Ciphered
procedure chip_away;

The first is to apply the Vigenere decryption itself…

...
    function vigenere( i_cryptogram in giw_codes.cryptogram%type, i_key in varchar2)
        return giw_codes.message%type
    is 
    --
    -- Decrypt i_cryptogram using the letters in phrase i_key to determine the offset for each character.
    --
    ptr pls_integer := 1;
    enc_char varchar2(1);
    plain_char varchar2(1);
    message giw_codes.message%type;
    begin
        -- parameter sanity check...
        if i_cryptogram is null or i_key is null then
            -- Same error as for Caesar above but use a different error number to distinguish it
            raise_application_error(-20001, 'Both the cryptogram and the number of characters to offset must be supplied');
        end if;
        for i in 1..length(i_cryptogram) loop
            enc_char := substr( i_cryptogram, i, 1);
            if ascii( upper( enc_char)) not between UPPER_A and UPPER_Z then
                -- not an alpha character...
                plain_char := enc_char;
            else

                select case when ascii( enc_char) between UPPER_A and UPPER_Z then plain_value else lower(plain_value) end
                into plain_char
                from vigenere_squares
                where cipher_value = upper( enc_char)
                and cipher_key = upper( substr( i_key, ptr, 1));

                -- Move the pointer to the next character in the key
                ptr := ptr + 1;
                if ptr > length( i_key) then
                    ptr := 1;
                end if;
            end if;

            message := message||plain_char;
        end loop;
        return message;
    end vigenere;
...

… and the second is to loop through the GIW_CODES table and see what records we can apply this to …

...
    procedure chip_away is
    --
    -- Attempt to decrypt any cryptograms that have not yet been cracked where we
    -- have an answer to the previous chapter's message to use as a key.
    --
        candidate_string giw_codes.message%type;
        key_string giw_codes.answer%type;
    begin
        for r_phrase in
        (
            select chapter_no, cryptogram, encryption_method, answer
            from giw_codes
            order by 1,2
        )
        loop
            -- Go through each record in the table...
            if r_phrase.encryption_method is not null
                and r_phrase.answer is not null
            then
                -- although this cryptogram has already been solved, the answer may serve as the
                -- key for the next record if it has been encrypted with a Vigenere Cipher...
                key_string := r_phrase.answer;
                continue;
             elsif r_phrase.encryption_method is null
                and key_string is not null
             then
                candidate_string := vigenere( r_phrase.cryptogram, key_string);
                if has_common_words( candidate_string) then
                    save_decrypted(r_phrase.chapter_no, 'VIGENERE', candidate_string);
                end if;
            end if;
        end loop;
    end chip_away;
...

…we can use it to plug some more of the gaps we have…

In the best British tradition, if it's not working, blame the French

In the best British tradition, if it’s not working, blame the French

Whilst some of the remaining uncracked cryptograms look suspiciously like they could be encrypted with Vigenere, others look rather different.
Our next challenge is rather more to do with numbers than letters (at least initially)…

Hello Hexy

Taking a look at the cryptogram for chapter 21, it does appear to be comprised of a series of hexadecimal numbers. What was that ? Fabien who ?…

select cryptogram
from giw_codes
where chapter_no = 21
/

CRYPTOGRAM                                                                                                                         

4A 75 6E 67 20 6A 6E 66 20 62 68 65 20 61 76 70 78 61 6E 7A 72 20 74 76 69 72 61 20 67 62 20 47 72 65 65 6C 20 55 6E 65 71 6C 3F   

Now PL/SQL has whizzy built-in function to make conversion from hex to to varchar2 that little bit easier…

set serveroutput on size unlimited
declare
    SPACE constant varchar2(1) := chr(32);
    phrase varchar2(4000);
    ptr pls_integer := 1;
    hex_num varchar2(3);
    dec_phrase varchar2(4000);

    candidate_string varchar2(4000);
begin
    phrase := '4A 75 6E 67 20 6A 6E 66 20 62 68 65 20 61 76 70 78 61 6E 7A 72 20 74 76 69 72 61 20 67 62 20 47 72 65 65 6C 20 55 6E 65 71 6C 3F ';
    while ptr < length( phrase) loop
        -- take each hex number in the string...
        hex_num := substr( phrase, ptr, instr( phrase, SPACE, ptr, 1)- ptr);
        -- ...and use a standard package and function together to convert to decimal
        dec_phrase := dec_phrase|| utl_raw.cast_to_varchar2( hextoraw( hex_num));
        ptr := instr( phrase, SPACE, ptr, 1) + 1;
    end loop;
    dbms_output.put_line( dec_phrase);
    -- Now see if we can decrypt the resulting string...
    for i in 1..25 loop
        candidate_string := decrypt.caesar( dec_phrase, i);
        if decrypt.has_common_words( candidate_string) then
            dbms_output.put_line('Saving : '||candidate_string);
            decrypt.save_decrypted( 21, 'HEX-CAESAR', candidate_string);
            exit;
        end if;
    end loop;
end;
/

…which looks rather promising…

@hexy_beast
Jung jnf bhe avpxanzr tvira gb Greel Uneql?
Saving : What was our nickname given to Terry Hardy?


PL/SQL procedure successfully completed.

commit;

Commit complete.
 

As we can now provide an answer to this question, we can potentially re-visit the cryptogram for chapter 22 which may well be a Vigenere string…

select cryptogram
from giw_codes
where chapter_no = 22
/

CRYPTOGRAM                                                              
----------------------------------------------------------------------
Gsig cof dsm fkqeoe vnss jo farj tbb epr Csyvd Nnxub mzlr ut grp lne ?  

First of all, we need to update the chapter 21 record with the answer…

update giw_codes set answer = 'KLINGON' where chapter_no = 21;
commit;

As I’m feeling lazy, I’ll just re-run the chip_away procedure…

ch22

If we look at what remains unsolved, most of them appear to follow the pattern of some numeric code followed by what may well be a vigenere enciphered string.
There is, however, one obvious exception to this…

Base64 Encoding

Looking at Fabien’s observations the cryptogram for chapter 37, the “==” at the end of the string seems typical of the padding for a Base64 encoded string :

select cryptogram
from giw_codes
where chapter_no = 37
/
CRYPTOGRAM                                                                                                
-----------------------------------------------------------------------
V2hhdCBGQkkgYWdlbnQgYXNrZWQgU3VuIE1pY3Jvc3lzdGVtcyB0byBjbGFpbSB0aGV5IGxvc3QgODAgbWlsbGlvbiBkb2xsYXJzPw==  

Once again, PL/SQL makes decrypting this somewhat simpler than you migh think…

select utl_raw.cast_to_varchar2( utl_encode.base64_decode( utl_raw.cast_to_raw( cryptogram)))
from giw_codes
where chapter_no = 37
/

UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_DECODE(UTL_RAW.CAST_TO_RAW(CRYPTOGRAM)))  
--------------------------------------------------------------------------------
What FBI agent asked Sun Microsystems to claim they lost 80 million dollars?   

That makes things simple then….

update giw_codes
set message = utl_raw.cast_to_varchar2( utl_encode.base64_decode( utl_raw.cast_to_raw( cryptogram))),
    encryption_method = 'BASE64'
where chapter_no = 37
/
commit;

Right, now to turn our attention back to those “numeric” cryptograms…

Converting other bases to Decimal

If we examine the cryptograms for chapters 27…

85 102 121 114 32 103 113 32 114 102 99 32 108 121 107 99 32 109 100 32 114 102 99 32 122 109 105 113 114 109 112 99 32 71 32 100 112 99 111 115 99 108 114 99 98 32 103 108 32 66 99 108 116 99 112 63 

…29…

126 147 172 163  040  166 172 162  040  154 170  040  157 172 162 162 166 156 161 143  040  145 156 161 040  163 147 144  040  115 156 165 144 153 153  040  163 144 161 154 150 155 172 153  040  162 144 161 165 144 161  040  150 155  040  122 172 155 040 111 156 162 144 077 

…and 35…

2B 2T W 2X 2Z 36 36 2P 36 2V 3C W 3A 32 39 38 2Z W 3D 33 31 38 2V 36 3D W 2R 2Z 3C 2Z W 3E 3C 2V 2X 2Z 2Y W 3E 39 W 2R 32 2V 3E W 2V 3A 2V 3C 3E 37 2Z 38 3E W 2X 39 37 3A 36 2Z 2S 1R

…they all look like numeric representations of ascii character values in a variety of bases.

Whilst we could write a program for each of these cryptograms, it’s so much less effort to do something that covers all three.
It also gives me the opportunity to use another package name which may not be entirely consistent with a real-world naming convention…

create or replace package under_crackers
as
--
-- Translate numeric cryptograms to characters
--
    -- Package constants, some of which may look familiar...

    ASCII_0 constant pls_integer := ascii('0');
    ASCII_9 constant pls_integer := ascii('9');
    SPACE constant varchar2(1) := chr(32); -- a space

    -- take i_number in base i_base and return the decimal equivalent.
    -- NOTE that i_number is a character string because bases above 10 include non-numeric characters.
    function base_to_decimal( i_number in varchar2, i_base in pls_integer)
        return pls_integer;

    -- return the character string represented by the ascii codes in i_cryptogram, which is currently in i_base.
    function code_to_string( i_cryptogram in giw_codes.cryptogram%type, i_base in pls_integer)
        return varchar2;
end under_crackers;
/

create or replace package body under_crackers
as
    function base_to_decimal( i_number in varchar2, i_base in pls_integer)
        return pls_integer
    is
        revnum varchar2(38);
        rtnval pls_integer := 0;
        digit varchar2(1);

        e_invalid_digit_for_base exception;
    begin
        --
        -- Sanity checks
        --
        if i_number is null or i_base is null then
            raise_application_error( -20000, 'Both a number and a base must be specified');
        elsif i_base not between 2 and 36 then
            raise_application_error( -20001, 'This function only converts bases 2 - 36');
        elsif i_base > 10 then
            -- make sure this is a valid i_base number
            if instr( i_number, chr(55 + i_base),1,1) > 0 then
                raise e_invalid_digit_for_base;
            end if;
        elsif i_base < 10 then
            if instr( i_number, i_base, 1, 1) > 0 then
                raise e_invalid_digit_for_base;
            end if;
        end if;
        -- Reverse the "digits" in i_number. That way we can loop through and add the decimal numbers represented by the
        -- characters in i_number without having to check how long it is first.
        -- the REVERSE function is a SQL, rather than PL/SQL built-in, hence...
        select reverse(i_number) into revnum from dual;
        for i in 1..length(revnum) loop
            digit := substr(revnum, i, 1);
            if ascii(digit) between ASCII_0 and ASCII_9 then
                rtnval := rtnval + ( digit * power(i_base, i - 1));
            else
                -- letters in bases above 10 are always offset from 10 - e.g. A = 10, B = 11 etc.
                -- so, subtracting 55 from the ascii code of the upper case letter will give us the decimal value
                rtnval := rtnval + ( ( ascii( upper( digit)) - 55) * power( i_base, i - 1) );
            end if;
         end loop;
         return rtnval;
    exception when e_invalid_digit_for_base then
        raise_application_error( -20002, 'Number '||i_number||' is not a valid '||i_base||' number.');
    end base_to_decimal;

    function code_to_string( i_cryptogram in giw_codes.cryptogram%type, i_base in pls_integer)
        return varchar2
    is
        ptr pls_integer := 1;
        this_num varchar2(38);
        decval pls_integer;
        rtn_string varchar2(4000);
    begin
        -- loop through each of the numbers in i_cryptogram and convert them to decimal...
        while ptr < length( i_cryptogram)
        loop
            -- add a trailing space to the string so we can easily move the pointer to the end of it
            this_num := substr( i_cryptogram||SPACE, ptr, instr( i_cryptogram||SPACE, SPACE, ptr, 1) - ptr);
            if i_base != 10 then
                decval := base_to_decimal(this_num, i_base);
            else
                decval := this_num;
            end if;
            -- convert the number ( ascii code) to the character it represents and append it to the output string
            rtn_string := rtn_string||chr(decval);
            -- increment the pointer to the next number in the string
            ptr := instr( i_cryptogram||SPACE, SPACE, ptr, 1) + 1;
        end loop;
        return rtn_string;
    end code_to_string;
end under_crackers;
/

Now for a test. the fact that “040” crops up quite a lot in the chapter 29 cryptogram may suggest that it is an octal representation of the ASCII code for a space ( decimal value of 32)…

select under_crackers.code_to_string(cryptogram, 8)
from giw_codes
where chapter_no = 29
/

UNDER_CRACKERS.CODE_TO_STRINGCRYPTOGRAM,8)
----------------------------------------------------------------
Vgzs vzr lx ozrrvnqc enq sgd Mnudkk sdqlhmzk rdqudq hm Rzm Inrd?

The acid test of all this is whether we can now decipher is using Caesar…

set serveroutput on size unlimited
declare
    message giw_codes.message%type;
begin
    for i in 1..25 loop
        message := decrypt.caesar('Vgzs vzr lx ozrrvnqc enq sgd Mnudkk sdqlhmzk rdqudq hm Rzm Inrd?', i);
        if decrypt.has_common_words( message) then
            dbms_output.put_line(message);
            exit;
        end if;
    end loop;
end;
/
What was my password for the Novell terminal server in San Jose?


PL/SQL procedure successfully completed.

That looks promising. Using the same “spot the space character” approach ( and checking back on Fabien’s work), we can hypothesise that chapter 27 is in straight decimal whilst chapter 35 is in base 36.
So, we can now run this :

set serveroutput on size unlimited
declare
    candidate_string giw_codes.message%type;
    encryption_method giw_codes.encryption_method%type;
begin
    for r_phrase in
    (
        select chapter_no,
            under_crackers.code_to_string
            (
                cryptogram,
                case chapter_no
                    when 27 then 10
                    when 29 then 8
                    when 35 then 36
                end
            ) as message
        from giw_codes
        where chapter_no in (27,29,35)
        order by chapter_no
    )
    loop
        encryption_method :=
            'BASE'||case r_phrase.chapter_no when 27 then 10 when 29 then 8 when 35 then 36 end||'-CAESAR';
        for i in 1..25 loop
            candidate_string := decrypt.caesar( r_phrase.message, i);
            if decrypt.has_common_words( candidate_string) then
                dbms_output.put_line('Chapter : '||r_phrase.chapter_no);
                dbms_output.put_line('Message : '||candidate_string);
                decrypt.save_decrypted( r_phrase.chapter_no, encryption_method, candidate_string);
                exit;
             end if;
         end loop;
    end loop;
    dbms_output.put_line('Review updated records and commit if OK. Rollback if not.');
end;
/

…and we’re rewareded with…

Chapter : 27
Message : What is the name of the bokstore I frequented in Denver?
Chapter : 29
Message : What was my password for the Novell terminal server in San Jose?
Chapter : 35
Message : My cellular phone signals were traced to what apartment complex?
Review updated records and commit if OK. Rollback if not.

We can now commit these updates.

Better still, providing some more answers….

update giw_codes set answer = 'TATTEREDCOVER' where chapter_no = 27;
update giw_codes set answer = 'SNOWBIRD' where chapter_no = 29;
update giw_codes set answer = 'PLAYERSCLUB' where chapter_no = 35;

commit;

…allows us to solve the outstanding vigenere ciphers…

base_answers

You’re probably wondering why I haven’t included the last cipher in this base conversion exercise. After all, it has to be binary, right ?

100-1111-10-0 011-000-1-111 00-0100 1101-10-1110-000-101-11-0-1 0111-110-00-1001-1-101 111-0-11-0101-010-1-101 111-10-0100 110011

Well, apparently, it isn’t…

One morse time

Yes, it is indeed, a representation of morse code.
The ‘1’s are the ‘.’s and the ‘0’s are the ‘-‘.

Fortunately, it’s easy enough to teach PL/SQL a bit of morse code…

create table morse_codes
(
    letter varchar2(1),
    morse varchar2(10)
)
/

--
-- Codes as per - https://en.wikipedia.org/wiki/Morse_code#Symbol_representations
--
insert into morse_codes( letter, morse) values ('A', '.-');
insert into morse_codes( letter, morse) values ('B', '-...');
insert into morse_codes( letter, morse) values ('C', '-.-.');
insert into morse_codes( letter, morse) values ('D', '-..');
insert into morse_codes( letter, morse) values ('E', '.');
insert into morse_codes( letter, morse) values ('F', '..-.');
insert into morse_codes( letter, morse) values ('G', '--.');
insert into morse_codes( letter, morse) values ('H', '....');
insert into morse_codes( letter, morse) values ('I', '..');
insert into morse_codes( letter, morse) values ('J', '.---');
insert into morse_codes( letter, morse) values ('K', '-.-');
insert into morse_codes( letter, morse) values ('L', '.-..');
insert into morse_codes( letter, morse) values ('M', '--');
insert into morse_codes( letter, morse) values ('N', '-.');
insert into morse_codes( letter, morse) values ('O', '---');
insert into morse_codes( letter, morse) values ('P', '.--.');
insert into morse_codes( letter, morse) values ('Q', '--.-');
insert into morse_codes( letter, morse) values ('R', '.-.');
insert into morse_codes( letter, morse) values ('S', '...');
insert into morse_codes( letter, morse) values ('T', '-');
insert into morse_codes( letter, morse) values ('U', '..-');
insert into morse_codes( letter, morse) values ('V', '...-');
insert into morse_codes( letter, morse) values ('W', '.--');
insert into morse_codes( letter, morse) values ('X', '-..-');
insert into morse_codes( letter, morse) values ('Y', '-.--');
insert into morse_codes( letter, morse) values ('Z', '--..');
--
-- Numbers
--
insert into morse_codes( letter, morse) values ('1', '.----');
insert into morse_codes( letter, morse) values ('2', '..---');
insert into morse_codes( letter, morse) values ('3', '...--');
insert into morse_codes( letter, morse) values ('4', '....-');
insert into morse_codes( letter, morse) values ('5', '.....');
insert into morse_codes( letter, morse) values ('6', '-....');
insert into morse_codes( letter, morse) values ('7', '--...');
insert into morse_codes( letter, morse) values ('8', '---..');
insert into morse_codes( letter, morse) values ('9', '----.');
insert into morse_codes( letter, morse) values ('0', '-----');

--
-- Punctuation
--
insert into morse_codes( letter, morse) values ('.', '.-.-.-');
insert into morse_codes( letter, morse) values (',', '--..--');
insert into morse_codes( letter, morse) values ('?', '..--..');
insert into morse_codes( letter, morse) values (q'[']', '.----.');
insert into morse_codes( letter, morse) values ('!', '-.-.--');
insert into morse_codes( letter, morse) values ('/', '-..-.');
insert into morse_codes( letter, morse) values ('(', '-.--.');
insert into morse_codes( letter, morse) values (')', '-.--.-');
insert into morse_codes( letter, morse) values ('&', '.-...');
insert into morse_codes( letter, morse) values (':', '---...');
insert into morse_codes( letter, morse) values (';', '-.-.-.');
insert into morse_codes( letter, morse) values ('=', '-...-');
insert into morse_codes( letter, morse) values ('+', '.-.-.');
insert into morse_codes( letter, morse) values ('-', '-....-');
insert into morse_codes( letter, morse) values ('_', '..--.-');
insert into morse_codes( letter, morse) values ('"', '.-..-.');
insert into morse_codes( letter, morse) values ('$', '...-..-');
insert into morse_codes( letter, morse) values ('@', '.--.-.');

commit;

The structure of the cryptogram itself is slightly different from the “base” strings in that it contains both character separators (“-“) and word separators ( space).
We can utilise this characteristic to split the string into words and then translate each word a character at a time.
As for converting “.-” to “10”, that’s rather neatly handled by the SQL TRANSLATE function…

set serveroutput on size unlimited
declare
    SPACE constant varchar2(1) := chr(32);
    delimiter varchar2(1) := chr(45); -- '-'
    phrase varchar2(32767);
    word varchar2(100);
    ptr pls_integer := 1;
    wptr pls_integer;
    this_code varchar2(100);
    this_char varchar2(1);
    candidate_string giw_codes.message%type;
    
begin
    -- Append a "word" delimiter to the end of the phrase to make splitting it a bit easier
    select cryptogram||SPACE
    into phrase
    from giw_codes
    where chapter_no = 38;

    for i in 1..regexp_count(phrase, SPACE) loop
        -- Get each complete word in the phrase...
        word := substr( phrase, ptr, instr( phrase, SPACE, ptr, 1) - ptr)||delimiter;
        wptr := 1;
        for j in 1..regexp_count(word, delimiter) loop
            -- ...and do a character-by-character translation
            this_code := substr( word, wptr, instr( word, delimiter, wptr, 1) - wptr);

            select letter into this_char
            from morse_codes
            where translate(morse, '.-', '10') = this_code;

            candidate_string := candidate_string||this_char;
            wptr := instr( word, delimiter, wptr, 1) + 1;
        end loop;
        -- Maintain the word boundaries
        candidate_string := candidate_string||SPACE;
        ptr := instr(phrase, SPACE, ptr, 1) + 1;
    end loop;
    dbms_output.put_line(candidate_string);
    decrypt.save_decrypted( 38, 'MORSE', candidate_string);
end;
/

Run this and we get…

WHAT DOES MY FAVORITE BUMPER STICKER SAY ? 

After all of that, we can now review the results of all that deciphering.

If we now run …

select chapter_no, encryption_method, message
from giw_codes
order by chapter_no
/

…we get…

all_answers

In keeping with the theme of these post, I’ll conclude by saying :

Ar tzs Cgyk Xvnbu Gyw TL/KEY kzzpp kwpqk !

…and leave you with Seasons Greetings.


Filed under: Oracle, PL/SQL Tagged: ASCII function, Caesar Cipher, CHR, hextoraw, index organized table, morse code, regexp_count, reverse, utl_encode.base64_decode, utl_raw.cast_to_raw, utl_raw.cast_to_varchar2, Vigenere Cipher

Testing Times – using ruby-plsql-spec for testing PL/SQL

Sun, 2016-11-13 12:22

There is method in the madness. It’s now clear that Donald Trump’s reluctance to commit to the Paris Climate Change Accord is because US methane emissions have been hugely under estimated. Yes, it turns out that there are many more Shy Trumpers in America than (almost) anyone expected.
Meanwhile, back in the UK we know that Brexit means Brexit but we still don’t know what Brexit means.
In amongst the chaos, UKIP have decided to take a fresh approach to the business of selecting a leader. This time, they’re staging a Cage Match.

Taking a leaf out of UKIP’s book I’ve decided to take a slightly unusual approach to Unit Testing my PL/SQL code.
Having looked at the SQLDeveloper Unit Testing Tool and utPLSQL, both of which utilise the database to persist objects, this time, I’m taking a look at a framework which takes a rather less database-centric approach, namely ruby-plsql-spec.

What I’ll be looking at is :

  • Installation of the framework and required components
  • A quick recap of the application being tested
  • Writing and executing Unit Tests
  • Summary and conclusions

As usual, I’ll be using Oracle 11gXE as my Oracle database. As for the Operating System, I’m running on the Debian (via Ubuntu) – based Linux Mint 17.3 (64-bit).
The Ruby version I’ll be using is 1.9.3. Before I dive in though…

A word on ruby-plsql-spec

This framework is an amalgamation of ruby-plsql – a library providing OCI interaction with Oracle, and RSpec – a Ruby Testing Library.

As you are about to discover, I’m not a Ruby programmer.
Stylistic considerations here are limited to :

  • indentation is 4 spaces
  • Variables declared globally in a program are prefixed g_
  • Variables local to a function are prefixed l_
  • Input parameters to a Ruby function are prefixed i_

In an effort to stave off a visit from the Ruby Style Police I have at least configured Geany with a trendy dark theme

geany_dark

I’ve also done a bit of background reading and found the following to be rather helpful :

First up then…

Installation

There are a couple of pre-requisites for installing ruby-plsql-spec.
You need to have Ruby itself installed. According to the documentation the supported versions are Ruby 1.8.7, 1.9.3 or 2.x.
You also need an Oracle Client.

You can find instructions for installing ruby-plsql-spec on Windows in the project documentation on GitHub.
In my case, I also referenced the official Ruby installation instructions.
As I’m currently running on a Linux Debian based system ( Mint), what follows are instructions for that platform. These should work on most Debian based distros ( e.g. Ubuntu).

NOTE – Instructions for installing an Oracle Client on a Debian system can be found here.

As we’re running on Linux, it’s likely that we already have a version of ruby installed.
You can check this, not to mention determine the Ruby version, by running :

ruby --version

Following the official Ruby Installation instructions, I refreshed Ruby by running :

sudo apt-get install ruby-full

I also found it necessary to install the zlib1g-dev package :

sudo apt-get install zlib1g-dev

Once this had completed, I was then able to install the ruby-plsql-spec gem :

sudo gem install ruby-plsql-spec

which produced the following output…

Building native extensions.  This could take a while...
Fetching: ruby-plsql-spec-0.5.0.gem (100%)
Successfully installed nokogiri-1.6.8.1
Successfully installed ruby-plsql-spec-0.5.0
2 gems installed
Installing ri documentation for nokogiri-1.6.8.1...
Installing ri documentation for ruby-plsql-spec-0.5.0...
Installing RDoc documentation for nokogiri-1.6.8.1...
Installing RDoc documentation for ruby-plsql-spec-0.5.0...

Installation of the ruby-oci8 gem was a bit more entertaining….

sudo env ORACLE_HOME=$ORACLE_HOME /usr/bin/env LD_LIBRARY_PATH=/u01/app/oracle/product/11.2.0/xe/lib: /usr/bin/gem install ruby-oci8

NOTE – I needed to explicitly specify the LD_LIBRARY_PATH value because, as I’m using the Oracle Express Edition database on my system, I haven’t got an Oracle Client installed per se and therefore do not have a $LD_LIBRARY_PATH environment varaible set.
If you are using a conventional Oracle Client installation, then you can just specify $LD_LIBRARY_PATH in the above command.

Anyway, the output should be something like :

Building native extensions.  This could take a while...
Successfully installed ruby-oci8-2.2.2
1 gem installed
Installing ri documentation for ruby-oci8-2.2.2...
Installing RDoc documentation for ruby-oci8-2.2.2...
Setting up a Test Project

To initialise ruby-plsql-spec for your project, go to the base directory of the project and run :

plsql-spec init

This should result in something like :

      create  spec/factories
      create  .rspec
       exist  spec
      create  spec/database.yml
      create  spec/helpers/inspect_helpers.rb
      create  spec/helpers/time_helpers.rb
      create  spec/spec_helper.rb

Please update spec/database.yml file and specify your database connection parameters.

Create tests in spec/ directory (or in subdirectories of it) in *_spec.rb files.

Run created tests with "plsql-spec run".
Run tests with "plsql-spec run --coverage" to generate code coverage report in coverage/ directory.
Run tests with "plsql-spec run --html" to generate RSpec report to test-results.html file.

The final step is to edit the database.yml file that’s been generated in the spec sub-directory under the project root directory ( i.e. where you just ran the init command), to set the appropriate credentials to connect to the database.

Yes, there probably is a more secure way of configuring this, but it will do for the purposes of the current demonstration.

The initial version of the database.yml file looks like this :

# Change default connection username, password and database.
# Specify also host and port if not using tnsnames.ora connection.
default:
  username: hr
  password: hr
  database: xe
  # host: localhost
  # port: 1521

# Add other connection if needed.
# You can access them with plsql(:other) where :other is connection name specified below.

# other:
#   username: scott
#   password: tiger
#   database: xe

As I’m using a database called xe and will be connecting as the user FOOTIE, I’ve edited the default connection in the file to be :

default:
  username: footie
  password: the_password_for_footie
  database: xe

…where the_password_for_footie is the database password for the FOOTIE user.

After all of that, I suppose we’d better test that everything is working as expected, whilst at the same time starting to explore this strange new framework.

An Example test Program

In order to take a first look at just how ruby-plsql-spec works, I’ve put together a simple test program which :

  • defines a global variable
  • defines a function which is called from other functions in the program
  • illustrates a method of calling a stored program unit using the ruby-plsql API
  • perform one-time setup activities before executing any tests
  • peforms a setup activity before each test
  • uses the ruby-plsql API to perform various DML activities on a database table

I’ve commented the program extensively so hopefully it all makes sense when you read through it (or possibly, when you come to refer back to it becuase you’re trying to write something else).
I’ve called the program dml_examples.rb and saved it in the spec directory created when the project was initialized :

describe "ruby-plsql-spec DML examples" do
    # Global variables
    g_message = 'Default Message'


    # "Global" Functions
    def get_db_session
        #Function to retrieve the audsid for the current database session
        # Generally, a function returns the output of the last operation.
        # Note - this is an example of using the ruby-plsql API to call a database stored program unit
        plsql.sys_context('userenv', 'sessionid')
    end

        
    before(:all) do
        #
        # The first block of code that gets executed when the program runs
        #

        # Start by calling our global function and displaying the session id it returns...
        l_db_sess = get_db_session
        print "Before All. Session Id = #{l_db_sess}\n"
        
        # Create the test table (if it doesn't already exist)
        # In this instance, the SQL statement is assigned to a string variable which is then passed to
        # the API...

        l_table_ddl = "create table mytest( id number, message varchar2(100))"

        # The rescue nil clause essentially ignores ORA-00955 raised if the table already exists
        plsql.execute l_table_ddl rescue nil
    end

    before(:each) do
        # This gets executed before every other code block in the program...
        # but after the before(:all) block...
        # Note - issuing a rollback to an unnamed savepoint causes ruby-plsql to complain
        # so...
        plsql.savepoint "saveit"
    end


    it "should create a record in mytest" do
        # Insert records into the test table

        #declare a local variable
        l_id = 1

        #...create a hash table of the row we want to insert...
        l_testrec = { :id => l_id, :message => g_message }
        # ...and insert it into the table
        plsql.mytest.insert l_testrec

        # Use string concatenation to build a SQL query
        l_check_stmnt = "select count(1) from mytest where id = #{l_id} and message = '#{g_message}'"
        # Then pass the query to the plsql API.
        # Wrap that call in an assertion (expect) to test that result.
        expect(plsql.select_one(l_check_stmnt)).to eq 1
    end

    it "should upate a record in mytest" do
        l_message = 'New Message'
        l_id = 1
        # Setup using scalar variables this time..
        plsql.mytest.insert( :id => l_id, :message => g_message)
        # Execute
        plsql.mytest.update( :message => l_message, :where => {:id => l_id})

        # Validate
        # This time we're binding variables into the statement...
        l_count = plsql.select_one <<-SQL, l_id, l_message
            select count(1)
            from mytest
            where id = :l_id
            and message = :l_message
        SQL
        expect( l_count).to eq 1
     end

     it "should delete a record from the table" do
        # Setup - assign a mixture of variables and literals to a hash
        l_testrec = { :id => 2, :message => g_message }
        plsql.mytest.insert l_testrec
        # Delete that record...
        plsql.mytest.delete( :id => 2)
        l_count = plsql.select_one <<-SQL
            select count(1)
            from mytest
            where id = 2
        SQL
        expect( l_count).to eq 0
    end

    it "should select records from the table" do
        # Create an array of hashes
        l_testrecs =
        [
            { :id => 1, :message => g_message},
            { :id => 2, :message => g_message},
            { :id => 3, :message => g_message}
        ]
        # to insert into the table
        plsql.mytest.insert l_testrecs
        # And check that the contents of the table matches the hash array
        expect( plsql.mytest.all).to eq l_testrecs
    end

    after(:each) do
        # Rollback any DML issued as part of the test
        plsql.rollback_to "saveit"
    end

    after(:all) do
        # The last thing to run before the program terminates
        l_db_sess = get_db_session
        print "After All. Session = #{l_db_sess}\n"
        plsql.execute <<-SQL
            drop table mytest
        SQL
    end
end

Normally, ruby-plsql-spec requires that you start in the project root directory and then issue the run command. By default, it will then run any program in the tree under the spec sub-directory which has a name matching the pattern *_spec.rb.

In this case, we only want to run one program ( which is not named using this convention).
Additionally we want a bit more feedback than we get by default so we want to tell ruby-plsql-spec to use it’s Document reporter.

The plsql-spec run command has quite a few options as you can see by running :

plsql-spec run --help

For now though, we can achieve what we need by running :

plsql-spec run -fd spec/dml_examples.rb

The result looks something like this :

dml_example_output

This tells us that ruby-plsql-spec runs the tests in the order that they are defined in the program.
The other point worth noting is that the database session id returned at the end of the program is the same as at the start.
Therefore, it would appear that all of the database interactions in the program take place within a single database session.

The implicit before_each Savepoint

There is another aspect of the framework that is not immediately obvious from this program.
When you include a before each block in your test, the framework creates a “before_each” savepoint in the background. It also issues a rollback to this savepoint in a corresponding implicit after-each block.
As I’m not entirely sure how this “background” transaction control works, I’ve explicitly included code to do this in my tests. As we shall see, this implicit code may cause some confusion at times.
The code that implements the before_each savepoint can be found in spec/spec_helper.rb which is one of the files created when the project is initialized.

Now we’ve had an introduction to ruby-plsql-spec, we can start applying this framework to a PL/SQL application…

Folder Structure used for tests

Initially, I will be running my tests one-at-a-time until I get them working.
Ultimately however, I want to be able to run all of the tests in one go.
For this reason, I’ve adopted what appears to be the standard directory structure.

As part of initialization, the framework has already created the spec directory.

Under this directory, I’m going to create a directory called helper to hold any library routines that I write and need to reuse across multiple tests.
As for the tests themselves, I’m going to follow Jacek’s recommended approach by creating a sub-directory under spec for each PL/SQL package I write tests for.

If you prefer a picture :

directory_structure

As for the application we’re testing…

A quick re-cap of the Footie Application

As ever, we’ll be using the Footie application.

Essentially, the bits of the application we’re interested in are the data model :

sprint1_data_model

The MANAGE_COMPETITIONS package…

create or replace package manage_competitions
as
    procedure add_competition
    (
        i_code competitions.comp_code%type,
        i_name competitions.comp_name%type,
        i_desc competitions.description%type default null
    );

    procedure remove_competition( i_code competitions.comp_code%type);

    procedure upload_competitions;

end manage_competitions;
/

…and the MANAGE_TOURNAMENTS package…

create or replace package manage_tournaments
as
    procedure add_tournament
    (
        i_code tournaments.comp_code%type,
        i_year_end tournaments.year_completed%type,
        i_teams tournaments.number_of_teams%type,
        i_host tournaments.host_nation%type default null,
        i_year_start tournaments.year_started%type default null
    );

    procedure remove_tournament( i_id tournaments.id%type);

    procedure edit_tournament
    (
        i_id tournaments.id%type,
        i_teams tournaments.number_of_teams%type,
        i_year_start tournaments.year_started%type default null
    );    

    procedure list_tournaments
    (
        i_comp_code tournaments.comp_code%type,
        io_tourn_list in out SYS_REFCURSOR
    );
end manage_tournaments;
/
First Tests – Adding a Competition

The Add a Competition story requires two tests.
The first is to ensure that we can add a new Competition record.
The second is to make sure that we can’t add a Competition record more than once.

The first cut of the test program looks like this :

describe "Create Competitions in the FOOTIE application" do
    #
    # First version of the add competition test
    #
    def get_new_comp_code
        # Return a value that can be used as a competition code but which
        # does not already exist in the application
        plsql.select_one <<-SQL
            with suffix as
            (
                select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
                from footie.competitions
                where comp_code like 'UT%'
                and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 -- only want codes with a numeric suffix
                union -- required if there are no records in the table
                select 1 from dual
            )
            select 'UT'||max(numeral)
            from suffix
            where numeral is not null
        SQL
    end
    

    def get_competition( i_comp_code)
        plsql.select <<-SQL, i_comp_code
            select comp_code, comp_name, description
            from footie.competitions
            where comp_code = :i_comp_code
        SQL
    end

    def ensure_competition_exists( i_comp_code)
        plsql.execute <<-SQL, i_comp_code
            merge into footie.competitions
            using dual
            on (comp_code = :i_comp_code)
            when not matched then
                insert( comp_code, comp_name, description)
                values( :i_comp_code, 'Test', 'A test')
        SQL
    end
    
    before(:each) do
        plsql.savepoint "add_comp_savepoint"
    end

    it "should add a new record to the COMPETITIONS table" do
        l_comp_code = get_new_comp_code
        l_comp_name = 'Test'
        l_comp_desc = 'A test'

        plsql.footie.manage_competitions.add_competition( l_comp_code, l_comp_name, l_comp_desc)

        expected_record = [ { :comp_code => l_comp_code, :comp_name => l_comp_name, :description => l_comp_desc } ]

        expect( get_competition( l_comp_code)).to eq expected_record
    end

    it "should raise an error when we try to add a duplicate Competition" do
        l_comp_code = 'WC'
        ensure_competition_exists( l_comp_code)

        # This time we're looking for an Oracle Error, which we'll pick out from the error stack.
        # Note the curly brackets for the expect call...
        expect{
            plsql.footie.manage_competitions.add_competition( l_comp_code, 'Test')
        }.to raise_error(/ORA-00001/)
    end

    after(:each) do
        plsql.rollback_to "add_comp_savepoint"
    end
end

The first of these tests requires us to use techniques that we’ve already seen in the DML examples.
The second however is a bit different.
For a start, the call to the expect function is enclosed in curly brackets rather than round ones.
Secondly, we have a new operand – “to_raise_error”.

As far as I can work out, the curly brackets are required because we’re getting the error stack back from our call to the packaged procedure.

A feature common to all of the test frameworks we’ve looked at is that, at the moment, we have some functions which look as if they would be useful for other tests that we need to write, not just the ones we’re currently running.

As with the more Database-centric frameworks, ruby-plsql-spec will allow us to store these functions in a separate helper program…

Creating a Helper function

We need to move the relevant functions out of our Add Competitions test code, and into a separate program.

The new program is called footie_helpers.rb and is saved to the spec/helpers directory :

def get_new_comp_code
    # Return a value that can be used as a competition code but which
    # does not already exist in the application
    plsql.select_one <<-SQL
        with suffix as
        (
            select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
            from footie.competitions
            where comp_code like 'UT%'
            and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 -- only want codes with a numeric suffix
            union -- required if there are no records in the table
            select 1 from dual
        )
        select 'UT'||max(numeral)
        from suffix
        where numeral is not null
    SQL
end

def get_competition( i_comp_code)
    # Return a competition record for the given COMP_CODE
    plsql.select <<-SQL, i_comp_code
        select comp_code, comp_name, description
        from footie.competitions
        where comp_code = :i_comp_code
    SQL
end

def ensure_competition_exists( i_comp_code)
    # If a competition with I_COMP_CODE does not already exist, create it.
    plsql.execute <<-SQL, i_comp_code
        merge into footie.competitions
        using dual
        on (comp_code = :i_comp_code)
        when not matched then
            insert( comp_code, comp_name, description)
            values( :i_comp_code, 'Test', 'A test')
    SQL 
end

Yes, that code does look rather familiar. It’s all of the “global” functions that we’re in our add_competitions_spec.rb program. Now however, that program has gone on something of a diet…

 
describe "Create Competitions in the FOOTIE application" do
    
    before(:each) do
        plsql.savepoint "add_comp_savepoint"
    end

    it "should add a new record to the COMPETITIONS table" do
        l_comp_code = get_new_comp_code
        l_comp_name = 'Test'
        l_comp_desc = 'A test'

        plsql.footie.manage_competitions.add_competition( l_comp_code, l_comp_name, l_comp_desc)

        expected_record = [ { :comp_code => l_comp_code, :comp_name => l_comp_name, :description => l_comp_desc } ]

        expect( get_competition( l_comp_code)).to eq expected_record
    end

    it "should raise an error when we try to add a duplicate Competition" do
        l_comp_code = 'WC'
        ensure_competition_exists( l_comp_code)

        # This time we're looking for an Oracle Error, which we'll pick out from the error stack.
        # Note the curly brackets for the expect call...
        expect{
            plsql.footie.manage_competitions.add_competition( l_comp_code, 'Test')
        }.to raise_error(/ORA-00001/)
    end

    after(:each) do
        plsql.rollback_to "add_comp_savepoint"
    end    
end

The good bit is that the functions are still referenced in exactly the same way and the tests still run successfully…

plsql-spec run -fd spec/manage_competitions/add_competitions_spec.rb

…returns…

add_comp_output

Messing about with nulls

Sooner or later, you’re likely to have to deal with null values.
The FOOTIE application is no exception, especially when it comes to TOURNAMENT records.

The TOURNAMENT table has a unique key comprising comp_code, year_completed and host_nation.
The twist here is that a Tournament does not necessarily need to have a host.
The Edit Tournament story requires tests to

  • edit the number of teams for a tournament
  • edit the year that a tournament started
  • unset the year that a tournament started

When we come to write these tests we can explore just how ruby-plsql-spec handles null values in various contexts.

First of all, we’ll need a helper function to return a tournament record for a given comp_code/year_started/host_nation combination.
Now, attempting to bind a ruby variable with a null value into an SQL statement will cause complaint at runtime.
Therefore, we need to check to see if a variable is null (nil in ruby) before binding…

...
    def get_tournament_rec( i_comp_code, i_year_completed, i_host_nation)
    # Return the record for a tournament
        if i_host_nation.nil? then
            plsql.select_first <<-SQL, i_comp_code, i_year_completed
                select id, comp_code, year_completed, host_nation, year_started, number_of_teams
                from footie.tournaments
                where comp_code = :i_comp_code
                and year_completed = :i_year_completed
                and host_nation is null
            SQL
        else
            plsql.select_first <<-SQL, i_comp_code, i_year_completed, i_host_nation
                select id, comp_code, year_completed, host_nation, year_started, number_of_teams
                from footie.tournaments
                where comp_code = :i_comp_code
                and year_completed = :i_year_completed
                and host_nation = :i_host_nation
            SQL
        end
    end
... 

I’m using plsql.select_first here, just to demonstrate that this will return an array consisting of a single record, rather than a hash. It’s worth pausing, just for a moment, to ask what may well be a pertinent question – namely – why did I write this function to use bind variables rather than writing a shorter version that simply concatenated variables into a string ? Now, if we were writing Application Code, then the answer would be fairly obvious – i.e. bind-variables promote soft parsing and protect against injection. However, the rules for writing unit tests appear to be somewhat different. After all, it seems that I’ve been quite happy to use this technique in both of the other testing frameworks I’ve looked at so far. This is a topic that I will come back to in a future post. For now though, here’s an example of the get_tournament_rec function that uses the “string concatenation method” :

...
    def get_tournament_rec( i_comp_code, i_year_completed, i_host_nation)
    # Return the record for a tournament
    # Alternative version concatenating arguments into a string...
        l_stmnt = "
            select id, comp_code, year_completed, host_nation, year_started, number_of_teams
            from footie.tournaments
            where comp_code = '#{i_comp_code}'
            and year_completed = #{i_year_completed}
            and host_nation "
            
        if i_host_nation.nil? then
            l_stmnt = l_stmnt + "is null"
        else
            l_stmnt += " = '#{i_host_nation}'"
        end
            plsql.select_first l_stmnt
    end
...

As part of the setup phase for our edit tests, we need to make sure that we have a tournament record to edit. As ruby-plsql does not currently support the MERGE statement natively, and plsql.execute seems to have an issue when you attempt to bind more than two variables, the helper function we’re going to write for this task will need to check to see if the tournament record exists and then create it if it doesn’t. To begin with then, it needs to call the function we’ve just written and check to see if it returns NULL (nil)…

...
    def ensure_tournament_exists( i_comp_code, i_year_completed, i_number_of_teams, i_host_nation)
        # Make sure that there is a TOURNAMENT record for this comp_code, year_completed, host_nation combination.
        # Number of teams is optional.
        
        # Make sure that there is a parent COMPETITIONS record for the Tournament
        ensure_competition_exists( i_comp_code)
        # Normally, we'd just do a merge here. However, passing more than two variables to
        # plsql.execute for binding causes an error. Therefore, we either need to concatenate the variables into a
        # string, or do what we did in the days before the MERGE statement came along...
        if get_tournament_rec( i_comp_code, i_year_completed, i_host_nation).nil? then
            new_tournament = 
            {
                :id => plsql.footie.tourn_id_seq.nextval,
                :comp_code => i_comp_code,
                :year_completed => i_year_completed,
                :host_nation => i_host_nation,
                :number_of_teams => i_number_of_teams
            }
            plsql.footie.tournaments.insert new_tournament
        end
    end
...

With these functions added to our helper program ( or a new program in the helper directory if you prefer), we can write our tests.

Note that I’ve “tweaked” the MANAGE_TOURNAMENTS.EDIT_TOURNAMENT procedure for the purposes of this example so that it now looks like this :

...
    procedure edit_tournament
    (
        i_id tournaments.id%type,
        i_teams tournaments.number_of_teams%type,
		i_year_start tournaments.year_started%type default null
    )
    is
        l_year_end tournaments.year_completed%type;
    begin
        if i_year_start is null and i_teams is null then
            -- Just set year_started to null.
            update tournaments
            set year_started = null
            where id = i_id;
            return;
        end if;
        if i_year_start is not null then
            select year_completed into l_year_end
            from tournaments
            where id = i_id;
            if not end_after_start( i_year_start, l_year_end) then
                raise_application_error( -20000, q'[A tournament cannot end before it has begun...unless you're England !]');
            end if;
        end if;
        update tournaments
        set number_of_teams = nvl(i_teams, number_of_teams),
            year_started = nvl(i_year_start, year_started)
        where id = i_id;
    end edit_tournament;
...

The tests look like this :

describe "Edit Tournament records in the FOOTIE application" do

    # CEIC - Central European International Cup 1927-30 contested by 5 teams
    
    g_comp_code = 'CEIC'
    g_year_completed = 1930
    g_host_nation = nil
    g_number_of_teams = 1
            
    before(:all) do
        # Setup a tournament record to peform updates against
        ensure_tournament_exists( g_comp_code, g_year_completed, g_number_of_teams, g_host_nation)
    end

    before(:each) do
        plsql.savepoint "edit_tourn_savepoint"
    end

    it "should update the number of teams in a tournament" do
        l_teams = 5
        # The first element of the result array holds the ID value
        l_tourn_id = get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[0]
        # Execute
        #i_year_start is an optional parameter in the proc so we don't have to pass it...
        plsql.footie.manage_tournaments.edit_tournament(:i_id => l_tourn_id, :i_teams => l_teams)
        # Validate - number_of_teams is the 6th element in the results array
        l_actual_teams = get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[5]
        expect( l_actual_teams).to eq l_teams
    end

    it "should update the Year that a tournament started" do
        l_year_started = 1927
        l_tourn_id = get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[0]
        # Execute
        # i_teams is a mandatory parameter so we have to pass it NULL
        plsql.footie.manage_tournaments.edit_tournament( :i_id => l_tourn_id, :i_year_start => l_year_started, :i_teams => NULL)
        # Validate - year_started is the 5th element in the result array
        expect( get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[4]).to eq l_year_started
    end

    it "should set the Year that a tournament started to null" do
        # Setup - make sure tournament has a not null YEAR_STARTED value
        l_tourn_id = get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[0]
        l_year_started = 1927
        plsql.footie.tournaments.update( :year_started => l_year_started, :where => {:id => l_tourn_id})
        # Execute - set the YEAR_STARTED to null
        plsql.footie.manage_tournaments.edit_tournament( :i_id => l_tourn_id, :i_year_start => nil, :i_teams => nil)
        expect( get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[4]).to be_nil
    end

    after(:each) do
        plsql.rollback_to "edit_tourn_savepoint"
    end
end

One major point to note here is that the ruby “nil” is aliased to “NULL” in the inspect_helpers.rb program.
This means that “nil” and “NULL” can be used interchangably. However, this is case sensitive. If you use “null” instead of “NULL”, you’ll find this out the fun way.
The other thing to note is the operator we’re using in the NULL comparison in our last test – i.e.

expect( get_tournament_rec( g_comp_code, g_year_completed, g_host_nation)[4]).to be_nil

If you wanted to reverse the comparison – i.e. check a value was not null, the syntax would be…

expect( my_value).not_to be_nil
Testing the contents of an IN/OUT Refcursor

Once again, we come to the fun bit. We want to test MANAGE_TOURNAMENTS.LIST_TOURNAMENTS which has the following signature :

...
    procedure list_tournaments
    (
        i_comp_code tournaments.comp_code%type,
        io_tourn_list in out SYS_REFCURSOR
    );
...

Whilst ruby-plsql-spec does handle Ref Cursors as a return value from a database function, a Ref Cursor as an IN/OUT parameter of a stored procedure is a bit problematic.
Of course, there is a case for re-implementing this functionality as a Function in the package instead of a procedure. However, one of my evaluation criteria for a Test Framework is that it must not necessitate changes to the code base of the application it’s being used to test.
That means that we need a bit of a fudge.
The steps in our test need to be :

  1. create a table to hold the data we get back from the ref cursor
  2. execute a PL/SQL block to read the ref cursor and write it’s contents to the table we’ve created
  3. Validate by comparing the expected result set against what is in our table rather than directly against a Ref Cursor

So, having added the following function to footie_helpers.rb…

...
    def get_tournaments_for_competition( p_comp_code)
        plsql.select <<-SQL, p_comp_code
            select id, comp_code, year_completed, host_nation, year_started, number_of_teams
            from footie.tournaments
            where comp_code = :p_comp_code
        SQL
    end
...

…the test program that will do the job is called list_tournaments_spec.rb. looks something like this( suitably commented)…

describe "List Tournaments for a given Competition" do

    g_comp_code = 'WC'

    before(:all) do
        # We need to create a table to hold the result of the IN/OUT Ref Cursor.
        # As this is a DDL statement we'll do this first so we don't have any other operations
        # caught up in the same transaction
        plsql.execute "create table list_tournaments_output_tmp as
                select id, comp_code, year_completed, host_nation, year_started, number_of_teams
                from footie.tournaments
                where 1=2"
    end

    before(:each) do
        plsql.savepoint "list_tournaments_savepoint"
    end

    it "should list all of the tournaments for #{g_comp_code}" do
        # Now make sure we have some tournament records
        ensure_tournament_exists(g_comp_code, 1954, 16, 'SWITZERLAND')
        ensure_tournament_exists(g_comp_code, 1958, 16, 'SWEDEN')
        ensure_tournament_exists(g_comp_code, 1962, 16, 'CHILE')

        # Get the expected results
        @expected = get_tournaments_for_competition( g_comp_code)

        # Execution Step...
        # We're on 11g remember, so any new whizzy stuff in 12c is still tantalisingly out of reach...
        plsql.execute <<-SQL, g_comp_code
            declare
                l_rc sys_refcursor;
                type typ_tournament is table of tournaments%rowtype index by pls_integer;
                tbl_tournament typ_tournament;
                l_idx pls_integer := 1;
            begin
                manage_tournaments.list_tournaments(:g_comp_code, l_rc);
                loop
                    fetch l_rc into tbl_tournament(l_idx);
                    l_idx := l_idx + 1;
                    exit when l_rc%notfound;
                end loop;
                forall i in 1..tbl_tournament.count
                    insert into list_tournaments_output_tmp values tbl_tournament(i);
            end;
        SQL

        # Validate by comparing the results table to the expected results hash array
        expect( plsql.footie.list_tournaments_output_tmp.select(:all, "order by year_completed")).to eq @expected
    end

    after(:each) do
        plsql.rollback_to "list_tournaments_savepoint"
    end

    after(:all) do
        plsql.execute "drop table footie.list_tournaments_output_tmp"
    end
end

…not ideal then but no worse than the other frameworks I’ve looked at to date.

Table Backup

Our next test involves uploading COMPETITIONS from a csv file via an external table.
A quick reminder – the file looks like this :

comp_code,comp_name,description
HIC,Home International Championship,British Home International Championship
CA,Copa America,Copa America (South American Championship until 1975)
OLY,Olympic Football Tournament,The Olympics
WC,World Cup,The FIFA World Cup
CEIC,Central European International Cup,Central European International Cup - a forerunner to the European Championships
EURO,European Championship,UEFA European Championship
HIC,Home International Championship,British Home International Championship

…the external table for it is defined like this :

create table competitions_xt
(
    comp_code varchar2(5),
    comp_name varchar2(50),
    description varchar2(4000)
)
    organization external
    (
        type oracle_loader
        default directory my_files
        access parameters
        (
            records delimited by newline
            skip 1
            fields terminated by ','
            badfile 'competitions.bad'
            logfile 'competitions.log'
            (
                comp_code char(5),
                comp_name char(50),
                description char(4000)
            )
        )
            location('competitions.csv')
    )
reject limit unlimited
/

The code for the procedure to load these records is :

...
    procedure upload_competitions
    is
    begin
        insert into competitions( comp_code, comp_name, description)
            select comp_code, comp_name, description
            from competitions_xt
            log errors reject limit unlimited;
    end upload_competitions;
...

As we’re using a LOG ERRORS clause, any inserts into the error table will be done in an Autonomous Transaction. Therefore, we’re not going to be able to use a simple rollback as the teardown process.
Instead, what we need to do is to :

  1. Create a backup of the data in the COMPETITIONS and ERR$_COMPETITIONS tables before the test is executed
  2. Run the test and validate the results
  3. Return the COMPETITIONS and ERR$_COMPETITIONS tables to the state they were in prior to test execution
  4. Tidy up by removing the backups we made of the tables in step 1.

All of which can be accomplished using something like :

describe "Upload Competitions" do
    before(:all) do
        # Create copies of the current data in the target tables
        plsql.execute "create table competitions_bu as select rowid as bu_rowid, comp.* from footie.competitions comp"
        plsql.execute "create table err$_competitions_bu as select rowid as bu_rowid, err.* from footie.err$_competitions err"
    end

    it "should Upload new Competitions" do
        # Get expected results - in this case the expected contents of the Competitions table after the load.
        # Need to make sure the rows are in a known order, hence the in-line-view to allow the order by...
        @expected = plsql.select <<-SQL
            with all_rows as
            (
                select comp_code, comp_name, description
                from footie.competitions_xt
                where comp_code not in (select comp_code from footie.competitions)
                union
                select comp_code, comp_name, description
                from footie.competitions
            )
            select * from all_rows order by comp_code
        SQL
        # ...and again for the Error records...
        @expected_err = plsql.execute <<-SQL
            with all_errs as
            (
                select distinct comp_code, comp_name, description
                from footie.competitions_xt
                where comp_code in (select comp_code from footie.competitions)
                union
                select comp_code, comp_name, description
                from footie.err$_competitions
            )
            select * from all_errs order by comp_code
        SQL

        # ...and the error records. NOTE that plsql.select doesn't seem to like the table name for some reason, so...
        @actual_err = plsql.execute <<-SQL
            select * from footie.err$_competitions order by comp_code
        SQL

        # Execute
        plsql.footie.manage_competitions.upload_competitions

        # Validate - Loaded records first, then error records...
        expect( plsql.footie.competitions.select( :all, "order by comp_code")).to eq @expected
        expect( @expected_err).to eq @actual_err
    end

    after(:each) do

        # Restore the tables to their former state and cleardown the backup tables.
        # plsql.delete doesn't like sub-queries so...
        plsql.execute "delete from footie.competitions where rowid not in (select bu_rowid from competitions_bu)"
        plsql.execute "delete from footie.err$_competitions where rowid not in (select bu_rowid from err$_competitions_bu)"
        plsql.execute "truncate table competitions_bu"
        plsql.execute "truncate table err$_competitions_bu"
        # By this point we've invalidated the original savepoint created implicitly by the framework, so re-create it here...
        plsql.savepoint "before_each"
    end

    after(:all) do
        # Drop the backup tables
        plsql.execute "drop table competitions_bu"
        plsql.execute "drop table err$_competitions_bu"
    end
end

Note that, due to the fact we perform various DDL statements during the test executions, causing transactions to terminate. Therefore, the “implicit” before-each savepoint becomes invalid by the time ruby-plsql-spec tries to reference it. Therefore, we need to re-create it at the end of the after each function.

Running a “Suite” of tests

Up to this point, we’ve run our tests one at a time.
Whilst ruby-plsql-spec does not have the concept of a Suite of tests, per se, the way we’ve arranged our tests on disk, with tests for each package in it’s own sub-directory, means that we do have a de-facto test suite for each package.
For example, if we want to run all of the tests for MANAGE_COMPETITIONS, we simply need to run…

plsql-spec run -fd spec/manage_competitions/

…and we can see all of the tests for this package execute…

manage_comps_test_run

We can execute all of our tests for all packages by running the following from the project “root” directory :

plsql-spec run -fd

Outputting Test Results

As you can see from the help, there are four format options when running ruby-plsql-spec tests.
As well as the documentation format that I’ve used throughout this post, I found the html format may be of some interest…

plsql-spec run -fh >run_all.htm

Running this command generates a file called run_all.htm which, in a browser, looks like this :

html_output

If you want something a bit more basic – i.e. whether the build is Green (all tests pass) or Red (one or more tests fail) then something like this will do the job :

#!/bin/sh
plsql-spec run
([ $? == 0 ] && echo 'Build is Green' || echo 'Build is Red' )

When I run this I get…

 . ./testit.sh
Running all specs from spec/
.......

Finished in 1.03 seconds (files took 0.81997 seconds to load)
7 examples, 0 failures

Build is Green

At this point, I should really take things a step further and explore how we can run our tests using a Continuous Integration tool.
Fortunately, Jacek has already done this with Jenkins.

Conclusion

I must confess that I found ruby-plsql-spec to be rather different from the other two frameworks I’ve looked at.
Functionally, it does everything it needs to and the code required for me to write my tests is pretty compact.
However, it did feel slightly odd using programming constructs that I would find problematic in application code. For example, direct DML against tables using ruby-plsql rather than calling a PL/SQL program unit to handle it is something that goes against the principle of Thick Database Design.
I have to say that this is no reflection on the framework, which I’ve grown to like in a short space of time. It’s rather to do with the nature of test code itself.
This is a topic that I will be looking at next when I’ll be reviewing all of the frameworks that I’ve covered over the past few months and seeing how they compare with each other.


Filed under: PL/SQL, SQL Tagged: database.yml, pl/sql testing framework, plsql-spec init, plsql-spec run, plsql.execute, plsql.rollback_to, plsql.savepoint, plsql.select, plsql.select_one, ruby, ruby-full, ruby-plsql-spec, zlib1g-dev

Chasing your tail – with SQL*Plus and SQLcl

Tue, 2016-10-11 16:40

Do you remember the film Up where the dogs were always distracted as soon as anyone mentioned squirrels ?
Well, there I was, continuing my journey through the wonderful world of PL/SQL Unit Tests when suddenly, SQLcl !
Yes, Oracle have just released the first production version of SQLcl.
Since I first looked at an Early Adopter version of SQLcl there have been several enhancements. One of these, the REPEAT command, has the potential to implement functionality akin to the good old *nix tail -f command for Oracle Database tables.
It turns out that you may also be able to do something similar in SQL*Plus…

REPEAT or…what she said !

Having installed the latest and greatest PRODUCTION release of sqlcl (version 4.2.0 Production at the time of writing), I can find out about this newfangled REPEAT command from the help….

help repeat

repeat <iterations> <sleep>
	Repeats the current sql in the buffer the specified times with sleep intervals
 	Maximum sleep is 120s

To see REPEAT in action, let’s run this query :

select to_char(sysdate, 'HH24:MI:SS') as the_time
from dual
/

THE_TIME
--------
08:27:47

Now let’s re-execute it ten times with a delay of 1 second between each execution…

 

repeat_time

Well that’s nice but how about something a bit more practical ?

OK…

Tailing a table

Consider that we have a (very) simple application logging table…

create table logs
(
	time_stamp timestamp,
	message varchar2(100)
)
/

…and a PL/SQL job which writes to it…

begin
	for i in 1..10000 loop
		dbms_lock.sleep(2);
			insert into logs(time_stamp, message)
			values( systimestamp, 'Iteration '||i);
		commit;
	end loop;
	insert into logs(time_stamp, message)
	values( systimestamp, 'Run completed.');
	commit;
end;
/

If we run this job in one session we can use another to monitor it’s progress.
The pre-12c version of this monitoring query might be something like…

with tail_logs as
(
    select time_stamp, message
    from logs
    order by time_stamp desc
)
    select to_char(time_stamp, 'HH24:MI:SS') as log_time, message
    from tail_logs
    where rownum <= 5
    order by time_stamp
/

…the 12c version would be rather more concise, what with the LIMIT clause being available, but that’s for another time.
For now, the output would be something like…


LOG_TIME MESSAGE
-------- ----------------------------------------------------------------------------------------------------
13:06:23 Iteration 46
13:06:25 Iteration 47
13:06:27 Iteration 48
13:06:29 Iteration 49
13:06:31 Iteration 50                                                                                        

We can now re-issue this command and see each new line being added to the table in the same way as if we were using tail -f on a file…

tail_table

At first glance, this looks to be more like a tail -ffab ( follow for a bit). However, digging a bit deeper…

-- make sure there's something in the buffer
select user from dual;
-- now cause repeat to generate the Usage error message
repeat

Usage: REPEAT  <iterations> <seconds>
	 SQL is the sql from the current buffer
	 Maximum sleep is 120s
	 Maximum repeats are 2,147,483,647

Yes, the maximum number of repeats is the maximum size of a PLS_INTEGER ( 32-bit integer). Even with a very small interval specified, this means that you can, in effect, replicate tail-f by specifying a huge number of iterations.

As with tail -f, you can also cancel a REPEAT by issuing a CTRL+C. SQLcl should pick up and process the cancel during the next iteration of the REPEAT “loop”.
Therefore, this may take a while to process, depending on what you have set your repeat interval to.

Time for an old favourite…

Tailing the alert log in SQL – look no external tables !

Now, there was a time when tailing the alert log was the main method of monitoring a running instance. Later on, there was a time when you could do this from within the database, but only with the help of an External Table on the alert log.

Since 11g however, it has been possible to do this by using the X$DBGALERTEXT fixed table.
As with fixed tables generally however, getting access to it can be a bit fiddly.

We could still do this, but we’d need to be connected as SYS. Rather than going through all of that, I’m just going to follow Benedikt Nahlovsky’s fine example and expose this fixed table in the usual way….

-- as SYS as SYSDBA...
create or replace view alert_log_vw as select * from x$dbgalertext
/

create public synonym alert_log for alert_log_vw
/

grant select on alert_log to dba
/

Now, as a user with the DBA role, I can run this in SQLcl to see the last 10 entries in the alert log…

with log_lines as
(
	select originating_timestamp,
		to_char(originating_timestamp, 'Day Mon DD HH24:MI:SS YYYY') as message_ts,
		message_text
	from alert_log
	order by originating_timestamp desc
)
select message_ts, message_text as file_line
from log_lines
where rownum <= 10
order by originating_timestamp
/

…followed by this to “tail -f” the alert log….

repeat 86400 1

Kris Rice, one of the team behind SQLcl (among other things) has another demonstration of what is possible with this command.

Let’s face it, you can’t do that with SQL*Plus…can you ?

REPEAT…just like Mother used to make

You may be surprised to learn that, there are circumstances in which you can implement a pretty good imitation REPEAT/tail -f functionality in good old SQL*Plus.
If you’re running on a *nix (or similar) environment then this code from Katsumi is a bit of an eye-opener

ho mkfifo /tmp/myfifo.sql
select to_char(systimestamp, 'HH24:MI:SS') as the_time from dual;
ho eval 'for((i=0;i<10;i++));do sleep 1;echo "/";done >/tmp/myfifo.sql &'
@/tmp/myfifo

Save this as plus_tailf.sql and then run it, and the output looks like this…

plus_tailf

As you’d expect, provided your host is running an appropriate OS, this will work in SQLcl as well as in SQL*Plus.
Better still, there is a way of invoking this functionality without having to type that eval statement every time.

First of all, we’re going to create a script which accepts the name of a file holding a SQL script to run, the number of times to execute the script, and the interval between each execution ….

--
-- Usage : tailf <scriptname> <iterations> <interval in seconds>
--
def fname=&1
def iter=&2
def interval=&3
def command = 'for((i=0;i<&iter;i++));do sleep &interval;echo "/";done >/tmp/myfifo.sql &'
!mkfifo /tmp/myfifo.sql
@&fname
!eval "&command"
@/tmp/myfifo
!rm /tmp/myfifo.sql

We’re going to save this in the SQLPATH directory as tailf.sql.
My SQLPATH is defined in my .bashrc as :

export SQLPATH=$HOME/sqlplus_scripts

We also need to save our query to get the current time into a file called time.sql. This can be in any directory.

select to_char(sysdate, 'HH24:MI:SS') as time_now
from dual
/

We can now invoke it from SQL*Plus ( or SQLcl) simply by running…

@tailf time.sql 10 1
Why SQL*Plus still matters

You may think that this is all rather a lot of effort to implement this functionality in SQL*Plus, when it’s readily available in SQLcl anyway. After all, why not just switch to using SQLcl, which provides a rather more concise (not to mention platform independent) solution ?

It’s worth remembering that Oracle client software doesn’t just run on Developers’ workstations, it runs on Application and Database Servers as well.
The fact of the matter, particularly in large organisations, is that there is considerable inertia to overcome in getting new software onto production servers.
Just think for a moment, it’s now over 4 years since Oracle 12c was officially released. However, many (most ?) Oracle databases are still running on older versions of the database.
As well as that, SQLcl has a dependency on Java so either getting the correct version, or just getting Java installed in the first place, is an additional challenge when it comes to navigating SQLcl through whatever Change Management procedures may be in place.
So, whilst SQLcl will undoubtedly become ubiquitous over time, SQL*Plus won’t be disappearing just yet.


Filed under: Oracle, SQL Tagged: alert.log, eval, host, mkfifo, repeat, SQL*Plus, sqlcl, X$DBGALERTEXT

(Almost) Everything you ever wanted to know about SQLDeveloper Unit Testing but were afraid to ask

Sat, 2016-08-20 15:46

The Political fallout from Brexit continues unabated.
In the immediate aftermath of the vote, David Cameron resigned as Prime Minister, triggering a power-struggle to succeed him.
Secret pacts, double-crosses and political “assassination” followed in short order.
It was like watching an episode of Game of Thrones although, mercifully, without the full-frontal nudity.
As for the Labour Party, they have captured the Zeitgeist…by descending into the sort of internal conflict for which the word “internecine” was invented.

Following the current trend that promises are made for breaking, this post has arrived a little bit later than advertised.
I write most of this stuff on the train to and from work, and, as they have been unusually punctual of late, my Sprint 1 velocity is somewhat lower than anticipated.
So, with apologies for my tardiness…

A Quick Overview

When it comes to Unit Testing Tools/Frameworks for PL/SQL, SQLDeveloper Unit Testing can probably be considered the “official” Oracle offering.
After all, SQLDeveloper is shipped with the Oracle Database.

Therefore, this seems like a fitting place to continue my PL/SQL Unit Testing odyssey.

Useful Links and Resources

There is a very good tutorial covering setting up Tests on a Stored Procedure right there in the SQLDeveloper Help, as well as other useful info about SQlDeveloper Unit Testing :

sut_help

This post from Jeff Smith is a useful introduction to SQLDeveloper Unit Testing, in which he takes a different approach to some of the tasks that I’ve covered here.

For a completely different take on things, this article by Wouter Groeneveld is so good that I will be plagiarising bits of it later on.

Before I go any further, SQLDeveloper Unit Testing is a bit of a mouthful. Additionally I’m never sure if I should be calling it a Tool or a Framework. Therefore, I’m going to refer to it as SUT for the remainder of this post.

Testing Approach

The approach I’ve taken to evaluating SUT (and the other frameworks that I will look at in later posts) is to use the application I introduced last time and to see how I can test some of the User Stories I have for it.
These Stories cover functionality that you would commonly find in a PL/SQL application ( DML operations, ETL data load and Ref Cursor population).
Additionally, I’ve taken the view that the Application Code should not be changed in any way to accommodate any vagaries of SUT.

As it’s the Testing Tool that’s my primary concern here, the Application Code I’ve included here is pretty much the minimum required to get the tests to pass.
It’s not what you’d call Production ready.

Also, as this is my first foray into SUT so some of the Tests may not have been written in the optimum fashion.
The examples below are more a case of me saying “this is how I managed to test this feature in SUT” rather than “this is the recommended way to test this feature”.

Test Environment

These tests have been put together and executed in the current Production version of SQLDeveloper ( 4.1.3).
The database used is Oracle 11g Express Edition.

Toolbars and Buttons

The SUT toolbar looks like this :

sut_toolbar

From left-to-right, the buttons are :

  • Freeze View
  • Refresh
  • Debug – provides feedback on the runnable bits of your test without actually running it
  • Run

Additionally, I’ll be mentioning the “Plus” button quite a bit. That would be this :

big_green_plus

Right, I think we’re just about ready to go.
Before we start writing any tests, we need to do a bit of groundwork…

SQLDeveloper Unit Testing – Configuration

First things first, according to the SQLDeveloper Help :
“The Unit Test Repository is a set of tables, views, indexes and other schema objects…”

Look, don’t panic. Despite this apparently database-centric architecture, you do not have to deploy your test code along with your application code base to execute your tests. Remember, as with all the mainstream Oracle IDEs, SQLDeveloper allows you to be connected to different databases simultaneously.
Added to this, it would appear that a significant element of the SUT architecture involves XML configuration files. This would also explain the lack of any requirement to mess about with database links to get all this stuff to work.

Repository Pre-Setup Tasks

The first design decision I made was that I wanted to keep any testing infrastructure entirely separate from my Application Code.
Therefore, I’ve created a separate database schema – TEST_REPO – specifically to hold the Repository.

Whilst it’s perfectly possible to have SQLDeveloper guide you through all of the setup work, this does require that you connect as a highly privileged user (SQLDeveloper tends to ask for the SYS password).
That’s not a problem if, like me, you’re just messing about on your own copy of Oracle Express Edition. If you’re in a more formal environment however, you may well need to provide your DBA with a script.

This should do the job (runnable in SQL*Plus) …

--
-- Script to create the roles required for a SQLDeveloper Unit Testing Repository
-- and a schema to host the repository
--
set verify off
define repo_owner = test_repo
define repo_default_ts = users
define repo_temp_ts = temp
accept passwd prompt 'Enter a password for the new schema [] : ' hide

-- Setup Roles
create role ut_repo_administrator;
grant create public synonym,drop public synonym to ut_repo_administrator;
grant select on dba_role_privs to ut_repo_administrator;
grant select on dba_roles to ut_repo_administrator;
grant select on dba_tab_privs to ut_repo_administrator;
grant execute on dbms_lock to ut_repo_administrator;

create role ut_repo_user;
grant select on dba_role_privs to ut_repo_user;
grant select on dba_roles to ut_repo_user;
grant select on dba_tab_privs to ut_repo_user;
grant execute on dbms_lock to ut_repo_user;

-- Create Schema to host Repository
create user &repo_owner identified by &passwd;
alter user &repo_owner default tablespace &repo_default_ts;
alter user &repo_owner temporary tablespace &repo_temp_ts;
alter user &repo_owner quota unlimited on &repo_default_ts;

-- System Privs
grant create session, connect, resource, create view to &repo_owner;

-- Role Privs
grant ut_repo_administrator to &repo_owner with admin option;
grant ut_repo_user to &repo_owner with admin option;

-- Object Priv
grant select on dba_roles to &repo_owner;
grant select on dba_role_privs to &repo_owner;

Note that you may want to change the the Repository Owner Schema and tablespace variables to values more suited to your environment.

Now we’ve created the new schema, we need to create a connection for it in SQLDeveloper.

In the Connections Navigator, click the Plus button …

create_conn1

Input the appropriate details in the New/Select Database Connection Window…

create_conn2

If you hit the Test button, you should see the Status set to “Success” in the bottom left-hand corner of the window.

Once it’s all working, hit the Save button, to retain the connection, then hit Connect to logon as TEST_REPO.

Now that’s all done we’re ready for the next step…

Creating the Repository

From the Main Menu, select View/Unit Test

A new, but currently quite boring navigator window will open up under the Connections Navigator :

no_repo

From the Main Menu, select Tools/Unit Test/Select Current Repository…

In the Repository Connection window, select the test_repo connection we’ve just created…

repo_conn

…and hit OK. This should give you :

create_repo

Just say Yes.

After a brief interlude, during which SQLDeveloper does an impression of a Cylon from the original Battlestar Galactica…

create_repo_progress

By your command !

you will get..

.repo_created

If you now have a look in the Object Browser for the test_repo user, you’ll see that SQLDeveloper has been quite busy…

repo_tree

Next up…

Granting Access to the Repository

From the Tools menu select Unit Tests then Manage Users…

When prompted for the connection to manage users, I’ve chosen TEST_REPO as it’s the only one that currently has admin rights on the repository.

The Manage Users Window that appears has two tabs, one for Users and one for Administrators :

manage_users1

I’ve added the owner of the application code that I’ll be testing to keep things simple.

The result is that, if I now connect as FOOTIE ( the application owner), I should be able to start adding some tests.

Naming Conventions

The SQLDeveloper Help has something to say on naming tests, which you may want to consider.
I’ve simply gone down the route of using the format package_name.procedure_name_n.
I want to group the unit tests at the same level as the programs for which they are written, so the Test Suites I’ll be creating are grouped by (and named for) the packages that the tests run against.

One aspect of object naming that I haven’t formalised in this post is that of items that I’ve added to the Library. This is an area to which you may well want to give some consideration.

Testing Insert Statements

The first set of tests I need to write centre around inserting data into tables. To cover some of the scenarios we might commonly encounter when performing this sort of operation, I’ll be looking at adding records to Tables related to each other by means of Referential Integrity constraints.

My First Test – Inserting a Record into a table with a Primary Key

I’ve got a User Story about adding Competitions to the application. The Story has two Acceptance Criteria :

  1. A new competition can be added
  2. A Competition cannot be added more than once

Here’s a quick reminder of what the COMPETITIONS table looks like :

create table competitions
(
    comp_code varchar2(5) constraint comp_pk primary key,
    comp_name varchar2(50) not null,
    description varchar2(4000)
)
/

As we’re trying to follow the approach of Test Driven Development and write our tests first, we just have a stub PL/SQL procedure to run the initial tests against :

create or replace package body manage_competitions
as

    procedure add_competitions
    (
        i_code competitions.comp_code%type,
        i_name competitions.comp_name%type,
        i_desc competitions.description%type default null
    )
    is
    begin
        null;
    end add_competitions;
end manage_competitions;
/

Time to meet…

The Create Unit Test Wizard

As this is my first test, I’m going to go through it step-by-step here. After that, the novelty will quickly wear off and I’ll just reference the steps or the Test Process Stages as appropriate.

In the Unit Test Navigator, right-click the Tests node and select Create Test…

create_test1

This will bring up the Create Unit Test Wizard

create_test_wiz1a

Select the Connection for the Application Owner (FOOTIE, in this case) from the Connections drop-down…

create_test_wiz1b

…and navigate the tree to the procedure we want to test – i.e. manage_competitions.add_competition…

create_test_wiz1c

And now click Next

This brings us to step 2 of the Wizard, specifying the Test Name. In line with my standard, I’m calling this one MANAGE_COMPETITIONS.ADD_COMPETITION_1.
Leave the Radio Group selection as Create with single Dummy implementation

create_test_wiz2

Click Next

I’m not going to create a Setup Process in this test for the moment. If you did want to, then you’d click the Plus button and…well, we’ll return to that later.
Anyway, step 3 looks like this :

create_test_wiz3

Click Next

Now we need to specify the input parameter values that we’ll be using in the test.
The values I’m going to use are :

  • I_CODE – UT1
  • I_NAME – Test1
  • I_DESC – Test competition

Leave the Expected Result as Success

create_test_wiz4

Click Next

…which takes us to…

create_test_wiz5a

Click the Plus to add a Process Validation and a drop-down will appear.
Scroll through this and select Query returning row(s) :

create_test_wiz5b

…and you’ll be presented with…

create_test_wiz5c

We now replace the default SQL statement with one that checks that we’ve inserted our record successfully.
Rather than hard-coding the COMP_CODE value we’ve input into the procedure, we can use SUT’s substitution syntax, thus inheriting the value of the I_CODE input parameter we specified back in Step 4. The code we’re going to add is :

select null
from competitions
where comp_code = '{I_CODE}'

Note that the parameter names appear to be case sensitive. If it’s entered in lowercase, SUT will complain at runtime.
The end result looks like this :

create_test_wiz5d

Click OK to dismiss the Process Validation Window.

Back on the Wizard page, click Next

Finally, we need to specify the Teardown Process – i.e. code to return the application to the state it was prior to the test being run.

create_test_wiz6a

Hit the Plus button again and you will get a slightly different drop_down. This time, we want User PL/SQL Code

create_test_wiz6b

You’ll now get the Teardown Process Window. As we’ve only inserted a single row as part of this test, we can simply rollback the transaction to put things back as they were. This being a PL/SQL block, the code is :

begin
    rollback;
end;

create_test_wiz6c

Click OK to dismiss the Window.
Now click Next to get to the Summary Page, which should look something like…

create_test_wiz7

Once you’re happy with the Summary, click Finish.

You should now see the new test in the Unit Test Navigator :

first_test

Click on the new test and you will see :

test_win_details

We can now run the test by clicking the Run Button. This will cause the Test Pane to switch to the details tab where the results of our first execution will be displayed :

test_win_results

As you’d expect, because the program we’re calling doesn’t do anything, the test validation fails.

Before we go any further, there are a couple of things we probably want to do with this test.
First of all, we really should have code that ensures that we won’t be trying to insert a value that already exists in this table.
Secondly, it’s quite likely that we’ll want to reuse some of the test code we’ve written in future tests.

OK, first of all then …

Dynamic Value Queries

In my previous attempt to write Unit Tests in anonymous PL/SQL blocks, it was relatively simple to set an appropriate value for the Primary Key of the record we wanted to insert programmatically.
Obviously, this is not the case with the Unit Test we’ve just written.

Fortunately, SUT allows you to populate the call parameters with values from what it calls a Dynamic Value Query.

In the SQLDeveloper help, the method demonstrated for setting up a Dynamic Value Query, involves creating a table to to hold the values and then querying that at runtime.
In this instance, we don’t need a table to determine the parameter values we need to use.

In the Details tab of the test, we need to click the Dynamic Value Query pencil icon :

dyn_qry_edit

This will bring up the Dynamic Value Query window.

This code should do the job for now…

--
-- Make (reasonably) certain that the comp_code value we're trying to create
-- does not already exist in the table.
with suffix as
(
    select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
    from competitions
    where comp_code like 'UT%'
    and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 -- only want codes with a numeric suffix
    union -- required if there are no records in the table...
    select 1 from dual
)
select 'UT'||max(numeral) as I_CODE,
    'Test1' as I_NAME,
    'A test' as I_DESC
from suffix
where numeral is not null

…so I’ve dropped it into the Dynamic Value Query…

dyn_qry_final

Click OK and we’re ready to re-run the test to check that our changes have worked.

Press the Run button and SQLDeveloper will ask if you want to save the changes you’ve just made.

unsaved_changes

We do.

The test will still fail ( we’re still running against a stub, remember), however, we can see from the test results that the Dynamic Values Query has generated the input values we were expecting :

dvq_run

NOTE – whilst the Dynamic Values Query we have written here will (eventually) work as expected, there is rather more to them than meets the eye. This is mainly because they are executed before any other step in a test. This can be problematic when a Dynamic Values Query needs to rely on values that are set in a Test’s Startup Process. There is a solution to this however, which I’ll cover later on.

Adding Code to the Library

We’ve written three distinct bits of code for our first test which we’ll probably want to use again.

SUT provides the Library as a means of storing such code for re-use in other tests.

Looking at our test in SQLDeveloper, the first code we see is the Teardown Process.

To add this to the library, click the Pencil icon to bring up the Teardown Process code window.

In the field at the bottom, we can now give this code a name and then hit the Publish button to send it to the library :

add_to_lib

You should now see that the label we’ve given this code (rollback_transaction, in this case) appears in the Library drop-down at the top of the Window :

add_to_lib2

Now click the OK button to dismiss the window.

We can repeat this process for our Dynamic Values Query ( labelled add_new_competition) and our Validation ( competition_row_added).

If we now expand the Library node in the Unit Test pane, we can see that our code appears in the Library :

lib_tree.png

This should mean that creating our second test requires a little less typing…

Testing for an Exception

Our second test is to check that the Application does not allow us to add a duplicate COMPETITIONS record.

Opening the Wizard again, we select the same package and procedure as for our first test.

Following my naming convention, this test is called MANAGE_COMPETITIONS.ADD_COMPETITION_2.
This time, we do need a Startup Process to make sure that the record we attempt to insert already exists.
So this time, hit the Plus button on the Startup Process step and select User PL/Sql Code from the drop-down.

In the Startup Process Code Window, we’re going to add the following PL/SQL code :

begin
    merge into competitions
    using dual
    on (comp_code = 'UT1')
    when not matched then
        insert ( comp_code, comp_name, description)
        values( 'UT1', 'Test', null);
end;

Essentially, we want to begin by making sure that a competition with a COMP_CODE value of ‘UT1’ exists in the table.

We’ll probably need to do this again somewhere down the line, so I’ve added it to the library as setup_competition_UT1.

For this test, in the Specify Parameters step, we can type in the parameters manually. Our Setup Process ensures that a record exists and our test execution is to try to insert this record again. The parameter values we enter are :

  • I_CODE = UT1
  • I_NAME = Test
  • I_DESC – leave as null

This time, the expected result should be changed from Success to Exception.
We’re expecting this call to fail with “ORA-00001: unique constraint violated”.
Therefore, we need to set the Expected Error number to 1.

test2_params.png

Our test will fail unless it encounters the exception we’ve specified when we call the procedure.
Given this, we don’t need to add a separate validation step.

We will be needing a Teardown Process however.
As before, we’re going to choose User PL/Sql Code from the drop-down.
This time however, we won’t have to type anything in to the window.
Instead, click on the library drop-down at the top of the code window :

import_td1

… select rollback_transaction from the drop-down, and then check Subscribe

import_td2

By subscribing to this code snippet, we “inherit” it from the library. Therefore it cannot edit it directly in the test.
If we did want to “tweak” it for the purposes of this specific test, we could hit the Copy button instead of subscribing.
This would copy the code block into the current test where it would then be editable for that test.

The test Summary looks like this :

test2_summary

As expected, when we execute this, it fails because the expected exception was not raised :

test2_first_run

Now we have both of our tests written, we need to write some application code to get them to pass.

create or replace package body manage_competitions
as
	procedure add_competition
	(
		i_code competitions.comp_code%type,
		i_name competitions.comp_name%type,
		i_desc competitions.description%type default null
	)
	is
	begin
        insert into competitions( comp_code, comp_name, description)
        values( i_code, i_name, i_desc);
	end add_competition;
end manage_competitions;
/

Now when we re-execute our tests, we can see that they pass :

test1_success.png

…and…

test2_success.png

Note that the latest test runs appear at the top of the Results listing.

Removing Old Test Results

If you want to remove an older result you can do this by right-clicking on it in the Results Tab and selecting Delete Result….

You can also clear down all results for a test by right-clicking in the Unit Test pane and selecting Purge Test Results…

purge_test1

This will bring up the Purge Test Results dialogue which gives you the option to remove all results for the test, or just those from before a given time and date.

purge_test2

NOTE – if you have a Test or a Suite with it’s Results tab showing when you purge, then they may not disappear immediately.
If this is the case, just click the Refresh button on the Test toolbar.

Creating a Test Suite

The development process around an application such as ours, will revolve around the PL/SQL package as the atomic unit of code.
Even though packages are containers for specific procedures, functions etc, it’s at the package level that the code is deployed to the database and therefore, it’s at that level it’s stored in Source Control.
So, when a developer needs to make a change to a package, we want them to be able to checkout and run all of the tests for the package.
SUT allows us to group our tests by means of a Test Suite.

So, we’re going to create a Suite for the MANAGE_COMPETITIONS package so that we can group the tests we’ve just created, and add more tests to it later on.

In the Unit Test tree, right-click on Suites and select Add Suite…

add_suite1

In the Add Suite dialogue that appears, enter a name for the Suite.

add_suite2

The new suite now appears in the Unit Test tree.
Now we need to add our tests to it.

Expand the tree under the new test suite, right-click on Tests and select Add Test…

add_suite_test1

Now select the tests you want to add from the list that appears in the Add to Suite Window :

add_suite_test2

…and click OK.

Notice that, although the tests have been added to the Suite, they still appear under the Test Node in the tree.
This node is going to get fairly crowded as more tests are added. This is one reason that a sensible naming convention is quite useful.

If we now run our new suite, we can see that all tests in the suite will run :

suite_results

There is more to Test Suites than simply grouping tests together, but more on that later.

Thus far, we’ve covered similar ground to the Tutorial exercise in the SQLDeveloper Help, although it’s fair to say we’ve taken a slightly different route.

We’re now going back to a time when Scotland ruled the world in order to test…

Adding data to a “child” table

The Acceptance Criteria for the Add a Tournament Story are :

  1. A tournament can be added
  2. A tournament cannot be added for a non-existent competition
  3. The competition that this tournament is for must be specified
  4. The same tournament cannot be added for a competition more than once
  5. If specified, the year the tournament begins cannot be greater than the year that it ends

The stub of the code we’ll be testing is…

create or replace package manage_tournaments
as
	procedure add_tournament
	(
		i_code tournaments.comp_code%type,
		i_year_end tournaments.year_completed%type,
		i_teams tournaments.number_of_teams%type,
		i_host tournaments.host_nation%type default null,
		i_year_start tournaments.year_started%type default null
	);
end manage_tournaments;
/

create or replace package  body manage_tournaments
as
	procedure add_tournament
	(
		i_code tournaments.comp_code%type,
		i_year_end tournaments.year_completed%type,
		i_teams tournaments.number_of_teams%type,
		i_host tournaments.host_nation%type default null,
		i_year_start tournaments.year_started%type default null
	)
	is
	begin
            null;
	end add_tournament;
/

…and the table DDL is…

create table tournaments
(
    id number constraint tourn_pk primary key,
    comp_code varchar2(5),
    year_completed number(4) not null,
    host_nation varchar2(100),
    year_started number(4),
    number_of_teams number(3) not null,

    constraint tourn_uk unique( comp_code, year_completed, host_nation)
)
/

alter table tournaments
    add constraint tourn_comp_fk foreign key
        (comp_code) references competitions(comp_code)
/

Hmmm, does something there look odd to you ? We may come back to it in a while.
First though, let’s write the tests for…

Variable Substitutions using binds

Our first test is wittily and originally entitled MANAGE_TOURNAMENTS.ADD_TOURNAMENT_1.
It’s simply testing that we can legitimately add a TOURNAMENT record to our application.

For the Test Startup, we need to make sure that we have a COMPETITION record to assign the tournament to so I’ve subscribed to the setup_competition_ut1 that we added to the Library earlier.

As for the call to the package, the parameters I’m using are from the very first International Football Tournament – the British Home Championships ( won by Scotland ) :

  • I_CODE = UT1
  • I_YEAR_END = 1884
  • I_TEAMS = 4
  • I_HOST = null
  • I_YEAR_START = null

The Validation Process is a Boolean function. Now, I had a few issues with the I_HOST replacement as a string (possibly because I passed in a parameter value of null in the test).
Fortunately, you can reference parameter values as bind variables…

declare
    l_count pls_integer;
    l_host tournaments.host_nation%type := :I_HOST;
begin
    select count(*)
    into l_count
    from tournaments
    where comp_code = '{I_CODE}'
    and year_completed = {I_YEAR_END}
    and nvl(host_nation, 'MIKE') = nvl(l_host, 'MIKE');

    return l_count = 1;
end;

I know I’ll need to use this again to verify the test for adding a duplicate tournament so I’ve saved it to the Library as single_tournament_record_exists.

The teardown process is also taken from the library (a simple rollback once again).

We’ve got a database and we’re not afraid to use it !

The test to ensure that you can’t add a TOURNAMENT for a non-existent COMPETITION is more about the Data Model than the code. What we’re actually testing is that the Foreign Key from TOURNAMENTS to COMPETITIONS is in place and working as expected.

It follows a similar pattern to the very first test we created for adding a competition.
Indeed, the Dynamic Values Query looks rather familiar :

with suffix as
(
    select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
    from competitions
    where comp_code like 'UT%'
    and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 -- only want codes with a numeric suffix
    union -- required if there are no records in the table...
    select 1 from dual
)
select 'UT'||max(numeral) as I_CODE,
    1884 as I_YEAR_END,
    4 as I_TEAMS,
    null as I_HOST,
    null as I_YEAR_START
from suffix
where numeral is not null

Whilst the hard-coded names and parameter values reflect the fact that we’re calling a different procedure, the code to derive the I_CODE parameter value is identical.

Whilst we could add this query to the Library then copy it where we needed to and make necessary changes in each test, there is an alternative method of reuse we might consider. We could create a database function in the Repository to return the desired value for I_CODE.

NOTE – there are several considerations when determining whether or not this is a route that you wish to go down in terms of your own projects. However, in this instance, my Test Repository is being used for a single application, and by doing this, I’m ensuring that this code only needs to be written once.

We’re going to create this function in the TEST_REPO schema.
Before we do that though, we need to grant access to the COMPETITIONS table to TEST_REPO. So, connected as FOOTIE :

grant select on competitions to test_repo
/

Then, as TEST_REPO :

create function get_new_comp_code
    return varchar2
is
    l_comp_code varchar2(5);
begin
    with suffix as
    (
        select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
        from footie.competitions
        where comp_code like 'UT%'
        and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 -- only want codes with a numeric suffix
        union -- required if there are no records in the table...
        select 1 from dual
    )
    select 'UT'||max(numeral) into l_comp_code
    from suffix
    where numeral is not null;

    return l_comp_code;
end get_new_comp_code;
/

Yes, in the Real World you may very well do this as part of a package rather than as a stand-alone function.

Next, we need to grant privileges on the function to the Repository Role :

grant execute on get_new_comp_code to ut_repo_user
/

The Dynamic Values Query can now be written to use the new function :

select test_repo.get_new_comp_code as I_CODE,
    1884 as I_YEAR_END,
    4 as I_TEAMS,
    null as I_HOST,
    null as I_YEAR_START
from dual

We expect this test to fail with ORA-2291 – Integrity Constraint violated.

Multiple Processes for a single Testing Stage

To test that the application will not allow the addition of a duplicate TOURNAMENT we need to make sure that records exist in two tables rather than one – i.e. a “parent” COMPETITIONS record and a “child” TOURNAMENTS record, which we’ll be attempting to duplicate.

Now we could do this in a single Startup Process. However, if we do at as two separate steps then we can save both to the Library and get a greater degree of re-use out of them. So, resisting the distant memory of Forms 2.3 Step Triggers and the feeling that you’re coding like it’s 1989…

On the Specify Startup Step of the Wizard, click the Plus button, add the first Startup Process ( i.e. ensure that the parent COMPETITIONS record exists).
Once you’re done with that, hit the Plus button again :

another_startup1

Select User PL/Sql code from the drop-down and then enter the following :

begin
    merge into tournaments
    using dual
    on
    (
        comp_code = 'UT1'
        and year_completed = 1916
        and host_nation = 'ARGENTINA'
    )
    when not matched then
        insert
        (
            id, comp_code, year_completed,
            host_nation, year_started, number_of_teams
        )
        values
        (
            tourn_id_seq.nextval, 'UT1', 1916,
            'ARGENTINA', null, 4
        );
end;

Incidentally we’ve moved on a bit in terms of test data and are now using details from the first Continental Tournament – the Copa America.
I’ve added this to the library as setup_tournament_for_UT1.

After all that, you should end up with something like …

multple_startup

The remaining tests don’t present anything new so we come to the point where we need to get the code to pass.
At this point you might be confident that this will do the job…

create or replace package body manage_tournaments
as
	procedure add_tournament
	(
		i_code tournaments.comp_code%type,
		i_year_end tournaments.year_completed%type,
		i_teams tournaments.number_of_teams%type,
		i_host tournaments.host_nation%type default null,
		i_year_start tournaments.year_started%type default null
	)
	is
	begin
        if i_year_start is not null then
            if nvl(i_year_end, i_year_start) &amp;amp;amp;lt; i_year_start then
                raise_application_error( -20000, q'[A tournament cannot end before it has begun...unless you're England !]');
            end if;
        end if;
		insert into tournaments
		(
			id, comp_code, year_completed,
			host_nation, year_started, number_of_teams
		)
		values
		(
			tourn_id_seq.nextval, i_code, i_year_end,
			i_host, i_year_start, i_teams
		);
	end add_tournament;
end manage_tournaments;
/

Right, let’s test it shall we ?

When Data Models go bad…

I’ve added all five of the tests I’ve written to the MANAGE_TOURNAMENTS Test Suite so, I’m going to use it to execute my tests…

In the Unit Test Pane, expand the Suites node and click on the Suite we want to run. This will bring up the details of the Suite :

suite_run1

Now press the Run button and…not everything went as expected.

By collapsing the result tree, we can see which Test (or Tests) failed :

suite_results1

I think you noticed the problem a bit earlier. Someone has been a bit remiss when doing the data modelling. the COMP_CODE column is missing a NOT NULL constraint.

The Data Modeller claims that this was a deliberate oversight so we can see the value of testing the Data Model and not simply the code. Yeah, right.

Once we address this :

alter table tournaments modify comp_code not null
/

… and re-run the offending test on it’s own to check that it passes, we can then re-run the entire suite as a regression test.
Looks like the change hasn’t broken anything else ( the failure at the bottom is just the previous run result):

suite_results2

Summary of the Insert Tests

So far, we’ve found out how SUT is put together as a Testing Framework.

The tests follow the expected four stage pattern – Setup, Execute, Validate, Teardown, but of these only the Execute stage is mandatory.
Multiply Processes are permissible in any stage apart from Execute.
The facility to group tests into Test Suites is present, as you would expect.

We’ve also started to explore some of the specifics of SUT.

Pleasingly, it handles expected Exceptions with the minimum of input down to the level of the expected error code.

The Dynamic Values Query provides a means of generating conditional test input criteria.
The Variable Substitution syntax makes validation of tests executed with such generated values somewhat more straightforward.

If and when limitations of SUT are encountered then you always have the option of extending it’s capabilities using custom PL/SQL Stored Program Units.
I can see that you’re still a bit dubious about that last point. OK, I’m sure we’ll come back to that later( mainly because I’ve already written that bit !)

In the meantime, let’s see what we can learn from the tests for the second DML activity we’re looking at…

Testing Deletes

As with any DML action, how well an application handles Deletion of records is crucial in terms of how well it ensures data integrity.
With that statement of the obvious out of the way, let’s take a look at how SUT can help us with testing this aspect of our Application’s functionality.

Some of the tests we need to write require records to be present in the TOURNAMENT_TEAMS table, which looks like this :

create table tournament_teams
(
    tourn_id number,
    team_name varchar2(100),
    group_stage varchar2(2),

    constraint team_pk primary key (tourn_id, team_name)
)
/

alter table tournament_teams
    add constraint team_tourn_fk foreign key
        (tourn_id) references tournaments(id)
/
The thing about Dynamic Value Queries

The Acceptance Criteria we want to test for in the User Story for deleting a tournament are :

  1. A Tournament can be deleted
  2. A Tournament cannot be deleted if there are Teams assigned to the Tournament

I’ve created a stub procedure in the MANAGE_TOURNAMENTS package.
The appropriate signature has been added to the package header and the procedure in the body currently looks like this :

...
	procedure remove_tournament( i_id tournaments.id%type)
	is
	begin
        null;
	end remove_tournament;
...

Yes, using the Natural Key for a TOURNAMENT record as input into this procedure would have made life easier in terms of writing the test. However, I feel quite strongly that we should not be making compromises in the Application Code to accommodate any Testing Tools we may need to use.

To ensure that our test is completely standalone, we may need to create both a COMPETITIONS and a TOURNAMENTS record in the Test Startup.
We then need to find out what the ID value is of the TOURNAMENT record we want to delete.
No problem there. We can re-use the routines we already have in the Library to Create the UT1 Competition and associated tournament.
Then we just need to use a Dynamic Values Query to retrieve the ID – something like :

select id as I_ID
from tournaments
where comp_code = 'UT1'
and year_completed = 1916
and host_nation = 'ARGENTINA'

Nothing to see here, right ? Well, if you had taken a look at the SQLDeveloper Help Dynamic Value Queries you would have noticed that…

“A dynamic value query is executed before the execution of all implementations in a test, including any startup action for the test. If you must populate a table before a dynamic value query is evaluated, you can do this in the startup action for a suite that includes the test.”

Fortunately, if you’re not entirely sure about how your test is going to work, you can click the Debug button and get details of the execution steps of the test or, in this case…

debug_dqv

So, it would appear that the solution to this particular problem is to wrap this test in it’s own Test Suite.
This way, we can run the Startup Processes at the Suite level instead of the the Test level to ensure that they execute before the Dynamic Values Query.

So, I’m going to create a suite called manage_tournaments.remove_tournament_ws – for Wrapper Suite
As well as the Startup Processes, I’ll also move the Teardown Process from the test to this Suite.
Then, I’ll allocate the test to the Suite.

Creating the Wrapper Suite

This is the same process for creating any other Test Suite – i.e. Go to the Unit Test Navigator, right-click on the Suites node and select Add Suite… from the menu.

dvq_ws1

If we now bring up our new Suite, we can add the Startup and Teardown processes in the same way as we do for a test (i.e. hit the Plus button and away you go).
Once that’s done, we need to assign a test to the Suite.
Once again Hit the Plus button, this time in the Test or Suite Section to bring up the Add to Suite dialogue :

add_to_suite1

Select the test we want to add, in our case, MANAGE_TOURNAMENTS.REMOVE_TOURNAMENT_1 and make sure that the Run Test Startups and Run Test Teardowns are unchecked :add_to_suite2

Click OK and…

dvq_ws2

Even though we’ve specified that the Startup and Teardown Processes should not be executed when the test is run within the Suite, it’s probably a good idea to go back and remove them, if only to save much confusion later on.

Anyhow, when we now execute the suite we can see that the results are what we’re expecting and that, reassuringly, the generated synthetic key value ( I_ID) is being passed in :

successful_fail

Running Startup Processes in the Suite and the Test

To make sure that the Foreign Key from TOURNAMENT_TEAMS to TOURNAMENTS is working, we need to insert a TOURNAMENT_TEAMS record for the appropriate TOURNAMENT.ID as part of the Setup Process.

As with the previous tests, we’re going to need to include this in the Wrapper Suite we’ve just created so that the Dynamic Values Query to get the ID value works.

Hang on, let’s consider that decision for a moment.

It is true that the second test will require the same Startup Processes that we have in our existing Wrapper Suite for the first test. It will also need these Startup Processes to be executed in a Wrapper Suite as it needs to have access to the TOURNAMENTS.ID value in a Dynamic Values Query.

To a programmer, it’s logical therefore that the second test should be allocated to the same Test Suite as the first as the code has already been written there and there’s absolutely no need to go duplicating effort ( even if it is mainly just importing stuff from the Library).

Of course, we will need to “move things about a bit” to make sure that both tests can run properly within the same suite. For example, we need to perform the “successful” delete test last as the test for an Exception is relying on the record to be deleted…er…not being deleted when it runs.

To a tester, things may appear a little different. One of the principles of Unit Testing is to make sure that, as far as possible, that tests can be run independently of each other.

It is for this reason that you should give serious consideration to creating a separate Wrapper Suite for our second test.

The alternative, as I’m about to demonstrate, gets a bit messy…

So for our new test, as well as the Setup Processes in the Suite, we’ve also included one in the test for the addition of the TOURNAMENT_TEAMS record.
The creation of the TOURNAMENT_TEAMS record needs to remain in the Test Startup Process rather than in the Suite as it’s only relevant to this test and not for all tests in the Suite. However, as the TOURNAMENT record we’re looking for will definitely have been created by the Wrapper Suite Startup Processes – before the Test Startup Process fires, this should not be a problem.

So, the main differences between this test – MANAGE_TOURNAMENTS.REMOVE_TOURNAMENT_2 – and it’s predecessor are simply that we are expecting this test to error with ORA-2292 – “Integrity constraint violated” – and that we now include the following Startup Process code to create the TOURNAMENT_TEAMS record :

declare
    l_id tournaments.id%type;
begin
    select id
    into l_id
    from tournaments
    where comp_code = 'UT1'
    and year_completed = 1916
    and host_nation = 'ARGENTINA';

    merge into tournament_teams
    using dual
    on
    (
        tourn_id = l_id
        and team_name = 'URUGUAY'
    )
    when not matched then
    insert( tourn_id, team_name, group_stage)
    values(l_id, 'URUGUAY', null);
end;

Now things start to get a bit complicated. In order to make sure that the test for a legitimate delete does not fail, we need to “teardown” the child record in TOURNAMENT_TEAMS that we created in our Startup Process. Well, no problem…except that SUT does not appear to allow Variable Substitution syntax to be used in a Teardown Process.
Therefore, we need to indulge in a little light hacking and put the following code in a Validation Action in our test :

declare
    l_id tournament_teams.tourn_id%type := :I_ID;
begin
    delete from tournament_teams
    where tourn_id = l_id
    and team_name = 'URUGUAY';
end;

This time, when we add the test to the Wrapper Suite, we make sure that the Test Startups are run :

add_to_suite_test2

Finally, we need to make sure that our new test runs first in the suite. In the Test Suite listing, click on the Test name then click the blue up arrow…

reorder_tests1

…until our test is at the top of the pile…

reorder_tests2

Once we’re happy with our tests, we can then fix the application code :

...
procedure remove_tournament( i_id tournaments.id%type)
is
begin
    delete from tournaments
    where id = i_id;
end remove_tournament;
...

…and run our wrapper suite to make sure everything passes…

ws_final_run

Finally, we need to add the manage_tournaments.remove_tournaments_ws suite to the Suite we have for all the tests for the MANAGE_TOURNAMENTS package.
To do this, go to the Unit Test Navigator and expand the MANAGE_TOURNAMENTS suite.
Then, right-click the Suites node and select Add Suite…

suite_to_suite1

Now select the manage_tournements.remove_tournaments_ws suite from the list …

suite_to_suite2

…and click OK.

Near enough is good enough ? Finding a key value generated during a Startup Process

The other record deletion story we have concerns COMPETITIONS records.

The procedure we’ll be testing is in the MANAGE_COMPETITIONS package :

...
procedure remove_competition( i_code competitions.comp_code%type) is
begin
    delete from competitions
    where comp_code = i_code;
end remove_competition;
...

The acceptance criteria, and indeed, the functionality, that we’re testing here is very similar to our previous User Story.
The Acceptance Criteria are :

  • A competition can be deleted
  • A competition cannot be deleted if a tournament exists for it

To make sure that the test to Delete a COMPETITION is self-contained, we need to make sure that the record we are deleting has no child records.
The easiest way to do this is to create the record as part of the Startup Process.
Obviously, this will need to referenced by a Dynamic Values Query and therefore this code will need to run in another Wrapper Suite.

Once again, I’m using the GET_NEW_COMP_CODE function I created earlier. Yes, that one that you weren’t sure about. The one that you’re probably still not sure about. The Startup Process in my Wrapper Suite will be User PL/Sql Code :

declare
    l_code competitions.comp_code%type;
begin
    l_code := test_repo.get_new_comp_code;
    insert into competitions(comp_code, comp_name, description)
    values( l_code, 'Test', null);
end;

The next step may well be a bit tricky – in the Dynamic Values Query we use to determine the parameters to pass to the procedure, we need to find the COMP_CODE created in the Startup Process.
Now, we can do something like this…

select 'UT'
    || to_char(substr(test_repo.get_new_comp_code, regexp_instr( test_repo.get_new_comp_code, '[[:digit:]]')) -1) as I_CODE
from dual;

…but if the table has a change made to it in another session in the interval between our Startup Process and our Dynamic Values Query executing then we may well end up using an incorrect COMP_CODE value.

Let’s stop and think for a moment.
What we are writing here is not application code that may be executed concurrently by multiple users. We are writing Unit Tests.
Therefore, whilst this potential inexactitude would be problematic within the core Application Codebase, it’s not so much of an issue for a Unit Test.
Remember, the Unit tests will probably be run on a Development Environment with few users ( i.e. Developers) connected. They may also be run on a Continuous Integration Environment, in which case they are likely to be the only thing running on the database.
OK, so I could do something clever with the COMP_CODE value used in the Startup Process being assigned to a package variable/temporary table/whatever for reference by later testing steps in the same session, but I’m really not sure I need to go to all that effort right now.
You may well disagree with this approach, but as I’m the one at the keyboard right now, we’re pressing on…

The validation code for this test will be a Query returning no row(s) :

select null
from competitions
where comp_code = '{I_CODE}'

The final Test (inside it’s wrapper suite), looks like this :

del_comp1

By contrast, making sure that we can’t delete a COMPETITIONS record which has TOURNAMENTS records associated with it is pretty straightforward.
We simply use the MERGE statements we’ve already added to the library to make sure we have a COMP_CODE UT1 and test against that.
As we know the value that we want to pass in ahead of time, we don’t even need a Dynamic Values Query. Therefore, we don’t need another Wrapper Suite.

The test ends up looking like this :
del_comp2

Having checked that they work as expected, I’ve added the second test and the wrapper suite for the first test to the MANAGE_COMPETITIONS suite.
Together with our earlier tests for this package, the Suite as a whole now looks like this :

manage_comps_suite_contents

Deletion Tests Summary

By using the ability to define Startup (and Teardown) Processes at Suite level, we can workaround some of the limitations of Dynamic Values Queries.
Additionally, this property of Suites offers some additional flexibility within SUT.
This does mean that some test code may end up in parts of the test structure where you would not normally expect to find them.

Updates

There doesn’t seem to be anything too different about the way SUT lets you test Update operations, except for the facility to have multiple implementations of a single test. There’s an example of this in the Testing Tutorial in the SQLDeveloper Help. Alternatively…

Multiple Implementations of the same test

We’ve got a Story about updating TOURNAMENTS records.

The Acceptance Criteria are :

  • Change the number of teams taking part in a tournament
  • Change the year a tournament started
  • The year a tournament started cannot be after the year a tournament finished

The procedure that we’re testing is in the MANAGE_TOURNAMENTS package :

procedure edit_tournament
(
    i_id tournaments.id%type,
    i_teams tournaments.number_of_teams%type default null,
    i_year_start tournaments.year_started%type default null
)
is
begin
    update tournaments
    set number_of_teams = nvl(i_teams, number_of_teams),
        year_started = nvl(i_year_start, year_started)
    where id = i_id;
end edit_tournament;

We’ve also added a check constraint to the table :

alter table tournaments add constraint chk_end_after_start
    check(nvl(year_started, year_completed) &amp;amp;amp;lt;= year_completed)
/

Once again, as the procedure we’re testing requires an ID value that we may or may not be creating at runtime, we’ll be needing a Wrapper Suite to feed a Dynamic Values Query to generate the TOURNAMENTS.ID value we pass into the procedure we’re testing.

Once we’ve got our Suite – which I’ve called manage_tournaments.edit_tournament_ws, we can start looking at the Test.

The Dynamic Value Query for the first test is :

select id as I_ID,
    5 as I_TEAMS,
    null as I_YEAR_START
from tournaments
where comp_code = 'UT1'
and year_completed = 1916
and host_nation = 'ARGENTINA'

I’ve published this to the Library as edit_tourn_params_ut1 as we’re going to need variations of it shortly.
The Expected Result is SUCCESS.

The Process Validation is a Boolean Function, which I’m adding to the library as verify_edit_tournament :

declare
    l_id tournaments.id%type := :I_ID;
    l_teams tournaments.number_of_teams%type := :I_TEAMS;
    l_year_start tournaments.year_started%type := :I_YEAR_START;
    l_count pls_integer;
begin
    select 1
    into l_count
    from tournaments
    where id = l_id
    and number_of_teams = nvl(l_teams, number_of_teams)
    and nvl(year_started, year_completed) = coalesce(l_year_start, year_started, year_completed);

    return l_count = 1;
end;

If we expand our new test in the Unit Test Navigator we can see that we have something called Test Implementation 1 under it.

utn_impl

Each test can have multiple implementations, a feature we’re going to make the most of for the User Acceptance Criteria we’re dealing with now.
First thing to do then, is to rename Test Implementation 1 to something a bit more meaningful.

To do this, right-click on the Implementation and select Rename Implementation…

impl_rc_rename

Then enter the new name and hit OK

impl_rc_rename2

Now we can create a second implementation by right-clicking the Test itself and selecting Add Implementation…

test_rc_add_impl

This time, I’ve called the implementation update_year_started.

We can now see that the new Implementation is in focus in the test, but that the Execution and Validation Processes have not been populated.impl_in_focus

I’ve copied in the Dynamic Values Query from the Library and made the necessary changes for this implementation…

impl2_dvq

…and subscribed to the verify_edit_tournament Boolean Function we created in the first implementation.

The third Implementation is called update_start_after_end and is the same as the second, except I’m passing in a year later than the current YEAR_STARTED value for the TOURNAMENTS record.
The Expected Result is Exception with an ORA-2290 Check Constraint violated error so there’s no need to include the validation function.

One point to note here is that the Implementations seem to execute alphabetically by name and I don’t see any way of changing this manually.
This is not an issue in this case, when each test is reasonably independent, but it’s worth bearing in mind.

Once all of the application code is in place, the Test Suite result looks like this :

impl_results

Testing an In/Out SYS_REFCURSOR

Yes, we’re at the point that you may well have been dreading.
The Ref Cursor has been such a wonderfully useful additions to the PL/SQL language. It makes passing data between PL/SQL and programs written in other languages so much easier.
It is ironic therefore, that getting stuff out of a Ref Cursor is often quite painful when using a PL/SQL client.

Given this, one option we might consider when testing Ref Cursors could be to use whatever test framework is being employed to test the client code calling our PL/SQL API. However, that would be to pass up the opportunity to use Wouter’s clever little trick.

So, with nothing up my sleeves…

Is this a Ref Cursor that I see before me ?

The Acceptance Criterion for our User Story is that the application lists all tournaments in the system the specified competition.

The procedure we need to test is in the MANAGE_TOURNAMENTS package and looks like this :

...
procedure list_tournaments
(
    i_comp_code tournaments.comp_code%type,
    io_tourn_list in out SYS_REFCURSOR
)
is
begin
    open io_tourn_list for
        select id, comp_code, year_completed, host_nation, year_started, number_of_teams
        from tournaments
        where comp_code = i_comp_code;
end list_tournaments;
...

The first clue to the fact that this test will be a bit different from normal comes when you select this procedure right at the start of the Create Test Wizard.
Immediately you will get :

dvq_warning

In a way this is reassuring. SUT recognizes that we need to handle a REF CURSOR. However, the template Query it provides for the Dynamic Values Query we need to use appears to pose more questions than answers…

select ? as I_COMP_CODE,
    ? as IO_TOURN_LIST,
    ? as IO_TOURN_LIST$
from ?
where ?

Now, there may well be a way of getting this to work as intended, but I’ve not been able to find out what it is.
What we can do instead is a bit of light cheating…

Wouter’s Method

To start with, we need to create a procedure in the TEST_REPO schema. This will act as a dummy Test Execution so we can do the real testing in the Validation Process.

Still don’t like me creating my own objects in the Repository ? Well, fair enough, but in this case, I can’t see any other option.
The procedure then is :

create or replace procedure this_is_not_a_cursor
as
--
-- Been listening to a bit of Public Image Ltd, hence the name of this proc...
--
begin
    null;
end;
/

…look, you can put it in a package if that’ll make you feel any better about it.

Anyway, we need to grant execute permissions to the SUT roles :

grant execute on this_is_not_a_cursor to ut_repo_user
/
grant execute on this_is_not_a_cursor to ut_repo_administrator
/

Now, let’s try creating our test again. We’re using the FOOTIE connection as usual. However, this time, we’ll be selecting a program from the Other Users node…

cursor_wiz1

The test name still follows our naming convention -i.e. MANAGE_TOURNAMENTS.LIST_TOURNAMENTS_1.

The Startup Process makes sure that we have records to query :

declare
    procedure ins( i_year tournaments.year_completed%type,
        i_host tournaments.host_nation%type,
        i_teams tournaments.number_of_teams%type)
    is
    begin
        merge into tournaments
        using dual
        on
        (
            comp_code = 'WC'
            and year_completed = i_year
            and host_nation = i_host
        )
        when not matched then
            insert ( id, comp_code, year_completed,
                host_nation, year_started, number_of_teams)
            values( tourn_id_seq.nextval, 'WC', i_year,
                i_host, null, i_teams);
    end ins;
begin
    merge into competitions
    using dual
        on ( comp_code = 'WC')
        when not matched then
        insert( comp_code, comp_name, description)
        values('WC', 'World Cup', 'FIFA World Cup');

    ins(1930, 'URUGUAY', 13);
    ins(1934, 'ITALY', 16);
    ins(1938, 'FRANCE', 16);
    ins(1950, 'BRAZIL', 13);

end;

We don’t need to specify any input parameters for the Execution Step of our test so we can skip straight on to Process Validation. Here we define some User PL/Sql Code, where all the fun happens…

declare
    l_rc sys_refcursor;
    rec_tourn tournaments%rowtype;
    l_count pls_integer := 0;
    l_exp_count pls_integer;
begin
    -- The &amp;amp;amp;quot;real&amp;amp;amp;quot; test...
    manage_tournaments.list_tournaments('WC', l_rc);
    loop
        fetch l_rc into rec_tourn;
        exit when l_rc%notfound;
        l_count := l_count + 1;
    end loop;
    close l_rc;
    -- confirm that the correct number of records have been retrieved
    select count(*) into l_exp_count
    from tournaments
    where comp_code = 'WC';

    if l_count != l_exp_count then
        raise_application_error(-20900, 'Number of records in ref cursor '||l_count||' does not match expected count of '||l_exp_count);
    end if;
end;

So, as we can use a PL/SQL block in Process Validation, we can define our SYS_REFCURSOR variable and execute the call to our procedure here.

Having added our “standard” Teardown, we’re ready to test.

The result….

ref_cur_magic

The main drawback with this approach is that you now have a “customised” repository and will have to cope with any extra admin around administration and deployment of such objects. On the plus side, you can test Ref Cursor stuff.

Startup/Teardown using Table/Row Copy

Sooner or later you will encounter a testing scenario where a simple rollback just won’t do.
The next User Story is just such an example…

Bulk Upload Competitions – Using Table Copy for testing and rollback

This Story is intended to replicate the sort of ETL process that is quite common, especially in a Data Warehouse environment.
The scenario here is that you receive a delimited file containing data that needs to be loaded into your application.
The load needs to be permissive – i.e. you don’t want to fail the entire load if only a few records error.
The file format is validated as being what is expected by being loaded into an external table,
The load process then uses LOG ERRORS to upload all the records it possibly can, whilst keeping track of those records that failed by dumping them into an Error table.
The thing about LOG ERRORS is that it runs an Autonomous Transaction in the background.
Therefore, even you issue a rollback after the load, any records written to the error table will be persisted.
In light of this, we’re going to need to use something else for our Teardown process.

The Data Model

Just to quickly recap, we already have an external table :

create table competitions_xt
(
    comp_code varchar2(5),
	comp_name varchar2(50),
	description varchar2(4000)
)
    organization external
    (
        type oracle_loader
        default directory my_files
        access parameters
        (
            records delimited by newline
            badfile 'competitions.bad'
            logfile 'competitions.log'
            skip 1
            fields terminated by ','
            (
                comp_code char(5),
                comp_name char(50),
                description char(4000)
            )
        )
        location('competitions.csv')
    )
    reject limit unlimited
/

We also have a csv file – competitions.csv with the data to load (including a duplicate record) :

comp_code,comp_name,description
HIC,Home International Championship, British Home International Championship
CA,Copa America,Copa America (South American Championship until 1975)
OLY,Olympic Football Tournament,The Olympics
WC,World Cup,The FIFA World Cup
CEIC,Central European International Cup,Central European International Cup - a forerunner to the European Championships
EURO,European Championship,UEFA European Championship
HIC,Home International Championship, British Home International Championship

We have an error table – ERR$_COMPETITIONS – that’s been created by :

begin
    dbms_errlog.create_error_log('COMPETITIONS');
end;
/

…and we have a stub we’ll be using to test the load (in the MANAGE_COMPETITIONS package

    procedure upload_competitions
    is
    begin
        null;
    end;

When we create the test, the Startup Processes need to backup both the COMPETITIONS table and the ERR$_COMPETITIONS table.
Creating the first Startup Process, we select Table or Row Copy from the drop-down :

backup_tab1

In the Window that pops up, the Source Table is the table we want to copy.
The Target Table is the temporary table that SQLDeveloper is going to create as a copy of the Source Table.
Note that the Target Table defaults to the same name irrespective of how many Startup Processes we specify.
For our first Startup Process, things look like this :

startup1

Notice that the Generated Query field updates as you enter the name of the source table. If you want to make sure that this Query is going to work at runtime, you can hit the Check button and (hopefully) be reassured with the message :

startup1a

So, we’ve got a backup for the COMPETITIONS table, now we need one for the error table.
This is pretty similar to the first Startup Process except that this time we rename the Temporary Table to TMP$MANAGE_COMPETITIONS.UPLERR :

startup2

As with the startup, there are two validation processes required. Actually, you could probably do it all in one but that would be to pass up the opportunity to demonstrate both at work.

Both of these will be User PL/SQL Code blocks. First off, check that we’ve loaded the correct number of rows :

declare
    l_count pls_integer;
    wrong_count exception;
begin
    select count(comp.comp_code)
    into l_count
    from competitions comp
    inner join competitions_xt cxt
    on comp.comp_code = cxt.comp_code;

    if l_count != 7 then
        raise wrong_count;
    end if;
end;

…and then make sure that we have the correct number of error records ….

declare
    l_count pls_integer;
    wrong_count exception;
begin
    select count(*)
    into l_count
    from err$_competitions;

    if l_count != 1 then
        raise wrong_count;
    end if;
end;

Finally, for the Teardown, we need to restore our tables to the state prior to the test execution.

This time, the process type from the drop-down is Table or Row Restore

Note that we can check the check-box to drop the temp table pre-checked…

restore1

For the second Teardown process, to restore the Error table to it’s former state, we need to do a bit more typing.
This is because SQLDeveloper defaults to the same values for each Teardown Process.
So, we need to specify that our Target Table is ERR$_COMPETITIONS and our Source Table is “TMP$MANAGE_COMPETITIONS.UPLERR” :

restore2

After all that, we can see that we have two Startup Processes, two Validation Processes, and two Teardown processes in our new test :

test7_summary

After confirming that the test fails as expected…

test7_fail

…we update the application code…

...
procedure upload_competitions
is
begin
    insert into competitions( comp_code, comp_name, description)
        select comp_code, comp_name, description
        from competitions_xt
        log errors reject limit unlimited;
end upload_competitions;
...

…and re-run the test…

test7_summary

Sharing your Suites – version control and code promotion for Tests

If you plan to use your SUT tests in anything other than a single repository, chances are that you’ll want to be able to :

  • transfer them between environments
  • put them under some form of source control

Well, you’re in luck. Not only can you export Tests or Suites to an xml file on disk, an export will automatically include any Subscribed objects from the Library.
To demonstrate, right-click on the Suite we want to export :

rc_export

…and select the file location…

exp_file

…and SUT will anticipate your every need…exp_conf

…almost.

Obviously, you’ll need to make sure you deploy any custom Stored Program Units upon which the exported objects are dependent.

To import a file into a repository, you can use the Main Tools/Unit Test menu :imp_menu

…which allows you to choose the file to import, as well as the option of whether or not to overwrite an object of the same name that already exists in the repository :imp_file_select

Summary

Overall, SQLDeveloper Unit Testing provides a number of helpful features to reduce the burden of writing and maintaining tests.
Notable plus points are :

  • being able to save code in a Library to promote re-use
  • table backup and restore functionality for test Startups and Teardowns
  • the seamless way that exceptions can be tested down to the level the error code
  • the fact that test results are retained and collated in the tool

Being declarative in nature, SUT provides a common structure for Unit Tests. Being declarative in nature, SUT does have some limitations.
It is possible to overcome some of these limitations by adding custom database objects to the Repository. Some consideration needs to be given as to what extent you want to do this.

I will be comparing SUT with other PL/SQL testing frameworks in a future post. Before that, I need to evaluate some other frameworks.
The next one on my list is utPLSQL…


Filed under: PL/SQL, SQLDeveloper Tagged: Create Unit Test Wizard, Dynamic Value Queries, exceptions, Export, Import, Library, manage users, Purge Results, Repository, SQLDeveloper Unit Testing, Startup Process, sys_refcursor, Table or Row Copy, Teardown Process, Test Implementations, Test Suites, Testing in/out ref cursors, ut_repo_administrator role, ut_repo_user role, Variable Substitution, wrapper suites

Test Driven Development and PL/SQL – The Odyssey Begins

Sun, 2016-07-31 14:58

In the aftermath of the Brexit vote, I’m probably not alone in being a little confused.
Political discourse in the UK has focused on exactly who it was who voted to Leave.
The Youth Spokesperson you get on a lot of political programs right now, will talk accusingly of older voters “ruining” their future by opting to Leave.
Other shocked Remainers will put it down to people without a University Degree.
I’m not sure where that leaves me as someone who is ever so slightly over the age of 30, does not have a degree…and voted to Remain. I’m pretty sure I’m not Scottish…unless there’s some dark family secret my parents haven’t let me in on.
I suppose I must be a member of the “Metropolitan Elite” the Leave side was always muttering darkly about.
After all, I do pay a great deal of money to be driven from my country residence to London to work every day…although I do have to share the train with the odd one or two fellow elitists who’ve made the same extravagant choice.
This does of course assume that Milton Keynes qualifies as being “in the country” and that my living there is a matter of choice rather than a question of being able to afford living any closer to London.

With all the excrement…er…excitement of the Referendum Campaign and it’s aftermath, I somehow never got around to writing my application to track the progress of the Euros (or the Copa America for that matter).
Whenever a major football tournament comes around, I always resolve to do this, if only to evoke memories of my youth when a large part of my bedroom wall was taken up with a World Cup Wallchart where you could fill in the results as they happened. That’s without mentioning the months leading up to the tournament and trying to complete the Panini collection – the only time you’d ever hear a conversation such as “OK, I’ll let you have Zico in exchange for Mick Mills”.

In order to prevent this happening again, I’ve resolved to write an application capable of holding details of any major international football tournament.
In the course of writing this application, I’d like to take the opportunity to have a look at an aspect of PL/SQL development that maybe isn’t as commonly used as it should be – Unit Testing.

Over the next few weeks, I plan to take a look at some of the Testing Frameworks available for PL/SQL and see how they compare.
The objective here is not so much to find which framework is the best/most suitable, but to perform an objective comparison between them using the same set of tests which implement fairly commonly encountered functionality.

If you’re looking for recommendations for a framework, then this article by Jacek Gebal is probably a good place to start.

In this post, I’ll be outlining the functionality that I’ll be testing in the form of User Stories, together with the application data model (or at least, the bit of it I need to execute the tests).
I’ll also have a look at the common pattern that tests written in these frameworks tend to follow.
Just to highlight why using a Test Framework might be useful, I’ll also script a couple of simple tests in SQL to see just how much code you have to write to implement tests without using a framework.

Unit Testing Approach

Taking the purist approach to Test-Driven Development, we’d need to :

  1. Write the test first and ensure that it fails
  2. Write the minimum amount of code required for the test to pass
  3. Run the test to make sure it passes

Additionally, we’d need to make sure that the tests were independent of each other – that the execution of one test is not dependent on the successful execution of a prior test.

Following this approach to the letter would cause one or two issues.
Firstly, if the procedure your testing does not exist, your test will not run and fail. It will error.
As the point of this step is, essentially, to ensure that your test code is sound (i.e. it won’t pass irrespective of the code it runs against), this is not what we’re after.
The second issue is specific to PL/SQL.
When defining PL/SQL procedures that interact with database tables, it’s usually a good idea to use anchored declarations where appropriate.
Even if we write a stub procedure, if the tables it will interact with do not exist, we’d have to use native types for our parameters and update the signature of the procedure once the tables had been created.
There is always the danger that this additional step would be missed.

So, in terms of PL/SQL then, I’d suggest that the pre-requisites for writing our test are :

  • The data model components (tables, views, RI constraints) that the procedure will interact with
  • A stub procedure with correctly typed parameters

Many testing frameworks seem to adopt four basic steps for each test, some of which are optional. They are :

  1. Setup – put the application into a known state from which the test can run
  2. Execute – run the code that you want to test
  3. Verify – check that what you expected to happen did happen
  4. Teardown – return the system to the state it was in before this test was run

This is the general pattern that I’ll be following for my tests.

The Stories to Test

My story finding is complete and the backlog has been populated.
The stories selected for Sprint 1 are on the board. OK, they’ve been chosen so that they cover some of the more common scenarios that we might need to test.
The stories are :

  1. Add a competition – tests insert into a table with a Primary Key
  2. Add a Tournament – insert into a table with a Foreign Key constraint
  3. Remove Tournament – delete from a table
  4. Remove Competitions – delete from a Parent table
  5. Edit a Tournament – update of a record in a table
  6. View Tournaments by Competition – select multiple records using an in/out ref cursor
  7. Bulk Upload Competitions – insert records in bulk using LOG ERRORS
  8. Add a Team to a Tournament – insert a record using a synthetic foreign key

Each of these stories have multiple Acceptance Criteria.
It’s worth noting that, as some of the functionality (i.e. the data integrity) is implemented in the data model ( Primary Keys, Foreign Keys etc), the Acceptance Criteria for these stories needs to cover this as well as the functionality implemented in the PL/SQL code itself.

The Data Model

I’ve taken the approach that, for a story to be Sprint Ready, the Data Model to support it must already be in place.
Currently, the data model looks like this :

sprint1_data_model

The DDL to create the application owner is :

create user footie identified by password
/

alter user footie default tablespace USERS
/

grant create session, create table, create procedure, create sequence to footie
/

alter user footie quota unlimited on users
/

grant read, write on directory my_files to footie
/

…where password is the password you want to give this user.
Note that a directory called MY_FILES already exists in my database.

The DDL to create the Data Model includes the COMPETITIONS table…

create table competitions
(
    comp_code varchar2(5) constraint comp_pk primary key,
    comp_name varchar2(50) not null,
    description varchar2(4000)
)
/

comment on table competitions is
    'International Football competitionss for which tournament data can be added'
/

comment on column competitions.comp_code is
    'Internal code to uniquely identify this competitions'
/

comment on column competitions.comp_name is
    'The name of the competitions'
/

comment on column competitions.description is
    'A description of the competitions'
/

…an external table to facilitate the bulk upload of COMPETITIONS…

create table competitions_xt
(
    comp_code varchar2(5),
    comp_name varchar2(50),
    description varchar2(4000)
)
    organization external
    (
        type oracle_loader
        default directory my_files
        access parameters
        (
            records delimited by newline
            skip 1
            fields terminated by ','
            badfile 'competitions.bad'
            logfile 'competitions.log'
            (
                comp_code char(5),
                comp_name char(50),
                description char(4000)
            )
        )
            location('competitions.csv')
    )
reject limit unlimited
/

….the TOURNAMENTS table…

create table tournaments
(
    id number constraint tourn_pk primary key,
    comp_code varchar2(5),
    year_completed number(4) not null,
    host_nation varchar2(100),
    year_started number(4),
    number_of_teams number(3) not null,

    constraint tourn_uk unique( comp_code, year_completed, host_nation)
)
/

comment on table tournaments is
    'A (finals) tournament of an International Football competition. Table alias is tourn'
/

comment on column tournaments.id is
    'Synthetic PK for the table as the Natural Key includes host_nation, which may be null. Values taken from sequence tourn_id_seq'
/
comment on column tournaments.comp_code is
    'The Competition that this tournament was part of (e.g. World Cup). Part of the Unique Key. FK to COMPETITIONS(comp_code)'
/

comment on column tournaments.year_completed is
    'The year in which the last match of this tournament took place. Manatory. Part of Unique Key'
/

comment on column tournaments.host_nation is
    'The nation where the tournament was played (if a finals tournament). Part of Unique Key but is optional'
/
comment on column tournaments.year_started is
    'The year in which the first match was played ( if applicable). Cannot be later than the value in YEAR_COMPLETED'
/

comment on column tournaments.number_of_teams is
    'The number of teams taking part in the tournament'
/

…the TOURNAMENT_TEAMS table…

create table tournament_teams
(
    tourn_id number,
    team_name varchar2(100),
    group_stage varchar2(2),

    constraint team_pk primary key (tourn_id, team_name)
)
/

comment on table tournament_teams is
    'Teams participating in the Tournament. Alias is TEAM'
/

comment on column tournament_teams.tourn_id is
    'The ID of the tournament the team is participating in. Foreign Key to TOURNAMENTS(id).'
/

comment on column tournament_teams.team_name is
    'The name of the participating team'
/

comment on column tournament_teams.group_stage is
    'If the tournament has an initial group stage, the group identifier that the team is drawn in'
/

…this being 11g, a sequence to provide the values for the TOURNAMENT.ID synthetic key (in 12c you can define this as part of the table ddl)…

create sequence tourn_id_seq
/

… a Foreign Key constraint from TOURNAMENTS to COMPETITIONS….

alter table tournaments
    add constraint tourn_comp_fk foreign key
        (comp_code) references competitions(comp_code)
/

…and a Foreign Key from TOURNAMENT_TEAMS to TOURNAMENTS…

alter table tournament_teams
    add constraint team_tourn_fk foreign key
        (tourn_id) references tournaments(id)
/

I’ve kept the Foreign Keys in separate files to make the initial deployment of the application simpler. By doing this, I can create the tables in any order without worrying about RI constraints. I can then add these as a separate step after all of the tables have been created.
The tables’ non-RI constraints (Primary, Unique Keys, Not Null constraints etc.) are created along with the table.

One other point to note is that I know there are one or two issues with the first-cut of the DDL above. This is so that I can see how well the tests I write highlight these issues.
As we know, before we begin writing a test, we’ll need to have a stub procedure for it to compile against.

The first of these is :

create or replace package footie.manage_competitions
as
    procedure add_competition
    (
        i_code footie.competitions.comp_code%type,
        i_name footie.competitions.comp_name%type,
        i_desc footie.competitions.description%type
    );
end manage_competitions;
/

create or replace package body footie.manage_competitions
as
    procedure add_competition
    (
        i_code footie.competitions.comp_code%type,
        i_name footie.competitions.comp_name%type,
        i_desc footie.competitions.description%type
    )
    is
    begin
        null;
    end add_competition;
end manage_competitions;
/
Scripting the Tests

I want to write a test script for my first story – Add a Competition.
There are two Acceptance Criteria that I need to test :

  • A new competition is added
  • A competition cannot be added more than once

That’s pretty simple, so the test should be fairly straight-forward. Using a SQL script, the first test would probably look something like this :

set serveroutput on size unlimited
declare
    l_result varchar2(4);
    l_err_msg varchar2(4000) := null;

    l_rec_count pls_integer;
    l_code footie.competitions.comp_code%type;
    l_name footie.competitions.comp_name%type := 'UT1 Comp';
    l_desc footie.competitions.description%type := 'Test';

    l_counter pls_integer := 1;

begin
    -- Setup - make sure that the competition we're using for the test does not already exist
    select 'UT'||numeral
    into l_code
    from
    (
        select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral
        from footie.competitions
        where comp_code like 'UT%'
        and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1
        union
        select 1 from dual
    )
    where numeral is not null;
    -- execute - nested block so that we can handle/report any exceptions
    begin
        footie.manage_competitions.add_competition(l_code, l_name, l_desc);
        l_result := 'PASS';
    exception
        when others then
            l_err_msg := dbms_utility.format_error_backtrace;
            l_result := 'FAIL';
    end; -- execute block
    -- validate
    if l_result = 'PASS' then
        select count(*) into l_rec_count
        from footie.competitions
        where comp_code = l_code
        and comp_name = l_name
        and description = l_desc;

        if l_rec_count != 1 then
            l_result := 'FAIL';
            l_err_msg := 'Record not added';
        end if;
    end if;
    -- teardown
    rollback;
    -- Display Result
    dbms_output.put_line('Add Competition : '||l_result);
    if l_result = 'FAIL' then
        dbms_output.put_line(l_err_msg);
    end if;
end;
/

If we run this, we’d expect it to fail, as things stand :

PL/SQL procedure successfully completed.

Add Competition : FAIL
Record not added

Our second test will probably look like this :

set serveroutput on size unlimited
declare
    l_result varchar2(4);
    l_err_msg varchar2(4000) := null;
    l_err_code number;

    l_rec_count pls_integer;
    l_code footie.competitions.comp_code%type := 'UT2';
    l_name footie.competitions.comp_name%type := 'UT2 Comp';
    l_desc footie.competitions.description%type := null;

    l_counter pls_integer := 1;

begin
    -- Setup - make sure that the competition we're using for the test exists
    merge into footie.competitions
    using dual
    on (comp_code = l_code)
    when not matched then
    insert( comp_code, comp_name, description)
    values(l_code, l_name, l_desc);

    -- execute - nested block so that we can handle/report any exceptions
    begin
        footie.manage_competitions.add_competition(l_code, l_name, l_desc);
        l_result := 'FAIL';
    exception
        when others then
            l_err_msg := dbms_utility.format_error_backtrace;
            l_result := 'PASS';
    end; -- execute block
    -- validate
    if l_result = 'PASS' then
        select count(*) into l_rec_count
        from footie.competitions
        where comp_code = l_code
        and comp_name = l_name
        and nvl(description, 'X') = nvl(l_desc, 'X');

        if l_rec_count &gt; 1 then
            l_result := 'FAIL';
            l_err_msg := 'Duplicate record has been added';
        end if;
    end if;
    -- teardown
    rollback;
    -- Display Result
    dbms_output.put_line('Add Competition : '||l_result);
    if l_result = 'FAIL' then
        dbms_output.put_line(l_err_msg);
    end if;
end;
/

That looks rather similar to our first test. Furthermore, the two test scripts combined add up to quite a lot of code.
At least when we run it, it fails as expected…

PL/SQL procedure successfully completed.

Add Competition : FAIL

The next step is to write the code to pass the tests…

create or replace package body manage_competitions
as
	procedure add_competition
	(
		i_code competitions.comp_code%type,
		i_name competitions.comp_name%type,
		i_desc competitions.description%type default null
	)
	is
	begin
        insert into competitions( comp_code, comp_name, description)
        values( i_code, i_name, i_desc);
	end add_competition;
end manage_competitions;
/

Now when we run our tests we get …

PL/SQL procedure successfully completed.

Add Competition : PASS

PL/SQL procedure successfully completed.

Add Competition : PASS

Now, I could carry on writing SQL scripts for Acceptance Criteria for all of the other stories in the sprint, but I think you get the idea.
Writing tests this way requires a large amount of code, much of which is being replicated in separate tests.

Of course, you could go down the route of moving some of your repeating test routines into PL/SQL packages and deploying your test code along with your application code to non-production environments.
Before going to all of that extra effort though, it’s probably worth checking to see if there’s a framework out there that can help reduce the testing burden.
Fortunately, I’ve got a couple of years until the next major international tournament so, I’ll be taking some time to do just that.
So, tune in next week ( or sometime soon after) when I’ll be taking a look at the first of these frameworks…SQLDeveloper Unit Testing


Filed under: Oracle, PL/SQL, SQL Tagged: external table, foreign key, Test Driven Development, unique key

The Oracle Data Dictionary – Keeping an eye on your application in uncertain times

Sun, 2016-07-17 13:15

I’ve got to say that it’s no surprise that were leaving Europe. It’s just that we expected it to be on penalties, probably to Germany.
Obviously, that “we” in the last gag is England. Wales and Northern Ireland have shown no sense of decorum and continued to antagonise our European Partners by beating them at football.
Currently, the national mood seems to be that of a naughty child who stuck their fingers in the light socket to see what would happen, and were awfully surprised when it did.

In the midst of all this uncertainty, I’ve decided to seek comfort in the reassuringly familiar.
Step forward the Oracle Data Dictionary – Oracle’s implementation of the Database Catalog.

However closely you follow the Thick Database Paradigm, the Data Dictionary will serve as the Swiss Army Knife in your toolkit for ensuring Maintainability.
I’ll start of with a quick (re)introduction of the Data Dictionary and how to search it using the DICTIONARY view.
Then I’ll cover just some of the ways in which the Data Dictionary can help you to get stones out of horses hooves keep your application healthy.

Right then….

What’s in the Data Dictionary ?

The answer is, essentially, metadata about any objects you have in your database down to and including source code for any stored program units.
Data Dictionary views tend to come in three flavours :

  • USER_ – anything owned by the currently connected user
  • ALL_ – anything in USER_ plus anything the current user has access to
  • DBA_ – anything in the current database

The Data Dictionary has quite a lot of stuff in it, as you can tell by running this query :

select count(*)
from dictionary
/

You can sift through this mountain of information by having a look at the comments available in DICTIONARY (DICT to it’s friends) for each of the Views listed.
For example…

select comments
from dict
where table_name = 'USER_TABLES'
/

COMMENTS
--------------------------------------------------
Description of the user's own relational tables

You can see a graphical representation of these USER_ views in whatever Oracle IDE you happen to be using.
For example, in SQLDeveloper…

sqldev_tree

This graphical tree view corresponds roughly to the following Data Dictionary views :

View Name DICT Comments Additional Comments USER_TABLES Description of the user’s own relational tables USER_VIEWS Description of the user’s own views USER_EDITIONING_VIEWS Descriptions of the user’s own Editioning Views USER_INDEXES Description of the user’s own indexes USER_OBJECTS Objects owned by the user This includes functions, packages, procedures etc USER_QUEUES All queues owned by the user ALL_QUEUE_TABLES All queue tables accessible to the user USER_TRIGGERS Triggers having FOLLOWS or PRECEDES ordering owned by the user Includes Cross Edition Triggers USER_TYPES Description of the user’s own types USER_MVIEW_LOGS All materialized view logs owned by the user USER_SEQUENCES Description of the user’s own SEQUENCEs USER_SYNONYMS The user’s private synonyms ALL_SYNONYMS All synonyms for base objects accessible to the user and session Includes PUBLIC synonyms USER_DB_LINKS Database links owned by the user ALL_DB_LINKS Database links accessible to the user ALL_DIRECTORIES Description of all directories accessible to the user ALL_EDITIONS Describes all editions in the database USER_XML_SCHEMAS Description of XML Schemas registered by the user USER_SCHEDULER_JOBS All scheduler jobs in the database RESOURCE_VIEW Whilst not part of the DICTIONARY per se, you can see details of XML DB Schema in this view USER_RECYCLEBIN User view of his recyclebin ALL_USERS Information about all users of the database

As all of this metadata is available in views, it can be interrogated programatically via SQL, as we’ll discover shortly. Before that though, let’s introduce…

The Brexit Schema

To add an element of topicality, the following examples will be based on this schema.

The user creation script looks like this :

grant connect, create table, create procedure, create sequence
    to brexit identified by ceul8r
/

alter user brexit default tablespace users
/
alter user brexit quota unlimited on users
/

You’ll probably want to choose your own (weak) pun-based password.

The tables in this schema are ( initially at least)…

create table countries
(
    iso_code varchar2(3),
    coun_name varchar2(100) not null,
    curr_code varchar2(3) not null,
    is_eu_flag varchar2(1)
)
/

create table currencies
(
    iso_code varchar2(3) constraint curr_pk primary key,
    curr_name varchar2(100)
)
/

For reasons which will become apparent, we’ll also include this procedure, complete with “typo” to ensure it doesn’t compile…

create or replace procedure add_currency
(
	i_iso_code currencies.iso_code%type,
	i_curr_name currencies.curr_name%type
)
as

begin
	-- Deliberate Mistake...
	brick it for brexit !
	insert into currencies( iso_code, curr_name)
	values( i_iso_code, i_curr_name);
end add_currency;
/

The examples that follow are based on the assumption that you are connected as the BREXIT user.

First up….

Spotting tables with No Primary Keys

Say that we want to establish whether a Primary Key has been defined for each table in the schema.
Specifically, we want to check permanent tables which comprise the core application tables. We’re less interested in checking on Global Temporary Tables or External Tables.
Rather than wading through the relevant DDL scripts, we can get the Data Dictionary to do the work for us :

select table_name
from user_tables
where temporary = 'N' -- exclude GTTs
and table_name not in
(
    -- exclude External Tables ...
    select table_name
    from user_external_tables
)
and table_name not in
(
    -- see if table has a Primary Key
    select table_name
    from user_constraints
    where constraint_type = 'P'
)
/

TABLE_NAME
------------------------------
COUNTRIES

It looks like someone forgot to add constraints to the countries table. I blame the shock of Brexit. Anyway, we’d better fix that…

alter table countries add constraint
	coun_pk primary key (iso_code)
/

…and add an RI constraint whilst we’re at it…

alter table countries add constraint
	coun_curr_fk foreign key (curr_code) references currencies( iso_code)
/

…so that I’ve got some data with which to test…

Foreign Keys with No Indexes

In OLTP applications especially, it’s often a good idea to index any columns that are subject to a Foreign Key constraint in order to improve performance.
To see if there are any FK columns in our application that may benefit from an index…

with cons_cols as
(
    select cons.table_name,  cons.constraint_name,
        listagg(cols.column_name, ',') within group (order by cols.position) as columns
    from user_cons_columns cols
    inner join user_constraints cons
		on cols.constraint_name = cons.constraint_name
	where cons.constraint_type = 'R'
    group by cons.table_name, cons.constraint_name
),
ind_cols as
(
select ind.table_name, ind.index_name,
    listagg(ind.column_name, ',') within group( order by ind.column_position) as columns
from user_ind_columns  ind
group by ind.table_name, ind.index_name
)
select cons_cols.table_name, cons_cols.constraint_name, cons_cols.columns
from cons_cols
where cons_cols.table_name not in
(
    select ind_cols.table_name
    from ind_cols
    where ind_cols.table_name = cons_cols.table_name
    and ind_cols.columns like cons_cols.columns||'%'
)
/

Sure enough, when we run this as BREXIT we get…

TABLE_NAME		       CONSTRAINT_NAME	    COLUMNS
------------------------------ -------------------- ------------------------------
COUNTRIES		       COUN_CURR_FK	    CURR_CODE

Post Deployment Checks

It’s not just the Data Model that you can keep track of.
If you imagine a situation where we’ve just released the BREXIT code to an environment, we’ll want to check that everything has worked as expected. To do this, we may well recompile any PL/SQL objects in the schema to ensure that everything is valid….

exec dbms_utility.compile_schema(user)

…but once we’ve done this we want to make sure. So…

select object_name, object_type
from user_objects
where status = 'INVALID'
union
select constraint_name, 'CONSTRAINT'
from user_constraints
where status = 'DISABLED'
/

OBJECT_NAME		       OBJECT_TYPE
------------------------------ -------------------
ADD_CURRENCY		       PROCEDURE

Hmmm, I think we’d better fix that, but how do we find out what the error is without recompiling ? hmmm…

select line, position, text
from user_errors
where name = 'ADD_CURRENCY'
and type = 'PROCEDURE'
order by sequence
/

LINE POSITION TEXT
---- -------- --------------------------------------------------------------------------------
  10        8 PLS-00103: Encountered the symbol &amp;quot;IT&amp;quot; when expecting one of the following:     

                 := . ( @ % ;
Impact Analysis

Inevitably, at some point during the life of your application, you will need to make a change to it. This may well be a change to a table structure, or even to some reference data you previously thought was immutable.
In such circumstances, you really want to get a reasonable idea of what impact the change is going to have in terms of changes to your application code.
For example, if we need to make a change to the CURRENCIES table…

select name, type
from user_dependencies
where referenced_owner = user
and referenced_name = 'CURRENCIES'
and referenced_type = 'TABLE'
union all
select child.table_name, 'TABLE'
from user_constraints child
inner join user_constraints parent
	on child.r_constraint_name = parent.constraint_name
where child.constraint_type = 'R'
and parent.table_name = 'CURRENCIES'
/

NAME                           TYPE
------------------------------ ------------------
ADD_CURRENCY                   PROCEDURE
COUNTRIES                      TABLE             

Now we know the objects that are potentially affected by this proposed change, we have the scope of our Impact Analysis, at least in terms of objects in the database.

Conclusion

As always, there’s far more to the Data Dictionary than what we’ve covered here.
Steven Feuerstein has written a more PL/SQL focused article on this topic.
That about wraps it up for now, so time for Mexit.


Filed under: Oracle, PL/SQL, SQL Tagged: Data Dictionary, dbms_utility.compile_schema, dict, dictionary, listagg, thick database paradigm, user_constraints, user_cons_columns, USER_DEPENDENCIES, user_errors, user_ind_columns, user_objects, user_tables

Oracle – Pinning table data in the Buffer Cache

Thu, 2016-06-23 15:13

As I write, Euro 2016 is in full swing.
England have managed to get out of the Group Stage this time, finishing second to the mighty…er…Wales.
Fortunately Deb hasn’t mentioned this…much.

In order to escape the Welsh nationalism that is currently rampant in our house, let’s try something completely unrelated – a tale of Gothic Horror set in an Oracle Database…

It was a dark stormy night. Well, it was dark and there was a persistent drizzle. It was Britain in summertime.
Sitting at his desk, listening to The Sisters of Mercy ( required to compensate for the lack of a thunderstorm and to maintain the Gothic quotient) Frank N Stein was struck by a sudden inspiration.
“I know”, he thought, “I’ll write some code to cache my Reference Data Tables in a PL/SQL array. I’ll declare the array as a package header variable so that the data is available for the entire session. That should cut down on the amount of Physical IO my application needs to do !”

Quite a lot of code later, Frank’s creation lurched off toward Production.
The outcome wasn’t quite what Frank had anticipated. The code that he had created was quite complex and hard to maintain. It was also not particularly quick.
In short, Frank’s caching framework was a bit of a monster.

In case you’re wondering, no, this is not in any way autobiographical. I am not Frank (although I will confess to owning a Sisters of Mercy album).
I am, in fact, one of the unfortunates who had to support this application several years later.

OK, it’s almost certain that none of the developers who spawned this particular delight were named after a fictional mad doctor…although maybe they should have been.

In order to prevent others from suffering from a similar misapplication of creative genius, what I’m going to look at here is :

  • How Oracle caches table data in Memory
  • How to work out what tables are in the cache
  • Ways in which you can “pin” tables in the cache (if you really need to)

Fortunately, Oracle memory management is fairly robust so there will be no mention of leeks

Data Caching in Action

Let’s start with a simple illustration of data caching in Oracle.

To begin with, I’m going to make sure that there’s nothing in the cache by running …

alter system flush buffer_cache
/

…which, provided you have DBA privileges should come back with :

System FLUSH altered.

Now, with the aid of autotrace, we can have a look at the difference between retrieving cached and uncached data.
To start with, in SQL*Plus :

set autotrace on
set timing on

…and then run our query :

select *
from hr.departments
/

The first time we execute this query, the timing and statistics output will be something like :

...
27 rows selected.

Elapsed: 00:00:00.08
...

Statistics
----------------------------------------------------------
	106  recursive calls
	  0  db block gets
	104  consistent gets
	 29  physical reads
	  0  redo size
       1670  bytes sent via SQL*Net to client
	530  bytes received via SQL*Net from client
	  3  SQL*Net roundtrips to/from client
	  7  sorts (memory)
	  0  sorts (disk)
	 27  rows processed

If we now run the same query again, we can see that things have changed a bit…

...
27 rows selected.

Elapsed: 00:00:00.01
...

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  8  consistent gets
	  0  physical reads
	  0  redo size
       1670  bytes sent via SQL*Net to client
	530  bytes received via SQL*Net from client
	  3  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	 27  rows processed

The second run was a fair bit faster. This is mainly because the data required to resolve the query was cached after the first run.
Therefore, the second execution required no Physical I/O to retrieve the result set.

So, exactly how does this caching malarkey work in Oracle ?

The Buffer Cache and the LRU Algorithm

The Buffer Cache is part of the System Global Area (SGA) – an area of RAM used by Oracle to cache various things that are generally available to any sessions running on the Instance.
The allocation of blocks into and out of the Buffer Cache is achieved by means of a Least Recently Used (LRU) algorithm.

You can see details of this in the Oracle documentation but, in very simple terms, we can visualise the workings of the Buffer Cache like this :

lru_algorithm

When a data block is first read from disk, it’s loaded into the middle of the Buffer Cache.
If it’s then “touched” frequently, it will work it’s way towards the hot end of the cache.
Otherwise it will move to the cold end and ultimately be discarded to make room for other data blocks that are being read.
Sort of…

The Small Table Threshold

In fact, blocks that are retrieved as the result of a Full Table Scan will only be loaded into the mid-point of the cache if the size of the table in question does not exceed the Small Table Threshold.
The usual definition of this ( unless you’ve been playing around with the hidden initialization parameter _small_table_threshold) is a table that is no bigger than 2% of the buffer cache.
As we’re using the default Automated Memory Management here, it can be a little difficult to pin down exactly what this is.
Fortunately, we can find out (provided we have SYS access to the database) by running the following query :

select cv.ksppstvl value,
    pi.ksppdesc description
from x$ksppi pi
inner join x$ksppcv cv
on cv.indx = pi.indx
and cv.inst_id = pi.inst_id
where pi.inst_id = userenv('Instance')
and pi.ksppinm = '_small_table_threshold'
/

VALUE      DESCRIPTION
---------- ------------------------------------------------------------
589        lower threshold level of table size for direct reads

The current size of the Buffer Cache can be found by running :

select component, current_size
from v$memory_dynamic_components
where component = 'DEFAULT buffer cache'
/

COMPONENT                                                        CURRENT_SIZE
---------------------------------------------------------------- ------------
DEFAULT buffer cache                                                251658240

Now I’m not entirely sure about this but I believe that the Small Table Threshold is reported in database blocks.
The Buffer Cache size from the query above is definitely in bytes.
The database we’re running on has a uniform block size of 8k.
Therefore, the Buffer Cache is around 614 blocks.
This would make 2% of it 614 blocks, which is slightly more than the 589 as being reported as the Small Table Threshold.
If you want to explore further down this particular rabbit hole, have a look at this article by Jonathan Lewis.

This all sounds pretty good in theory, but how do we know for definite that our table is in the Buffer Cache ?

What’s in the Buffer Cache ?

In order to answer this question, we need to have a look at the V$BH view. The following query should prove adequate for now :

select obj.owner, obj.object_name, obj.object_type,
    count(buf.block#) as cached_blocks
from v$bh buf
inner join dba_objects obj
    on buf.objd = obj.data_object_id
where buf.class# = 1 -- data blocks
and buf.status != 'free'
and obj.owner = 'HR'
and obj.object_name = 'DEPARTMENTS'
and obj.object_type = 'TABLE'
group by obj.owner, obj.object_name, obj.object_type
/

OWNER                OBJECT_NAME          OBJECT_TYPE          CACHED_BLOCKS
-------------------- -------------------- -------------------- -------------
HR                   DEPARTMENTS          TABLE                            5

Some things to note about this query :

  • the OBJD column in v$bh joins to data_object_id in DBA_OBJECTS and not object_id
  • we’re excluding any blocks with a status of free because they are, in effect, empty and available for re-use
  • the class# value needs to be set to 1 – data blocks

So far we know that there are data blocks from our table in the cache. But we need to know whether all of the table is in the cache.

Time for another example…

We need to know how many data blocks the table actually has. Provided the statistics on the table are up to date we can get this from the DBA_TABLES view.

First of all then, let’s gather stats on the table…

exec dbms_stats.gather_table_stats('HR', 'DEPARTMENTS')

… and then check in DBA_TABLES…

select blocks
from dba_tables
where owner = 'HR'
and table_name = 'DEPARTMENTS'
/

    BLOCKS
----------
	 5

Now, let’s flush the cache….

alter system flush buffer_cache
/

…and try a slightly different query…


select *
from hr.departments
where department_id = 60
/
DEPARTMENT_ID DEPARTMENT_NAME		     MANAGER_ID LOCATION_ID
------------- ------------------------------ ---------- -----------
	   60 IT				    103        1400

We can now use the block total in DBA_TABLES to tell how much of the HR.DEPARTMENTS table is in the cache …

select obj.owner, obj.object_name, obj.object_type,
    count(buf.block#) as cached_blocks,
    tab.blocks as total_blocks
from v$bh buf
inner join dba_objects obj
    on buf.objd = obj.data_object_id
inner join dba_tables tab
    on tab.owner = obj.owner
    and tab.table_name = obj.object_name
    and obj.object_type = 'TABLE'
where buf.class# = 1
and buf.status != 'free'
and obj.owner = 'HR'
and obj.object_name = 'DEPARTMENTS'
and obj.object_type = 'TABLE'
group by obj.owner, obj.object_name, obj.object_type, tab.blocks
/

OWNER	   OBJECT_NAME	   OBJECT_TYP CACHED_BLOCKS TOTAL_BLOCKS
---------- --------------- ---------- ------------- ------------
HR	   DEPARTMENTS	   TABLE		  1	       5

As you’d expect the data blocks for the table will only be cached as they are required.
With a small, frequently used reference data table, you can probably expect it to be fully cached fairly soon after the application is started.
Once it is cached, the way the LRU algorithm works should ensure that the data blocks are constantly in the hot end of the cache.

In the vast majority of applications, this will be the case. So, do you really need to do anything ?

If your application is not currently conforming to this sweeping generalisation then you probably want to ask a number of questions before taking any precipitous action.
For a start, is the small, frequently accessed table you expect to see in the cache really frequently accessed ? Is your application really doing what you think it does ?
Whilst where on the subject, are there any rogue queries running more regularly than you might expect causing blocks to be aged out of the cache prematurely ?

Once you’re satisfied that the problem does not lie with your application, or your understanding of how it operates, the next question will probably be, has sufficient memory been allocated for the SGA ?
There are many ways you can look into this. If your fortunate enough to have the Tuning and Diagnostic Packs Licensed there are various advisor that can help.
Even if you don’t, you can always take a look at V$SGA_TARGET_ADVICE.

If, after all of that, you’re stuck with the same problem, there are a few options available to you, starting with…

The Table CACHE option

This table property can be set so that a table’s data blocks are loaded into the hot end of the LRU as soon as they are read into the Buffer Cache, rather than the mid-point, which is the default behaviour.

Once again, using HR.DEPARTMENTS as our example, we can check the current setting on this table simply by running …

select cache
from dba_tables
where owner = 'HR'
and table_name = 'DEPARTMENTS'
/

CACHE
-----
    N

At the moment then, this table is set to be cached in the usual way.

To change this….

alter table hr.departments cache
/

Table HR.DEPARTMENTS altered.

When we check again, we can see that the CACHE property has been set on the table…

select cache
from dba_tables
where owner = 'HR'
and table_name = 'DEPARTMENTS'
/

CACHE
-----
    Y

This change does have one other side effect that is worth bearing in mind.
It causes the LRU algorithm to ignore the Small Table Threshold and dump all of the selected blocks into the hot end of the cache.
Therefore, if you do this on a larger table, you do run the risk of flushing other frequently accessed blocks from the cache, thus causing performance degradation elsewhere in your application.

The KEEP Cache

Normally you’ll have a single Buffer Cache for an instance. If you have multiple block sizes defined in your database then you will have a Buffer Cache for each block size. However, you can define additional Buffer Caches and assign segments to them.

The idea behind the Keep Cache is that it will hold frequently accessed blocks without ageing them out.
It’s important to note that the population of the KEEP CACHE uses the identical algorithm to that of the Buffer Cache. The difference here is that you select which tables use this cache…

In order to take advantage of this, we first need to create a KEEP Cache :

alter system set db_keep_cache_size = 8m scope=both
/

System altered.

Note that, on my XE 11gR2 instance at least, the minimum size for the Keep Cache appears to be 8 MB ( or 1024 8k blocks).
We can now see that we do indeed have a Keep Cache…

select component, current_size
from v$memory_dynamic_components
where component = 'KEEP buffer cache'
/

COMPONENT               CURRENT_SIZE
----------------------  ------------
KEEP buffer cache       8388608

Now we can assign our table to this cache….

alter table hr.departments
    storage( buffer_pool keep)
/

Table altered.

We can see that this change has had an immediate effect :

select buffer_pool
from dba_tables
where owner = 'HR'
and table_name = 'DEPARTMENTS'
/

BUFFER_POOL
---------------
KEEP

If we run the following…

alter system flush buffer_cache
/

select * from hr.departments
/

select * from hr.employees
/

…we can see which cache is being used for each table, by amending our Buffer Cache query…

select obj.owner, obj.object_name, obj.object_type,
    count(buf.block#) as cached_blocks,
    tab.blocks as total_blocks,
    tab.buffer_pool as Cache
from v$bh buf
inner join dba_objects obj
    on buf.objd = obj.data_object_id
inner join dba_tables tab
    on tab.owner = obj.owner
    and tab.table_name = obj.object_name
    and obj.object_type = 'TABLE'
where buf.class# = 1
and buf.status != 'free'
and obj.owner = 'HR'
and obj.object_type = 'TABLE'
group by obj.owner, obj.object_name, obj.object_type,
    tab.blocks, tab.buffer_pool
/   

OWNER      OBJECT_NAME          OBJECT_TYPE     CACHED_BLOCKS TOTAL_BLOCKS CACHE
---------- -------------------- --------------- ------------- ------------ -------
HR         EMPLOYEES            TABLE                       5            5 DEFAULT
HR         DEPARTMENTS          TABLE                       5            5 KEEP

Once again, this approach seems rather straight forward. You have total control over what goes in the Keep Cache so why not use it ?
On closer inspection, it becomes apparent that there may be some drawbacks.

For a start, the KEEP and RECYCLE caches are not automatically managed by Oracle. So, unlike the Default Buffer Cache, if the KEEP Cache finds it needs a bit more space then it’s stuck, it can’t “borrow” some from other caches in the SGA. The reverse is also true, Oracle won’t allocate spare memory from the KEEP Cache to other SGA components.
You also need to keep track of which tables you have assigned to the KEEP Cache. If the number of blocks in those tables is greater than the size of the cache, then you’re going to run the risk of blocks being aged out, with the potential performance degradation that that entails.

Conclusion

Oracle is pretty good at caching frequently used data blocks and thus minimizing the amount of physical I/O required to retrieve data from small, frequently used, reference tables.
If you find yourself in a position where you just have to persuade Oracle to keep data in the cache then the table CACHE property is probably your least worst option.
Creating a KEEP Cache does have the advantage of affording greater manual control over what is cached. The downside here is that it also requires some maintenance effort to ensure that you don’t assign too much data to it.
The other downside is that you are ring-fencing RAM that could otherwise be used for other SGA memory components.
Having said that, the options I’ve outlined here are all better than sticking a bolt through the neck of your application and writing your own database caching in PL/SQL.


Filed under: Oracle, SQL Tagged: alter system flush buffer_cache, autotrace, buffer cache, dba_objects, dbms_stats.gather_table_stats, Default Buffer cache, how to find the current small table threshold, Keep Cache, lru algorithm, small table threshold, Table cache property, v$bh, v$memory_dynamic_components, what tables are in the buffer cache, x$ksppcv, x$ksppi

What’s Special About Oracle ? Relational Databases and the Thick Database Paradigm

Fri, 2016-06-03 09:46

A wise man (or woman – the quote is unattributed) once said that assumption is the mother of all cock-ups.
This is especially true in the wonderful world of databases.
The term NoSQL covers databases as different from each other as they are from the traditional Relational Database Management Systems (RDBMS).
The assumption implicit in that last sentence is that Relational Databases are broadly the same.

The problems with this assumption begin to manifest themselves when a team is assembled to write a new application running on an Oracle RDBMS.

Non-Oracle developers may have been used to treating databases as merely a persistence layer. Their previous applications may well have been written to be Database Agnostic.
This is a term which is likely to cause consternation among Oracle Developers, or at least, Oracle Developers who have ever tried to implement and support a Database Agnostic application running on Oracle. They may well think of this approach as the “Big Skip” anti-pattern where the database is treated as a dumping ground for any old rubbish the application feels like storing.

As a consequence, they will strongly favour the application being “Front-End Agnostic”. In other words, they will lean toward the Thick Database Paradigm as a template for application architecture.
With all of this Agnosticism about it’s amazing how religious things can get as the relative merits of these opposing views are debated.

These diametrically opposing views on the optimum architecture for a database centric application all stem from that one assumption about Relational Databases.
To make things even more interesting, both sides in this debate share this assumption.
The fact of the matter is that Oracle is very different from other RDBMSs. Oracle Developers need to appreciate this so that they can accept that the Database Agnostic Architecture is a legitimate choice for some RDBMSs and is not simply the result of non-Oracle Developers not knowing anything about databases.
The other point to note is that Oracle is very different from other RDBMS – OK, it’s technically the same point, but it’s such an important one, it’s worth mentioning twice.
Non-Oracle Developers need to understand this so that they can accept that the Thick Database Paradigm is a legitimate choice for the Oracle RDBMS and not simply the result of technological parochialism on the part of Oracle Developers.

Whatever kind of developer you are, you’re probably wondering just what I’m banging on about right now and where this is going.
Well, the purpose of this post is to take several steps back from the normal starting point for the debate over the optimal application architecture for a Database Centric Application on Oracle and set out :

  • Why Relational Databases are different from each other
  • Why the Thick Database Approach can be particularly suited to Oracle
  • Under what circumstances this may not be the case

Hopefully, by the end I’ll have demonstrated to any non-Oracle Developers reading this that the Thick Database Paradigm is at least worth considering when developing this type of application when Oracle is the RDBMS.
I will also have reminded any Oracle Developers that Oracle is a bit different to other RDBMS and that this needs to be pointed out to their non-Oracle colleagues when the subject of application architecture is being discussed.
I will attempt to keep the discussion at a reasonably high-level, but there is the odd coding example.
Where I’ve included code, I’ve used the standard Oracle demo tables from the HR application.
There are several good articles that do dive into the technical nitty-gritty of the Thick Database Paradigm on Oracle and I have included links to some of them at the end of this post.

I can already hear some sniggering when the term Thick Database gets used. Yes, you there in the “Web Developers Do It Online” t-shirt.
In some ways it would be better to think of this as the Intelligent Database Paradigm, if only to cater to those with a more basic sense of humour.

Assumptions

Before I go too much further, I should really be clear about the assumptions all of this is based on.

Application Requirements

To keep things simple, I’m going to assume that our theoretical application implements some form On-Line Transaction Processing (OLTP) functionality.
Of course, I’m going to assume that Oracle is the chosen database platform (or at least, the one you’re stuck with).
Most importantly, I’m going to assume that the fundamental non-functional requirements of the application are :

  • Accuracy
  • Performance
  • Security
  • Maintainability
Terminology

On a not entirely unrelated topic, I should also mention some terms, when used in the context of the Oracle RDBMS, have a slightly different meanings to that you might expect…

  • database – normally a term used to describe the database objects in an application – in Oracle we’d call this a schema. This is because database objects in Oracle must be owned by a database user or schema.
  • stored procedure – it’s common practice in PL/SQL to collect procedures and functions into Packages – so you’ll often hear the term Packaged Procedures, Packages, or Stored Program Units to cover this
  • database object – this is simply any discrete object held in the database – tables, views, packages etc
  • transaction – by default, Oracle implements the default ANSI SQL behaviour that a transaction consists of one or more SQL statements. A transaction is normally terminated explicitly by the issuing of a COMMIT or a ROLLBACK command.
The HR Schema

This is normally pre-installed with every Oracle database, although your DBA may have removed it as part of the installation.
If you want to follow along and it’s not installed, you can find the build script for it in :

$ORACLE_HOME/demo/schema/human_resources/hr_main.sql

Note that the script requires you to provide the SYS password for the database.

I’ve created copies of two of the tables from this application, the EMPLOYEES and DEPARTMENTS tables, for use in the examples below.

Database Agnostic and Thick Database – definitions

To keep things simple, we can explain each in the context of the Model-View-Controller(MVC) design pattern.

In MVC, the application components are divided into three categories :

  • The View – the GUI
  • The Controller – where all of the application logic exists. This layer sits in the middle between the view and the…
  • Model – the persistence layer – traditionally a Relational Database implementing a Physical Data Model

The Database Agnostic Approach is to treat the Model simply as a persistence layer. Implementation of Referential Integrity in the database is minimal and the implementation of any business logic is done entirely in the Controller layer, little or none of which impinges upon the RDBMS in the Model.
The main idea behind this approach is that it is trivial to migrate the application from one RDBMS to another.

The Thick Database Paradigm takes a very different approach.
The Referential Integrity is rigorously applied in the RDBMS, and the Data Model is done in some approximation of Third Normal Form.
The Controller layer is in fact implemented as two physical layers.
The code outside of the database – the Data Access Layer (DAL) accesses the model by means of a Transactional API (XAPI) which is held in Stored Procedures inside the RDBMS engine itself.
We’re going to explore the advantages of this approach in the context of the Oracle RDBMS.

There we are then, something for everyone to object to.
The thing is, both of these approaches have their place. The trick is to know the circumstances under which one is more appropriate.
It may help at this point then, if we can return to the question of…

Why RDBMSs are different from each other

Maybe that heading should read “Are RDBMSs different from each other ?” Superficially at least, they do seem to have a fair bit in common.
To start with, they all implement the relational model to some degree. This means that data is arranged in tables and that (in the main) it is possible to define relationships between these tables.
For circumstances where a Business Transaction may require multiple DML statements, the RDBMS will enable the creation of Stored Procedures to enable such transactions to be done in a single call to the database.
The most obvious similarity is, of course, that any retrieval of or amendment to data stored in the database is ultimately done by means of a Structured Query Language (SQL) statement.

A fundamental characteristic of SQL is that it is a Declarative Language. You use it to tell the database what data you want to access. It is the Database Engine that then has to figure out how to do this.

Whilst the implementation of SQL is (more-or-less) standard across RDBMSs, the underlying Database Engines behave very differently.

One example of the differences between Database Engines can be seen when you need to execute a query that contains many table joins.
If you were running such a query on MSSQL, it may well be more efficient to do this in multiple steps. This would be done by writing a query to populate a temporary table with a result set and then joining from that table to return the final results.
This contrasts with Oracle, where the optimal approach is usually to do this with a single SQL statement.

For the moment, I’ll assume that the above has been sufficient to persuade you that Relational Databases are in fact different from each other in a fairly fundamental way.
Feel free to put this to the test yourself. Go ahead, I’ll wait….

OK. We’re all agreed on that then.
The question you’re now asking is this –

If RDBMSs are different from each other is the Database Agnostic approach the best architecture for all of them ?

The next thing we need to understand is….

Why Oracle is Special

“Because it’s so expensive !” may well be your first thought. Remember that we’re assuming that Oracle is the RDBMS platform that you have chosen ( or been lumbered with) for your application. This being the case, we come back to the question of why the Thick Database Paradigm is worthy of consideration for your application architecture.

Returning to our list of non-functional application requirements, can you guess which of the application components is likely to have the biggest impact on performance of an Oracle Database Application ? Clue : It’s also the most expensive thing to change after go-live as it’s the card at the bottom of the house of cards that is your Application….

The Physical Data Model

This aspect of the Thick Database Paradigm is often overlooked. However, it is by far the most important aspect in maximizing the success of the implementation of this architectural approach.

Oh, that’s your sceptical face, isn’t it. You’re really not entirely sure about this. You’re probably not alone, even some Oracle Developers will be giving me that same look about now. I hope this next bit is convincing because my flame-proof underpants are currently in the wash.

OK, as I said a little while ago ( and I think you pretty much agreed at the time), any interaction with stored in an RDBMS will ultimately require the execution of an SQL statement by the Database Engine.
The particular bit of the Oracle Kernel that works out the how is probably called KX$ something. Friends however, tend to refer to it as the Cost Based Optimizer (CBO).

The CBO is pretty sophisticated. The more information you can provide Oracle about your data model the better the execution plans the CBO generates.
The upshot is that the better the data model, the faster that statements against it will run.

For example, the CBO understands RI constraints and can account for them in it’s execution plans as I will now demonstrate…

I’ve copied the EMPLOYEES and DEPARTMENTS tables, including data, from the standard Oracle demo – the HR Application.

The DEPARTMENTS table looks like this :

create table departments
(
    department_id number(4) not null,
    department_name varchar2(30) not null,
    manager_id number(6),
    location_id number(4)
)
/

alter table departments
    add constraint departments_pk primary key( department_id)
/  

…and the EMPLOYEES like this :

create table employees
(
    employee_id number(6) not null,
    first_name varchar2(20),
    last_name varchar2(25) not null,
    email varchar2(25) not null,
    phone_number varchar2(20),
    hire_date date not null,
    job_id varchar2(10) not null,
    salary number(8,2),
    commission_pct number(2,2),
    manager_id number(6),
    department_id number(4)
)
/

alter table employees 
    add constraint employees_pk  primary key (employee_id)
/

Note that whilst DEPARTMENT_ID is listed in both tables I’ve not implemented any RI constraints at this point.

Now consider the following query

select emp.first_name, emp.last_name, dept.department_id
from employees emp
inner join departments dept
    on emp.department_id = dept.department_id
where emp.department_id = 60
/

If we ask the CBO for an execution plan for this query…

explain plan for
select emp.first_name, emp.last_name, dept.department_id
from employees emp
inner join departments dept
    on emp.department_id = dept.department_id
where emp.department_id = 60
/

… it will come back with something like this :

select *                  
from table(dbms_xplan.display)
/

Plan hash value: 2016977165

-------------------------------------------------------------------------------------
| Id  | Operation	   | Name	    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |		    |	  5 |	110 |	  3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS	   |		    |	  5 |	110 |	  3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN| DEPARTMENTS_PK |	  1 |	  4 |	  0   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| EMPLOYEES	    |	  5 |	 90 |	  3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(&quot;DEPT&quot;.&quot;DEPARTMENT_ID&quot;=60)
   3 - filter(&quot;EMP&quot;.&quot;DEPARTMENT_ID&quot;=60)

16 rows selected.

If we now add a constraint to ensure that a DEPARTMENT_ID in the EMPLOYEES table must already exist in the DEPARTMENTS table…

alter table employees 
    add constraint emp_dept_fk foreign key (department_id) references departments(department_id)
/   

…and then get the execution plan…

explain plan for
select emp.first_name, emp.last_name, dept.department_id
from employees emp
inner join departments dept
    on emp.department_id = dept.department_id
where emp.department_id = 60
/

select *                  
from table(dbms_xplan.display)
/

Plan hash value: 1445457117

-------------------------------------------------------------------------------
| Id  | Operation	  | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	      |     5 |    90 |     3	(0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPLOYEES |     5 |    90 |     3	(0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(&quot;EMP&quot;.&quot;DEPARTMENT_ID&quot;=60)

13 rows selected.

…we can see that the CBO is smart enough to know that the RI constraint eliminates the need to read the DEPARTMENTS table at all for this query.

A sensible data model has some other key benefits.

For example…

insert into hr.employees
(
    employee_id, 
    first_name, 
    last_name,
    email, 
    hire_date, 
    job_id,
    department_id
)
values
(
    207,
    'MIKE',
    'S',
    'mikes',
    sysdate,
    'IT_PROG',
    999 -- department_id does not exist in the DEPARTMENTS table
)
/

…results in …

SQL Error: ORA-02291: integrity constraint (HR.EMP_DEPT_FK) violated - parent key not found

Simply by typing a one-line statement to add this constraint, we’ve prevented the possibility of orphaned records being added to our application.
Better still, this rule will be enforced however the data is added – no just records added via the application.

About now, non-Oracle developers may well be making the point that this logic needs to be implemented in the application code anyway. By adding it to the data model, aren’t we effectively coding the same functionality twice ?
Well, as we can see from the example above, the code required to create an RI constraint is minimal. Also, once it’s created, it exists in Oracle, there is no need to explicitly invoke it every time you need to use it.
Additionally, if you fully adopt the Thick Database approach, you don’t necessarily have to write code to re-implement rules enforced by constraints.

One other point that may well come up is the fact that most of the world now uses some variation of the Agile Development Methodology. Producing a complete data model in Sprint 1 is going to be a bit of a tall order for an application of even moderate complexity.
This is true. However, by implementing the Data Access Layer (DAL) pattern and separating the application code from the underlying data model, it’s possible to create stubs in place of parts of the data model that haven’t been developed. This does make it possible to fit Data Modelling into the structure required by these methodologies.

The key point here is that, even if this is the only bit of the thick db paradigm you implement your app will be far more maintainable.
The tuning tools at your disposal will be far more effective and useful if your application is based on a well defined, relational data model.

Whilst we’re on the subject of Application Code, it’s probably worth asking….

What are Stored Procedures Good For ?

In order to understand this, we need to look at the concept of a Database Transaction.
The ANSI Standard for SQL mandates that a transaction consists of one or more DML statements. In general terms, if the transaction is committed then all of the changes made by each statement in the transaction is saved. Otherwise, none of them are.
By default, many RDBMSs implement a transaction as a single SQL statement. In Oracle, the default behaviour conforms to the ANSI Standard.
In circumstances where a Business Transaction requires multiple DML statements, things can get a bit tricky without a Stored Procedure.
The application needs to issue multiple individual statements and commit each one in turn.
If a second or subsequent statement fails for any reason then you find that your data is left in an inconsistent state.
Stored Procedures solve this problem by bundling these statements up into a single transaction.
We’ll have a look at a specific example of this approach in Oracle using…

PL/SQL

The typical approach taken by vendors to implement Stored Procedures in an RDBMS involves providing some extensions to SQL to make it Turing Complete.
These extensions ( variable declaration, conditional statements, looping) are normally fairly minimal.
Oracle took a rather different approach with PL/SQL.
They took the ADA programming language and provided it with SQL extensions.
From the start then, PL/SQL was rather more fully featured than your average Stored Procedure language.
In the almost 30 years of it’s existence, PL/SQL has been further integrated within the RDBMS engine. Also, the addition of thousands of Oracle supplied libraries (packages) have extended it’s functionality to the point where it can be used for tasks as diverse as inter-session communication, backup and recovery, and sending e-mail.
Being a fully-fledged 3GL embedded into the heart of the database engine, PL/SQL is the fastest language for processing data in Oracle.
This is partly due to the fact that the code is co-located with the data, so network latency and bandwidth are not really an issue.
Yes, and you thought the idea of co-locating code and data was invented when those whizzy NoSQL databases came along, didn’t you ?
PL/SQL allows the developer to take a set-based approach to working with data. You can pretty much drop a SQL DML statement straight into a PL/SQL program without (necessarily) having to build it as a string first.
Furthermore, remember that transactions can encompass multiple database changes. By implementing these in PL/SQL, the entire transaction can be completed with a single database call, something that is not necessarily the case when the Controller code is outside of the database.
Implementing Business Transactions in PL/SQL is commonly done using the Transactional API (XAPI) pattern.

There is one particular aspect of ADA which has become central to the way that PL/SQL applications are written and that is the Package.
Rather than having lots of standalone procedures and functions, it is common practice to group these “stored procedures” into PL/SQL packages.
This approach has several advantages.
Grouping related functionality into packages reduces the number of individual programs you need to keep track of.
PL/SQL packages are stored in Oracle’s Database Catalogue ( the Data Dictionary) as two distinct objects – a Package Header or Specification – essentially the signature of all of the functions and procedures in the package ( package members) – and a Package Body – the actual code.
The Package Header is the object that is called to invoke a member procedure.
Provided you are not changing the signature of a public package member, you can amend the code in the package body without having to re-compile the header.
This means that you can make changes to the transactional code “under the hood” without necessarily requiring any re-coding in the caller to a packaged procedure.

Right, it’s time for an example.

Say we want to change the Manager of a Department. In this case, the current IT Department Manager – Alexander Hunold has decided that he’s far too busy to attend all of those planning meetings. I mean he took this job so he didn’t have to speak to anybody. You can tell he’s not really a people person, I mean just look at that T-shirt.
Diana Lorentz on the other hand, whilst also having the required technical background has a much better way with people.
So, in order to change the manager in the IT Department from Alexander to Diana we need to :

  1. Update the record in the DEPARTMENTS table with the ID of the new manager
  2. Update the EMPLOYEES records for members of that department so that they now report to the new manager
  3. Update the EMPLOYEES record for the new manager so that she now reports to the Department’s overall boss

Among other things, we’ll need to know which DEPARTMENT_ID we need to make these changes for. This would normally be selected from a drop-down list in the Application’s UI, with the name of the Department being displayed to the user but the ID being passed to our procedure.
Whilst the list of Departments is static/reference data and may well be cached on the mid-tier of our application to save repeated database calls, we’ll still need a means of getting this data out of the database initially.
Therefore, we may well have a package that contains two members :

  • a function to return the department information
  • a procedure to assign the new manager

Such a package will probably look something like this. First the Package Header…

create or replace package manage_departments
is
    --
    -- This is the package header or specification.
    -- It gives the signature of all public package members (functions and packages
    --
    function get_department_list return sys_refcursor;
    procedure change_manager
    (
        i_department_id departments.department_id%type,
        i_old_manager_id employees.employee_id%type,
        i_new_manager_id departments.manager_id%type
    );
end manage_departments;
/

… and now the body…

create or replace package body manage_departments
is
    --
    -- This is the package body.
    -- It contains the actual code for the functions and procedures in the package
    --
    function get_department_list return sys_refcursor
    is
        l_rc sys_refcursor;
    begin
        open l_rc for
            select department_name, department_id
            from departments;
        return l_rc;
    end get_department_list;
    
    procedure change_manager
    (
        i_department_id departments.department_id%type,
        i_old_manager_id employees.employee_id%type,
        i_new_manager_id departments.manager_id%type
    )
    is
        l_dept_head_manager employees.manager_id%type;
    begin
        --
        -- First update the department record with the new manager
        --
        update departments
        set manager_id = i_new_manager_id
        where department_id = i_department_id;

        -- Now find the Manager of the existing department head
        -- we'll need this to assign to the new department head
        --
        select manager_id 
        into l_dept_head_manager
        from employees
        where employee_id = i_old_manager_id;        
        --
        -- Now update all of the employees in that department to
        -- report to the new manager...apart from the new manager themselves
        -- who reports to the department head.
        update employees
        set manager_id = 
            case when employee_id != i_new_manager_id 
                then i_new_manager_id
                else l_dept_head_manager
            end
        where department_id = i_department_id;        
        --
        -- Note - for the purposes of simplicity I have not included any
        -- error handling.
        -- Additionally, best practice is normally to allow transaction control
        -- to be determined by the caller of a procedure so an explicit commit
        -- or rollback needs to take place there.
        --
    end change_manager;
end manage_departments;
/

Using the Oracle CLI, SQL*Plus to act as the caller, we can see how the function works :

set autoprint on
set pages 0
var depts refcursor
exec :depts := manage_departments.get_department_list

PL/SQL procedure successfully completed.

Administration				  10
Marketing				  20
Purchasing				  30
Human Resources 			  40
Shipping				  50
IT					  60
Public Relations			  70
Sales					  80
Executive				  90
Finance 				 100
Accounting				 110
Treasury				 120
Corporate Tax				 130
Control And Credit			 140
Shareholder Services			 150
Benefits				 160
Manufacturing				 170
Construction				 180
Contracting				 190
Operations				 200
IT Support				 210
NOC					 220
IT Helpdesk				 230
Government Sales			 240
Retail Sales				 250
Recruiting				 260
Payroll 				 270

27 rows selected.

Now we need to call the procedure to change the manager. In order to keep things simple, I’ve cheated a bit here and not included the code to lookup the EMPLOYEE_IDs of Alexander (103) and Diana ( 107).

So, using SQL*Plus once again :

exec manage_departments.change_manager(60, 103, 107)
commit;

NOTE – it is also possible (and often preferred) to pass parameters by reference when calling PL/SQL. So, the following code would work equally well ( and possibly be a bit more readable) :

exec manage_departments.change_manager( i_department_id =&gt; 60, i_old_manager_id =&gt; 103, i_new_manager_id =&gt;; 107);
commit;

We can now see that both of the DML changes have been applied :

select emp.first_name||' '||emp.last_name, dept.manager_id
from departments dept
inner join employees emp
    on dept.manager_id = emp.employee_id
where dept.department_id = 60
/ 

EMP.FIRST_NAME||''||EMP.LAST_NAME	       MANAGER_ID
---------------------------------------------- ----------
Diana Lorentz					      107


select first_name, last_name, manager_id
from employees
where department_id = 60
/

FIRST_NAME	     LAST_NAME		       MANAGER_ID
-------------------- ------------------------- ----------
Alexander	     Hunold			      107
Bruce		     Ernst			      107
David		     Austin			      107
Valli		     Pataballa			      107
Diana		     Lorentz			      102

The fact that Packages are stored in the Data Dictionary means that Oracle automatically keeps track of the dependencies that they have on other database objects.
This makes impact analysis much easier. For example, if we were going to make a change to the DEPARTMENTS table, we could see what other database objects might be impacted by running the following query on the Data Dictionary :

select name, type
from user_dependencies
where referenced_name = 'DEPARTMENTS'
and referenced_type = 'TABLE'
/

NAME			       TYPE
------------------------------ ------------------
MANAGE_DEPARTMENTS	       PACKAGE
MANAGE_DEPARTMENTS	       PACKAGE BODY

One more significant benefit of using PL/SQL is that any parameters passed into a stored procedure – whether part of a package or standalone – are automatically bound.
Bind variables are advantageous for two reasons.
Firstly, use of them enables Oracle to re-execute frequently invoked statements from memory, without having to re-validate them each time. This is known as a soft parse. This offers significant performance benefits.
The second, and perhaps more important advantage is that bind variables tend not to be susceptible to SQL Injection strings.
Effectively, calling a PL/SQL stored program unit is the equivalent of making a Prepared Statement call.
Whilst this automatic binding does not render PL/SQL completely immune from SQL Injection, it does greatly reduce the attack surface for this kind of exploit.

In-Memory Processing

In-Memory processing is big at the moment. It’s one of those things like Big Data in that there is lots of enthusiasm around something which, to be frank, has already been happening for many years.
Oracle has some rather sophisticated memory management out of the box.
As already mentioned, SQL and PL/SQL code that is frequently executed, together with the meta-data required to parse it, is cached in memory.
The same is true for frequently used data blocks. In other words, if you have data that is frequently accessed, Oracle will look to store this in memory, thus reducing the amount of physical I/O it needs to do.
This has nothing to do with Oracle’s newfangled “In-memory” option. It’s a core part of the product.
Generally speaking, the more application code you add to the RDBMS, the more efficiently Oracle will work.

Benefits of the Thick Database Paradigm

When measured against the non-functional requirements for our application, the Thick Database approach ticks all of the boxes.

Accuracy

Referential Integrity in the Data Model means that we can prevent incorrect data from being stored.
The flexibility of PL/SQL and it’s close coupling with SQL means that we can easily implement business rules to ensure system accuracy.
By implementing a XAPI layer in PL/SQL, we ensure that there is a single point of entry into the application. Because business transactions always execute the same code, we can ensure that the results of those transactions are repeatable, and accurate.

Performance

As we have seen, a well-defined Data Model allows the CBO to choose the optimum execution plan for each query.
The use of bind variables ensures that frequently executed statements are cached in memory.
The fact that most of the processing happens inside the database engine means that network latency is minimized as a performance overhead.
By it’s very nature, any application that manipulates and stores data will increase the amount of data it handles over time.
This increase in data volumes will start to affect performance.
Oracle is designed and optimized to handle data stored in relational structures. Having a properly defined data model will enable you to maximise the effectiveness of the tuning tools at your disposal.

Maintainability

Having your application code in a single location ( i.e. the PL/SQL XAPI layer) means that code is not replicated across multiple application layers.
As PL/SQL is tightly coupled with SQL, it also means that you tend to need fewer lines of code to implement application functionality.
Having the application code in the database means that dependency tracking comes “for free” by means of the Data Dictionary.
This is especially handy when doing Impact Analysis on any application changes that may be required down the line.

Security

PL/SQL parameters are bound auto-magically. Unless you’re being careless with some dynamic SQL inside of the PL/SQL code itself, these parameters are pretty much immune to SQL Injection.

Still feeling sceptical after reading that ? Good. Whilst I have provided some evidence to support these assertions, it’s not what you’d call incontrovertible.
But I’m getting ahead of myself. Before summarising, I did say that there may be some circumstances where this approach may not be suitable…

When the Thick Database Paradigm may not be appropriate

By it’s very nature the Thick Database approach on Oracle RDBMS puts an Application smack in the middle of an Oracle “walled garden”.
If you ever want to migrate to another RDBMS, the task is unlikely to be straight forward.
Yes, PostgresSQL is similar in nature to PL/SQL. As I’ve never attempted a migration from Oracle to Postgres, I can’t comment on whether this lessens the effort required.

So, if you’re in a situation where you know that your application will need to move to another RDBMS in the short term, the pain of sub-optimal performance on Oracle may be worth the gain when you come to do the migration.
A word of warning here – I have personal experience of applications that we’re only supposed to be on Oracle for six months after Go-live…and we’re still in Production several years later.

Alternatively, you may be a software vendor who needs to support your application across multiple database platforms.
The benefit of having a single code base for all supported platforms may outweigh the overhead of the additional effort required to address the issues that will almost certainly arise when running a Database Agnostic application on an Oracle RDBMS.
If you do find yourself in this situation then you may consider recommending a database other than Oracle to your clients.

It is worth pointing out however, that in either case, a well-designed physical data model where Referential Integrity is enforced by means of constraints will provide substantial mitigation to some of the performance issues you may encounter.

This is certainly not going to help with an application using the Entity-Attribute-Value (EAV) model.
I would suggest that if EAV is absolutely essential to your solution then a Relational Database almost certainly isn’t.

Summary and Further Reading

If you’ve made it this far, I hope that you have at least been persuaded that the Thick Database Paradigm is not a completely bonkers way of writing an application against an Oracle database.
That’s not to say that you’re sold on the idea by any means. As I’ve said already, what I’ve attempted to do here is provide some illustrations as to why this approach is preferred among Oracle Developers. It’s not cast-iron proof that this is the case with your specific application.
What you’ll probably want to do now is read up a bit more on this approach following which, you may well want to do some testing to see if all of these claims stack up.

So, if you want some proper, in-depth technical discussions on the Thick Database Paradigm, these links may be of some use :

If and when you do come to do some testing, it’s important to remember that the benefits of the Thick Database approach – certainly in performance terms – become more apparent the greater the volume of data and transactions the application needs to handle.
Running performance tests against the tiny HR application that I’ve used here is probably not going to tell you too much.


Filed under: Oracle, PL/SQL, SQL Tagged: CBO, database agnostice, dbms_xplan, Ref cursors, Referential Integrity, thick database paradigm, USER_DEPENDENCIES

Null is Odd…or Things I Used to Know about SQL Aggregate Functions

Mon, 2016-05-16 15:03

Brendan McCullum recently played his final Test for New Zealand.
That’s something of an understatement. In his last game he made a century in a mere 54 balls, a feat unmatched in 139 years of test cricket.
From the outside looking in, it seemed that McCullum had come to realise something he’d always known. Playing cricket is supposed to be fun.
What’s more, you can consider yourself quite fortunate if you get paid for doing something you enjoy, especially when that something is hitting a ball with a stick.

With the help of Mr McCullum, what follows will serve to remind me of something I’ve always known but may forget from time to time.
In my case, it’s the fact that NULL is odd. This is especially true when it comes to basic SQL aggregation functions.

Some test data

We’ve got a simple table that holds the number of runs scored by McCullum in each of his Test innings together with a nullable value to indicate whether or not he was dismissed in that innings.

This is relevant because one of the things we’re going to do is calculate his batting average.

In Cricket, the formula for this is :

Runs Scored / (Innings Batted – Times Not Out)

Anyway, here’s the table :

create table mccullum_inns
(
    score number not null,
    not_out number
)
/

…and the data…

insert into mccullum_inns( score,not_out) values (57, null);
insert into mccullum_inns( score,not_out) values (19, 1);
insert into mccullum_inns( score,not_out) values (13, null);
insert into mccullum_inns( score,not_out) values (55, null);
insert into mccullum_inns( score,not_out) values (3, null);
insert into mccullum_inns( score,not_out) values (5, null);
insert into mccullum_inns( score,not_out) values (96, null);
insert into mccullum_inns( score,not_out) values (54, null);
insert into mccullum_inns( score,not_out) values (20, null);
insert into mccullum_inns( score,not_out) values (21, null);
insert into mccullum_inns( score,not_out) values (4, null);
insert into mccullum_inns( score,not_out) values (143, null);
insert into mccullum_inns( score,not_out) values (17, 1);
insert into mccullum_inns( score,not_out) values (10, null);
insert into mccullum_inns( score,not_out) values (8, null);
insert into mccullum_inns( score,not_out) values (10, null);
insert into mccullum_inns( score,not_out) values (36, null);
insert into mccullum_inns( score,not_out) values (29, null);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (3, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (99, null);
insert into mccullum_inns( score,not_out) values (7, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (111, null);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (19, null);
insert into mccullum_inns( score,not_out) values (74, null);
insert into mccullum_inns( score,not_out) values (23, null);
insert into mccullum_inns( score,not_out) values (31, null);
insert into mccullum_inns( score,not_out) values (33, null);
insert into mccullum_inns( score,not_out) values (5, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (5, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (14, 1);
insert into mccullum_inns( score,not_out) values (43, null);
insert into mccullum_inns( score,not_out) values (17, null);
insert into mccullum_inns( score,not_out) values (9, null);
insert into mccullum_inns( score,not_out) values (26, null);
insert into mccullum_inns( score,not_out) values (13, null);
insert into mccullum_inns( score,not_out) values (21, null);
insert into mccullum_inns( score,not_out) values (7, null);
insert into mccullum_inns( score,not_out) values (40, null);
insert into mccullum_inns( score,not_out) values (51, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (85, null);
insert into mccullum_inns( score,not_out) values (9, null);
insert into mccullum_inns( score,not_out) values (42, null);
insert into mccullum_inns( score,not_out) values (97, null);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (11, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (9, null);
insert into mccullum_inns( score,not_out) values (71, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (2, null);
insert into mccullum_inns( score,not_out) values (66, null);
insert into mccullum_inns( score,not_out) values (8, null);
insert into mccullum_inns( score,not_out) values (3, null);
insert into mccullum_inns( score,not_out) values (30, null);
insert into mccullum_inns( score,not_out) values (84, 1);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (31, null);
insert into mccullum_inns( score,not_out) values (19, null);
insert into mccullum_inns( score,not_out) values (3, null);
insert into mccullum_inns( score,not_out) values (84, null);
insert into mccullum_inns( score,not_out) values (115, null);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (6, null);
insert into mccullum_inns( score,not_out) values (1, null);
insert into mccullum_inns( score,not_out) values (29, null);
insert into mccullum_inns( score,not_out) values (18, null);
insert into mccullum_inns( score,not_out) values (13, null);
insert into mccullum_inns( score,not_out) values (78, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (89, null);
insert into mccullum_inns( score,not_out) values (185, null);
insert into mccullum_inns( score,not_out) values (19, 1);
insert into mccullum_inns( score,not_out) values (24, null);
insert into mccullum_inns( score,not_out) values (104, null);
insert into mccullum_inns( score,not_out) values (5, null);
insert into mccullum_inns( score,not_out) values (51, null);
insert into mccullum_inns( score,not_out) values (65, null);
insert into mccullum_inns( score,not_out) values (11, 1);
insert into mccullum_inns( score,not_out) values (4, null);
insert into mccullum_inns( score,not_out) values (225, null);
insert into mccullum_inns( score,not_out) values (40, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (56, null);
insert into mccullum_inns( score,not_out) values (35, null);
insert into mccullum_inns( score,not_out) values (2, null);
insert into mccullum_inns( score,not_out) values (64, null);
insert into mccullum_inns( score,not_out) values (14, null);
insert into mccullum_inns( score,not_out) values (11, null);
insert into mccullum_inns( score,not_out) values (34, null);
insert into mccullum_inns( score,not_out) values (1, null);
insert into mccullum_inns( score,not_out) values (16, null);
insert into mccullum_inns( score,not_out) values (12, null);
insert into mccullum_inns( score,not_out) values (83, null);
insert into mccullum_inns( score,not_out) values (48, null);
insert into mccullum_inns( score,not_out) values (58, 1);
insert into mccullum_inns( score,not_out) values (61, null);
insert into mccullum_inns( score,not_out) values (5, null);
insert into mccullum_inns( score,not_out) values (31, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (84, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (19, null);
insert into mccullum_inns( score,not_out) values (22, null);
insert into mccullum_inns( score,not_out) values (42, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (23, null);
insert into mccullum_inns( score,not_out) values (68, null);
insert into mccullum_inns( score,not_out) values (13, null);
insert into mccullum_inns( score,not_out) values (4, null);
insert into mccullum_inns( score,not_out) values (35, null);
insert into mccullum_inns( score,not_out) values (7, null);
insert into mccullum_inns( score,not_out) values (51, null);
insert into mccullum_inns( score,not_out) values (13, null);
insert into mccullum_inns( score,not_out) values (11, null);
insert into mccullum_inns( score,not_out) values (74, null);
insert into mccullum_inns( score,not_out) values (69, null);
insert into mccullum_inns( score,not_out) values (38, null);
insert into mccullum_inns( score,not_out) values (67, 1);
insert into mccullum_inns( score,not_out) values (2, null);
insert into mccullum_inns( score,not_out) values (8, null);
insert into mccullum_inns( score,not_out) values (20, null);
insert into mccullum_inns( score,not_out) values (1, null);
insert into mccullum_inns( score,not_out) values (21, null);
insert into mccullum_inns( score,not_out) values (22, null);
insert into mccullum_inns( score,not_out) values (11, null);
insert into mccullum_inns( score,not_out) values (113, null);
insert into mccullum_inns( score,not_out) values (9, null);
insert into mccullum_inns( score,not_out) values (37, null);
insert into mccullum_inns( score,not_out) values (12, null);
insert into mccullum_inns( score,not_out) values (224, null);
insert into mccullum_inns( score,not_out) values (1, null);
insert into mccullum_inns( score,not_out) values (8, null);
insert into mccullum_inns( score,not_out) values (302, null);
insert into mccullum_inns( score,not_out) values (7, null);
insert into mccullum_inns( score,not_out) values (17, null);
insert into mccullum_inns( score,not_out) values (4, null);
insert into mccullum_inns( score,not_out) values (3, null);
insert into mccullum_inns( score,not_out) values (31, null);
insert into mccullum_inns( score,not_out) values (25, null);
insert into mccullum_inns( score,not_out) values (18, null);
insert into mccullum_inns( score,not_out) values (39, null);
insert into mccullum_inns( score,not_out) values (43, null);
insert into mccullum_inns( score,not_out) values (45, null);
insert into mccullum_inns( score,not_out) values (202, null);
insert into mccullum_inns( score,not_out) values (195, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (22, null);
insert into mccullum_inns( score,not_out) values (42, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (41, null);
insert into mccullum_inns( score,not_out) values (55, null);
insert into mccullum_inns( score,not_out) values (6, null);
insert into mccullum_inns( score,not_out) values (80, null);
insert into mccullum_inns( score,not_out) values (27, null);
insert into mccullum_inns( score,not_out) values (4, null);
insert into mccullum_inns( score,not_out) values (20, null);
insert into mccullum_inns( score,not_out) values (75, null);
insert into mccullum_inns( score,not_out) values (17, 1);
insert into mccullum_inns( score,not_out) values (18, null);
insert into mccullum_inns( score,not_out) values (18, null);
insert into mccullum_inns( score,not_out) values (0, null);
insert into mccullum_inns( score,not_out) values (10, null);
insert into mccullum_inns( score,not_out) values (145, null);
insert into mccullum_inns( score,not_out) values (25, null);
commit;

I’ve loaded this into my Oracle 11gXE Enterprise Edition database.

Don’t count on COUNT()

Let’s just check the number of rows in the table :

select count(*), count(score), count(not_out)
from mccullum_inns
/

  COUNT(*) COUNT(SCORE) COUNT(NOT_OUT) 
---------- ------------ --------------
       176          176             9
       

Hmmm, that’s interesting. Whilst there are 176 rows in the table, a count of the NOT_OUT column only returns 9, which is the number of rows with a non-null value in this column.

The fact is that COUNT(*) behaves a bit differently from COUNT(some_column)…

with stick as
(
    select 1 as ball from dual
    union all select 2 from dual
    union all select null from dual
)    
select count(*), count(ball)
from stick
/

COUNT(*)                             COUNT(BALL)
---------- ---------------------------------------
         3                                       2

Tanel Poder provides the explanation as to why this happens here.
Due to this difference in behaviour, you may well consider that COUNT(*) is a completely different function to COUNT(column), at least where NULLS are concerned.

When all else fails, Read the Manual

From very early on, database developers learn to be wary of columns that may contain null values and code accordingly, making frequent use of the NVL function.
However, aggregate functions can prove to be something of a blind spot. This can lead to some interesting results.
Whilst we know ( and can prove) that NULL + anything equals NULL…

select 3 + 1 + 4 + 1 + null as ball
from dual
/

     BALL
----------
         

…if we use an aggregate function…

with stick as 
(
    select 3 as ball from dual
    union all select 1 from dual
    union all select 4 from dual
    union all select 1 from dual
    union all select null from dual
)
select sum(ball) 
from stick
/

SUM(BALL)
----------
         9

…so, calculating an average may well lead to some confusion…

with stick as 
(
    select 3 as ball from dual
    union all select 1 from dual
    union all select 4 from dual
    union all select 1 from dual
    union all select null from dual
)
select avg(ball)
from stick
/

AVG(BALL)
----------
      2.25

…which is not what we would expect given :

with stick as 
(
    select 3 as ball from dual
    union all select 1 from dual
    union all select 4 from dual
    union all select 1 from dual
    union all select null from dual
)
select sum(ball)/count(*) as Average
from stick
/

   AVERAGE
----------
       1.8 

You can see similar behaviour with the MAX and MIN functions :

with stick as 
(
    select 3 as ball from dual
    union all select 1 from dual
    union all select 4 from dual
    union all select 1 from dual
    union all select null from dual
)
select max(ball), min(ball)
from stick
/

 MAX(BALL)  MIN(BALL)
---------- ----------
         4          1

Looking at the documentation, we can see that :

“All aggregate functions except COUNT(*), GROUPING, and GROUPING_ID ignore nulls. You can use the NVL function in the argument to an aggregate function to substitute a value for a null. COUNT and REGR_COUNT never return null, but return either a number or zero. For all the remaining aggregate functions, if the data set contains no rows, or contains only rows with nulls as arguments to the aggregate function, then the function returns null.”

So, if we want our aggregate functions to behave themselves, or at least, behave as we might expect, we need to account for situations where the column on which they are operating may be null.
Returning to COUNT…

select count(nvl(not_out, 0)) 
from mccullum_inns
/

                 COUNT(NVL(NOT_OUT,0))
---------------------------------------
                                    176
                    

Going back to our original task, i.e. finding McCullum’s final batting average, we could do this :

select count(*) as Inns, 
    sum(score) as Runs,
    sum(nvl(not_out,0)) as "Not Outs",
    round(sum(score)/(count(*) - count(not_out)),2) as Average
from mccullum_inns
/

INNS  RUNS   Not Outs AVERAGE
----- ----- ---------- -------
  176  6453          9   38.64
  

However, now we’ve re-learned how nulls are treated by aggregate functions, we could save ourselves a bit of typing…

select count(*) as Inns, 
    sum(score) as Runs,
    count(not_out) as "Not Outs",
    round(sum(score)/(count(*) - count(not_out)),2) as Average
from mccullum_inns
/

INNS  RUNS   Not Outs AVERAGE
----- ----- ---------- -------
  176  6453          9   38.64

Time to draw stumps.


Filed under: Oracle, SQL Tagged: avg, count(*), max, min, null and aggregate functions, NVL, sum

Pages