PeopleSoft Public users are not required to authenticate (sign on). These are generic accounts created for specific purposes, for example informational pages and/or company directories. Public users are also not subject to timeouts (session inactivity). Because no authentication is required, no sensitive data should be accessible to these users. It also goes without saying, that if you don’t need Public accounts, don’t use them.
When performing a PeopleSoft security audit, Integrigy identifies Public users and analyzes their authorization privileges. To do this yourself, use the SQL below to list your public users and then query the application or database to look at their authorization privileges.
--List the public users SELECT O.OPRID, O.OPRDEFNDESC, O.ACCTLOCK, O.LASTPSWDCHANGE, O.FAILEDLOGINS,O.ENCRYPTED, O.EMPLID FROM SYSADM.PSWEBPROFILE P, SYSADM.PSOPRDEFN O WHERE P.BYPASSSIGNON = 'Y' AND P.DEFAULTUSERID = O.OPRID;
If you have questions, please contact us at email@example.com
Michael A. Miller, CISSP-ISSMP, CCSPReferences
- A new command line interface for launching and controlling compute
- A new way to backup instance with snapshots was introduced
- Shared cloud storage via NFSv4 was made generally available
- Security Lists were changed a little
- Updating private machine images was updated
- Orchestrations were updated and changed
- Resizing instances has been updated
- Backing up storage volumes with snapshots was introduced
- Compute Instance High Availability was updated with Monitor as an option
- You can select different domains inside your instance for high availability and to reduce latency from the compute console
- You can use the GET /instanceconsole/name and updates to the monitoring api were made to help troubleshoot booting issues
- End Points for VPN and Compute were updated to reflect changes in instances
Warning: this is based on the Apex 5.1 Early Adopter and details may change.
The standard File Upload item type is getting a nice little upgrade in Apex 5.1. By simply changing attributes on the item, you can allow users to select multiple files (from a single directory) at the same time.
In addition, you can now restrict the type of file they may choose, according to the MIME type of the file, e.g.
image/jpg. This file type restriction can use a wildcard, e.g.
image/*, and can have multiple patterns separated by commas, e.g.
Normally, to access the file that was uploaded you would query
APEX_APPLICATION_TEMP_FILES with a predicate like
name = :P1_FILE_ITEM. If multiple files are allowed, however, the item will be set to a comma-delimited list of names, so the suggested code to get the files is:
declare arr apex_global.vc_arr2; begin arr := apex_util.string_to_table(:P1_MULTIPLE_FILES); for i in 1..arr.count loop select t.whatever into your_variable from apex_application_temp_files t where t.name = arr(i); end loop; end;
You can play with a simple demo here: https://apexea.oracle.com/pls/apex/f?p=UPLOAD_DEMO&cs=JK64 (login as demo / demodemo).
If you want to support drag-and-drop, image copy&paste, load large files asynchronously, or restrict the maximum file size that may be uploaded, you will probably want to consider a plugin instead, like Daniel Hochleitner’s DropZone.
Filed under: APEX Tagged: APEX, apex-5.1
In the aftermath of the Brexit vote, I’m probably not alone in being a little confused.
Political discourse in the UK has focused on exactly who it was who voted to Leave.
The Youth Spokesperson you get on a lot of political programs right now, will talk accusingly of older voters “ruining” their future by opting to Leave.
Other shocked Remainers will put it down to people without a University Degree.
I’m not sure where that leaves me as someone who is ever so slightly over the age of 30, does not have a degree…and voted to Remain. I’m pretty sure I’m not Scottish…unless there’s some dark family secret my parents haven’t let me in on.
I suppose I must be a member of the “Metropolitan Elite” the Leave side was always muttering darkly about.
After all, I do pay a great deal of money to be driven from my country residence to London to work every day…although I do have to share the train with the odd one or two fellow elitists who’ve made the same extravagant choice.
This does of course assume that Milton Keynes qualifies as being “in the country” and that my living there is a matter of choice rather than a question of being able to afford living any closer to London.
With all the excrement…er…excitement of the Referendum Campaign and it’s aftermath, I somehow never got around to writing my application to track the progress of the Euros (or the Copa America for that matter).
Whenever a major football tournament comes around, I always resolve to do this, if only to evoke memories of my youth when a large part of my bedroom wall was taken up with a World Cup Wallchart where you could fill in the results as they happened. That’s without mentioning the months leading up to the tournament and trying to complete the Panini collection – the only time you’d ever hear a conversation such as “OK, I’ll let you have Zico in exchange for Mick Mills”.
In order to prevent this happening again, I’ve resolved to write an application capable of holding details of any major international football tournament.
In the course of writing this application, I’d like to take the opportunity to have a look at an aspect of PL/SQL development that maybe isn’t as commonly used as it should be – Unit Testing.
Over the next few weeks, I plan to take a look at some of the Testing Frameworks available for PL/SQL and see how they compare.
The objective here is not so much to find which framework is the best/most suitable, but to perform an objective comparison between them using the same set of tests which implement fairly commonly encountered functionality.
If you’re looking for recommendations for a framework, then this article by Jacek Gebal is probably a good place to start.
In this post, I’ll be outlining the functionality that I’ll be testing in the form of User Stories, together with the application data model (or at least, the bit of it I need to execute the tests).
I’ll also have a look at the common pattern that tests written in these frameworks tend to follow.
Just to highlight why using a Test Framework might be useful, I’ll also script a couple of simple tests in SQL to see just how much code you have to write to implement tests without using a framework.
Taking the purist approach to Test-Driven Development, we’d need to :
- Write the test first and ensure that it fails
- Write the minimum amount of code required for the test to pass
- Run the test to make sure it passes
Additionally, we’d need to make sure that the tests were independent of each other – that the execution of one test is not dependent on the successful execution of a prior test.
Following this approach to the letter would cause one or two issues.
Firstly, if the procedure your testing does not exist, your test will not run and fail. It will error.
As the point of this step is, essentially, to ensure that your test code is sound (i.e. it won’t pass irrespective of the code it runs against), this is not what we’re after.
The second issue is specific to PL/SQL.
When defining PL/SQL procedures that interact with database tables, it’s usually a good idea to use anchored declarations where appropriate.
Even if we write a stub procedure, if the tables it will interact with do not exist, we’d have to use native types for our parameters and update the signature of the procedure once the tables had been created.
There is always the danger that this additional step would be missed.
So, in terms of PL/SQL then, I’d suggest that the pre-requisites for writing our test are :
- The data model components (tables, views, RI constraints) that the procedure will interact with
- A stub procedure with correctly typed parameters
Many testing frameworks seem to adopt four basic steps for each test, some of which are optional. They are :
- Setup – put the application into a known state from which the test can run
- Execute – run the code that you want to test
- Verify – check that what you expected to happen did happen
- Teardown – return the system to the state it was in before this test was run
This is the general pattern that I’ll be following for my tests.The Stories to Test
My story finding is complete and the backlog has been populated.
The stories selected for Sprint 1 are on the board. OK, they’ve been chosen so that they cover some of the more common scenarios that we might need to test.
The stories are :
- Add a competition – tests insert into a table with a Primary Key
- Add a Tournament – insert into a table with a Foreign Key constraint
- Remove Tournament – delete from a table
- Remove Competitions – delete from a Parent table
- Edit a Tournament – update of a record in a table
- View Tournaments by Competition – select multiple records using an in/out ref cursor
- Bulk Upload Competitions – insert records in bulk using LOG ERRORS
- Add a Team to a Tournament – insert a record using a synthetic foreign key
Each of these stories have multiple Acceptance Criteria.
It’s worth noting that, as some of the functionality (i.e. the data integrity) is implemented in the data model ( Primary Keys, Foreign Keys etc), the Acceptance Criteria for these stories needs to cover this as well as the functionality implemented in the PL/SQL code itself.
I’ve taken the approach that, for a story to be Sprint Ready, the Data Model to support it must already be in place.
Currently, the data model looks like this :
The DDL to create the application owner is :
create user footie identified by password / alter user footie default tablespace USERS / grant create session, create table, create procedure, create sequence to footie / alter user footie quota unlimited on users / grant read, write on directory my_files to footie /
…where password is the password you want to give this user.
Note that a directory called MY_FILES already exists in my database.
The DDL to create the Data Model includes the COMPETITIONS table…
create table competitions ( comp_code varchar2(5) constraint comp_pk primary key, comp_name varchar2(50) not null, description varchar2(4000) ) / comment on table competitions is 'International Football competitionss for which tournament data can be added' / comment on column competitions.comp_code is 'Internal code to uniquely identify this competitions' / comment on column competitions.comp_name is 'The name of the competitions' / comment on column competitions.description is 'A description of the competitions' /
…an external table to facilitate the bulk upload of COMPETITIONS…
create table competitions_xt ( comp_code varchar2(5), comp_name varchar2(50), description varchar2(4000) ) organization external ( type oracle_loader default directory my_files access parameters ( records delimited by newline skip 1 fields terminated by ',' badfile 'competitions.bad' logfile 'competitions.log' ( comp_code char(5), comp_name char(50), description char(4000) ) ) location('competitions.csv') ) reject limit unlimited /
….the TOURNAMENTS table…
create table tournaments ( id number constraint tourn_pk primary key, comp_code varchar2(5), year_completed number(4) not null, host_nation varchar2(100), year_started number(4), number_of_teams number(3) not null, constraint tourn_uk unique( comp_code, year_completed, host_nation) ) / comment on table tournaments is 'A (finals) tournament of an International Football competition. Table alias is tourn' / comment on column tournaments.id is 'Synthetic PK for the table as the Natural Key includes host_nation, which may be null. Values taken from sequence tourn_id_seq' / comment on column tournaments.comp_code is 'The Competition that this tournament was part of (e.g. World Cup). Part of the Unique Key. FK to COMPETITIONS(comp_code)' / comment on column tournaments.year_completed is 'The year in which the last match of this tournament took place. Manatory. Part of Unique Key' / comment on column tournaments.host_nation is 'The nation where the tournament was played (if a finals tournament). Part of Unique Key but is optional' / comment on column tournaments.year_started is 'The year in which the first match was played ( if applicable). Cannot be later than the value in YEAR_COMPLETED' / comment on column tournaments.number_of_teams is 'The number of teams taking part in the tournament' /
…the TOURNAMENT_TEAMS table…
create table tournament_teams ( tourn_id number, team_name varchar2(100), group_stage varchar2(2), constraint team_pk primary key (tourn_id, team_name) ) / comment on table tournament_teams is 'Teams participating in the Tournament. Alias is TEAM' / comment on column tournament_teams.tourn_id is 'The ID of the tournament the team is participating in. Foreign Key to TOURNAMENTS(id).' / comment on column tournament_teams.team_name is 'The name of the participating team' / comment on column tournament_teams.group_stage is 'If the tournament has an initial group stage, the group identifier that the team is drawn in' /
…this being 11g, a sequence to provide the values for the TOURNAMENT.ID synthetic key (in 12c you can define this as part of the table ddl)…
create sequence tourn_id_seq /
… a Foreign Key constraint from TOURNAMENTS to COMPETITIONS….
alter table tournaments add constraint tourn_comp_fk foreign key (comp_code) references competitions(comp_code) /
…and a Foreign Key from TOURNAMENT_TEAMS to TOURNAMENTS…
alter table tournament_teams add constraint team_tourn_fk foreign key (tourn_id) references tournaments(id) /
I’ve kept the Foreign Keys in separate files to make the initial deployment of the application simpler. By doing this, I can create the tables in any order without worrying about RI constraints. I can then add these as a separate step after all of the tables have been created.
The tables’ non-RI constraints (Primary, Unique Keys, Not Null constraints etc.) are created along with the table.
One other point to note is that I know there are one or two issues with the first-cut of the DDL above. This is so that I can see how well the tests I write highlight these issues.
As we know, before we begin writing a test, we’ll need to have a stub procedure for it to compile against.
The first of these is :
create or replace package footie.manage_competitions as procedure add_competition ( i_code footie.competitions.comp_code%type, i_name footie.competitions.comp_name%type, i_desc footie.competitions.description%type ); end manage_competitions; / create or replace package body footie.manage_competitions as procedure add_competition ( i_code footie.competitions.comp_code%type, i_name footie.competitions.comp_name%type, i_desc footie.competitions.description%type ) is begin null; end add_competition; end manage_competitions; /Scripting the Tests
I want to write a test script for my first story – Add a Competition.
There are two Acceptance Criteria that I need to test :
- A new competition is added
- A competition cannot be added more than once
That’s pretty simple, so the test should be fairly straight-forward. Using a SQL script, the first test would probably look something like this :
set serveroutput on size unlimited declare l_result varchar2(4); l_err_msg varchar2(4000) := null; l_rec_count pls_integer; l_code footie.competitions.comp_code%type; l_name footie.competitions.comp_name%type := 'UT1 Comp'; l_desc footie.competitions.description%type := 'Test'; l_counter pls_integer := 1; begin -- Setup - make sure that the competition we're using for the test does not already exist select 'UT'||numeral into l_code from ( select max( to_number( substr( comp_code, regexp_instr( comp_code, '[[:digit:]]')))) + 1 as numeral from footie.competitions where comp_code like 'UT%' and regexp_instr( substr( comp_code, -1, 1), '[[:digit:]]') = 1 union select 1 from dual ) where numeral is not null; -- execute - nested block so that we can handle/report any exceptions begin footie.manage_competitions.add_competition(l_code, l_name, l_desc); l_result := 'PASS'; exception when others then l_err_msg := dbms_utility.format_error_backtrace; l_result := 'FAIL'; end; -- execute block -- validate if l_result = 'PASS' then select count(*) into l_rec_count from footie.competitions where comp_code = l_code and comp_name = l_name and description = l_desc; if l_rec_count != 1 then l_result := 'FAIL'; l_err_msg := 'Record not added'; end if; end if; -- teardown rollback; -- Display Result dbms_output.put_line('Add Competition : '||l_result); if l_result = 'FAIL' then dbms_output.put_line(l_err_msg); end if; end; /
If we run this, we’d expect it to fail, as things stand :
PL/SQL procedure successfully completed. Add Competition : FAIL Record not added
Our second test will probably look like this :
set serveroutput on size unlimited declare l_result varchar2(4); l_err_msg varchar2(4000) := null; l_err_code number; l_rec_count pls_integer; l_code footie.competitions.comp_code%type := 'UT2'; l_name footie.competitions.comp_name%type := 'UT2 Comp'; l_desc footie.competitions.description%type := null; l_counter pls_integer := 1; begin -- Setup - make sure that the competition we're using for the test exists merge into footie.competitions using dual on (comp_code = l_code) when not matched then insert( comp_code, comp_name, description) values(l_code, l_name, l_desc); -- execute - nested block so that we can handle/report any exceptions begin footie.manage_competitions.add_competition(l_code, l_name, l_desc); l_result := 'FAIL'; exception when others then l_err_msg := dbms_utility.format_error_backtrace; l_result := 'PASS'; end; -- execute block -- validate if l_result = 'PASS' then select count(*) into l_rec_count from footie.competitions where comp_code = l_code and comp_name = l_name and nvl(description, 'X') = nvl(l_desc, 'X'); if l_rec_count > 1 then l_result := 'FAIL'; l_err_msg := 'Duplicate record has been added'; end if; end if; -- teardown rollback; -- Display Result dbms_output.put_line('Add Competition : '||l_result); if l_result = 'FAIL' then dbms_output.put_line(l_err_msg); end if; end; /
That looks rather similar to our first test. Furthermore, the two test scripts combined add up to quite a lot of code.
At least when we run it, it fails as expected…
PL/SQL procedure successfully completed. Add Competition : FAIL
The next step is to write the code to pass the tests…
create or replace package body manage_competitions as procedure add_competition ( i_code competitions.comp_code%type, i_name competitions.comp_name%type, i_desc competitions.description%type default null ) is begin insert into competitions( comp_code, comp_name, description) values( i_code, i_name, i_desc); end add_competition; end manage_competitions; /
Now when we run our tests we get …
PL/SQL procedure successfully completed. Add Competition : PASS PL/SQL procedure successfully completed. Add Competition : PASS
Now, I could carry on writing SQL scripts for Acceptance Criteria for all of the other stories in the sprint, but I think you get the idea.
Writing tests this way requires a large amount of code, much of which is being replicated in separate tests.
Of course, you could go down the route of moving some of your repeating test routines into PL/SQL packages and deploying your test code along with your application code to non-production environments.
Before going to all of that extra effort though, it’s probably worth checking to see if there’s a framework out there that can help reduce the testing burden.
Fortunately, I’ve got a couple of years until the next major international tournament so, I’ll be taking some time to do just that.
So, tune in next week ( or sometime soon after) when I’ll be taking a look at the first of these frameworks…SQLDeveloper Unit Testing
Filed under: Oracle, PL/SQL, SQL Tagged: external table, foreign key, Test Driven Development, unique key
During my recent visit to Databricks, I of course talked a lot about technology — largely with Reynold Xin, but a bit with Ion Stoica as well. Spark 2.0 is just coming out now, and of course has a lot of enhancements. At a high level:
- Using the new terminology, Spark originally assumed users had data engineering skills, but Spark 2.0 is designed to be friendly to data scientists.
- A lot of this is via a focus on simplified APIs, based on
- Unlike similarly named APIs in R and Python, Spark DataFrames work with nested data.
- Machine learning and Spark Streaming both work with Spark DataFrames.
- There are lots of performance improvements as well, some substantial. Spark is still young enough that Bottleneck Whack-A-Mole yields huge benefits, especially in the SparkSQL area.
- SQL coverage is of course improved. For example, SparkSQL can now perform all TPC-S queries.
The majority of Databricks’ development efforts, however, are specific to its cloud service, rather than being donated to Apache for the Spark project. Some of the details are NDA, but it seems fair to mention at least:
- Databricks’ notebooks feature for organizing and launching machine learning processes and so on is a biggie. Jupyter is an open source analog.
- Databricks has been working on security, and even on the associated certifications.
Two of the technical initiatives Reynold told me about seemed particularly cool. One, on the machine learning side, was a focus on training models online as new data streams in. In most cases this seems to require new algorithms for old model types, with a core idea being that the algorithm does a mini gradient descent for each new data point.
The other cool idea fits the trend of alternatives to the “lambda architecture”. Under the name “structured streaming”, which seems to be a replacement for “DStreaming”, the idea is to do set-based SQL processing even though membership of the set changes over time. Result sets are extracted on a snapshot basis; you can keep either all the results from each snapshot query or just the deltas.
Despite all this, there’s some non-trivial dissatisfaction with Spark, fair or otherwise.
- Some of the reason is that SparkSQL is too immature to be great.
- Some is annoyance that Databricks isn’t putting everything it has into open source.
- Some is that everything has its architectural trade-offs.
To the last point, I raised one of the biggest specifics with Reynold, namely Spark’s lack of a strong built-in data persistence capability. Reynold’s answer was that they’re always working to speed up reading and writing from other forms of persistent storage. E.g., he cited a figure of ~100 million rows/core/second decoded from Parquet.
I visited Databricks in early July to chat with Ion Stoica and Reynold Xin. Spark also comes up in a large fraction of the conversations I have. So let’s do some catch-up on Databricks and Spark. In a nutshell:
- Spark is indeed the replacement for Hadoop MapReduce.
- Spark is becoming the default platform for machine learning.
- SparkSQL (nee’ Shark) is puttering along predictably.
- Databricks reports good success in its core business of cloud-based machine learning support.
- Spark Streaming has strong adoption, but its position is at risk.
- Databricks, the original authority on Spark, is not keeping a tight grip on that role.
I shall explain below. I also am posting separately about Spark evolution, especially Spark 2.0. I’ll also talk a bit in that post about Databricks’ proprietary/closed-source technology.
Spark is the replacement for Hadoop MapReduce.
This point is so obvious that I don’t know what to say in its support. The trend is happening, as originally decreed by Cloudera (and me), among others. People are rightly fed up with the limitations of MapReduce, and — niches perhaps aside — there are no serious alternatives other than Spark.
The greatest use for Spark seems to be the same as the canonical first use for MapReduce: data transformation. Also in line with the Spark/MapReduce analogy:
- Data-transformation-only use cases are important, but they don’t dominate.
- Most other use cases typically have a data transformation element as well …
- … which has to be started before any other work can be done.
And so it seems likely that, at least for as long as Spark is growing rapidly, data transformation will appear to be the biggest Spark use case.
Spark is becoming the default platform for machine learning.
Largely, this is a corollary of:
- The previous point.
- The fact that Spark was originally designed with machine learning as its principal use case.
To do machine learning you need two things in your software:
- A collection of algorithms. Spark, I gather, is one of numerous good alternatives there.
- Support for machine learning workflows. That’s where Spark evidently stands alone.
And thus I have conversations like:
- “Are you doing anything with Spark?”
- “We’ve gotten more serious about machine learning, so yes.”
SparkSQL (nee’ Shark) is puttering along.
SparkSQL is pretty much following the Hive trajectory.
- Useful from Day One as an adjunct to other kinds of processing.
- A tease and occasionally useful as a SQL engine for its own sake, but really not very good, pending years to mature.
Databricks reports good success in its core business of cloud-based machine learning support.
Databricks, to an even greater extent than I previously realized, is focused on its cloud business, for which there are well over 200 paying customers. Notes on that include:
- As you might expect based on my comments above, the majority of usage is for data transformation, but a lot of that is in anticipation of doing machine learning/predictive modeling in the near future.
- Databricks customers typically already have their data in the Amazon cloud.
- Naturally, a lot of Databricks customers are internet companies — ad tech startups and the like. Databricks also reports “strong” traction in the segments:
- Financial services (especially but not only insurance)
- Health care/pharma
- The main languages Databricks customers use are R and Python. Ion said that Python was used more on the West Coast, while R was used more in the East.
Databricks’ core marketing concept seems to be “just-in-time data platform”. I don’t know why they picked that, as opposed to something that emphasizes Spark’s flexibility and functionality.
Spark Streaming’s long-term success is not assured.
To a first approximation, things look good for Spark Streaming.
- Spark Streaming is definitely the leading companion to Kafka, and perhaps also to cloud equivalents (e.g. Amazon Kinesis).
- The “traditional” alternatives of Storm and Samza are pretty much done.
- Newer alternatives from Twitter, Confluent and Flink aren’t yet established.
- Cloudera is a big fan of Spark Streaming.
- Even if Spark Streaming were to generally decline, it might keep substantial “good enough” usage, analogously to Hive and SparkSQL.
- Cool new Spark Streaming technology is coming out.
But I’m also hearing rumbles and grumbles about Spark Streaming. What’s more, we know that Spark Streaming wasn’t a core part of Spark’s design; the use case just happened to emerge. Demanding streaming use cases typically involve a lot of short-request inserts (or updates/upserts/whatever). And if you were designing a system to handle those … would it really be based on Spark?
Databricks is not keeping a tight grip on Spark leadership.
- Databricks’ main business, as noted above, is its cloud service. That seems to be going well.
- Databricks’ secondary business is licensing stuff to Spark distributors. That doesn’t seem to amount to much; it’s too easy to go straight to the Apache distribution and bypass Databricks. No worries; this never seemed it would be a big revenue opportunity for Databricks.
At the moment, Databricks is pretty clearly the general leader of Spark. Indeed:
- If you want the story on where Spark is going, you do what I did — you ask Databricks.
- Similarly, if you’re thinking of pushing the boundaries on Spark use, and you have access to the Databricks folks, that’s who you’ll probably talk to.
- Databricks employs ~1/3 of Spark committers.
- Databricks organizes the Spark Summit.
But overall, Databricks doesn’t seem to care much about keeping Spark leadership. Its marketing efforts in that respect are minimal. Word-of-mouth buzz paints a similar picture. My direct relationship with the company gives the same impression. Oh, I’m sure Databricks would like to remain the Spark leader. But it doesn’t seem to devote much energy toward keeping the role.
Starting with my introduction to Spark, previous overview posts include those in:
I learned some newish terms on my recent trip. They’re meant to solve the problem that “data scientists” used to be folks with remarkably broad skill sets, few of whom actually existed in ideal form. So instead now it is increasingly said that:
- “Data engineers” can code, run clusters, and so on, in support of what’s always been called “data science”. Their knowledge of the math of machine learning/predictive modeling and so on may, however, be limited.
- “Data scientists” can write and run scripts on single nodes; anything more on the engineering side might strain them. But they have no-apologies skills in the areas of modeling/machine learning.
- I raised concerns about the “data science” term 4 years ago.
We share our skills to maximize your revenue!