Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 5 hours 5 min ago

Control File issues on duplicating with non patched Oracle version.

Wed, 2018-07-18 02:34

Introduction :

RMAN has the ability to duplicate, or clone, a database from a backup or from an active database.
It is possible to create a duplicate database on a remote server with the same file structure,
or on a remote server with a different file structure or on the local server with a different file structure.
For some old and  non patched Oracle versions such as that earlier than 11.2.0.4 , the duplicate (from active or backup) can be a real
challenge even for those DBAs with years of experience,  due to different bugs encountered.

The scenario specified  below will focus on control file issues revealed by duplication from active database an Oracle 11.2.0.2 version EE.

<INFO>Make sure to use nohup command line-utility which allows to run command/process or shell script.

Demonstration :

Step1: Prepare your script:

vi script_duplicate.ksh

#!/bin/ksh
export ORACLE_HOME=$ORACLE_HOME
export PATH=$PATH1:$ORACLE_HOME/bin
rman target sys/pwd@TNS_NAME_TARGET auxiliary sys/pwd@TNS_NAME_AUXILIARY log=duplicate.log cmdfile=/home/oracle/rman_bkup.cmd

vi rman_bkup.cmd
run
{
allocate channel ch1 device type disk;
allocate channel ch2 device type disk;
allocate channel ch3 device type disk;
allocate auxiliary channel dh1 device type disk;
allocate auxiliary channel dh2 device type disk;
allocate auxiliary channel dh3 device type disk;
duplicate target database to <AUXILIARY_NAME> from active database nofilenamecheck;
release channel ch3;
release channel ch2;
release channel ch1;
}

and launch like that : nohup ./script_duplicate.ksh &

Step2: Check instance parameters.
Depending on the PSU level of your instance, even before starting the duplicate, can fail with this error.

RMAN-00571: ===================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS
RMAN-00571: ===================================================
RMAN-03002: failure of Duplicate Db command at 11/02/2011 06:05:48
RMAN-04014: startup failed: ORA-00600: internal error code, arguments: [kck_rls_check must use (11,0,0,0,0) or lower], [kdt.c], [9576], [11.2.0.2.0], [], [], [], [], [], [], [], []
RMAN-04017: startup error description: ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
RMAN-03015: error occurred in stored script Memory Script
RMAN-04014: startup failed: ORA-00600: internal error code, arguments: [kck_rls_check must use (11,0,0,0,0) or lower], [kdt.c], [9576], [11.2.0.2.0], [], [], [], [], [], [], [], []
RMAN-04017: startup error description: ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance

According with Oracle Support note : 1064264.1

1. Edit the pfile, add parameter:
_compression_compatibility= "11.2.0"

2. Restart the instance using the pfile
SQL> startup pfile='<fullpath name of pfile>'

3. Create the SPFILE again
SQL> create spfile from pfile;

4. Restart the instance with the SPFILE
SQLl> shutdown immediate;
SQL> startup

and relaunch the previous command (Step 1).

Step3 : Control file issue, trying to open the database.
After transferring the datafiles , your duplicate will crash with these errors , trying to open the database.

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 15/07/2018 17:39:30
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script



SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-19838: Cannot use this control file to open database

Basically is because of a known bug (Bug 11063122 in 11gr2).
Controlfile created during the duplicate in 11gr2 will store redolog file locations as of primary.
We need to recreate control file changing the locations of redo logfiles and datafiles and open database with resetlogs.
In the controlfile recreation script the database name is the source <db_name> and the directory names for redo logs are still pointing to the source database .

The workaround is :

1. Backup as trace your control file (cloned DB)

sql> alter database backup controlfile to trace ;

2. Open the file  , and extract the section RESETLOGS, to modify like that :

CREATE CONTROLFILE REUSE DATABASE "<src_db_name>" RESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 11680
LOGFILE
  GROUP 9  '<path_of_the_cloned_DB>redo09.log'  SIZE 150M BLOCKSIZE 512,
  GROUP 10 '<path_of_the_cloned_DB>/redo10.log'  SIZE 150M BLOCKSIZE 512,
  GROUP 11 '<path_of_the_cloned_DB>/redo11.log'  SIZE 150M BLOCKSIZE 512,

DATAFILE
  '<path_of_the_cloned_DB>/system01.dbf',
  '<path_of_the_cloned_DB>/undotbs01.dbf',
  '<path_of_the_cloned_DB>/sysaux01.dbf',
  '<path_of_the_cloned_DB>/users01.dbf',
-------------more datafiles
CHARACTER SET EE8ISO8859P2;

Save as trace_control.ctl

3. SQL> alter system set db_name=<new db_name> scope=spfile;
4. SQL> startup nomount
5. SQL>@trace_control.ctl
      --control file created and multiplexed in all the destinations mentioned on your spfile 
6. SQL> alter database open resetlogs

<INFO>If your source db had activity during the duplicate process you should apply manually some required archivelogs.

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 15/07/2018 19:21:30
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/u01/oradata/DBName/system01.dbf'

Search on source database , for those  archivelogs with sequence# greater or equal to 399747 and apply them manually on the target DB.

If somehow those are not available you need to take an incremental backup to roll forward your cloned database.

7. SQL> recover database using backup controlfile;

ORA-00279: change 47260162325 generated at  15/07/2018 19:27:40 needed for thread 1
ORA-00289: suggestion : <path>o1_mf_1_399747_%u_.arc
ORA-00280: change 47260162325 for thread 1 is in sequence #399747

Once the required archivelogs files have been applied , try again to open your database:

RMAN> alter database open resetlogs;

database opened

RMAN> exit

Conclusion :
If you’re the kind of Oracle administrator who has the power to approve or deny, you must know how dangerous it is to run your applications with  non patched Oracle databases.
Your data within your organization is better protected if your are taking advantage of patches issued by Oracle and running your production data against supported Oracle versions only.

 

Cet article Control File issues on duplicating with non patched Oracle version. est apparu en premier sur Blog dbi services.

Restarting a failed transportable tablespace metadata import

Tue, 2018-07-17 09:39

I’m currently working in a project to migrate a Datawarehouse-database from Solaris to Linux Intel (Endian-change). We do use the cross platform incremental backups method as described in My Oracle Support Note 1389592.1 for that.
I.e. incremental backups are applied to database files and during the migration the recovered datafiles are attached to the target database via the transportable tablespace method. When testing the transportable tablespace metadata import I got an error for a table:


Processing object type TRANSPORTABLE_EXPORT/TABLE
ORA-39083: Object type TABLE:"DWHT"."TDWHTAB" failed to create with error:
ORA-01843: not a valid month
Failing sql is:
CREATE TABLE "DWHT"."TDWHTAB" (...,"PROCESSING_TM" DATE DEFAULT '01-01-0001' NOT NULL ENABLE,...

I.e. metadata of all tables in the tablespace were successfully imported except one. The reason for the error was a default-defintion of column “PROCESSING_TM” without a TO_DATE-casting. I.e. it relied on the NLS_DATE_FORMAT-setting. E.g.


SQL> create table t1 ("PROCESSING_TM" DATE DEFAULT '01-01-0001' NOT NULL ENABLE);
create table t1 ("PROCESSING_TM" DATE DEFAULT '01-01-0001' NOT NULL ENABLE)
  *
ERROR at line 1:
ORA-01843: not a valid month
 
SQL> alter session set nls_date_format='dd-mm-yyyy';
 
Session altered.
 
SQL> create table t1 ("PROCESSING_TM" DATE DEFAULT '01-01-0001' NOT NULL ENABLE);
 
Table created.

There are different possibilities on why a metadata import fails for specific tables (see e.g. My Oracle Support Note 1082116.1 for another reason). The question was: How to restart the import without going through the whole process of copying datafiles from the source-database again? As we imported the metadata through a network-link the tablespace on the source database and the tablespace on the target database were still set to “READ ONLY”.

So first of all I fixed the error in the table definition on the source database:


alter table "DWHT"."TDWHTAB" modify ("PROCESSING_TM" DATE DEFAULT TO_DATE('01-01-0001', 'DD-MM-YYYY'));

That was possible even having the associated tablespace of the table on READ ONLY as it is just a metadata update and does not touch the tablespace.

On the target database I cleaned up the tablespace of the failed table metadata import:


drop tablespace DWH_DATA including contents;

It’s of course important to NOT specify “AND DATAFILES” when dropping the tablespace to avoid having to copy all files again.

In my case I actually transported 146 tablespaces and modified the impdp-parameterfile xttplugin.par to contain only the single tablespace, which contained failing tables. Of course you have to drop and include all tablespaces, which have tables with dependencies between each other. In my case that was not an issue as my tables had no dependencies to tables in other tablespaces.

Afterwards I could just restart my metadata-import with the single tablespace:


$ cat xttplugin2.par
directory=DATA_PUMP_DIR
logfile=tts_imp2.log
network_link=ttslink
transport_full_check=no
transport_tablespaces=DWH_DATA
transport_datafiles='/u01/DWH/oradata/DWH_DATA_001.dbf','/u01/DWH/oradata/DWH_DATA_002.dbf'
$ imdp parfile=./xttplugin2.par
Password: / as sysdba

So to restart a failed transportable tablespace metadata import, just fix the root cause, drop the associated tablespace on the target database without dropping the datafiles and restart the import again. That becomes handy especially in cross platform incremental backup migration scenarios.

 

Cet article Restarting a failed transportable tablespace metadata import est apparu en premier sur Blog dbi services.

Google Cloud Spanner – no decimal numeric data types

Mon, 2018-07-16 10:51

Google Cloud Spanner is a distributed relational database focused on scalability without compromising consistency and integrity. It is available only as a managed service in Google Cloud. Basically, the idea is to keep the scalability advantages of NoSQL database (like Bigtable) but adding transactions, relational tables, SQL, structured data,… as in the relational databases we love for decades.
The commercial pitch includes all the NoSQL buzzwords, with the addition of the legacy properties of SQL databases:
Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale, schemas, SQL (ANSI 2011 with extensions), and automatic, synchronous replication for high availability.
Here I’m testing something that is not mentioned, but is taken for granted with all SQL databases: the ability to add numbers without erroneous arithmetic results.

It is easy to test on the Google Cloud (which offers 1 year trials) by creating an instance:
CaptureSpanner001CreateInstance

Then create a Spanner database:
CaptureSpanner002CreateDatabase

And create a table:
CaptureSpanner003CreateTable

The table creation can also use the SQL create table statement. Here I’m testing one of the most important features of SQL databases: the numeric datatypes. This is where humans and computers do not speak the same language: Humans have full hands of 10 fingers, where computers deal only with binary digits. Humans numbers are decimal. Computer numbers are binary.

It seems that Google Spanner is binary only. According to the documentation, the only numeric types are:

  • INT64 for signed integers up to 9E18
  • FLOAT64 for floating point IEEE-754

So, there are no decimal datatypes and decimal values will be approximated by binary values. This is ok to store computer numbers, but not human numbers such as prices, salaries,…

In order to show the problem I’ve created a table with FLOAT64:

CREATE TABLE NUMBERS (
ID INT64 NOT NULL,
AMOUNT FLOAT64,
) PRIMARY KEY (ID)

The SQL Query interface do not allow for DML other than SELECT:
DML not supported

So we can use the API or this simple from from the ‘data’ tab:
CaptureSpannerInsert

I’ve added 10 rows with ‘0.1’ which is easy to represent in decimal arithmetic, but not in binary arithmetic. Look at the sum:
CaptureQuery1

This is binary arithmetic applied to decimal numbers: approximation. You can select each rows and see ‘0.1’ but when you sum all the 10 rows together, you get less than 1. That’s probably close enough for some ‘BigData’ usage, accountants will not like it.

If you wonder why it takes 100 milliseconds for this 10 rows table, remember that this is a distributed database across 3 continents. Here is the execution plan:
CapturePlan

So what?

Do not forget that all the new trends for databases, in the ‘micro-services’ era, are focused at specific use-cases. They do not compete with the ‘old’ relational databases which are general purpose and have integrated, version after version, all the different ways to store and process data shared by multiple applications. Those NoSQL and NewSQL can be considered as an alternative only within the scope of what they are designed for. Spanner was desgined for Google internal use in Google AdWords and then provided as a service for similar use. It was developed to solve a specific problem: the lack of transactions in Bigtable.

Note that the Open Source alternative that is close to Google Spanner is CockroachDB which has a DECIMAL datatype to store fixed-point decimal numbers.

 

Cet article Google Cloud Spanner – no decimal numeric data types est apparu en premier sur Blog dbi services.

Drilling down the pgSentinel Active Session History

Sun, 2018-07-15 13:35

In pgSentinel: the sampling approach for PostgreSQL I mentioned that one of the advantages of the ASH approach is the ability to drill down from an overview of the database activity, down to the details where we can do some tuning. The idea is to always focus on the components which are relevant to our tuning goal:

  • Filter/Group by the relevant dimension to focus on what you want to tune (a program, session, query, time window,…)
  • Sort by the most active samples, to spend time only where you know you can improve significantly

The idea is to start at a high level. Here is a GROUP BY BACKEND_TYPE to show the activity of the ‘client backend’ and the ‘autovacuum worker':

select count(*), backend_type
from pg_active_session_history
where ash_time>=current_timestamp - interval '5 minutes'
group by backend_type
order by 1 desc
;
count | backend_type
-------+-------------------
1183 | client backend
89 | autovacuum worker

I selected only the last 5 minutes (the total retention is defined by pgsentinel_ash.max_entries and the sampling frequency by pgsentinel_ash.pull_frequency).

I ordered by the number of samples for each one, which gives a good idea of the proportion: most of the activity here for ‘client backend’. It may be more interesting to show a percentage, such as 93% activity is from the client and 7% is from the vacuum. However, this removes an interesting measure about the overall activity. The fact that we have 1183 samples within 5 minutes is an indication of the total load. In 5 minutes, we have 300 seconds, which means that each session can have 300 samples, when being 100% active in the database during that time. 1183 samples during 5 minutes mean that we have on average 1183/300 = 4 sessions active. This measure, calculated from the number of samples divided by the number of seconds, and known as Average Active Sessions (AAS) gives two different piece of information:

  • The overall activity in the database, similar to the load average at OS level
  • The relative activity of an aggregate (per session, program, event, time…)
AAS (Average Active Sessions)

In the previous post I counted the number of samples with count(distinct ash_time) because I knew that I had several sessions active during the whole time. But if there are periods of inactivity during those 5 minutes, there are no samples at all. And when drilling down to more detail, there will be some samples with no activity for a specific group. Here I calculate the number of seconds covered by the samples, using a window function:

with ash as (
select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",
backend_type
from ash
group by samples,
backend_type
order by 1 desc fetch first 20 rows only
;
AAS | backend_type
-------+-------------------
3.95 | client backend
0.29 | autovacuum worker
(2 rows)

From this output, I know that I have about 4 client sessions running. This is what I want to tune.

Drill down on wait events

Adding the WAIT_EVENT_TYPE to the GROUP BY, I can have more detail about the resources used by those sessions:

with ash as (
select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",
backend_type,wait_event_type
from ash
group by samples,
backend_type,wait_event_type
order by 1 desc fetch first 20 rows only
;
AAS | backend_type | wait_event_type
-------+-------------------+-----------------
2.57 | client backend | IO
0.94 | client backend | CPU
0.45 | client backend | LWLock
0.16 | autovacuum worker | CPU
0.12 | autovacuum worker | IO
0.00 | autovacuum worker | LWLock
(6 rows)

This gives a better idea about which system component may be tuned to reduce the response time or the throughput. IO is the major component here with 2.57 AAS being on an I/O call. Let’s get more information about which kind of I/O.

Drilling down to the wait event:

with ash as (
select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",
backend_type,wait_event_type,wait_event
from ash
group by samples,
backend_type,wait_event_type,wait_event
order by 1 desc fetch first 20 rows only
;
AAS | backend_type | wait_event_type | wait_event
-------+-------------------+-----------------+------------------
1.52 | client backend | IO | DataFileWrite
0.94 | client backend | CPU | CPU
0.46 | client backend | IO | DataFileExtend
0.41 | client backend | IO | DataFileRead
0.33 | client backend | LWLock | WALWriteLock
0.15 | autovacuum worker | CPU | CPU
0.12 | client backend | LWLock | buffer_mapping
0.10 | autovacuum worker | IO | DataFileRead
0.08 | client backend | IO | WALInitWrite
0.08 | client backend | IO | BufFileWrite
0.02 | client backend | IO | WALWrite
0.01 | autovacuum worker | IO | DataFileWrite
0.01 | client backend | IO | DataFilePrefetch
0.00 | client backend | LWLock | buffer_content
0.00 | autovacuum worker | LWLock | buffer_mapping
(15 rows)

This gives more information. The average 2.57 sessions active on IO are actually writing for 1.52 of them, reading for 0.46 of them, and waiting for the datafile to be extended for 0.46 of them. That helps to focus on the areas where we might improve the performance, without wasting time on the events which are only a small part of the session activity.

Drill-down on queries

This was a drill-down on the system axis (wait events are system call instrumentation). This is useful when we think something is wrong on the system or the storage. But performance tuning must also drive the investigation on the application axis. The higher level is the user call, the TOP_LEVEL_QUERY:

with ash as (
select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",
backend_type,top_level_query
from ash
group by samples,
backend_type,top_level_query
order by 1 desc fetch first 20 rows only
;
AAS | backend_type | top_level_query
-------+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0.95 | client backend | SELECT * FROM mypgio('pgio3', 50, 3000, 131072, 255, 8);
0.95 | client backend | SELECT * FROM mypgio('pgio2', 50, 3000, 131072, 255, 8);
0.95 | client backend | SELECT * FROM mypgio('pgio4', 50, 3000, 131072, 255, 8);
0.95 | client backend | SELECT * FROM mypgio('pgio1', 50, 3000, 131072, 255, 8);
0.25 | autovacuum worker | autovacuum: VACUUM ANALYZE public.pgio2
0.02 | client backend | commit;
0.01 | client backend | select * from pg_active_session_history where pid=21837 order by ash_time desc fetch first 1 rows only;
0.01 | client backend | with ash as ( +
| | select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples +
| | from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes' +
...

Here I see 4 user calls responsible for most of the 4 active sessions related to the ‘client backend’, each one with AAS=0.95 and this is actually what is running: the PGIO benchmark (see https://kevinclosson.net/) with 4 sessions calling mypgio function.

The function we see in TOP_LEVEL_QUERY is itself running some queries, and the big advantage of the pgSentinel extension, over pg_stat_activity, is the capture of the actual statement running, with the actual values of the parameters:

with ash as (
select *,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",
backend_type,substr(query,1,100)
from ash
group by samples,
backend_type,substr(query,1,100)
order by 1 desc fetch first 20 rows only
;
AAS | backend_type | substr
-------+-------------------+----------------------------------------------------------------------------------------
0.26 | autovacuum worker |
0.02 | client backend | commit
0.02 | client backend | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 3567 AND 3822
0.01 | client backend | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN 5729 AND 5984
0.01 | client backend | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN 5245 AND 5500
0.01 | client backend | truncate table l_ash.ps
0.01 | client backend | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 3249 AND 3504
0.01 | client backend | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 57 AND 312
0.01 | client backend | UPDATE pgio4 SET scratch = scratch + 1 WHERE mykey BETWEEN 3712 AND 3720
0.01 | client backend | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 1267 AND 1522
0.01 | client backend | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 703 AND 958
0.01 | client backend | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 2025 AND 2280
0.01 | client backend | insert into l_ash.ps_diff +
| | select ps1.pid,ps1.uname,ps1.pr,ps1.ni,ps1.virt,ps1.res,ps1.shr,ps1.s,ps1.
0.01 | client backend | UPDATE pgio4 SET scratch = scratch + 1 WHERE mykey BETWEEN 2690 AND 2698
0.01 | client backend | SELECT sum(scratch) FROM pgio3 WHERE mykey BETWEEN 5463 AND 5718
0.01 | client backend | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN 1467 AND 1722
0.01 | client backend | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 4653 AND 4908
(20 rows)

Here, no session is at the top. We have a few samples for each execution. This is because each execution is different (different values for the parameters) and they have a balanced execution time. If we had one query being longer with one specific set of parameter values, it would show up at the top here.

Finally, we can also aggregate at a higher level than QUERY with QUERYID which is per prepared statement and do not change when executing with different parameter values. If we want to get the text, then we can join with PG_STAT_STATEMENTS

with ash as (
select *,datid dbid,ceil(extract(epoch from max(ash_time)over()-min(ash_time)over()))::numeric samples
from pg_active_session_history where ash_time>=current_timestamp - interval '5 minutes'
) select round(count(*)::numeric/samples,2) as "AAS",dbid,
backend_type,queryid,pg_stat_statements.query
from ash left outer join pg_stat_statements using(dbid,queryid)
group by samples,dbid,
backend_type,queryid,pg_stat_statements.query
order by 1 desc fetch first 15 rows only
;
AAS | dbid | backend_type | queryid | query
-------+-------+----------------+------------+------------------------------------------------------------------------------------------------------
0.89 | 17487 | client backend | 837728477 | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 100926 AND 101181
0.70 | 17487 | client backend | 3411884874 | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN $1 AND $2
0.68 | 17487 | client backend | 1046864277 | SELECT sum(scratch) FROM pgio3 WHERE mykey BETWEEN 1591 AND 1846
0.67 | 17487 | client backend | 2994234299 | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN $1 AND $2
0.33 | 17487 | client backend | 1648177216 | UPDATE pgio1 SET scratch = scratch + 1 WHERE mykey BETWEEN 2582 AND 2590
0.32 | 17487 | client backend | 3381000939 | UPDATE pgio3 SET scratch = scratch + $1 WHERE mykey BETWEEN $2 AND $3
0.30 | 17487 | client backend | 1109524376 | UPDATE pgio4 SET scratch = scratch + 1 WHERE mykey BETWEEN 5462 AND 5470
0.11 | 17487 | client backend | 3355133240 | UPDATE pgio2 SET scratch = scratch + $1 WHERE mykey BETWEEN $2 AND $3
0.05 | 17547 | client backend | 2771355107 | update l_ash.parameters set value=now(),timestamp=now() where name=$1
0.05 | 17547 | client backend | 1235869898 | update l_ash.parameters set value=$1,timestamp=now() where name=$2
0.02 | 13806 | client backend | 935474258 | select * from pg_active_session_history where pid=$1 order by ash_time desc fetch first $2 rows only
0.01 | 13806 | client backend | 164740364 | with ash as ( +

This shows the main queries running: SELECT and UPDATE on the PGIO1,PGIO2,PGIO3,PGIO4. They run with different parameter values but have the same QUERYID. It seems that PG_STAT_STATEMENTS is not very consistent when capturing the query text: some show the parameter, some other show the values. But you must know that those are the prepared statements. We do not have 0.89 average sessions running the ‘SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 100926 AND 101181′. This is the ‘SELECT sum(scratch) FROM pgio2′ running with different parameter values and for whatever reasons, the PG_STAT_STATEMENTS extension displays one of the set of values rather than ‘BETWEEN $1 AND $2′.

Time dimension

Of course we can also query all samples and drill-down with a graphical tool. For the time axis, this is a better visualization. Here is a quick Excel PivotChart from those 5 minutes samples:
pg_active_session_history
I always have 4 sessions running, as we have seen in the average, but the wait event detail is not uniform during the timeline. This is where you will drill down on the time axis. This can be helpful to investigate a short duration issue. Or to try to understand non-uniform response time. For example, coming from Oracle, I’m not used to this pattern where, from one second to the other, the wait profile is completely different. Probably because of all the background activity such as Vacuum, WAL, sync buffers to disk, garbage collection,… The workload here, PGIO, the SLOB method for PostgreSQL, is short uniform queries. It would be interesting to have some statistics about the response time variation.

Note that in this database cluster, in addition to the PGIO workload, I have a small application running and committing very small changes occasionally and this why you see the peaks with 1 session on WALWrite and 4 sessions waiting on WALWriteLock. This adds to the chaos of waits.

This extension providing active session sampling is only the first component of pgSentinel so do not spend too much time building queries, reports and graphs on this and let’s see when will come with pgSentinel:

pgSentinel is in progress….@postgresql @amplifypostgres @PostgreSQLFR @BertrandDrouvot @ckikof pic.twitter.com/Pwq8vB69MI

— pgSentinel (@Pg_Sentinel) July 11, 2018

 

Cet article Drilling down the pgSentinel Active Session History est apparu en premier sur Blog dbi services.

PGDay Amsterdam – follow up 2 – Where do null values go to in a hash partitioned table?

Fri, 2018-07-13 23:17

This is the second follow up which covers this question: When you hash partition a table in PostgreSQL 11 where do null values for the partitioned column go to? Lets go…

In the demo I used this little table:

postgres=# select version();
                                                            version                                                          
-----------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bi
(1 row)
postgres=# create table part2 ( a int, list varchar(10) ) partition by hash (a);
CREATE TABLE
postgres=# create table part2_1 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 0);
CREATE TABLE
postgres=# create table part2_2 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE
postgres=# create table part2_3 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 2);
CREATE TABLE
postgres=# \d+ part2
                                          Table "public.part2"
 Column |         Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+-----------------------+-----------+----------+---------+----------+--------------+-------------
 a      | integer               |           |          |         | plain    |              | 
 list   | character varying(10) |           |          |         | extended |              | 
Partition key: HASH (a)
Partitions: part2_1 FOR VALUES WITH (modulus 3, remainder 0),
            part2_2 FOR VALUES WITH (modulus 3, remainder 1),
            part2_3 FOR VALUES WITH (modulus 3, remainder 2)

The data we played with was this:

postgres=# insert into part2 (a,list) values (1,'beer');
INSERT 0 1
postgres=# insert into part2 (a,list) values (2,'whine');
INSERT 0 1
postgres=# insert into part2 (a,list) values (3,'schnaps');
INSERT 0 1
postgres=# select * from only part2_1;
 a | list  
---+-------
 2 | whine
(1 row)

postgres=# select * from only part2_2;
 a |  list   
---+---------
 3 | schnaps
(1 row)

postgres=# select * from only part2_3;
 a | list 
---+------
 1 | beer
(1 row)

We have the data evenly distributed over the three partitions. When we insert a row which contains a NULL value for the column we partitioned on:

postgres=# insert into part2 (a,list) values (null,'cocktail');
INSERT 0 1

… where does that column go to?

postgres=# select * from only part2_1;
 a |   list   
---+----------
 2 | whine
   | cocktail
(2 rows)

postgres=# select * from only part2_2;
 a |  list   
---+---------
 3 | schnaps
(1 row)

postgres=# select * from only part2_3;
 a | list 
---+------
 1 | beer
(1 row)

It goes to the first partition and every following NULL key row gets there as well:

postgres=# insert into part2 (a,list) values (null,'rum');
INSERT 0 1
postgres=# select * from only part2_1;
 a |   list   
---+----------
 2 | whine
   | cocktail
   | rum
(3 rows)

I couldn’t find anything in the documentation about that so I did send a mail to the general mailing list and here is the answer: “The calculated hash value for the null value will be zero, therefore, it will fall to the partition having remainder zero.”

 

Cet article PGDay Amsterdam – follow up 2 – Where do null values go to in a hash partitioned table? est apparu en premier sur Blog dbi services.

Database Vault : Rules, Rule Sets and Command Rules

Fri, 2018-07-13 14:31

In a previous blog I talked about protecting data using Realms. With Database Vault we can also protect our database against some SQL statements. These statements can include SELECT, ALTER SYSTEM, database definition language (DDL), and data manipulation language (DML) statements.
We can do this with Command Rules. In this blog I am demonstrating how we can use a Command Rule to prevent SYS from creating a new pluggable database in a multitenant environment.

Before starting the demonstration, we can see that there are some predefined Command Rules which apply to all users.

SQL> show user
USER is "C##DBV_OWNER_ROOT"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> SELECT COMMAND, RULE_SET_NAME FROM DVSYS.DBA_DV_COMMAND_RULE;

COMMAND              RULE_SET_NAME
-------------------- --------------------------------------------------
ALTER PROFILE        Can Maintain Accounts/Profiles
ALTER SYSTEM         Allow Fine Grained Control of System Parameters
ALTER USER           Can Maintain Own Account
CHANGE PASSWORD      Can Maintain Own Account
CREATE PROFILE       Can Maintain Accounts/Profiles
CREATE USER          Can Maintain Accounts/Profiles
DROP PROFILE         Can Maintain Accounts/Profiles
DROP USER            Can Maintain Accounts/Profiles

8 rows selected.
SQL>

Because of these default Command Rules, for example, user sys cannot create a user once Database Vault is enabled.

SQL> conn sys/root@pdb1 as sysdba
Connected.
SQL> create user myuser identified by test;
create user myuser identified by test
                                 *
ERROR at line 1:
ORA-01031: insufficient privileges

To grant a user the ability to use these commands, you can grant the user the role that the rule set checks.

SQL> SELECT PRIVILEGE FROM DBA_SYS_PRIVS WHERE GRANTEE = 'DV_ACCTMGR';

PRIVILEGE
----------------------------------------
DROP PROFILE
ALTER PROFILE
ALTER USER
CREATE PROFILE
CREATE USER
CREATE SESSION
DROP USER

7 rows selected.

SQL>

To allow sys to create a user we can grant the DV_ACCTMGR role to SYS

SQL> show user
USER is "C##DBV_ACCTMGR_ROOT"

SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL>

SQL> grant  DV_ACCTMGR to sys;

Grant succeeded.

And now SYS can create a user

SQL> conn sys/root@pdb1 as sysdba
Connected.
SQL> create user myuser identified by test;

User created.

SQL>

Before starting the demonstration let’s verify that user SYS, by default, can create a pluggable database

SQL> conn sys as sysdba
Enter password:
Connected.
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL> create pluggable database PDB2 ADMIN USER pdb2adm IDENTIFIED BY root create_file_dest='/u01/app/oracle/oradata/DBSEC/PDB2';

Pluggable database created.

SQL>

To prevent sys from creating a pluggable database, we are first going to create a RULE. This rule will determine when the command rule will be fired.

SQL> exec DVSYS.DBMS_MACADM.CREATE_RULE(rule_name => 'MY_PDB_RULE', 
                                        rule_expr => 'SYS_CONTEXT(''USERENV'', ''SESSION_USER'') != ''SYS''');

PL/SQL procedure successfully completed.

SQL>

After we have to create a RULE SET which is a collection of one or more rules. We can associate a rule set with a realm authorization, factor assignment, command rule, or secure application role.

SQL> exec DVSYS.DBMS_MACADM.CREATE_RULE_SET(rule_set_name => 'MY_PDB_RULESET', 
                                            description => ' About managing Pdbs', 
                                            enabled => DBMS_MACUTL.G_YES, eval_options => DBMS_MACUTL.G_RULESET_EVAL_ANY,
                                            audit_options => DBMS_MACUTL.G_RULESET_AUDIT_FAIL + DBMS_MACUTL.G_RULESET_AUDIT_SUCCESS, 
                                            fail_options => DBMS_MACUTL.G_RULESET_FAIL_SILENT, fail_message => '', 
                                            fail_code => '', 
                                            handler_options => DBMS_MACUTL.G_RULESET_HANDLER_OFF, 
                                            handler => '',
                                            is_static => FALSE);

PL/SQL procedure successfully completed.
SQL>

We then add the RULE to the RULE SET

BEGIN
DVSYS.DBMS_MACADM.ADD_RULE_TO_RULE_SET(
                                       rule_set_name => 'MY_PDB_RULESET',
                                       rule_name => 'MY_PDB_RULE');
END;
   /

PL/SQL procedure successfully completed.

And finally create a COMMAND RULE which will prevent SYS to execute a CREATE PLUGGABLE DATABASE statement

SQL> exec DVSYS.DBMS_MACADM.CREATE_COMMAND_RULE(command=> 'CREATE PLUGGABLE DATABASE', 
                                                rule_set_name => 'MY_PDB_RULESET', 
                                                object_owner => DBMS_ASSERT.ENQUOTE_NAME('%',FALSE), 
                                                object_name => '%',
                                                enabled => 'Y');

PL/SQL procedure successfully completed.

SQL>

And now if we try to create a Pdb with SYS

SQL> show user
USER is "SYS"
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL>  CREATE PLUGGABLE DATABASE PDB3 ADMIN USER pdb3adm IDENTIFIED BY root create_file_dest='/u01/app/oracle/oradata/DBSEC/PDB3';
 CREATE PLUGGABLE DATABASE PDB3 ADMIN USER pdb3adm IDENTIFIED BY root create_file_dest='/u01/app/oracle/oradata/DBSEC/PDB3'
*
ERROR at line 1:
ORA-47400: Command Rule violation for CREATE PLUGGABLE DATABASE on PDB3

SQL>
 

Cet article Database Vault : Rules, Rule Sets and Command Rules est apparu en premier sur Blog dbi services.

PGDay Amsterdam – follow up 1 – Adding columns with a default value and changing the default value right after

Fri, 2018-07-13 01:32

As always, this time during my talk about the PostgreSQL 11 new features in Amsterdam, there have been question I could not immediately answer. The first one was this: Suppose we add a column with a default value in PostgreSQL 11, what happens when we change that default afterwards? Does the table get rewritten? Do we have more than on distinct default value for that column? Here we go …

The sample table:

postgres=# select version();
                                                            version                                                            
-------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11beta1 build on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)
postgres=# create table t1 ( a int, b text );
CREATE TABLE
postgres=# insert into t1 (a,b) 
           select a.*, md5(a::text) 
             from generate_series(1,1000) a;
INSERT 0 1000

Lets add a new column with a default value:

postgres=# alter table t1 add column c text default 'aa';;
ALTER TABLE

This populates the two columns in pg_attribute as described in a previous post:

postgres=# select atthasmissing,attmissingval 
             from pg_attribute 
            where attrelid = 't1'::regclass and attname = 'c';
 atthasmissing | attmissingval 
---------------+---------------
 t             | {aa}
(1 row)

When we check for the distinct values in column “c” we should only see one result (which is “aa”):

postgres=# select c, count(*) from t1 group by c;
 c  | count 
----+-------
 aa |  1000
(1 row)

When I got the question right the concern was: When we change the default now do we see two results when we ask for the distinct values in column “c”? Of course not and the table is not rewritten:

postgres=# alter table t1 alter column c set default 'bb';
ALTER TABLE
postgres=# select c, count(*) from t1 group by c;
 c  | count 
----+-------
 aa |  1000
(1 row)

postgres=# select atthasmissing,attmissingval from pg_attribute where attrelid = 't1'::regclass and attname = 'c';
 atthasmissing | attmissingval 
---------------+---------------
 t             | {aa}
(1 row)

What does that mean? For the existing rows the value is still “aa” as that was true when the column was added. For new values we will get “bb”:

postgres=# \d t1
                  Table "public.t1"
 Column |  Type   | Collation | Nullable |  Default   
--------+---------+-----------+----------+------------
 a      | integer |           |          | 
 b      | text    |           |          | 
 c      | text    |           |          | 'bb'::text

postgres=# insert into t1 (a,b) values (1001,'aa');
INSERT 0 1
postgres=# select c, count(*) from t1 group by c;
 c  | count 
----+-------
 bb |     1
 aa |  1000
(2 rows)

I hope that answers the question. If not, please leave a comment.

 

Cet article PGDay Amsterdam – follow up 1 – Adding columns with a default value and changing the default value right after est apparu en premier sur Blog dbi services.

What a great event – PGDay Amsterdam

Fri, 2018-07-13 00:14

PostgreSQL conferences are always cool and this time it was in Amsterdam: PGDay Amsterdam. Beside that meeting all the great people was fun again the location was really outstanding: The TOBACCO Theater:

IMG-20180712-WA0001

Some impressions:

Here you can see Devrim preparing the opening of the event with the introduction session:
dav

… and then it happened: We finally started:

dav

Jan kicked of the sessions with his talk about the reasons he finally arrived in the PostgreSQL community after working years in another one:

Selection_006

Oleksi took over to speak about ACID, transactions and much more, a great talk:

dav
I had the pleasure to speak about PostgreSQL 11 to close the first sessions before the coffee break:

Selection_002

Stefanie followed with foreign data wrappers and data integration with PostgreSQL (another great one):

Selection_005

And then there was something special: You might know Devrim has a real PostgreSQL tattoo and that was taken as an opportunity to offer temporary tattoos to everyone and that looked like this:

Selection_003

Selection_004

Hans rocked the stage right after:
dav

Devrim right after his talk about WAL:
a115fd15-0f21-4173-afcc-1aece9465afd-original

As in Rapperswil two weeks ago Bruce closed the sessions with his talk: Will PostgreSQL live forever:
70c46b91-342d-41cf-af52-19f282fdd170-original1

There have been other session not mentioned here, which also have been great, but I didn’t ask if it as fine to publish the pictures. I could not attend the party after the event but I am sure that was great as well. See you next year. And never forget: PostgreSQL rocks :).

 

Cet article What a great event – PGDay Amsterdam est apparu en premier sur Blog dbi services.

pgSentinel: the sampling approach for PostgreSQL

Thu, 2018-07-12 17:12

Here is the first test I did with the beta of pgSentinel. This Active Session History sampling is a new approach to Postgres tuning. For people coming from Oracle, this is something that has made our life a lot easier to optimize database applications. Here is a quick example showing how it links together some information that are missing without this extension.

The installation of the extension is really easy (nore details on Daniel’s post):

cp pgsentinel.control /usr/pgsql-10/share/extension
cp pgsentinel--1.0.sql /usr/pgsql-10/share/extension
cp pgsentinel.so /usr/pgsql-10/lib

and declare it in postgresql.conf

grep -i pgSentinel $PGDATA/postgresql.conf
 
shared_preload_libraries = 'pg_stat_statements,pgsentinel'
#pgsentinel_ash.pull_frequency = 1
#pgsentinel_ash.max_entries = 1000000

and restart:

/usr/pgsql-10/bin/pg_ctl restart

Then create the views in psql:

CREATE EXTENSION pgsentinel;

I was running PGIO (the SLOB method for PostgreSQL from Kevin Closson https://kevinclosson.net/)

Without the extension, here is what I can see about the current activity from the OS point of view, with ‘top -c':

top - 21:57:23 up 1 day, 11:22, 4 users, load average: 4.35, 4.24, 4.16
Tasks: 201 total, 2 running, 199 sleeping, 0 stopped, 0 zombie
%Cpu(s): 27.6 us, 19.0 sy, 0.0 ni, 31.0 id, 19.0 wa, 0.0 hi, 3.4 si, 0.0 st
KiB Mem : 4044424 total, 54240 free, 282220 used, 3707964 buff/cache
KiB Swap: 421884 total, 386844 free, 35040 used. 3625000 avail Mem
 
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9766 postgres 20 0 440280 160036 150328 D 50.0 4.0 10:56.63 postgres: postgres pgio [local] SELECT
9762 postgres 20 0 439940 160140 150412 D 43.8 4.0 10:55.95 postgres: postgres pgio [local] SELECT
9761 postgres 20 0 440392 160088 150312 D 37.5 4.0 10:52.29 postgres: postgres pgio [local] SELECT
9763 postgres 20 0 440280 160080 150432 R 37.5 4.0 10:41.94 postgres: postgres pgio [local] SELECT
9538 postgres 20 0 424860 144464 142956 D 6.2 3.6 0:30.79 postgres: writer process

As I described in a previous post, PostgreSQL changes the title of the process to display the current operation. This looks interesting, but not very detailed (only ‘SELECT’ here) and very misleading because here I’m running PGIO with 50% updates. The ‘SELECT’ here is the user call. Not the actual SQL statement running.

We have more information from PG_STAT_ACTIVITY, but again only the top-level call is displayed, as I mentioned in a previous post:

select * from pg_stat_activity where pid=9766;
-[ RECORD 1 ]----+---------------------------------------------------------
datid | 17487
datname | pgio
pid | 9766
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2018-07-12 21:28:46.539052+02
xact_start | 2018-07-12 21:28:46.542203+02
query_start | 2018-07-12 21:28:46.542203+02
state_change | 2018-07-12 21:28:46.542209+02
wait_event_type | IO
wait_event | DataFileWrite
state | active
backend_xid | 37554
backend_xmin | 37553
query | SELECT * FROM mypgio('pgio4', 50, 3000, 131072, 255, 8);
backend_type | client backend

Here, I know what the user is doing: a call to mypgio() started at 21:28:46. And I know which resources are involved on the system: DataFileWrite. But again the most important is missing, the link between the user call and the system resources. And you can only guess it here because you know that a SELECT do not write to datafiles. There’s something hidden in the middle, which is actually an UPDATE. Of course, we can see this UPDATE in PG_STAT_STATEMENTS. But there, it will not be linked with the current activity, the mypgio() call, nor the DataFileWrite wait event. And we also need some timing information to be able to see the database load over the time.

Here is where the pgSentinel extension fills the gap, providing:

  • The actual query running, with the queryid which links to PG_STAT_STATEMENTS, but also the full text with all parameter values
  • multiple samples of the activity, with their timestamp information


select ash_time,pid,wait_event_type,wait_event,state,queryid,backend_type,top_level_query,query from pg_active_session_history order by ash_time desc,pid fetch first 10 rows only;
 
ash_time | pid | wait_event_type | wait_event | state | queryid | backend_type | top_level_query | query
-------------------------------+------+-----------------+---------------+--------+------------+----------------+----------------------------------------------------------+--------------------------------------------------------------------------
2018-07-12 21:57:22.991558+02 | 9761 | IO | DataFileWrite | active | 837728477 | client backend | SELECT * FROM mypgio('pgio2', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 1065 AND 1320
2018-07-12 21:57:22.991558+02 | 9762 | IO | DataFileWrite | active | 1046864277 | client backend | SELECT * FROM mypgio('pgio3', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio3 WHERE mykey BETWEEN 267 AND 522
2018-07-12 21:57:22.991558+02 | 9763 | IO | DataFileRead | active | 1648177216 | client backend | SELECT * FROM mypgio('pgio1', 50, 3000, 131072, 255, 8); | UPDATE pgio1 SET scratch = scratch + 1 WHERE mykey BETWEEN 1586 AND 1594
2018-07-12 21:57:22.991558+02 | 9766 | IO | DataFileWrite | active | 3411884874 | client backend | SELECT * FROM mypgio('pgio4', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN 3870 AND 4125
2018-07-12 21:57:21.990178+02 | 9761 | CPU | CPU | active | 837728477 | client backend | SELECT * FROM mypgio('pgio2', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 13733 AND 13988
2018-07-12 21:57:21.990178+02 | 9762 | IO | DataFileRead | active | 1046864277 | client backend | SELECT * FROM mypgio('pgio3', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio3 WHERE mykey BETWEEN 4135 AND 4390
2018-07-12 21:57:21.990178+02 | 9763 | IO | DataFileWrite | active | 2994234299 | client backend | SELECT * FROM mypgio('pgio1', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio1 WHERE mykey BETWEEN 4347 AND 4602
2018-07-12 21:57:21.990178+02 | 9766 | CPU | CPU | active | 3411884874 | client backend | SELECT * FROM mypgio('pgio4', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio4 WHERE mykey BETWEEN 14423 AND 14678
2018-07-12 21:57:20.985253+02 | 9761 | IO | DataFileWrite | active | 837728477 | client backend | SELECT * FROM mypgio('pgio2', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio2 WHERE mykey BETWEEN 129 AND 384
2018-07-12 21:57:20.985253+02 | 9762 | IO | DataFileWrite | active | 1046864277 | client backend | SELECT * FROM mypgio('pgio3', 50, 3000, 131072, 255, 8); | SELECT sum(scratch) FROM pgio3 WHERE mykey BETWEEN 3313 AND 3568
(10 rows)

Everything is there. The timeline where each sample links together the user call (top_level_query), the running query (queryid and query – which is the text with parameter values), and the wait event (wait_event_type and wait_event).

Here is, on one sample, what is currently available in the beta version:

select * from pg_active_session_history where pid=9766 order by ash_time desc fetch first 1 rows only;
-[ RECORD 1 ]----+-----------------------------------------------------------------------
ash_time | 2018-07-12 21:57:23.992798+02
datid | 17487
datname | pgio
pid | 9766
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2018-07-12 21:28:46.539052+02
xact_start | 2018-07-12 21:28:46.542203+02
query_start | 2018-07-12 21:28:46.542203+02
state_change | 2018-07-12 21:28:46.542209+02
wait_event_type | IO
wait_event | DataFileExtend
state | active
backend_xid | 37554
backend_xmin | 37553
top_level_query | SELECT * FROM mypgio('pgio4', 50, 3000, 131072, 255, 8);
query | UPDATE pgio4 SET scratch = scratch + 1 WHERE mykey BETWEEN 700 AND 708
queryid | 1109524376
backend_type | client backend

Then, what do we do with this? This is a fact table with many dimensions. And we can drill down on the database activity.

A quick overview of the load shows that I have, on average, 4 foreground sessions running for my user calls, and very low vacuuming activity:

postgres=# select backend_type
postgres-# ,count(*)/(select count(distinct ash_time)::float from pg_active_session_history) as load
postgres-# from pg_active_session_history
postgres-# group by backend_type
postgres-# ;
backend_type | load
-------------------+--------------------
client backend | 4.09720483938256
autovacuum worker | 0.07467667918231
(2 rows)

I’ll show in a future post how to query this view to drill down into the details. For the moment, here is a short explanation about the reason to go to a sampling approach.

Here is an abstract sequence diagram showing some typical user calls to the database. Several components are involved: CPU for the backed process, or for background processes, the OS, the storage… Our tuning goal is to reduce the user call duration. And then to reduce or optimize the work done in the different layers. With the current statistics available on PostgreSQL, like PG_STAT_ACTIVITY or PG_STAT_STATEMENTS, or available from the OS (strace to measure system call duration) we have a vertical approach on the load. We can look at each component individually:
PGASHWHY001

This is basically what we did on Oracle before ASH (Active Session History) was introduced in 10g, 12 years ago. The activity sampling approach takes an orthogonal point of view. Rather than cumulating statistics for each components, it looks at what happens on the system at specific point in times, across all components. We don’t have all measures (such as how many execution of a query) but only samples. However, each sample gives a complete view from the user call down to the system calls. And 1 second samples are sufficient to address any relevant activity, without taking too much space for short retention. For each sample, we cover all layers end-to-end:
PGASHWHY002
This horizontal approach makes the link between the user calls (the user perception of the database performance) and the system resources where we can analyze and optimize. With this, we can ensure that our tuning activity always focuses on the problem (the user response time) by addressing the root cause on the right component.

 

Cet article pgSentinel: the sampling approach for PostgreSQL est apparu en premier sur Blog dbi services.

Active session history in PostgreSQL: Say hello to pgSentinel

Thu, 2018-07-12 00:17

There is new project, currently in beta, which aims to bring active session history (and probably more) to PostgreSQL: pgSentinel. Because PostgreSQL is highly extensible such projects are possible and usually are coming as an extension. pgSentinel is no exception to that so lets see how it can be installed. When you want to try the beta for your own, please connect with the project on twitter.

This is what I got:

postgres@pgbox:/home/postgres/beta/ [pg103] ll
total 120
-rw-r--r--. 1 postgres postgres   1108 Jul  8 22:13 pgsentinel--1.0.sql
-rw-r--r--. 1 postgres postgres    117 Jul  5 22:15 pgsentinel.control
-rwxr-xr-x. 1 postgres postgres 108000 Jul  9 11:12 pgsentinel.so
-rw-r--r--. 1 postgres postgres    623 Jul  9 11:22 readme.txt

You can already see from here that we probably need to load a library because of the pgsentinel.so file. Lets copy that to the correct locations, in my case:

postgres@pgbox:/home/postgres/beta/ [pg103] cp pgsentinel--1.0.sql pgsentinel.control /u01/app/postgres/product/10/db_3/share/extension/
postgres@pgbox:/home/postgres/beta/ [pg103] cp pgsentinel.so /u01/app/postgres/product/10/db_3/lib/

As I plan to run pgbench later to get some load onto the system I’ve created a separate database for installing the extension:

postgres@pgbox:/home/postgres/ [PG10] psql -c "create database bench" postgres
CREATE DATABASE
postgres@pgbox:/home/postgres/ [PG10] pgbench -i -s 10 bench

When we need to load a library we need to tell PostgreSQL about that by setting the shared_preload_libraries parameter. As pgsentinel depends on pg_stat_statements this needs to be installed as well.

postgres=# alter system set shared_preload_libraries='pg_stat_statements,pgsentinel';
ALTER SYSTEM

So once we have that set and the instance is restarted:

postgres@pgbox:/home/postgres/beta/ [PG10] pg_ctl -D $PGDATA restart -m fast

… we should see the new extension in the pg_available_extensions view:

postgres=# select * from pg_available_extensions where name = 'pgsentinel';
    name    | default_version | installed_version |        comment         
------------+-----------------+-------------------+------------------------
 pgsentinel | 1.0             |                   | active session history
(1 row)

Ready to install the extensions:

postgres=# create extension pg_stat_statements;
CREATE EXTENSION
postgres=# create extension pgsentinel ;
CREATE EXTENSION
postgres=# \dx
                                     List of installed extensions
        Name        | Version |   Schema   |                        Description                        
--------------------+---------+------------+-----------------------------------------------------------
 pg_stat_statements | 1.5     | public     | track execution statistics of all SQL statements executed
 pgsentinel         | 1.0     | public     | active session history
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

So what did we get? One solution is to look at the sql file:

cat pgsentinel--1.0.sql
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION pgsentinel" to load this file. \quit

CREATE FUNCTION pg_active_session_history(
    OUT ash_time timestamptz,
    OUT datid Oid,
    OUT datname text,
    OUT pid integer,
    OUT usesysid Oid,
    OUT usename text,
    OUT application_name text,
    OUT client_addr text,
    OUT client_hostname text,
    OUT client_port integer,
    OUT backend_start timestamptz,
    OUT xact_start timestamptz,
    OUT query_start timestamptz,
    OUT state_change timestamptz,
    OUT wait_event_type text,
    OUT wait_event text,
    OUT state text,
    OUT backend_xid xid,
    OUT backend_xmin xid,
    OUT top_level_query text,
    OUT query text,
    OUT queryid bigint,
    OUT backend_type text
   
)
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'pg_active_session_history'
LANGUAGE C STRICT VOLATILE PARALLEL SAFE;

-- Register a view on the function for ease of use.
CREATE VIEW pg_active_session_history AS
  SELECT * FROM pg_active_session_history();

GRANT SELECT ON pg_active_session_history TO PUBLIC;

The other solution is to ask PostgreSQL directly:

bench=# \dx+ pgsentinel 
  Objects in extension "pgsentinel"
          Object description          
--------------------------------------
 function pg_active_session_history()
 view pg_active_session_history
(2 rows)

Basically we got a function and a view over that function. Lets have a look at the view then:

postgres=# \d pg_active_session_history
                   View "public.pg_active_session_history"
      Column      |           Type           | Collation | Nullable | Default 
------------------+--------------------------+-----------+----------+---------
 ash_time         | timestamp with time zone |           |          | 
 datid            | oid                      |           |          | 
 datname          | text                     |           |          | 
 pid              | integer                  |           |          | 
 usesysid         | oid                      |           |          | 
 usename          | text                     |           |          | 
 application_name | text                     |           |          | 
 client_addr      | text                     |           |          | 
 client_hostname  | text                     |           |          | 
 client_port      | integer                  |           |          | 
 backend_start    | timestamp with time zone |           |          | 
 xact_start       | timestamp with time zone |           |          | 
 query_start      | timestamp with time zone |           |          | 
 state_change     | timestamp with time zone |           |          | 
 wait_event_type  | text                     |           |          | 
 wait_event       | text                     |           |          | 
 state            | text                     |           |          | 
 backend_xid      | xid                      |           |          | 
 backend_xmin     | xid                      |           |          | 
 top_level_query  | text                     |           |          | 
 query            | text                     |           |          | 
 queryid          | bigint                   |           |          | 
 backend_type     | text                     |           |          | 

Most of the columns are already in pg_stat_activity but there is more. Before going further lets generate some load:

postgres@pgbox:/home/postgres/ [PG10] pgbench -c 5 -j 4 -T 60 bench 
starting vacuum...end.
transaction type: 
scaling factor: 10
query mode: simple
number of clients: 5
number of threads: 4
duration: 60 s
number of transactions actually processed: 151397
latency average = 1.982 ms
tps = 2522.898859 (including connections establishing)
tps = 2523.280694 (excluding connections establishing)

Now we should see sampled data in the pg_active_session_history view:

bench=# select ash_time,top_level_query,query,queryid,wait_event_type,wait_event from pg_active_session_history limit 10;
           ash_time            |                               top_level_query                               |                                   query                                    |  queryid   | wait_event_type |  wait_event   
-------------------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------+------------+-----------------+---------------
 2018-07-09 14:51:48.883599+02 | create database bench                                                       | create database bench                                                      | 3222771996 | CPU             | CPU
 2018-07-09 14:52:37.291115+02 | copy pgbench_accounts from stdin                                            | copy pgbench_accounts from stdin                                           | 4164808321 | CPU             | CPU
 2018-07-09 14:52:38.292674+02 | alter table pgbench_accounts add primary key (aid)                          | alter table pgbench_accounts add primary key (aid)                         | 4164808321 | CPU             | CPU
 2018-07-09 14:55:51.111621+02 | UPDATE pgbench_branches SET bbalance = bbalance + 2228 WHERE bid = 4;       | UPDATE pgbench_branches SET bbalance = bbalance + 2228 WHERE bid = 4       |  553956422 | Lock            | transactionid
 2018-07-09 14:55:51.111621+02 | END;                                                                        | END                                                                        | 3376944276 | CPU             | CPU
 2018-07-09 14:55:51.111621+02 | UPDATE pgbench_accounts SET abalance = abalance + -2408 WHERE aid = 973208; | UPDATE pgbench_accounts SET abalance = abalance + -2408 WHERE aid = 973208 | 2992934481 | CPU             | CPU
 2018-07-09 14:55:52.112507+02 | UPDATE pgbench_tellers SET tbalance = tbalance + -4957 WHERE tid = 87;      | UPDATE pgbench_tellers SET tbalance = tbalance + -4957 WHERE tid = 87      | 3459630226 | Client          | ClientRead
 2018-07-09 14:55:52.112507+02 | END;                                                                        | END                                                                        | 3376944276 | LWLock          | WALWriteLock
 2018-07-09 14:55:52.112507+02 | END;                                                                        | END                                                                        | 3376944276 | CPU             | CPU
 2018-07-09 14:55:52.112507+02 | UPDATE pgbench_branches SET bbalance = bbalance + -3832 WHERE bid = 8;      | UPDATE pgbench_branches SET bbalance = bbalance + -3832 WHERE bid = 8      |  553956422 | Lock            | transactionid
(10 rows)

The important point here is that we have the queryid which we can use to map that back to pg_stat_statements. So if we want to know what the shared_blks_* statistics for the update statement with query id 553956422 are, we can do that (or write a join over the two views, of course):

bench=# select shared_blks_hit,shared_blks_read,shared_blks_dirtied,shared_blks_written from pg_stat_statements where queryid = 553956422;
 shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_written 
-----------------+------------------+---------------------+---------------------
          453201 |               29 |                  37 |                   0
(1 row)

Really looks promising, automatic session sampling in PostgreSQL. More tests to come …

 

Cet article Active session history in PostgreSQL: Say hello to pgSentinel est apparu en premier sur Blog dbi services.

Documetum xPlore: Failed to load content file + file system full

Wed, 2018-07-11 08:33

We had issues with our indexing server when launching batches of indexes to the queue, in the xPlore server. The indexing was taking so much time to index documents and almost all queue items fell down to the warning/error state.

In addition, our file system was growing too fast : 100Go for only 30.000 docs. I was like the temp folders weren’t cleared properly. And in other hand the queue was telling that the temp file couldn’t be loaded because it was cleared too early…

Thoughts

I observed the behavior of the temp folders in the idex agent server and noticed that the files were only 10min old, even after 30min after sending the batch of index requests. Hence, I deduced the clearing was still running, which could explain the index warning telling the file couldn’t be found.

That means the processing of the indexes takes too long while the clearing thread runs anyway… But I noticed that the file system was still growing way to much.

Solution

If you didn’t know, by default, the CPS (which parses the downloaded files) only has 1 thread. It means that if it takes too long (50Mo files in my case), the thread will be busy and other files will not be indexed during this time. But the documents will still be downloaded during this time, and the clearing process will still continue to harvest our beloved files.

The point here is to add more CPS threads to parallelize the process and then avoid missing files. You can also increase the time between two clearing phase but it’s not efficient and increasing the number of threads will improve your overall performances.

To do so, edit the following config file:

$DSEARCH_HOME/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml

Change the following line from 1 to 4:

<daemon_count>1</daemon_count>

A restart will be needed. You can change the value from 1 to 6 maximum. Please note that xPlore uses two other threads for clearing and other processes, and it allows only 8 threads to run at the same time, then 6 is the maximum cps to run, otherwise you’ll have issues with the clearing thread, and you will end up with a file system full.

 

Cet article Documetum xPlore: Failed to load content file + file system full est apparu en premier sur Blog dbi services.

Creating and Using a Parcel Repository for Cloudera Manager

Wed, 2018-07-11 07:59

This blog post describes how to create a hosted Cloudera repository and use it in your Cloudera Manager deployment.

The first step is to install a web server, which will host RPM packages and repodata. The common way, is to use an Apache web server.

Installing Apache HTTPD service
[cdhtest@edge ]$ sudo yum install httpd -y

 

Starting Apache HTTPD service
[cdhtest@edge ]$ sudo systemctl start httpd

Verify that the service has been started properly.

[cdhtest@master html]$ sudo systemctl status httpd
* httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2018-07-11 09:16:45 UTC; 1h 26min ago
     Docs: man:httpd(8)
           man:apachectl(8)
 Main PID: 53284 (httpd)
   Status: "Total requests: 40; Current requests/sec: 0; Current traffic:   0 B/sec"
   CGroup: /system.slice/httpd.service
           |-53284 /usr/sbin/httpd -DFOREGROUND
           |-53285 /usr/sbin/httpd -DFOREGROUND
           |-53286 /usr/sbin/httpd -DFOREGROUND
           |-53287 /usr/sbin/httpd -DFOREGROUND
           |-53288 /usr/sbin/httpd -DFOREGROUND
           |-53289 /usr/sbin/httpd -DFOREGROUND
           |-53386 /usr/sbin/httpd -DFOREGROUND
           |-53387 /usr/sbin/httpd -DFOREGROUND
           |-53388 /usr/sbin/httpd -DFOREGROUND
           `-58024 /usr/sbin/httpd -DFOREGROUND

Jul 11 09:16:45 master systemd[1]: Starting The Apache HTTP Server...
Jul 11 09:16:45 master httpd[53284]: AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 192.168.2.6. Set the 'ServerName' directive globally to suppress this message
Jul 11 09:16:45 master systemd[1]: Started The Apache HTTP Server.

 

Downloading Parcels for CDH5 and Publishing files

Download parcels according to your OS distribution for CDH5 (latest release) from the below link:

https://archive.cloudera.com/cdh5/parcels/latest/

Download the 2 files:

  • .parcels
  • manifest.json

Before downloading the files, create the CDH parcel directory tree in your web server.

[cdhtest@master html]$ cd /var/www/html/
[cdhtest@master html]$ sudo mkdir -p cdh5.15/
[cdhtest@master html]$ sudo chmod -R ugo+rX /var/www/html/cdh5.15/
[cdhtest@master html]$ cd /var/www/html/cdh5.15/
[cdhtest@master cdh5.15]$ sudo wget https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.15.0-1.cdh5.15.0.p0.21-el5.parcel https://archive.cloudera.com/cdh5/parcels/latest/manifest.json
--2018-07-11 12:16:04--  https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.15.0-1.cdh5.15.0.p0.21-el5.parcel
Resolving archive.cloudera.com (archive.cloudera.com)... 151.101.32.167
Connecting to archive.cloudera.com (archive.cloudera.com)|151.101.32.167|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1675168741 (1.6G) [binary/octet-stream]
Saving to: 'CDH-5.15.0-1.cdh5.15.0.p0.21-el5.parcel'

100%[==================================================================================================================================================================================================================================================================================================>] 1,675,168,741 53.2MB/s   in 29s

2018-07-11 12:16:32 (56.0 MB/s) - 'CDH-5.15.0-1.cdh5.15.0.p0.21-el5.parcel' saved [1675168741/1675168741]

--2018-07-11 12:16:32--  https://archive.cloudera.com/cdh5/parcels/latest/manifest.json
Reusing existing connection to archive.cloudera.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 74072 (72K) [application/json]
Saving to: 'manifest.json'

100%[====================================================================================================================================================================================================================================================================================================>] 74,072      --.-K/s   in 0s

2018-07-11 12:16:32 (225 MB/s) - 'manifest.json' saved [74072/74072]

FINISHED --2018-07-11 12:16:32--
Total wall clock time: 29s
Downloaded: 2 files, 1.6G in 29s (56.0 MB/s)
[cdhtest@master cdh5.15]$
[cdhtest@master cdh5.15]$ ll
total 1635984
-rw-r--r-- 1 root root 1675168741 Jun 14 18:06 CDH-5.15.0-1.cdh5.15.0.p0.21-el5.parcel
-rw-r--r-- 1 root root      74072 Jun 14 18:08 manifest.json

 

Your Parcels remote repository is now available

CM_Parcels8

 

Configuring the Cloudera Manager Server to Use the Parcel URL for Hosted Repositories

1. In the Cluster Installation – Select Repository step

Click on More Options.

CM_Parcels3

Add your Remote Parcel Repository URL .

CM_Parcels4

Then Cloudera Manager will download, distribute, unpack and activate parcels for all cluster hosts.

CM_Parcels5

 

2. You can also configure your local parcels repository in the Cloudera Manager Configuration Menu.

Click on Administration Menu > Settings

Click on Parcels category > Add your Remote Parcel Repository URL here

CM_Parcels6

Click Save Changes to commit the changes.

 

Activate Parcels

In the Cloudera Manager Parcels page,

Click on Check for New Parcels

Click on Download, Distribute, Activate buttons for the parcels found.

CM_Parcels7

 

Cet article Creating and Using a Parcel Repository for Cloudera Manager est apparu en premier sur Blog dbi services.

Oracle Database Vault: Realm in a Pluggable Database

Tue, 2018-07-10 11:03

Database Vault can also be used in a multitenant environment. In a multitenant environment we must register Oracle Database Vault in the root first, then after in the PDBs.
In this blog we will see how we can use realms to protect data in a pluggable database 12.1.

In CDB$ROOT we have to create common accounts that will be used for the Database Vault Owner (DV_OWNER role) and Database Vault Account Manager (DV_ACCTMGR role) accounts. It is also recommended to create a backup for each user.

SQL> conn sys as sysdba
Enter password:
Connected.
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL>

SQL> GRANT CREATE SESSION, SET CONTAINER TO c##dbv_owner_root IDENTIFIED BY root CONTAINER = ALL;

SQL> GRANT CREATE SESSION, SET CONTAINER TO c##dbv_acctmgr_root IDENTIFIED BY root CONTAINER = ALL;

SQL> grant select any dictionary to C##DBV_OWNER_ROOT;

Grant succeeded.

SQL> grant select any dictionary to C##DBV_ACCTMGR_ROOT;

Grant succeeded.

SQL>

The next step is configure Database Vault user account on CDB$ROOT

BEGIN
 DVSYS.CONFIGURE_DV (
   dvowner_uname         => 'c##dbv_owner_root',
   dvacctmgr_uname       => 'c##dbv_acctmgr_root');
 END;
  6  /

PL/SQL procedure successfully completed.

SQL> @?/rdbms/admin/utlrp.sql

We can after enable Oracle Database Vault with user c##dbv_owner_root in CDB$ROOT

SQL> conn c##dbv_owner_root/root
Connected.
SQL> show user
USER is "C##DBV_OWNER_ROOT"
SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT
SQL>

SQL> EXEC DBMS_MACADM.ENABLE_DV;

PL/SQL procedure successfully completed.

SQL>

After restart o CDB$ROOT , we can verify the status. These queries should return TRUE.

SQL> SELECT VALUE FROM V$OPTION WHERE PARAMETER = 'Oracle Database Vault';

VALUE
----------------------------------------------------------------
TRUE

SQL> SELECT VALUE FROM V$OPTION WHERE PARAMETER = 'Oracle Label Security';

VALUE
----------------------------------------------------------------
TRUE

SQL>  SELECT * FROM DVSYS.DBA_DV_STATUS;

NAME                STATUS
------------------- ----------------------------------------------------------------
DV_CONFIGURE_STATUS TRUE
DV_ENABLE_STATUS    TRUE

SQL>

At PDB level, we must register common users we created earlier. In this example I am using a pluggable database named PDB1.

SQL> show user
USER is "SYS"
SQL> show con_name

CON_NAME
------------------------------
PDB1

SQL> GRANT CREATE SESSION, SET CONTAINER TO c##dbv_owner_root CONTAINER = CURRENT;

Grant succeeded.

SQL> GRANT CREATE SESSION, SET CONTAINER TO c##dbv_acctmgr_root CONTAINER = CURRENT;

Grant succeeded.

SQL>

SQL> grant select any dictionary to C##DBV_OWNER_ROOT;

Grant succeeded.

SQL> grant select any dictionary to C##DBV_ACCTMGR_ROOT;

Grant succeeded.

SQL>

Like in CDB$ROOT we also have to configure the Database Vault Users in PDB1

SQL> show user
USER is "SYS"
SQL> show con_name

CON_NAME
------------------------------
PDB1


SQL> BEGIN
 DVSYS.CONFIGURE_DV (
   dvowner_uname         => 'c##dbv_owner_root',
   dvacctmgr_uname       => 'c##dbv_acctmgr_root');
 END;
  6  /

PL/SQL procedure successfully completed.

SQL>

SQL> @?/rdbms/admin/utlrp.sql

And now let’s enable Oracle Database Vault on PDB1

SQL> show user
USER is "C##DBV_OWNER_ROOT"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> EXEC DBMS_MACADM.ENABLE_DV;

PL/SQL procedure successfully completed.

SQL>

With SYS let’s restart PDB1

SQL> show user
USER is "SYS"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> alter pluggable database pdb1 close immediate;

Pluggable database altered.

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

As in CDB$ROOT we can verify

SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> SELECT VALUE FROM V$OPTION WHERE PARAMETER = 'Oracle Database Vault';

VALUE
----------------------------------------------------------------
TRUE

SQL> SELECT VALUE FROM V$OPTION WHERE PARAMETER = 'Oracle Label Security';

VALUE
----------------------------------------------------------------
TRUE

SQL> SELECT * FROM DVSYS.DBA_DV_STATUS;

NAME                STATUS
------------------- ----------------------------------------------------------------
DV_CONFIGURE_STATUS TRUE
DV_ENABLE_STATUS    TRUE

SQL>

Now that the Database vault is configured, we can create A REALM to protect our DATA. In this example we are protecting data of the SCOTT table EMP. We are using EM 12c to create the REAM.
From Database Home select Security and then Database Vault
dbvault1
In the Database vault page log with any user having appropriate privileges: DV_OWNER or DV_ADMIN role, SELECT ANY DICTIONARY
dbvault2
Before creating the REALM we can verify that user SYSTEM access to table SCOTT.EMP

SQL> show user
USER is "SYSTEM"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> select * from scott.emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

14 rows selected.

SQL>

Under Administration Tab, Select Reams
dbvault3
And Click on Create
Give a name and a description for the realm
dbvault4
Click on Next
On the Realm Secured Objects click on Add
dbvault5
Click on OK
dbvault6
Click on Next
On Real Authorization page select ADD
dbvault7
Click on OK
dbvault8
Click Next
On the Review page Click Finish
dbvault9
And the end we should have
dbvault10
And that’s all. We can verify now that SYSTEM is no longer allowed to query SCOTT.EMP

SQL> conn system/root@pdb1
Connected.
SQL> show user
USER is "SYSTEM"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> select * from scott.emp;
select * from scott.emp
                    *
ERROR at line 1:
ORA-01031: insufficient privileges
SQL>

And that user EDGE is allowed to query SCOTT.EMP

SQL> show user
USER is "EDGE"
SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> select * from scott.emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

14 rows selected.

SQL>

 

Cet article Oracle Database Vault: Realm in a Pluggable Database est apparu en premier sur Blog dbi services.

Create an HDFS user’s home directory

Tue, 2018-07-10 07:07

Let’s assume we need to create an HDFS home directory for a user named “dbitest”.

We need first to verify if the user exists on the local filesystem. It’s important to understand that HDFS is mapping users from the local filesystem.

[cdhtest@master ~]$ cat /etc/passwd | grep dbitest
 Create a user on the local file system

When the user is not created, we can easily create one with it associated group.

[cdhtest@master ~]$ sudo groupadd dbitest

[cdhtest@master ~]$ sudo useradd -g dbitest -d/home/dbitest dbitest

[cdhtest@master ~]$ cat /etc/passwd | grep dbitest
dbitest:x:1002:1002::/home/dbitest:/bin/bash
[cdhtest@master ~]$

Note that, the user dbitest should be created in all cluster hosts.

Create a directory in HDFS for a new user

Then we can create the directory under /user in HDFS for the new user dbitest. This directory needs to be created using hdfs user as hdfs user is the super user for admin commands.

[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -mkdir /user/dbitest

 

Verify the owner for our new directory
[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -ls /user
Found 5 items
drwxr-xr-x   - hdfs   supergroup          0 2018-07-10 10:10 /user/dbitest
drwxrwxrwx   - mapred hadoop              0 2018-07-10 07:54 /user/history
drwxrwxr-t   - hive   hive                0 2018-07-10 07:55 /user/hive
drwxrwxr-x   - hue    hue                 0 2018-07-10 07:55 /user/hue
drwxrwxr-x   - oozie  oozie               0 2018-07-10 07:56 /user/oozie

The new home directory has been created but it’s owned by hdfs user.

Change owner for /user/dbitest directory

Use the below command to change the owner of the new user home directory created.

[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -chown dbitest:dbitest /user/dbitest

Let’s see if the owner has changed.

[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -ls /user
Found 5 items
drwxr-xr-x   - dbitest dbitest          0 2018-07-10 10:10 /user/dbitest
drwxrwxrwx   - mapred  hadoop           0 2018-07-10 07:54 /user/history
drwxrwxr-t   - hive    hive             0 2018-07-10 07:55 /user/hive
drwxrwxr-x   - hue     hue              0 2018-07-10 07:55 /user/hue
drwxrwxr-x   - oozie   oozie            0 2018-07-10 07:56 /user/oozie
Change permissions

Change the permissions of the newly created home directory so that no other users can have read, write and execute permissions except the owner.

[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -chmod 700 /user/dbitest
[cdhtest@master ~]$ sudo -u hdfs hdfs dfs -ls /user
Found 6 items
drwxr-xr-x   - admins  cdhtest          0 2018-07-10 08:56 /user/cdhtest
drwx------   - dbitest dbitest          0 2018-07-10 10:10 /user/dbitest
drwxrwxrwx   - mapred  hadoop           0 2018-07-10 07:54 /user/history
drwxrwxr-t   - hive    hive             0 2018-07-10 07:55 /user/hive
drwxrwxr-x   - hue     hue              0 2018-07-10 07:55 /user/hue
drwxrwxr-x   - oozie   oozie            0 2018-07-10 07:56 /user/oozie

 

Test the user dbitest home directory

We can now test the user home directory creation by uploading data into it without specifying the destination directory. The file will be automatically uploaded to the user’s home directory if no destination is specified.

[cdhtest@master ~]$ sudo su dbitest
[dbitest@master ~]$ hdfs dfs -ls /user/dbitest
[dbitest@master ~]$ hdfs dfs -put HelloWorld.txt
[dbitest@master ~]$ hdfs dfs -ls /user/dbitest
Found 1 items
-rw-r--r--   3 dbitest dbitest         39 2018-07-10 10:30 /user/dbitest/HelloWorld.txt

 

Your user home directory has been created successfully.

 

Cet article Create an HDFS user’s home directory est apparu en premier sur Blog dbi services.

Deploy a Cloudera cluster with Terraform and Ansible in Azure – part 3

Tue, 2018-07-10 03:42

After the deployment step with Terraform and the configuration/installation with Ansible, we will continue the installation of our Cloudera cluster with Cloudera Manager.

By following the below steps you will see how to install CDH on our hosts using Cloudera Manager.

Connection

First, Login to Cloudera manager URL.

Cloudera-Manager

When you connect to C.M for the first time, you need to accept the Cloudera Terms and Conditions.

Cloudera-Manager-Conditions

Then choose your desired edition of Cloudera. For this blog post, we will use the Data Hub trial edition (60 days trial).

C.M-Edition

 

Hosts discovery

In this step, provide the IP or hostname of all cluster machines you want to use. To complete this steps check if your /etc/hosts file of each cluster hosts is properly defined.

Cloudera-Manager-Hosts

When all hosts are reachable by Cloudera Manager server,

CDH Installation

This step is about the version of CDH to install in your cluster.

C.M-Installation_Methods

Use parcels installation.

By default, the parcel directory is /opt/cloudera/parcels. A best practice is to have a separate filesystem for /opt (at least 15GB), in order to separate the Cloudera installation to the /root filesystem.

If you don’t have a specific filesystem for /opt, you will have some performance impact on your server.

Java automatic installation

Since we install Java JDK automatically and properly with Ansible, we don’t need to check the box for java installation. Please skip this step.

CM-Java_install

Account details

In this step, we will provide the user account information to Cloudera Manager in order to install all components needed in all cluster hosts.

It’s not recommended to give the root access to Cloudera Manager but a dedicated user with sudo access. For our example we will use the user created during the installation part: dbi, with it associated password.

CM-Users1

Cloudera Installation – Install Agents

In this steps, Cloudera Manager will install and configure cloudera-scm-agent in all cluster hosts.

CM-Install-Agents

 

Cloudera Installation – Parcels installation

After cloudera-scm-agent installation and configuration, Cloudera Manager will install the CDH Parcel and additional parcels on all cluster hosts.

CM-Parcels2

Cloudera Installation – Hosts Inspector

In this step, the host’s inspector will check all cluster hosts requirement and notify you if you have any problem.

Note that, you can go through all validation section to understand all pre-requisites for Cloudera installation. You will get to know the complete checklists items Cloudera use to validate your host’s cluster.

CM-Hosts-Inspector

You can ignore the warning for now and resolve them after the installation. Click on Finish button and go to the next step.

Cluster Setup – Select Services

In this step, choose your services to install. For our sandbox environment we will only install Core Hadoop first.

CM-Services

 

Cluster Setup – Customize Role Assignments

Assign roles by hosts and click on continue.

CM-Roles

 

Cluster Setup – Setup databases

In this step, setup the remote databases for hive metastore, hue, Cloudera reports manager and oozie server.

CM-Databases

Test the connection and click on Continue .

Cluster Setup – Review changes

Ensure that you use the /data directory previously created with Terraform and Ansible.

CM-ReviewsUsers

 

Cluster Setup – Start services

CM-StartServices

 

Congratulations your Cloudera cluster is now installed and configured!

CM-End

CM-Dashboard

 

 

Cet article Deploy a Cloudera cluster with Terraform and Ansible in Azure – part 3 est apparu en premier sur Blog dbi services.

Data Guard: always set db_create_file_dest on the standby

Mon, 2018-07-09 12:21

The file name convert parameters are not dynamic and require a restart of the instance. An enhancement request was filled in 2011. I mentioned recently on Twitter that it can be annoying with Active Data Guard when a file on the primary server is created on a path that has no file name conversion. However, Ian Baugaard mentioned that there is a workaround for this specific case because db_create_file_dest is dynamic:

I recall seeing a MOS note indicating the order of precedence when it comes to db_file_name_convert and db_create_file_dest. The latter wins and makes config much easier especially when using OMF and ASM

— Ian Baugaard (@IanBaugaard) July 5, 2018

I’ve quickly created a 18c Data Guard configuration on the Oracle Cloud DBaaS to test it and here it is.

In the primary database and the standby database, here are the datafiles:

RMAN> report schema;
 
Report of database schema for database with db_unique_name ORCL_01
 
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1 830 SYSTEM YES /u02/app/oracle/oradata/ORCL/system01.dbf
3 510 SYSAUX NO /u02/app/oracle/oradata/ORCL/sysaux01.dbf
4 60 UNDOTBS1 YES /u02/app/oracle/oradata/ORCL/undotbs01.dbf
5 340 PDB$SEED:SYSTEM NO /u02/app/oracle/oradata/ORCL/pdbseed/system01.dbf
6 620 PDB$SEED:SYSAUX NO /u02/app/oracle/oradata/ORCL/pdbseed/sysaux01.dbf
7 5 USERS NO /u02/app/oracle/oradata/ORCL/users01.dbf
8 200 PDB$SEED:UNDOTBS1 NO /u02/app/oracle/oradata/ORCL/pdbseed/undotbs01.dbf
12 340 PDB1:SYSTEM YES /u02/app/oracle/oradata/ORCL/PDB1/system01.dbf
13 620 PDB1:SYSAUX NO /u02/app/oracle/oradata/ORCL/PDB1/sysaux01.dbf
14 200 PDB1:UNDOTBS1 YES /u02/app/oracle/oradata/ORCL/PDB1/undotbs01.dbf
15 50 PDB1:USERS NO /u02/app/oracle/oradata/ORCL/PDB1/PDB1_users01.dbf
 
List of Temporary Files
=======================
File Size(MB) Tablespace Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1 33 TEMP 32767 /u04/app/oracle/oradata/temp/temp01.dbf
2 62 PDB$SEED:TEMP 32767 /u04/app/oracle/oradata/temp/pdbseed_temp012018-02-08_13-49-27-256-PM.dbf
4 62 PDB1:TEMP 32767 /u04/app/oracle/oradata/temp/temp012018-02-08_13-49-27-256-PM.dbf

The properties of the standby database define no DbFileNameConvert because the directory structure is supposed to be the same:

DGMGRL> show configuration
 
Configuration - fsc
 
Protection Mode: MaxPerformance
Members:
ORCL_01 - Primary database
ORCL_02 - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS (status updated 45 seconds ago)
 
 
DGMGRL> show database verbose 'ORCL_02';
 
Database - ORCL_02
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Average Apply Rate: 15.00 KByte/s
Active Apply Rate: 532.00 KByte/s
Maximum Apply Rate: 535.00 KByte/s
Real Time Query: ON
Instance(s):
ORCL
 
Properties:
DGConnectIdentifier = 'ORCL_02'
...
DbFileNameConvert = ''
LogFileNameConvert = 'dummy, dummy'
...
 
Log file locations:
Alert log : /u01/app/oracle/diag/rdbms/orcl_02/ORCL/trace/alert_ORCL.log
Data Guard Broker log : /u01/app/oracle/diag/rdbms/orcl_02/ORCL/trace/drcORCL.log
 
Database Status:
SUCCESS

You can see that Oracle defines a dummy log file name convert. This a good idea to avoid some RMAN duplicate issues.

On the standby server, I have no db_create_file_dest defined:

SQL> show parameter create%dest
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_create_file_dest string
db_create_online_log_dest_1 string .
db_create_online_log_dest_2 string
db_create_online_log_dest_3 string
db_create_online_log_dest_4 string
db_create_online_log_dest_5 string

Note that the Oracle Cloud DBaaS defines it. I’ve reset it for the purpose of this demo.

New filesystem on Primary server only

I create a new filesystem on the primary server:

[root@DG-dg01 opc]# mkdir /DATA ; chown oracle:dba /DATA

I create a datafile on this new filesystem:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> create tablespace FRANCK datafile '/DATA/franck.dbf' size 100M;
Tablespace created.

The apply is stuck:

DGMGRL> show database 'ORCL_02';
 
Database - ORCL_02
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 2 seconds ago)
Apply Lag: 11 seconds (computed 2 seconds ago)
Average Apply Rate: 16.00 KByte/s
Real Time Query: OFF
Instance(s):
ORCL
 
Database Error(s):
ORA-16766: Redo Apply is stopped
 
Database Status:
ERROR

The standby alert.log shows the error about the impossibility to create the datafile:

2018-07-06T08:04:59.077730+00:00
Errors in file /u01/app/oracle/diag/rdbms/orcl_02/ORCL/trace/ORCL_pr00_29393.trc:
ORA-01274: cannot add data file that was originally created as '/DATA/franck.dbf'
2018-07-06T08:04:59.111881+00:00
Background Media Recovery process shutdown (ORCL)

db_file_name_convert

The first idea is to set a db_file_name_convert, however, this requires an instance restart, which means downtime when you have sessions on the Active Data Guard standby:

DGMGRL> edit database 'ORCL_02' set property DbFileNameConvert='/DATA,/u02/app/oracle/oradata/ORCL';
Warning: ORA-16675: database instance restart required for property value modification to take effect
 
Property "dbfilenameconvert" updated
 
DGMGRL> show database 'ORCL_02';
 
Database - ORCL_02
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 2 seconds ago)
Apply Lag: 3 minutes 32 seconds (computed 2 seconds ago)
Average Apply Rate: 16.00 KByte/s
Real Time Query: OFF
Instance(s):
ORCL
Warning: ORA-16675: database instance restart required for property value modification to
take effect
Warning: ORA-16714: the value of property DbFileNameConvert is inconsistent with the member setting
 
Database Error(s):
ORA-16766: Redo Apply is stopped
 
Database Warning(s):
ORA-16853: apply lag has exceeded specified threshold
 
Database Status:
ERROR

db_create_file_dest

The solution is set db_create_file_dest which, on the standby, has higher priority than the convert:

SQL> alter system set db_create_file_dest='/u02/app/oracle/oradata';
System altered.

I restart the apply:

DGMGRL> edit database 'ORCL_02' set state=apply-on;
Succeeded.

No need to restart and future datafile creations will be created there. However, it is too late for this datafile as it has already been created as UNNAMED in the controlfile:

ORA-01186: file 18 failed verification tests
ORA-01157: cannot identify/lock data file 18 - see DBWR trace file
ORA-01111: name for data file 18 is unknown - rename to correct file
ORA-01110: data file 18: '/u01/app/oracle/product/18.0.0/dbhome_1/dbs/UNNAMED00018'

Manual CREATE DATAFILE

Then I must manually create it, but I cannot do that while I am in standby_file_management=auto:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> alter database create datafile '/u01/app/oracle/product/18.0.0/dbhome_1/dbs/UNNAMED00018' as '/u02/app/oracle/oradata/ORCL/franck.dbf';
alter database create datafile '/u01/app/oracle/product/18.0.0/dbhome_1/dbs/UNNAMED00018' as '/u02/app/oracle/oradata/ORCL/franck.dbf'
*
ERROR at line 1:
ORA-01275: Operation CREATE DATAFILE is not allowed if standby file management
is automatic.

This can be changed dynamically:

DGMGRL> edit database 'ORCL_02' set property StandbyFileManagement=manual;
Property "standbyfilemanagement" updated

And then the creation is possible:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> alter database create datafile '/u01/app/oracle/product/18.0.0/dbhome_1/dbs/UNNAMED00018' as new;
Database altered.

You can see that because I have defined db_create_file_dest, I don’t need to name the datafile and create it as OMF with the ‘new’ keyword.

Now I can start the apply and it will resolve the gap:

DGMGRL> edit database 'ORCL_02' set state=apply-on;
Succeeded.
 
DGMGRL> show database 'ORCL_02';
 
Database - ORCL_02
 
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 3 seconds ago)
Apply Lag: 0 seconds (computed 3 seconds ago)
Average Apply Rate: 22.00 KByte/s
Real Time Query: ON
Instance(s):
ORCL
Warning: ORA-16675: database instance restart required for property value modification to take effect
Warning: ORA-16714: the value of property DbFileNameConvert is inconsistent with the member setting
 
Database Status:
WARNING

Do not forget to put back standby_file_management”to auto:

DGMGRL> edit database 'ORCL_02' set property StandbyFileManagement=auto;
Property "standbyfilemanagement" updated

So, now that db_create_file_dest is set, new datafiles will be created automatically as OMF (Oracle Managed Files), without caring about file name conversion:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> alter tablespace FRANCK add datafile '/DATA/franck2.dbf' size 100M;
Tablespace altered.

This is confirmed from the standby alert.log:

(4):Datafile 19 added to flashback set
(4):Successfully added datafile 19 to media recovery
(4):Datafile #19: '/u02/app/oracle/oradata/ORCL_02/7050211FE75F26FAE05392781D0AADAA/datafile/o1_mf_franck_fmybw332_.dbf'

Conclusion

Always define db_create_file_dest in the standby database so that datafiles will be created. Better to have them at the wrong place rather than stopping the apply. And anyway, if you don’t like the OMF names, and you are at least in 12c Enterprise Edition, you can change their name later with online move:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> alter database move datafile '/u02/app/oracle/oradata/ORCL_02/7050211FE75F26FAE05392781D0AADAA/datafile/o1_mf_franck_fmybw332_.dbf' to '/u02/app/oracle/oradata/ORCL/franck2.dbf';
Database altered.

 

Cet article Data Guard: always set db_create_file_dest on the standby est apparu en premier sur Blog dbi services.

My tmux scripts to script demos

Fri, 2018-07-06 10:27

When I did my first demo using tmux, I have written the following blog post to explain it: https://blog.dbi-services.com/using-tmux-for-semi-interactive-demos/. Since then I’ve done all my demos with this technique. My scripts and configuration files have evolved now I have uploaded the on GitHub to share them: https://github.com/FranckPachot/scripts/tree/master/tmux-demo

The README.md explains the pre-requisites (a recent version of Tmux), how to download the scripts quickly (I use this to get everything ready on a new Cloud host), and how to run it. The ‘demo’ alias starts tmux attached to the same session (in order to see it on the laptop and the beamer). The .vimrc defines the macros to run lines from the script file as tmux-send commands.

If you have questions, the best is to ask them on Twitter:

Here are the scripts and configuration files I use for my demos with tmux: https://t.co/OuuhHClmt3

— Franck Pachot (@FranckPachot) July 6, 2018

 

Cet article My tmux scripts to script demos est apparu en premier sur Blog dbi services.

18c: Order by in WITH clause is not preserved

Thu, 2018-07-05 14:29

For a previous post I’ve run on 18c a script of mine to get the V$MYSTAT delta values between two queries. This script (new version available on GitHub) generates the queries to store some values and subtract them on the next execution. But I had to fix it for 18c because I relied on some order by in a CTE which is lost in 18c.
The idea was to get the statistic names in a Common Table Expression (CTE):

with stats as (
select rownum n,stat_id,name from (select stat_id,name from v$statname where name in (&names) order by stat_id)
)

and query it from different parts of the UNION ALL which generates the script:

select 'select ' from dual
union all
select ' '||decode(n,1,' ',',')||'"CUR'||stat_id||'" - '||'&'||'LAG'||stat_id||' "DIF'||stat_id||'"' from stats
union all
select ' '||',nvl("CUR'||stat_id||'",0) "CUR'||stat_id||'"' from stats
union all

The rowum in the WITH clause is used later to add the comma for all rows except the first one – using decode(n,1,’ ‘,’,’)

But this relied on two assumptions:

  1. the WITH keeps the rows ordered
  2. the UNION ALL keeps the rows ordered

Of course, it would be better to add a number on each part and an ORDER BY at the end of the query. But for better readability, I didn’t.

However, the first assumption is wrong since 12.2 where the optimizer introduced In-Memory Cursor Duration Temp Tables. In the query above, the CTE is materialized because it is used multiple times. And this optimization keeps it in memory. But it seems that this structure does not keep the rows ordered.

Here is an example where I have a 4 rows table:

SQL> create table DEMO as select rownum id from xmltable('1 to 4');
Table DEMO created.
 
SQL> select * from DEMO order by id;
 
ID
----------
1
2
3
4

When I put the same query in a WITH clause, with its ORDER BY, and query it from 2 union all statements, the rows are not ordered anymore:

SQL> with cte as ( select * from DEMO order by id)
select rownum,id from cte
union all
select rownum,id from cte ;
 
ROWNUM ID
---------- ----------
1 4
2 3
3 2
4 1
1 4
2 3
3 2
4 1

The execution plan shows the materialization of the CTE result, and mentions that it is an In-Memory Cursor Duration Temp Table (CURSOR DURATION MEMORY)

SQL> select * from dbms_xplan.display_cursor(format=>'+projection');
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 9uwc06ana6trn, child number 0
-------------------------------------
with cte as ( select * from DEMO order by id) select rownum,id from
cte union all select rownum,id from cte
 
Plan hash value: 4025392480
 
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6640_23D74B | | | | |
| 3 | SORT ORDER BY | | 4 | 12 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | DEMO | 4 | 12 | 3 (0)| 00:00:01 |
| 5 | UNION-ALL | | | | | |
| 6 | COUNT | | | | | |
| 7 | VIEW | | 4 | 52 | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6640_23D74B | 4 | 12 | 2 (0)| 00:00:01 |
| 9 | COUNT | | | | | |
| 10 | VIEW | | 4 | 52 | 2 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6640_23D74B | 4 | 12 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - STRDEF[22], STRDEF[22] 2 - SYSDEF[4], SYSDEF[0], SYSDEF[1], SYSDEF[120], SYSDEF[0] 3 - (#keys=1) "DEMO"."ID"[NUMBER,22] 4 - (rowset=256) "DEMO"."ID"[NUMBER,22] 5 - STRDEF[22], STRDEF[22] 6 - "ID"[NUMBER,22], ROWNUM[8] 7 - "ID"[NUMBER,22] 8 - "C0"[NUMBER,22] 9 - "ID"[NUMBER,22], ROWNUM[8] 10 - "ID"[NUMBER,22] 11 - "C0"[NUMBER,22]

/*+ inline */

If the CTE is not materialized, the rows are ordered, bit the table is read two times:


SQL> with cte as ( select /*+ inline */ * from DEMO order by id)
2 select rownum,id from cte
3 union all
4 select rownum,id from cte ;
 
ROWNUM ID
---------- ----------
1 1
2 2
3 3
4 4
1 1
2 2
3 3
4 4
 
8 rows selected.
 
SQL>
SQL> select * from dbms_xplan.display_cursor(format=>'+projection');
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0dad0vwg3k0th, child number 0
-------------------------------------
with cte as ( select /*+ inline */ * from DEMO order by id) select
rownum,id from cte union all select rownum,id from cte
 
Plan hash value: 2913170750
 
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 8 (100)| |
| 1 | UNION-ALL | | | | | |
| 2 | COUNT | | | | | |
| 3 | VIEW | | 4 | 52 | 4 (25)| 00:00:01 |
| 4 | SORT ORDER BY | | 4 | 12 | 4 (25)| 00:00:01 |
| 5 | TABLE ACCESS FULL| DEMO | 4 | 12 | 3 (0)| 00:00:01 |
| 6 | COUNT | | | | | |
| 7 | VIEW | | 4 | 52 | 4 (25)| 00:00:01 |
| 8 | SORT ORDER BY | | 4 | 12 | 4 (25)| 00:00:01 |
| 9 | TABLE ACCESS FULL| DEMO | 4 | 12 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
1 - STRDEF[22], STRDEF[22] 2 - "ID"[NUMBER,22], ROWNUM[8] 3 - "ID"[NUMBER,22] 4 - (#keys=1) "DEMO"."ID"[NUMBER,22] 5 - (rowset=256) "DEMO"."ID"[NUMBER,22] 6 - "ID"[NUMBER,22], ROWNUM[8] 7 - "ID"[NUMBER,22] 8 - (#keys=1) "DEMO"."ID"[NUMBER,22] 9 - (rowset=256) "DEMO"."ID"[NUMBER,22]

“_in_memory_cdt”=off

If the CTE is materialized, but the new feature to keep it in memory is disabled, the rows are ordered (but probably by chance – there si no guarantee):


SQL> alter session set "_in_memory_cdt"=off;
 
Session altered.
 
SQL>
SQL> with cte as ( select /*+ materialize */ * from DEMO order by id)
2 select rownum,id from cte
3 union all
4 select rownum,id from cte ;
 
ROWNUM ID
---------- ----------
1 1
2 2
3 3
4 4
1 1
2 2
3 3
4 4
 
8 rows selected.
 
SQL> select * from dbms_xplan.display_cursor(format=>'+projection');
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 9uwc06ana6trn, child number 1
-------------------------------------
with cte as ( select * from DEMO order by id) select rownum,id from
cte union all select rownum,id from cte
 
Plan hash value: 4025392480
 
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | SYS_TEMP_0FD9D664D_23D74B | | | | |
| 3 | SORT ORDER BY | | 4 | 12 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | DEMO | 4 | 12 | 3 (0)| 00:00:01 |
| 5 | UNION-ALL | | | | | |
| 6 | COUNT | | | | | |
| 7 | VIEW | | 4 | 52 | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL | SYS_TEMP_0FD9D664D_23D74B | 4 | 12 | 2 (0)| 00:00:01 |
| 9 | COUNT | | | | | |
| 10 | VIEW | | 4 | 52 | 2 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D664D_23D74B | 4 | 12 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
1 - STRDEF[22], STRDEF[22] 2 - SYSDEF[4], SYSDEF[0], SYSDEF[1], SYSDEF[120], SYSDEF[0] 3 - (#keys=1) "DEMO"."ID"[NUMBER,22] 4 - (rowset=256) "DEMO"."ID"[NUMBER,22] 5 - STRDEF[22], STRDEF[22] 6 - "ID"[NUMBER,22], ROWNUM[8] 7 - "ID"[NUMBER,22] 8 - "C0"[NUMBER,22] 9 - "ID"[NUMBER,22], ROWNUM[8] 10 - "ID"[NUMBER,22] 11 - "C0"[NUMBER,22]

So what?

We should never rely on the preservation of the row order except when explicitly documented (like rownum over a subquery with order by). And this In-Memory Cursor Duration Temporary table also works in parallel and RAC, so we can understand that the rows are not read in the same order as they were stored. This is always a problem when we migrate. When something works by chance in a version and not in the newer versions, people usually incriminate the migration, rather than recognizing an old bug in the application, just because it was working before.

 

Cet article 18c: Order by in WITH clause is not preserved est apparu en premier sur Blog dbi services.

ODA fiber interface link detection issue

Thu, 2018-07-05 11:55

Configuring fiber interfaces during an ODA installation can sometime been challenging. The configuration can depend on the kind of gigabit interface converter (GBIC) and switch been used. In this blog, I would like to show a real case faced during one of our customer deployment, and come with a solution.

Problem

We were facing link detection issue when using the GBIC converter delivered by ORACLE, although using a CISCO GBIC converter could make the link available.
em2 interface was used with a CISCO GBIC.
em3 interface was used with an ORACLE GBIC.

[root@TEST1 tmp]# ethtool em2
Settings for em2:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)

        Link detected: yes
        
[root@TEST1 tmp]# ethtool em3
Settings for em3:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: FIBRE
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)

        Link detected: no

 

Solution

Forcing the interface on 10 Gb and configuring auto-negotiation to false could solve the link detection issue with the ORACLE GBIC converter.
This has been performed by updating both ifcfg-em2 and ifcfg-em3 (stored in /etc/sysconfig/network-scripts) with :

ETHTOOL_OPTS="speed 10000 duplex full autoneg off"

 

Both links could then be detected successfully.

[root@TEST1 ~]# ethtool em2
Settings for em2:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)

        Link detected: yes
        
[root@TEST1 ~]# ethtool em3
Settings for em3:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000000 (0)

        Link detected: yes

 

SFP28 Ethernet Controller firmware upgrade

The firmware version we were running is the 20.06.04.06.

[root@TEST1 ~]# ethtool -i em3
driver: bnxt_en
version: 1.8.0
firmware-version: 20.6.141/1.8.1 pkg 20.06.04.06
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no
supports-priv-flags: no

In case of further trouble, a firmware upgrade to 20.08.01.14 version can help, as described in following MOS note.
Using the onboard SFP28 ports on an ODA X7-2 server node (Doc ID 2373070.1)

 

Cet article ODA fiber interface link detection issue est apparu en premier sur Blog dbi services.

18c: some optimization about redo size

Tue, 2018-07-03 15:19

Some years ago, at the time of 12.1 release, I published in the SOUG newsletter some tests to show the amount of redo generated by different operations on a 10000 rows table. I had run it on 12.2 without seeing the differences and now on 18.1
I get the statistics from mystat using a script that displays them as columns, with the value being the difference from the previous run. I’ve run the same as in the article, and most of the statistics were in the same ballpark.

12.2

In 12.2 I get the same numbers. I was surprised about that because there is an optimization on 12.2 when updating a column to the same value. But this optimization may not occur for all situations. This reduction of redo generation has been analyzed by Maris Elsins with redo log dumps and by Sayan Malakshinov on triggers. And then the origin of this optimization has been exposed by Bryn Llewellyn. All info and links from the following Twitter conversation:

I hadn't blogged in a long time, but there was a very good reason for a new post: A redo optimization in 12cR2. https://t.co/M9LRNy3PuS @Pythian

— Maris Elsins (@MarisDBA) June 26, 2018

Here are the numbers I have in 18c, which are very similar to those from 12.1


SQL> --- update with different values (the original ones were all lowercase)
SQL>
SQL> update TEST_TABLE set a=upper(a),b=upper(b),c=upper(c),d=upper(d);
10000 rows updated.
 
SQL> commit;
Commit complete.
 
SQL> @ _mystat_diff.sql
SQL> set termout off verify off
SQL> /
 
db block changes redo size undo change vector size redo entries
---------------- ---------------- ----------------------- ----------------
190,485 32,226,368 13,288,940 95,258
 
1 row selected.
 
SQL>
SQL> --- update with same valus
SQL>
SQL> update TEST_TABLE set a=upper(a),b=upper(b),c=upper(c),d=upper(d);
10000 rows updated.
 
SQL> commit;
Commit complete.
 
SQL> @ _mystat_diff.sql
SQL> set termout off verify off
SQL> /
 
db block changes redo size undo change vector size redo entries
---------------- ---------------- ----------------------- ----------------
20,346 4,594,528 1,844,012 10,085
 
1 row selected.

The second update has updated the 4 columns with the same value. For about 1MB of data (10000 rows with avg_row_len=100), we have 1.8MB of undo and 4.4MB of redo (which covers the new data and the undo). I have 4 indexes there but Oracle do not update index entries when the old and new values are the same.

The first update changes all values, and then, in addition to the changes in the table block, the indexes must be updated.

So, here, on my test, it seems that the 12.2 optimization, referenced in the tweet above, did not occur because the redo generated for the table blocks is stull full logging when the old and new values are the same. I can check from a block dump that I have the same value in undo and redo:

REDO RECORD - Thread:1 RBA: 0x000008.00002444.0010 LEN: 0x01c8 VLD: 0x01 CON_UID: 1008806272
SCN: 0x00000000002cb8a4 SUBSCN:3047 07/03/2018 12:23:22
CHANGE #1 CON_ID:4 TYP:0 CLS:36 AFN:14 DBA:0x02405a20 OBJ:4294967295 SCN:0x00000000002cb8a4 SEQ:34 OP:5.1 ENC:0 RBL:0 FLG:0x0000
ktudb redo: siz: 184 spc: 2020 flg: 0x0022 seq: 0x0147 rec: 0x22
xid: 0x000a.009.000002bd
ktubu redo: slt: 9 rci: 33 opc: 11.1 objn: 77968 objd: 77978 tsn: 0
Undo type: Regular undo Undo type: Last buffer split: No
Tablespace Undo: No
0x00000000
KDO undo record:
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x02405a20.0147.21
KDO Op code: URP row dependencies Disabled
xtype: XAxtype KDO_KDOM2 flags: 0x00000080 bdba: 0x0040a994 hdba: 0x0040a7d8
itli: 3 ispac: 0 maxfr: 4863
tabn: 0 slot: 3(0x3) flag: 0x2c lock: 0 ckix: 0
ncol: 4 nnew: 4 size: 0
Vector content:
col 0: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 1: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 2: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 3: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
CHANGE #2 CON_ID:4 TYP:0 CLS:1 AFN:12 DBA:0x0040a994 OBJ:77978 SCN:0x00000000002cb8a4 SEQ:3 OP:11.5 ENC:0 RBL:0 FLG:0x0000
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x02405a20.0147.22
KDO Op code: URP row dependencies Disabled
xtype: XAxtype KDO_KDOM2 flags: 0x00000080 bdba: 0x0040a994 hdba: 0x0040a7d8
itli: 3 ispac: 0 maxfr: 4863
tabn: 0 slot: 3(0x3) flag: 0x2c lock: 3 ckix: 0
ncol: 4 nnew: 4 size: 0
Vector content:
col 0: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 1: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 2: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
col 3: [24] 31 30 30 30 30 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58

The redo record has the old and new values even when they are the same. I hope that we will have more documentation about the 12.2 redo optimization so that it can be predictable and tunable.

18.1

So, I didn’t see the 12.2 optimizations I expected here. However, but it seems that we have one with deletes on 18c.

In 12.2 the delete of all 10000 rows without index generates 2MB of undo and 3.5MB of redo:

SQL> --- delete all rows
SQL>
SQL> delete from TEST_TABLE;
10000 rows deleted.
 
SQL> commit;
Commit complete.
 
SQL> @ _mystat_diff.sql
SQL> set termout off verify off
SQL> /
 
db block changes redo size undo change vector size redo entries
---------------- ---------------- ----------------------- ----------------
20,690 3,670,476 2,053,292 10,145
 
1 row selected.

The same in 18.1 generates only 1MB of undo and redo:

SQL> --- delete all rows
SQL>
SQL> delete from TEST_TABLE;
10000 rows deleted.
 
SQL> commit;
Commit complete.
 
SQL> @ _mystat_diff.sql
SQL> set termout off verify off
SQL> /
 
db block changes redo size undo change vector size redo entries
---------------- ---------------- ----------------------- ----------------
872 1,187,120 1,116,812 293
 
1 row selected.

So in 12.2 I had one block change, and one redo record per row deleted. In 18.1 it seems that I have one redo record per block where all rows are deleted. Still in the same Twitter conversation, Tanel Poder had the idea to do a sparse delete leaving one row in each block:

SQL> delete from TEST_TABLE where dbms_rowid.rowid_row_number(rowid)!=42;
9849 rows deleted.
 
SQL> commit;
Commit complete.
 
SQL> @ _mystat_diff.sql
SQL> set termout off verify off
SQL> /
 
db block changes redo size undo change vector size redo entries
---------------- ---------------- ----------------------- ----------------
20,431 3,660,204 2,102,584 10,011

Now, as there is no blocks that are fully emptied, I’m back to one redo entry per row deleted.

 

Cet article 18c: some optimization about redo size est apparu en premier sur Blog dbi services.

Pages