Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 15 hours 26 min ago

When PDB name conflicts with CDB name

Sun, 2017-09-03 09:24

Going to multitenant architecture is not a big change. The administration things (DBA, monitoring, backups) connect to the CDB and the application things connect to the PDB. Without the multitenant option, it is still recommended to go to the CDB architecture. The non-CDB is deprecated and the multitenant architecture brings interesting features. People often ask how to name the CDB and the PDB, especially when they have naming rules or policies in the company. My recommendation is to name the PDB as you are used to naming the databases: the name often gives an idea of the data that is inside, the application, and the environment. The CDB is the container, and in my opinion, you should apply the same naming rules as for servers. Don’t forget that pluggable databases are made to be moved across CDB, so the CDB name should not depend on the content.

But, with single tenant, you have a one-to-one relationship between the CDB and the PDB and then may come the idea to set the same name for CDB and PDB… I’m not sure if it is supported or not and please, don’t do that.

Service Name

There’s one rule: the service name must be unique on a server, especially when registered to the same listener. The PDB name will be the default service name registered by the PDB. And the DB_UNIQUE_NAME of the CDB will be the default service name registered by the CDB. Then the PDB name must be different than the DBA_UNIQUE_NAME.

With this rule, it should be possible to have the same name for the CDB (the DB_NAME) and the PDB, given that we have set a different DB_UNIQUE_NAME.

Here is an example. The name of my Container Database is CDB1. But as it is part of a Data Guard configuration I changed the unique name to CDB1A (and standby will be CDB1B).

Here are the services from by CDB:

SQL> select * from v$services;
 
SERVICE_ID NAME NAME_HASH NETWORK_NAME CREATION_DATE CREATION_DATE_HASH GOAL DTP AQ_HA_NOTIFICATION CLB_GOAL COMMIT_OUTESSION_STATE_CONSISTENCY GLOBAL PDB SQL_TRANSLATION_PROFILE MAX_LAG_TIME STOP_OPTION FAILOVER_RESTORE DRAIN_TIMEOUT CON_ID
---------- ---- --------- ------------ ------------- ------------------ ---- --- ------------------ -------- ---------------------------------- ------ --- ----------------------- ------------ ----------- ---------------- ------------- ------
7 CDB1A 3104886812 CDB1A 27-AUG-17 1962062146 NONE N NO LONG NO NO CDB$ROOT NONE NONE 0 1
1 SYS$BACKGROUND 165959219 26-JAN-17 1784430042 NONE N NO SHORT NO NO CDB$ROOT NONE NONE 0 1
2 SYS$USERS 3427055676 26-JAN-17 1784430042 NONE N NO SHORT NO NO CDB$ROOT NONE NONE 0 1
0 pdb1 1888881990 pdb1 0 NONE N NO SHORT NO NO PDB1 NONE NONE 0 4
6 CDB1XDB 1202503288 CDB1XDB 27-AUG-17 1962062146 NONE N NO LONG NO NO CDB$ROOT NONE NONE 0 1

All are default services: CDB1A is the DB_UNIQUE_NAME, SYS$BACKGROUND for background processes, SYS$USERS when connecting without a service name, CDB1XDB is used to connec to XDB dispathers. PDB1 is the default service of my pluggable database PDB1.

I can also look at the services registred in the listener:


SQL> host lsnrctl status
 
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 28-AUG-2017 20:34:36
 
Copyright (c) 1991, 2016, Oracle. All rights reserved.
 
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 27-AUG-2017 20:41:33
Uptime 0 days 23 hr. 53 min. 3 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Log File /u01/app/oracle/diag/tnslsnr/VM104/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=VM104)(PORT=1521)))
Services Summary...
Service "57c2283990d42152e053684ea8c05ea0" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB1A" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB1XDB" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
The command completed successfully

There is just one additional service here: the GUI of my PDB (see https://blog.dbi-services.com/service-696c6f76656d756c746974656e616e74-has-1-instances/)

ORA-65149

Do you see any service named ‘CDB1′ here? No. Then I should be able to create a PDB with this name.

SQL> create pluggable database CDB1 admin user admin identified by covfefe file_name_convert=('pdbseed','cdb1');
 
Error starting at line : 1 in command -
create pluggable database CDB1 admin user admin identified by covfefe file_name_convert=('pdbseed','cdb1')
Error report -
ORA-65149: PDB name conflicts with existing service name in the CDB or the PDB
65149. 00000 - "PDB name conflicts with existing service name in the CDB or the PDB"
*Cause: An attempt was made to create a pluggable database (PDB) whose
name conflicts with the existing service name in the container
database (CDB) or the PDB.
*Action: Choose a different name for the PDB.

Ok. This is impossible. However, the error message is not correct. My PDB name does not conflict with existing service names. It may conflict with instance name or DB_NAME, but not with any service.

NID

As I’m not satisfied with this, I try to find another way to have the same name for CDB and PDB. I have a pluggable database named ‘PDB1′ and I’ll try to change the CDB name to this:


[oracle@VM104 ~]$ nid dbname=PDB1 target=sys/oracle
 
DBNEWID: Release 12.2.0.1.0 - Production on Mon Aug 28 20:40:08 2017
 
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
 
Connected to database CDB1 (DBID=926862412)
 
Connected to server version 12.2.0
 
Control Files in database:
/u01/oradata/CDB1A/control01.ctl
/u01/fast_recovery_area/CDB1A/control02.ctl
 
The following datafiles are read-only:
/u01/oradata/CDB1A/PDB1/USERS2.db (17)
These files must be writable by this utility.
 
Change database ID and database name CDB1 to PDB1? (Y/[N]) => Y
 
Proceeding with operation
Changing database ID from 926862412 to 3460932968
Changing database name from CDB1 to PDB1
Control File /u01/oradata/CDB1A/control01.ctl - modified
Control File /u01/fast_recovery_area/CDB1A/control02.ctl - modified
Datafile /u01/oradata/CDB1A/system01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/sysaux01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/undotbs01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/pdbseed/system01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/pdbseed/sysaux01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/users01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/pdbseed/undotbs01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/system01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/sysaux01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/undotbs01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/USERS.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/USERS2.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/temp01.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/pdbseed/temp012017-08-27_18-30-16-741-PM.db - dbid changed, wrote new name
Datafile /u01/oradata/CDB1A/PDB1/temp012017-08-27_18-30-16-741-PM.db - dbid changed, wrote new name
Control File /u01/oradata/CDB1A/control01.ctl - dbid changed, wrote new name
Control File /u01/fast_recovery_area/CDB1A/control02.ctl - dbid changed, wrote new name
Instance shut down
 
Database name changed to PDB1.
Modify parameter file and generate a new password file before restarting.
Database ID for database PDB1 changed to 3460932968.
All previous backups and archived redo logs for this database are unusable.
Database is not aware of previous backups and archived logs in Recovery Area.
Database has been shutdown, open database with RESETLOGS option.
Succesfully changed database name and ID.
DBNEWID - Completed succesfully.
 
SQL> startup
ORACLE instance started.
 
Total System Global Area 859832320 bytes
Fixed Size 8798552 bytes
Variable Size 784338600 bytes
Database Buffers 58720256 bytes
Redo Buffers 7974912 bytes
ORA-01103: database name 'PDB1' in control file is not 'CDB1'
 
SQL> alter system set db_name=PDB1 scope=spfile;
 
System altered.
 
SQL> shutdown immediate
ORA-01507: database not mounted
 
ORACLE instance shut down.
SQL>
SQL> startup
ORACLE instance started.
 
Total System Global Area 859832320 bytes
Fixed Size 8798552 bytes
Variable Size 784338600 bytes
Database Buffers 58720256 bytes
Redo Buffers 7974912 bytes
Database mounted.
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
 
SQL> alter database open resetlogs;
 
Database altered.

That’s done.
My CDB is named PDB1:
SQL> select * from v$database;
 
DBID NAME CREATED RESETLOGS_CHANGE# RESETLOGS_TIME PRIOR_RESETLOGS_CHANGE# PRIOR_RESETLOGS_TIME LOG_MODE CHECKPOINT_CHANGE# ARCHIVE_CHANGE# CONTROLFILE_TYPE CONTROLFILE_CREATED CONTROLFILE_SEQUENCE# CONTROLFILE_CHANGE# CONTROLFILE_TIME OPEN_RESETLOGS VERSION_TIME OPEN_MODE PROTECTION_MODE PROTECTION_LEVEL REMOTE_ARCHIVE ACTIVATION# SWITCHOVER# DATABASE_ROLE ARCHIVELOG_CHANGE# ARCHIVELOG_COMPRESSION SWITCHOVER_STATUS DATAGUARD_BROKER GUARD_STATUS SUPPLEMENTAL_LOG_DATA_MIN SUPPLEMENTAL_LOG_DATA_PK SUPPLEMENTAL_LOG_DATA_UI FORCE_LOGGING PLATFORM_ID PLATFORM_NAME RECOVERY_TARGET_INCARNATION# LAST_OPEN_INCARNATION# CURRENT_SCN FLASHBACK_ON SUPPLEMENTAL_LOG_DATA_FK SUPPLEMENTAL_LOG_DATA_ALL DB_UNIQUE_NAME STANDBY_BECAME_PRIMARY_SCN FS_FAILOVER_STATUS FS_FAILOVER_CURRENT_TARGET FS_FAILOVER_THRESHOLD FS_FAILOVER_OBSERVER_PRESENT FS_FAILOVER_OBSERVER_HOST CONTROLFILE_CONVERTED PRIMARY_DB_UNIQUE_NAME SUPPLEMENTAL_LOG_DATA_PL MIN_REQUIRED_CAPTURE_CHANGE# CDB CON_ID PENDING_ROLE_CHANGE_TASKS CON_DBID FORCE_FULL_DB_CACHING
---- ---- ------- ----------------- -------------- ----------------------- -------------------- -------- ------------------ --------------- ---------------- ------------------- --------------------- ------------------- ---------------- -------------- ------------ --------- --------------- ---------------- -------------- ----------- ----------- ------------- ------------------ ---------------------- ----------------- ---------------- ------------ ------------------------- ------------------------ ------------------------ ------------- ----------- ------------- ---------------------------- ---------------------- ----------- ------------ ------------------------ ------------------------- -------------- -------------------------- ------------------ -------------------------- --------------------- ---------------------------- ------------------------- --------------------- ---------------------- ------------------------ ---------------------------- --- ------ ------------------------- -------- ---------------------
3460932968 PDB1 27-AUG-17 1495032 28-AUG-17 1408558 27-AUG-17 ARCHIVELOG 1495035 0 CURRENT 27-AUG-17 2574 1496538 28-AUG-17 NOT ALLOWED 27-AUG-17 READ WRITE MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE ENABLED 3460947145 3460947145 PRIMARY 0 DISABLED NOT ALLOWED DISABLED NONE NO NO NO NO 13 Linux x86 64-bit 3 3 1497050 NO NO NO CDB1A 0 DISABLED 0 NO NO YES 0 NOT APPLICABLE 3460932968 NO

And I have a PDB with the same name:

SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
4 PDB1 MOUNTED
 
SQL> alter pluggable database PDB1 open;
 
Pluggable database PDB1 altered.
 
SQL> show pdbs
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
4 PDB1 READ WRITE NO

What was forbidden with a wrong error message was made possible with this other way.

So what?

Please, do not take this as a solution. There is clearly a problem here. Maybe the documentation and error message are wrong. Maybe the NID has a bug, allowing to do something that should be blocked. Or the create pluggable database has a bug, blocking something that should be possible. Until this is fixed (SR opened) I would recommend that the PDB name is always different than the CDB name, independently of service names. Well, I would recommend it anyway as it brings a lot of confusion: when you mention a database name, people will not know whether you are referring to the CDB or the PDB.

 

Cet article When PDB name conflicts with CDB name est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths XI – Sample Scan

Sat, 2017-08-26 09:33

I was going to end this series with the previous post because the last access path available in Postgres is a bit special: a Seq Scan that returns only a sample of the rows, at random. However, it is the occasion to come back to the difference between random and sequential reads.

I’m still working on the same table as in the previous posts, with 10000 rows in 1429 pages. 5% of rows is 500 rows and 5% of blocks is about 72 pages.

Rows

Sometimes, you can answer your business question on a sample of rows, when you need an approximate result, trend or pattern Let’s say that you want to sum() on only 5 percent of rows:
explain (analyze,verbose,costs,buffers) select sum(a) from demo1 tablesample bernoulli(5) ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1435.25..1435.26 rows=1 width=8) (actual time=1.940..1.940 rows=1 loops=1)
Output: sum(a)
Buffers: shared hit=1429
-> Sample Scan on public.demo1 (cost=0.00..1434.00 rows=500 width=4) (actual time=0.007..1.890 rows=509 loops=1)
Output: n, a, x
Sampling: bernoulli ('5'::real)
Buffers: shared hit=1429
Planning time: 0.373 ms
Execution time: 1.956 ms

This row sampling reads all rows and picks a sample of them at random. Unfortunately, it reads all blocks because you cannot get a good sample if you don’t know how many rows you have in each block. Working on a sample can make sense if you want to apply complex operations on the result. Here the cost in the database is similar to a Seq Scan: 1429 blocks read at seq_page_cost=1, but the sum() applied on 500 rows (cpu_operator_cost=0.0025) and 500 tuples from the scan and 1 tuple for the result, with cpu_tuple_cost=0.01

From execution statistics, you can see that the result is exactly what we asked: 500 rows returned.

Oracle has a different syntax and different algorithm:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 1tsadjdd9ddam, child number 0
-------------------------------------
select /*+ */ sum(a) from demo1 sample(5)
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 397 (100)| 1 |00:00:00.01 | 581 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 581 |
| 2 | TABLE ACCESS SAMPLE| DEMO1 | 1 | 500 | 397 (0)| 478 |00:00:00.01 | 581 |
-----------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) SUM("A")[22] 2 - (rowset=256) "A"[NUMBER,22]

Here we have not read all the blocks. Only 40% of them. This is faster than the Postgres approach, but the drawback is that the result is not exact: 478 rows were returned here.

Blocks

When we can afford an approximate sampling, we can sample on blocks rather than on rows:
explain (analyze,verbose,costs,buffers) select sum(a) from demo1 tablesample system(5) ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Aggregate (cost=290.25..290.26 rows=1 width=8) (actual time=0.479..0.480 rows=1 loops=1)
Output: sum(a)
Buffers: shared hit=73
-> Sample Scan on public.demo1 (cost=0.00..289.00 rows=500 width=4) (actual time=0.016..0.377 rows=511 loops=1)
Output: n, a, x
Sampling: system ('5'::real)
Buffers: shared hit=73
Planning time: 0.698 ms
Execution time: 0.509 ms

The number of rows is still good here, but the result may depend on the blocks sampled. Only 73 blocks were read, which is exactly 5% and of course, the rows may be distributed differently within the blocks. However, the advantage is that it is faster as it reads less blocks. But those blocks being picked at random, they are by definition random reads: 71 pages read at random_page_cost=0:4 and, as in the previous case, 501 cpu_tuple_cost and 500 cpu_operator_cost

With block sampling, Oracle reads a smaller number of blocks than with row sampling, but still more than 5%, and the number of rows is not exact: 798 rows here:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID fqgbwqfavgdrn, child number 0
-------------------------------------
select /*+ */ sum(a) from demo1 sample block(5)
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 22 (100)| 1 |00:00:00.01 | 134 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 134 |
| 2 | TABLE ACCESS SAMPLE| DEMO1 | 1 | 500 | 22 (0)| 798 |00:00:00.01 | 134 |
-----------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) SUM("A")[22] 2 - (rowset=256) "A"[NUMBER,22]

Again, as for the previous access paths: same concepts and different implementation between Postgres and Oracle. Everything looks similar and easily portable from a far overview, but going into details you see all those little differences which make it no so easy to be database agnostic or easily portable.

Summary

This is the end of this series comparing Postgres access path with Oracle ones. The goal is not to tell you that one is better than the other. They have a different approach, different targets, different price, different history and probably future. But understanding how they work and how they estimate the cost is a good way to learn. I’m myself learning a lot about Postgres while writing those posts, matching things I discover on Postgres with those I know for a while in Oracle.

Here is the list of posts on Access Path:

  1. Postgres vs. Oracle access paths – intro
  2. Postgres vs. Oracle access paths I – Seq Scan
  3. Postgres vs. Oracle access paths II – Index Only Scan
  4. Postgres vs. Oracle access paths III – Partial Index
  5. Postgres vs. Oracle access paths IV – Order By and Index
  6. Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX
  7. Postgres vs. Oracle access paths VI – Index Scan
  8. Postgres vs. Oracle access paths VII – Bitmap Index Scan
  9. Postgres vs. Oracle access paths VIII – Index Scan and Filter
  10. Postgres vs. Oracle access paths IX – Tid Scan
  11. Postgres vs. Oracle access paths X – Update
  12. Postgres vs. Oracle access paths XI – Sample Scan

I think my next series will be on Join methods.

 

Cet article Postgres vs. Oracle access paths XI – Sample Scan est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths X – Update

Thu, 2017-08-24 15:03

In the previous post we have seen the cheapest way to get one row, reading only one block from its physical location. But that’s the optimal case where the row has not moved. I’ll (nearly) conclude this series about access path with an update.

ROWID in Oracle

Here is the ROWID of one row in Oracle:

select rowid from demo1 where n=1000;
ROWID
------------------
AAAR4WAAMAAAAEaAAF

There’s enough information here to get directly to the block with file_name and offset:
select file_name,dbms_rowid.rowid_block_number('AAAR4WAAMAAAAEaAAF')*block_size offset
from dba_data_files join dba_tablespaces using(tablespace_name)
where file_id=dbms_rowid.rowid_to_absolute_fno('AAAR4WAAMAAAAEaAAF','DEMO','DEMO1');
 
FILE_NAME OFFSET
---------------------------------------- ----------
/u01/oradata/CDB1A/PDB/users01.dbf 2310144

The ROWID also contains the index of the row within the block’s row directory:

select dbms_rowid.rowid_row_number('AAAR4WAAMAAAAEaAAF') from dual;
 
DBMS_ROWID.ROWID_ROW_NUMBER('AAAR4WAAMAAAAEAAAF')
-------------------------------------------------
5

TID in Postgres

And the TID of similar row in Postgres:

select ctid from demo1 where n=1000;
ctid
---------
(142,6)

The file is known from the table, as there is only one file per table:

show data_directory;
data_directory
----------------------------
/usr/share/postgresql/data
 
select pg_relation_filepath('demo1');
pg_relation_filepath
----------------------
base/16437/125852

The blocksize is common for the whole database:

show block_size;
block_size
------------
8192

Then the block is at offset 142+8192=8334.
Within the block, the row is at index 6.

SELECT

We have seen in the previous post that we can select using the ROWID/TID and Oracle and Postgres behave the same: only one block to read, cost estimation based on one random read:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8mj3ms08x0sfh, child number 0
-------------------------------------
select /*+ */ a from demo1 where rowid='AAAR4WAAMAAAAEaAAF'
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 1 |
| 1 | TABLE ACCESS BY USER ROWID| DEMO1 | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 1 |
-----------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22]

Different units but same signification: cost=1 for Oracle is for random reads, cost=1 for Postgres is for sequential reads and random reads are estimated to cost=4:

explain (analyze,verbose,costs,buffers) select a from demo1 where ctid='(142,6)' ;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Tid Scan on public.demo1 (cost=0.00..4.01 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=1)
Output: a
TID Cond: (demo1.ctid = '(142,6)'::tid)
Buffers: shared hit=1
Planning time: 0.358 ms
Execution time: 0.016 ms

Oracle UPDATE

Now I’m updating this row, changing the column X which contains 1000 ‘x’ characters to 1000 ‘y’ characters:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gpqv4k6m1q20y, child number 0
-------------------------------------
update /*+ */ demo1 set x=lpad('y',1000,'y') where rowid='AAAR4WAAMAAAAEaAAF'
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | | 1 (100)| 0 |00:00:00.01 | 4 |
| 1 | UPDATE | DEMO1 | 1 | | | 0 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY USER ROWID| DEMO1 | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 1 |
------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
2 - (upd=2) ROWID[ROWID,10], "X"[VARCHAR2,1000]

In addition to the access to the block (1 buffer) the update had to read 3 additional buffers. There are no indexes on this updated column and then Oracle has no additional maintenance to do. One buffer is the table block to update (the TABLE ACCESS BY USER ROWID was a consistent get, the update needs the current version of the block).

Additional buffers are from the UNDO tablespace for MVCC (Multi Version Concurrency Control). It is the first modification in my transaction and then has to update the transaction table and undo segment, which is why we see 2 additional buffers. Another update within the same transaction reads only two buffers in total:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gpqv4k6m1q20y, child number 0
-------------------------------------
update /*+ */ demo1 set x=lpad('z',1000,'z') where rowid='AAAR4WAAMAAAAEaAAF'
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | | 1 (100)| 0 |00:00:00.01 | 2 |
| 1 | UPDATE | DEMO1 | 1 | | | 0 |00:00:00.01 | 2 |
| 2 | TABLE ACCESS BY USER ROWID| DEMO1 | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 1 |
------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
2 - (upd=2) ROWID[ROWID,10], "X"[VARCHAR2,1000]

Only the table blocks are read: one consistent read as of the beginning of the query (or the transaction if in serializable isolation level) and one for the current block. Oracle has an optimization called In-Memory UNDO to avoid frequent access undo blocks.

There are no further re-visits needed. Oracle may choose to come back at commit if it can be done quickly (few blocks still in buffer cache) but that’s not required. The block can stay like this for years without the need to read it again for cleanup. If another session has to read it, then cleanup may be done by this session.

Postgres UPDATE

Here is the same update in Postgres:

explain (analyze,verbose,costs,buffers) update demo1 set x=lpad('y',1000,'y') where ctid='(142,6)' ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.demo1 (cost=0.00..4.01 rows=1 width=46) (actual time=0.214..0.214 rows=0 loops=1)
Buffers: shared hit=6 dirtied=3
-> Tid Scan on public.demo1 (cost=0.00..4.01 rows=1 width=46) (actual time=0.009..0.009 rows=1 loops=1)
Output: n, a, 'yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'::text, ctid
TID Cond: (demo1.ctid = '(142,6)'::tid)
Buffers: shared hit=1
Planning time: 0.405 ms
Execution time: 0.232 ms

The Tid Scan is the same as for the select. Then the update has read 5 blocks and modified 3 of them. The update in Postgres is processed as a delete+insert. Here is my guess about those numbers. The new version is inserted, in a new block if there is no free space in the same block. The old version is updated. And the index must be maintained. Those are 3 blocks to modify. Here, the row was directly accessed through its TID. But we must find the index entry. The row contains the index value, and then an index scan is possible: two block reads for this small index having one branch only.

SELECT again

I said that with Oracle the row is updated in-place and doesn’t need further cleanup. If I run the same SELECT as the one I did before the UPDATE, I still have only one block to read:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8mj3ms08x0sfh, child number 0
-------------------------------------
select /*+ */ a from demo1 where rowid='AAAR4WAAMAAAAEaAAF'
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 1 |
| 1 | TABLE ACCESS BY USER ROWID| DEMO1 | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 1 |
-----------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22]

In Postgres, because the update was processed as insert+delete, running the same also reads only one block, but it returns no rows:

explain (analyze,verbose,costs,buffers) select a from demo1 where ctid='(142,6)' ;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Tid Scan on public.demo1 (cost=0.00..4.01 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
Output: a
TID Cond: (demo1.ctid = '(142,6)'::tid)
Buffers: shared hit=1
Planning time: 0.442 ms
Execution time: 0.028 ms

The new version is in another block, then the TID to find it is different:

select ctid from demo1 where n=1000;
ctid
----------
(1428,5)
(1 row)

There was not enough space for another version of the whole row within the same block. Free space was found in the last block (1428). Of course, this is why the index was updated even if the indexed column did not change: it had to address a different block.

Let’s query with the new TID:

explain (analyze,verbose,costs,buffers) select a from demo1 where ctid='(1428,5)' ;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Tid Scan on public.demo1 (cost=0.00..4.01 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1)
Output: a
TID Cond: (demo1.ctid = '(1428,5)'::tid)
Buffers: shared hit=1
Planning time: 0.449 ms
Execution time: 0.023 ms

Only one buffer read. However, as we have seen with the Index Only Scan, there is a need for cleanup to avoid Heap Fetches. There are also the old tuples that should be removed later or the updated tables and indexes grow forever.

There’s only one Postgres access path remaining. That’s for teh last post of this series, which will include the table of content.

 

Cet article Postgres vs. Oracle access paths X – Update est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths IX – Tid Scan

Wed, 2017-08-23 15:29

In the previous post we have seen how Postgres and Oracle finds the table row from the index entry. It uses the TID / ROWID. I’ll focus on this access path and I will have covered all Postgres access paths to table data.

Oracle ACCESS BY ROWID

I start with Oracle because we already have seen the TABLE ACCESS BY ROWID. I’ll decompose an index acces to the table. The first step is getting the ROWID from the index entry:

SQL> select /*+ */ rowid from demo1 where n=1000;
 
ROWID
------------------
AAASPkAAMAAABIaAAF

The ROWID contains the data object ID (to be able to identify the segment and then the tablespace), the relative file number within the tablespace, the block number within this file and the row number within the block. This can be stored in 10 bytes. When in an index entry, except if this is a global index on a partitioned table, we don’t need the object ID (because there’s a one-to-one relationship between the table and the index objects) and the only 6 bytes are stored in the index entry.

This is a simple index access and the output (projection) is the ROWID:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 32tsqy19ctmd4, child number 0
-------------------------------------
select /*+ */ rowid from demo1 where n=1000
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 2 |
|* 1 | INDEX UNIQUE SCAN| DEMO1_N | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("N"=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - ROWID[ROWID,10]

Now with the ROWID, I query a column from the table:
SQL> select /*+ */ a from demo1 where rowid='AAASPkAAMAAABIaAAF';
 
A
----------
1

And the plan is exactly the ‘TABLE ACCESS’ part we have seen in previous posts on index scans:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID c46nq5t0sru8q, child number 0
-------------------------------------
select /*+ */ a from demo1 where rowid='AAASPkAAMAAABIaAAF'
Plan hash value: 3196731035
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 1 |
| 1 | TABLE ACCESS BY USER ROWID| DEMO1 | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 1 |
-----------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22]

There’s no Predicate section visible here, but the access is done on the ROWID which contains the file number, block number, and row number. This is the fastest way to get one row: reading only one buffer.

Postgres Tid Scan

Same idea in Postgres where we can query the TID (Tumple ID):

select ctid from demo1 where n=1000 ;
ctid
---------
(142,6)
(1 row)

Because my table is stored in a file (no tablespace with multiple data files here) the TID contains only the block number and the row number within the block.
explain (analyze,verbose,costs,buffers) select ctid from demo1 where n=1000 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..8.30 rows=1 width=6) (actual time=0.009..0.009 rows=1 loops=1)
Output: ctid
Index Cond: (demo1.n = 1000)
Buffers: shared hit=3
Planning time: 0.429 ms
Execution time: 0.023 ms

We already have seen the cost of this operation: 116 startup operations, 2 index pages read at random_page_cost=4 and 1 result row at cpu_tuple_cost=0.01 (note that the query planner does not count the cpu_index_tuple_cost here).

Then here is the query using this TID:
explain (analyze,verbose,costs,buffers) select a from demo1 where ctid='(142,6)' ;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Tid Scan on public.demo1 (cost=0.00..4.01 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1)
Output: a
TID Cond: (demo1.ctid = '(142,6)'::tid)
Buffers: shared hit=1
Planning time: 0.351 ms
Execution time: 0.017 ms

The cost estimation is very simple here: 1 seek()+read() at random_page_cost=4 and 1 result row at cpu_tuple_cost=0.01

Since the post on Index Only Scan, I’m working on a vacuumed table with no modifications. Now that I have the simplest access path, I’ll show it with an update, in the next post.

 

Cet article Postgres vs. Oracle access paths IX – Tid Scan est apparu en premier sur Blog dbi services.

Improving Statspack Experience

Wed, 2017-08-23 14:41

I’ve published a few month ago an article in the UKOUG OracleScene magazine on Improving Statspack Experience: quick setup script, changing settings, filling Idle Events,etc. In the article, I used dbms_job to schedule the snapshots, because I have this script for years and never took the time to do it with dbms_scheduler. Thanks to Nicolas Jardot here is the equivalent script using dbms_scheduler.

The idea is to have a script to run on each instance (when in RAC) in order to have a job calling statspack.snap and statspack.purge on each instance.

DECLARE
instno NUMBER;
snapjob VARCHAR2(30);
purgejob VARCHAR2(30);
BEGIN
select instance_number into instno from v$instance;
snapjob := 'PERFSTAT.STATSPACK_SNAP_' || instno;
purgejob := 'PERFSTAT.STATSPACK_PURGE_' || instno;
 
DBMS_SCHEDULER.CREATE_JOB (
job_name => snapjob,
job_type => 'PLSQL_BLOCK',
job_action => 'statspack.snap;',
number_of_arguments => 0,
start_date => systimestamp,
repeat_interval => 'FREQ=HOURLY;BYTIME=0000;BYDAY=MON,TUE,WED,THU,FRI,SAT,SUN',
end_date => NULL,
enabled => TRUE,
auto_drop => FALSE,
comments => 'Take hourly Statspack snapshot');
 
DBMS_SCHEDULER.CREATE_JOB (
job_name => purgejob,
job_type => 'PLSQL_BLOCK',
job_action => 'statspack.purge(i_num_days=>31,i_extended_purge=>true);',
number_of_arguments => 0,
start_date => systimestamp,
repeat_interval => 'FREQ=WEEKLY;BYTIME=120000;BYDAY=SUN',
end_date => NULL,
enabled => TRUE,
auto_drop => FALSE,
comments => 'Weekly purge Statspack snapshot');
 
DBMS_SCHEDULER.SET_ATTRIBUTE( name => snapjob, attribute => 'logging_level', value => DBMS_SCHEDULER.LOGGING_OFF );
DBMS_SCHEDULER.SET_ATTRIBUTE( name => snapjob, attribute => 'INSTANCE_ID', value=>instno);
DBMS_SCHEDULER.SET_ATTRIBUTE( name => snapjob, attribute => 'INSTANCE_STICKINESS', value=>TRUE);
 
DBMS_SCHEDULER.SET_ATTRIBUTE( name => purgejob, attribute => 'logging_level', value => DBMS_SCHEDULER.LOGGING_OFF );
DBMS_SCHEDULER.SET_ATTRIBUTE( name => purgejob, attribute => 'INSTANCE_ID', value=>instno);
DBMS_SCHEDULER.SET_ATTRIBUTE( name => purgejob, attribute => 'INSTANCE_STICKINESS', value=>TRUE);
END;
/

I used the Oracle Cloud Service to provision quickly a two nodes RAC database to validate, and I’ll check the scheduling:

[oracle@rac-dg01-1 admin]$ alias sql='JAVA_HOME=$ORACLE_HOME/jdk bash $ORACLE_HOME/sqldeveloper/sqlcl/bin/sql'
[oracle@rac-dg01-1 admin]$ TWO_TASK=//10.31.143.86/pdb1.a511644.oraclecloud.internal sql sys/"Ach1z0#d" as sysdba
 
SQLcl: Release 12.2.0.1.0 RC on Wed Aug 23 18:57:12 2017
 
Copyright (c) 1982, 2017, Oracle. All rights reserved.
 
Connected to:
Oracle Database 12c EE Extreme Perf Release 12.2.0.1.0 - 64bit Production
 
SQL> set sqlformat ansiconsole
SQL> select job_name, state, enabled, next_run_date, instance_stickiness, instance_id from dba_scheduler_jobs where owner='PERFSTAT';
JOB_NAME STATE ENABLED NEXT_RUN_DATE INSTANCE_STICKINESS INSTANCE_ID
-------- ----- ------- ------------- ------------------- -----------
STATSPACK_SNAP_1 SCHEDULED TRUE 23-AUG-17 07.00.00.981193000 PM +00:00 TRUE 1
STATSPACK_PURGE_1 SCHEDULED TRUE 27-AUG-17 12.00.00.074939000 PM +00:00 TRUE 1
STATSPACK_SNAP_2 SCHEDULED TRUE 23-AUG-17 07.00.00.644681000 PM +00:00 TRUE 2
STATSPACK_PURGE_2 SCHEDULED TRUE 27-AUG-17 12.00.00.755685000 PM +00:00 TRUE 2

One hour later, the job has run on each instance:

JOB_NAME STATE ENABLED NEXT_RUN_DATE INSTANCE_STICKINESS INSTANCE_ID
-------- ----- ------- ------------- ------------------- -----------
STATSPACK_SNAP_1 SCHEDULED TRUE 23-AUG-17 08.00.00.325755000 PM +00:00 TRUE 1
STATSPACK_PURGE_1 SCHEDULED TRUE 27-AUG-17 12.00.00.074939000 PM +00:00 TRUE 1
STATSPACK_SNAP_2 SCHEDULED TRUE 23-AUG-17 08.00.00.644681000 PM +00:00 TRUE 2
STATSPACK_PURGE_2 SCHEDULED TRUE 27-AUG-17 12.00.00.755685000 PM +00:00 TRUE 2

Now running a spreport to see the instances having snapshots:

[oracle@rac-dg01-1 admin]$ TWO_TASK=//10.31.143.86/pdb1.a511644.oraclecloud.internal/cdb12 sqlplus sys/"Ach1z0#d" as sysdba @ spreport
 
Instances in this Statspack schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
DB Id Inst Num DB Name Instance Host
----------- -------- ------------ ------------ ------------
924704606 2 CDB1 cdb12 rac-dg01-2
924704606 1 CDB1 cdb11 rac-dg01-1
 
Using 924704606 for database Id
Using 2 for instance number

Here it is. dbms_job is deprecated. Let’s use dbms_scheduler.

 

Cet article Improving Statspack Experience est apparu en premier sur Blog dbi services.

Bequeath connect to PDB: set container in logon trigger?

Wed, 2017-08-23 00:54

There are little changes when you go to multitenant architecture and one of them is that you must connect with a service name. You cannot connect directly to a PDB with a beaqueath (aka local) connection. This post is about a workaround you may have in mind: create a common user and set a logon trigger to ‘set container’. I do not recommend it and you should really connect with a service. Here is an example.

Imagine that I have a user connecting with bequeath connection to a non-CDB, using user/password without a connection string, the database being determined by the ORACLE_SID. And I want to migrate to CDB without changing anything on the client connection configuration side. The best idea would be to use a service, explicitly or implicitly with TWO_TASK or LOCAL. But let’s imagine that you don’t want to change anything on the client side.

As we can connect only the the CDB$ROOT with a bequeath connection, we have to create a common user. Because the idea is not to change anything on client configuration, and there’s a very little chance that the user starts with C## I’ll start to remove the mandatory prefix for common users.


SQL> show parameter common_user_prefix
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
common_user_prefix string
 
SQL> alter system set common_user_prefix='' scope=spfile;
System altered.
 
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
...

Then I create my common user:

SQL> create user MYOLDUSER identified by covfefe container=all;
User created.

This user must be able to connect to the CDB:

SQL> grant create session to MYOLDUSER container=current;
Grant succeeded.

And then I want it to switch immediately to PDB1 using a logon trigger:

SQL> create or replace trigger SET_CONTAINER_AT_LOGON after logon on database
2 when (user in ('MYOLDUSER'))
3 begin
4 execute immediate 'alter session set container=PDB1';
5 end;
6 /
Trigger created.

Once on PDB1 this user will have some privileges, and for the example I will grant him a default role:

SQL> alter session set container=PDB1;
Session altered.
 
SQL> create role MYROLE;
Role created.
 
SQL> grant MYROLE to MYOLDUSER container=current;
Grant succeeded.

The documentation says that When you grant a role to a user, the role is granted as a default role for that user and is therefore enabled immediately upon logon so I don’t need to:

SQL> alter user MYOLDUSER default role MYROLE;
User altered.

But the doc say ‘logon’ and technically I do not logon to PDB1. I just set container. However, if you test it you will see that default roles are set also on ‘set container’. And anyway, we cannot set a role in a procedure, neither with ‘set role’ nor with dbms_session.set_role:

ORA-06565: cannot execute SET ROLE from within stored procedure

Then, I can now connect locally to the CDB$ROOT with this user:

SQL> connect MYOLDUSER/covfefe
Connected.

And I’m automatically switched to the PDB1:

SQL> show con_name
 
CON_NAME
------------------------------
PDB1

Issue #1: default roles

However the default roles are not set:

SQL> select * from session_roles;
 
no rows selected

I have to set the role once connected:

SQL> set role all;
Role set.
 
SQL> select * from session_roles;
 
ROLE
--------------------------------------------------------------------------------
MYROLE

This is probably not what we want when we cannot change anything on the application side. This is considered as a bug (Bug 25081564 : ALTER SESSION SET CONTAINER IN “ON LOGON TRIGGER” IS NOT WORKING) fixed in 18.1 (expected in Q1 2018) and there’s a patch for 12.1 and 12.2 https://updates.oracle.com/download/25081564.html

Issue #2: core dump

There’s another issue. If you run the same with SQLcl you have a core dump in the client library libclntsh.so on kpuSetContainerNfy

SQLcl: Release 17.2.0 Production on Tue Aug 22 22:00:52 2017
 
Copyright (c) 1982, 2017, Oracle. All rights reserved.
 
SQL> connect MYOLDUSER/covfefe
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fcaa172faf6, pid=31242, tid=140510230116096
#
# JRE version: Java(TM) SE Runtime Environment (8.0_91-b14) (build 1.8.0_91-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.91-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libclntsh.so.12.1+0x11d8af6] kpuSetContainerNfy+0x66
#
# Core dump written. Default location: /media/sf_share/122/blogs/core or core.31242

There’s a SR opened for that. This is not a no-go because the context being no change to the client part, then sqlplus will probably be used. However, that’s another point which shows that ‘set container’ in a logon trigger may have some implementation problems.

Issue #3: security

In my opinion, there is a bigger problem here. With sqlplus (or with sqlcl not using local connection) I can connect to the CDB$ROOT and switch to PDB1. But look at all the commands above… where did I grant the ‘set container’ privilege for MYOLDUSER on the PDB1 container? Nowhere. MYOLDUSER has no create session and no set container privileges, but is able to connect to PDB1 thanks to the logon trigger. Of course , the logon trigger is defined by a DBA who knows what he does. But in my opinion, it is not a good idea to bypass the privilege checking.

So what?

With no default role, connecting without the right privilege, the security model is biased here. And disabling the common user prefix will raise other issues one day with plugging operations. Then, in my opinion, this is not a solution to workaround the need to connect with a service. Especially in the context where we run legacy application with no possibility to change the way it connects: you just postpone the problems to bigger ones later.

The real solution is to connect to a service (and that’s not difficult even when you can’t change the code, with TWO_TASK environment variable).

 

Cet article Bequeath connect to PDB: set container in logon trigger? est apparu en premier sur Blog dbi services.

Get trace file from server to client

Sun, 2017-08-20 15:20

The old way to get a user dump trace file, for sql_trace (10046), Optimizer compilation trace (10053), lock trace (10704), Optimizer execution trace (10507),… is to go to the server trace directory. But if you don’t have access to the server (as in the ☁) the modern (12cR2) way is to select from V$DIAG_TRACE_FILE_CONTENTS. Before everybody is on 12.2 I’m sharing here a sqlplus script that I use for a long time to get the trace file to the client.

This is a script to run with sqlplus passing as a parameter the name of the trace file you want to write to (because the original name on the server is not very meaningful). I close all cursors to get all info such as STAT lines. I get the trace file name from v$diag_info. If a directory exists I use it. If not I create one, named UDUMP, which I’ll remove at the end. Then the file is read by utl_file and spooled on client though dbms_output. At the end, I remove the trace file from the server.

Here is the script – comments welcome.

set serveroutput on feedback off verify off termout off linesize 1024 trimspool on echo off
spool &1..trc
declare
fd utl_file.file_type;
line varchar2(1024);
l_tracename varchar2(512);
l_directory_path all_directories.directory_path%TYPE;
l_directory_name all_directories.directory_name%TYPE;
l_directory_created boolean;
procedure t (msg in varchar2) is begin dbms_output.put_line(msg); end;
begin
/* use old parameter _cached_open_cursors to close all open cursors */
for r in (select 1 from v$session_cursor_cache where count>0) loop
dbms_session.set_close_cached_open_cursors(true);
rollback;
commit;
dbms_session.set_close_cached_open_cursors(false);
end loop;
/* get trace directory and trace file name */
select value into l_directory_path from v$diag_info where name='Diag Trace';
select substr(replace(value,l_directory_path,''),2) into l_tracename from v$diag_info where name='Default Trace File';
/* get directory name for it, or try to create it */
l_directory_created:=false;
begin
select directory_name into l_directory_name from all_directories where directory_path = l_directory_path and rownum=1;
exception
when no_data_found then
begin
l_directory_name:='UDUMP';
execute immediate 'create directory '||l_directory_name||' as '''||l_directory_path||'''';
l_directory_created:=true;
exception when others then
raise_application_error(-20000,'You must CREATE DIRECTORY '||l_directory_name||' as ''' ||l_directory_path||'''; or be granted CREATE DIRECTORY.');
end;
end;
/* opens the trace file */
begin
fd:=utl_file.fopen(l_directory_name,l_tracename,'R');
exception when others then
raise_application_error(-20001,'Impossible to open file: '||l_tracename||' in '||l_directory_name||' ( '||l_directory_path||' )');
end;
/* read the trace file and prints it */
begin
loop
begin
utl_file.get_line(fd,line);
dbms_output.put_line(line);
exception
when no_data_found then exit;
when others then
dbms_output.put_line('!!! error while reading file '||l_tracename||' in '||l_directory_name||' ( '||l_directory_path||' )');
end;
end loop;
/* close and remove the file from the server */
utl_file.fclose(fd);
utl_file.fremove(l_directory_name,l_tracename);
exception
when others then
raise_application_error(-20002,'Impossible to remove: '||l_tracename||' in '||l_directory_name||' ( '||l_directory_path||' )');
end;
begin
/* drop directory if created */
if l_directory_created then execute immediate 'drop directory '||l_directory_name; end if;
exception
when others then
raise_application_error(-20002,'Impossible to remove directory: '||l_directory_name||' ( '||l_directory_path||' )'||sqlerrm);
end;
end;
/
spool off

 

Cet article Get trace file from server to client est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths VIII – Index Scan and Filter

Sun, 2017-08-20 07:40

In the previous post we have seen a nice optimization to lower the consequences of bad correlation between the index and the table physical order: a bitmap, which may include false positives and then requires a ‘recheck’ of the condition, but with the goal to read each page only once. Now we are back to the well-clustered table where we have seen two possible access paths: IndexOnlyScan when all columns we need are in the index, and IndexScan when we select additional columns. Here is a case in the middle: the index does not have all the columns required by the select, but can eliminate all rows.

The table created is:

create table demo1 as select generate_series n , 1 a , lpad('x',1000,'x') x from generate_series(1,10000);
SELECT 10000
create unique index demo1_n on demo1(n);
CREATE INDEX
vacuum demo1;
VACUUM

Index Only Scan and Filter

I use only one column (N), which is indexed, in the SELECT clause and the WHERE clause. And this WHERE clause is silly: in addition to the n<=1000 I've used in previous post to focus on 10% of rows, I add a condition which is always false: mod(n,100)=1000

explain (analyze,verbose,costs,buffers) select n from demo1 where n<=1000 and mod(n,100)=1000 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..38.78 rows=5 width=4) (actual time=0.276..0.276 rows=0 loops=1)
Output: n
Index Cond: (demo1.n <= 1000)
Filter: (mod(demo1.n, 100) = 1000)
Rows Removed by Filter: 1000
Heap Fetches: 0
Buffers: shared hit=5
Planning time: 0.454 ms
Execution time: 0.291 ms

Index Only Scan is used here because no other columns are used. The n<=1000 is the access condition (Index Cond.) doing a range scan on the index structure. The mod(n,100)=1000 is a filter predicate which is applied to the result of the index access (Filter) and we have additional information that the 1000 rows selected by the access predicate have been filtered out (Rows Removed by Filter). During the execution, 5 index buffers have been read for the range scan (branches + leaves). Because I vacuumed any changes, the visibility map knows that all rows can be displayed and there are no blocks to read from the table (Heap Fetches).

Now I’ll select another column in order to see an Index Scan. We have seen in the previous post that the huge cost of index access is the access to the table. Filtering most of the rows from the index entries is the most common recommendation to optimize a query. And my example here is running the extreme case: a predicate on the indexed column removes all rows.

Index Scan and Filter

I’ve just changed the ‘select n’ to ‘select a':


explain (analyze,verbose,costs,buffers) select a from demo1 where n<=1000 and mod(n,100)=1000 ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..184.78 rows=5 width=4) (actual time=0.427..0.427 rows=0 loops=1)
Output: a
Index Cond: (demo1.n <= 1000)
Filter: (mod(demo1.n, 100) = 1000)
Rows Removed by Filter: 1000
Buffers: shared hit=147
Planning time: 0.434 ms
Execution time: 0.440 ms

I can understand that the cost is higher. The optimizer may not know that mod(n,100) will never be equal to 1000. Estimating 5 rows, as in the previous case, is ok for me. We see different Output (different SELECT clause) but same information about Index Cond, Filter, and Rows Removed (same WHERE clause). The estimation part looks good.

However, there’s something that I can’t understand. At execution, we know that all rows can be removed before going to the table. We go to the table to get the value from A but all rows were filtered out from the index. At least it was the case with the Index Only Scan, and we know that the filter condition has all values from the index.

However, 147 blocks were read here. We have seen that the index scan reads 5 index pages, and then we can guess that 142 table pages have been read, exactly 10% of the pages from my correlated table. It seems that all rows have been read from the table before being filtered out. The Index Scan being one operation, the filter occurs at the end only. This is only my guess and I hope to get comments about that.

Oracle

With Oracle, the first query, selecting only indexed columns is an INDEX RANGE SCAN similar to the Postgres one.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID fj36y2vph9u8f, child number 0
-------------------------------------
select /*+ */ n from demo1 where n<=1000 and mod(n,100)=1000
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 (100)| 0 |00:00:00.01 | 3 |
|* 1 | INDEX RANGE SCAN| DEMO1_N | 1 | 10 | 4 (0)| 0 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("N"<=1000)
filter(MOD("N",100)=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

Oracle does not know either that the filter predicate mod(n,100)=1000 eliminates all rows and estimates this kind of predicate to 10% of rows (a generic value) after the access predicate returning 10% (this one is calculated from statistics). 3 blocks were read: index branch + leaves.

Reading an additional table from the column does not change this INDEX RANGE SCAN operation but just adds one step to go to the table:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 1rpmvq3jj8hgq, child number 0
-------------------------------------
select /*+ */ a from demo1 where n<=1000 and mod(n,100)=1000
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 (100)| 0 |00:00:00.01 | 3 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO1 | 1 | 10 | 6 (0)| 0 |00:00:00.01 | 3 |
|* 2 | INDEX RANGE SCAN | DEMO1_N | 1 | 10 | 4 (0)| 0 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"<=1000)
filter(MOD("N",100)=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22] 2 - "DEMO1".ROWID[ROWID,10]

Having two operations, the filter removes the rows on the output of the index range scan on line 2 and then has to go to the table only for rows that remain. No additional buffer reads on this step 1 when there are no rows. With Oracle, we build indexes to optimize the access predicates and we add columns to optimize the filter predicate. We can go further by adding all projections and avoid completely the access to the table, but that is not always needed. If we can apply all where clause filters on the indexed columns, then the access to the table remains proportional to the result. And the end-user usually accept longer response time for long results. And index access response time is proportional to the result.

The decomposition in two operations is also convenient to see which columns projection is done for the index result or the table result. Here the only output of the index range scan at line 2 is the ROWID and the output from the table access at line 1 is the column we select. So, we have two operations here. We have seen that INDEX RANGE SCAN can run alone. And we will see in the next post that the TABLE ACCESS BY INDEX ROWID can also run alone.

So what?

I hope that Postgres experts will comment about the need to read the table pages even when we can filter all rows from the index scan. We can do something similar by re-writing the query where we can see that the access to the table is never executed:

explain (analyze,verbose,costs,buffers) select a from demo1 where n in (select n from demo1 where n<=1000 and mod(n,100)=1000 ) ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.57..76.35 rows=5 width=4) (actual time=0.285..0.285 rows=0 loops=1)
Output: demo1.a
Buffers: shared hit=5
-> Index Only Scan using demo1_n on public.demo1 demo1_1 (cost=0.29..34.78 rows=5 width=4) (actual time=0.284..0.284 rows=0 loops=1)
Output: demo1_1.n
Index Cond: (demo1_1.n <= 1000)
Filter: (mod(demo1_1.n, 100) = 1000)
Rows Removed by Filter: 1000
Heap Fetches: 0
Buffers: shared hit=5
-> Index Scan using demo1_n on public.demo1 (cost=0.29..8.30 rows=1 width=8) (never executed)
Output: demo1.n, demo1.a, demo1.x
Index Cond: (demo1.n = demo1_1.n)

But this involves a join, and join methods will deserve another series of blog posts. The next one on access paths will show the TABLE ACCESS BY INDEX ROWID equivalent, Tid Scan. Then I’ll have covered all access paths.

 

Cet article Postgres vs. Oracle access paths VIII – Index Scan and Filter est apparu en premier sur Blog dbi services.

Alfresco – Unable to see the content of folders with ‘+’ in the name

Wed, 2017-08-16 15:23

As you might now if you are following our blogs (see this one for example), we are using Alfresco Community Edition internally for some years now and we also have a few customers using it. Today, I will present you a small issue I faced with Alfresco which wasn’t able to display the content of a few – very specific – folders… As you will see below, it was actually not related to Alfresco but that’s a good example.

 

For some background on this issue, an end-user contacted me, saying that he wasn’t able to see the content of three folders in which he was sure there were at least some PDFs. So I checked and yes, even as an Alfresco Admin, I wasn’t able to see the content of this folder using Share. The only common point I could see with these three folders is that they all contained a “+” in their name. I recently upgraded and migrated this Alfresco environment so I was kind of sure this was somehow linked (since the folders existed and were working before the upgrade+migration). For this blog, I will use a test folder named “Test+Folder” and I’m using an Apache HTTPD as a front end. I put this folder under the Shared folder:

//alfresco_server_01/share/page/context/shared/sharedfilesURL: https://alfresco_server_01/share/page/context/shared/sharedfiles

 

Clicking on this folder’s name from the central screen or from the browsertree on the left side will result in the following screen:

Folder2URL: https://alfresco_server_01/share/page/context/shared/sharedfiles#filter=path|%2FTest%2BFolder|&page=1

 

As you can see above, the URL accessed is the correct one, it contains the “/Test+Folder” path so that’s correct (%2F is “/” and %2B is “+”). However, the screen is empty, just like if this path would not exist and on the breadcrumb, it is shown “Shared Files” while it should be “Shared Files > Test+Folder”. So in summary, the Share UI isn’t able to display the content of this folder. For testing purpose, I accessed this folder using WebDAV and AOS and these clients were able to see the content of the folder but not Share. Like I mentioned above, this Alfresco environment is using Apache HTTPD as a front-end and then mod_jk for the communication with the Tomcat. You can take a look at this blog or the official documentation for more information on this setup.

 

Since other clients were working properly, I tried to access Alfresco Share without going through the front-end (so accessing Tomcat directly) in order to ensure that the issue isn’t with Alfresco itself. By doing so, I was able to see the PDFs. If the issue isn’t with Alfresco, then it should be with the front-end and in particular with the mod_rewrite and mod_jk, in this case. The idea here is to check what are doing these two mods and then compare the outcome with the initial request and what is being transferred to the Tomcat using the access logs.

 

While doing this analysis, I found out that the mod_jk was most probably the root cause of this issue. When mod_jk is doing its job, it will decode the URL’s encoded characters like %2F for “/”, like %2B for “+”, like %20 for a space, aso… Then once the rules have been processed, it will, by default, re-encode these special characters before transmitting the request to Tomcat. However in the Tomcat access logs, it appeared that the other special characters were indeed present in their encoded format but it wasn’t the case for the “+” sign which was shown like that (so no %2B anymore).

This is an example (it’s not the real requests in the background but this is the URL on a Web browser so it gives a good example):

  • URL accessed:                    https://alfresco_server_01/share/page/context/shared/sharedfiles#filter=path|%2FTest%2BFolder|&page=1
  • URL decoded by mod_jk:  https://alfresco_server_01/share/page/context/shared/sharedfiles#filter=path|/Test+Folder|&page=1
  • URL sent to Tomcat:          https://alfresco_server_01/share/page/context/shared/sharedfiles#filter=path|%2FTest+Folder|&page=1

 

For some reasons (it looks like a bug), the mod_jk does not re-encode the “+” sign and this prevents the Tomcat in the back-end to complete the request properly. The behavior of mod_jk related to the URIs is managed using the JkOptions. This property can have the following values regarding the Forwarded URI:

  • JkOptions     +ForwardURIProxy
    • Default value in version > mod_jk 1.2.23
    • The forwarded URI will be partially re-encoded after processing inside Apache and before forwarding to Tomcat
  • JkOptions     +ForwardURICompatUnparsed
    • Default value in version = mod_jk 1.2.23
    • The forwarded URI will be unparsed so no decoding nor re-encoding
  • JkOptions     +ForwardURICompat
    • Default value in version < mod_jk 1.2.23
    • The forwarded URI will be decoded by Apache
  • JkOptions     +ForwardURIEscaped
    • The forwarded URI will be the encoded form of the URI used by ForwardURICompat

 

I’m using a fairly recent version of mod_jk on this Alfresco environment so the default value is “ForwardURIProxy”. Therefore on the paper, it should work properly since URIs will be decoded and re-encoded… However we saw above that this is working but not for the “+” sign which is not re-encoded.

 

To workaround this issue, I just updated the JkOptions to have “JkOptions     +ForwardURICompatUnparsed” in my Apache HTTPD configuration and after a reload of the conf, I was able to access the content of the folder:

Folder3URL: https://alfresco_server_01/share/page/context/shared/sharedfiles#filter=path|%2FTest%2BFolder|&page=1

 

Please note that “ForwardURICompatUnparsed” will always forward the original request URI, so rewriting URIs with mod_rewrite might not work properly, it all depends how you configured the rewrite rules and what you need to do with it. Honestly, I don’t think this issue is linked to all mod_jk versions above 1.2.23 since I was using a version 1.2.40 on RedHat before and I never noticed such issue so this might be linked to a specific mod_jk version on a specific OS (Ubuntu?). So basically if you don’t have this issue, I would suggest you to use the default JkOptions.

 

 

Cet article Alfresco – Unable to see the content of folders with ‘+’ in the name est apparu en premier sur Blog dbi services.

Replicating specific tables in PostgreSQL 10 Beta with mimeo

Tue, 2017-08-15 11:31

In this blog I am going to test the extension mimeo with PostgreSQL 10 beta. Mimeo is a replication extension for copying specific tables in one of several specialized ways from any number of source databases to a destination database where mimeo is installed.
In our configuration we are going to replicate data on a same server but between 2 clusters running on different ports. But it’s same for different servers. The pg_hba.conf should be configured to allow remote connection.
Source
Hostname: pgservertools.localdomain (192.168.56.30)
Database: prima (port 5432)
Target
Hostname: pgservertools.localdomain (192.168.56.30)
Database: repl (port 5433)
The first thing is to install the extension on the destination server. For this we will use the command git to clone the extension directory on the server.
[root@pgservertools ~]# yum install perl-Git.noarch
[root@pgservertools ~]# git clone git://github.com/omniti-labs/mimeo.git
Cloning into 'mimeo'...
remote: Counting objects: 1720, done.
remote: Total 1720 (delta 0), reused 0 (delta 0), pack-reused 1720
Receiving objects: 100% (1720/1720), 1.24 MiB | 429.00 KiB/s, done.
Resolving deltas: 100% (1094/1094), done.
[root@pgservertools ~]#

Then in the mimeo directory let’s run following commands

[root@pgservertools mimeo]# make
[root@pgservertools mimeo]# make install

If there is no error, we can create our two databases. The source database will be named prima and the target will be named repl.

[postgres@pgservertools postgres]$ psql
psql (10beta2)
Type "help" for help.
.
postgres=# show port;
port
------
5432
(1 row)
.
postgres=# create database prima;
CREATE DATABASE
postgres=#


postgres=# show port;
port
------
5433
(1 row)
.
postgres=# create database repl;
CREATE DATABASE
postgres=#

Now we have to install the extension mimeo in the destination database repl.
The extension mimeo requires the extension dblink. If this extension is not present, an error will be raised

repl=# create schema mimeo;
CREATE SCHEMA
.
repl=# create extension mimeo schema mimeo;
ERROR: required extension "dblink" is not installed
HINT: Use CREATE EXTENSION ... CASCADE to install required extensions too.
repl=#

The extension dblink should be already present with the standard installation. This can be verified by listing files the extension directory.

[root@pgservertools extension]# pwd
/usr/pgsql-10/share/extension
.
[root@pgservertools extension]# ls -l dblink*
-rw-r--r--. 1 root root 419 Jul 13 12:15 dblink--1.0--1.1.sql
-rw-r--r--. 1 root root 2832 Jul 13 12:15 dblink--1.1--1.2.sql
-rw-r--r--. 1 root root 6645 Jul 13 12:15 dblink--1.2.sql
-rw-r--r--. 1 root root 170 Jul 13 12:15 dblink.control
-rw-r--r--. 1 root root 2863 Jul 13 12:15 dblink--unpackaged--1.0.sql
[root@pgservertools extension]#

So rexecuting the instruction with the cascade option will install the extension dblink.

repl=# create extension mimeo schema mimeo cascade;
NOTICE: installing required extension "dblink"
CREATE EXTENSION
repl=#

On the target database let’s create a user mimeo we will use for the replication and let’s give him all required privileges. Superuser is not needed by will also work.

repl=# create user mimeo password 'root';
CREATE ROLE
repl=# GRANT USAGE ON SCHEMA mimeo TO mimeo;
GRANT
repl=# GRANT USAGE ON SCHEMA public TO mimeo;
GRANT
repl=# GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA mimeo TO mimeo;
GRANT
repl=#

On the source database let’s create same user on the source and give him required privileges

prima=# create user mimeo password 'root';
CREATE ROLE
prima=# CREATE SCHEMA mimeo;
CREATE SCHEMA
prima=# ALTER SCHEMA mimeo OWNER TO mimeo;
ALTER SCHEMA
prima=#

Every source database needs to have its connection information stored in mimeo’s dblink_mapping_mimeo table on the destination database. You can have as many source databases as you need, which makes creating a central replication destination for many master databases easy. All data is pulled by the destination database, never pushed by the source.

repl=# INSERT INTO mimeo.dblink_mapping_mimeo (data_source, username, pwd)
VALUES ('host=192.168.56.30 port=5432 dbname=prima', 'mimeo', 'root');
INSERT 0 1
repl=#

On the source let’s create table to be replicated and insert some data

prima=# create table article(idart int primary key, name varchar(20));
CREATE TABLE


prima=# table article;
idart | name
-------+-------
1 | Paper
2 | Food
(2 rows)
prima=#

Grant required privilege to mimeo

prima=# grant select on article to mimeo;
GRANT
prima=# grant trigger on article to mimeo;
GRANT
prima=#

Now we are ready to start the replication. We have three methods of replication:
-Snapshot replication
-Incremental replication
-DML replication
We will discuss only for snapshot and DML methods. Indeed the incremental method can replicate only inserted and updated data. It will not replicate any deleted data. See the documentation here

Snapshot Replication
This method is the only one to replicate data and structure change (add column….)
To initialize the table we use the function snapshot_maker (as we are using snapshot replication) and we pass as arguments the table to be replicated and the id of the dblink we want to use.

repl=# select * from mimeo.dblink_mapping_mimeo ;
data_source_id | data_source | username | pwd
----------------+-------------------------------------------+----------+------
1 | host=192.168.56.30 port=5432 dbname=prima | mimeo | root

So following command is used to initialize the table

repl=# SELECT mimeo.snapshot_maker('public.article', 1);
NOTICE: attempting first snapshot
NOTICE: attempting second snapshot
NOTICE: Done
snapshot_maker
----------------
.
(1 row)
repl=#

And we can easily verify that the two tables are synchronized.

repl=# table article;
idart | name
-------+-------
1 | Paper
2 | Food
(2 rows)
repl=#

Now let’s insert new data in the source table and let’s see how to refresh the target table.

prima=# table article;
idart | name
-------+-------
1 | Paper
2 | Food
3 | Oil
4 | Books

On the target database we just have to use the refresh_snap function

repl=# SELECT mimeo.refresh_snap('public.article');
refresh_snap
--------------
.
(1 row)

And we see that the source table was updated.

repl=# table article;
idart | name
-------+-------
1 | Paper
2 | Food
3 | Oil
4 | Books
(4 rows)
repl=#

A refresh can be scheduled using crontab for example every two minutes in my case

[postgres@pgservertools ~]$ crontab -l
*/2 * * * * psql -p 5433 -d repl -c "SELECT mimeo.refresh_snap('public.article')";
[postgres@pgservertools ~]$

DML Replication
The snapshot method is easier to setup, but it is not recommended for large table as
a table setup with this method will have the entire contents refreshed every time it is run.
So for large tables DML replication is recommended.
Let’s create a table customers on the source

prima=# create table customers(idcust int primary key, name varchar(30));
CREATE TABLE
. ^
prima=# insert into customers values(1,'Dbi');
INSERT 0 1
prima=# insert into customers values(2,'XZZ');
INSERT 0 1
.
prima=# table customers
prima-# ;
idcust | name
--------+------
1 | Dbi
2 | XZZ
(2 rows)
prima=#

And let’s grant required privileges to mimeo on customers

prima=# grant select on customers to mimeo;
GRANT
prima=# grant trigger on customers to mimeo;
GRANT
prima=#

On the target we use the function dml_maker to replicate data. We can see that we can even change the name of the destination table.

repl=# SELECT mimeo.dml_maker('public.customers', 1, p_dest_table := 'public.customers_repl');
NOTICE: Creating objects on source database (function, trigger & queue table)...
NOTICE: Pulling data from source...
dml_maker
-----------
.
(1 row)

We can verify that the table customers_repl is created

repl=# \d
List of relations
Schema | Name | Type | Owner
--------+----------------+-------+----------
public | article | view | postgres
public | article_snap1 | table | postgres
public | article_snap2 | table | postgres
public | customers_repl | table | postgres
(4 rows)

And that data are replicated

repl=# select * from customers_repl ;
idcust | name
--------+------
1 | Dbi
2 | XZZ
(2 rows)

Now let’s insert again new data in the source table and let’s see how to refresh the target

prima=# insert into customers values(3,'Linux');
INSERT 0 1
prima=# insert into customers values(4,'Unix');
INSERT 0 1
.
prima=# table customers
;
idcust | name
--------+-------
1 | Dbi
2 | XZZ
3 | Linux
4 | Unix
(4 rows)

On the target database we have to run the refresh_dml function

repl=# SELECT mimeo.refresh_dml('public.customers_repl');
refresh_dml
-------------
.
(1 row)


repl=# table customers_repl;
idcust | name
--------+-------
1 | Dbi
2 | XZZ
3 | Linux
4 | Unix
(4 rows)
repl=#

Like the snapshot method a crontab can be scheduled .
Conclusion
In a previous blog, we saw that logical replication is now supported on PostgreSQL 10. But the extension mimeo can still be used.

 

Cet article Replicating specific tables in PostgreSQL 10 Beta with mimeo est apparu en premier sur Blog dbi services.

ODA X6 Small Medium Large and High Availability

Mon, 2017-08-14 10:46

There are 4 models of Oracle Database Appliance with the new ODA X6 which is for the moment the latest ODA hardware version. One is similar to the previous X5-2 one, and 3 smaller ones known as ODA Lite. They are 1 year old already, here is a small recap of the differences and links to more detail.

System

The ODA X6 are composed with Oracle Server X6-2:

  • ODA X6-2S has one server
  • ODA X6-2M has one server
  • ODA X6-2L has one server
  • ODA X6-2HA is a cluster of two servers with one shared storage shelf (DE3-24C). It can be expanded with one or two additional storage shelf.

Those servers have 2 USB 2.0 ports accessible (2 in front, 2 in back) and 2 internal USB ports.
They have one serial console (SER MGT/RJ-45 connector) port. And One VGA DB-15 connector, easier to plug before you put the cable rails. Resolution: 1,600 x1,200×16 MB @ 60 Hz. Note that the resolution is 1,024 x 768 when viewed remotely via Oracle ILOM).
Each server has 2 redundant power supplies (hot swappable).

Rack Unit
  • ODA X6-2S is 2U
  • ODA X6-2M is 2U
  • ODA X6-2L is 4U
  • ODA X6-2HA is 6U (up to 10U with storage expansions)
CPU

The processor of X6 servers is an Intel 2.2GHz 10-Core Xeon E5-2630 v4, 85W
Cache:
Level 1: 32 KB instruction and 32 KB data L1 cache per core
Level 2: 256 KB shared data and instruction L2 cache per core
Level 3: 25 MB shared inclusive L3 cache per processor

  • ODA X6-2S has 1 processor: 10 cores
  • ODA X6-2M has 2 processors: 20 cores
  • ODA X6-2L has 2 processors: 20 cores
  • ODA X6-2 HA has two servers with 2 processors each: 40 cores

The core factor of the processor is 0.5 and you can license the full capacity for Enterprise Edition: 5 licenses for one ODA X6-2S, 10 licenses for one ODA X6-2M or 2L, 20 licenses for ODA X6-2 HA. You can also run NUP licenses with a minimum of 125 NUP for one ODA X6-2S, 250 NUP for one ODA X6-2M or 2L, 500 NUP for ODA X6-2 HA.
Of course, you can do Capacity on Demand on all models, so the minimum processor license is 1 license for ODA Lite (when 2 cores only are enabled) or 2 licenses for ODA HA because the nodes must by symmetric (except if you run only one node). In NUP, it means 25 NUP minimum for ODA Lite or 50 NUP for ODA HA.

On ODA Lite (2S, 2M, 2L) you can choose to run Standard Edition. Then you count the number of processors: 1 license for ODA X6-2S, 2 licenses for one ODA X6-2M or 2L. When in Standard Edition 2 you can license in NUP with the minimum of 10 NUP per server.

RAM

The X6 servers have 14 DIMM slots and are partially populated with 32GB DIMMs (DDR4-2400).

  • ODA X6-2S has 4 slots populated: 128GB RAM. It can be expanded with additional 8 DIMMs for 384GB
  • ODA X6-2M has 8 slots populated: 256GB RAM. It can be expanded with additional 16 DIMMs for 512GB or 768GB which populates all the slots.
  • ODA X6-2L has 8 slots populated: 256GB RAM. It can be expanded with additional 16 DIMMs for 512GB or 768GB which populates all the slots.
  • ODA X6-2 HA has 8 slots populated per server: 256GB RAM. It can be expanded with additional 16 DIMMs for 512GB or 768GB which populates all the slots.
Network

ODA Lite:
2x 10GbE SFP+ (fiber) and 2x 10GBase-T (copper) ports for ODAX6-2S (4x 10GBase-T for 2M and 2L)
ODA HA:
InfiniBand interconnect, 4x 10GBase-T (copper) ports (bonded to provide two public interfaces) or optional 2x 10GbE SFP+ (fiber) per server and one pair of 10GBase-T will be used for interconnect.

Storage

The servers have two 2.5″ 480GB SAS-3 SDDs disks in front, mirrored for the Operating System and the Oracle Database Software

ODA X6-2S and 2M have additional two 2.5″ 3.2TB NVMe PCIe 3.0 SSD disks for DATA and RECO disk groups (6.4TB raw capacity, 2.4TB double-mirrored, 1.6TB triple-mirrored)
They are expandable to 12.8 TB with two additional disks.

ODA X6-2L has six 3.5-inch disk bays are populated with 3.2TB SSDs (19.2TB raw capacity, 9.6TB double-mirrored, 6.4TB triple-mirrored)
They are expandable to 28.8 TB with two additional disks.

ODA X6-HA has a storage shelf, as previous X5 model (similar but not compatible), but with SSD and lower capacity for DATA: 10x SAS SSD flash 1.2TB (1.6 formatted to 1.2, over-provisioned for write performance because of garbage collection) which means 12 TB (12TB raw capacity, 6TB double-mirrored, 4TB triple-mirrored) expandable to 24 TB in the same chassis. You can add another storage shelf with additional 24TB.

Note that latest ODA X5 had 16x 8TB disks, so 128TB so X6-HA has higher performance but decreased capacity.

Virtualization, High Availability

Virtualization with OVM is only for ODA X6-HA. Only one VM per node has the resources to run Oracle Databases. You can add additional ‘Guest VMs’ for your applications (not for databases as the I/O is optimal only in the ODA Base VM). You can define VLANs.

The ODA Lite (2S/2M/2L) cannot run applications on guest VMs as they are bare metal only. If you want to run applications on ODA Lite, you can now use KVM, but remember that you still need to license all activated cores for each product because KVM is only ‘soft partitioning’ for Oracle LMS.

High Availability (with RAC) is only for ODA X6-HA.
A standby database to another ODA is possible for all models: Data Guard when in Enterprise Edition, or manual standby (we recommend Dbvisit standby – see one of our customer case study http://www.dbvisit.com/customers/case-study-das-protection-juridique/) when in Standard Edition.

Price

According to the Price List (http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf)

  • ODA X6-2S costs USD 18000
  • ODA X6-2M costs USD 24000
  • ODA X6-2L costs USD 40000
  • ODA X6-2 HA costs USD 72000
Full specifications

The full specification is available in the Oracle System Handbook:

A simple presentation is in the 3D view:

 

Cet article ODA X6 Small Medium Large and High Availability est apparu en premier sur Blog dbi services.

WebLogic – Cannot register for disconnect events on local server

Sun, 2017-08-13 03:15

When working with WebLogic, there will probably be a moment when you will ask yourself: damn, what is wrong? Nowadays, software are so complex that it is kind of easy to introduce bugs/non-wanted behaviors in them… In this blog, I will present a small thing that just blew my mind when I faced it: I had a fully working WLS on which I executed a WLST script in order to configure the SSL Ciphers for the Admin Server as well as Managed Servers. After this, the WLS wasn’t able to start anymore but this had nothing to do with the WLST script since I was sure it didn’t contain any error and it did what I expected… To stay generic, to update the Ciphers of all Managed Servers/Domains of a single server and because we manage more than one hundred WebLogic Servers at this customer, an automated deployment was really necessary.

On all these WLS, we have a very few of them that contain two domains on the same server (with different IPs/DNS Aliases of course). To handle this case, we had to use a variable that point to the Admin Server in order to connect to this server and execute the WLST script against it. The name of the variable we defined is, of course, “ADMIN_URL”. This is the name used by WebLogic in the configuration files to point to the Admin Server. So what better name than this? Well actually, anything would have been better since this is what caused this issue…

 

So let’s demonstrate this. First, I’m just trying to start the Admin Server without the variable defined:

[weblogic@weblogic_server_01 ~]$ cd $DOMAIN_HOME/bin
[weblogic@weblogic_server_01 bin]$ 
[weblogic@weblogic_server_01 bin]$ ./startWebLogic.sh
.
.
JAVA Memory arguments: -Xms256m -Xmx512m -XX:MaxPermSize=256m
.
CLASSPATH=$ORACLE_HOME/wlserver/server/lib/jcmFIPS.jar:$ORACLE_HOME/wlserver/server/lib/sslj.jar:$ORACLE_HOME/wlserver/server/lib/cryptoj.jar::$JAVA_HOME/lib/tools.jar:$ORACLE_HOME/wlserver/server/lib/weblogic_sp.jar:$ORACLE_HOME/wlserver/server/lib/weblogic.jar:$ORACLE_HOME/wlserver/../oracle_common/modules/net.sf.antcontrib_1.1.0.0_1-0b3/lib/ant-contrib.jar:$ORACLE_HOME/wlserver/modules/features/oracle.wls.common.nodemanager_2.0.0.0.jar:$ORACLE_HOME/wlserver/../oracle_common/modules/com.oracle.cie.config-wls-online_8.1.0.0.jar:$ORACLE_HOME/wlserver/common/derby/lib/derbyclient.jar:$ORACLE_HOME/wlserver/common/derby/lib/derby.jar:$ORACLE_HOME/wlserver/server/lib/xqrl.jar:$DOMAIN_HOME/lib/LB.jar:$DOMAIN_HOME/lib/LBJNI.jar:
.
PATH=$ORACLE_HOME/wlserver/server/bin:$ORACLE_HOME/wlserver/../oracle_common/modules/org.apache.ant_1.9.2/bin:$JAVA_HOME/jre/bin:$JAVA_HOME/bin:$DOMAIN_HOME/D2/lockbox:$DOMAIN_HOME/D2/lockbox/lib/native/linux_gcc34_x64:$JAVA_HOME/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/weblogic/bin
.
***************************************************
* To start WebLogic Server, use a username and    *
* password assigned to an admin-level user. For   *
* server administration, use the WebLogic Server  *
* console at http://hostname:port/console         *
***************************************************
starting weblogic with Java version:
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
Starting WLS with line:
$JAVA_HOME/bin/java -server -Xms256m -Xmx512m -XX:MaxPermSize=256m -Dweblogic.Name=AdminServer -Djava.security.policy=$ORACLE_HOME/wlserver/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true -Ddomain.home=$DOMAIN_HOME -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.minimumProtocolVersion=TLSv1.2 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true -Djava.endorsed.dirs=$JAVA_HOME/jre/lib/endorsed:$ORACLE_HOME/wlserver/../oracle_common/modules/endorsed -da -Dwls.home=$ORACLE_HOME/wlserver/server -Dweblogic.home=$ORACLE_HOME/wlserver/server -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true weblogic.Server
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
<Aug 5, 2017 1:37:30 PM UTC> <Info> <Security> <BEA-090906> <Changing the default Random Number Generator in RSA CryptoJ from ECDRBG128 to FIPS186PRNG. To disable this change, specify -Dweblogic.security.allowCryptoJDefaultPRNG=true.>
<Aug 5, 2017 1:37:30 PM UTC> <Notice> <WebLogicServer> <BEA-000395> <The following extensions directory contents added to the end of the classpath:
$DOMAIN_HOME/lib/LB.jar:$DOMAIN_HOME/lib/LBJNI.jar.>
<Aug 5, 2017 1:37:30 PM UTC> <Info> <WebLogicServer> <BEA-000377> <Starting WebLogic Server with Java HotSpot(TM) 64-Bit Server VM Version 25.102-b14 from Oracle Corporation.>
<Aug 5, 2017 1:37:31 PM UTC> <Info> <Management> <BEA-141107> <Version: WebLogic Server 12.1.3.0.0 Wed May 21 18:53:34 PDT 2014 1604337 >
<Aug 5, 2017 1:37:32 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STARTING.>
<Aug 5, 2017 1:37:32 PM UTC> <Info> <WorkManager> <BEA-002900> <Initializing self-tuning thread pool.>
<Aug 5, 2017 1:37:32 PM UTC> <Info> <WorkManager> <BEA-002942> <CMM memory level becomes 0. Setting standby thread pool size to 256.>
<Aug 5, 2017 1:37:33 PM UTC> <Notice> <Log Management> <BEA-170019> <The server log file $DOMAIN_HOME/servers/AdminServer/logs/AdminServer.log is opened. All server side log events will be written to this file.>
<Aug 5, 2017 1:37:34 PM UTC> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>
<Aug 5, 2017 1:37:36 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STANDBY.>
<Aug 5, 2017 1:37:36 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STARTING.>
Aug 5, 2017 1:37:36 PM weblogic.wsee.WseeCoreMessages logWseeServiceStarting
INFO: The Wsee Service is starting
<Aug 5, 2017 1:37:37 PM UTC> <Warning> <Munger> <BEA-2156227> <The "META-INF/application.xml" deployment descriptor for the "consoleapp" module is in DTD format. Overrides from the deployment plan will NOT be applied since only deployment descriptors in XML Schema format are supported.>
<Aug 5, 2017 1:37:37 PM UTC> <Warning> <Munger> <BEA-2156227> <The "WEB-INF/weblogic.xml" deployment descriptor for the "consolehelp" module is in DTD format. Overrides from the deployment plan will NOT be applied since only deployment descriptors in XML Schema format are supported.>
<Aug 5, 2017 1:37:38 PM UTC> <Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias mycert from the JKS keystore file $DOMAIN_HOME/certs/identity.jks.>
<Aug 5, 2017 1:37:38 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the JKS keystore file $DOMAIN_HOME/certs/trust.jks.>
<Aug 5, 2017 1:37:39 PM UTC> <Notice> <Log Management> <BEA-170027> <The server has successfully established a connection with the Domain level Diagnostic Service.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to ADMIN.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to RESUMING.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <Server> <BEA-002613> <Channel "DefaultSecure" is now listening on weblogic_server_01:8443 for protocols iiops, t3s, ldaps, https.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <WebLogicServer> <BEA-000329> <Started the WebLogic Server Administration Server "AdminServer" for domain "DOMAIN" running in production mode.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <WebLogicServer> <BEA-000360> <The server started in RUNNING mode.>
<Aug 5, 2017 1:37:40 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to RUNNING.>

^C
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <WebLogicServer> <BEA-000388> <JVM called the WebLogic Server shutdown hook. The server will force shutdown now.>
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <WebLogicServer> <BEA-000396> <Server shutdown has been requested by <WLS Kernel>.>
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SUSPENDING.>
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to ADMIN.>
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN.>
<Aug 5, 2017 1:37:55 PM UTC> <Notice> <Server> <BEA-002607> <Channel "DefaultSecure", listening on weblogic_server_01:8443, was shut down.>
[weblogic@weblogic_server_01 bin]$

 

As you can see above, the Admin Server is starting properly. Once in RUNNING state, I just stopped it (CTRL+C) to continue the demo.

The next step is to find the current value of “ADMIN_URL”. This variable is used in the files stopWebLogic.sh, stopManagedWebLogic.sh and startManagedWebLogic.sh. To be able to stop these components, the address of the Admin Server is needed. The funny thing is that in two of these three files, it is using the “t3s” protocol and on the third one, it is using “https” (or t3/http if not in SSL-enabled). Once you have this value, you define your environment variable using this exact same value (manual setup or more automatic setup):

[weblogic@weblogic_server_01 bin]$ echo $ADMIN_URL

[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ grep -E 'ADMIN_URL="[th]' *.sh
startManagedWebLogic.sh:	ADMIN_URL="https://weblogic_server_01:8443"
stopManagedWebLogic.sh:		ADMIN_URL="t3s://weblogic_server_01:8443"
stopWebLogic.sh:			ADMIN_URL="t3s://weblogic_server_01:8443"
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ export ADMIN_URL="https://weblogic_server_01:8443"
[weblogic@weblogic_server_01 bin]$ # or
[weblogic@weblogic_server_01 bin]$ export ADMIN_URL=`grep -E 'ADMIN_URL="[th]' stopWebLogic.sh | sed 's,ADMIN_URL="\([^"]*\)",\1,'`
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ echo $ADMIN_URL
t3s://weblogic_server_01:8443
[weblogic@weblogic_server_01 bin]$

 

At this point, I defined the ADMIN_URL variable using the T3S protocol. We can think that this wouldn’t affect the start of our domain. I mean we are just defining a variable that exist in the shell scripts of WebLogic with the exact same value… But note that this variable isn’t defined in the file “startWebLogic.sh”… Once this is done, we can reproduce the issue. For that, simply try to start the Admin Server again:

[weblogic@weblogic_server_01 bin]$ ./startWebLogic.sh
.
.
JAVA Memory arguments: -Xms256m -Xmx512m -XX:MaxPermSize=256m
.
CLASSPATH=$ORACLE_HOME/wlserver/server/lib/jcmFIPS.jar:$ORACLE_HOME/wlserver/server/lib/sslj.jar:$ORACLE_HOME/wlserver/server/lib/cryptoj.jar::$JAVA_HOME/lib/tools.jar:$ORACLE_HOME/wlserver/server/lib/weblogic_sp.jar:$ORACLE_HOME/wlserver/server/lib/weblogic.jar:$ORACLE_HOME/wlserver/../oracle_common/modules/net.sf.antcontrib_1.1.0.0_1-0b3/lib/ant-contrib.jar:$ORACLE_HOME/wlserver/modules/features/oracle.wls.common.nodemanager_2.0.0.0.jar:$ORACLE_HOME/wlserver/../oracle_common/modules/com.oracle.cie.config-wls-online_8.1.0.0.jar:$ORACLE_HOME/wlserver/common/derby/lib/derbyclient.jar:$ORACLE_HOME/wlserver/common/derby/lib/derby.jar:$ORACLE_HOME/wlserver/server/lib/xqrl.jar:$DOMAIN_HOME/lib/LB.jar:$DOMAIN_HOME/lib/LBJNI.jar:
.
PATH=$ORACLE_HOME/wlserver/server/bin:$ORACLE_HOME/wlserver/../oracle_common/modules/org.apache.ant_1.9.2/bin:$JAVA_HOME/jre/bin:$JAVA_HOME/bin:$DOMAIN_HOME/D2/lockbox:$DOMAIN_HOME/D2/lockbox/lib/native/linux_gcc34_x64:$JAVA_HOME/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/weblogic/bin
.
***************************************************
* To start WebLogic Server, use a username and *
* password assigned to an admin-level user. For *
* server administration, use the WebLogic Server *
* console at http://hostname:port/console *
***************************************************
starting weblogic with Java version:
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
Starting WLS with line:
$JAVA_HOME/bin/java -server -Xms256m -Xmx512m -XX:MaxPermSize=256m -Dweblogic.Name=AdminServer -Djava.security.policy=$ORACLE_HOME/wlserver/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true -Ddomain.home=$DOMAIN_HOME -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.minimumProtocolVersion=TLSv1.2 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true -Djava.endorsed.dirs=$JAVA_HOME/jre/lib/endorsed:$ORACLE_HOME/wlserver/../oracle_common/modules/endorsed -da -Dwls.home=$ORACLE_HOME/wlserver/server -Dweblogic.home=$ORACLE_HOME/wlserver/server -Dweblogic.management.server=t3s://weblogic_server_01:8443 -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true weblogic.Server
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
<Aug 5, 2017 1:40:04 PM UTC> <Info> <Security> <BEA-090906> <Changing the default Random Number Generator in RSA CryptoJ from ECDRBG128 to FIPS186PRNG. To disable this change, specify -Dweblogic.security.allowCryptoJDefaultPRNG=true.>
<Aug 5, 2017 1:40:04 PM UTC> <Notice> <WebLogicServer> <BEA-000395> <The following extensions directory contents added to the end of the classpath:
$DOMAIN_HOME/lib/LB.jar:$DOMAIN_HOME/lib/LBJNI.jar.>
<Aug 5, 2017 1:40:04 PM UTC> <Info> <WebLogicServer> <BEA-000377> <Starting WebLogic Server with Java HotSpot(TM) 64-Bit Server VM Version 25.102-b14 from Oracle Corporation.>
<Aug 5, 2017 1:40:05 PM UTC> <Warning> <Security> <BEA-090924> <JSSE has been selected by default, since the SSLMBean is not available.>
<Aug 5, 2017 1:40:05 PM UTC> <Info> <Security> <BEA-090908> <Using the default WebLogic SSL Hostname Verifier implementation.>
<Aug 5, 2017 1:40:05 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file $ORACLE_HOME/wlserver/server/lib/DemoTrust.jks.>
<Aug 5, 2017 1:40:06 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file $JAVA_HOME/jre/lib/security/cacerts.>
<Aug 5, 2017 1:40:06 PM UTC> <Info> <Management> <BEA-141298> <Could not register with the Administration Server: java.rmi.RemoteException: [Deployer:149147]Could not reach the Administration Server through any of its URLs: "https://weblogic_server_01:8443".>
<Aug 5, 2017 1:40:06 PM UTC> <Info> <Management> <BEA-141107> <Version: WebLogic Server 12.1.3.0.0 Wed May 21 18:53:34 PDT 2014 1604337 >
<Aug 5, 2017 1:40:07 PM UTC> <Info> <Security> <BEA-090908> <Using the default WebLogic SSL Hostname Verifier implementation.>
<Aug 5, 2017 1:40:07 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file $ORACLE_HOME/wlserver/server/lib/DemoTrust.jks.>
<Aug 5, 2017 1:40:07 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the jks keystore file $JAVA_HOME/jre/lib/security/cacerts.>
<Aug 5, 2017 1:40:07 PM UTC> <Alert> <Management> <BEA-141151> <The Administration Server could not be reached at https://weblogic_server_01:8443.>
<Aug 5, 2017 1:40:07 PM UTC> <Info> <Configuration Management> <BEA-150018> <This server is being started in Managed Server independence mode in the absence of the Administration Server.>
<Aug 5, 2017 1:40:07 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STARTING.>
<Aug 5, 2017 1:40:07 PM UTC> <Info> <WorkManager> <BEA-002900> <Initializing self-tuning thread pool.>
<Aug 5, 2017 1:40:07 PM UTC> <Info> <WorkManager> <BEA-002942> <CMM memory level becomes 0. Setting standby thread pool size to 256.>
<Aug 5, 2017 1:40:07 PM UTC> <Notice> <Log Management> <BEA-170019> <The server log file $DOMAIN_HOME/servers/AdminServer/logs/AdminServer.log is opened. All server side log events will be written to this file.>
<Aug 5, 2017 1:40:09 PM UTC> <Notice> <Security> <BEA-090082> <Security initializing using security realm myrealm.>
<Aug 5, 2017 1:40:10 PM UTC> <Error> <Configuration Management> <BEA-150000> <An error occurred while establishing a connection back to the Adminstration Server t3s://weblogic_server_01:8443 during startup. Since bootstrap succeeded, check that t3s://weblogic_server_01:8443 uniquely identifies the Administration Server.
javax.naming.ConfigurationException: Cannot register for disconnect events on local server
    at weblogic.server.channels.RemoteChannelServiceImpl.registerInternal(RemoteChannelServiceImpl.java:234)
    at weblogic.server.channels.RemoteChannelServiceImpl.registerForever(RemoteChannelServiceImpl.java:190)
    at weblogic.protocol.ConnectMonitorFactory.registerForever(ConnectMonitorFactory.java:54)
    at weblogic.management.provider.MSIService.registerForReconnectToAdminServer(MSIService.java:188)
    at weblogic.management.mbeanservers.compatibility.internal.CompatibilityMBeanServerService.start(CompatibilityMBeanServerService.java:183)
    Truncated. see log file for complete stacktrace
>
<Aug 5, 2017 1:40:10 PM UTC> <Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias myacert from the JKS keystore file $DOMAIN_HOME/certs/identity.jks.>
<Aug 5, 2017 1:40:10 PM UTC> <Notice> <Security> <BEA-090169> <Loading trusted certificates from the JKS keystore file $DOMAIN_HOME/certs/trust.jks.>
<Aug 5, 2017 1:40:12 PM UTC> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason:

    There are 1 nested errors:

javax.naming.ConfigurationException: Cannot register for disconnect events on local server
    at weblogic.server.channels.RemoteChannelServiceImpl.registerInternal(RemoteChannelServiceImpl.java:234)
    at weblogic.server.channels.RemoteChannelServiceImpl.registerForever(RemoteChannelServiceImpl.java:190)
    at weblogic.protocol.ConnectMonitorFactory.registerForever(ConnectMonitorFactory.java:54)
    at weblogic.management.provider.MSIService.registerForReconnectToAdminServer(MSIService.java:188)
    at weblogic.management.mbeanservers.compatibility.internal.CompatibilityMBeanServerService.start(CompatibilityMBeanServerService.java:183)
    at weblogic.server.AbstractServerService.postConstruct(AbstractServerService.java:78)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.glassfish.hk2.utilities.reflection.ReflectionHelper.invoke(ReflectionHelper.java:1017)
    at org.jvnet.hk2.internal.ClazzCreator.postConstructMe(ClazzCreator.java:388)
    at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:430)
    at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:456)
    at org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncRunLevelContext.java:225)
    at org.glassfish.hk2.runlevel.RunLevelContext.findOrCreate(RunLevelContext.java:82)
    at org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2488)
    at org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:98)
    at org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:87)
    at org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.oneJob(CurrentTaskFuture.java:1162)
    at org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.run(CurrentTaskFuture.java:1147)
    at org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$UpOneLevel.run(CurrentTaskFuture.java:753)
    at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:553)
    at weblogic.work.ExecuteThread.execute(ExecuteThread.java:311)
    at weblogic.work.ExecuteThread.run(ExecuteThread.java:263)

>
<Aug 5, 2017 1:40:12 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED.>
<Aug 5, 2017 1:40:12 PM UTC> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down.>
<Aug 5, 2017 1:40:12 PM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN.>
[weblogic@weblogic_server_01 bin]$

 

So what is happening exactly that can cause this simple variable definition to prevent you to start your Admin Server? If you take a look at the script “startWebLogic.sh”, you will not see any use of this variable so you should be good, right? Well not really because this script is actually loading its environment by using the well-known “setDomainEnv.sh” (in the same folder). The interesting part is in this second file… If you are checking the usage of “ADMIN_URL” in this setDomainEnv, you will see an “if-then-else” section:

[weblogic@weblogic_server_01 bin]$ grep -C2 "ADMIN_URL" setDomainEnv.sh
# Clustering support (edit for your cluster!)

if [ "${ADMIN_URL}" = "" ] ; then
        # The then part of this block is telling us we are either starting an admin server OR we are non-clustered
        CLUSTER_PROPERTIES=""
        export CLUSTER_PROPERTIES
else
        CLUSTER_PROPERTIES="-Dweblogic.management.server=${ADMIN_URL}"
        export CLUSTER_PROPERTIES
fi
[weblogic@weblogic_server_01 bin]$

 

With this small portion of code, you can actually understand the issue:

  • if the variable “ADMIN_URL” isn’t defined when loading the file setDomainEnv.sh, then WebLogic will suppose that you are starting an Admin Server or that you are in a non-clustered environment.
  • if the variable “ADMIN_URL” is defined when loading the file setDomainEnv.sh, then WebLogic will suppose that you are starting a clustered environment

 

Therefore when defining the ADMIN_URL in the environment, we are actually – unintentionally – saying to WebLogic that this is a cluster and that the Management Server can be found at $ADMIN_URL… But this URL is the local Admin Server which we are trying to start. Thus the failure to start…

If you carefully read the logs above, you could actually saw this small difference… When reproducing the error, if you take a look at the end of the line 21, you will see that the JVM parameter “-Dweblogic.management.server=t3s://weblogic_server_01:8443″ is present and that’s the reason of this issue.

 

 

Cet article WebLogic – Cannot register for disconnect events on local server est apparu en premier sur Blog dbi services.

OUD 11.1.2.3 – How to recover a lost entry in a replicated OUD environment

Thu, 2017-08-10 04:57

By accident, it could happen that someone dropped an OUD entry in your replicated environment and of course, the entry is deleted on all replicas as well. And besides that, you got no logical ldif export of your OUD. Not a good situation, but if you have a done regular backups, there is a way out without resetting the whole OUD environment to an old timestamp.

The idea is, to create a second empty OUD instance, restore only the Context which is needed, export the entry as ldif and import it again into your current replicated OUD.

Let’s suppose, we have lost the following OUD entry:

Distinguished Name: cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com

To recover it, follow these steps:

  1. Create new OUD instance
  2. Create suffix
  3. Stop OUD instance asinst_2
  4. Restore the correct dc
  5. Start OUD instance asinst_2
  6. Export entry from asinst_2
  7. Import entry into asinst_1
  8. Cleanup asinst_2
1. Create new OUD instance
[dbafmw@dbidg01 ~]$ cd /u01/app/oracle/product/Middleware/11.1.2.3/Oracle_OUD1
[dbafmw@dbidg01 Oracle_OUD1]$
[dbafmw@dbidg01 Oracle_OUD1]$ ./oud-setup --cli --baseDN dc=dbi,dc=com --addBaseEntry --adminConnectorPort 5444 --ldapPort 2389 \
> --rootUserDN cn=Directory\ Manager --rootUserPasswordFile ~/.oudpwd \
> --ldapsPort 2636 --generateSelfSignedCertificate \
> --hostname dbidg01 --integration generic \
> --serverTuning -Xms2048m\ -Xmx2048m\ -d64\ -XX:+UseCompressedOops\ -server\ -XX:MaxTenuringThreshold=1\ -XX:+UseConcMarkSweepGC\ -XX:CMSInitiatingOccupancyFraction=55 \
> --offlineToolsTuning -Xms2048m\ -Xmx2048m\ -d64\ -XX:+UseCompressedOops\ -server\ -XX:+UseParallelGC\ -XX:+UseNUMA \
> --no-prompt --noPropertiesFile
OUD Instance location successfully created - /u01/app/oracle/product/Middleware/11.1.2.3/Oracle_OUD1/../asinst_2"

Oracle Unified Directory 11.1.2.3.170718
Please wait while the setup program initializes...

See /u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/logs/oud-setup
for a detailed log of this operation.

Configuring Directory Server ..... Done.
Configuring Certificates ..... Done.
Creating Base Entry dc=dbi,dc=com ..... Done.
Preparing the server for Oracle integration ....... Done.
Starting Directory Server ....... Done.
Creating Net Services suffixes ..... Done.

To see basic server configuration status and configuration you can launch
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin/status


The new OUD instance was started automatically.


[dbafmw@dbidg01 bin]$ /u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin/status


>>>> Specify Oracle Unified Directory LDAP connection parameters

Administrator user bind DN [cn=Directory Manager]:

Password for user 'cn=Directory Manager':

          --- Server Status ---
Server Run Status:        Started
Open Connections:         1

          --- Server Details ---
Host Name:                dbidg01
Administrative Users:     cn=Directory Manager
Installation Path:
/u01/app/oracle/product/Middleware/11.1.2.3/Oracle_OUD1
Instance Path:
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD
Version:                  Oracle Unified Directory 11.1.2.3.170718
Java Version:             1.7.0_151
Administration Connector: Port 5444 (LDAPS)

          --- Connection Handlers ---
Address:Port : Protocol : State
-------------:----------:---------
--           : LDIF     : Disabled
0.0.0.0:161  : SNMP     : Disabled
0.0.0.0:1689 : JMX      : Disabled
0.0.0.0:2389 : LDAP     : Enabled
0.0.0.0:2636 : LDAPS    : Enabled

          --- Data Sources ---
Base DN:     cn=OracleContext
Backend ID:  OIDCompatibility
Entries:     26
Replication: Disabled

Base DN:     cn=OracleContext,dc=dbi,dc=com
Backend ID:  OracleContext0
Entries:     17
Replication: Disabled

Base DN:     cn=OracleSchemaVersion
Backend ID:  OIDCompatibility
Entries:     3
Replication: Disabled

Base DN:     cn=virtual acis
Backend ID:  virtualAcis
Entries:     0
Replication: Disabled

Base DN:     dc=dbi,dc=com
Backend ID:  userRoot
Entries:     1
Replication: Disabled

[dbafmw@dbidg01 bin]$
2. Create suffix
[dbafmw@dbidg01 bin]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin

[dbafmw@dbidg01 bin]$ ./manage-suffix create --baseDN "dc=william,dc=dbi,dc=com" \
> --entries base-entry --integration Generic \
> --hostname localhost --port 5444 \
> --bindDN cn="Directory Manager" --bindPasswordFile ~/.oudpwd \
> --trustAll --no-prompt
Reading Configuration ..... Done.

Creating suffixes ..... Done.

Adding Data ..... Done.

Updating Oracle Integration ..... Done.
[dbafmw@dbidg01 bin]$
3. Stop OUD instance asinst_2
[dbafmw@dbidg01 bin]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin
[dbafmw@dbidg01 bin]$ ./stop-ds
Stopping Server...

[10/Aug/2017:11:10:34 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=OIDCompatibility,cn=Workflow Elements,cn=config is now taken offline
[10/Aug/2017:11:10:34 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=OracleContext for dc=william\,dc=dbi\,dc=com,cn=Workflow elements,cn=config is now taken offline
[10/Aug/2017:11:10:35 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=OracleContext0,cn=Workflow elements,cn=config is now taken offline
[10/Aug/2017:11:10:35 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=userRoot,cn=Workflow Elements,cn=config is now taken offline
[10/Aug/2017:11:10:35 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=userRoot-0,cn=Workflow elements,cn=config is now taken offline
[10/Aug/2017:11:10:35 +0200] category=BACKEND severity=NOTICE msgID=9896306 msg=The backend cn=virtualAcis,cn=Workflow Elements,cn=config is now taken offline
[10/Aug/2017:11:10:35 +0200] category=CORE severity=NOTICE msgID=458955 msg=The Directory Server is now stopped
4. Restore the correct dc
[dbafmw@dbidg01 bin]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin
[dbafmw@dbidg01 bin]$ ./restore --listBackups --backupDirectory="/u99/backup/OUD/20170810/OracleContext for dc=william,dc=dbi,dc=com"
Backup ID:          20170810085118Z
Backup Date:        10/Aug/2017:10:51:18 +0200
Is Incremental:     false
Is Compressed:      true
Is Encrypted:       false
Has Unsigned Hash:  false
Has Signed Hash:    false
Dependent Upon:     none

[dbafmw@dbidg01 bin]$ ./restore --dry-run --backupDirectory="/u99/backup/OUD/20170810/OracleContext for dc=william,dc=dbi,dc=com"
[10/Aug/2017:11:19:47 +0200] category=JEB severity=NOTICE msgID=8847444 msg=Verifying: 00000000.jdb

[dbafmw@dbidg01 bin]$ ./restore  --backupDirectory="/u99/backup/OUD/20170810/OracleContext for dc=william,dc=dbi,dc=com"
[10/Aug/2017:11:20:11 +0200] category=JEB severity=NOTICE msgID=8847445 msg=Restored: 00000000.jdb (size 114414)
5. Start OUD instance asinst_2
[dbafmw@dbidg01 bin]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD/bin
[dbafmw@dbidg01 bin]$ ./start-ds
[10/Aug/2017:11:20:30 +0200] category=CORE severity=INFORMATION msgID=132 msg=The Directory Server is beginning the configuration bootstrapping process
[10/Aug/2017:11:20:31 +0200] category=CORE severity=NOTICE msgID=458886 msg=Oracle Unified Directory 11.1.2.3.170718 (build 20170621135318Z, R1706210545) starting up
[10/Aug/2017:11:20:34 +0200] category=RUNTIME_INFORMATION severity=NOTICE msgID=20381717 msg=Installation Directory:  /u01/app/oracle/product/Middleware/11.1.2.3/Oracle_OUD1
[10/Aug/2017:11:20:34 +0200] category=RUNTIME_INFORMATION severity=NOTICE msgID=20381719 msg=Instance Directory:      /u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD
[10/Aug/2017:11:20:34 +0200] category=RUNTIME_INFORMATION severity=NOTICE msgID=20381713 msg=JVM Information: 1.7.0_151-b15 by Oracle Corporation, 64-bit architecture, 2130051072 bytes heap size
[10/Aug/2017:11:20:34 +0200] category=RUNTIME_INFORMATION severity=NOTICE msgID=20381714 msg=JVM Host: dbidg01, running Linux 4.1.12-94.5.7.el7uek.x86_64 amd64, 5986422784 bytes physical memory size, number of processors available 2
[10/Aug/2017:11:20:34 +0200] category=RUNTIME_INFORMATION severity=NOTICE msgID=20381715 msg=JVM Arguments: "-Xms2048m", "-Xmx2048m", "-XX:+UseCompressedOops", "-XX:MaxTenuringThreshold=1", "-XX:+UseConcMarkSweepGC", "-XX:CMSInitiatingOccupancyFraction=55", "-Dorg.opends.server.scriptName=start-ds"
[10/Aug/2017:11:20:34 +0200] category=ACCESS_CONTROL severity=INFORMATION msgID=12582978 msg=Added 11 Global Access Control Instruction (ACI) attribute types to the access control evaluation engine
[10/Aug/2017:11:20:34 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend OracleContext0 does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:34 +0200] category=BACKEND severity=INFORMATION msgID=9437594 msg=Local DB backend OracleContext0 does not specify the number of cleaner threads: defaulting to 24 threads
[10/Aug/2017:11:20:34 +0200] category=BACKEND severity=INFORMATION msgID=9437615 msg=Local DB backend OracleContext0 does not specify the percentage of the heap space to allocate to the database cache: defaulting to 35 percent
[10/Aug/2017:11:20:34 +0200] category=BACKEND severity=INFORMATION msgID=9437613 msg=Local DB backend OracleContext0 does not specify the size of the file handle cache: sizing automatically to use 100 file descriptors
[10/Aug/2017:11:20:35 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=OracleContext0,cn=Workflow elements,cn=config containing 17 entries has started
[10/Aug/2017:11:20:35 +0200] category=ACCESS_CONTROL severity=INFORMATION msgID=12582962 msg=Added 5 Access Control Instruction (ACI) attribute types found in context "cn=OracleContext,dc=dbi,dc=com" to the access control evaluation engine
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend virtualAcis does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:35 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=virtualAcis,cn=Workflow Elements,cn=config containing 0 entries has started
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend userRoot-0 does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437594 msg=Local DB backend userRoot-0 does not specify the number of cleaner threads: defaulting to 24 threads
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437615 msg=Local DB backend userRoot-0 does not specify the percentage of the heap space to allocate to the database cache: defaulting to 35 percent
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437613 msg=Local DB backend userRoot-0 does not specify the size of the file handle cache: sizing automatically to use 100 file descriptors
[10/Aug/2017:11:20:35 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=userRoot-0,cn=Workflow elements,cn=config containing 1 entries has started
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend OracleContext for dc=william,dc=dbi,dc=com does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437594 msg=Local DB backend OracleContext for dc=william,dc=dbi,dc=com does not specify the number of cleaner threads: defaulting to 24 threads
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437615 msg=Local DB backend OracleContext for dc=william,dc=dbi,dc=com does not specify the percentage of the heap space to allocate to the database cache: defaulting to 35 percent
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437613 msg=Local DB backend OracleContext for dc=william,dc=dbi,dc=com does not specify the size of the file handle cache: sizing automatically to use 100 file descriptors
[10/Aug/2017:11:20:35 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=OracleContext for dc=william\,dc=dbi\,dc=com,cn=Workflow elements,cn=config containing 18 entries has started
[10/Aug/2017:11:20:35 +0200] category=ACCESS_CONTROL severity=INFORMATION msgID=12582962 msg=Added 5 Access Control Instruction (ACI) attribute types found in context "cn=OracleContext,dc=william,dc=dbi,dc=com" to the access control evaluation engine
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend userRoot does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437594 msg=Local DB backend userRoot does not specify the number of cleaner threads: defaulting to 24 threads
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437615 msg=Local DB backend userRoot does not specify the percentage of the heap space to allocate to the database cache: defaulting to 35 percent
[10/Aug/2017:11:20:35 +0200] category=BACKEND severity=INFORMATION msgID=9437613 msg=Local DB backend userRoot does not specify the size of the file handle cache: sizing automatically to use 100 file descriptors
[10/Aug/2017:11:20:35 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=userRoot,cn=Workflow Elements,cn=config containing 1 entries has started
[10/Aug/2017:11:20:36 +0200] category=BACKEND severity=INFORMATION msgID=9437595 msg=Local DB backend OIDCompatibility does not specify the number of lock tables: defaulting to 97
[10/Aug/2017:11:20:36 +0200] category=BACKEND severity=INFORMATION msgID=9437594 msg=Local DB backend OIDCompatibility does not specify the number of cleaner threads: defaulting to 24 threads
[10/Aug/2017:11:20:36 +0200] category=BACKEND severity=INFORMATION msgID=9437615 msg=Local DB backend OIDCompatibility does not specify the percentage of the heap space to allocate to the database cache: defaulting to 35 percent
[10/Aug/2017:11:20:36 +0200] category=BACKEND severity=INFORMATION msgID=9437613 msg=Local DB backend OIDCompatibility does not specify the size of the file handle cache: sizing automatically to use 100 file descriptors
[10/Aug/2017:11:20:36 +0200] category=JEB severity=NOTICE msgID=8847402 msg=The database backend cn=OIDCompatibility,cn=Workflow Elements,cn=config containing 29 entries has started
[10/Aug/2017:11:20:36 +0200] category=ACCESS_CONTROL severity=INFORMATION msgID=12582962 msg=Added 10 Access Control Instruction (ACI) attribute types found in context "cn=OracleContext" to the access control evaluation engine
[10/Aug/2017:11:20:36 +0200] category=ACCESS_CONTROL severity=INFORMATION msgID=12582962 msg=Added 1 Access Control Instruction (ACI) attribute types found in context "cn=OracleSchemaVersion" to the access control evaluation engine
[10/Aug/2017:11:20:36 +0200] category=EXTENSIONS severity=INFORMATION msgID=1048797 msg=DIGEST-MD5 SASL mechanism using a server fully qualified domain name of: dbidg01
[10/Aug/2017:11:20:36 +0200] category=CORE severity=INFORMATION msgID=731 msg=LDAP Connection Handler 0.0.0.0 port 2389 does not specify the number of request handler threads: sizing automatically to use 8 threads
[10/Aug/2017:11:20:36 +0200] category=CORE severity=INFORMATION msgID=731 msg=LDAP Connection Handler 0.0.0.0 port 2636 does not specify the number of request handler threads: sizing automatically to use 8 threads
[10/Aug/2017:11:20:36 +0200] category=CORE severity=INFORMATION msgID=720 msg=No worker queue thread pool size specified: sizing automatically to use 24 threads
[10/Aug/2017:11:20:36 +0200] category=PROTOCOL severity=NOTICE msgID=2556180 msg=Started listening for new connections on Administration Connector 0.0.0.0 port 5444
[10/Aug/2017:11:20:36 +0200] category=PROTOCOL severity=NOTICE msgID=2556180 msg=Started listening for new connections on LDAP Connection Handler 0.0.0.0 port 2389
[10/Aug/2017:11:20:36 +0200] category=PROTOCOL severity=NOTICE msgID=2556180 msg=Started listening for new connections on LDAP Connection Handler 0.0.0.0 port 2636
[10/Aug/2017:11:20:36 +0200] category=CORE severity=NOTICE msgID=458887 msg=The Directory Server has started successfully
[10/Aug/2017:11:20:36 +0200] category=CORE severity=NOTICE msgID=458891 msg=The Directory Server has sent an alert notification generated by class org.opends.server.core.DirectoryServer (alert type org.opends.server.DirectoryServerStarted, alert ID 458887):  The Directory Server has started successfully
6. Export entry from asinst_2

Now let’s check if we got our entry “cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com” back.

[dbafmw@dbidg01 bin]$ ./ldapsearch --hostname localhost --port 2389 --bindDN "cn=Directory Manager" \
> --bindPasswordFile ~/.oudpwd --baseDN "cn=OracleContext,dc=william,dc=dbi,dc=com" "(cn=dbit122_ldap)" objectclass orclNetDescString orclNetDescName orclVersion
dn: cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com
orclNetDescName: 000:cn=DESCRIPTION_0
orclNetDescString: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg
 01)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg02)(PORT=1521)))(CONNECT_DATA=
 (SERVICE_NAME=DBIT122_PRI)))
objectclass: orclNetService
objectclass: top

Looks good, so we can create the ldif now.

[dbafmw@dbidg01 bin]$ ./ldapsearch --hostname localhost --port 2389 --bindDN "cn=Directory Manager" --bindPasswordFile 
> ~/.oudpwd --baseDN "cn=OracleContext,dc=william,dc=dbi,dc=com" "(cn=dbit122_ldap)" objectclass 
> orclNetDescString orclNetDescName orclVersion > /tmp/dbit122.ldif

[dbafmw@dbidg01 bin]$ cat /tmp/dbit122.ldif
dn: cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com
orclNetDescName: 000:cn=DESCRIPTION_0
orclNetDescString: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg
 01)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg02)(PORT=1521)))(CONNECT_DATA=
 (SERVICE_NAME=DBIT122_PRI)))
objectclass: orclNetService
objectclass: top
7. Import the entry into asinst_1 (this is the replicated OUD environment)
[dbafmw@dbidg01 bin]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/bin

[dbafmw@dbidg01 bin]$ ./ldapmodify --defaultAdd --filename /tmp/dbit122.ldif --hostname dbidg01 --port 1389 \
> --bindDN "cn=Directory Manager" --bindPasswordFile ~/.oudpwd
Processing ADD request for cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com
ADD operation successful for DN cn=dbit122_ldap,cn=OracleContext,dc=william,dc=dbi,dc=com
8. Cleanup asinst_2
[dbafmw@dbidg01 OUD]$ pwd
/u01/app/oracle/product/Middleware/11.1.2.3/asinst_2/OUD
[dbafmw@dbidg01 OUD]$ ./uninstall --cli --remove-all

The server is currently running and must be stopped before uninstallation can
continue.
Stop the Server and permanently delete the files? (yes / no) [yes]: yes

Stopping Directory Server ..... Done.
Deleting Files under the Installation Path ..... Done.

Uninstall Completed Successfully.
See /tmp/oud-uninstall-4049143346007549356.log for a detailed log of this operation.
[dbafmw@dbidg01 OUD]$

Ready, we got our entry back, and even cleaned up the leftovers from the temporary OUD instance asinst_2.

Conclusion

Loosing entries from your replicated OUD environment is not the end of the world. If you have good backups, you can create anytime a new OUD instance and recover only the Context that you need.

 

Cet article OUD 11.1.2.3 – How to recover a lost entry in a replicated OUD environment est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths VI – Index Scan

Wed, 2017-08-09 13:58

In the previous post my queries were still reading the indexed column only, from a table which had no modifications since the last vacuum, and then didn’t need to read table pages: it was Index Only Scan. However, we often need more columns than the ones that are in the index. Here is the Index Scan access path.

I’m continuing on the table that I’ve created in the first post of the series. I’ve run VACUUM (the lazy one, not the full one) and did not do any modification after that, as we have seen that Index Only Access is efficient only when there are no modifications.

create table demo1 as select generate_series n , 1 a , lpad('x',1000,'x') x from generate_series(1,10000);
SELECT 10000
create unique index demo1_n on demo1(n);
CREATE INDEX
vacuum demo1;
VACUUM

I have 10000 rows, a unique column N with decimal numbers, indexed and another column A which is not indexed.

Index Only Scan

I’ll now query one row, the one with N=1000.

explain (analyze,verbose,costs,buffers) select n from demo1 where n=1000 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..4.30 rows=1 width=4) (actual time=0.123..0.124 rows=1 loops=1)
Output: n
Index Cond: (demo1.n = 1000)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.625 ms
Execution time: 0.137 ms

It seems that the query planner estimates to read one block:

  • The startup cost of 0.29 as we have seen before
  • Read one index page, cost=4 (random_page_cost=4)
  • 1 result row to process, estimated at cpu_tuple_cost=0.01

As the index is a B*Tree with 30 pages, I expect to read at least one branch in addition to the leaf block. The execution has actually read 3 blocks (Buffers: shared hit=3). Here it seems that Postgres decides to ignore the branches and count only the leaf blocks.

In Oracle, the estimation cost=1 and execution has read 2 blocks:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gusay436hpzck, child number 0
-------------------------------------
select /*+ */ n from demo1 where n=1000
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 2 |
|* 1 | INDEX UNIQUE SCAN| DEMO1_N | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("N"=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

Both Oracle and Postgres read only the index here. This is the fastest access to one indexed column: no need to read the table because the column is in the index. The use-case is quite limited here: just testing the existence of the column. I will now select another column than the one used in the where clause.

Select another column

I filter on N but now query the column A which is not in the index. The Index Only Scan changes to an Index Scan:

explain (analyze,verbose,costs,buffers) select a from demo1 where n=1000 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)
Output: a
Index Cond: (demo1.n = 1000)
Buffers: shared hit=3
Planning time: 0.639 ms
Execution time: 0.030 ms

The cost is the same except that there is one additional page to read, which pushes it to cost=8.30:

  • The startup cost of 0.29 as we have seen before
  • Read one index page, and one table page: cost=8 (random_page_cost=4)
  • 1 result row to process, estimated at cpu_tuple_cost=0.01

In Oracle it is not a different operation. We still have the INDEX UNIQUE SCAN, but in addition to it, an additional operation to read the table: TABLE ACCESS BY INDEX ROWID. The index entry returns the ROWID (physical address of the table block, equivalent to the Postgres TID). And then we have the detail of the cost, and execution buffer reads: one more block.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8q4tcxgk1n1vn, child number 0
-------------------------------------
select /*+ */ a from demo1 where n=1000
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 3 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEMO1 | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
|* 2 | INDEX UNIQUE SCAN | DEMO1_N | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22] 2 - "DEMO1".ROWID[ROWID,10]

The important thing here is within the predicate information where we see the part of the where clause which is not a filter applied after the scan, but is used for optimal access by the index. It is displayed as access() in Oracle execution plan:

access("N"=1000)

In PostgreSQL execution plan, the same information is displayed as ‘Index Cond':

Index Cond: (demo1.n = 1000)

Postgres Range Scan

That was retrieving only one row with an equality predicate on a unique index column. The index scan helps to get directly to the value because of the B*Tree structure. As the index is sorted, an inequality predicate can also use the index to find the rows in a range of values.

The Postgres plan looks the same, with Index Scan:

explain (analyze,verbose,costs,buffers) select a from demo1 where n<=1000 ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..175.78 rows=1000 width=4) (actual time=0.029..0.780 rows=1000 loops=1)
Output: a
Index Cond: (demo1.n <= 1000)
Buffers: shared hit=147
Planning time: 1.019 ms
Execution time: 0.884 ms

Same plan but of course we have more index blocks to scan, and more rows to fetch from the table, which is why the cost is higher.

In order to understand the cost, I’ve changed the query planner constants one by one. Here is what I got:

  • (cost=0.29..33.78 rows=1000 width=4) when seq_page_cost=0 instead of 1, which means that it estimates (175.78-33.78)/1=142 sequential reads
  • (cost=0.29..159.78 rows=1000 width=4) when random_page_cost=0 instead of 4, which means that it estimates (175.78-159.78)/4=4 random reads
  • (cost=0.29..165.78 rows=1000 width=4) when cpu_tuple_cost=0 instead of 0.01, which means that it estimates (175.78-165.78)/0.01=1000 rows
  • (cost=0.29..170.78 rows=1000 width=4) when cpu_index_tuple_cost=0 instead of 0.005, which means that it estimates (175.78-170.78)/0.005=1000 index entries
  • (cost=0.00..173.00 rows=1000 width=4) when cpu_operator_cost=0 instead of 0.0025, which means that it estimates (175.78-173.00)/0.0025=1112 cpu operations (116 for initial cost + 996 to get all rows)

I understand the 4 random reads from the index pages. However, I expected random reads, and not sequential reads, to fetch the rows from the table. But this is a case where the clustering factor is very good: the rows have been inserted in the same order as the indexed column, and this means that those reads from table probably read consecutive pages.

In order to validate this guess, I’ve traced the system calls on Linux

25734 open("base/12924/42427", O_RDWR) = 42
25734 lseek(42, 0, SEEK_END) = 11706368
25734 open("base/12924/42433", O_RDWR) = 43
25734 lseek(43, 0, SEEK_END) = 245760

The file descriptor 42 is my table (demo1) and the descriptor 43 is the index (demo1_n). The file name is in the open() call and it includes the file id:

select relname,relfilenode from pg_class where relname='demo1';
-[ RECORD 1 ]--+------
relname | demo1
relfilenode | 42427
 
select relname,relfilenode from pg_class where relname='demo1_n';
-[ RECORD 1 ]--+--------
relname | demo1_n
relfilenode | 42433

Then we see some random reads from the index (branches and first leaf):

25734 lseek(43, 0, SEEK_SET) = 0
25734 read(43, "100036037360374 b152"..., 8192) = 8192
25734 lseek(43, 24576, SEEK_SET) = 24576
25734 read(43, "121000836360374 35023720330237 "..., 8192) = 8192
25734 lseek(43, 8192, SEEK_SET) = 8192
25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192

Then we see 53 reads from the table:

25734 lseek(42, 0, SEEK_SET) = 0
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

Only one lseek. The other reads are all single block (8k) I/O calls but without seek, which means that they are sequential. When relying on filesystem prefetching, this may avoid the latency for each I/O call.

Then the next leaf block from the index is read, and then 52 reads from the table (no lseek):

25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

And again, one index block and 38 contiguous table blocks:

25734 lseek(43, 32768, SEEK_SET) = 32768
25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

Here is the summary of the cost 175.78

  • The startup cost of 0.29 as we have seen before
  • Estimates 4 random reads (reading 1000 rows from a 30 pages index which contains 10000 rows): cost=16 (random_page_cost=4)
  • Estimates 142 sequential reads: cost=142 (seq_page_cost=1)
  • 1000 index entries to process, estimated at cost=5 (cpu_index_tuple_cost=0.005)
  • 1000 result row to process, estimated at cost=10 (cpu_tuple_cost=0.01)
  • about 1000 operators or functions estimated at cpu_operator_cost=0.0025

The very interesting thing here is that the query planner is totally aware of the clustering factor and uses sequential read estimation.

Oracle Range Scan

Here is the same query on the similar table on Oracle:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID a3gqx19xs9wxq, child number 0
-------------------------------------
select /*+ */ a from demo1 where n<=1000
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 147 (100)| 1000 |00:00:00.01 | 148 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO1 | 1 | 1000 | 147 (0)| 1000 |00:00:00.01 | 148 |
|* 2 | INDEX RANGE SCAN | DEMO1_N | 1 | 1000 | 4 (0)| 1000 |00:00:00.01 | 4 |
----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"<=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22] 2 - "DEMO1".ROWID[ROWID,10]

The straces shows calls to pread:

open("/u01/oradata/CDB1A/PDB/users01.dbf", O_RDWR|O_DSYNC) = 7
fcntl(7, F_SETFD, FD_CLOEXEC) = 0
fcntl(7, F_DUPFD, 256) = 258
fcntl(258, F_SETFD, FD_CLOEXEC) = 0
close(7) = 0
pread(258, "62422313G275"142532'!1?275""..., 8192, 2252800 ) = 8192
pread(258, "62422413C275"14x2432'!1?275""..., 8192, 2260992 ) = 8192
pread(258, "6242313v3362274"24b+1&!1354274""..., 8192, 24731648 ) = 8192
pread(258, "6242314v3362274"24e*1&!1354274""..., 8192, 24739840 ) = 8192
pread(258, "6242315v3362274"24d51&!1354274""..., 8192, 24748032 ) = 8192
pread(258, "6242316v3362274"24g41&!1354274""..., 8192, 24756224 ) = 8192
pread(258, "6242317v3362274"24f71&!1354274""..., 8192, 24764416 ) = 8192
pread(258, "6242320v3362274"24y71&!1354274""..., 8192, 24772608 ) = 8192

pread is similar to lseek()+read() here and, as far as I know, Linux detects when there is no need to seek, and this allows prefetching as well. Oracle has also its own prefetching but I’ll not go into the detail here (read Timur Akhmadeev on Pythian blog about this).

With Oracle, there is no need to run strace because all system calls are instrumented as ‘wait events’ and here is a trace:

PARSE #140375247563104:c=2000,e=1872,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=187737470,tim=53267437268
EXEC #140375247563104:c=0,e=147,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=187737470,tim=53267437481
WAIT #140375247563104: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267437532
WAIT #140375247563104: nam='db file sequential read' ela= 8 file#=12 block#=275 blocks=1 obj#=74023 tim=53267437679
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=276 blocks=1 obj#=74023 tim=53267437785
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=3019 blocks=1 obj#=74022 tim=53267437902
FETCH #140375247563104:c=0,e=368,p=3,cr=3,cu=0,mis=0,r=1,dep=0,og=1,plh=187737470,tim=53267437977
WAIT #140375247563104: nam='PGA memory operation' ela= 14 p1=0 p2=0 p3=0 obj#=74022 tim=53267438017
WAIT #140375247563104: nam='SQL*Net message from client' ela= 280 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267438385
WAIT #140375247563104: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267438419
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3020 blocks=1 obj#=74022 tim=53267438443
WAIT #140375247563104: nam='PGA memory operation' ela= 7 p1=1114112 p2=2 p3=0 obj#=74022 tim=53267438475
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=3021 blocks=1 obj#=74022 tim=53267438504
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3022 blocks=1 obj#=74022 tim=53267438532
WAIT #140375247563104: nam='db file sequential read' ela= 2 file#=12 block#=3023 blocks=1 obj#=74022 tim=53267438552
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3024 blocks=1 obj#=74022 tim=53267438576
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3025 blocks=1 obj#=74022 tim=53267438603
WAIT #140375247563104: nam='db file sequential read' ela= 26 file#=12 block#=3026 blocks=1 obj#=74022 tim=53267438647
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3027 blocks=1 obj#=74022 tim=53267438680
WAIT #140375247563104: nam='db file sequential read' ela= 2 file#=12 block#=3028 blocks=1 obj#=74022 tim=53267438699
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3029 blocks=1 obj#=74022 tim=53267438781
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3030 blocks=1 obj#=74022 tim=53267438807
WAIT #140375247563104: nam='db file sequential read' ela= 28 file#=12 block#=3031 blocks=1 obj#=74022 tim=53267438878
...

The name ‘sequential read’ does not mean the same as the Postgres ‘sequential read’. It only means single-block reads that are done one after the other, but they are actually random reads. However, looking at the block# they appear as reading contiguous blocks.

At the end, because I have an index with good clustering factor, and because I’m using the defaults on Linux without direct read and asynchronous I/O, the execution is very similar to the postgres one: read the few index blocks and follow the pointer to the 140 blocks of the table.

The cost estimation looks similar (same number) between Postgres and Oracle but it is not the same unit. Postgres estimates the cost with sequential reads, but Oracle estimates the cost as random reads. In addition to that, Postgres, with its default planner parameters, gives more importance than Oracle to the CPU usage.

This is the good case of Index Access where we have a good clustering/correlation factor between the physical order of the table and the logical order of the index. The random reads are finally behaving as sequential read because there is no seek() between them. You can imagine that in the next post I’ll try the same with a very bad clustering factor.

 

Cet article Postgres vs. Oracle access paths VI – Index Scan est apparu en premier sur Blog dbi services.

SQL Server on Linux: Introduction to DBFS experimental tool

Wed, 2017-08-09 08:14

A couple of months ago, Microsoft announced two additional command line tools for SQL Server that are mssql-scripter and DBFS. The latter has drawn my attention because it exposes live data from SQL Server DMVs as virtual files in a virtual directory on Linux operating system. Microsoft has probably taken another positive step in the SQL Server’s adoption on Linux. Indeed, in a Linux world, we may get Kernel’s performance and configuration data either directly from the procfs or indirectly by using tools that involve procfs in the background.

blog 123 - sqlserver dbfs

DBFS uses the FUSE filesystem module to expose DMVs and according to Microsoft blog, reading data from each concerned DMV file is a live process that ensures to get always fresh data.

[mikedavem@sql server]$ rpm -qR $(rpm -qa | grep dbfs)
glibc
fuse
fuse-devel
freetds
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(PayloadIsXz) <= 5.2-1

After installing DBFS, let’s start the tool by using the following command line:

[mikedavem@sql mssql]$ dbfs -m /var/opt/mssql/dmv -c /var/opt/mssql/secrets/dmvtool.conf -v      
1: Processing entry for section server in configuration file:
SUCCESSFULLY added entry for server server.

The configuration file (-c parameter) stores sensitive information to connect to the SQL Server instance (hostname, user and “insecure” password). You shall consider to protect this file with suitable permissions.

DBFS is a background process by default but you may change the behavior by using -f parameter at the startup.

[mikedavem@sql mssql]$ ps -e | grep dbfs
  2673 ?        00:00:00 dbfs

At this stage, the negative point is that to start DBFS manually if the server restarts. I had thought to wrap DBFS in a cron job that will run at the server startup but my preference would be to get the possibility to control DBFS through system as a service.

Exploring the DMV mounted filesystem

In fact, every DMV is exposed in two formats (normal file and JSON file).

[mikedavem@sql server]$ pwd
/var/opt/mssql/dmv/server
[mikedavem@sql server]$ ls -I "*.json" | wc -l
494

DBFS exposes a lot of DMVs as we notice above. We may also want to filter regarding the category plan described in the BOL by using grep.

For instance, AlwaysOn related DMVs …

[mikedavem@sql server]$ ls -I "*.json" | grep dm_hadr
dm_hadr_automatic_seeding
dm_hadr_auto_page_repair
dm_hadr_availability_group_states
dm_hadr_availability_replica_cluster_nodes
dm_hadr_availability_replica_cluster_states
dm_hadr_availability_replica_states
dm_hadr_cluster
dm_hadr_cluster_members
dm_hadr_cluster_networks
dm_hadr_database_replica_cluster_states
dm_hadr_database_replica_states
dm_hadr_instance_node_map
dm_hadr_name_id_map
dm_hadr_physical_seeding_stats
…

… or database related DMVs

[mikedavem@sql server]$ ls -I "*.json" | grep dm_db
dm_db_column_store_row_group_operational_stats
dm_db_column_store_row_group_physical_stats
dm_db_file_space_usage
dm_db_fts_index_physical_stats
dm_db_index_usage_stats
dm_db_log_space_usage
dm_db_mirroring_auto_page_repair
dm_db_mirroring_connections
dm_db_mirroring_past_actions
dm_db_missing_index_details
dm_db_missing_index_groups
dm_db_missing_index_group_stats
dm_db_partition_stats
dm_db_persisted_sku_features
dm_db_rda_migration_status
dm_db_rda_schema_update_status
dm_db_script_level
dm_db_session_space_usage
dm_db_task_space_usage 
…

The schema name suffix (sys) is not present for DMV related files.

Extracting data from DMV related files

Linux provides very powerful tools to consume data from files like tr, cut, split, sort, join, cat, grep and awk to cite few of them. You may find some examples in Github but I decided to get my own experience by attempting to address some usual DBA queries.

  • Extracting data from sys.dm_os_sys_info to retrieve CPU, memory available information on the server as well as the memory manager consumption

We may use either cut command to extract required columns from the dm_os_sys_info file.

[mikedavem@sql server]$ cut -d$'\t' -f3,4,5,7,8,9,15,27 dm_os_sys_info
cpu_count       hyperthread_ratio       physical_memory_kb      committed_kb    committed_target_kb      visible_target_kb       scheduler_count virtual_machine_type_desc
4       4       3918848 207160  3914240 3914240 4       HYPERVISOR

But the above output is not very readable and we may want to display the information differently (in column rather than in row). Using awk may be your best solution in this case:

[mikedavem@sql server]$ for ((i=1;i<$(cut -d$'\t' -f3,4,5,7,8,9,15,27 dm_os_sys_info | head -n 1 | wc -w);i++)); do awk '{print $'$i'}' <(cut -d$'\t' -f3,4,5,7,8,9,15,27 dm_os_sys_info) | tr '\n' ' ';echo;done | column -t
cpu_count			4
hyperthread_ratio	4
physical_memory_kb	3918848
committed_kb		207296
committed_target_kb	3914240
visible_target_kb	3914240
scheduler_count		4

 

  • Extracting information from sys.databases

A typical query I may see is to retrieve database information including their name, id, state, recovery model and owner. Basically , we have to join two DMVs to get all the requested information: sys.databases and sys.server_principals. In Linux world, you may also use the join command to gather information from different files but it implies to pre-sort each data file first.

But getting the desired output may be challenging. Indeed, firstly we will probably choose the grep / cut tools to extract only rows and columns we want. But using such tools will prevent to keep the column context and getting only column values from the DMV related files may be meaningless in this case as shown below :

[mikedavem@sql server]$ join -1 2 -2 3 -t$'\t' -o 2.1,2.2,2.3,2.4,2.5,1.1 \
<(grep -Ev '^[NT|##]|SERVER_ROLE' server_principals | cut -d$'\t' -f 1,3 | sort -t$'\t' -k2) \
<(cut -d$'\t' -f 1,2,4,14,22 databases | sort -t$'\t' -k3) | column -t
model				3  01  ONLINE  FULL    sa
test				6  01  ONLINE  FULL    sa
ApplixEnterprise	5  01  ONLINE  SIMPLE  sa
master				1  01  ONLINE  SIMPLE  sa
msdb				4  01  ONLINE  SIMPLE  sa
tempdb				2  01  ONLINE  SIMPLE  sa

But preserving the column context may become also your concern if you want to sort column data values. Indeed, let’s say you want to sort the above output by the database_id column value. You will quickly notice that the header file will be included by the sort operator. Definitely not the result we expect in this case. So, in order to meet our requirement, we may use again the very powerful awk command as shown below:

[mikedavem@sql server]$ join --header -1 2 -2 3 -t$'\t' -o 2.1,2.2,2.3,2.4,2.5,1.1 \
<(head -n1 server_principals | cut -d$'\t' -f 1,3;grep -Ev '^[NT|##]|SERVER_ROLE' server_principals | cut -d$'\t' -f 1,3 | sort -t$'\t' -k2) \
<(head -n 1 databases | cut -d$'\t' -f 1,2,4,14,22;cut -d$'\t' -f 1,2,4,14,22 databases | sort -t$'\t' -k3) \
| awk 'NR<2{print $0;next}{print $0 | "sort -k2"}' | column -t
name				database_id  	owner_sid	state_desc	recovery_model_desc	name
master				1				01		ONLINE      	SIMPLE				sa
tempdb				2            	01		ONLINE      	SIMPLE				sa
model				3            	01		ONLINE      	FULL				sa
msdb				4            	01		ONLINE      	SIMPLE				sa
ApplixEnterprise	5            	01		ONLINE      	SIMPLE				sa
test				6            	01		ONLINE      	FULL				sa

We finally managed to display the desired output but my feeling is we worked hard for few results and usually we have to deal with more DMVs and complex join than the previous example in the same query. Furthermore, we often have to aggregate data from DMVs to get relevant results and I may easily imagine the additional efforts to produce the corresponding scripts if we extend the scenarios we have to deal with.

There are probably other tools on Linux to make the task easier but my opinion is that DBFS should include the ability to execute custom queries already used by DBAs day-to-day to go beyond the actual capabilities. We didn’t surface JSON format in this blog. This is another way to consume data from DMV related files but in my opinion, it targets more the developers than the DBAs audience.

I noticed that I was not the first and the only one to think about those enhancements by looking at the DBFS Github and the issues section. Anyway, I like the Microsoft’s initiative to give us the ability to consume DMVs related data from the command line in a Linux world and I’m looking forward the next features about this tool!

 

 

 

Cet article SQL Server on Linux: Introduction to DBFS experimental tool est apparu en premier sur Blog dbi services.

MariaDB – How to migrate quickly MySQL to MariaDB over different hosts with mydumper/myloader and ncat

Wed, 2017-08-09 07:27

A lot of possibilities exist to migrate MySQL to MariaDB. In this blog I would like to show a scenario where I migrate MySQL 5.7.19 with a lot of databases to MariaDB 10.2.7, which is on a different host. The tools I am using are mydumper/myloader and ncat.

Please be aware that mydumper does not come out of the box. It has to be installed beforehand, like explained in my previous blog: https://blog.dbi-services.com/mariadb-speed-up-your-logical-mariadb-backups-with-mydumper/

The same applies to ncat. It might not be on your system. However, the installation of ncat is quite simple. Just run

# yum install nmap-ncat

Ncat is just a networking utility which reads and writes data across networks from the command line. A quite underestimated tool from my point view. And please take care, ncat does not encrypt your network traffic per default. That’s why it might be faster than coping it via scp/sftp. Ncat operates in one of two primary modes: connect mode and listen mode. So, we need to install it on both hosts. In my case on node mysql01 (connect mode) and on mysql02 (listen mode). If you want to learn more about ncat, take a look at the following web site. http://nmap.org/ncat

Ok. Let’s get started. First of all, we take a look at the current MySQL instance. As you can see in the output below, I do have a lot of databases that I want to migrate, e.g. air, dbi and a couple of sakila databases.

mysql@mysql01:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] u | grep -A 1 mysqld6
mysqld6 is running        : 5.7.19 (mysql-5.7.19)
Port                      : 33006

mysqld6-(root@localhost) [(none)]> select @@version;
+------------+
| @@version  |
+------------+
| 5.7.19-log |
+------------+
1 row in set (0.00 sec)

mysqld6-(root@localhost) [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| air                |
| dbi                |
| mysql              |
| performance_schema |
| sakila             |
| sakila01           |
| sakila02           |
| sakila03           |
| sakila04           |
| sakila05           |
| sakila06           |
| sys                |
| wrs                |
+--------------------+
14 rows in set (0.00 sec)

MyDumper comes with a quite cool parameter, called –regex. By using the –regex parameter, we can dump out all databases except the ones the we don’t want in an elegant way. e.g. we might not want to dump (mysql|test|information_schema|sys|performance_schema)

mysql@mysql01:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] /u00/app/mysql/product/tools/mydumper/bin/mydumper --defaults-file=/u00/app/mysql/admin/mysqld6/.my.cnf \
--threads=6 --regex '^(?!(mysql|test|information_schema|sys|performance_schema))' \
--outputdir=/u99/mysqlbackup/mysqld6/mydumper_mysqld6

Ok. We got now all databases dumped out and we can copy it over to the MariaDB host with ssh, ftp, rsync or other tools. Or we do it via ncat directly. For doing so, we need to start the ncat in listen mode (-l) on the destination host mysql02.

-- On host mysql02:

mysql@mysql02:/u00/app/mysql/ [mysqld6] cd /u99/mysqlbackup/mysqld6/mydumper_mysqld6
mysql@mysql02:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6]
mysql@mysql02:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] nc -4 -l 33333 | tar -xzvp
...
this host is waiting now for network packets

On our source node, we start mydumper, pipe it to tar and send it to ncat.

On host mysql01:

mysql@mysql01:/u00/app/mysql/ [mysqld6] cd /u99/mysqlbackup/mysqld6/mydumper_mysqld6
mysql@mysql01:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6]
mysql@mysql01:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] /u00/app/mysql/product/tools/mydumper/bin/mydumper --defaults-file=/u00/app/mysql/admin/mysqld6/.my.cnf \
--threads=6 --regex '^(?!(mysql|test|information_schema|sys|performance_schema))' | tar -czv -f - .  | nc --send-only mysql02 33333
./
./export-20170809-062635/
./export-20170809-062635/air-schema-create.sql
./export-20170809-062635/dbi-schema-create.sql
./export-20170809-062635/sakila-schema-create.sql
./export-20170809-062635/sakila01-schema-create.sql
./export-20170809-062635/sakila02-schema-create.sql
./export-20170809-062635/sakila03-schema-create.sql
./export-20170809-062635/sakila04-schema-create.sql
./export-20170809-062635/sakila05-schema-create.sql
./export-20170809-062635/sakila06-schema-create.sql
./export-20170809-062635/wrs-schema-create.sql
./export-20170809-062635/metadata
./export-20170809-062635/dbi.dbi_t.sql
./export-20170809-062635/sakila05.category.sql
./export-20170809-062635/sakila.actor.sql
./export-20170809-062635/sakila.address.sql
./export-20170809-062635/sakila.category.sql
./export-20170809-062635/sakila.city.sql
./export-20170809-062635/sakila.country.sql
./export-20170809-062635/sakila.customer.sql
./export-20170809-062635/sakila.film.sql
./export-20170809-062635/sakila.film_actor.sql
./export-20170809-062635/sakila.film_category.sql
./export-20170809-062635/sakila.film_text.sql
./export-20170809-062635/sakila.inventory.sql
./export-20170809-062635/sakila.language.sql
./export-20170809-062635/sakila.payment.sql
./export-20170809-062635/sakila.rental.sql
./export-20170809-062635/sakila.staff.sql
./export-20170809-062635/sakila.store.sql
./export-20170809-062635/sakila01.actor.sql
./export-20170809-062635/sakila01.address.sql
...
...

On the other terminal, you can see that the files are being received.

On host mysql02:

mysql@mysql02:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] nc -4 -l 33333 | tar -xzvp
./
./export-20170809-062635/
./export-20170809-062635/air-schema-create.sql
./export-20170809-062635/dbi-schema-create.sql
./export-20170809-062635/sakila-schema-create.sql
./export-20170809-062635/sakila01-schema-create.sql
./export-20170809-062635/sakila02-schema-create.sql
./export-20170809-062635/sakila03-schema-create.sql
./export-20170809-062635/sakila04-schema-create.sql
./export-20170809-062635/sakila05-schema-create.sql
./export-20170809-062635/sakila06-schema-create.sql
./export-20170809-062635/wrs-schema-create.sql
./export-20170809-062635/metadata
./export-20170809-062635/dbi.dbi_t.sql
./export-20170809-062635/sakila05.category.sql
./export-20170809-062635/sakila.actor.sql
./export-20170809-062635/sakila.address.sql
./export-20170809-062635/sakila.category.sql
./export-20170809-062635/sakila.city.sql
./export-20170809-062635/sakila.country.sql
./export-20170809-062635/sakila.customer.sql
./export-20170809-062635/sakila.film.sql
./export-20170809-062635/sakila.film_actor.sql
./export-20170809-062635/sakila.film_category.sql
./export-20170809-062635/sakila.film_text.sql
./export-20170809-062635/sakila.inventory.sql
./export-20170809-062635/sakila.language.sql
./export-20170809-062635/sakila.payment.sql
./export-20170809-062635/sakila.rental.sql
./export-20170809-062635/sakila.staff.sql
./export-20170809-062635/sakila.store.sql
./export-20170809-062635/sakila01.actor.sql
./export-20170809-062635/sakila01.address.sql
...
...
...

mysql@mysql02:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] ls -l
total 20
drwx------ 2 mysql mysql 16384 Aug  9 06:26 export-20170809-062635

That’s it. We got now all the files on our destination host mysql02. Now it’s time to import the data via myloader into the new MariaDB 10.2.7 which is empty at the moment.

mysql@mysql02:/u00/app/mysql/ [mysqld6] u | grep -A 1 mysqld6
mysqld6 is running        : 10.2.7-MariaDB (mariadb-10.2.7)
Port                      : 33006

mysql@mysql02:/u00/app/mysql/product/tools/mydumper-0.9.2/bin/ [mysqld6] mq
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.2.7-MariaDB-log MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysqld6-(root@localhost) [(none)]> select @@version;
+--------------------+
| @@version          |
+--------------------+
| 10.2.7-MariaDB-log |
+--------------------+
1 row in set (0.00 sec)

Per default, the myloader tool starts 4 parallel threads, and runs in verbose mode 2 which means that only warning messages are shown. In case you want to have it more verbose, you can specify –verbose=3

mysql@mysql02:/u99/mysqlbackup/mysqld6/mydumper_mysqld6/ [mysqld6] /u00/app/mysql/product/tools/mydumper/bin/myloader \
--defaults-file=/u00/app/mysql/admin/mysqld6/.my.cnf --threads=6 --verbose=3 \
--directory /u99/mysqlbackup/mysqld6/mydumper_mysqld6/export-20170809-062635
** Message: 6 threads created
** Message: Creating database `air`
** Message: Creating table `air`.`muc`
** Message: Creating database `dbi`
** Message: Creating table `dbi`.`dbi_t`
** Message: Creating table `dbi`.`dbi_t2`
** Message: Creating database `sakila`
** Message: Creating table `sakila`.`actor`
** Message: Creating table `sakila`.`address`
** Message: Creating table `sakila`.`category`
** Message: Creating table `sakila`.`city`
** Message: Creating table `sakila`.`country`
** Message: Creating table `sakila`.`customer`
** Message: Creating table `sakila`.`film`
...
...

If you take a look at the process list, you should see 6 threads doing the work.

mysqld6-(root@localhost) [(none)]> show processlist;
...
...
| 30 | root        | localhost | sakila01 | Query   |    0 | Opening tables          | INSERT INTO `city` VALUES
(1,"A Corua (La Corua)",87,"2006-02-15 03:45:25"),
(2,"Abha",82,"2006-02-1 |    0.000 |
| 31 | root        | localhost | sakila   | Query   |    6 | update                  | INSERT INTO `film_actor` VALUES
(1,1,"2006-02-15 04:05:03"),
(1,23,"2006-02-15 04:05:03"),
(1,25,"20 |    0.000 |
| 32 | root        | localhost | sakila   | Query   |    4 | update                  | INSERT INTO `payment` VALUES
(1,1,1,76,2.99,"2005-05-25 11:30:37","2006-02-15 21:12:30"),
(2,1,1,573 |    0.000 |
| 33 | root        | localhost | sakila   | Query   |    3 | update                  | INSERT INTO `rental` VALUES
(1,"2005-05-24 22:53:30",367,130,"2005-05-26 22:04:30",1,"2006-02-15 20: |    0.000 |
| 34 | root        | localhost | sakila01 | Query   |    2 | update                  | INSERT INTO `address` VALUES
(1,"47 MySakila Drive",NULL,"Alberta",300,"","",">\n2]c |    0.000 |
| 35 | root        | localhost | sakila   | Query   |    4 | update                  | INSERT INTO `inventory` VALUES

That’s it. We got now the data migrated to MariaDB 10.2.7.

Conclusion

Many ways do exist for migrating MySQL to MariaDB. Using mydumper/myloader in combination with ncat is just one of those. However, from my point of view, a quite cool one.

 

Cet article MariaDB – How to migrate quickly MySQL to MariaDB over different hosts with mydumper/myloader and ncat est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX

Tue, 2017-08-08 08:58

We have seen how an index can help to avoid a sorting operation in the previous post. This avoids a blocking operation: the startup cost is minimal and the first rows can be immediately returned. This is often desired when displaying rows to the user screen. Here is more about Postgres startup cost, Oracle first_rows costing, and fetching first rows only.

Here is the execution plan we had in Oracle to get the values of N sorted. The cost for Oracle is the cost to read the index leaves: estimated to 46 random reads:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dbck3rgnqbakg, child number 0
-------------------------------------
select /*+ */ n from demo1 where n is not null order by n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 46 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 46 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

In PostreSQL, we have two costs (cost=0.29..295.29):

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null order by n ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.194..2.026 rows=10000 loops=1)
Output: n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=30
Planning time: 1.190 ms
Execution time: 2.966 ms

I explained where the total cost (295.29) comes from:

  • The index on the column X has 30 blocks witch is estimated at cost=120 (random_page_cost=4)
  • We have 10000 index entries to process, estimated at cost=50 (cpu_index_tuple_cost=0.005)
  • We have 10000 result rows to process, estimated at cost=100 (cpu_tuple_cost=0.01)
  • We have evaluated 10000 ‘is not null’ conditions, estimated at cost=25 (cpu_operator_cost=0.0025)

But the Postgres EXPLAIN also show the startup cost (0.29) which is the cost before returning the first rows (only few cpu_operator_cost here).

From that, I can guess that fetching 1 row will have the following cost:

  • The startup cost of 0.29
  • Read the first index page, cost=4 (random_page_cost=4)
  • 1 index entry to process at cpu_index_tuple_cost=0.005
  • 1 result row to process, estimated at cpu_tuple_cost=0.01
  • 1 ‘is not null’ conditions, estimated at cpu_operator_cost=0.0025

This should be approximately cost=4.3075 for one row. Roughly the cost to read one index page. We will see later that the query planner do not count this first index page.

Oracle First Rows

In Oracle, we have only the total cost in the execution plan, but we can estimate the cost to retrieve 1 row with the FIRST_ROWS(1) hint:


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0fjk9vv4g1q1w, child number 0
-------------------------------------
select /*+ first_rows(1) */ n from demo1 where n is not null order by
n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 2 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

The cost here is small, estimated to 2 random reads (1 B*Tree branch and 1 leaf) which is sufficient to get the first row. Of course, I’ve estimated it for 1 row but I finally retrieved all rows (A-Rows=10000), reading all blocks (Buffers=48). However, my execution plan is optimized for fetching one row.

Fetch first rows

I can run the previous query and finally fetch only one row, but I can also explicitly filter the result to get one row only. If you use older versions of Oracle, you may have used the ‘rownum’ way of limiting rows, and this implicitly adds the first_rows hint. Here I’m using the FETCH FIRST syntax and I need to explicitely add the FIRST_ROWS() hint to get the plan optimized for that.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 9bcm542sk64az, child number 0
-------------------------------------
select /*+ first_rows(1) */ n from demo1 where n is not null order by n fetch first 1 row only
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 3 |
|* 1 | VIEW | | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
|* 2 | WINDOW NOSORT STOPKEY| | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
| 3 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 2 (0)| 2 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "N")<=1)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "from$_subquery$_002"."N"[NUMBER,22], "from$_subquery$_002"."rowlimit_$$_rownumber"[NUMBER,22] 2 - (#keys=1) "N"[NUMBER,22], "DEMO1".ROWID[ROWID,10], ROW_NUMBER() OVER ( ORDER BY "N")[22] 3 - "DEMO1".ROWID[ROWID,10], "N"[NUMBER,22]

The cost is the same, estimated to 2 random reads, but we see how Oracle implements the FETCH FIRST: with window functions. And only one row has been fetched (A-Rows) reading 3 blocks (buffers). Note that because the index is sorted, the window function is a NOSORT operation.

Postgres

I can run the same query on PostgreSQL and get the execution plan:

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null order by n fetch first 1 row only;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..0.31 rows=1 width=4) (actual time=0.124..0.124 rows=1 loops=1)
Output: n
Buffers: shared hit=3
-> Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.124..0.124 rows=1 loops=1)
Output: n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.576 ms
Execution time: 0.143 ms

Here, the total cost of the query is lower than the total cost of the Index Only Scan, because we know we will not read all index entries. Then the total cost of the query (0.31) is based on the startup cost (0.29) of the index access. I suppose there is 0.01 for the cpu_tuple_cost but I expected to see the cost to get the first page because we cannot get a row without reading the whole page. My guess is that Postgres divides the total cost (295) by the number of rows (10000) and uses that as a per-row estimation. This makes sense for a lot of rows but underestimates the cost to get the first page.

In order to validate my guess, I force a Seq Scan to have a higher cost and fetch first 5 rows:

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null fetch first 5 row only ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.76 rows=5 width=4) (actual time=0.026..0.029 rows=5 loops=1)
Output: n
Buffers: shared hit=1
-> Seq Scan on public.demo1 (cost=0.00..1529.00 rows=10000 width=4) (actual time=0.022..0.024 rows=5 loops=1)
Output: n
Filter: (demo1.n IS NOT NULL)
Buffers: shared hit=1
Planning time: 1.958 ms
Execution time: 0.057 ms

My guess is: ( 1529.00 / 10000 ) * 5 = 0.7645 which is exactly the cost estimated for the Limit operation. This approximation does not take the page granularity into account.

MIN/MAX

The “order by n fetch first 1 row only” finally reads only one index entry, the first one, and returns the indexed value. We can get the same value with a “select max(N)” and Oracle has a special operation for that:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 29bsqfg69nudp, child number 0
-------------------------------------
select /*+ */ min(n) from demo1
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 2 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 2 |
| 2 | INDEX FULL SCAN (MIN/MAX)| DEMO1_N | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) MIN("N")[22] 2 - "N"[NUMBER,22]

This goes through the index branches (blevel=1 here in this small index so root is the first and only one branch) to the first leaf in order to get the value in the first entry. This has read 2 blocks here. The same can be done to get the last index entry in case we “select max(N)”.

Postgres do not show a special operation for it, but a plan which is very similar to the one we have seen above when fetching the first row: Index Only Scan, with a Limit:


explain (analyze,verbose,costs,buffers) select min(n) from demo1 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.31..0.32 rows=1 width=4) (actual time=0.123..0.124 rows=1 loops=1)
Output: $0
Buffers: shared hit=3
InitPlan 1 (returns $0)
-> Limit (cost=0.29..0.31 rows=1 width=4) (actual time=0.121..0.121 rows=1 loops=1)
Output: demo1.n
Buffers: shared hit=3
-> Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.119..0.119 rows=1 loops=1)
Output: demo1.n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.415 ms
Execution time: 0.140 ms

If we look at the ‘Index Only Scan’ we see exactly what I had at the top of this post with “select n from demo1 where n is not null order by n”.

Above it, there’s the Limit clause which is exactly the same as the one with the “fetch 1 row only” because the query planner understands that getting the MIN(N) is the same as getting the first value from the ordered index on N.

This is processed as a non-correlated subquery (query block), also called InitPlan. The result of it ($0) is used by the result with an additional cost of 0.01 for the cpu_tuple_cost in this additional step. I don’t really know the reason for this additional step here, but anyway, the cost is minimal. Basically, both Oracle and Postgres take advantage of the index structure to get the minimum – or first value – from the sorted index entries.

In this series, I’m running very simple queries in order to show how it works. In this post, we reached the minimum: one column and one row. The next post will finally select one additional column, which is not in the index.

 

Cet article Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX est apparu en premier sur Blog dbi services.

24 HOP French edition 2017 – Session videos are available

Mon, 2017-08-07 10:09

The 2nd edition of 24HOP French Edition 2017 is over and we had great sessions about SQL Server and various topics (
SQL Server 2017 new features, Azure, PowerBI, High Availability , Linux, Hyper-convergence , Modeling …)

24hopsqlinuxha

If you did not attend to this event, you now have the opportunity to watch the videos of the different sessions. From my side, I had the chance to present SQL Server and High Availability on Linux.

Hope to see you next time!

 

Cet article 24 HOP French edition 2017 – Session videos are available est apparu en premier sur Blog dbi services.

MariaDB – Speed up your logical MariaDB backups with mydumper

Mon, 2017-08-07 07:08

Per default, MariaDB is shipped with a utility called mysqldump for logical backups. For more information, please take a look at the following link.

https://mariadb.com/kb/en/mariadb/mysqldump/

The mysqldump has advantages, e.g. it is easy to use and it is shipped with the standard MariaDB installation.  So, no additional installation is needed. However, it has also some disadvantages. E.g. it is single threaded and it is  writing to one big file, even with the latest version which is MariaDB 10.2.7 at the moment.

In case you want to dump out your data very quickly this can be your bottleneck. This is where the mydumper comes into play. The main feature of mydumper is that you can parallelize it. The mydumper utility uses 4 parallel threads per default if not otherwise specified.

./mydumper --help | grep threads
  -t, --threads               Number of threads to use, default 4

Another cool feature is compression.

./mydumper --help | grep compress
  -c, --compress              Compress output files

The biggest disadvantage is that mydumper is not delivered out of the box. You have to compile it yourself. To do so, simply follow the following steps:

Install the packages, which are needed for the mydumper compilation

# yum install gcc gcc-c++ glib2-devel mysql-devel zlib-devel \
  pcre-devel openssl-devel cmake

Unzip and compile mydumper

$ unzip mydumper-master.zip
mysql@mysql01:/u00/app/mysql/product/tools/ [mysqld1] unzip mydumper-master.zip
Archive:  mydumper-master.zip
e643528321f51e21a463156fbf232448054b955d
   creating: mydumper-master/
  inflating: mydumper-master/.bzrignore
  inflating: mydumper-master/CMakeLists.txt
  inflating: mydumper-master/README
  inflating: mydumper-master/binlog.c
  inflating: mydumper-master/binlog.h
   creating: mydumper-master/cmake/
   creating: mydumper-master/cmake/modules/
  inflating: mydumper-master/cmake/modules/CppcheckTargets.cmake
  inflating: mydumper-master/cmake/modules/FindGLIB2.cmake
  inflating: mydumper-master/cmake/modules/FindMySQL.cmake
  inflating: mydumper-master/cmake/modules/FindPCRE.cmake
  inflating: mydumper-master/cmake/modules/FindSphinx.cmake
  inflating: mydumper-master/cmake/modules/Findcppcheck.cmake
  inflating: mydumper-master/cmake/modules/Findcppcheck.cpp
  inflating: mydumper-master/common.h
  inflating: mydumper-master/config.h.in
   creating: mydumper-master/docs/
  inflating: mydumper-master/docs/CMakeLists.txt
   creating: mydumper-master/docs/_build/
  inflating: mydumper-master/docs/_build/conf.py.in
  inflating: mydumper-master/docs/_build/sources.cmake.in
  inflating: mydumper-master/docs/authors.rst
  inflating: mydumper-master/docs/compiling.rst
  inflating: mydumper-master/docs/examples.rst
  inflating: mydumper-master/docs/files.rst
  inflating: mydumper-master/docs/index.rst
  inflating: mydumper-master/docs/mydumper_usage.rst
  inflating: mydumper-master/docs/myloader_usage.rst
  inflating: mydumper-master/g_unix_signal.c
  inflating: mydumper-master/g_unix_signal.h
  inflating: mydumper-master/mydumper.c
  inflating: mydumper-master/mydumper.h
  inflating: mydumper-master/myloader.c
  inflating: mydumper-master/myloader.h
  inflating: mydumper-master/server_detect.c
  inflating: mydumper-master/server_detect.h

mysql@mysql01:/u00/app/mysql/product/tools/ [mysqld1] mv mydumper-master mydumper-0.9.2
mysql@mysql01:/u00/app/mysql/product/tools/ [mysqld1] cd mydumper-0.9.2
mysql@mysql01:/u00/app/mysql/product/tools/mydumper-0.9.2/ [mysqld1] cmake . -DCMAKE_INSTALL_PREFIX=/u00/app/mysql/product/tools/mydumper-0.9.2
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Using mysql-config: /u00/app/mysql/product/mysql-5.6.37/bin/mysql_config
-- Found MySQL: /u00/app/mysql/product/mysql-5.6.37/include, /u00/app/mysql/product/mysql-5.6.37/lib/libmysqlclient.so;/usr/lib64/libpthread.so;/usr/lib64/libm.so;/usr/lib64/librt.so;/usr/lib64/libdl.so
-- Found ZLIB: /usr/lib64/libz.so (found version "1.2.7")
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.27.1")
-- checking for one of the modules 'glib-2.0'
-- checking for one of the modules 'gthread-2.0'
-- checking for module 'libpcre'
--   found libpcre, version 8.32
-- Found PCRE: /usr/include
1
-- ------------------------------------------------
-- MYSQL_CONFIG = /u00/app/mysql/product/mysql-5.6.37/bin/mysql_config
-- CMAKE_INSTALL_PREFIX = /u00/app/mysql/product/tools/mydumper-0.9.2
-- BUILD_DOCS = ON
-- WITH_BINLOG = OFF
-- RUN_CPPCHECK = OFF
-- Change a values with: cmake -D<Variable>=<Value>
-- ------------------------------------------------
--
-- Configuring done
-- Generating done
-- Build files have been written to: /u00/app/mysql/product/tools/mydumper-0.9.2

HINT: In case you don’t have Sphinx installed, you can use the -DBUILD_DOCS=OFF option. Sphinx is a documentation generator. For more information see http://sphinx-doc.org/

mysql@mysql01:/u00/app/mysql/product/tools/mydumper-0.9.2/ [mysqld1] make
Scanning dependencies of target mydumper
[ 16%] Building C object CMakeFiles/mydumper.dir/mydumper.c.o
[ 33%] Building C object CMakeFiles/mydumper.dir/server_detect.c.o
[ 50%] Building C object CMakeFiles/mydumper.dir/g_unix_signal.c.o
Linking C executable mydumper
[ 50%] Built target mydumper
Scanning dependencies of target myloader
[ 66%] Building C object CMakeFiles/myloader.dir/myloader.c.o
Linking C executable myloader
[ 66%] Built target myloader
Scanning dependencies of target doc_sources
[ 66%] Built target doc_sources
Scanning dependencies of target doc_html
[ 83%] Building HTML documentation with Sphinx
/u00/app/mysql/product/tools/mydumper-0.9.2/docs/_sources/files.rst:39: WARNING: unknown option: mydumper --schemas
WARNING: html_static_path entry '/u00/app/mysql/product/tools/mydumper-0.9.2/docs/_static' does not exist
[ 83%] Built target doc_html
Scanning dependencies of target doc_man
[100%] Building manual page with Sphinx
[100%] Built target doc_man

mysql@mysql01:/u00/app/mysql/product/tools/mydumper-0.9.2/ [mysqld1] make install
[ 50%] Built target mydumper
[ 66%] Built target myloader
[ 66%] Built target doc_sources
[ 83%] Building HTML documentation with Sphinx
[ 83%] Built target doc_html
[100%] Building manual page with Sphinx
[100%] Built target doc_man
Install the project...
-- Install configuration: ""
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/bin/mydumper
-- Removed runtime path from "/u00/app/mysql/product/tools/mydumper-0.9.2/bin/mydumper"
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/bin/myloader
-- Removed runtime path from "/u00/app/mysql/product/tools/mydumper-0.9.2/bin/myloader"
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/authors.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/compiling.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/examples.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/files.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/index.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/mydumper_usage.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/myloader_usage.rst
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/authors.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/authors.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/compiling.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/examples.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/files.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/index.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/mydumper_usage.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_sources/myloader_usage.txt
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/compiling.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/examples.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/files.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/index.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/mydumper_usage.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/myloader_usage.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/genindex.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/search.html
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/pygments.css
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/ajax-loader.gif
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/basic.css
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/comment-bright.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/comment-close.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/comment.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/doctools.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/down-pressed.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/down.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/file.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/jquery-1.11.1.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/jquery.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/minus.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/plus.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/searchtools.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/underscore-1.3.1.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/underscore.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/up-pressed.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/up.png
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/websupport.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/classic.css
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/sidebar.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/_static/default.css
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/.buildinfo
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/searchindex.js
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/doc/mydumper/html/objects.inv
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/man/man1/mydumper.1
-- Installing: /u00/app/mysql/product/tools/mydumper-0.9.2/share/man/man1/myloader.1
mysql@mysql01:/u00/app/mysql/product/tools/mydumper-0.9.2/ [mysqld1]

mysql@mysql01:/u00/app/mysql/product/tools/ [mysqld1] ln -s mydumper-0.9.2 mydumper
mysql@mysql01:/u00/app/mysql/product/tools/ [mysqld1]

If compiled correctly, you will see two new binaries created. The mydumper and the myloader.

mysql@mysql01:/u00/app/mysql/product/tools/mydumper/bin/ [mysqld1] ls -l
total 280
-rwxr-xr-x 1 mysql mysql 218808 Aug  7 07:25 mydumper
-rwxr-xr-x 1 mysql mysql  63448 Aug  7 07:25 myloader

And besides that, you will have the documentation compiled as html in the ../mydumper-0.9.2/share/doc/mydumper/html folder.

MyDumper HTML

Ok. Let’s see now mysqldump vs. mydumper in action. My sample database is about 10G in size. Of course, the bigger the database is, the bigger the performance impact of mydumper will be.

First, we dump out all databases with mysqldump (without and with compression) and record the time.

-- no compression 

mysql@mysql01:/u00/app/mysql/ [mysqld1] mysqldump --version 
mysqldump  Ver 10.16 Distrib 10.2.7-MariaDB, for Linux (x86_64) 

mysql@mysql01:/u00/app/mysql/ [mysqld1] time mysqldump --defaults-file=/u00/app/mysql/admin/mysqld1/.my.cnf --single-transaction --all-databases > mysqldump.sql 

real    3m38.94s 
user    1m29.11s 
sys     0m11.85s 

mysql@mysql01:/u00/app/mysql/ [mysqld1] ls -lh mysqldump.sql     
-rw-r--r-- 1 mysql mysql 10G Aug  7 11:33 mysqldump.sql 

-- compression 

mysql@mysql01:/u00/app/mysql/ [mysqld1] time mysqldump --defaults-file=/u00/app/mysql/admin/mysqld1/.my.cnf --single-transaction --all-databases | gzip > mysqldump.sql.gz 

real    4m43.75s 
user    4m55.25s 
sys     0m10.65s 

mysql@mysql01:/u00/app/mysql/ [mysqld1] ls -lh mysqldump.sql.gz 
-rw-r--r-- 1 mysql mysql 3.1G Aug  7 11:55 mysqldump.sql.gz

The uncompressed dump took about 3.39 Minute (10G) and the compressed one about 4.44 Minute (3.1G).

Now we repeat it with mydumper.

-- no compression 

mysql@mysql01:/u00/app/mysql/ [mysqld1] time /u00/app/mysql/product/tools/mydumper/bin/mydumper --defaults-file=/u00/app/mysql/admin/mysqld1/.my.cnf --trx-consistency-only --threads=6 --outputdir=/mydump/mysqld1/mydumper_mysqld1 

real    1m22.44s 
user    0m41.17s 
sys     0m7.31s 

mysql@mysql01:/u00/app/mysql/ [mysqld1] du -hs /mydump/mysqld1/mydumper_mysqld1/ 
10G     mydumper_mysqld1/ 

-- compression 

mysql@mysql01:/u00/app/mysql/ [mysqld1] time /u00/app/mysql/product/tools/mydumper/bin/mydumper --defaults-file=/u00/app/mysql/admin/mysqld1/.my.cnf --trx-consistency-only --threads=6 --compress --outputdir=/mydump/mysqld1/mydumper_mysqld1 

real    3m4.99s 
user    3m54.94s 
sys     0m5.11s 

mysql@mysql01:/u00/app/mysql/ [mysqld1] du -hs /mydump/mysqld1/mydumper_mysqld1/ 
3.1G    mydumper_mysqld1/

With mydumper, the uncompressed dump took about 1.23 Minute (10G) and the compressed one about 3.04 Minute (3.1G).

As you can see in the results, the uncompressed dump was about 3 times faster with mydumper. The compressed mydumper export only about 30% faster. The reason for the compressed export being only 30% faster might be due to the fact that I have only 2 virtual cpu’s assigned to my VM.

Conclusion

MyDumper is a great tool that can speed up your database exports quite dramatically. Take a look at it. It might be worth it.

 

 

 

Cet article MariaDB – Speed up your logical MariaDB backups with mydumper est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths IV – Order By and Index

Sat, 2017-08-05 15:00

I realize that I’m talking about indexes in Oracle and Postgres, and didn’t mention yet the best website you can find about indexes, with concepts and examples for all RDBMS: http://use-the-index-luke.com. You will probably learn a lot about SQL design. Now let’s continue on execution plans with indexes.

As we have seen two posts ago, an index can be used even with a 100% selectivity (all rows), when we don’t filter any rows. Oracle has INDEX FAST FULL SCAN which is the fastest, reading blocks sequentially as they come. But this doesn’t follow the B*Tree leaves chain and does not return the rows in the order of the index. However, there is also the possibility to read the leaf blocks in the index order, with INDEX FULL SCAN and random reads instead of multiblock reads.
It is similar to the Index Only Scan of Postgres except that there is no need to get to the table to filter out uncommitted changes. Oracle reads the transaction table to get the visibility information, and goes to undo records if needed.

The previous post had a query with a ‘where n is not null’ predicate to be sure having all index entries in Oracle indexes and we will continue on this by adding an order by.

For this post, I’ve increased the size of the column N in the Oracle table, by adding 1/3 to each number. I did this for this post only, and for the Oracle table only. The index on N is now 45 blocks instead of 20. The reason is to show what happens when the cost of ‘order by’ is high. I didn’t change the Postgres table because there is only one way to scan the index, where result is always sorted.

Oracle Index Fast Full Scan vs. Index Full Scan


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dbck3rgnqbakg, child number 0
-------------------------------------
select /*+ */ n from demo1 where n is not null order by n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 46 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 46 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

Index Full Scan, the random read version of index read is chosen here by the Oracle optimizer because we want the result on the column N and the index can provide this without additional sorting.

We can force the optimizer to do multiblock reads, with INDEX_FFS hint:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID anqfbf5caat2a, child number 0
-------------------------------------
select /*+ index_ffs(demo1) */ n from demo1 where n is not null order
by n
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 82 (100)| 10000 |00:00:00.01 | 51 | | | |
| 1 | SORT ORDER BY | | 1 | 10000 | 82 (2)| 10000 |00:00:00.01 | 51 | 478K| 448K| 424K (0)|
| 2 | INDEX FAST FULL SCAN| DEMO1_N | 1 | 10000 | 14 (0)| 10000 |00:00:00.01 | 51 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) "N"[NUMBER,22] 2 - "N"[NUMBER,22]

The estimated cost is higher: the index read is cheaper (cost=14 instead of 46) but then the sort operation brings this to 82. We can see additional columns in the execution plan here because the sorting operation needs a workarea in memory (estimated 478K, actually 424K used during the execution). Note that the multiblock read has a few blocks of overhead (reads 51 blocks instead of 48) because it has to read the segment header to identify the extents to scan.

Postgres Index Only Scan

In PostgreSQL there’s only one way to scan indexes: random reads by following the chain of leaf blocks. This returns the rows in the order of the index and does not require an additional sort:


explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null order by n ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.125..1.277 rows=10000 loops=1)
Output: n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=30
Planning time: 0.532 ms
Execution time: 1.852 ms

In the previous posts, we have seen a cost of cost=0.29..270.29 for the Index Only Scan. Here we have an additional cost of 25 for the cpu_operator_cost because I’ve added the ‘where n is not null’. As the default constant is 0.0025 this is the query planner estimating to evaluate it for 10000 rows.

First Rows

The Postgres cost always shows two values. The first one is the startup cost: the cost just before being able to return the first row. Some operations have a very small startup cost, others have some blocking operations that must finish before sending their first result rows. Here, as we have no sort operation, the first row retrieved from the index can be returned immediately and the startup cost is small: 0.29
In Oracle you can see the initial cost by optimizing the plan to retrieve the first row, with the FIRST_ROWS() hint:


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0fjk9vv4g1q1w, child number 0
-------------------------------------
select /*+ first_rows(1) */ n from demo1 where n is not null order by
n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 2 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

The actual number of blocks read (48) is the same as before because I finally fetched all rows, but the cost is small because it was estimated for two rows only. Of course, we can also tell Postgres or Oracle that we want only the first rows. This is for the next post.

Character strings

The previous example is an easy one because the column N is a number and both Oracle and Postgres stores number in a binary format that follows the same order as the numbers. But that’s different with character strings. If you are not in America, there is a very little chance that the order you want to see follows the ASCII order. Here I’ve run a similar query but using the column X instead of N, which is a text (VARCHAR2 in Oracle):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID fsqk4fg1t47v5, child number 0
-------------------------------------
select /*+ */ x from demo1 where x is not null order by x
--------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2493 (100)| 10000 |00:00:00.27 | 1644 | 18 | | | |
| 1 | SORT ORDER BY | | 1 | 10000 | 2493 (1)| 10000 |00:00:00.27 | 1644 | 18 | 32M| 2058K| 29M (0)|
|* 2 | INDEX FAST FULL SCAN| DEMO1_X | 1 | 10000 | 389 (0)| 10000 |00:00:00.01 | 1644 | 18 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("X" IS NOT NULL)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("X",'nls_sort=''FRENCH''')[2000], "X"[VARCHAR2,1000] 2 - "X"[VARCHAR2,1000]

I have created an index on X, and as you can see it can be used to get all X values, but with an Index Fast Full Scan, the multiblock index only access which is fast but does not return rows in the order of the index. And then a sort operation is applied. I can force an Index Full Scan with INDEX() hint but the sort will still have to be done.

The reason can be seen in the column projection note. My Oracle client application is running on a laptop where the OS is in French and Oracle returns the setting according to what the end-user can expect. This is National Language Support. An Oracle database can be accessed by users all around the world and they will see ordered lists, date format, decimal separators,… according to their country and language.

ORDER BY … COLLATE …

My databases has been created in a system which is in English. In Postgres we can get results sorted in French with the COLLATE option of ORDER BY:


explain (analyze,verbose,costs,buffers) select x from demo1 where x is not null order by x collate "fr_FR" ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5594.17..5619.17 rows=10000 width=1036) (actual time=36.163..37.254 rows=10000 loops=1)
Output: x, ((x)::text)
Sort Key: demo1.x COLLATE "fr_FR"
Sort Method: quicksort Memory: 1166kB
Buffers: shared hit=59
-> Index Only Scan using demo1_x on public.demo1 (cost=0.29..383.29 rows=10000 width=1036) (actual time=0.156..1.559 rows=10000 loops=1)
Output: x, x
Index Cond: (demo1.x IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=52
Planning time: 0.792 ms
Execution time: 38.264 ms

Same idea here as in Oracle: there is an additional sort operation, which is a blocking operation that needs to be completed before being able to return the first row.

The detail of the cost is the following:

  • The index on the column X has 52 blocks witch is estimated at cost=208 (random_page_cost=4)
  • We have 10000 index entries to process, estimated at cost=50 (cpu_index_tuple_cost=0.005)
  • We have 10000 result rows to process, estimated at cost=100 (cpu_tuple_cost=0.01)
  • We have evaluated 10000 ‘is not null’ conditions, estimated at cost=25 (cpu_operator_cost=0.0025)

In Oracle we can use the same COLLATE syntax, but the name of the language is different, consistent across platforms rather than useing the OS one:


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 82az4syppyndf, child number 0
-------------------------------------
select /*+ */ x from demo1 where x is not null order by x collate "French"
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2493 (100)| 10000 |00:00:00.28 | 1644 | | | |
| 1 | SORT ORDER BY | | 1 | 10000 | 2493 (1)| 10000 |00:00:00.28 | 1644 | 32M| 2058K| 29M (0)|
|* 2 | INDEX FAST FULL SCAN| DEMO1_X | 1 | 10000 | 389 (0)| 10000 |00:00:00.01 | 1644 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("X" IS NOT NULL)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) NLSSORT("X" COLLATE "French",'nls_sort=''FRENCH''')[2000], "X"[VARCHAR2,1000] 2 - "X"[VARCHAR2,1000]

In Oracle, we do not need to use the COLLATE option. The language can be set for the session (NLS_LANGUAGE=’French’) or from the environment (NLS_LANG=’=French_.’). Oracle can share cursors across sessions (to avoid to waste resource compiling and optimizing the same statements used by different sessions) but will not share execution plans among different NLS environments because, as we have seen, the plan can be different. Postgres do not have to manage that because each PREPARE statement does a full compilation and optimization. There is no cursor sharing in Postgres.

Indexing for different languages

We have seen in the Oracle execution plan Column Projection Information that an NLSSORT operation is applied on the column to get a value that follows the collation order of the language. We have seen in the previous post that we can index a function on a column. Then we have the possibility to create an index for different languages. The following index will be used to avoid sort from French users:

create index demo1_x_fr on demo1(nlssort(x,'NLS_SORT=French'));

Since 12cR2 we can create the same with de collate syntax:

create index demo1_x_fr on demo1(x collate "French");

Both syntaxes create the same index, which can be used by queries with ORDER BY … COLLATE or with session that set the NLS_LANGUAGE:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 82az4syppyndf, child number 0
-------------------------------------
select /*+ */ x from demo1 where x is not null order by x collate "French"
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4770 (100)| 10000 |00:00:00.02 | 4772 |
|* 1 | TABLE ACCESS BY INDEX ROWID| DEMO1 | 1 | 10000 | 4770 (1)| 10000 |00:00:00.02 | 4772 |
| 2 | INDEX FULL SCAN | DEMO1_X_FR | 1 | 10000 | 3341 (1)| 10000 |00:00:00.01 | 3341 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X" IS NOT NULL)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "X"[VARCHAR2,1000] 2 - "DEMO1".ROWID[ROWID,10], "DEMO1"."SYS_NC00004$"[RAW,2000]

There’s no sort operation here as the INDEX FULL SCAN returns the rows in order.

PostgreSQL has the same syntax:

create index demo1_x_fr on demo1(x collate "fr_FR");

and then the query can use this index and bypass the sort operation:

explain (analyze,verbose,costs,buffers) select x from demo1 where x is not null order by x collate "fr_FR" ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_x_fr on public.demo1 (cost=0.29..383.29 rows=10000 width=1036) (actual time=0.190..1.654 rows=10000 loops=1)
Output: x, x
Index Cond: (demo1.x IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=32 read=20
Planning time: 1.049 ms
Execution time: 2.304 ms

Avoiding a sort operation can really improve the performance of queries in two ways: save the resources required by a sort operation (which will have to spill to disk when the workarea do not fit in memory) and avoid a blocking operation and then be able to return the first rows quickly.

We have seen how indexes can be used to access a subset of columns from a smaller structure, and how they can be used to access a sorted version of the rows. Future posts will show how the index access is used to quickly filter a subset of rows. But for the moment I’ll continue on this blocking operation. We have seen a lot of Postgres costs, and they have two values (startup cost and total cost). More on startup cost in the next post.

 

Cet article Postgres vs. Oracle access paths IV – Order By and Index est apparu en premier sur Blog dbi services.

Pages