Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 15 hours 24 min ago

Welcome to M|17

Wed, 2017-04-12 20:00

m17bannernew

Welcome to the MariaDB’s first user conference

On the 11th, started at 09:00 this big event at the Conrad Hotel in New York, closed to the One World Trade Center
After the short registration process where we received a full bag of goodies (mobilephone lens,Jolt charger, cap,note block,…)
we could choose between 3 workshops.
– Scaling and Securing MariaDB for High Availability
– MariaDB ColumnStore for High Performance Analytics
– Building Modern Applications with MariaDB

I decided to go to the first one presented by Michael de Groot, technical consultant at MariaDB.
After a theoritical introduction of the detailled MariaDB cluster technology and mechanisms (around 40 slides) we had to build up from scratch a MariaDB cluster composed of 4 nodes and I have to admit that this exercise was well prepared as we had just to follow the displayed instructions on the screen.
At the end that means 12:30, almost everybody had deployed the MariaDB cluster and was able to use and manage it.

Afterwards, it was time to get lunch. A big buffet of salads and sandwiches was waiting for us.
It was really nice because we could meet all people as Peter Zaitsev (Percona’s CEO) in a cool and relax atmosphere.

Welcome-mariadb
Atfter lunch, a keynote was delivered by MariaDB CEO Michael Howard in the biggest conference room of the hotel where around 400 people were present.
He mainly talked about the strategic orientation of MariaDB in the Open Source world for the next coming years.
Unfortunately the air conditioning was too cool and a lot of people started sneezing, even I and I had to keep my jacket all the time.

Then, a special guest speaker called Joan Tay Kim Choo, Executive Director of Technology Operations at DBS Bank, talked about their success story.
How they migrated all their databases from Oracle Enterprise and DB2 to MariaDB.

Roger Bodamer, MariaDB Chief Product Officer, then had also his keynote session.
It was really interresting because he discussed about how MariaDB will exploit the fundamental architectural changes in the cloud and how MariaDB will enable both OLTP and Analytical use cases for enterprises at any scale.

Finally, at five started the Welcome Reception and Technology Pavilion, in other words a small party.
Good music, good red wines (Cabernet was really good), good atmosphere.
we could meet all speakers and I had the chance to meet Michael Widenius alias “Monty”, founder of the MySQL Server, a great moment for me.
He gracefully accepted and several times because the light was really bad to take pictures with me.
MontySaid2

Around 18:30, the party was almost over, I was still here, one of the last guest finishing my glass of cabernet, thinking of tomorrow, the second day of this event and all the sessions I planned to see.

 

Cet article Welcome to M|17 est apparu en premier sur Blog dbi services.

Failed to set logmining server parameter MAX_SGA_SIZE to value XX

Wed, 2017-04-12 08:44

When you see something like this in your GoldenGate error log when you try to start an extract:

2017-04-12 14:51:38  ERROR   OGG-02039  Oracle GoldenGate Capture for Oracle, extxxx.prm:  Failed to set logmining server parameter MAX_SGA_SIZE to value 24.
2017-04-12 14:51:38  ERROR   OGG-02042  Oracle GoldenGate Capture for Oracle, extxxx.prm:  OCI Error 23605.
2017-04-12 14:51:38  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, extxxx.prm:  PROCESS ABENDING.

… then you should increase the streams_pool_size (maybe you need to increase the sga parameters as well):

SQL> show parameter streams

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
streams_pool_size                    big integer 23M

Go at least to 1GB and you should be fine.

 

Cet article Failed to set logmining server parameter MAX_SGA_SIZE to value XX est apparu en premier sur Blog dbi services.

PostgreSQL 10 is just around the corner, time to dare the switch?

Wed, 2017-04-12 00:54

Some days ago Robert Haas published a great blog post about the features you can expect for the upcoming PostgreSQL 10 (probably in September this year). Beside of what Robert is describing in his blog: Do you still build your database infrastructure on proprietary software? The time to move forward is now, let me explain why:

What you can always hear when it is about replacing proprietary products with open source solutions is: It does not cost anything. Well, this is not entirely true. The software itself is free and at least when if comes to PostgreSQL you are free to do whatever your want. But this does not mean that you do not need to spend money when using open source software. You will still need to either hire people who will operate what you need or you will need to spend some money for someone else operating what you need (in the cloud or not, that does not matter). The big difference is:

  • You won’t need to purchase licenses, fact
  • Internal or external: When you compare the effort to operate a proprietary database with the time required to operate an open source database: You’ll save money for sure, as you’ll usually reduce complexity. The database is there to do its work and not for generating huge amounts of administration efforts.
  • When you need specific features not there yet you’ll need to get in touch with the community and try to convince them to implement it or you implement it yourself or you pay someone for implementing it (all choices will cost some money).

So far for the money aspects. The real benefit you get when choosing PostgreSQL is that you do not lock you in. Of course, once you start using PostgreSQL your data is in PostgreSQL and you can not just take it as it is and put it into another database. And of course, once you start implementing business logic inside the database you might feel that this locks you in again, but this is true for every product you use. Once you start using it you use it in the way the product works and other products usually work in another way. The key point is that you are free to do whatever you want to do with it and PostgreSQL tried to be as much compliant with the SQL Standard as possible. This is a complete change in thinking when you are used to work with the products of the big companies. PostgreSQL gets developed by people around the globe who in turn work for various companies around the globe. But there is no company called PostgreSQL, nobody can “buy” PostgreSQL. It is a pure open source project comparable to the Linux kernel development. Nobody can “buy” the Linux kernel but everybody can build business around it like the various commercial Linux distributions are doing it. The very same is true about PostgreSQL. The PostgreSQL product itself will always be free, check the PostgreSQL license.

What you do not get from PostgreSQL are the tools you need around PostgreSQL, e.g. for monitoring, backup/restore management or tools to automate failover and failback. The tools are there of course, both open source products as well as commercial tools. The commercial ones usually require some kind of subscription (e.g. EnterpiseDB).

Another important point to know is that PostgreSQL is supported on many platforms, check the build farm on what currently is tested and works. You are free to chose whatever platform you want to use: Your company is mainly using Windows, go and install PostgreSQL on Windows. Your main platform is FreeBSD? Go, install and use PostgreSQL on it.

But we need professional support! I know, you are used to work with the support organizations of the big companies and believe that only payed support is good support. If you want to (or are forced to), have a look here or contact us. There are plenty of companies which offer commercial support. In fact the official mailing lists provide outstanding support as well. Post your question to the mailing list which is the right one for your question and the question will get answered pretty fast, trust me. If you can’t believe it: Test it (but better think of asking a question after you searched the archives, maybe the answer is already there).

There are no conferences for PostgreSQL! Really? Have a look here. The next one in Switzerland is here.

I will not go into a features discussion here. If you want to learn more about the features of PostgreSQL search this blog or check the official documentation. There are tons of slides on SlideShare as well and many, many videos. If you really want to know what currently is going on in the PostgreSQL development check the PostgreSQL commit fest which is currently in progress. This is the place where patches are maintained. All is transparent and for every single patch you can check on how the whole discussion started in the hackers mailing list, e.g. for declarative partitioning.

Think about it …

 

Cet article PostgreSQL 10 is just around the corner, time to dare the switch? est apparu en premier sur Blog dbi services.

Documentum – Deactivation of a docbase without uninstallation

Sun, 2017-04-09 03:19

At some of our customers, we often install new docbases for development purposes which are used only for a short time to avoid cross-team interactions/interferences and this kind of things. Creating new docbases is quite easy with Documentum but it still takes some time (unless you use silent installations or Docker components). Therefore installing/removing docbases over and over can be a pain. For this purpose, we often install new docbases but then we don’t uninstall it, we simply “deactivate” it. By deactivate I mean updating configuration files and scripts to act just like if this docbase has never been created in the first place. As said above, some docbases are there only temporarily but we might need them again in a near future and therefore we don’t want to remove them completely.

In this blog, I will show you which files should be updated and how to simulate a “deactivation” so that the Documentum components will just act as if the docbase wasn’t there. I will describe the steps for the different applications of the Content Server including the JMS and Thumbnail Server, Web Application Server (D2/DA for example), Full Text Server and ADTS.

On this blog, I will use a Documentum 7.2 environment in LINUX of course (except for the ADTS…) which is therefore using JBoss 7.1.1 (for the JMS and xPlore 1.5). In all our environments we also have a custom script that can be used to stop or start all components installed in the host. Therefore in this blog, I will assume that you do have a similar script (let’s say that this script is named “startstop”) which include a variable named “DOCBASES=” that contains the list of docbases/repositories installed on the local Content Server (DOCBASES=”DOCBASE1 DOCBASE2 DOCBASE3″). For the Full Text Server, this variable will be “INDEXAGENTS=” and it will contain the name of the Index Agents installed on the local FT (INDEXAGENTS=”Indexagent_DOCBASE1 Indexagent_DOCBASE2 Indexagent_DOCBASE3″). If you don’t have such kind of script or if it is setup differently, then just adapt the needed steps below. I will put this custom startstop script at the following locations: $DOCUMENTUM/scripts/startstop in the Content Server and $XPLORE_HOME/scripts/startstop in the Full Text Server.

In the steps below, I will also assume that the docbase that need to be deactivated is “DOCBASE1″ and that we have two additional docbases installed on our environment (“DOCBASE2″ and “DOCBASE3″) that need to stay up&running. If you have some High Availability environments, then the steps below will apply to the Primary Content Server but for Remote Content Servers, you will need to adapt the name of the Docbase start and shutdown scripts which are placed under $DOCUMENTUM/dba: the correct name for Remote CSs should be $DOCUMENTUM/dba/dm_shutdown_DOCBASE1_<ServiceName@RemoteCSs>.

 

1. Content Server

Ok so let’s start with the deactivation of the docbase on the Content Server. Obviously the first thing to do is to stop the docbase if it is running:

ps -ef | grep "docbase_name DOCBASE1 " | grep -v grep
$DOCUMENTUM/dba/dm_shutdown_DOCBASE1

 

Once done and since we don’t want the docbase to be inadvertently restarted, then we need to update the custom script that I mentioned above. In addition to that, we should also rename the Docbase start script so an installer won’t start the docbase too.

mv $DOCUMENTUM/dba/dm_start_DOCBASE1 $DOCUMENTUM/dba/dm_start_DOCBASE1_deactivated
vi $DOCUMENTUM/scripts/startstop
    ==> Duplicate the line starting with "DOCBASES=..."
    ==> Comment one of the two lines and remove the docbase DOCBASE1 from the list that isn't commented
    ==> In the end, you should have something like:
        DOCBASES="DOCBASE2 DOCBASE3"
        #DOCBASES="DOCBASE1 DOCBASE2 DOCBASE3"

 

Ok so now the docbase has been stopped and can’t be started anymore so let’s start to check all the clients that were able to connect to this docbase. If you have a monitoring running on the Content Server (using the crontab for example), don’t forget to disable the monitoring too since the docbase isn’t running anymore. In the crontab, you can just comment the lines for example (using “crontab -e”). On the Java MethodServer (JMS) side, there are at least two applications you should take a look at (ServerApps and the ACS). To deactivate the docbase DOCBASE1 for these two applications, you should apply the following steps:

cd $DOCUMENTUM_SHARED/jboss7.1.1/server/DctmServer_MethodServer/deployments
vi ServerApps.ear/DmMethods.war/WEB-INF/web.xml
    ==> Comment the 4 lines related to DOCBASE1 as follow:
        <!--init-param>
            <param-name>docbase-DOCBASE1</param-name>
            <param-value>DOCBASE1</param-value>
        </init-param-->

vi acs.ear/lib/configs.jar/config/acs.properties
    ==> Reorder the “repository.name.X=” properties for DOCBASE1 to have the biggest number (X is a number which goes from 1 to 3 in this case since I have 3 docbases)
    ==> Reorder the “repository.acsconfig.X=” properties for DOCBASE1 to have the biggest number (X is a number which goes from 1 to 3 in this case since I have 3 docbases)
    ==> Comment the “repository.name.Y=” property with the biggest number (Y is the number for DOCBASE1 so should be 3 now)
    ==> Comment the “repository.acsconfig.Y=” property with the biggest number (Y is the number for DOCBASE1 so should be 3 now)
    ==> Comment the “repository.login.Y=” property with the biggest number (Y is the number for DOCBASE1 so should be 3 now)
    ==> Comment the “repository.password.Y=” property with the biggest number (Y is the number for DOCBASE1 so should be 3 now)

 

So what has been done above? In the file web.xml, there is a reference to all docbases that are configured for the applications. Therefore commenting these lines in the file simply avoid the JMS to try to contact the docbase DOCBASE1 because it’s not running anymore. For the ACS, the update of the file acs.properties is a little bit more complex. What I usually do in this file is reordering the properties so that the docbases that aren’t running have the biggest index. Since we have DOCBASE1, DOCBASE2 and DOCBASE3, DOCBASE1 being the first docbase installed, therefore it will have by default the index N°1 inside the acs.properties (e.g.: repository.name.1=DOCBASE1.DOCBASE1 // repository.name.2=DOCBASE2.DOCBASE2 // …). Reordering the properties will simply allow you to just comment the highest number (3 in this case) for all properties and you will keep the numbers 1 and 2 enabled.

In addition to the above, you might also have a BPM (xCP) installed, in which case you also need to apply the following step:

vi bpm.ear/bpm.war/WEB-INF/web.xml
    ==> Comment the 4 lines related to DOCBASE1 as follow:
        <!--init-param>
            <param-name>docbase-DOCBASE1</param-name>
            <param-value>DOCBASE1</param-value>
        </init-param-->

 

Once the steps have been applied, you can restart the JMS using your preferred method. This is an example:

$DOCUMENTUM_SHARED/jboss7.1.1/server/stopMethodServer.sh
ps -ef | grep "MethodServer" | grep -v grep
nohup $DOCUMENTUM_SHARED/jboss7.1.1/server/startMethodServer.sh >> $DOCUMENTUM_SHARED/jboss7.1.1/server/nohup-JMS.out 2>&1 &

 

After the restart of the JMS, it won’t contain any errors anymore related to connection problems to DOCBASE1. For example if you don’t update the ACS file (acs.properties), it will still try to project itself to all docbases and it will therefore fail for DOCBASE1.

The next component I wanted to describe isn’t a component that is installed by default on all Content Servers but you might have it if you need document previews: the Thumbnail Server. To deactivate the docbase DOCBASE1 inside the Thumbnail Server, it’s pretty easy too:

vi $DM_HOME/thumbsrv/conf/user.dat
    ==> Comment the 5 lines related to DOCBASE1:
        #[DOCBASE1]
        #user=dmadmin
        #local_folder=thumbnails
        #repository_folder=/System/ThumbnailServer
        #pfile.txt=/app/dctm/server/product/7.2/thumbsrv/conf/DOCBASE1/pfile.txt

sh -c "$DM_HOME/thumbsrv/container/bin/shutdown.sh"
ps -ef | grep "thumbsrv" | grep -v grep
sh -c "$DM_HOME/thumbsrv/container/bin/startup.sh"

 

If you don’t do that, the Thumbnail Server will try to contact all docbases configured in the “user.dat” file and because of a bug with certain versions of the Thumbnail (see this blog for more information), your Thumbnail Server might even fail to start. Therefore commenting the lines related to DOCBASE1 inside this file is quite important.

 

2. Web Application Server

For the Web Application Server hosting your Documentum Administrator and D2/D2-Config clients, the steps are pretty simple: usually nothing or almost nothing has to be done. If you really want to be clean, then there might be a few things to do, it all depends on what you configured… On this part, I will consider that you are using non-exploded applications (which means: war files). I will put these WAR files under $WS_HOME/applications/. In case your applications are exploded (meaning your D2 is a folder and not a war file), then you don’t have to extract the files (no need to execute the jar commands). If you are using a Tomcat Application Server, then the applications will usually be exploded (folder) and will be placed under $TOMCAT_HOME/webapps/.

 – D2:

If you defined the LoadOnStartup property for DOCBASE1, then you might need to execute the following commands to extract the file, comment the line for the DOCBASE1 inside it and update the file back into the war file:

jar -xvf $WS_HOME/applications/D2.war WEB-INF/classes/D2FS.properties
sed -i 's,^LoadOnStartup.DOCBASE1.\(username\|domain\)=.*,#&,' WEB-INF/classes/D2FS.properties
jar -uvf $WS_HOME/applications/D2.war WEB-INF/classes/D2FS.properties

 

Also if you defined which docbase should be the default one in D2 and that this docbase is DOCBASE1 then you need to change the default docbase to DOCBASE2 or DOCBASE3. In my case, I will use DOCBASE2 as new default docbase:

jar -xvf $WS_HOME/applications/D2.war WEB-INF/classes/config.properties
sed -i 's,^defaultRepository=.*,defaultRepository=DOCBASE2,' WEB-INF/classes/config.properties
jar -uvf $WS_HOME/applications/D2.war WEB-INF/classes/config.properties

 

Finally if you are using Single Sign-On, you will have a SSO User. This is defined inside the d2fs-trust.properties file with recent versions of D2 while it was defined in the shiro.ini file before. Since I’m using a D2 4.5, the commands would be:

jar -xvf $WS_HOME/applications/D2.war WEB-INF/classes/d2fs-trust.properties
sed -i 's,^DOCBASE1.user=.*,#&,' WEB-INF/classes/d2fs-trust.properties
jar -uvf $WS_HOME/applications/D2.war WEB-INF/classes/d2fs-trust.properties

 

 – D2-Config:

Usually nothing is needed. Only running docbases will be available through D2-Config.

 

 – DA:

Usually nothing is needed, unless you have specific customization for DA, in which case you probably need to take a look at the files under the “custom” folder.

 

3. Full Text Server

For the Full Text Server, the steps are also relatively easy. The only thing that needs to be done is to stop the Index Agent related to the docbase DOCBASE1 and prevent it from starting again. In our environments, since we sometimes have several docbases installed on the same Content Server and several Index Agents installed on the same Full Text, then we need to differentiate the name of the Index Agents. We usually only add the name of the docbase at the end: Indexagent_DOCBASE1. So let’s start with stopping the Index Agent:

ps -ef | grep "Indexagent_DOCBASE1" | grep -v grep
$XPLORE_HOME/jboss7.1.1/server/stopIndexagent_DOCBASE1.sh

 

Once done and if I use the global startstop script I mentioned earlier in this blog, then the only remaining step is preventing the Index Agent to start again and that can be done in the following way:

mv $XPLORE_HOME/jboss7.1.1/server/startIndexagent_DOCBASE1.sh $XPLORE_HOME/jboss7.1.1/server/startIndexagent_DOCBASE1.sh_deactivated
vi $XPLORE_HOME/scripts/startstop
    ==> Duplicate the line starting with "INDEXAGENTS=..."
    ==> Comment one of the two lines and remove the Index Agent related to DOCBASE1 from the list that isn't commented
    ==> In the end, you should have something like:
        INDEXAGENTS="Indexagent_DOCBASE2 Indexagent_DOCBASE3"
        #INDEXAGENTS="Indexagent_DOCBASE1 Indexagent_DOCBASE2 Indexagent_DOCBASE3"

 

If you have a monitoring running on the Full Text Server for this Index Agent, don’t forget to disable it.

 

4. ADTS

The last section of this blog will talk about the ADTS (Advanced Document Transformation Services), also called the Rendition Server. The ADTS is fairly similar to all other Documentum components: first you start with installing the different binaries and then you can configure a docbase to use/be supported by the ADTS. By doing that, the ADTS will update some configuration files that therefore need to be updated again if you want to deactivate a docbase. As you know, the ADTS is a Windows Server so I won’t show you commands to be executed in this section, I will just point you to the configuration files that need to be edited and what to update inside them. In this section, I will use %ADTS_HOME% as the folder under which the ADTS has been installed. It’s usually a good idea to install the ADTS under a specific/separated drive (not the OS drive) like D:\CTS\.

So the first thing to do is to prevent the different profiles for a docbase to be loaded:

Open the file "%ADTS_HOME%\CTS\config\CTSProfileService.xml"
    ==> Comment the whole "ProfileManagerContext" XML tag related to DOCBASE1
    ==> In the end, you should have something like:
        <!--ProfileManagerContext DocbaseName="DOCBASE1" ProcessExternally="false">
            <CTSServerProfile CTSProfileValue="%ADTS_HOME%\CTS\\docbases\\DOCBASE1\\config\\profiles\\lightWeightProfiles" CTSProfileName="lightWeightProfile"/>
            <CTSServerProfile CTSProfileValue="%ADTS_HOME%\CTS\\docbases\\DOCBASE1\\config\\profiles\\lightWeightSystemProfiles" CTSProfileName="lightWeightSystemProfile"/>
            <CTSServerProfile CTSProfileValue="%ADTS_HOME%\CTS\\docbases\\DOCBASE1\\config\\profiles\\heavyWeightProfiles" CTSProfileName="heavyWeightProfile"/>
            <CTSServerProfile CTSProfileValue="/System/Media Server/Profiles" CTSProfileName="lightWeightProfileFolder"/>
            <CTSServerProfile CTSProfileValue="/System/Media Server/System Profiles" CTSProfileName="lightWeightSystemProfileFolder"/>
            <CTSServerProfile CTSProfileValue="/System/Media Server/Command Line Files" CTSProfileName="heavyWeightProfileFolder"/>
            <CTSServerProfile CTSProfileValue="%ADTS_HOME%\CTS\docbases\DOCBASE1\config\temp_profiles" CTSProfileName="tempFileDir"/>
            <CTSServerProfile CTSProfileValue="ProfileSchema.dtd" CTSProfileName="lwProfileDTD"/>
            <CTSServerProfile CTSProfileValue="MP_PROPERTIES.dtd" CTSProfileName="hwProfileDTD"/>
            <ForClients>XCP</ForClients>
        </ProfileManagerContext-->

 

Once that is done, the queue processors need to be disabled too:

Open the file "%ADTS_HOME%\CTS\config\CTSServerService.xml"
    ==> Comment the two "QueueProcessorContext" XML tags related to DOCBASE1
    ==> In the end, you should have something like (I'm not displaying the whole XML tags since they are quite long...):
        <!--QueueProcessorContext DocbaseName="DOCBASE1">
            <CTSServer AttributeName="queueItemName" AttributeValue="dm_mediaserver"/>
            <CTSServer AttributeName="queueInterval" AttributeValue="10"/>
            <CTSServer AttributeName="maxThreads" AttributeValue="10"/>
            ...
            <CTSServer AttributeName="processOnlyParked" AttributeValue=""/>
            <CTSServer AttributeName="parkingServerName" AttributeValue=""/>
            <CTSServer AttributeName="notifyFailureMessageAdmin" AttributeValue="No"/>
        </QueueProcessorContext-->
        <!--QueueProcessorContext DocbaseName="DOCBASE1">
            <CTSServer AttributeName="queueItemName" AttributeValue="dm_autorender_win31"/>
            <CTSServer AttributeName="queueInterval" AttributeValue="10"/>
            <CTSServer AttributeName="maxThreads" AttributeValue="10"/>
            ...
            <CTSServer AttributeName="processOnlyParked" AttributeValue=""/>
            <CTSServer AttributeName="parkingServerName" AttributeValue=""/>
            <CTSServer AttributeName="notifyFailureMessageAdmin" AttributeValue="No"/>
        </QueueProcessorContext-->

 

After that, there is only one last configuration file to be updated and that’s the session manager which is the one responsible for the errors printed during startup of the ADTS because it defines which docbases the ADTS should try to contact, using which user/password and how many tries should be perform:

Open the file "%ADTS_HOME%\CTS\config\SessionService.xml"
    ==> Comment the whole "LoginContext" XML tag related to DOCBASE1
    ==> In the end, you should have something like:
        <!--LoginContext DocbaseName="DOCBASE1" Domain="" isPerformanceLogRepository="false">
            <CTSServer AttributeName="userName" AttributeValue="adtsuser"/>
            <CTSServer AttributeName="passwordFile" AttributeValue="%ADTS_HOME%\CTS\docbases\DOCBASE1\config\pfile\mspassword.txt"/>
            <CTSServer AttributeName="maxConnectionRetries" AttributeValue="10"/>
        </LoginContext-->

 

Once the configuration files have been updated, simply restart the ADTS services for the changes to be applied.

 

And here we go, you should have a clean environment with one less docbase configured without having to remove it on all servers. As a final note, if you ever want to reactivate the docbase, simply uncomment everything that was commented above, restore the default line from the custom “startstop” scripts and rename the Documentum start scripts with their original names (without the “_deactivated”) on the Content Server and Full Text Server.

 

 

Cet article Documentum – Deactivation of a docbase without uninstallation est apparu en premier sur Blog dbi services.

Service “696c6f76656d756c746974656e616e74″ has 1 instance(s).

Sat, 2017-04-08 02:53

Weird title, isn’t it? That was my reaction when I did my first ‘lsnrctl status’ in 12.2: weird service name… If you have installed 12.2 multitenant, then you have probably seen this strange service name registered in your listener. One per PDB. It is not a bug. It is an internal service used to connect to the remote PDB for features like Proxy PDB. This name is the GUID of the PDB which makes this service independent of the name or the physical location of the PDB. You can use it to connect to the PDB, but should not. It is an internal service name. But on a lab, let’s play with it.

CDB

I have two Container Databases on my system:

18:01:33 SQL> connect sys/oracle@//localhost/CDB2 as sysdba
Connected.
18:01:33 SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO

CDB2 has been created without any pluggable databases (except PDB$SEED of course).

18:01:33 SQL> connect sys/oracle@//localhost/CDB1 as sysdba
Connected.
18:01:33 SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
4 PDB1 READ WRITE NO

CDB1 has one pluggable database PDB1.

PDB1 has its system files in /u01/oradata/CDB1/PDB1/ and I’ve a user tablespace datafiles elsewhere:

18:01:33 SQL> select con_id,file_name from cdb_data_files;
CON_ID FILE_NAME
------ -------------------------------------
1 /u01/oradata/CDB1/users01.dbf
1 /u01/oradata/CDB1/undotbs01.dbf
1 /u01/oradata/CDB1/system01.dbf
1 /u01/oradata/CDB1/sysaux01.dbf
4 /u01/oradata/CDB1/PDB1/undotbs01.dbf
4 /u01/oradata/CDB1/PDB1/sysaux01.dbf
4 /u01/oradata/CDB1/PDB1/system01.dbf
4 /u01/oradata/CDB1/PDB1/USERS.dbf
4 /var/tmp/PDB1USERS2.dbf

Both are registered to the same local listener:

SQL> host lsnrctl status
 
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 07-APR-2017 18:01:33
 
Copyright (c) 1991, 2016, Oracle. All rights reserved.
 
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 07-APR-2017 07:53:06
Uptime 0 days 10 hr. 8 min. 27 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Log File /u01/app/oracle/diag/tnslsnr/VM104/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=VM104)(PORT=1521)))
Services Summary...
Service "4aa269fa927779f0e053684ea8c0c27f" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB1" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB1XDB" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB2" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
Service "CDB2XDB" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
The command completed successfully

Each container database declares its db_unique_name as a service: CDB1 and CDB2, with an XDB service for each: CDB1XDB and CDB2XDB, each pluggable database has also its service: PDB1 here. This is what we had in 12.1 but in 12.2 there is one more service with a strange name in hexadecimal: 4aa269fa927779f0e053684ea8c0c27f

Connect to PDB without a service name?

Want to know more about it? Let’s try to connect to it:

SQL> connect sys/oracle@(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=4aa269fa927779f0e053684ea8c0c27f))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.78.104)(PORT=1521))) as sysdba
Connected.
SQL> select sys_context('userenv','cdb_name'), sys_context('userenv','con_name'), sys_context('userenv','service_name') from dual;
 
SYS_CONTEXT('USERENV','CDB_NAME') SYS_CONTEXT('USERENV','CON_NAME') SYS_CONTEXT('USERENV','SERVICE_NAME')
--------------------------------- --------------------------------- -------------------------------------
CDB1 PDB1 SYS$USERS

With this service, I can connect to the PDB1 but the service name I used in the connection string is not a real service:

SQL> select name from v$services;
 
NAME
----------------------------------------------------------------
pdb1
 
SQL> show parameter service
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string CDB1

The documentation says that SYS$USERS is the default database service for user sessions that are not associated with services so I’m connected to a PDB here without a service.

GUID

The internal service name is the GUID of the PDB, which identifies the container even after unplug/plug.

SQL> select pdb_id,pdb_name,con_uid,guid from dba_pdbs;
 
PDB_ID PDB_NAME CON_UID GUID
------ -------- ------- ----
4 PDB1 2763763322 4AA269FA927779F0E053684EA8C0C27F

Proxy PDB

This internal service has been introduced in 12cR2 for Proxy PDB feature: access to a PDB through another one, so that you don’t have to change the connection string when you migrate the PDB to another server.

I’ll create a Proxy PDB in CDB2 to connect to PDB1 which is in CDB1. This is simple: create a database link for the creation of the Proxy PDB which I call PDB1PX1:

18:01:33 SQL> connect sys/oracle@//localhost/CDB2 as sysdba
Connected.
18:01:33 SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
 
18:01:33 SQL> create database link CDB1 connect to system identified by oracle using '//localhost/CDB1';
Database link CDB1 created.
 
18:01:38 SQL> create pluggable database PDB1PX1 as proxy from PDB1@CDB1
file_name_convert=('/u01/oradata/CDB1/PDB1','/u01/oradata/CDB1/PDB1PX1');
 
Pluggable database PDB1PX1 created.
 
18:02:14 SQL> drop database link CDB1;
Database link CDB1 dropped.

The Proxy PDB clones the system tablespaces, and this is why I had to give a file_name_convert. Note that the user tablespace datafile is not cloned, so I don’t need to convert the ‘/var/tmp/PDB1USERS2.dbf’. The dblink is not needed anymore once the Proxy PDB is created, as it is used only for the clone of system tablespaces. The PDB is currently in mount.

18:02:14 SQL> connect sys/oracle@//localhost/CDB2 as sysdba
Connected.
18:02:14 SQL> show pdbs
 
CON_ID CON_NAME OPEN MODE RESTRICTED
------ -------- ---- ---- ----------
2 PDB$SEED READ ONLY NO
3 PDB1PX1 MOUNTED

The system tablespaces are there (I’m in 12.2 with local undo which is required for Proxy PDB feature)

18:02:14 SQL> select con_id,file_name from cdb_data_files;
 
CON_ID FILE_NAME
------ ---------
1 /u01/oradata/CDB2/system01.dbf
1 /u01/oradata/CDB2/sysaux01.dbf
1 /u01/oradata/CDB2/users01.dbf
1 /u01/oradata/CDB2/undotbs01.dbf

I open the PDB

18:02:19 SQL> alter pluggable database PDB1PX1 open;
Pluggable database PDB1PX1 altered.

connect

I have now 3 ways to connect to PDB1: with the PDB1 service, with the internal service, and through the Proxy PDB service.
I’ve tested the 3 ways:


18:02:45 SQL> connect demo/demo@//localhost/PDB1
18:02:56 SQL> connect demo/demo@//localhost/PDB1PX1
18:03:06 SQL> connect demo/demo@//localhost/4aa269fa927779f0e053684ea8c0c27f

and I’ve inserted each time into a DEMO table the information about my connection:
SQL> insert into DEMO select '&_connect_identifier' "connect identifier", current_timestamp "timestamp", sys_context('userenv','cdb_name') "CDB name", sys_context('userenv','con_name') "con name" from dual;

Here is the result:

connect identifier timestamp CDB name container name
------------------ --------- -------- --------------
//localhost/PDB1 07-APR-17 06.02.50.977839000 PM CDB1 PDB1
//localhost/PDB1PX1 07-APR-17 06.03.01.492946000 PM CDB1 PDB1
//localhost/4aa269fa927779f0e053684ea8c0c27f 07-APR-17 06.03.11.814039000 PM CDB1 PDB1

We are connected to the same databases. As for this test I’m on the same server with same listener, I can check what is logged in the listener log.

Here are the $ORACLE_BASE/diag/tnslsnr/$(hostname)/listener/alert/log.xml entries related to my connections.

//localhost/PDB1

When connecting directly to PDB1 the connection is simple:


<msg time='2017-04-07T18:02:45.644+02:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='VM104'
host_addr='192.168.78.104' pid='1194'>
<txt>07-APR-2017 18:02:45 * (CONNECT_DATA=(SERVICE_NAME=PDB1)(CID=(PROGRAM=java)(HOST=VM104)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=27523)) * establish * PDB1 * 0
</txt>
</msg>

I am connecting with SQLcl which is java: (PROGRAM=java)

//localhost/PDB1PX1

When connecting through the Proxy PDB I see the connection to the Proxy PDBX1:


<msg time='2017-04-07T18:02:56.058+02:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='VM104'
host_addr='192.168.78.104' pid='1194'>
<txt>07-APR-2017 18:02:56 * (CONNECT_DATA=(SERVICE_NAME=PDB1PX1)(CID=(PROGRAM=java)(HOST=VM104)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=27524)) * establish * PDB1PX1 * 0
</txt>
</msg>

This is the java connection. But I can also see the connection to the remote PDB1 from the Proxy PDB


<msg time='2017-04-07T18:03:01.375+02:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='VM104'
host_addr='192.168.78.104' pid='1194'>
<txt>07-APR-2017 18:03:01 * (CONNECT_DATA=(SERVICE_NAME=4aa269fa927779f0e053684ea8c0c27f)(CID=(PROGRAM=oracle)(HOST=VM104)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.78.104)(PORT=16787)) * establish * 4aa269fa927779f0e053684ea8c0c27f * 0
</txt>
</msg>

Here the program is (PROGRAM=oracle) which is a CDB2 instance process connecting to the CDB1 remote through the internal service.

//localhost/4aa269fa927779f0e053684ea8c0c27f

When I connect to the internal service, I see the same connection to PDB1’s GUID but from (PROGRAM=java) directly


<msg time='2017-04-07T18:03:06.671+02:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='VM104'
host_addr='192.168.78.104' pid='1194'>
<txt>07-APR-2017 18:03:06 * (CONNECT_DATA=(SERVICE_NAME=4aa269fa927779f0e053684ea8c0c27f)(CID=(PROGRAM=java)(HOST=VM104)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=27526)) * establish * 4aa269fa927779f0e053684ea8c0c27f * 0
</txt>
</msg>

One more…

So each user PDB, in addition to the PDB name and additional services you have defined, registers an additional internal service, whether the PDB is opened our closed. And the fun is that Proxy PDB also register this additional service. Here is my listener status:


Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=VM104)(PORT=1521)))
Services Summary...
Service "4aa269fa927779f0e053684ea8c0c27f" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "4c96bda23b8e41fae053684ea8c0918b" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
Service "CDB1" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB1XDB" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "CDB2" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
Service "CDB2XDB" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
Instance "CDB1", status READY, has 1 handler(s) for this service...
Service "pdb1px1" has 1 instance(s).
Instance "CDB2", status READY, has 1 handler(s) for this service...
The command completed successfully

This “4c96bda23b8e41fae053684ea8c0918b” is the GUID of the Proxy PDB.

SQL> select sys_context('userenv','cdb_name'), sys_context('userenv','con_name'), sys_context('userenv','service_name') from dual;
 
SYS_CONTEXT('USERENV','CDB_NAME')
--------------------------------------------------------------------------------
SYS_CONTEXT('USERENV','CON_NAME')
--------------------------------------------------------------------------------
SYS_CONTEXT('USERENV','SERVICE_NAME')
--------------------------------------------------------------------------------
CDB1
PDB1
SYS$USERS

So that’s a fourth way to connect to PDB1: through the internal service of the Proxy PDB.

Then you can immediately imagine what I tried…

ORA-65280

Because the internal service name is used to connect through Proxy PDB, can I create an proxy for the proxy?

18:03:32 SQL> create pluggable database PDB1PX2 as proxy from PDB1PX1@CDB2
2 file_name_convert=('/u01/oradata/CDB1/PDB1/PX1','/u01/oradata/CDB1/PDB1PX2');
 
Error starting at line : 76 File @ /media/sf_share/122/blogs/proxypdb.sql
In command -
create pluggable database PDB1PX2 as proxy from PDB1PX1@CDB2
file_name_convert=('/u01/oradata/CDB1/PDB1/PX1','/u01/oradata/CDB1/PDB1PX2')
Error report -
ORA-65280: The referenced pluggable database is a proxy pluggable database.

Answer is no. You cannot nest the Proxy PDB.

So what?

Don’t panic when looking at services registered in the listener. Those hexadecimal service names are expected in 12.2, with one per user PDB. You see them, but have no reason to use them directly. You will use them indirectly when creating a Proxy PDB which makes the location where users connect independent from the physical location of the PDB. Very interesting from migration because client configuration is independent from the migration (think hybrid-cloud). You can use this feature even without the multitenant option. Want to see all multitenant architecture options available without the option? Look at the ITOUG Tech day agenda

 

Cet article Service “696c6f76656d756c746974656e616e74″ has 1 instance(s). est apparu en premier sur Blog dbi services.

Trace files segmented in multiple parts as a workaround for bug 23300142

Fri, 2017-04-07 12:27

Today I visited a customer, who deleted a Data Guard configuration (i.e. a temporary Data Guard setup through the broker was deleted). The LOG_ARCHIVE_DEST_STATE_2 on the primary database was set to DEFER temporarily. That resulted in trace-files with name *tt*.trc to become huge (GBytes after a couple of days). Analysis showed that this was caused by bug 23300142 in 12.1.0.2. See My Oracle Support Note

Bug 23300142 - TT background process trace file message: async ignored current log: kcclenal clear thread open (Doc ID 23300142.8)

for details.
Unfortunately the bug does not have a workaround.
Due to the fact that the affected development-databases (which were now normal single instances without Data Guard) could not be restarted, I searched for a temporary workaround to stop the trace-files from growing further. Limiting the trace-file size on the database with

alter system set max_dump_file_size='100M';

did actually not always work to limit the file size. Here an example of a huge trace file (over 5GB):


$ find . -name "*tt*.trc" -ls | tr -s " " | cut -d " " -f7-11 | sort -n
...
5437814195 Apr 7 10:46 ./xxxxxx_site1/XXXXXX/trace/XXXXXX_tt00_28304.trc

However, what came in handy was the uts-trace-segmentation feature of 12c. See Jonathan Lewis’ blog here:

https://jonathanlewis.wordpress.com/2016/01/26/trace-file-size

I.e. I left all DBs on max_dump_file_size=unlimited and set


SQL> alter system set "_uts_first_segment_size" = 52428800 scope=memory;
SQL> alter system set "_uts_trace_segment_size" = 52428800 scope=memory;

Unfortunately setting the limit to the tt-background-process alone does not work:


SQL> exec dbms_system.set_int_param_in_session(sid => 199, serial# => 44511, parnam => '_uts_trace_segment_size', intval => 52428800);
BEGIN dbms_system.set_int_param_in_session(sid => 199, serial# => 44511, parnam => '_uts_trace_segment_size', intval => 52428800); END;
 
*
ERROR at line 1:
ORA-44737: Parameter _uts_trace_segment_size did not exist.
ORA-06512: at "SYS.DBMS_SYSTEM", line 117
ORA-06512: at line 1

With the default setting of “_uts_trace_segments” (Maximum number of trace segments) = 5 I could limit the maximum size of the trace of 1 DB to 250MB (50MB * 5). Below you can see only 4 files, because of 2 tests with earlier splittings of the trace-file:


$ ls -ltr *_tt00_28304*.trc
-rw-r----- 1 oracle dba 52428964 Apr 7 14:14 XXXXXX_tt00_28304_3.trc
-rw-r----- 1 oracle dba 52428925 Apr 7 16:07 XXXXXX_tt00_28304_4.trc
-rw-r----- 1 oracle dba 52428968 Apr 7 17:12 XXXXXX_tt00_28304_5.trc
-rw-r----- 1 oracle dba 43887950 Apr 7 18:50 XXXXXX_tt00_28304.trc

The feature of segmented trace-files may help a lot in situations like bug 23300142.

REMARK: Do not use underscore parameters in production environments without agreement from Oracle Support.

 

Cet article Trace files segmented in multiple parts as a workaround for bug 23300142 est apparu en premier sur Blog dbi services.

12cR2 DML monitoring and Statistics Advisor

Thu, 2017-04-06 15:40

Monitoring DML to get an idea of the activity on our tables is not new. The number of insert/delete/update/truncate since last stats gathering is tracked automatically. The statistics gathering job use it to list and prioritize tables that need fresh statistics. This is for slow changes on tables. In 12.2 we have the statistics advisor that goes further, with a rule that detects volatile tables:

SQL> select * from V$STATS_ADVISOR_RULES where rule_id=14;
 
RULE_ID NAME RULE_TYPE DESCRIPTION CON_ID
------- ---- --------- ----------- ------
14 LockVolatileTable OBJECT Statistics for objects with volatile data should be locked 0

But to detect volatile tables, you need to track DML frequency with finer grain. Let’s investigate what is new here in 12.2

Statistics Advisor tracing

DBMS_STATS has its trace mode enabled as a global preference. It is not documented, but it works with powers of two. 12.1.0.2 introduced 262144 to trace system statistics gathering, so let’s try the next one: 524288

SQL> exec dbms_stats.set_global_prefs('TRACE',0+524288)
PL/SQL procedure successfully completed.

After a while, I grepped my trace directory for DBMS_STATS and found the MMON slave trace (ORCLA_m001_30694.trc here):

*** 2017-04-06T14:10:11.979283+02:00
*** SESSION ID:(81.2340) 2017-04-06T14:10:11.979302+02:00
*** CLIENT ID:() 2017-04-06T14:10:11.979306+02:00
*** SERVICE NAME:(SYS$BACKGROUND) 2017-04-06T14:10:11.979309+02:00
*** MODULE NAME:(MMON_SLAVE) 2017-04-06T14:10:11.979313+02:00
*** ACTION NAME:(Flush KSXM hash table action) 2017-04-06T14:10:11.979317+02:00
*** CLIENT DRIVER:() 2017-04-06T14:10:11.979320+02:00
 
...
 
DBMS_STATS: compute_volatile_flag: objn=74843, flag=0, new_flag=0, inserts_new=619, updates_new=0, deletes_new=0, inserts_old=619, updates_old=0, deletes_old=0, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=0
DBMS_STATS: compute_volatile_flag: objn=74862, flag=0, new_flag=0, inserts_new=4393, updates_new=0, deletes_new=0, inserts_old=4393, updates_old=0, deletes_old=0, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=0
DBMS_STATS: compute_volatile_flag: objn=74867, flag=1, new_flag=0, inserts_new=4861477, updates_new=584000, deletes_new=13475192, inserts_old=3681477, updates_old=466000, deletes_old=12885192, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=1

Those entries appear every hour. Obviously, they are looking at some table (by their object_id) and computes a new flag from an existing flag and statistics about new and old DML (insert, update, delete). There’s a mention or row count and stale percentage. Obviously, the volatility of tables est computed every hour (mentions gather=NO_GATHER) or when we gather statistics (gather=GATHER). This goes beyond the DML monitoring from previous release, but is probably based on it.

Testing some DML

SQL> delete from DEMO;
10000 rows deleted.
 
SQL> insert into DEMO select rownum from xmltable('1 to 10000');
10000 rows created.
 
SQL> commit;
Commit complete.
 
SQL> select count(*) numrows from DEMO;
NUMROWS
----------
10000
 
SQL> update demo set n=n+1 where rownum lt;= 2000;
 
2000 rows updated.
 
SQL> insert into DEMO select rownum from xmltable('1 to 10000');
 
10000 rows created.

I deleted 10000 rows and inserted 10000, with a commit at the end. I updated 2000 ones and inserted 10000 again, without commit.

x$ksxmme

DML monitoring is done in memory, I order to see the changes in DBA_TAB_MODIFICATIONS, we need to flush it. But this in-memory information is visible in X$ fixed view:

SQL> select * from X$KSXMME where objn=&object_id;
old 1: select * from X$KSXMME where objn=&object_id
new 1: select * from X$KSXMME where objn= 74867
 
ADDR INDX INST_ID CON_ID CHUNKN SLOTN OBJN INS UPD DEL DROPSEG CURROWS PAROBJN LASTUSED FLAGS
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
00007F526E0B81F0 0 1 0 64 256 74867 20000 2000 10000 0 2350000 0 1491467123 128

Here are my 10000 deletes + 10000 inserts + 2000 updates + 10000 inserts. Of course the uncommitted ones are there because DML tracking do not keep the numbers for each transaction in order to update later what is committed or not.

The proof is that when I rollback, the numbers do not change:

SQL> rollback;
Rollback complete.
 
SQL> select * from X$KSXMME where objn=&object_id;
old 1: select * from X$KSXMME where objn=&object_id
new 1: select * from X$KSXMME where objn= 74867
 
ADDR INDX INST_ID CON_ID CHUNKN SLOTN OBJN INS UPD DEL DROPSEG CURROWS PAROBJN LASTUSED FLAGS
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
00007F526DDF47F8 0 1 0 64 256 74867 20000 2000 10000 0 2350000 0 1491467123 128

Yes, there is an estimation of the current number of rows here, in real-time. This is used to compare the changes with the total number, but you can use it to see the progress of a big transaction, giving a view of uncommitted changes.

sys.mon_mods_all$

The table sys.mon_mods_all$ is what is behind DBA_TAB_MODIFICATIONS (not exactly, but that will be for another blog post) and you have to flush what’s in memory to see the latest changes there:

SQL> exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully completed.
 
SQL> select * from sys.mon_mods_all$ where obj#=&object_id;
old 1: select * from sys.mon_mods_all$ where obj#=&object_id
new 1: select * from sys.mon_mods_all$ where obj#= 74867
 
OBJ# INSERTS UPDATES DELETES TIMESTAMP FLAGS DROP_SEGMENTS
---------- ---------- ---------- ---------- ------------------ ---------- -------------
74867 5581477 656000 13835192 06-APR 15:10:53 1 0

The flag 1 means that the table has been truncated since the latest stats gathering.

This is what we already know from previous release. Nothing to do with the trace we see every hour in MMON slave.

sys.optstat_snapshot$

What happens every hour is that a snapshot of sys.mon_mods_all$ is stored in sys.optstat_snapshot$:

SQL> select * from sys.optstat_snapshot$ where obj#=&object_id order by timestamp;
old 1: select * from sys.optstat_snapshot$ where obj#=&object_id order by timestamp
new 1: select * from sys.optstat_snapshot$ where obj#= 74867 order by timestamp
 
OBJ# INSERTS UPDATES DELETES FLAGS TIMESTAMP
---------- ---------- ---------- ---------- ---------- ------------------
74867 999 0 0 32 05-APR-17 17:27:01
74867 1997 0 0 32 05-APR-17 17:33:25
74867 1997 0 0 32 05-APR-17 17:33:31
74867 1997 0 0 32 05-APR-17 17:33:32
74867 80878 0 160 0 05-APR-17 18:59:37
74867 90863 0 210 0 05-APR-17 20:53:07
74867 10597135 0 410 0 05-APR-17 21:53:13
74867 10598134 0 410 32 05-APR-17 22:02:38
74867 38861 0 10603745 1 06-APR-17 08:17:58
74867 38861 0 10603745 1 06-APR-17 09:18:04
74867 581477 124000 11175192 1 06-APR-17 10:11:27
74867 1321477 230000 11705192 1 06-APR-17 11:09:50
74867 2481477 346000 12285192 1 06-APR-17 12:09:56
74867 3681477 466000 12885192 1 06-APR-17 01:10:04
74867 4861477 584000 13475192 1 06-APR-17 02:10:11
74867 5561477 654000 13825192 1 06-APR-17 03:10:19

You see snapshots every hour, the latest being 03:10, 02:10, 01.10, 12:09, 11:09, …
You see additional snapshots at each statistics gathering. I’ve run dbms_stats.gather_table_stats at 17:27 and 17:33 several times the day before. Those snapshots are flagged 32.
The statistics was gathered again at 20:02 (the auto job) and I’ve truncated the table after that which is why the flag is 1.

dbms_stats_advisor.compute_volatile_flag

My guess is that there should be a flag for volatile tables here, because I’ve seen a trace for compute_volatile_flag in MMON trace, so I’ve enabled sql_trace for the MMON slave, and here is the query which takes the snapshot:

insert /* KSXM:TAKE_SNPSHOT */ into sys.optstat_snapshot$ (obj#, inserts, updates, deletes, timestamp, flags) (select m.obj#, m.inserts, m.updates, m.deletes, systimestamp, dbms_stats_advisor.compute_volatile_flag( m.obj#, m.flags, :flags, m.inserts, m.updates, m.deletes, s.inserts, s.updates, s.deletes, null, nvl(to_number(p.valchar), :global_stale_pcnt), s.gather) flags from sys.mon_mods_all$ m, (select si.obj#, max(si.inserts) inserts, max(si.updates) updates, max(si.deletes) deletes, decode(bitand(max(si.flags), :gather_flag), 0, 'NO_GATHER', 'GATHER') gather, max(si.timestamp) timestamp from sys.optstat_snapshot$ si, (select obj#, max(timestamp) ts from sys.optstat_snapshot$ group by obj#) sm where si.obj# = sm.obj# and si.timestamp = sm.ts group by si.obj#) s, sys.optstat_user_prefs$ p where m.obj# = s.obj#(+) and m.obj# = p.obj#(+) and pname(+) = 'STALE_PERCENT' and dbms_stats_advisor.check_mmon_policy_violation(rownum, 6, 2) = 0)

It reads the current values (from sys.mon_mods_all$) and the last values (from sys.optstat_snapshot$), reads the stale percentage parameter, and calls the dbms_stats_advisor.compute_volatile_flag function that updates the flag with one passed as :flag, probably adding the value 64 (see below) when table is volatile (probably when sum of DML is over the row count + stale percentage). The function is probably different when the snapshots comes from statistics gathering (‘GATHER’) or from DML monitoring (‘NO_GATHER’) because the number of rows is absolute or relative to the previous one.

From the trace of bind variables, or simply from the dbms_stats trace, I can see all values:
DBMS_STATS: compute_volatile_flag: objn=74867, flag=1, new_flag=0, inserts_new=5701477, updates_new=668000, deletes_new=13895192, inserts_old=5701477, updates_old=668000, deletes_old=13895192, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=1
DBMS_STATS: compute_volatile_flag: objn=74867, flag=1, new_flag=0, inserts_new=4861477, updates_new=584000, deletes_new=13475192, inserts_old=3681477, updates_old=466000, deletes_old=12885192, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=1
DBMS_STATS: compute_volatile_flag: objn=74867, flag=1, new_flag=0, inserts_new=5561477, updates_new=654000, deletes_new=13825192, inserts_old=4861477, updates_old=584000, deletes_old=13475192, rowcnt=, rowcnt_loc=, stale_pcnt=10, gather=NO_GATHER, flag_result=1

The input flag is 1 and the output flag is 1. And I think that, whatever the number of DML we have, this is because the new_flag=0

This explains why I was not able to have snapshots flagged as volatile even when changing a lot of rows. Then How can the statistics advisor detect my volatile table?

Statistics Advisor

I’ve traced the statistics advisor

set long 100000 longc 10000
variable t varchar2(30)
variable e varchar2(30)
variable r clob
exec :t:= DBMS_STATS.CREATE_ADVISOR_TASK('my_task');
exec :e:= DBMS_STATS.EXECUTE_ADVISOR_TASK('my_task');
exec :r:= DBMS_STATS.REPORT_ADVISOR_TASK('my_task');
print r

No ‘LockVolatileTable’ rule has raised a recommendation, but I’ve seen a call to the DBMS_STATS.CHECK_VOLATILE function with an object_id as parameter.

dbms_stats_internal.check_volatile

In order to understand what are the criteria, I’ve run (with sql_trace) the function on my table:

SQL> select dbms_stats_internal.check_volatile(&object_id) from dual;
old 1: select dbms_stats_internal.check_volatile(&object_id) from dual
new 1: select dbms_stats_internal.check_volatile( 74867) from dual
 
DBMS_STATS_INTERNAL.CHECK_VOLATILE(74867)
------------------------------------------
F

I suppose ‘F’ is false, which explains why my table was not considered as volatile.

Here is the trace with binds:

PARSING IN CURSOR #140478915921360 len=191 dep=1 uid=0 oct=3 lid=0 tim=99947151021 hv=976524548 ad='739cb468' sqlid='1r3ujfwx39584'
SELECT SUM(CASE WHEN ISVOLATILE > 0 THEN 1 ELSE 0 END) FROM (SELECT OBJ#, BITAND(FLAGS, :B2 ) ISVOLATILE FROM OPTSTAT_SNAPSHOT$ WHERE OBJ# = :B1 ORDER BY TIMESTAMP DESC) O WHERE ROWNUM < :B3
END OF STMT
...
BINDS #140478915921360:
 
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=00 csi=00 siz=72 off=0
kxsbbbfp=7fc3cbe1c158 bln=22 avl=02 flg=05
value=64
Bind#1
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=00 csi=00 siz=0 off=24
kxsbbbfp=7fc3cbe1c170 bln=22 avl=04 flg=01
value=74867
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=00 csi=00 siz=0 off=48
kxsbbbfp=7fc3cbe1c188 bln=22 avl=02 flg=01
value=24

So, here is what the algorithm looks like:

  1. sys.opstat_snapshot$ is read for the latest 24 snapshots (remember that we have snapshots every hour + at each statistics gathering)
  2. ‘ISVOLATILE’ is 1 when the flags from the snapshots has flag 64. This is how I guessed that snapshots should me flagged with 64 by compute_volatile_flag.
  3. And finally, the number of ‘ISVOLATILE’ ones is summed.

So, it seems that the Statistics Advisor will raise a recommendation when the table has been flagged as volatile multiple times over the last 24 hour. How many? let’s guess:

SQL> insert into sys.optstat_snapshot$ select &object_id,0,0,0,64,sysdate from xmltable('1 to 12');
old 1: insert into sys.optstat_snapshot$ select &object_id,0,0,0,64,sysdate from xmltable('1 to 12')
new 1: insert into sys.optstat_snapshot$ select 74867,0,0,0,64,sysdate from xmltable('1 to 12')
 
12 rows created.
 
SQL> select dbms_stats_internal.check_volatile(&object_id) from dual;
old 1: select dbms_stats_internal.check_volatile(&object_id) from dual
new 1: select dbms_stats_internal.check_volatile( 74867) from dual
 
DBMS_STATS_INTERNAL.CHECK_VOLATILE(74867)
-----------------------------------------
F
 
SQL> rollback;
 
Rollback complete.

I’ve called the function after inserting various number of lines with flag=63 into sys.optstat_snapshot$ and up to 12 snapshots, it is still not considered as volatile.
Please remember that this is a lab, we are not expected to update the internal dictionary tables ourselves.

Now inserting one more:

SQL> insert into sys.optstat_snapshot$ select &object_id,0,0,0,64,sysdate from xmltable('1 to 13');
old 1: insert into sys.optstat_snapshot$ select &object_id,0,0,0,64,sysdate from xmltable('1 to 13')
new 1: insert into sys.optstat_snapshot$ select 74867,0,0,0,64,sysdate from xmltable('1 to 13')
 
13 rows created.
 
SQL> select dbms_stats_internal.check_volatile(&object_id) from dual;
old 1: select dbms_stats_internal.check_volatile(&object_id) from dual
new 1: select dbms_stats_internal.check_volatile( 74867) from dual
 
DBMS_STATS_INTERNAL.CHECK_VOLATILE(74867)
-----------------------------------------
T
 
SQL> rollback;
 
Rollback complete.

Good I have a ‘T’ here for ‘true. I conclude that the Statistics Advisor recommends to lock the stats on tables when half of the last 24h hours snapshots have encountered more than STALE_PERCENT modifications.

So what?

My table was not considered as volatile. None of the snapshots have been flagged as volatile. I’m quite sure that the number of DML is sufficient, so I suppose that this is disabled by default and I don’t know how to enable it. What I want to see is the compute_volatile_flag called with new_flag=64 so that snapshots are flagged when a large percentage or rows have been modified, so that enough snapshots have been flagged to be considered by the the check_volatile function.
Even if it is enabled, I think that there are more cases where tables should have statistics locked. Even if a table is empty for 5 minutes per day, we must be sure that the statistics are not gathered at that time. And looking at the Statistics Advisor thresholds, this case is far from being detected.
Final thought here: do you realize that you buy an expensive software to detect the changes happening on your tables, guess how the tables are updated, and recommend (and even implement) a general best practice? Does it mean that, today, we put in production some applications where we have no idea about what it does? Aren’t we supposed to design the application, document which tables are volatile and when they are loaded in bulk, and when to gather stats and lock them?

 

Cet article 12cR2 DML monitoring and Statistics Advisor est apparu en premier sur Blog dbi services.

OUD – Oracle Unified Directory 11.1.2.3, Oracle generates more and more LDAP lookups with every release

Thu, 2017-04-06 04:50

After installing OUD some time ago, I was doing some tests to see how it performs, and as long as I do ldap searching on the command line it looks very good. I am running Unified Directory 11.1.2.3.170117 (latest PSU), just for the protocol and I use the OUD only for TNS resolving and nothing else. However, Oracle clients are not connecting with “ldapsearch”, they are using “sqlplus” and the TNS name is resolved automatically in background.

I do have the following ldap.ora and sqlnet.ora. Very simply and nothing special.

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] cat ldap.ora
DIRECTORY_SERVERS= (dbidg01:1389)
DEFAULT_ADMIN_CONTEXT = "dc=dbi,dc=com"
DIRECTORY_SERVER_TYPE = OID

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] cat sqlnet.ora
NAMES.DIRECTORY_PATH = (TNSNAMES,LDAP,EZCONNECT)

Here is a little quiz: How many LDAP search requests do you expect when you connect to a 12.2 databases with the following command?

sqlplus system/manager@dbit122_ldap

Only one, right? Oracle looks up the TNS name dbit122_ldap in the OUD and retrieves the connect string. As soon as Oracle has the connect details, OUD does not play any role anymore. In case you do a ldapsearch from the 12.2 Oracle Home, then this is exactly the case.

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] which ldapsearch
/u01/app/oracle/product/12.2.0/dbhome_1/bin/ldapsearch
oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] ldapsearch -v -h dbidg01 -p 1389 -b "cn=DBIT122_LDAP,cn=OracleContext,dc=dbi,dc=com" -s base "(objectclass=*)" "objectclass,orclNetDescString,orclNetDescName,orclVersion"
ldap_open( dbidg01, 1389 )
filter pattern: (objectclass=*)
returning: objectclass,orclNetDescString,orclNetDescName,orclVersion
filter is: ((objectclass=*))
cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com
1 matches

In the OUD access log, you can see it clearly. One connect, the bind, the search request and finally the disconnect. Exactly how it should be, and the etime is 1 millisecond. That’s the elapsed time to deliver the search request which is very fast.

[dbafmw@dbidg01 logs]$ tail -40f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
[06/Apr/2017:10:46:49 +0200] CONNECT conn=877 from=192.168.56.203:21971 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:10:46:49 +0200] BIND REQ conn=877 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:10:46:49 +0200] BIND RES conn=877 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:10:46:49 +0200] SEARCH REQ conn=877 op=1 msgID=2 base="cn=DBIT122_LDAP,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:10:46:49 +0200] SEARCH RES conn=877 op=1 msgID=2 result=0 nentries=1 etime=1
[06/Apr/2017:10:46:49 +0200] UNBIND REQ conn=877 op=2 msgID=3
[06/Apr/2017:10:46:49 +0200] DISCONNECT conn=877 reason="Client Disconnect"

Ok. Let’s do the first test with Oracle 10.2.0.5. I know, it is not supported, however, regarding LDAP searches it is a version  where everything is ok. My test is very simple, just a sqlplus connection and then an exit. Nothing else.

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT102] sqlplus -V

SQL*Plus: Release 10.2.0.5.0 - Production

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT102] sqlplus system/manager@dbit122_ldap

SQL*Plus: Release 10.2.0.5.0 - Production on Thu Apr 6 11:00:02 2017

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

In the OUD access log I see, like expected only one search request.

[dbafmw@dbidg01 logs]$ tail -40f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
[06/Apr/2017:11:01:18 +0200] CONNECT conn=879 from=192.168.56.203:21974 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:01:18 +0200] BIND REQ conn=879 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:01:18 +0200] BIND RES conn=879 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:01:18 +0200] SEARCH REQ conn=879 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:01:18 +0200] SEARCH RES conn=879 op=1 msgID=2 result=0 nentries=1 etime=2
[06/Apr/2017:11:01:18 +0200] UNBIND REQ conn=879 op=2 msgID=3
[06/Apr/2017:11:01:18 +0200] DISCONNECT conn=879 reason="Client Disconnect"

Let’s to the same now with 11.2.0.4. This time with a fully supported version. Yes. It still is. :-)

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT112] sqlplus -V

SQL*Plus: Release 11.2.0.4.0 Production

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT112] sqlplus system/manager@dbit122_ldap

SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 6 11:03:17 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Wowwwww … now I see already two search request on the OUD. To be honest, I haven’t expected that. One should be sufficient from my point of view.

[dbafmw@dbidg01 logs]$ tail -40f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
[06/Apr/2017:11:03:43 +0200] CONNECT conn=882 from=192.168.56.203:21979 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:03:43 +0200] BIND REQ conn=882 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:03:43 +0200] BIND RES conn=882 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:03:43 +0200] SEARCH REQ conn=882 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:03:43 +0200] SEARCH RES conn=882 op=1 msgID=2 result=0 nentries=1 etime=1
[06/Apr/2017:11:03:43 +0200] UNBIND REQ conn=882 op=2 msgID=3
[06/Apr/2017:11:03:43 +0200] DISCONNECT conn=882 reason="Client Disconnect"
[06/Apr/2017:11:03:43 +0200] CONNECT conn=883 from=192.168.56.203:21980 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:03:43 +0200] BIND REQ conn=883 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:03:43 +0200] BIND RES conn=883 op=0 msgID=1 result=0 authDN="" etime=1
[06/Apr/2017:11:03:43 +0200] SEARCH REQ conn=883 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:03:43 +0200] SEARCH RES conn=883 op=1 msgID=2 result=0 nentries=1 etime=2
[06/Apr/2017:11:03:43 +0200] UNBIND REQ conn=883 op=2 msgID=3
[06/Apr/2017:11:03:43 +0200] DISCONNECT conn=883 reason="Client Disconnect"

But when you think, it can’t get worse, then do the same simple test with a 12.1.0.2 Oracle client.

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT121] sqlplus -V

SQL*Plus: Release 12.1.0.2.0 Production

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT121] sqlplus system/manager@dbit122_ldap

SQL*Plus: Release 12.1.0.2.0 Production on Thu Apr 6 11:06:18 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 06 2017 11:03:43 +02:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Incredible, it is issuing three ldap search requests against the OUD for a simple sqlplus connection.

[dbafmw@dbidg01 logs]$ tail -40f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
[06/Apr/2017:11:06:41 +0200] CONNECT conn=887 from=192.168.56.203:21986 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:06:41 +0200] BIND REQ conn=887 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:06:41 +0200] BIND RES conn=887 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:06:41 +0200] SEARCH REQ conn=887 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:06:41 +0200] SEARCH RES conn=887 op=1 msgID=2 result=0 nentries=1 etime=1
[06/Apr/2017:11:06:41 +0200] UNBIND REQ conn=887 op=2 msgID=3
[06/Apr/2017:11:06:41 +0200] DISCONNECT conn=887 reason="Client Disconnect"
[06/Apr/2017:11:06:41 +0200] CONNECT conn=888 from=192.168.56.203:21987 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:06:41 +0200] BIND REQ conn=888 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:06:41 +0200] BIND RES conn=888 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:06:41 +0200] SEARCH REQ conn=888 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:06:41 +0200] SEARCH RES conn=888 op=1 msgID=2 result=0 nentries=1 etime=2
[06/Apr/2017:11:06:41 +0200] UNBIND REQ conn=888 op=2 msgID=3
[06/Apr/2017:11:06:41 +0200] DISCONNECT conn=888 reason="Client Disconnect"
[06/Apr/2017:11:06:41 +0200] CONNECT conn=889 from=192.168.56.203:21988 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:06:41 +0200] BIND REQ conn=889 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:06:41 +0200] BIND RES conn=889 op=0 msgID=1 result=0 authDN="" etime=1
[06/Apr/2017:11:06:41 +0200] SEARCH REQ conn=889 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:06:41 +0200] SEARCH RES conn=889 op=1 msgID=2 result=0 nentries=1 etime=2
[06/Apr/2017:11:06:41 +0200] UNBIND REQ conn=889 op=2 msgID=3
[06/Apr/2017:11:06:41 +0200] DISCONNECT conn=889 reason="Client Disconnect"

The last test is now with a 12cR2 client. Will it increase now to 4?

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] sqlplus system/manager@dbit122_ldap

SQL*Plus: Release 12.2.0.1.0 Production on Thu Apr 6 11:09:08 2017

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 06 2017 11:06:41 +02:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

No, it did not increase to 4. But with 12cR2 you will see like with 12cR1 also 3 search requests against the OUD.

[dbafmw@dbidg01 logs]$ tail -40f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
[06/Apr/2017:11:09:07 +0200] CONNECT conn=890 from=192.168.56.203:21990 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:09:07 +0200] BIND REQ conn=890 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:09:07 +0200] BIND RES conn=890 op=0 msgID=1 result=0 authDN="" etime=1
[06/Apr/2017:11:09:07 +0200] SEARCH REQ conn=890 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:09:07 +0200] SEARCH RES conn=890 op=1 msgID=2 result=0 nentries=1 etime=2
[06/Apr/2017:11:09:07 +0200] UNBIND REQ conn=890 op=2 msgID=3
[06/Apr/2017:11:09:07 +0200] DISCONNECT conn=890 reason="Client Disconnect"
[06/Apr/2017:11:09:07 +0200] CONNECT conn=891 from=192.168.56.203:21991 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:09:07 +0200] BIND REQ conn=891 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:09:07 +0200] BIND RES conn=891 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:09:07 +0200] SEARCH REQ conn=891 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:09:07 +0200] SEARCH RES conn=891 op=1 msgID=2 result=0 nentries=1 etime=1
[06/Apr/2017:11:09:07 +0200] UNBIND REQ conn=891 op=2 msgID=3
[06/Apr/2017:11:09:07 +0200] DISCONNECT conn=891 reason="Client Disconnect"
[06/Apr/2017:11:09:07 +0200] CONNECT conn=892 from=192.168.56.203:21992 to=192.168.56.201:1389 protocol=LDAP
[06/Apr/2017:11:09:07 +0200] BIND REQ conn=892 op=0 msgID=1 type=SIMPLE dn="" version=3
[06/Apr/2017:11:09:07 +0200] BIND RES conn=892 op=0 msgID=1 result=0 authDN="" etime=0
[06/Apr/2017:11:09:07 +0200] SEARCH REQ conn=892 op=1 msgID=2 base="cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[06/Apr/2017:11:09:07 +0200] SEARCH RES conn=892 op=1 msgID=2 result=0 nentries=1 etime=1
[06/Apr/2017:11:09:07 +0200] UNBIND REQ conn=892 op=2 msgID=3
[06/Apr/2017:11:09:07 +0200] DISCONNECT conn=892 reason="Client Disconnect"

So what is the reason for this high increase in ldap searches. Instead of 1, it is doing 3 with 12cR1 and 12cR2, and 2 with 11gR2. That is 66% more than with Oracle 10gR2 clients. That’s enormous from my point view. Quite a huge extra load on your OUD server, when  you upgrade your Oracle clients.

To make it short, I have no answer. It might be related to the old Oracle Names code, which seems that it is still there. I have found errors in the client trace file regarding a A.SMD query. The A.SMD call is coming from the old Oracle Names server, where you could have done stuff like “NAMESCTL> QUERY DB920.oracle.com A.SMD”. But this is really a long time ago. My last Oracle Name server, I have seen in 2002.

oracle@dbidg02:/u01/app/oracle/network/trc/ [DBIT122] cat 12.2_client.trc | grep A.SMD
(4144394624) [04-APR-2017 14:38:18:633] nnfttran: Error querying DBIT122_LDAP of attribute A.SMD errcode 408
(4144394624) [04-APR-2017 14:38:18:642] nnfttran: Error querying DBIT122_LDAP of attribute A.SMD errcode 408
(4144394624) [04-APR-2017 14:38:18:646] nnfttran: Error querying DBIT122_LDAP of attribute A.SMD errcode 408

If I take a look at my 12cR2 adapters I have no Oracle Names compiled in. I don’t know if this is possible at all, with 12c.

oracle@dbidg03:/u01/app/oracle/network/admin/ [DBIT122] adapters | egrep -A 5 "Installed Oracle Net naming methods"
Installed Oracle Net naming methods are:

    Local Naming (tnsnames.ora)
    Oracle Directory Naming
    Oracle Host Naming
Conclusion

Ok. What should I say … take care if you upgrade your clients to more recent versions, in case you use OUD to resolve your names. It might generate some extra load on your OUD servers. More and more with every release since 10gR2. By the way … I have opened a SR at Oracle, because this seems to be a bug for me. I was very surprised, that I was the first one facing this issue. Will keep you posted as soon as I have results. ;-)

 

Cet article OUD – Oracle Unified Directory 11.1.2.3, Oracle generates more and more LDAP lookups with every release est apparu en premier sur Blog dbi services.

Sharding with Oracle 12c R2 Part II : Scalability and Connections

Wed, 2017-04-05 12:12

In previous blog, we talked about system-managed sharding. We saw how it is possible to create shard databases with Oracle 12c R2. Below we remind the configuration we used.
VM sharddemo1: catalog
VM sharddemo2: shard
VM sharddemo3: shard
One of the characteristics of sharding is the scalability, and in this blog we are going to add a new shard on a new server. The new configuration will be like below
VM sharddemo1: catalog
VM sharddemo2: shard
VM sharddemo3: shard
VM sharddemo4: shard — New added shard
We supposed that you have already read the first part

After adding the new shard, we will see how we connect in a shard environment.

First let’s confirm that we have only two shards running now (one on server sharddemo2 and one on server sharddemo3):

GDSCTL>config shard
Name Shard Group Status State Region Availability
---- ----------- ------ ----- ------ ------------
sh1 shgrp1 Ok Deployed region1 ONLINE
sh21 shgrp1 Ok Deployed region1 ONLINE

To add the new server sharddemo4 in the sharding environment, we first have to register remote scheduler agent on the newly added shard:

[oracle@sharddemo4 ~]$ which schagent
/u01/app/oracle/product/12.2.0.1/dbhome_1/bin/schagent


[oracle@sharddemo4 ~]$ echo welcome | schagent -registerdatabase sharddemo1 8080
Agent Registration Password ?
Oracle Scheduler Agent Registration for 12.2.0.1.2 Agent
Agent Registration Successful!
[oracle@sharddemo4 ~]$

After registration, let’s start the agent

[oracle@sharddemo4 ~]$ schagent -start
Scheduler agent started using port 16267

We must also create the corresponding directories for the database

[oracle@sharddemo4 ~]$ mkdir /u01/app/oracle/oradata
[oracle@sharddemo4 ~]$ mkdir /u01/app/oracle/fast_recovery_area

We can now launch gdsctl on sharddemo1 and connect with an administrator

[oracle@sharddemo1]$ gdsctl
GDSCTL: Version 12.2.0.1.0 - Production on Tue Mar 07 11:43:44 CET 2017
Copyright (c) 2011, 2016, Oracle. All rights reserved.
Welcome to GDSCTL, type "help" for information.
Current GSM is set to REGION1_DIRECTOR
GDSCTL>


GDSCTL>connect mygdsadmin/root
Catalog connection is established
GDSCTL>

As for existing shards, the new server sharddemo4 must be invited

GDSCTL>add invitednode sharddemo4

We also have to create the shard on sharddemo4. The shardgroup, destination and the credentials should be specified.

GDSCTL>create shard -shardgroup shgrp1 -destination sharddemo4 -credential oracle_cred
The operation completed successfully
DB Unique Name: sh41
GDSCTL>

And then now we can deploy

GDSCTL>deploy
deploy: examining configuration...
deploy: deploying primary shard 'sh41' ...
deploy: network listener configuration successful at destination 'sharddemo4'
deploy: starting DBCA at destination 'sharddemo4' to create primary shard 'sh41' ...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: DBCA primary creation job succeeded at destination 'sharddemo4' for shard 'sh41'
deploy: requesting Data Guard configuration on shards via GSM
deploy: shards configured; background operations in progress
The operation completed successfully
GDSCTL>

Running command config shard, we can verify that a new sharded database sh41 was created

GDSCTL>config shard
Name Shard Group Status State Region Availability
---- ----------- ------ ----- ------ ------------
sh1 shgrp1 Ok Deployed region1 ONLINE
sh21 shgrp1 Ok Deployed region1 ONLINE
sh41 shgrp1 Ok Deployed region1 ONLINE

Now querying the 3 shards we can see that data for sharded tables are automatically balanced between the SH1, SH21 and SH41
For example we have a total of 15 rows in the catalog ORCLCAT for the sharded table CUSTOMERS that should be distributed between the 3 sharded instances

SQL> select name from v$database;
NAME
---------
ORCLCAT

SQL> select count(*) from customers;
COUNT(*)
----------
15

We have 3 rows for customer table in SH1

SQL> select name from v$database;
NAME
---------
SH1

SQL> select count(*) from customers;
COUNT(*)
----------
3

We have 3 rows for customer table in SH21

SQL> select name from v$database;
NAME
---------
SH21

SQL> select count(*) from customers;
COUNT(*)
----------
3

And 9 rows for customer table in SH41

SQL> select name from v$database;
NAME
---------
SH41

SQL> select count(*) from customers;
COUNT(*)
----------
9

Now that we have our sharded databases, how to connect? We have two ways to do it:
1- Connect to a shard by specifying a sharding_key
2- Connect to the shardcatalog via GDS$CATALOG service

For single-shard queries, we can connect to a shard with a given sharding_key using the shard director. For example let’s display data for table customers in SH41 database

SQL> select firstname,lastname,custid from customers;
FIRSTNAME LASTNAME CUSTID
--------------- --------------- ----------------------------
Seane Tuger Seane.Tuger@localdomain
Seaneis Tugeris Seaneis.Tugeris@localdomain
Seanae Tugera Seanea.Tugera@localdomain
Sophiea Moralesa Sophiea.Moralesa@localdomain
Mourada Habiba Mourada.Habiba@localdomain
Michel Robert Michel.Robert@localdomain
Sophie Morales Sophie.Morales@localdomain
Joe Dalton Joe.Dalton@localdomain
Mourad Habib Mourad.Habib@localdomain
9 rows selected.

If we want to retrieve information about customer Seane Tuger, we can connect directly to shard SH41.
For this we need first to create a global service that runs to all shard databases. This global service will be used with a sharding key

GDSCTL>add service -service my_service_shard_srvc
The operation completed successfully
GDSCTL>

Let’s start the service

GDSCTL>start service -service my_service_shard_srvc
The operation completed successfully


GDSCTL>config service
Name Network name Pool Started Preferred all
---- ------------ ---- ------- -------------
my_service_sha my_service_shard_srvc.cust_sd cust_sdb Yes Yes
rd_srvc b.oradbcloud
GDSCTL>

Now that the service is started, we can use the following connect string which will route us directly to the SH41 because of the SH4ARDING_KEY we specified.


[oracle@sharddemo2 admin]$ sqlplus user_shard/root@'(description=(address=(protocol=tcp)
(host=sharddemo1)(port=1571))(connect_data=(service_name=my_service_shard_srvc.cust_sdb.oradbcloud)
(region=region1)(SHARDING_KEY=Seane.Tuger@localdomain)))'
SQL*Plus: Release 12.2.0.1.0 Production on Tue Mar 7 14:34:48 2017
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Last Successful login time: Tue Mar 07 2017 13:41:56 +01:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name from v$database;
NAME
---------
SH41

To perform cross-shard queries, we have to connect to the shardcatalog (coordinator database) using the GDS$CATALOG service (from any shard). GDS$CATALOG is a service automatically deployed in the shardcatalog.

[oracle@sharddemo1 ~]$ lsnrctl services
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 07-MAR-2017 14:45:14
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=sharddemo1.localdomain)(PORT=1521)))
Services Summary...
Service "GDS$CATALOG.oradbcloud" has 1 instance(s).
Instance "ORCLCAT", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:604 refused:0 state:ready
LOCAL SERVER

Via sqlplus we can connect like

[oracle@sharddemo2 admin]$ sqlplus user_shard/root@sharddemo1:1521/GDS\$CATALOG.oradbcloud
SQL*Plus: Release 12.2.0.1.0 Production on Tue Mar 7 14:48:05 2017
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Last Successful login time: Tue Mar 07 2017 14:47:58 +01:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select name from v$database;
NAME
---------
ORCLCAT

Now that we saw how to connect, we can go further and see how oracle manage queries in a shard environment.
Let’s connect to ORCLCAT database and let’s run a query which does a SELECT query accessing multiple shards on a sharded table and let’s look the execution plan. Such queries are called cross-shard queries (CSQ)

SQL> SELECT FirstName,LastName, geo, class FROM Customers WHERE class like '%free%';

Execution Plan
----------------------------------------------------------
Plan hash value: 2953441084

--------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)| Inst |IN-OUT|
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0 (0)| | |
| 1 | SHARD ITERATOR | | | | |
| 2 | REMOTE | | | ORA_S~ | R->S |
--------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

2 - EXPLAIN PLAN SET STATEMENT_ID='PLUS470045' INTO PLAN_TABLE@! FOR
SELECT "A1"."FIRSTNAME","A1"."LASTNAME","A1"."GEO","A1"."CLASS" FROM
"CUSTOMERS" "A1" WHERE "A1"."CLASS" LIKE '%free%' /*
coord_sql_id=bkpy0tbjnqu3k */ (accessing
'ORA_SHARD_POOL@ORA_MULTI_TARGET' )

We can see above that the query is using a dblink ORA_MULTI_TARGET to perform CSQ on sharded tables. This dblink is created automatically when the gsm is configured to handle cross-shard querying.


SQL> select owner,DB_LINK,username,host from dba_db_links where owner='USER_SHARD';
OWNER DB_LINK USERNAME HOST
--------------- ----------------------------------- --------------- ---------------
USER_SHARD ORA_SHARD_POOL@ORA_MULTI_TARGET USER_SHARD GDS$CATALOG

Now let’s connect to SH1 and let’s query the duplicated table products and let’s look into the execution plan.

SQL> select * from products where productid>2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1639127380

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 158 | 0 (0)| 00:00:01 |
| 1 | MAT_VIEW ACCESS BY INDEX ROWID BATCHED| PRODUCTS | 1 | 158 | 0 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_C007361 | 1 | | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

We can see that a materialized view is being used. This means that for duplicated tables oracle is using materialized views to handle queries.

Conclusion

In this article we talked about sharding scalability and how to connect to shards. System-managed sharding was used. In coming articles, we will talk about other sharding methods like composite and user-defined sharding. We will also talk how it is possible to combine sharding and dataguard.

 

Cet article Sharding with Oracle 12c R2 Part II : Scalability and Connections est apparu en premier sur Blog dbi services.

OUD – Oracle Unified Directory 11.1.2.3 Tuning, It is not always the servers fault

Tue, 2017-04-04 03:41

The default configuration which is shipped with OUD is not meant to be ready for enterprise usage. The default settings of OUD are targeted at evaluators and developers who run equipment with limited resources, and so it is quite likely that you run into performance issues if you don’t change anything, before going into production. The OUD performance depends on a lot of things like

  • Network configuration/routing/firewalls/bonding
  • OUD version and configuration (Replication, TLS)
  • Java version and Java runtime memory configuration
  • DNS Lookup times
  • Name Service Cache Daemon
  • And many more …

However, it is not always the servers fault. Sometimes the client is causing the issue. But how do I know, if it is the client or the server. In the following example it takes about 10 seconds to resolve the connect string DBIT122_LDAP. That is enormous. Far too long from being acceptable. Where is the tnsping spending so much time?

oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122] time tnsping DBIT122_LDAP

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 04-APR-2017 08:43:06

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/network/admin/sqlnet.ora

Used LDAP adapter to resolve the alias
Attempting to contact (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg01)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DBIT122)))
OK (10 msec)

real    0m10.177s
user    0m0.017s
sys     0m0.018s

To exclude, that it is the servers fault, just check the OUD access log where you can see any ldap request against the OUD.

[dbafmw@dbidg01 logs]$ tail -50f /u01/app/oracle/product/Middleware/11.1.2.3/asinst_1/OUD/logs/access
...
...
[04/Apr/2017:08:43:39 +0200] CONNECT conn=5 from=192.168.56.202:30826 to=192.168.56.201:1389 protocol=LDAP
[04/Apr/2017:08:43:39 +0200] BIND REQ conn=5 op=0 msgID=1 type=SIMPLE dn="" version=3
[04/Apr/2017:08:43:39 +0200] BIND RES conn=5 op=0 msgID=1 result=0 authDN="" etime=0
[04/Apr/2017:08:43:39 +0200] SEARCH REQ conn=5 op=1 msgID=2 base="cn=DBIT122_LDAP,cn=OracleContext,dc=dbi,dc=com" scope=base filter="(objectclass=*)" attrs="objectclass,orclNetDescString,orclNetDescName,orclVersion"
[04/Apr/2017:08:43:39 +0200] SEARCH RES conn=5 op=1 msgID=2 result=0 nentries=1 etime=2
[04/Apr/2017:08:43:39 +0200] UNBIND REQ conn=5 op=2 msgID=3
[04/Apr/2017:08:43:39 +0200] DISCONNECT conn=5 reason="Client Disconnect"
...
...

The important entry to look for is the etime after the search request. The etime filed is the elapsed time in milliseconds which the server spent processing the request. In the above case, it is 2 milliseconds, so quite fast. If you would see here large elapsed times here, then this would be a good indicator for issues on the server side.

Now, that we know that the server is ok, let’s move to client side. The first thing I am trying to do, is to see how fast the ldapsearch is. I am using the ldapsearch which comes with 12cR2 and I will use the same search criteria which tnsping is using to search for the connect string. The ldapsearch syntax from the OUD binaries differs a little bit with ldapsearch syntax which is shipped with 12cR2. Why should Oracle make them the same, it would be too easy. ;-) Ok, let’s check the ldapsearch.

oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122] time ldapsearch -v -h dbidg01 -p 1389 -b "cn=DBIT122_LDAP,cn=OracleContext,dc=dbi,dc=com" \
-s base "(objectclass=*)" "objectclass,orclNetDescString,orclNetDescName,orclVersion"

ldap_open( dbidg01, 1389 )
filter pattern: (objectclass=*)
returning: objectclass,orclNetDescString,orclNetDescName,orclVersion
filter is: ((objectclass=*))
cn=dbit122_ldap,cn=OracleContext,dc=dbi,dc=com
1 matches

real    0m0.020s
user    0m0.005s
sys     0m0.004s

I don’t see any issues here. My ldapsearch came back in a blink of an eye. So .. where are the other 10 seconds? We need more information. We can either use strace or we can activate tracing on the client side. Something less known in the Oracle world is the tnsping tracing, which can be activated too. My tnsping is slow, and so I want only the tnsping to be traced and nothing else. To do so, we need to specify two parameters in the sqlnet.ora file. The TNSPING.TRACE_DIRECTORY and the TNSPING.TRACE_LEVEL. The tnsping trace level can have 4 different values like the sqlnet tracing.

  • 0 or OFF – No Trace output
  • 4 or USER – User trace information
  • 10 or ADMIN – Administration trace information
  • 16 or SUPPORT – Worldwide Customer Support trace information

Because I want to have the full trace output, I go for level 16 which is the support tracing.

oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122] cat sqlnet.ora | grep TNSPING

TNSPING.TRACE_DIRECTORY = /u01/app/oracle/network/trc
TNSPING.TRACE_LEVEL = SUPPORT

Ok. Let’s do it again and see the outcome.

oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122] time tnsping DBIT122_LDAP

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 04-APR-2017 09:44:44

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/network/admin/sqlnet.ora

Used LDAP adapter to resolve the alias
Attempting to contact (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg01)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DBIT122)))
OK (10 msec)

real    0m10.191s
user    0m0.013s
sys     0m0.016s
oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122]

If we look at the trace file, we see that Oracle found 3 directory paths in the following order, TNSNAMES, EZCONNECT and LDAP.

[04-APR-2017 09:44:44:569] nnfgsrsp: Obtaining path parameter from names.directory_path or native_names.directory_path
[04-APR-2017 09:44:44:569] nnfgsrdp: entry
[04-APR-2017 09:44:44:569] nnfgsrdp: Setting path:
[04-APR-2017 09:44:44:569] nnfgsrdp: checking element TNSNAMES
[04-APR-2017 09:44:44:569] nnfgsrdp: checking element EZCONNECT
[04-APR-2017 09:44:44:569] nnfgsrdp: checking element LDAP

Switching to the TNSNAMES adapter is very fast and Oracle see’s it immediately that the query is unsuccessful, and so it is switching to the next adapter.

[04-APR-2017 09:44:44:569] nnfgrne: Switching to TNSNAMES adapter
[04-APR-2017 09:44:44:569] nnftboot: entry
[04-APR-2017 09:44:44:569] nlpaxini: entry
[04-APR-2017 09:44:44:569] nlpaxini: exit
[04-APR-2017 09:44:44:569] nnftmlf_make_local_addrfile: entry
[04-APR-2017 09:44:44:569] nnftmlf_make_local_addrfile: construction of local names file failed
[04-APR-2017 09:44:44:569] nnftmlf_make_local_addrfile: exit
[04-APR-2017 09:44:44:569] nlpaxini: entry
[04-APR-2017 09:44:44:569] nlpaxini: exit
[04-APR-2017 09:44:44:569] nnftmlf_make_system_addrfile: entry
[04-APR-2017 09:44:44:569] nnftmlf_make_system_addrfile: system names file is /u01/app/oracle/network/admin/tnsnames.ora
[04-APR-2017 09:44:44:569] nnftmlf_make_system_addrfile: exit
[04-APR-2017 09:44:44:569] nnftboot: exit
[04-APR-2017 09:44:44:569] nnftrne: entry
[04-APR-2017 09:44:44:569] nnftrne: Original name: DBIT122_LDAP
[04-APR-2017 09:44:44:569] nnfttran: entry
[04-APR-2017 09:44:44:569] nnfttran: Error querying DBIT122_LDAP of attribute A.SMD errcode 408
[04-APR-2017 09:44:44:569] nnfgrne: Query unsuccessful, skipping to next adapter

Now, Oracle is switching to the EZCONNECT adapter.

[04-APR-2017 09:44:44:569] nnfgrne: Switching to EZCONNECT adapter
[04-APR-2017 09:44:44:569] nnfhboot: entry
[04-APR-2017 09:44:44:569] nnfhboot: exit
[04-APR-2017 09:44:44:569] snlinGetAddrInfo: entry
[04-APR-2017 09:44:54:664] snlinGetAddrInfo: getaddrinfo() failed with error -2
[04-APR-2017 09:44:54:664] snlinGetAddrInfo: exit
[04-APR-2017 09:44:54:665] snlinGetAddrInfo: entry
[04-APR-2017 09:44:54:727] snlinGetAddrInfo: getaddrinfo() failed with error -2
[04-APR-2017 09:44:54:727] snlinGetAddrInfo: exit
[04-APR-2017 09:44:54:727] nnfhrne: Error forming address for DBIT122_LDAP, errcode 406
[04-APR-2017 09:44:54:727] nnfgrne: Query unsuccessful, skipping to next adapter

Ok. Here we go. Between “snlinGetAddrInfo: entry” and “snlinGetAddrInfo: getaddrinfo() failed with error -2″  10 seconds have been gone. Oracle thinks that the DBIT122_LDAP is an easy connect string, and tries to resolve the name, which fails.

So I need to switch the entries in the directory path in the sqlnet.ora file, to NAMES.DIRECTORY_PATH= (TNSNAMES,LDAP,EZCONNECT). After I have done that, the tnsping comes back successfully and very fast.

oracle@dbidg02:/u01/app/oracle/network/admin/ [DBIT122] time tnsping DBIT122_LDAP

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 04-APR-2017 10:25:39

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/network/admin/sqlnet.ora

Used LDAP adapter to resolve the alias
Attempting to contact (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg01)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DBIT122)))
OK (0 msec)

real    0m0.018s
user    0m0.007s
sys     0m0.006s
Conclusion

It is not always the OUD servers fault when you hit performance issues. I might be on the client as well and it can have a severe impact.

 

Cet article OUD – Oracle Unified Directory 11.1.2.3 Tuning, It is not always the servers fault est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 14 – optimizer hints

Tue, 2017-04-04 01:23

This is a question that comes up quite often: How can I use optimizer hints in PostgreSQL as I can do it in Oracle? Well, you cant, and the reasons are this:

  • Poor application code maintainability: hints in queries require massive refactoring.
  • Interference with upgrades: today’s helpful hints become anti-performance after an upgrade.
  • Encouraging bad DBA habits slap a hint on instead of figuring out the real issue.
  • Does not scale with data size: the hint that’s right when a table is small is likely to be wrong when it gets larger.
  • Failure to actually improve query performance: most of the time, the optimizer is actually right.
  • Interfering with improving the query planner: people who use hints seldom report the query problem to the project.


But this does not mean that you cant influence the optimizer (or “planner” in PostgreSQL wording), it is just not working in the same way. Lets have a look.

On of the reasons that the planner does not choose an index over a sequential scan is that the parameter effective_cache_size is not set properly. To understand what it does you have to know that PostgreSQL works together with the operating system file cache/disk cache very well. It is not required, as you do it in Oracle, to give most of the available memory of the server to the database. Usually you start with 25% of the total available memory and give that to PostgreSQL by setting the parameter shared_buffers to that value. When pages fall out of that region it is still likely that they are available in the disk cache and can be retrieved from there without going down to disk. And this is what effective_cache_size is about: Setting this parameter does not consume more memory but is telling PostgreSQL how big the total cache of the system really is, so shared_buffers plus disk cache. This gets taken into consideration by the planner. A good starting point is 50 to 75% of the available memory. Lets do a quick test to show how this behaves. Lets generate some data:

postgres=# \! cat a.sql
drop table if exists t1;
create table t1 ( a int );
with generator as 
 ( select a.*
     from generate_series ( 1, 5000000 ) a
    order by random()
 )
insert into t1 ( a ) 
     select a
       from generator;
create index i1 on t1(a);
analyze verbose t1;
select * from pg_size_pretty ( pg_relation_size ('t1' ));
select * from pg_size_pretty ( pg_total_relation_size('t1'));
postgres=# \i a.sql
DROP TABLE
CREATE TABLE
INSERT 0 5000000
CREATE INDEX
psql:a.sql:12: INFO:  analyzing "public.t1"
psql:a.sql:12: INFO:  "t1": scanned 22124 of 22124 pages, containing 5000000 live rows and 0 dead rows; 30000 rows in sample, 5000000 estimated total rows
ANALYZE
 pg_size_pretty 
----------------
 173 MB
(1 row)
 pg_size_pretty 
----------------
 280 MB
(1 row)
postgres=# show shared_buffers ;
 shared_buffers 
----------------
 128MB
(1 row)

The table without the index is big enough to not fit into shared_buffers (173MB) and even bigger of course including the index (280MB). When we set effective_cache_size to a very low value we get costs of 40.55 for the statement below (almost no disk cache):

postgres=# SET effective_cache_size TO '1 MB';
SET
postgres=# explain SELECT * FROM t1 ORDER BY  a limit 10;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Limit  (cost=0.43..40.55 rows=10 width=4)
   ->  Index Only Scan using i1 on t1  (cost=0.43..20057243.41 rows=5000000 width=4)
(2 rows)

Setting this to a more realistic value decreases the costs because it is expected to find the index in the disk cache:

postgres=# SET effective_cache_size TO '5 GB';
SET
postgres=# explain SELECT * FROM t1 ORDER BY  a limit 10;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Limit  (cost=0.43..0.87 rows=10 width=4)
   ->  Index Only Scan using i1 on t1  (cost=0.43..218347.46 rows=5000000 width=4)
(2 rows)

This is the first “hint” you can set to influence the optimizer/planner. But there are many others. What PostgreSQL allows you to do is to enable or disable features of the planner:

postgres=# select name from pg_settings where name like 'enable%';
         name         
----------------------
 enable_bitmapscan
 enable_hashagg
 enable_hashjoin
 enable_indexonlyscan
 enable_indexscan
 enable_material
 enable_mergejoin
 enable_nestloop
 enable_seqscan
 enable_sort
 enable_tidscan

Using the same data from above we could disable the index only scan:

postgres=# set enable_indexonlyscan=false;
SET
postgres=# explain (analyze,buffers) SELECT * FROM t1 ORDER BY  a limit 10;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..0.87 rows=10 width=4) (actual time=0.019..0.058 rows=10 loops=1)
   Buffers: shared hit=13
   ->  Index Scan using i1 on t1  (cost=0.43..218347.46 rows=5000000 width=4) (actual time=0.017..0.036 rows=10 loops=1)
         Buffers: shared hit=13
 Planning time: 0.057 ms
 Execution time: 0.084 ms
(6 rows)

postgres=# set enable_indexonlyscan=true;
SET
postgres=# explain (analyze,buffers) SELECT * FROM t1 ORDER BY  a limit 10;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..0.87 rows=10 width=4) (actual time=0.025..0.072 rows=10 loops=1)
   Buffers: shared hit=13
   ->  Index Only Scan using i1 on t1  (cost=0.43..218347.46 rows=5000000 width=4) (actual time=0.023..0.048 rows=10 loops=1)
         Heap Fetches: 10
         Buffers: shared hit=13
 Planning time: 0.068 ms
 Execution time: 0.105 ms
(7 rows)

But the documentation clearly states: “If the default plan chosen by the optimizer for a particular query is not optimal, a temporary solution is to use one of these configuration parameters to force the optimizer to choose a different plan”. For testing and troubleshooting this can be handy.

Another way to influence the optimizer/planner is to set the planner cost constants:

 postgres=# select name from pg_settings where name like '%cost%' and name not like '%vacuum%';
         name         
----------------------
 cpu_index_tuple_cost
 cpu_operator_cost
 cpu_tuple_cost
 parallel_setup_cost
 parallel_tuple_cost
 random_page_cost
 seq_page_cost"
(7 rows)

What they mean is pretty well documented and how you need to set them (if you need to change them at all) depends on your hardware and application. There are others as well, such as the *collapse_limit* parameters and the parameters for the Genetic Query Optimizer.

Conclusion: There are several ways you can influence the optimizer/planner in PostgreSQL it is just not by using hints.

 

Cet article Can I do it with PostgreSQL? – 14 – optimizer hints est apparu en premier sur Blog dbi services.

12cR2 DBCA, Automatic Memory Management, and -databaseType

Mon, 2017-04-03 15:52

This post explains the following error encountered when creating a 12.2 database with DBCA:
[DBT-11211] The Automatic Memory Management option is not allowed when the total physical memory is greater than 4GB.
or when creating the database directly with the installer:
[INS-35178]The Automatic Memory Management option is not allowed when the total physical memory is greater than 4GB.
If you used Automatic Memory Management (AMM) you will have to think differently and size the SGA and PGA separately.

ASMM

Automatic Shared Memory Management, or ASMM is what you do when setting SGA_TARGET and not setting MEMORY_TARGET. Basically, you define the size of the SGA you want to allocate at startup and that will be available for the instance, most of it being buffer cache and shared pool. I’ll not go into the detail of SGA_TARGET and SGA_MAX_SIZE because on the most common platforms, all is allocated at instance startup. Then, in addition to this shared area used by all instance processes, each processes can allocate private memory, and you control this with PGA_AGGREGATE_TARGET.
The total size of SGA and PGA for all instances in a system must reside in physical memory for the simple reason that they are mostly used to avoid I/O (a large buffer cache avoids physical reads and optimizes physical writes, a large PGA avoids reads and writes to tempfiles).

AMM

Because you don’t always know how much to allocate to each (SGA and PGA) Oracle came with a feature where you define the whole MEMORY_TARGET, part of this will be dynamically allocated to SGA or PGA. This is called Automatic Memory Management (AMM). It’s a good idea on the paper: it is automatic, which means that you don’t have to think about it, and it is dynamic, which means that you don’t waste physical memory because of bad sizing.

But it is actually a bad idea when going to implementation, at least on the most common platforms.
SGA and PGA are different beasts that should not be put in the same cage:

  • SGA is big, static, shared, allocated once at startup
  • PGA is small chunks constantly allocated and deallocated, private to processes

First, it is not so easy because you have to size the /dev/shm correctly or you will get the following at startup:
ORA-00845: MEMORY_TARGET not supported on this system
In addition to that, because the whole memory is prepared to contain the whole SGA you see misleading numbers in ‘show sga’.

Second there are lot of bugs, resizing overhead, etc.

And finally, you cannot use large pages when you are in AMM, and in modern system (lot of RAM, lot of processes) having all processes mapping the SGA with small pages of 4k is a big overhead.

So, as long as you have more than few GB on a system, you should avoid AMM and set SGA_TARGET and PGA_AGGREGATE_TARGET independently. Forget MEMORY_TARGET. Forget /dev/shm. Forget also the following documentation at http://docs.oracle.com/database/122/ADMIN/managing-memory.htm#ADMIN00207 which mentions that Oracle recommends that you enable the method known as automatic memory management.
Actually, AMM is not recommended for systems with more than a few GB of physical memory, and most system have more than few GB of physical memory. If you try to use AMM on a system with less than 4GB you get a warning in 12cR1 and it is an error in 12cR2:
CaptureAMM002
I got this when trying to create a database with AMM on a system with more than 4GB of physical memory.

CaptureAMM001
This does not depend on the size of MEMORY_TARGET you choose, or the size of /dev/shm, but only the size of available physical memory:
[oracle@VM104 ~]$ df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 3.9G 0 3.9G 0% /dev/shm
 
[oracle@VM104 ~]$ free -h
total used free shared buff/cache available
Mem: 7.8G 755M 5.0G 776M 2.1G 6.2G
Swap: 411M 0B 411M

No choice: it is a hard stop

If you are not convinced, then please have a look at MOS Doc ID 2244817.1 which explains this decision:

  • It is not something new: DBCA used to give similar warning message but in 12.2.0.1 it is an error message
  • Reason behind: Because database creation fails some times and in some cases database wont be functional after some times

So, do you want to create a database which may not be functional after some times?

So, what size for SGA and PGA?

Then, if you were thinking that AMM was cool, your next question not is: what size to allocate to SGA and PGA?

Don’t panic.

You are in this situation because you have several GB of RAM. Current servers have lot of memory. You don’t have to size it to the near 100MB. Start with some values, run with it. Look at the performance and the memory advisors. Are you doing too much physical I/O on tables where you expect data to be in cache? Then increase the SGA, and maybe set a minimum for the buffer cache. Do you see lot of hard parse because your application runs lot of statements and procedures? Then increase the SGA and maybe set a minimum for the shared pool. Do you run lot of analytic queries that full scan tables and have to hash and sort huge amount of data? Then decrease the SGA and increase the PGA_AGGREGATE_TARGET.

Where to start?

If you don’t know where to start, look at the DBCA database types:

#-----------------------------------------------------------------------------
# Name : databaseType
# Datatype : String
# Description : used for memory distribution when memoryPercentage specified
# Valid values : MULTIPURPOSE|DATA_WAREHOUSING|OLTP
# Default value : MULTIPURPOSE
# Mandatory : NO
#-----------------------------------------------------------------------------

Those types define the ratio between SGA and PGA. Then why not start with what is recommended by Oracle?

I’ve created the 3 types of instances with the following:
dbca -silent -totalMemory 10000 -databaseType MULTIPURPOSE -generateScripts -scriptDest /tmp/MULT ...
dbca -silent -totalMemory 10000 -databaseType DATA_WAREHOUSING -generateScripts -scriptDest /tmp/DWHG ...
dbca -silent -totalMemory 10000 -databaseType OLTP -generateScripts -scriptDest /tmp/OLTP ...

And here are the settings generated by DBCA
$ grep target /tmp/*/init.ora
DWHG/init.ora:sga_target=6000m
DWHG/init.ora:pga_aggregate_target=4000m
MULT/init.ora:sga_target=7500m
MMULT/init.ora:pga_aggregate_target=2500m
OLTP/init.ora:sga_target=8000m
OLTP/init.ora:pga_aggregate_target=2000m

Here is the summary:

SGA PGA OLTP 80% 20% Multi-Purpose 75% 25% Data Warehousing 60% 40%

(percentages are relative to eachother, here. Donc’ use 100% of physical memory for the Oracle instances because the system needs some memory as well)

This gives an idea where to start. Servers have lot of memory but you don’t have to use all of it. If you have a doubt, leave some free memory to be available for the filesystem cache. Usually, we recommend to used direct i/o (filesystemio_options=setall) to avoid the filesystem overhead. But when you start and want to lower the risks sub-sizing SGA or PGA, then you may prefer to keep that second level of cache (filesystemio_options=async) which uses all the physical memory available. This may improve the reads from tempfiles in case your PGA is too small. This is just an idea, not a recommendation.

So what?

If you have a server with more than few GB, then set SGA and PGA separately. Start with the ratios above, and then monitor performance and advisors. Physical servers today have at least 32GB. Even with a small VM with 1GB for my labs, I prefer to set them separately, because in that case I want to be sure to have a minimum size for buffer cache and shared pool. You may have lot of small VMs with 3GB and think about setting MEMORY_TARGET. But using large pages is a recommendation here because the hypervisor will have lot of memory to map, so ASMM is still the recommandation.

Once you know the size of all SGA, look at Hugepagesize in /proc/meminfo, set the number of hugepages in /etc/sysctl.conf, run sysctl -p and your instances will use available large pages for the SGA.

 

Cet article 12cR2 DBCA, Automatic Memory Management, and -databaseType est apparu en premier sur Blog dbi services.

When automatic reoptimization plan is less efficient

Sun, 2017-04-02 05:05

11gR2 started to have the optimizer react at execution time when a misestimate is encountered. Then the next executions are re-optimized with more accurate estimation, derived from the execution statistics. This was called cardinality feedback. Unfortunately, in rare cases we had a fast execution plan with bad estimations, and better estimations lead to worse execution plan. This is rare, but even when 9999 queries are faster, the one that takes too long will gives a bad perception of this optimizer feature.
This feature has been improved in 12cR1 with new names: auto-reoptimization and statistics feedback. I’m showing an example here in 12.1.0.2 without adaptive statistics (the 12.2 backport) and I’ve also disabled adaptive plan because they show the wrong numbers (similar to what I described in this post). I’ll show that at one point, the re-optimization can go back to the initial plan if it was the best in execution time.

V$SQL

Basically, here is what happened: first execution was fast, but with actual number of rows far from the estimated ones. Auto-reoptimisation kicks in for next execution and get a new plan, but with longer execution time. Third execution is another re-optimization, leading to same bad plan. Finally starting at 4th execution, the time is back to reasonable and we see the same as the first plan is used:

SQL> select sql_id,child_number,plan_hash_value,is_reoptimizable,is_resolved_adaptive_plan,parse_calls,executions,elapsed_time/1e6
from v$sql where sql_id='b4rhzfw7d6vdp';
 
SQL_ID CHILD_NUMBER PLAN_HASH_VALUE I I PARSE_CALLS EXECUTIONS ELAPSED_TIME/1E6
------------- ------------ --------------- - - ----------- ---------- ----------------
b4rhzfw7d6vdp 0 1894156093 Y 1 1 .346571
b4rhzfw7d6vdp 1 955499861 Y 1 1 5.173733
b4rhzfw7d6vdp 2 955499861 Y 1 1 4.772258
b4rhzfw7d6vdp 3 1894156093 N 7 7 .5008

The scope of statistic feedback is not to get optimal execution from the first execution. This requires accurate statistics, static or dynamic, and SQL Plan Directives is a try to get that. Statistics feedback goal is to try to get a better plan rather than re-use one that is based on misestimates. But sometimes the better is the enemy of the good and we have an example here in child cursors 1 and 2. But the good thing is that finally we are back to acceptable execution time, with a final plan that can be re-used without re-optimization.

What surprised me here is that the final plan has the same hash value than the initial one. Is it a coincidence that different estimations gives the same plan? Or did the optimizer finally gave up to try to find better?

V$SQL_REOPTIMIZATION_HINTS

In 12c the statistics feedback are exposed in V$SQL_REOPTIMIZATION_HINTS.

SQL> select sql_id,child_number,hint_text,client_id,reparse from v$sql_reoptimization_hints where sql_id='b4rhzfw7d6vdp';
 
SQL_ID CHILD_NUMBER HINT_TEXT CLIENT_ID REPARSE
------------- ------------ ---------------------------------------------------------------------------------------------------- ---------- ----------
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$582FA660" QUERY_BLOCK ROWS=1491.000000 ) 1 1

The child cursor 0 was re-optimized to cursor 1 with different number of rows for “DM_FOLDER_R1″ and “DM_SYSOBJECT_R2″
The child cursor 1 has the same values, but an additional number of row correction for a query block.

But we don’t see anything about cursor 2. It was re-optimizable, and was actually re-optimized into cursor 3 but no statistics corrections are displayed here.

Trace

As it is a reproducible case, I’ve run the same while tracing 10046, 10053 and 10507 (level 512) to get all information about SQL execution, Optimiser compilation, and statistics feedback. For each child cursor, I’ll show the execution plan with estimated and actual number of rows (E-Rows and A-Rows) and then some interesting lines from the trace, mainly those returned by:
grep -E "KKSMEC|^atom_hint|^@"

Child cursor 0 – plan 1894156093 – 0.34 seconds

Plan hash value: 1894156093
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 171 (100)| 1 |00:00:00.04 | 17679 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 17679 |
| 2 | NESTED LOOPS | | 1 | 1 | 116 (0)| 1491 |00:00:00.04 | 17679 |
| 3 | NESTED LOOPS | | 1 | 1 | 115 (0)| 1491 |00:00:00.04 | 17456 |
| 4 | NESTED LOOPS | | 1 | 49 | 17 (0)| 5648 |00:00:00.01 | 537 |
|* 5 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 3 | 3 (0)| 1517 |00:00:00.01 | 13 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED | DM_SYSOBJECT_R | 1517 | 16 | 10 (0)| 5648 |00:00:00.01 | 524 |
|* 7 | INDEX RANGE SCAN | D_1F0049A880000010 | 1517 | 71 | 2 (0)| 5648 |00:00:00.01 | 249 |
|* 8 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 5648 | 1 | 2 (0)| 1491 |00:00:00.03 | 16919 |
|* 9 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 5648 | 1 | 1 (0)| 1491 |00:00:00.03 | 15428 |
| 10 | NESTED LOOPS SEMI | | 5648 | 2 | 25 (0)| 1491 |00:00:00.02 | 14828 |
| 11 | NESTED LOOPS | | 5648 | 7 | 18 (0)| 2981 |00:00:00.02 | 12869 |
| 12 | TABLE ACCESS BY INDEX ROWID BATCHED| DM_SYSOBJECT_R | 5648 | 71 | 4 (0)| 2981 |00:00:00.01 | 7747 |
|* 13 | INDEX RANGE SCAN | D_1F0049A880000010 | 5648 | 16 | 3 (0)| 2981 |00:00:00.01 | 6145 |
|* 14 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 2981 | 1 | 2 (0)| 2981 |00:00:00.01 | 5122 |
|* 15 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 2981 | 1 | 1 (0)| 2981 |00:00:00.01 | 2140 |
|* 16 | INDEX UNIQUE SCAN | D_1F0049A880000145 | 2981 | 52759 | 1 (0)| 1491 |00:00:00.01 | 1959 |
|* 17 | INDEX UNIQUE SCAN | D_1F0049A880000142 | 1491 | 1 | 1 (0)| 1491 |00:00:00.01 | 223 |
----------------------------------------------------------------------------------------------------------------------------------------

Because of low cardinality estimation of DM_SYSOBJECT_R predicate (E-Rows=3) the optimizer goes to NESTED LOOP. This plan has good execution time here because all blocks are in buffer cache. Reading 17679 blocks from buffer cache takes less than one second. It would have been much longer if those were physical I/O.

This is a case where the optimizer detects misestimate at execution time. Here is what is recorded in the trace:

Reparsing due to card est...
@=0x63a56820 type=3 nodeid=5 monitor=Y halias="DM_FOLDER_R1" loc="SEL$1" oname="SEL$F5BB74E1" act=1517 min=0 est=3 next=(nil)
Reparsing due to card est...
@=0x638fe2b0 type=5 nodeid=4 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" act=5648 min=0 est=49 next=0x638fe250
Reparsing due to card est...
@=0x638fe4c0 type=5 nodeid=3 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" "SEL$3"@"SEL$1" act=1491 min=0 est=1 next=0x638fe460
Reparsing due to card est...
@=0x638fe688 type=5 nodeid=2 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" "SEL$3"@"SEL$1" "R_OBJECT_ID"@"SEL$1" act=1491 min=0 est=1 next=0x638fe5f8
kkocfbCheckCardEst [sql_id=b4rhzfw7d6vdp] reparse=y ecs=n efb=n ost=n fbs=n

Those are the misestimates which triggers re-optimization.

And here are all statistics feedback.

*********** Begin Dump Context (kkocfbCheckCardEst) [sql_id=b4rhzfw7d6vdp cpcnt=0] ***********
@=0x638fe688 type=5 nodeid=2 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" "TE_"@"SEL$2" "LJ_"@"SEL$2" act=1491 min=0 est=1 next=0x638fe5f8
@=0x638fe5f8 type=3 nodeid=17 monitor=Y halias="LJ_" loc="SEL$2" oname="D_1F0049A880000142" act=0 min=1 est=1 next=0x638fe4c0
@=0x638fe4c0 type=5 nodeid=3 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" "TE_"@"SEL$2" act=1491 min=0 est=1 next=0x638fe460
@=0x638fe460 type=1 nodeid=8 monitor=Y halias="TE_" loc="SEL$2" act=0 min=1 est=1 next=0x638fe3d0
@=0x638fe3d0 type=3 nodeid=9 monitor=Y halias="TE_" loc="SEL$2" oname="D_1F0049A880000108" act=0 min=1 est=1 next=0x638fe2b0
@=0x638fe2b0 type=5 nodeid=4 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" act=5648 min=0 est=49 next=0x638fe250
@=0x638fe250 type=1 nodeid=6 monitor=Y halias="DM_SYSOBJECT_R2" loc="SEL$1" act=3 min=1 est=16 next=0x638fe1c0
@=0x638fe1c0 type=3 nodeid=7 monitor=Y halias="DM_SYSOBJECT_R2" loc="SEL$1" oname="D_1F0049A880000010" act=3 min=1 est=71 next=0x63a56820
@=0x63a56820 type=3 nodeid=5 monitor=Y halias="DM_FOLDER_R1" loc="SEL$1" oname="D_1F0049A880000016" act=1517 min=0 est=3 next=(nil)
*********** End Dump Context ***********

We also see some information about execution performance:

kkoarCopyCtx: [sql_id=b4rhzfw7d6vdp] origin=CFB old=0x63a565d0 new=0x7fe74e2153f0 copyCnt=1 copyClient=y
**************************************************************
kkocfbCopyBestEst: Best Stats
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
*********** Begin Dump Context (kkocfbCopyBestEst) **********
*********** End Dump Context ***********

They are labeled as ‘Best Stats’ because we had only one execution at that time.

Finally, the hints are dumped:

******** Begin CFB Hints [sql_id=b4rhzfw7d6vdp] xsc=0x7fe74e215748 ********
Dumping Hints
=============
atom_hint=(@=0x7fe74e21ebf0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21e758 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21e3f0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21dfd0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21dc68 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) )
atom_hint=(@=0x7fe74e21d8c8 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) )
********** End CFB Hints **********

Those are exactly what we see in V$SQL_REOPTIMIZATION_HINTS

This is all what we see for this first execution. The next execution starts with:

KKSMEC: Invalidating old cursor 0 with hash val = 1894156093
KKSMEC: Produced New cursor 1 with hash val = 955499861

As a consequence of child cursor 0 marked as reoptimizable, the next execution invalidates it and creates a new child cursor 1.

Child cursor 1 – new plan 955499861 – 5.17 seconds

Here is the new plan we see after that second execution:

Plan hash value: 955499861
------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 30996 (100)| 1 |00:00:04.58 | 102K| 101K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:04.58 | 102K| 101K| | | |
| 2 | VIEW | VM_NWVW_2 | 1 | 12039 | 30996 (1)| 1491 |00:00:04.58 | 102K| 101K| | | |
| 3 | HASH UNIQUE | | 1 | 12039 | 30996 (1)| 1491 |00:00:04.58 | 102K| 101K| 941K| 941K| 2597K (0)|
|* 4 | HASH JOIN RIGHT SEMI | | 1 | 12039 | 30490 (1)| 4132 |00:00:04.57 | 102K| 101K| 12M| 3867K| 14M (0)|
| 5 | TABLE ACCESS FULL | DM_DOCUMENT_S | 1 | 213K| 210 (1)| 213K|00:00:00.01 | 741 | 0 | | | |
|* 6 | HASH JOIN | | 1 | 36463 | 29665 (1)| 5622 |00:00:04.51 | 101K| 101K| 1405K| 1183K| 2026K (0)|
|* 7 | HASH JOIN | | 1 | 36463 | 18397 (1)| 5622 |00:00:02.23 | 65103 | 65050 | 940K| 940K| 1339K (0)|
|* 8 | HASH JOIN | | 1 | 2222 | 14489 (1)| 1499 |00:00:01.58 | 51413 | 51369 | 992K| 992K| 1377K (0)|
|* 9 | HASH JOIN | | 1 | 2222 | 14120 (1)| 1499 |00:00:01.46 | 50088 | 50057 | 3494K| 1598K| 4145K (0)|
|* 10 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 39235 | 10003 (1)| 39235 |00:00:00.83 | 36385 | 36376 | | | |
|* 11 | HASH JOIN | | 1 | 24899 | 3920 (1)| 5648 |00:00:00.62 | 13703 | 13681 | 1199K| 1199K| 1344K (0)|
|* 12 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 1517 | 12 (0)| 1517 |00:00:00.01 | 13 | 0 | | | |
|* 13 | TABLE ACCESS FULL| DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.50 | 13690 | 13681 | | | |
| 14 | TABLE ACCESS FULL | DM_FOLDER_S | 1 | 431K| 367 (1)| 431K|00:00:00.04 | 1325 | 1312 | | | |
|* 15 | TABLE ACCESS FULL | DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.51 | 13690 | 13681 | | | |
|* 16 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 646K| 10000 (1)| 646K|00:00:02.14 | 36385 | 36376 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- statistics feedback used for this statement

The notes makes it clear that the estimations comes from previous run (statistics feedback) and we see that for most operations E-Rows = A-Rows. With those a new plan has been chosen, with complex view merging: VM_NWWM. You can find clues about those internal view names on Jonathan Lewis blog. Here probably because the estimated number of rows is high, the subquery has been unnested. It is an ‘EXISTS’ subquery, which is transformed to semi join and merged to apply a distinct at the end.

So, we have a different plan, which is supposed to be better because it has been costed with more accurate cardinalities. .The goal of this post is not to detail the reason why the execution time is longer with a ‘better’ plan. If you look at ‘Reads’ column you can see that the first one has read all blocks from buffer cache but second one had to do physical I/O for all. With nothing from buffer cache, reading 101K blocks in multiblock reads may be faster than reading 17679 so the optimizer decision was not bad. I’ll have to estimate if it is expected to have most of the blocks in buffer cache in real production life as behavior in UAT is different. Some people will stop here, say that cardinality feedback is bad, disable it or even set optimizer_cost_adj to get the nested loop, but things are more complex than that.

The important thing is that the optimizer doesn’t stop there and compares the new execution statistics with the previous one.

**************************************************************
kkocfbCompareExecStats : Current
Exec count: 1
CR gets: 102226
CU gets: 3
Disk Reads: 101426
Disk Writes: 0
IO Read Requests: 1633
IO Write Requests: 0
Bytes Read: 830881792
Bytes Written: 0
Bytes Exchanged with Storage: 830881792
Bytes Exchanged with Disk: 830881792
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 4586 (ms)
CPU Time: 1305 (ms)
User I/O Time: 3040 (ms)
**************************************************************
kkocfbCompareExecStats : Best
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
kkocfbCompareExecStats: improvement BG: 0.172935 CPU: 0.039555

The first execution, with ‘bad’ statistics, is still the best one and this new execution has an improvement of 0.17, which means 5 times slower.

Then in the trace we see again that re-optimisation (reparsing) is considered:

Reparsing due to card est...
@=0x6a368338 type=5 nodeid=11 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" act=5648 min=0 est=24899 next=0x6a3682d8
Reparsing due to card est...
@=0x6a3687b0 type=5 nodeid=7 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" act=5622 min=0 est=36463 next=0x6a368750
Reparsing due to card est...
@=0x6a368990 type=5 nodeid=6 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" "SEL$FB0FE72C"@"SEL$4" act=5622 min=0 est=36463 next=0x6a368930
Reparsing due to card est...
@=0x6a368b90 type=5 nodeid=4 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" "SEL$FB0FE72C"@"SEL$4" "SEL$F5BB74E1"
@"SEL$4" act=4132 min=0 est=12039 next=0x6a368b30
Reparsing due to card est...
@=0x6a368d60 type=4 nodeid=3 monitor=Y halias="" loc="SEL$582FA660" act=1491 min=0 est=12039 next=0x6a368b90

An additional OPT_ESTIMATE hint is generated for the complext view merging view query block:

atom_hint=(@=0x7fe74e21eb90 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21e7b0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) )
atom_hint=(@=0x7fe74e21e470 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21e050 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21dce8 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21da38 err=0 resol=0 used=0 token=1018 org=6 lvl=2 txt=OPT_ESTIMATE (@"SEL$582FA660" QUERY_BLOCK ROWS=1491.000000 ) )
atom_hint=(@=0x7fe74e21d600 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) )

Whith this new cardinality estimation, the next execution will try to get a better execution, but it doesn’t change the optimizer choice and the new child cursor gets the same execution plan:
KKSMEC: Invalidating old cursor 1 with hash val = 955499861
KKSMEC: Produced New cursor 2 with hash val = 955499861

Child cursor 2 – plan 955499861 again – 4.77 seconds

This the third execution:

Plan hash value: 955499861
------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 30996 (100)| 1 |00:00:04.19 | 102K| 101K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:04.19 | 102K| 101K| | | |
| 2 | VIEW | VM_NWVW_2 | 1 | 1491 | 30996 (1)| 1491 |00:00:04.19 | 102K| 101K| | | |
| 3 | HASH UNIQUE | | 1 | 1491 | 30996 (1)| 1491 |00:00:04.19 | 102K| 101K| 941K| 941K| 1355K (0)|
|* 4 | HASH JOIN RIGHT SEMI | | 1 | 12039 | 30490 (1)| 4132 |00:00:04.19 | 102K| 101K| 12M| 3867K| 14M (0)|
| 5 | TABLE ACCESS FULL | DM_DOCUMENT_S | 1 | 213K| 210 (1)| 213K|00:00:00.01 | 740 | 0 | | | |
|* 6 | HASH JOIN | | 1 | 36463 | 29665 (1)| 5622 |00:00:04.12 | 101K| 101K| 1405K| 1183K| 2021K (0)|
|* 7 | HASH JOIN | | 1 | 36463 | 18397 (1)| 5622 |00:00:03.39 | 65102 | 65050 | 940K| 940K| 1359K (0)|
|* 8 | HASH JOIN | | 1 | 2222 | 14489 (1)| 1499 |00:00:02.94 | 51412 | 51369 | 992K| 992K| 1331K (0)|
|* 9 | HASH JOIN | | 1 | 2222 | 14120 (1)| 1499 |00:00:01.04 | 50088 | 50057 | 3494K| 1598K| 4145K (0)|
|* 10 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 39235 | 10003 (1)| 39235 |00:00:00.47 | 36385 | 36376 | | | |
|* 11 | HASH JOIN | | 1 | 24899 | 3920 (1)| 5648 |00:00:00.55 | 13703 | 13681 | 1199K| 1199K| 1344K (0)|
|* 12 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 1517 | 12 (0)| 1517 |00:00:00.01 | 13 | 0 | | | |
|* 13 | TABLE ACCESS FULL| DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.43 | 13690 | 13681 | | | |
| 14 | TABLE ACCESS FULL | DM_FOLDER_S | 1 | 431K| 367 (1)| 431K|00:00:01.82 | 1324 | 1312 | | | |
|* 15 | TABLE ACCESS FULL | DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.33 | 13690 | 13681 | | | |
|* 16 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 646K| 10000 (1)| 646K|00:00:00.60 | 36385 | 36376 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- statistics feedback used for this statement

Same plan and same execution time here. Tables are large and SGA is small here.

*********** Begin Dump Context: best estimates ***********
 
**************************************************************
kkocfbCompareExecStats : Current
Exec count: 1
CR gets: 102224
CU gets: 3
Disk Reads: 101426
Disk Writes: 0
IO Read Requests: 1633
IO Write Requests: 0
Bytes Read: 830881792
Bytes Written: 0
Bytes Exchanged with Storage: 830881792
Bytes Exchanged with Disk: 830881792
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 4206 (ms)
CPU Time: 1279 (ms)
User I/O Time: 3084 (ms)
**************************************************************
kkocfbCompareExecStats : Best
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
kkocfbCompareExecStats: improvement BG: 0.172939 CPU: 0.040363

So where we are here? We had an execution which was based on bad estimations. Then two tries on good estimations, but because of different buffer cache behavior they are finally 5 times slower. Nothing else to try.

The good thing is that the optimizer admits it cannot do better and falls back to the best execution time, now considered as the best estimate:

kkocfbCheckCardEst: reparse using best estimates
...
kkocfbCopyCardCtx: No best stats found

We see no OPT_ESTIMATE hints here, reason why there was noting in V$SQL_REOPTIMIZATION_HINTS for cursor 2, but this cursor is still marked as re-optimizable and next execution invalidates it:

KKSMEC: Invalidating old cursor 2 with hash val = 955499861
KKSMEC: Produced New cursor 3 with hash val = 1894156093

We see that we are back to the original plan, which is expected because the static statistics have not changed, and there are no statistics feedback this time.

Child cursor 3 – back to plan 1894156093 – 0.5 seconds

This is the plan that si used for all subsequent executions now.

Plan hash value: 1894156093
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 171 (100)| 1 |00:00:00.04 | 17677 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 17677 |
| 2 | NESTED LOOPS | | 1 | 1 | 116 (0)| 1491 |00:00:00.04 | 17677 |
| 3 | NESTED LOOPS | | 1 | 1 | 115 (0)| 1491 |00:00:00.04 | 17454 |
| 4 | NESTED LOOPS | | 1 | 49 | 17 (0)| 5648 |00:00:00.01 | 536 |
|* 5 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 3 | 3 (0)| 1517 |00:00:00.01 | 13 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED | DM_SYSOBJECT_R | 1517 | 16 | 10 (0)| 5648 |00:00:00.01 | 523 |
|* 7 | INDEX RANGE SCAN | D_1F0049A880000010 | 1517 | 71 | 2 (0)| 5648 |00:00:00.01 | 249 |
|* 8 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 5648 | 1 | 2 (0)| 1491 |00:00:00.03 | 16918 |
|* 9 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 5648 | 1 | 1 (0)| 1491 |00:00:00.03 | 15427 |
| 10 | NESTED LOOPS SEMI | | 5648 | 2 | 25 (0)| 1491 |00:00:00.02 | 14827 |
| 11 | NESTED LOOPS | | 5648 | 7 | 18 (0)| 2981 |00:00:00.02 | 12868 |
| 12 | TABLE ACCESS BY INDEX ROWID BATCHED| DM_SYSOBJECT_R | 5648 | 71 | 4 (0)| 2981 |00:00:00.01 | 7747 |
|* 13 | INDEX RANGE SCAN | D_1F0049A880000010 | 5648 | 16 | 3 (0)| 2981 |00:00:00.01 | 6145 |
|* 14 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 2981 | 1 | 2 (0)| 2981 |00:00:00.01 | 5121 |
|* 15 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 2981 | 1 | 1 (0)| 2981 |00:00:00.01 | 2140 |
|* 16 | INDEX UNIQUE SCAN | D_1F0049A880000145 | 2981 | 52759 | 1 (0)| 1491 |00:00:00.01 | 1959 |
|* 17 | INDEX UNIQUE SCAN | D_1F0049A880000142 | 1491 | 1 | 1 (0)| 1491 |00:00:00.01 | 223 |
----------------------------------------------------------------------------------------------------------------------------------------

After a few tries to get a better plan, the optimizer finally switched back to the first one because it was the best in term of response time (I don’t know exactly which execution statistics are used for this decision, elapsed time is just my guess here).

The interesting point here is to understand that you can see a reoptimized cursor without statistics feedback:

  • No rows for the previous cursor in V$SQL_REOPTIMIZATION_HINTS
  • No ‘statistics feedback’ not in the new cursor plan
  • Difference between E-Rows and A-Rows in the new plan
So what?

SQL optimization is a complex task and there is nothing like an execution is ‘fast’ or ‘slow’, an execution plan is ‘good’ or ‘bad’, an optimizer decision is ‘right’ or ‘wrong’. What is fast after several similar executions can be slow on a busy system because less blocks remains in cache. What is slow at a time where the storage is busy may be fast at another time of the day. What is fast with one single user may raise more concurrency contention on a busy system. Cardinality feedback is a reactive attempt to improve an execution plan. On average, things go better with it, but it is not abnormal that few cases can go wrong for a few executions. You can’t blame the optimizer for that, and fast conclusions or optimizer parameter tweaking are not sustainable solutions. And don’t forget that if your data model is well designed, then the critical queries should have one clear optimal access path which will not depend on a small difference in estimated number of rows.

The only thing I can always conclude when I see cardinality feedback going wrong is that there is something to fix in the design of data model, the statistics gathering and/or the query design. When statistics feedback gives a worse execution plan, it is the consequence of the combination of:

  • mis-estimation of cardinalities: bad, insufficient, or stale statistics
  • mis-estimation of response time: bad system statistics, untypical memory sizing, unrepresentative execution context
  • no clear optimal access path: sub-optimal indexing, lack of partitioning,…

It is a good thing to have the auto-reoptimization coming back to the initial plan when nothing better has been observed. I would love to see more control about it. For example, a hint that sets a threshold of execution time where the optimizer should not try to find better. I filled this idea in https://community.oracle.com/ideas/17514 and you can vote for it.

Update 2-APR-2017

I was not clear in this post, but this is the first time I observed this behavior (multiple reoptimization and then back to original plan), so I’m not sure about the reasons and the different conditions required. This was on 12.1.0.2 with JAN17 PSU and the two Adaptive Statistics backport from 12cR2, adaptive plans set to false and no bind variables.

 

Cet article When automatic reoptimization plan is less efficient est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 13 – create database

Sun, 2017-04-02 04:02

Following the last post about which tools you can use to create your PostgreSQL instance in this post we’ll look at how you can do the “create database” part that you would do in Oracle when not using the database configuration assistant. Of course can can create a database in PostgreSQL but it is not the same as it is in Oracle. To actually create your Oracle database you would do something like this once you have the instance started in nomount mode:

startup nomount pfile="/u01/app/oracle/admin/DB4/create/init.ora";
CREATE DATABASE "DB4"
MAXINSTANCES 8
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 1024
DATAFILE '/u02/oradata/DB4/system01DB4.dbf' SIZE 700M REUSE AUTOEXTEND ON NEXT  10240K MAXSIZE 2048M
EXTENT MANAGEMENT LOCAL
SYSAUX DATAFILE '/u02/oradata/DB4/sysaux01DB4.dbf' SIZE 600M REUSE AUTOEXTEND ON NEXT  10240K MAXSIZE 2048M
SMALLFILE DEFAULT TEMPORARY TABLESPACE TEMP TEMPFILE '/u02/oradata/DB4/temp01DB4.dbf' SIZE 20M REUSE AUTOEXTEND ON NEXT  640K MAXSIZE 2048M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1024K
SMALLFILE UNDO TABLESPACE "UNDOTBS1" DATAFILE  '/u02/oradata/DB4/undotbs01DB4.dbf' SIZE 200M REUSE AUTOEXTEND ON NEXT  5120K MAXSIZE 2048M
CHARACTER SET AL32UTF8
NATIONAL CHARACTER SET AL16UTF16
LOGFILE GROUP 1 ('/u03/oradata/DB4/redog1m1DB4.dbf','/u04/oradata/DB4/redog1m2DB4.dbf') SIZE 50M,
GROUP 2 ('/u03/oradata/DB4/redog2m1DB4.dbf','/u04/oradata/DB4/redog2m2DB4.dbf') SIZE 50M,
GROUP 3 ('/u03/oradata/DB4/redog3m1DB4.dbf','/u04/oradata/DB4/redog3m2DB4.dbf') SIZE 50M,
GROUP 4 ('/u03/oradata/DB4/redog4m1DB4.dbf','/u04/oradata/DB4/redog4m2DB4.dbf') SIZE 50M,
GROUP 5 ('/u03/oradata/DB4/redog5m1DB4.dbf','/u04/oradata/DB4/redog5m2DB4.dbf') SIZE 50M,
GROUP 6 ('/u03/oradata/DB4/redog6m1DB4.dbf','/u04/oradata/DB4/redog6m2DB4.dbf') SIZE 50M
USER SYS IDENTIFIED BY "&&sysPassword" USER SYSTEM IDENTIFIED BY "&&systemPassword"
enable pluggable database
seed file_name_convert=('/u02/oradata/DB4/system01DB4.dbf','/u02/oradata/DB4/pdbseed/system01DB4.dbf','/u02/oradata/DB4/sysaux01DB4.dbf','/u02/oradata/DB4/pdbseed/sysaux01DB4.dbf','/u02/oradata/DB4/temp01DB4.dbf','/u02/oradata/DB4/pdbseed/temp01DB4.dbf','/u02/oradata/DB4/undotbs01DB4.dbf','/u02/oradata/DB4/pdbseed/undotbs01DB4.dbf') LOCAL UNDO ON;

Once this completed you start creating the catalog and install additional stuff that you need for your application (e.g. Oracle Text or Oracle Spatial). How does that work in PostgreSQL?

In the last post we had a very quick look at initdb. To create the PostgreSQL database cluster you have to use initdb, there is no other possibility. When you take a look at the options you can provide to initdb there are not too much. The only mandatory parameter is “-D” or “–pgdata”. This tells initdb where you want to have the files created on disk:

postgres@pgbox:/home/postgres/ [pg962final] initdb -D /home/postgres/test/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  en_US.UTF-8
  CTYPE:    en_US.UTF-8
  MESSAGES: en_US.UTF-8
  MONETARY: de_CH.UTF-8
  NUMERIC:  de_CH.UTF-8
  TIME:     en_US.UTF-8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/test ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /home/postgres/test/ -l logfile start

There are some important messages which got printed to the screen. The first two lines tell us that all the files will be owned by the operating system user which invoked initdb, postgres, and that the same user must be used to start the instance. Then it gets more interesting as it is about the default encoding/characterset for the template database. When you do not specify what you want you get the default of you operating system session:

 
postgres@pgbox:/home/postgres/ [pg962final] locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_CH.UTF-8
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=de_CH.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=de_CH.UTF-8
LC_NAME=de_CH.UTF-8
LC_ADDRESS=de_CH.UTF-8
LC_TELEPHONE=de_CH.UTF-8
LC_MEASUREMENT=de_CH.UTF-8
LC_IDENTIFICATION=de_CH.UTF-8
LC_ALL=

Of course you can override that by passing any of the supported character sets to initdb, e.g.:

 
postgres@pgbox:/home/postgres/ [pg962final] rm -rf test
postgres@pgbox:/home/postgres/ [pg962final] mkdir test
postgres@pgbox:/home/postgres/ [pg962final] initdb -D test --encoding=LATIN1 --locale=de_DE

You can control how sorting and the display for numbers, money and so on shall happen by specifying the various “–lc” parameters, e.g.:

 
postgres@pgbox:/home/postgres/ [pg962final] rm -rf test
postgres@pgbox:/home/postgres/ [pg962final] mkdir test
postgres@pgbox:/home/postgres/ [pg962final] initdb -D test --encoding=LATIN1 --locale=de_DE --lc-messages=en_US --lc-monetary=de_DE

PostgreSQL comes with build-in full text search and the line below the encoding stuff tells you that the default will be English. Can overwrite this as well with the “–text-search-config” parameter of initdb.

Maybe the most important message is this:”Data page checksums are disabled.”. This means that PostgreSQL will not use checksums to detect silent data corruptions. Of course this introduces overhead when enabled but your data usually is important, isn’t it? You can enable this by using the “–data-checksums” switch of initdb and this cannot be changed afterwards.

The last message we will look at for now is this one: “fixing permissions on existing directory /home/postgres/test … ok”. What does that mean? When you look at the permissions of the files and directories that got created by initdb you’ll notice that only the owner of the files and directories has permission (no permission for group and world):

drwx------. 19 postgres postgres      4096 Mar 31 11:07 test
postgres@pgbox:/home/postgres/ [pg962final] ls -al test/
total 56
drwx------. 19 postgres postgres  4096 Mar 31 11:07 .
drwx------. 10 postgres postgres  4096 Mar 31 10:51 ..
drwx------.  5 postgres postgres    38 Mar 31 11:07 base
drwx------.  2 postgres postgres  4096 Mar 31 11:07 global
drwx------.  2 postgres postgres    17 Mar 31 11:07 pg_clog
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_commit_ts
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_dynshmem
-rw-------.  1 postgres postgres  4468 Mar 31 11:07 pg_hba.conf
-rw-------.  1 postgres postgres  1636 Mar 31 11:07 pg_ident.conf
drwx------.  4 postgres postgres    37 Mar 31 11:07 pg_logical
drwx------.  4 postgres postgres    34 Mar 31 11:07 pg_multixact
drwx------.  2 postgres postgres    17 Mar 31 11:07 pg_notify
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_replslot
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_serial
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_snapshots
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_stat
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_stat_tmp
drwx------.  2 postgres postgres    17 Mar 31 11:07 pg_subtrans
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_tblspc
drwx------.  2 postgres postgres     6 Mar 31 11:07 pg_twophase
-rw-------.  1 postgres postgres     4 Mar 31 11:07 PG_VERSION
drwx------.  3 postgres postgres    58 Mar 31 11:07 pg_xlog
-rw-------.  1 postgres postgres    88 Mar 31 11:07 postgresql.auto.conf
-rw-------.  1 postgres postgres 22258 Mar 31 11:07 postgresql.conf

When you change that PostgreSQL will refuse to start:

postgres@pgbox:/home/postgres/ [pg962final] chmod 770 test/
postgres@pgbox:/home/postgres/ [pg962final] pg_ctl -D test/ start
server starting
postgres@pgbox:/home/postgres/ [pg962final] FATAL:  data directory "/home/postgres/test" has group or world access
DETAIL:  Permissions should be u=rwx (0700).

Now that we have everything initialized on disk we are ready to start the instance:

postgres@pgbox:/home/postgres/ [pg962final] pg_ctl -D /home/postgres/test/ start
postgres@pgbox:/home/postgres/ [pg962final] LOG:  database system was shut down at 2017-03-31 11:07:05 CEST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
postgres@pgbox:/home/postgres/ [pg962final] psql postgres
psql (9.6.2 dbi services build)
Type "help" for help.

postgres=

From now on you can create a database:

postgres=# create database mydb;
CREATE DATABASE
postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 mydb      | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres

Check here if you want to know what the other databases are for. When you compare that to the Oracle “create database” statement it creates something like a pluggable database and you can create as many as you want. Inside the database you have the usual objects (schemata,tables,views,…). You can even change the encoding for new databases:

postgres=# create database mydb2 encoding='LATIN1' LC_COLLATE='de_CH.iso88591' LC_CTYPE='de_CH.iso88591' template=template0;
CREATE DATABASE
postgres=# \l
                                     List of databases
   Name    |  Owner   | Encoding |    Collate     |     Ctype      |   Access privileges   
-----------+----------+----------+----------------+----------------+-----------------------
 mydb      | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | 
 mydb2     | postgres | LATIN1   | de_CH.iso88591 | de_CH.iso88591 | 
 postgres  | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | 
 template0 | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | =c/postgres          +
           |          |          |                |                | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | =c/postgres          +
           |          |          |                |                | postgres=CTc/postgres
(5 rows)

Important to know is that users are global so you can not create users inside databases but of course you can grant access to databases to different users and users can be owners of databases:

postgres=# alter database mydb owner to myusr;
ALTER DATABASE
postgres=# \l
                                     List of databases
   Name    |  Owner   | Encoding |    Collate     |     Ctype      |   Access privileges   
-----------+----------+----------+----------------+----------------+-----------------------
 mydb      | myusr    | UTF8     | en_US.UTF-8    | en_US.UTF-8    | 
 mydb2     | postgres | LATIN1   | de_CH.iso88591 | de_CH.iso88591 | 
 postgres  | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | 
 template0 | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | =c/postgres          +
           |          |          |                |                | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8    | en_US.UTF-8    | =c/postgres          +
           |          |          |                |                | postgres=CTc/postgres
(5 rows)

How can you then install options into the databases? There are none, all is included. What maybe comes closest to what Oracle calls options are extensions or modules. Some of them are provided by default and you can find them usually in the “share/extension” directory where you installed the PostgreSQL binaries:

postgres@pgbox:/u01/app/postgres/product/96/db_2/ [pg962final] ls
bin  include  lib  share
postgres@pgbox:/u01/app/postgres/product/96/db_2/ [pg962final] ls share/extension/
adminpack--1.0.sql                  hstore--1.3--1.4.sql                  pageinspect.control                      plperlu--unpackaged--1.0.sql
adminpack.control                   hstore--1.4.sql                       pageinspect--unpackaged--1.0.sql         plpgsql--1.0.sql
...

All of those can be installed per database, e.g.:

postgres@pgbox:/home/postgres/ [pg962final] psql postgres
psql (9.6.2 dbi services build)
Type "help" for help.

postgres=# \connect mydb
You are now connected to database "mydb" as user "postgres".
mydb=# create extension hstore;
CREATE EXTENSION
mydb=# \dx
                           List of installed extensions
  Name   | Version |   Schema   |                   Description                    
---------+---------+------------+--------------------------------------------------
 hstore  | 1.4     | public     | data type for storing sets of (key, value) pairs
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

mydb=# 

Others are not available by default and you usually need to download them from github, e.g.
cstore

For an overview can you check the PostgreSQL Extension Network.

Conclusion: You need to use initdb to initialize all the files for your PostgreSQL instance on disk. Use pg_ctl to start your instance and then you are ready to create databases. Hope this helps for starting with PostgreSQL.

 

Cet article Can I do it with PostgreSQL? – 13 – create database est apparu en premier sur Blog dbi services.

Documentum – Increase the number of concurrent_sessions

Sat, 2017-04-01 07:01

In Documentum, there is a parameter named concurrent_sessions which basically defines how many sessions a Content Server can open simultaneously. This parameter is defined in the server.ini file (or server_<dm_server_config.object_name>.ini on a Remote Content Server) of each docbase and it has a default value of 100.

An empty Content Server with an IndexAgent and D2 installed (without user using it) will usually take around 10 sessions for the jobs, for the searches, aso… As soon as there are users in the environment, the number of concurrent sessions will quickly grow and therefore depending on how many users you have, you may (will) need to increase this limit. To be more precise, the concurrent_sessions controls the number of connections the server can handle concurrently. This number must take into account not only the number of users who are using the repository concurrently, but also the operations those users are executing. Some operations require a separate connection to complete. For example:

  • Issuing an IDfQuery.execute method with the readquery flag set to FALSE causes an internal connection request
  • Executing an apply method or an EXECUTE DQL statement starts another connection
  • When the agent exec executes a job, it generally requires two additional connections
  • Issuing a full-text query requires an additional connection
  • aso…

 

Do you already know how to increase the number of allowed concurrent sessions? I’m sure you do, it’s pretty easy:

  1. Calculate the appropriate number of concurrent sessions needed based on the information provided above
  2. Open the file $DOCUMENTUM/dba/config/DOCBASE/server.ini and replace “concurrent_sessions = 100″ with the desired value (“concurrent_sessions = 500″ for example)
  3. Restart the docbase DOCBASE using your custom script of the default Documentum scripts under $DOCUMENTUM/dba
  4. Ensure that the Database used can handle the new number of sessions properly and see if you need to increase the sessions/processes for that

 

To know how many sessions are currently used, it’s pretty simple, you can just execute the DQL “execute show_sessions” but be aware that all sessions will be listed and that’s not exactly what we want. Therefore you need to keep only the ones with a dormancy_status that is Active in the final count otherwise the value will be wrong. The number of active sessions is not only linked to a docbase but also to a Content Server (to be more precise, it is only linked to a dm_server_config object). This means that if you have a High Availability environment, each Content Server (each dm_server_config object) will have its own number of active sessions. This also means that you need to take this into account when calculating how many concurrent sessions you need.

For example if you calculated that you will need a total of 500 concurrent sessions (again it’s not the number of concurrent users!) for a docbase, then:

  • If you have only one Content Server, you will need to set “concurrent_sessions = 500″ on the server.ini file of this docbase.
  • If you have two Content Servers (two dm_server_config objects) for the docbase, then you can just set “concurrent_sessions = 275″ on each server.ini files. Yes I know 2*275 isn’t really equal to 500 but that’s because each Content Server will need its internal sessions for the jobs, searches, aso… In addition to that, the Content Servers might need to talk to each other so these 25 additional sessions wouldn’t really hurt.

 

Now is the above procedure working for any value of the concurrent_sessions? Well the answer to this question is actually the purpose of this blog: yes and no. From a logical point of view, there is no restriction to this value but from a technical point of view, there is… A few months ago at one of our customer, I was configuring a new Application which had a requirement of 2150 concurrent_sessions accross a High Availability environment composed of two Content Servers. Based on the information provided above, I started the configuration with 1100 concurrent sessions on each Content Server to match the requirement. But then when I tried to start the docbase again, I got the following error inside the docbase log file ($DOCUMENTUM/dba/log/DOCBASE.log):

***********************************************************

Program error: Illegal parameter value for concurrent_sessions: 1100

***********************************************************

Usage: ./documentum -docbase_name <docbase name> -init_file <filename> [-o<option>]

    -docbase_name : name of docbase
    -init_file    : name of server.ini file (including path) which
                    contains the server startup parameters

Usage: ./documentum -h

 

As you can see, the docbase refuses to start with a number of concurrent sessions set to 1100. What’s the reason behind that? There is an artificial limit set to 1020. This is actually mentioned in the documentation:
The maximum number of concurrent sessions is dependent on the operating system of the server host machine. The limit is:

  • 3275 for Windows systems
  • 1020 for Linux, Solaris, and AIX systems

 

So why is there a limit? Why 1020? This limit is linked to the FD_SETSIZE value. The documentation on FD_SETSIZE says the following:

An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.

 

Thus FD_SETSIZE doesn’t explicitly limit the number of file descriptors that can be worked on with the system select() call. Inside every UNIX process, for its PID, Documentum maintain a corresponding list (of pointers) of file descriptors. In UNIX based systems, every Documentum session is created as a separate process. Since the number of sessions created directly depends on the number of file descriptors in an OS, each of these processes will be having a list of the file descriptors within their process which will be taking a good chunk of physical memory. With this technical reasoning the value 1020 has been set to be the max concurrent sessions available by default in Documentum.

So basically this limit of 1020 has been set arbitrary by EMC to stay within the default OS (kernel) value which is set to 1024 (can be checked with “ulimit -Sa” for example). An EMC internal task (CS-40186) was opened to discuss this point and to discuss the possibility to increase this maximum number. Since the current default limit is set only in regards to the default OS value of 1024, if this value is increased to 4096 for example (which was our case since the beginning), then there is no real reason to be stuck at 1020 on Documentum side. The Engineering Team implemented a change in the binaries that allows changing the limit. This is done by adding the environment variable DM_FD_SETSIZE.

Therefore to change the concurrent sessions above 1020 (1100 in this example) and in addition to the steps already mentioned before, you also need to do the following (depending on your OS, you might need to update the .bashrc or .profile files instead):

echo "export DM_FD_SETSIZE=1200" >> ~/.bash_profile
source ~/.bash_profile
$DOCUMENTUM/dba/dm_start_DOCBASE

 

With this environment variable DM_FD_SETSIZE now set to 1200, we can use 1100 concurrent sessions without issue. The value that will be used for the concurrent_sessions will be the one from the server.ini file. We just need to define a DM_FD_SETSIZE variable with a value equal or bigger than what we want. Also, I didn’t mention the ulimit but of course, you also need to set the limits of your OS accordingly (this is done in the file /etc/limits.conf or inside any file under /etc/limits.d/).

 

 

Cet article Documentum – Increase the number of concurrent_sessions est apparu en premier sur Blog dbi services.

Documentum – Wrong dfc versions after installation, upgrade or patch

Sat, 2017-04-01 06:09

If you are familiar with Documentum or if you already read some of my blogs, then you probably already know that EMC has sometimes issues with libraries. In a previous blog (this one), I talked about the Lockbox versions which caused us an issue and in this blog, I will talk about DFC versions.

Whenever you install a CS patch or another patch, it will probably have its own DFC libraries simply because EMC fixed something in it or because it was needed. Whenever you install D2, it will also have its own DFC libraries in the JMS and the WAR files. The problem is that the DFC libraries are everywhere… Each and every DFC client has its own DFC libraries which come when you install it, patch it, aso… Basically that’s not a wrong approach, it ensure that the components will work wherever they are installed so it can always talk to the Content Server.

The problem here is that the DFC libraries are changing at every patch almost and therefore it is kind of complicated to keep a clean environment. It already happened to us that two different patches (CS and D2 for example), released on the exact same day, were using different DFC versions and you will see below another example coming from the same package…  You can live with a server having five different DFC versions but this also means that whenever a bug impact one of your DFC library, it will be hard to fix that because you then need to deploy the next official patch which is always a pain. It also multiplies the number of issues that impact your DFC versions since you are running several versions at the same time.

I’m not saying that you absolutely need to always use only the latest DFC version but if you can properly and quickly perform the appropriate testing, I believe it can brings you something. A few weeks ago for example, one of the Application Teams we are supporting had an issue with some search functionalities in D2. This was actually caused by the DFC version bundled with D2 (DFC 7.2P03 I think) and we solved this issue by simply using the DFC version coming from our Content Server (DFC 7.2P05) which was only two patch above.

To quickly and efficiently see which versions of the DFC libraries you are using and where, you can use:

find <WHERE_TO_FIND> -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done

or

find <WHERE_TO_FIND> -type f -name dfc.jar -print0 | while IFS= read -r -d '' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done

 

You can execute these commands on a Content Server, Application Server (Note: dfc.jar files might be on the D2/DA war files if you aren’t using exploded deployments), Full Text Server or any other Linux Servers for what it matters. These commands handle the spaces in the paths even if normally you shouldn’t have any for the dfc files. To use them, you can just replace <WHERE_TO_FIND> with the base folder of your installation. This can be $DOCUMENTUM for a Content Server, $XPLORE_HOME for a Full Text Server, aso… Of course you still need to have the proper permissions to see the files otherwise it will be quite useless to execute this command.

A small example on a Content Server 7.3 (no patches are available yet) including xCP 2.3 P05 (End of February 2017 patch which is supposed to be for CS 7.3):

[dmadmin@content_server_01 ~]$ find $DOCUMENTUM -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/product/7.3/install/composer/ComposerHeadless/plugins/com.emc.ide.external.dfc_1.0.0/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/temp/installer/wildfly/dctmutils/templates/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/modules/system/layers/base/emc/documentum/security/main/dfc.jar
DFC: 7.2.0210.0184  ===  Size: 15212 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/bpm.ear/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/ServerApps.ear/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/deployments/acs.ear/lib/dfc.jar
DFC: 7.2.0210.0184  ===  Size: 15212 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-7ac143a725d0471/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-bc760ece35b05a08/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $DOCUMENTUM/shared/wildfly9.0.1/server/DctmServer_MethodServer/tmp/vfs/deployment/deploymente7c710bab402b3f7/dfc.jar-35c79cfe4b79d974/dfc.jar

 

As you can see above, it looks like there are two different versions of the DFC library on this Content Server which has just been installed: one coming from the CS 7.3 which is therefore in 7.3 P00 build number 205 and another version which is still in 7.2 P21 build number 184. This second version has been put on the Content Server by the xCP 2.3 P05 installer. Therefore using a 7.2 library on a 7.3 Content Server is a little bit ugly but the good news is that they are both in a pretty recent version since these two libraries were released almost at the same time (end of 2016/beginning of 2017). Therefore here I don’t think it would be a big problem even if as soon as the CS 7.3 P01 is out (normally end of this month), we will replace all dfc.jar files with the 7.3 P01 versions.

Another example on a Full Text Server using xPlore 1.6 (same as before, no patches are available yet for xPlore 1.6) including one Primary Dsearch and two IndexAgents for DOCBASE1 and DOCBASE2:

[xplore@fulltext_server_01 ~]$ find $XPLORE_HOME -type f -name dfc.jar -print0 | while read -d $'' file; do echo "DFC: `$JAVA_HOME/bin/java -cp "$file" DfShowVersion`  ===  Size: `ls -ks "$file" | awk '{print $1}'` kb  ===  Location: $file"; done
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/temp/installer/wildfly/dctmutils/templates/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/dfc/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE1/tmp/vfs/deployment/deployment5417db9ca7307cfc/dfc.jar-aa1927b943be418f/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE1/deployments/IndexAgent.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE2/tmp/vfs/deployment/deploymentbb9811e18d147b6a/dfc.jar-7347e3d3bbd8ffd/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_Indexagent_DOCBASE2/deployments/IndexAgent.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0196  ===  Size: 15220 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_PrimaryDsearch/tmp/vfs/deployment/deployment5fd2cff2d805ceb2/dfc.jar-29edda1355c549b8/dfc.jar
DFC: 7.3.0000.0196  ===  Size: 15220 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/server/DctmServer_PrimaryDsearch/deployments/dsearchadmin.war/WEB-INF/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/wildfly9.0.1/modules/system/layers/base/emc/documentum/security/main/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/watchdog/lib/dfc.jar
DFC: 7.3.0000.0205  ===  Size: 15236 kb  ===  Location: $XPLORE_HOME/setup/qbs/tool/lib/dfc.jar

 

Do you see something strange here? Because I do! This is a completely new xPlore 1.6 server which has just been installed and yet we have two different versions of the DFC libraries… It’s not a difference on the minor version but it’s a difference on the build number! As you can see above, it looks like the PrimaryDsearch uses a DFC 7.3 P00 build number 196 while all other DFC versions are in 7.3 P00 build number 205 (so just like the Content Server). The problem here is that each xPlore modules (IA, Dsearch, aso…) are built by different xPlore Teams. Therefore the team that package the Dsearch libraries isn’t the same that the one that package the IndexAgent libraries.

Since there is a difference here, it probably means that the Dsearch team built their package some days/weeks before the other teams (from IA, CS, aso…) and therefore the DFC libraries included in the Dsearch are older… Is it an issue or not? According to EMC, it’s not, BUT I wouldn’t be so categorical. If EMC built this library 9 additional times, it’s not for nothing… There must be a reason behind those builds and therefore not having the latest build seems a little bit risky to me. Since this is just a sandbox environment, I will most probably just wait for the P01 of the xPlore 1.6 which will be release in a few days and I will implement it to have an aligned version of the DFC for all components.

 

Have fun finding issues in the EMC releases :).

 

 

Cet article Documentum – Wrong dfc versions after installation, upgrade or patch est apparu en premier sur Blog dbi services.

Documentum – Thumbnail not working with TCS

Sat, 2017-04-01 04:41

A few months ago and right after a migration of around 0.5TB of documents, we enabled the TCS for one of our Application. We were using a Content Server 7.2 P05 with the associated D2 4.5 P03. As already mentioned in a previous blog, D2 4.5 doesn’t handle the previews of the documents and therefore there were also a Thumbnail Server used by this Application. The setup of the TCS for the document filestores went well without issue but when we tried to do the same thing for the Thumbnail filestore, the previews weren’t working anymore.

Basically when you configure the Thumbnail to use a TCS filestore, you need to request new renditions for existing documents otherwise they will continue to use the non-TCS filestore.

If you access D2 while inspecting the network traffic or using the browser dev/debug feature, you will find that D2 is building the Thumbnail preview URL using the value of the “dm_document.thumbnail_url”. This thumbnail_url is actually – by default – a concatenation of several things: thumbnail_url = base_url + path + store.

Therefore if you define:

  • the Thumbnail base_url to be “https://thumbnail_alias/thumbsrv/getThumbnail?”
  • the Thumbnail TCS filestore to be “trusted_thumbnail_store_01″
  • a test document 09012345801a8f56 with a file system path (get_path) to be: “00012345/80/00/10/4d.jpg”

Then D2 will retrieve a thumbnail_url that is:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01

 

When accessing this URL, if the filestore mentioned above is indeed a TCS filestore, this is the kind of logs that will be generated:

Jul 05, 2016 9:42:15 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 05, 2016 9:42:15 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : the ETag received is the path - will return 304 response
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_NEW_REQUEST] New request: path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_STORE] Retrieving storage area...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : getting storage area for docbase: 00012345 store: trusted_thumbnail_store_01
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_CHECK_SECURITY] Checking security...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : About to start reading files...
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Trusted content store found, will use DFC for decryption
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Content paramters are: format= null; page=null; page modifier=null
Jul 05, 2016 9:45:12 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: Object ID missing from request...

 

When using a non-TCS filestore, this url will retrieve the document preview properly (of course the name of the store was “thumbnail_store_01″ and not “trusted_thumbnail_store_01″) but as you can see above, with a TCS filestore, the Thumbnail is trying to use the DFC for decryption but isn’t able to because the Object ID is missing on the request… With a non-TCS filestore, the Thumbnail is retrieving the content directly from the file system but it’s not possible to do that with a TCS filestore because of the encryption. Therefore a TCS-enabled Thumbnail has to use the “getContent” method of the DFC and this method requires the Object ID. That’s the issue here.

When we faced this issue last summer, we contacted EMC because that was clearly a bug in how D2 is constructing the URLs to request the preview and not how the Thumbnail Server is processing the requested URLs. After several days, the D2 engineering team provided a hotfix to enable D2 to provide the correct parameters. Basically with the new hotfix deployed, D2 was now able to generate the following URL:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Note: In this case, we also had to specify the format (jpeg_lres). While waiting for the EMC hotfix, we performed some tests on our side by trying to add missing parameters and trying to understand what is needed for the Thumbnail Server in the requests. At some point in time, we found out that the Thumbnail was trying to use “jpeg_th” as a default value for the format if you don’t specify any. As far as I know, we have a pretty default Thumbnail configuration but we don’t have any “jpeg_th” formats. Only jpeg_lres and jpeg_story are used and therefore I don’t know where this jpeg_th is coming from. I believe that’s an issue of the Thumbnail Server because EMC included the format in the D2 hotfix too, after we mentioned that to them.

 

So using this new URL, the generated log files were:

Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_NEW_REQUEST] New request: path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_DOCBASE_THUMB] Retrieving docbase thumbnail...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_RETRIEVE_STORE] Retrieving storage area...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : getting storage area for docbase: 00012345 store: trusted_thumbnail_store_01
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : [DM_TS_T_CHECK_SECURITY] Checking security...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : About to start reading files...
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Trusted content store found, will use DFC for decryption
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Content paramters are: format= jpeg_lres; page=null; page modifier=null
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : will be retrieving object 09012345801a8f56
Jul 12, 2016 6:29:53 AM org.apache.catalina.core.ApplicationContext log
INFO: getThumbnail: [DEBUG] : Session ok for 00012345

 

And it worked; the preview of the first page of this document has been displayed. The hotfix seemed to work so we started to deploy it in our DEV environments while waiting for a confirmation from the Application Team that it was also working from D2 (because of SoD, our admin accounts don’t have access to documents in D2). But then after a few tests, The App Team found out that the hotfix was actually only partially working: only for the first page! Indeed EMC created this hotfix using a one-page document and they never tried to retrieve the previews of a multi-page document. The thing with the Thumbnail Server is that if you have a 25-pages document, then this document will have 25 previews. I already talked about that in another previous blog so you can take a look at this blog to have more information on how to manage that.

I will suppose that 09012345801a8f56 is a 3-pages document. I gathered some information from this document and also got the path of the thumbnail previews that are related to the 3 pages:

API> ?,c,select r_object_id, full_format, parent_id, content_size, full_content_size, set_time, set_file from dmr_content where any parent_id='09012345801a8f56'
r_object_id       full_format  parent_id         content_size  full_content_size  set_time           set_file
----------------  -----------  ----------------  ------------  -----------------  -----------------  --------------------------------------------------------------------
06012345801d95cc  jpeg_lres    09012345801a8f56  60467         60467              7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4d.jpg

06012345801d95cd  jpeg_lres    09012345801a8f56  138862        138862             7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4e.jpg

06012345801d95ce  jpeg_lres    09012345801a8f56  29596         29596              7/4/2016 13:22:39  C:\Users\SYS_AD~1\AppData\Local\Temp\batchFile353776780929575961.tar
	get_path => /data/DOCBASE/trusted_thumbnail_storage_01/00012345/80/00/10/4f.jpg

 

So here we have three different jpg files for the same “parent_id” (the document) and each one is a preview of a specific page. These previews have a different size (content_size/full_content_size), therefore the previews should be different!

With the information provided above, the previews URLs generated by D2 (using the new hotfix) are:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4e.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4f.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Three different URLs for three different pages. That looks fine but actually if you access all these three URLs, you will always see the preview of the first page of the document… It goes even beyond that! Actually you can put any path in the URL and it will always show the first page of the document (09012345801a8f56). To confirm that, you can access the following URLs and see by yourself that they will also display the same preview:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/zz.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/zz/zz/zz/zz.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres

 

Based on all the above information, you can understand that the way the Thumbnail Server is working is really different when using a non-TCS or a TCS filestore… For non-TCS filestores, the URL points directly to the actual file and that’s the reason why the full path is really important: it leads directly to the right file and each page has its own path. For TCS filestores, since all the files are encrypted, the Thumbnail Server can’t directly use the file path. Therefore it relies on the DFC to first decrypt the file and then return the result. That’s the reason why only the docbase ID is needed inside the “path” when using a TCS and that everything else in the path is completely useless. On the other end and as seen previously in this blog, you of course also need some additional information to specify which preview you want.

In addition to the parameters we saw above and in order to uniquely identify different renditions, page and/or page_modifier is also required with a TCS filestore. Both attributes are part of dmr_content table and page is the position of the content when the object has multiple contents (generally it will always be zero by default) while page_modifier uniquely identifies a rendition within the same page number and format for a document. If no/null page_modifier is passed in the URL then the rendition with an empty page_modifier value is returned. If there isn’t any rendition in the docbase without a page_modifier value, then the one with the smallest page_modifier (in alphabetical order) will be returned instead: that’s the reason why all previous URLs always returned the preview of the first page only… In short, for non-TCS filestores, the path pretty much does the job, but for TCS filestores we need additional parameters to uniquely identify the rendition.

 

So to summarize the investigation we did with EMC (because yes we found a definitive solution), the minimum elements that should be present in the Thumbnail URLs are:

  • non-TCS filestore:
    • Full path to the right file
    • Store used
  • TCS filestore:
    • Path containing at least the Docbase ID
    • Store used
    • Document ID (parent_id)
    • Format of the preview
    • page and/or page_modifier (from the dmr_content table)

 

In the end, EMC provided another hotfix which fix the first hotfix. The complete URLs are now generated by D2 and the previews are working properly even with a TCS Thumbnail filestore. Here are the examples for the 3-pages document:

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4d.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000001

https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4e.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000002


https://thumbnail_alias/thumbsrv/getThumbnail?path=00012345/80/00/10/4f.jpg&store=trusted_thumbnail_store_01&did=09012345801a8f56&format=jpeg_lres&page_modifier=000000003

 

The definitive hotfix should normally be included in all the 2017 releases according to EMC.

 

 

Cet article Documentum – Thumbnail not working with TCS est apparu en premier sur Blog dbi services.

Can I do it with PostgreSQL? – 12 – dbca (database configuration assistant)

Fri, 2017-03-31 12:59

Short answer: No. Long answer: No. Every other possible answer: No. There is no such tool in PostgreSQL.
In Oracle you can use the database configuration assistant to create your Oracle database. But this is not a must, you can do the same using scripts. Either you use dbca to generate scripts based on the questions you answered in the several screens or you use your own set of scripts to create an Oracle database (You can use dbca templates in addition or you invoke dbca in silent mode and pass your options on the command line). In PostgreSQL the steps to create a PostgreSQL instance are totally different, no surprise here. It already starts with the wording: In PostgreSQL you do not create a database you create a database cluster. Confused? This has nothing to do with multiple nodes are multiple instances as you know it from Oracle (e.g. Real Application Clusters). What a database cluster means are the files on disk (a collection of databases) that are managed by a set of background processes. Cluster just means: belonging together (there are many of types clusters, actually). I’ll use the term database instance from now on. So how can you create a database instance after you installed the PostgreSQL binaries?

When you use the graphical installer provided to the community by EnterpriseDB a PostgreSQL instance will be created automatically after the binaries have been installed. This will create a service on Windows and startup/shutdown scripts on Linux based operating systems (not sure about Mac OS). On Linux when you install the standard way the installer will require root privileges as it needs to add the scripts to startup and shutdown PostgreSQL automatically when then server starts up or does down to the system configuration. Once you started the installer it is just a matter of clicking through the screens:

pg_install_1
pg_install_2
pg_install_3
pg_install_4
pg_install_5
pg_install_6
pg_install_7
pg_install_8
pg_install_9

What happened in the background is that the PostgreSQL instance was created and started:

postgres@pgbox:/home/postgres/ [pg960final] ps -ef | grep postgres | egrep -v "ssh|bash|ps"
postgres  3412     1  0 07:30 ?        00:00:00 /opt/PostgreSQL/9.6/bin/postgres -D /opt/PostgreSQL/9.6/data
postgres  3413  3412  0 07:30 ?        00:00:00 postgres: logger process   
postgres  3415  3412  0 07:30 ?        00:00:00 postgres: checkpointer process   
postgres  3416  3412  0 07:30 ?        00:00:00 postgres: writer process   
postgres  3417  3412  0 07:30 ?        00:00:00 postgres: wal writer process   
postgres  3418  3412  0 07:30 ?        00:00:00 postgres: autovacuum launcher process   
postgres  3419  3412  0 07:30 ?        00:00:00 postgres: stats collector process   
postgres  3805  3579  0 07:56 pts/0    00:00:00 grep --color=auto postgres

In addition the installer created a systemd service (I am on CentOS) so your instance will start up and shutdown automatically (and a postgres user and group, of course):

postgres@pgbox:/home/postgres/ [pg960final] systemctl status postgresql-9.6.service
● postgresql-9.6.service - PostgreSQL 9.6 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2017-03-31 07:30:01 CEST; 28min ago
  Process: 3410 ExecStart=/opt/PostgreSQL/9.6/bin/pg_ctl start -w -t ${TimeoutSec} -D /opt/PostgreSQL/9.6/data -l /opt/PostgreSQL/9.6/data/pg_log/startup.log (code=exited, status=0/SUCCESS)
 Main PID: 3412 (postgres)
   CGroup: /system.slice/postgresql-9.6.service
           ├─3412 /opt/PostgreSQL/9.6/bin/postgres -D /opt/PostgreSQL/9.6/data
           ├─3413 postgres: logger process   
           ├─3415 postgres: checkpointer process   
           ├─3416 postgres: writer process   
           ├─3417 postgres: wal writer process   
           ├─3418 postgres: autovacuum launcher process   
           └─3419 postgres: stats collector process   

From now on you can connect to the instance and start your journey:

postgres@pgbox:/home/postgres/ [pg960final] /opt/PostgreSQL/9.6/bin/psql postgres
Password: 
psql.bin (9.6.2)
Type "help" for help.

postgres= \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

postgres=

But this is not the only method for creating a PostgreSQL instance. On Linux most of the distributions provide PostgreSQL packages in their repositories which you then can install using yum (when you are on RedHat based systems) or apt (when you are on Debian based systems (There are others as well, such as YaST on SuSE but yum and apt are the most popular, I believe). In my case, using the official CentOS repositories, I can use yum and check what CentOS provides:

postgres@pgbox:/home/postgres/ [pg960final] yum search postgresql
...
postgresql.i686 : PostgreSQL client programs
postgresql.x86_64 : PostgreSQL client programs
postgresql-contrib.x86_64 : Extension modules distributed with PostgreSQL
postgresql-devel.i686 : PostgreSQL development header files and libraries
postgresql-devel.x86_64 : PostgreSQL development header files and libraries
postgresql-docs.x86_64 : Extra documentation for PostgreSQL
postgresql-jdbc.noarch : JDBC driver for PostgreSQL
postgresql-jdbc-javadoc.noarch : API docs for postgresql-jdbc
postgresql-libs.i686 : The shared libraries required for any PostgreSQL clients
postgresql-libs.x86_64 : The shared libraries required for any PostgreSQL clients
postgresql-odbc.x86_64 : PostgreSQL ODBC driver
postgresql-plperl.x86_64 : The Perl procedural language for PostgreSQL
postgresql-plpython.x86_64 : The Python2 procedural language for PostgreSQL
postgresql-pltcl.x86_64 : The Tcl procedural language for PostgreSQL
postgresql-server.x86_64 : The programs needed to create and run a PostgreSQL server
postgresql-test.x86_64 : The test suite distributed with PostgreSQL
postgresql-upgrade.x86_64 : Support for upgrading from the previous major release of PostgreSQL
...

The packagers decided to split PostgreSQL into several packages as you can see above. If you only want to install the core server you would:

[root@centos7 ~] yum install postgresql-server.x86_64

In this case the instance will not be created automatically:

[root@centos7 ~] ps -ef | grep postgres
root      9981  4558  0 08:18 pts/0    00:00:00 grep --color=auto postgres

But, as with the graphical installer above, a systemd service was created for you:

[root@centos7 ~] systemctl list-unit-files | grep postgres
postgresql.service                            disabled

Then we can just enable and start the service?

[root@centos7 ~] systemctl enable postgresql.service
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql.service to /usr/lib/systemd/system/postgresql.service.
[root@centos7 ~] systemctl start postgresql.service
Job for postgresql.service failed because the control process exited with error code. See "systemctl status postgresql.service" and "journalctl -xe" for details.

Not really. What the installation with yum did not do for you is to create the instance. The good thing with the CentOS provided packages is that you get helpers to create the instance which are not there by default. For creating the instance you would:

-bash-4.2$ sudo postgresql-setup initdb
Initializing database ... OK

… and your instance got created but not yet started. To start it:

-bash-4.2$ sudo systemctl start postgresql.service
-bash-4.2$ psql postgres
psql (9.2.18)
Type "help" for help.

Quite easy. But here you can see one issue with the packages provided by the Linux distributions. What was installed in my case is PostgreSQL 9.2.18 and there are some points to consider with that: The latest PostgreSQL 9.2 release is 9.2.20 (as of today). So you miss 2 minor versions containing bug fixes. Even more important PostgreSQL 9.2 will go out of support this September because it was released almost 5 years ago, the current release is 9.6.2 (supported until September 2021). Not a good choice for running a production server.

Luckily there is another possibility: The PostgreSQL community provides repositories as well. There is one for yum based systems and one for apt based systems. For getting the yum based repositories into CentOS you would (for PostgreSQL 9.6):

[root@centos7 ~] wget https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
[root@centos7 ~] yum localinstall pgdg-centos96-9.6-3.noarch.rpm 

When you check what is available now you’ll find this:

postgresql96.x86_64 : PostgreSQL client programs and libraries
postgresql96-contrib.x86_64 : Contributed source and binaries distributed with PostgreSQL
postgresql96-debuginfo.x86_64 : Debug information for package postgresql96
postgresql96-devel.x86_64 : PostgreSQL development header files and libraries
postgresql96-docs.x86_64 : Extra documentation for PostgreSQL
postgresql96-libs.x86_64 : The shared libraries required for any PostgreSQL clients
postgresql96-odbc.x86_64 : PostgreSQL ODBC driver
postgresql96-plperl.x86_64 : The Perl procedural language for PostgreSQL
postgresql96-plpython.x86_64 : The Python procedural language for PostgreSQL
postgresql96-pltcl.x86_64 : The Tcl procedural language for PostgreSQL
postgresql96-server.x86_64 : The programs needed to create and run a PostgreSQL server
postgresql96-tcl.x86_64 : A Tcl client library for PostgreSQL
postgresql96-tcl-debuginfo.x86_64 : Debug information for package postgresql96-tcl
postgresql96-test.x86_64 : The test suite distributed with PostgreSQL

Installation is the same as with the CentOS repositories above:

[root@centos7 ~] yum install postgresql96-server.x86_64
[root@centos7 ~] systemctl list-unit-files | grep postgres
postgresql-9.6.service    
[root@centos7 ~] systemctl enable postgresql-9.6.service
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql-9.6.service to /usr/lib/systemd/system/postgresql-9.6.service.
[root@centos7 ~] ls /usr/pgsql-9.6/bin/initdb ^C
[root@centos7 ~] su - postgres
Last login: Fri Mar 31 08:29:10 CEST 2017 on pts/1
-bash-4.2$ /usr/pgsql-9.6/bin/initdb
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgsql/9.6/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-9.6/bin/pg_ctl -D /var/lib/pgsql/9.6/data -l logfile start

-bash-4.2$ sudo systemctl start postgresql-9.6.service
-bash-4.2$ /usr/pgsql-9.6/bin/psql postgres
psql (9.6.2)
Type "help" for help.

postgres= \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

And ready. The packages provided by the PostgreSQL community are a much better choice than the ones provided by your distributions because here you get the latest ones.

What we did above as an extra step was calling a utility which is named “initdb”. initdb is the tool you use to create the PostgreSQL instance after you installed the binaries. The graphical installer uses initdb in the background as well as there is no other way to do it. When you compile and install PostgreSQL from source you would need to use initdb to create your PostgreSQL instance as well.

Conclusion: Depending on how you install PostgreSQL onto your system the PostgreSQL instance is created automatically (when you use the installer provided by EnterpriseDB), almost automatically when you use the packages provided by your distribution and almost automatically when you use the packages provided directly by the PostgreSQL community. In the next post we’ll look at initdb in more detail.

 

Cet article Can I do it with PostgreSQL? – 12 – dbca (database configuration assistant) est apparu en premier sur Blog dbi services.

SQL Server on Linux and experimental Flexible Architecture

Fri, 2017-03-31 06:28

In this blog post let’s talk a little bit about SQL Server optimal installation on the Linux file system. This is probably one of the first considerations we will have as a database administrator. Actually basic installation of SQL Server can be resumed to use yum command and let the SQL Server installer place all related files on the file system. But in a production environment, the story will be different and we will have to follow best practices. In fact, Microsoft guidelines already exist in Windows world and concern mainly placement of system / user database files and tempdb as well.

But what about Linux? In my opinion, the game is not so different and existing database file placement rules also apply on Linux. We don’t work with drive letters on Linux but it doesn’t matter because in some cases we already use the same mount point concept.

How to start and how to achieve an optimal placement were the questions in my head since the last meetup I attended about MongoDB and organized by my colleagues Medhi Bada and Eric Dusquesnoy.

During the meeting, they talked about MongoDB installation according the OFA architecture initiated by Oracle and which has many similar aspects with FHS on Unix system. Another interesting thing I’ve realized after discussing with my colleagues is that we also propose the same based-model for other SGBDs like MySQL or PostgreSQL (with some adjustments in the respect of their respective context).

In a nutshell, OFA architecture provides the following interesting things:

  • A logical layout for the databases that helps DBAs to manage their system. The idea here is to provide a predictable and consistent file naming structure to manage and to deal easily with issues such as adding hardware, installing new instance etc…
  • Preventing performance bottlenecks by distributing I/O across enough disks. For instance, we may separate group of entities that will contend for disk resources. We may think here about separating data and log files, data files in different filegroups or dedicated tempdb to its own storage path as well.
  • Maximizing database reliability in case of disk failure

But you may wonder why to scale the existing SQL Server best practices to the OFA standard?

First of all, let’s say that existing Microsoft best practices, in many respects, resembles to OFA. Then providing a uniform approach to customers (Oracle, MySQL, PostgreSQL, MongoDB and SQL Server) about best practices may help to contribute to a true immersion of SQL Server on Linux. From a customer perspective, we may provide guidelines on an already widely architecture standard known in Linux world. This is, at least, the way I see it.

So just to be clear, the next part of this blog is purely experimental and it doesn’t pretend to replace any future guideline that will be provided by Microsoft.

Before getting to the heart of the matter, let’s get the files associated to the mssql-server package. I continue to use the CentOS distribution for my demo but you may easy apply the same with other distributions.

[root@sql mssqlserver]# rpm -qpl mssql-server-14.0.405.200-1.x86_64.rpm
/opt/mssql/bin
/opt/mssql/bin/compress-dump.sh
/opt/mssql/bin/generate-core.sh
/opt/mssql/bin/mssql-conf
/opt/mssql/bin/paldumper
/opt/mssql/bin/sqlpackage
/opt/mssql/bin/sqlservr
/opt/mssql/lib
/opt/mssql/lib/libc++.so.1
/opt/mssql/lib/libc++abi.so.1
/opt/mssql/lib/libjemalloc.so.1
/opt/mssql/lib/libsqlvdi.so
/opt/mssql/lib/libunwind.so.8
/opt/mssql/lib/mssql-conf
/opt/mssql/lib/mssql-conf/checkinstall.sh
/opt/mssql/lib/mssql-conf/checkrunninginstance.sh
/opt/mssql/lib/mssql-conf/collations.txt
/opt/mssql/lib/mssql-conf/disableservice.sh
/opt/mssql/lib/mssql-conf/enableservice.sh
/opt/mssql/lib/mssql-conf/mssql-conf.py
/opt/mssql/lib/mssql-conf/mssqlconfhelper.py
/opt/mssql/lib/mssql-conf/mssqlsettings.py
/opt/mssql/lib/mssql-conf/mssqlsettingsmanager.py
/opt/mssql/lib/mssql-conf/sapassword.sh
/opt/mssql/lib/mssql-conf/set-collation.sh
/opt/mssql/lib/mssql-conf/startservice.sh
/opt/mssql/lib/mssql-conf/stopservice.sh
/opt/mssql/lib/sqlpackage.sfp
/opt/mssql/lib/sqlservr.sfp
/opt/mssql/lib/system.certificates.sfp
/opt/mssql/lib/system.common.sfp
/opt/mssql/lib/system.netfx.sfp
/opt/mssql/lib/system.sfp
/usr/lib/systemd/system/mssql-server.service
/usr/share/doc/mssql-server/LICENSE.TXT
/usr/share/doc/mssql-server/THIRDPARTYNOTICES.TXT
/usr/share/man/man1/mssql-conf.1.gz
/usr/share/man/man1/sqlpackage.1.gz
/usr/share/man/man1/sqlservr.1.gz

From the above output we may find out a bunch of files that are part of the proper functioning of SQL Server. For example, we may notice different script files (either bash and python scripts files), libraries and sfp files (which are part of the new drawbridge architecture?), man page files and finally the sqlservr binary.

Most of these files are installed in the main hierarchy /opt/mssql/ with the following specific items

  • /opt/mssql/bin/ (binary files and SQL Server crash dump generation scripts)
  • /opt/mssql/lib (sqlserver libraries and sfp files)
  • /opt/mssql/lib/mssql-conf (python and bash scripts)

In addition, installing a SQL Server instance reveals an additional hierarchy /var/opt/mssql with the followings items :

  • data folder = default folder for database data and transaction log files. By the way, system and tempdb database files are located here by default. At the moment of writing this blog, SQL Server is in CTP 1.4 and moving system databases are not supported but we may expect to see this limit removed in the future
  • log folder = log files are stored here. We may retrieve logs related to SQL Server engine (errorlog files), to the SQL Server agent (SQLAGENT.OUT), to the default SQL Server trace and HK engine extended event.
  • secret folder = contains the machine.key file used by the SQL Server engine or potential other files that come with high-availability architectures to store pacemaker credentials for example.
  • conf file = Stores the default location files for user database files and dump files as well.

The last one – mssql.conf – has drawn my attention. According to my tests, it is possible to modify this file manually without getting sides effects but maybe to take with a pinch of salt at this stage.

So here my flexible architecture version for SQL Server:

  • Binary structure
mkdir -p /u00/app/sqlserver
mkdir -p /u00/app/sqlserver/product/14/mssqlserver
mkdir -p /u00/app/sqlserver/local/dmk
mkdir -p /u00/app/sqlserver/local/mssqltools
mkdir -p /u00/app/sqlserver/admin/mssqlserver/etc
mkdir -p /u00/app/sqlserver/admin/mssqlserver/log
mkdir -p /u00/app/sqlserver/admin/mssqlserver/dump

ln -s /opt/mssql/bin /u00/app/sqlserver/product/14/mssqlserver
ln -s /var/opt/mssql/mssql.conf /u00/app/sqlserver/admin/mssqlserver/etc/
ln -s /var/opt/mssql/log/ /u00/app/sqlserver/admin/mssqlserver/log 

Products file consists of SQL Server software that are supplied on the media / package provided by Microsoft.

Administrative files are files containing data about the instance, including SQL Server and SQL Agent error log files, default traces or HK engine extended event files server process diagnostic and finally instance parameter files like mssql.conf file.

  • Database structure
mkdir -p /u01/sqlserverdata/mssqlserver
mkdir -p /u02/sqlserverlog/mssqlserver
mkdir -p /u03/sqlservertempdb/mssqlserver
mkdir -p /u98/sqlserver/backup/mssqlserver

Then let’s configure the correct permissions on the new hierarchy folders

chown -R mssql:mssql /u01/sqlserverdata/mssqlserver
chmod -R 750 /u01/sqlserverdata/mssqlserver
chown -R mssql:mssql /u02/sqlserverlog/mssqlserver
chmod -R 750 /u02/sqlserverlog/mssqlserver
chown -R mssql:mssql /u03/sqlservertempdb/mssqlserver
chmod -R 750 /u03/sqlservertempdb/mssqlserver
chown -R mssql:mssql /u98/sqlserver/backup/mssqlserver
chmod -R 750 /u98/sqlserver/backup/mssqlserver

After configuring the permissions let’s change default path parameters from the mssql-config utility

/opt/mssql/bin/mssql-conf set filelocation.defaultdatadir /u01/sqlserverdata/mssqlserver 
/opt/mssql/bin/mssql-conf set filelocation.defaultlogdir /u02/sqlserverlog/mssqlserver 
/opt/mssql/bin/mssql-conf set filelocation.defaultbackupdir /u98/sqlserver/backup/mssqlserver
/opt/mssql/bin/mssql-conf set filelocation.defaultdumpdir /u00/app/sqlserver/admin/mssqlserver/dump

And let’s move tempdb database files by connecting to SQL Server (moving tempdb is supported since the CTP 1.4).

DECLARE @sql NVARCHAR(MAX) = N'';
DECLARE @path_name NVARCHAR(MAX) = N'/u03/sqlservertempdb/mssqlserver/';

USE tempdb;


SELECT 
	@sql += N'ALTER DATABASE [tempdb] MODIFY FILE (NAME = ' + name + ', FILENAME = ''' 
	        + @path_name + REVERSE(SUBSTRING(REVERSE(REPLACE(physical_name, 'C:\', '')), 1 , CHARINDEX('\', REVERSE(REPLACE(physical_name, 'C:\', ''))) - 1)) + ''
			+ ''')' + CHAR(13)
FROM sys.database_files;


--PRINT @sql

EXEC sp_executesql @sql;
GO

And finally let’s restart the SQL Server instance:

systemctl restart mssql-server

Moreover, we may also add to the system path the binary structure in place and then change it when a new product is installed

echo 'PATH=”$PATH:/u00/app/sqlserver/product/14/mssqlserver/bin”' >> ~/.bashrc
source ~/.bashrc

Invoking mssql-conf is pretty easy in this case as shown below:

[root@sql home]# mssql-conf [options]

Finally let’s have a look at the database files placement :

SELECT 
	DB_NAME(database_id) AS database_name,
	name AS logical_name,
	physical_name
FROM sys.master_files
ORDER BY database_id

blog 120 - SQL Server and Flexible architecture

Drive letters and Windows fashion paths are still present for system databases but no doubt this issue will disappear in a next release.

Bottom line

Keep in mind that OFA architecture is very flexible and provides only guidelines and best practices. Therefore, we will be able to make adjustments over time as the version evolves. Please feel free to comment. It always be appreciated!

See you

 

 

Cet article SQL Server on Linux and experimental Flexible Architecture est apparu en premier sur Blog dbi services.

GoldenGate 12.2: ERROR: Unable to create a new wallet

Thu, 2017-03-30 10:42

Today I had a very strange issue when setting up a brand new GoldenGate 12.2 installation. Installation went fine, patch went fine (PSU 170221) and then I started to configure GoldenGate. One of the first commands we usually do is:

GGSCI (xxxxx) 2> create wallet

I really never had any issues with this, but today I got:

GGSCI (psrdbap03n1) 2> create wallet

ERROR: Unable to create a new wallet.

What a great error message and it was not much more verbose in the GoldenGate logfile:

2017-03-30 15:20:06  INFO    OGG-02096  Oracle GoldenGate Command Interpreter for Oracle:  ERROR: Unable to create a new wallet.

So, what to do? My first thought was:

oracle@xxx]~[xxx]> ps -ef | grep ggscs
oracle    7683 30606  0 15:29 pts/2    00:00:00 grep ggscs
[oracle@xxx]~[xxx]> ps -ef | grep ggs
oracle    1261 24160  0 15:19 pts/0    00:00:00 ./ggsci
oracle    7698 30606  0 15:29 pts/2    00:00:00 grep ggs
[oracle@xxx]~[xxx]> strace -fp 1261

… and then do the command again and check the strace output. But there was not much to look at. GoldenGate tried to stat the wallet files but as they did not exist yet failed to, which is fine:

[pid  1261] stat("dirwlt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1261] stat("dirwlt/cwallet.sso", 0x7fff88bd2f30) = -1 ENOENT (No such file or directory)
[pid  1261] stat("dirwlt/cwallet.lck", 0x7fff88bd28f0) = -1 ENOENT (No such file or directory)
[pid  1261] open("dirwlt/cwallet.lck", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6
[pid  1261] close(6)                    = 0
[pid  1261] open("dirwlt/cwallet.lck", O_RDWR) = 6

Some lines later:

[pid  1262] futex(0x1e68a64, FUTEX_WAIT_PRIVATE, 123, {0, 982142} 
[pid  1261] write(1, "ERROR: Unable to create a new wa"..., 37) = 37
[pid  1261] write(1, "\n", 1)    

And then many, many of these:

[pid  1262] futex(0x1e68a64, FUTEX_WAIT_PRIVATE, 163, {0, 63977012}) = -1 ETIMEDOUT (Connection timed out)
[pid  1262] futex(0x1e68a38, FUTEX_WAKE_PRIVATE, 1) = 0

In short, I had no clue what was the issue (the many futex calls did not help me) . The next idea was to enable debugging for Goldengate. This can be done by creating a file like this in the GoldenGate Home (a picture again, sorry, the xml tags destroy the formatting. Let me know when you want to have the file):
ogg-debug

This will create a file called “gglog-ggsci.log” in the GoldenGate Home which contains more information what was going on. But the only two lines which showed “ERROR” were these:

2017-03-30 16:09:31.015 ERROR|ggsec.CWallet                 |main| 1090 ggsec/CWallet.cpp        | initNZCtx: Error initializing the nz context. NZERROR 43490
2017-03-30 16:09:31.015 ERROR|ggsec.CWallet                 |main|  439 ggsec/CWallet.cpp        | create: Could not initialize network security context.

Hm, this does not help me either (mos did not come up with a single note about this error). To make it short: Finally this happened because I had the ORACLE_HOME environment set to the Goldengate Home:

[oracle@xxxxxxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> echo $ORACLE_HOME
/u01/app/ogg/ch_src/product/12.2.0.1.170221/via

As soon as I changed this to a database home everything was fine again:

[oracle@xxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> export ORACLE_HOME=/u01/app/oracle/product/11.2.0.3/dbhome_1/
[oracle@xxx]/u01/app/ogg/ch_src/product/12.2.0.1.170221/via[p1viach1]> ./ggsci 

Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.1.170221 OGGCORE_12.2.0.1.0OGGBP_PLATFORMS_170123.1033_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Jan 23 2017 21:54:15
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2017, Oracle and/or its affiliates. All rights reserved.

GGSCI (xxx) 1> create wallet

Created wallet at location 'dirwlt'.

Opened wallet at location 'dirwlt'.

GGSCI (xxx) 2> 

When you ask yourself why I had the ORACLE_HOME pointed to the GoldenGate Home: I applied the latest PSU just before and before invoking opatch I did set the ORACLE_HOME to point to the GoldenGate Home. When you do the same in GoldenGate 12.1 and even in the GoldenGate 12.2 base release you will not hit this. Currently there is bug open with Oracle, lets see what happens :)

 

Cet article GoldenGate 12.2: ERROR: Unable to create a new wallet est apparu en premier sur Blog dbi services.

Pages