Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 5 hours 25 min ago

WebLogic – SSO/Atn/Atz – 403 Forbidden, a first issue

Sat, 2017-12-02 02:00

In a previous blog, I explained how it was possible to enable the SSO/Atn/Atz (SSO/Authentication/Authorization) debug logs in order to troubleshoot an issue. In this blog, I will show the logs generated by an issue that I had to deal with last month at one of our customers. This issue will probably not occur very often but it is a pretty funny one so I wanted to share it!

So the issue I will talk about in this blog happened on an environment that is configured with a SAML2 SSO. With a fully working SAML2 SSO, the WebLogic hosting the application is supposed to redirect the end-user to the IdP Partner (with a SAML2 request) which process it and then redirect the end-user again to the WebLogic (with the SAML2 response) which process the response and finally grant access to the Application. On this issue, both redirections were apparently happening properly but then for an unknown reason the WebLogic Server was blocking the access to the application with a “403 – Forbidden” message.

Obviously the first thing I did is to enable the debug logs and then I replicated the issue. These are the logs that I could see on the Managed Server log file:

<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <SAML2Servlet: Processing request on URI '/saml2/sp/acs/post'>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <getServiceTypeFromURI(): request URI is '/saml2/sp/acs/post'>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <getServiceTypeFromURI(): service URI is '/sp/acs/post'>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <getServiceTypeFromURI(): returning service type 'ACS'>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <Assertion consumer service: processing>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <get SAMLResponse from http request:PBvbnNSJ1cXMlFtZzaXM6bmHhtbG5zm46NhwOlJlOnNhbWxHbWxIb2Fc3wP6dGM6
U0FiB4bWxuczp4NTAwPSJ1cm46b2FzNTDoyLjA6YXNzZXJ0aW9uIaXM6bmFtZXM6
U0FdG9jb2wiIHhtbG5zOmRzaWc9Imh0dHA6Ly93NTDoyLjA6cHJvd3cudzMub3Jn
aHR0cDoa5vcmcvMjAwMS9W5zdGFuY2vL3d3dy53MyYTUxTY2hlbWEtUiIERlc3Rp
MWNxM2FjNzI1ZDjYmIVhNDM1Zlzc3VlSW5zdGFudD0ijhlNjc3OTkiIEMjAxNy0x
LzINpZyMiIHhtwMDAvMDkveG1sZHbG5DovL3d3dy53My5vczOmVuYz0iaHR0cmcv
MMS8wNC94bWxjAwlbmWxuczpzYW1sPMjIiB4bSJ1aXM6bmFtZXM6dcm46b2FzGM6
dGdXRlOM6U0FNZXM6YXR0cmliTDoyLjAAiIHhtbG5zOnhzaT6cHJvZmlslg1MD0i
bmF0aW9JodHRwczovL5ldS5uPSub3Zhc3BoY2hicy1DIyMinRpcyzdDIyM5uZXQ6
ODA4NSMvcG9zdCI9zYW1sMi9zcC3SUhwRHRuN3I1WH9hY3gSUQ9ImlkLUhhTTFha
Z3hBiIEmVzcWW5URXhybHJlG9uc2VUbz0RVFGbWt1VkRaNC0iXzB4ZluUGM1Mjk2
MS0xNw6SXNzdWjo0OTFZlcnNpVyIEZvo1MloiIlQxMb249IjIuMCI+PHNhbWcm1h
...
LTExIgTDEyPSIyMDE3LFjQ525PcLTE2VETExLTOk2VDOjUyWimdGVym90TEyOjU0
OjUyWiI+PHNh8c2FtbDpBdWRpZW5bWw6QXVkabj4jWVuY2VSZXN0cmljdGlvZT5T
c3NGVkVHJhb3b3JkUHJvdG1sOkF1dGhuQ29udGV4VjdnNwb3J0PC9zYWdENsYXNz
QXV0aG5gQXV0TdGF0LTExLTE2VDEaG5JZW1lbnQSIyMDE3bnN0YW50PyOjQ5OjUy
aEV25PcVucnhPSIyMDE3LTExEZmZ2IiBkFmdGVyLTEJWTTZXNzaW9uTm90T2VDEz
WivUWuZGV4PSJpZC13UlVMWGRYOXd6xWzc2lvbklRThFZDJwRDdIgU2VR210OUc0
dWJXYSUQ8L3NhRE1PQ19XlfREVWTQU1MMl9FbnbWw6QXVkaWVuY2U+RpdHlfPC9z
YWRpb24+P1sOkF1ZGllbHJpY3mPHNhNlUmVzdC9zYW1sOkNvbmRpdGlvbnM+bWw6
YXNzUzpjbYW1YXNpczpuIuMlmVmPnVybjpvDphYxzp0YzpTQU1MOjGFzc2VzOlBh
OjAXh0Pj0OjUyWiI+PHNsOkF1dGhuQhxzYW129udGV4bWw6QXV0aG5Db250ZdENs
UmVnRlepBdXRobkNvbzYWmPjwvc2FtbDF1dGhuU1sOkHQ+PC93RhdGVtZW50Pjwv
c2F9zY25zZtbDpBcW1scDpSZXNwb3NlcnRpb24+PCT4=
>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <BASE64 decoded saml message:<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#" xmlns:enc="http://www.w3.org/2001/04/xmlenc#" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:x500="urn:oasis:names:tc:SAML:2.0:profiles:attribute:X500" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Destination="https://weblogic_server_01/saml2/sp/acs/post" ID="id-HpDtn7r5XxxAQFYnwSLXZmkuVgIHTExrlreEDZ4-" InResponseTo="_0x7258edc52961ccbd5a435fb13ac67799" IssueInstant="2017-11-12T12:23:42Z" Version="2.0"><saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://idp_partner_01/fed/idp</saml:Issuer><dsig:Signature><dsig:SignedInfo><dsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><dsig:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><dsig:Reference URI="#id-HpDtn7r5XxxAQFYnwSLXZmkuVgIHTExrlreEDZ4-"><dsig:Transforms><dsig:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><dsig:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/></dsig:Transforms><dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><dsig:DigestValue>YGtUZvsfo3z51AsBo7UDhbd6Ts=</dsig:DigestValue></dsig:Reference></dsig:SignedInfo><dsig:SignatureValue>al8sJwbqzjh1qgM3Sj0QrX1aZjwyI...JB6l4jmj91BdQrYQ7VxFzvNLczZ2brJSdLLig==</dsig:SignatureValue><dsig:KeyInfo><dsig:X509Data><dsig:X509Certificate>MIwDQUg+nhYqGZ7pCgBQAwTTELMAkGA1UEBhMCQk1ZhQ...aATPRCd113tVqsvCkUwpfQ5zyUHaKw4FkXmiT2nzxxHA==</dsig:X509Certificate></dsig:X509Data></dsig:KeyInfo></dsig:Signature><samlp:Status><samlp:StatusCode Value="urn:oasis:names:tc:SAML:2.0:status:Success"/></samlp:Status><saml:Assertion ID="id-0WrMNbOz6wsuZdFPhfjnw7WIXXQ6k89-1AgHZ9Oi" IssueInstant="2017-11-12T12:23:42Z" Version="2.0"><saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://idp_partner_01/fed/idp</saml:Issuer><dsig:Signature><dsig:SignedInfo><dsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><dsig:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><dsig:Reference URI="#id-0WrMNbOz6wsuZdFPhfjnw7WIXXQ6k89-1AgHZ9Oi"><dsig:Transforms><dsig:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><dsig:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/></dsig:Transforms><dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><dsig:DigestValue>7+jZtq8SpY3BKVaFjIFeEJm51cA=</dsig:DigestValue></dsig:Reference></dsig:SignedInfo><dsig:SignatureValue>GIlXt4B4rVFoDJRxidpZO73gXB68Dd+mcpoV9DKrjBBjLRz...zGTDcEYY2MG8FgtarZhVQGc4zxkkSg8GRT6Wng3NEuTUuA==</dsig:SignatureValue><dsig:KeyInfo><dsig:X509Data><dsig:X509Certificate>MIwDQUg+nhYqGZ7pCgBQAwTTELMAkGA1UEBhMCQk1ZhQ...aATPRCd113tVqsvCkUwpfQ5zyUHaKw4FkXmiT2nzxxHA==</dsig:X509Certificate></dsig:X509Data></dsig:KeyInfo></dsig:Signature><saml:Subject><saml:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified">PATOU_MORGAN</saml:NameID><saml:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer"><saml:SubjectConfirmationData InResponseTo="_0x7258edc52961ccbd5a435fb13ac67799" NotOnOrAfter="2017-11-12T12:28:42Z" Recipient="https://weblogic_server_01/saml2/sp/acs/post"/></saml:SubjectConfirmation></saml:Subject><saml:Conditions NotBefore="2017-11-12T12:23:42Z" NotOnOrAfter="2017-11-12T12:28:42Z"><saml:AudienceRestriction><saml:Audience>SAML2_Entity_ID_01</saml:Audience></saml:AudienceRestriction></saml:Conditions><saml:AuthnStatement AuthnInstant="2017-11-12T12:23:42Z" SessionIndex="id-oX9wXdpGmt9GQlVffvY4hEIRULEd25nrxDzE8D7w" SessionNotOnOrAfter="2017-11-12T12:38:42Z"><saml:AuthnContext><saml:AuthnContextClassRef>urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport</saml:AuthnContextClassRef></saml:AuthnContext></saml:AuthnStatement></saml:Assertion></samlp:Response>>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <<samlp:Response> is signed.>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecurityAtn> <com.bea.common.security.internal.service.IdentityAssertionServiceImpl.assertIdentity(SAML2.Assertion.DOM)>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecurityAtn> <com.bea.common.security.internal.service.IdentityAssertionTokenServiceImpl.assertIdentity(SAML2.Assertion.DOM)>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2IdentityAsserterProvider: start assert SAML2 token>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2IdentityAsserterProvider: SAML2IdentityAsserter: tokenType is 'SAML2.Assertion.DOM'>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: Start verify assertion signature>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: The assertion is signed.>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: End verify assertion signature>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: Start verify assertion attributes>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: End verify assertion attributes>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: Start verify assertion issuer>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: End verify assertion issuer>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Atn> <SAML2Assert: Start verify assertion conditions>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecurityAtn> <com.bea.common.security.internal.service.IdentityAssertionTokenServiceImpl.assertIdentity - IdentityAssertionException>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <[Security:090377]Identity Assertion Failed, weblogic.security.spi.IdentityAssertionException: [Security:090377]Identity Assertion Failed, weblogic.security.spi.IdentityAssertionException: [Security:096537]Assertion is not yet valid (NotBefore condition).>
<Nov 12, 2017 12:23:41 PM UTC> <Debug> <SecuritySAML2Service> <exception info
javax.security.auth.login.LoginException: [Security:090377]Identity Assertion Failed, weblogic.security.spi.IdentityAssertionException: [Security:090377]Identity Assertion Failed, weblogic.security.spi.IdentityAssertionException: [Security:096537]Assertion is not yet valid (NotBefore condition).
        at com.bea.common.security.internal.service.IdentityAssertionServiceImpl.assertIdentity(IdentityAssertionServiceImpl.java:89)
        at sun.reflect.GeneratedMethodAccessor1410.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.bea.common.security.internal.utils.Delegator$ProxyInvocationHandler.invoke(Delegator.java:64)
		...
>

 

I cut some of the strings above (like all signatures, the SSL Certificates, aso…) because it was really too big and it is not really important. What is important above is the java exception. Indeed, the Identity Assertion failed because of the following: ‘Assertion is not yet valid (NotBefore condition)’. This message might seems a little bit mystical but it actually points you right to the issue: the ‘NotBefore’ condition is causing the validation to fail.

So why is that? Well when you have a SAML2 SSO like I said above, you first have a request and then a response. For security reasons, there are some conditions that apply on them and that need to be fulfilled for the SSO to be working. To understand that a little bit better, I took the decoded SAML2 response from the logs above (line 32) and I reformatted it into an XML format so it is more readable:

<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" xmlns:dsig="http://www.w3.org/2000/09/xmldsig#" xmlns:enc="http://www.w3.org/2001/04/xmlenc#" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:x500="urn:oasis:names:tc:SAML:2.0:profiles:attribute:X500" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Destination="https://weblogic_server_01/saml2/sp/acs/post" ID="id-HpDtn7r5XxxAQFYnwSLXZmkuVgIHTExrlreEDZ4-" InResponseTo="_0x7258edc52961ccbd5a435fb13ac67799" IssueInstant="2017-11-12T12:23:42Z" Version="2.0">
	<saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://idp_partner_01/fed/idp</saml:Issuer>
	<dsig:Signature>
		<dsig:SignedInfo>
			<dsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
			<dsig:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/>
			<dsig:Reference URI="#id-HpDtn7r5XxxAQFYnwSLXZmkuVgIHTExrlreEDZ4-">
				<dsig:Transforms>
					<dsig:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
					<dsig:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
				</dsig:Transforms>
				<dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
				<dsig:DigestValue>YGtUZvsfo3z51AsBo7UDhbd6Ts=</dsig:DigestValue>
			</dsig:Reference>
		</dsig:SignedInfo>
		<dsig:SignatureValue>al8sJwbqzjh1qgM3Sj0QrX1aZjwyI...JB6l4jmj91BdQrYQ7VxFzvNLczZ2brJSdLLig==</dsig:SignatureValue>
		<dsig:KeyInfo>
			<dsig:X509Data>
				<dsig:X509Certificate>MIwDQUg+nhYqGZ7pCgBQAwTTELMAkGA1UEBhMCQk1ZhQ...aATPRCd113tVqsvCkUwpfQ5zyUHaKw4FkXmiT2nzxxHA==</dsig:X509Certificate>
			</dsig:X509Data>
		</dsig:KeyInfo>
	</dsig:Signature>
	<samlp:Status>
		<samlp:StatusCode Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
	</samlp:Status>
	<saml:Assertion ID="id-0WrMNbOz6wsuZdFPhfjnw7WIXXQ6k89-1AgHZ9Oi" IssueInstant="2017-11-12T12:23:42Z" Version="2.0">
		<saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">https://idp_partner_01/fed/idp</saml:Issuer>
		<dsig:Signature>
			<dsig:SignedInfo>
				<dsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
				<dsig:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/>
				<dsig:Reference URI="#id-0WrMNbOz6wsuZdFPhfjnw7WIXXQ6k89-1AgHZ9Oi">
					<dsig:Transforms>
						<dsig:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
						<dsig:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>
					</dsig:Transforms>
					<dsig:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
					<dsig:DigestValue>7+jZtq8SpY3BKVaFjIFeEJm51cA=</dsig:DigestValue>
				</dsig:Reference>
			</dsig:SignedInfo>
			<dsig:SignatureValue>GIlXt4B4rVFoDJRxidpZO73gXB68Dd+mcpoV9DKrjBBjLRz...zGTDcEYY2MG8FgtarZhVQGc4zxkkSg8GRT6Wng3NEuTUuA==</dsig:SignatureValue>
			<dsig:KeyInfo>
				<dsig:X509Data>
					<dsig:X509Certificate>MIwDQUg+nhYqGZ7pCgBQAwTTELMAkGA1UEBhMCQk1ZhQ...aATPRCd113tVqsvCkUwpfQ5zyUHaKw4FkXmiT2nzxxHA==</dsig:X509Certificate>
				</dsig:X509Data>
			</dsig:KeyInfo>
		</dsig:Signature>
		<saml:Subject>
			<saml:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified">PATOU_MORGAN</saml:NameID>
			<saml:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer">
				<saml:SubjectConfirmationData InResponseTo="_0x7258edc52961ccbd5a435fb13ac67799" NotOnOrAfter="2017-11-12T12:28:42Z" Recipient="https://weblogic_server_01/saml2/sp/acs/post"/>
			</saml:SubjectConfirmation>
		</saml:Subject>
		<saml:Conditions NotBefore="2017-11-12T12:23:42Z" NotOnOrAfter="2017-11-12T12:28:42Z">
			<saml:AudienceRestriction>
				<saml:Audience>SAML2_Entity_ID_01</saml:Audience>
			</saml:AudienceRestriction>
		</saml:Conditions>
		<saml:AuthnStatement AuthnInstant="2017-11-12T12:23:42Z" SessionIndex="id-oX9wXdpGmt9GQlVffvY4hEIRULEd25nrxDzE8D7w" SessionNotOnOrAfter="2017-11-12T12:38:42Z">
			<saml:AuthnContext>
				<saml:AuthnContextClassRef>urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport</saml:AuthnContextClassRef>
			</saml:AuthnContext>
		</saml:AuthnStatement>
	</saml:Assertion>
</samlp:Response>

 

As you can see on the XML, there are two conditions that apply on the SAML2 response:

  • The usage of the response needs to take place ‘NotBefore’ the current time
  • The usage of the response needs to take place ‘NotOnOrAfter’ the current time + 5 minutes

In this case, the NotBefore is set to ‘2017-11-12T12:23:42Z’ which is the current time of the IdP Partner Server. However you can see in the logs that the WebLogic Server hosting the application is actually one second before this time (Nov 12, 2017 12:23:41 PM UTC) and therefore the NotBefore restriction applies and the WebLogic Server hosting the application has no other choice than to return a ‘403 – Forbidden’ message because the SAML2 response is NOT valid.

In this case, the NTP daemon (Network Time Protocol) on the IdP Partner Linux server has been restarted and the time on this server has been resynched which solved the issue. Having a server in the future can cause some interesting behaviors :).

 

 

Cet article WebLogic – SSO/Atn/Atz – 403 Forbidden, a first issue est apparu en premier sur Blog dbi services.

Transfer redo in async-mode to the Gold/Master copy of the Production DB for ACFS snapshots

Thu, 2017-11-30 04:58

If you store your databases on the cluster filesystem ACFS you may use the provided Perl-script gDBClone from OTN to clone databases or create snapshot databases. It is an interesting approach to create clones from the Production DB in minutes regardless of the production DB size. What you do is to create a standby DB from your production DB on a separate cluster and use that standby DB as a Gold/Master copy for ACFS snapshots.

In a Production environment with Data Guard Broker we wanted to use that technique, but were confronted with an issue:

The Production DB had already a physical standby DB with the Data Guard Broker running. The protection mode was MaxAvailability, which means transport of the redo in sync mode. The master/gold copy to get the snapshots from should receive the redo data in async mode. How to achieve that?

Actually not very common parameters in a Broker configuration are


ExternalDestination1
ExternalDestination2

With those parameters (which are available in 11.2.0.4 and 12.1.0.2 onwards) you actually can send your redo to a destination in async mode. The parameters are documented as follows:

The ExternalDestination1 configuration property is used to specify a redo transport destination that can receive redo data from the current primary database. To set up transport of redo data to the specified destination, the broker uses the values specified for this parameter to define a LOG_ARCHIVE_DEST_n initialization parameter on the primary database. The broker also monitors the health of the transport to the specified destination.

After a role change, the broker automatically sets up a LOG_ARCHIVE_DEST_n initialization parameter on the new primary database to ship redo data to the specified destination.

I.e. you can set the parameter the same as LOG_ARCHIVE_DEST_n, but the following options are not allowed:

ALTERNATE
DELAY
LOCATION
MANDATORY
MAX_FAILURE
NET_TIMEOUT
SYNC
VALID_FOR

So let’s assume I created my DB GOLDCOP as a standby DB using the rman duplicate command


RMAN> duplicate target database for standby from active database dorecover nofilenamecheck;

or alternatively using


# ./gDBClone clone -sdbname PRIM -sdbscan scoda7 -tdbname GOLDCOP -tdbhome OraDb11g_home1 -dataacfs /cloudfs -standby

In the broker configuration I added the DB GOLDCOP as follows:

DGMGRL> show configuration;
 
Configuration - MYPROD
 
Protection Mode: MaxAvailability
Databases:
PRIM - Primary database
STBY - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS
 
DGMGRL> edit configuration set property ExternalDestination1 = 'service=goldcop db_unique_name=GOLDCOP noaffirm async';
Property "externaldestination1" updated
DGMGRL> show configuration;
 
Configuration - MYPROD
 
Protection Mode: MaxAvailability
Databases:
PRIM - Primary database
STBY - Physical standby database
GOLDCOP - External destination 1
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS

Let’s check if I really do NOAFFIRM ASYNC redo transport on PRIM:

SQL> select DEST_NAME, DB_UNIQUE_NAME, AFFIRM, TRANSMIT_MODE from v$archive_dest where dest_id in (2,3);
 
DEST_NAME DB_UNIQUE_NAME AFF TRANSMIT_MOD
-------------------------------- ------------------------------ --- ------------
LOG_ARCHIVE_DEST_2 STBY YES PARALLELSYNC
LOG_ARCHIVE_DEST_3 GOLDCOP NO ASYNCHRONOUS

The external destination is not a database in the configuration:

DGMGRL> show database "GOLDCOP";
Object "GOLDCOP" was not found

But the transport to its destination is monitored by the Broker. I.e. when shutting down the DB GOLDCOP I do get an error:

DGMGRL> show configuration;
 
Configuration - MYPROD
 
Protection Mode: MaxAvailability
Databases:
PRIM - Primary database
Error: ORA-16778: redo transport error for one or more databases
 
STBY - Physical standby database
GOLDCOP - External destination 1
 
Fast-Start Failover: DISABLED
 
Configuration Status:
ERROR
 
DGMGRL> show instance "PRIM";
 
Instance 'PRIM' of database 'PRIM'
 
Instance Error(s):
ORA-16737: the redo transport service for standby database "GOLDCOP" has an error
 
Instance Status:
ERROR

As the External destination 1 is not “a database” in the broker configuration, it actually also does not matter if the broker is started (dg_broker_start=TRUE) at the external destination GOLDCOP or not.
To start applying redo on the external destination, you have to start managed recovery as you would without a broker configuration:

alter database recover managed standby database using current logfile disconnect from session;

And redo real time apply is happening on GOLDCOP:

SQL> select name,value
2 from v$dataguard_stats
3 where name in ('apply lag','transport lag');
 
NAME VALUE
-------------------- --------------------
transport lag +00 00:00:00
apply lag +00 00:00:00
 
SQL>
SQL> select inst_id,process,pid,status,thread#,sequence#, block#
2 from gv$managed_standby
3 where process like 'MRP%';
 
INST_ID PROCESS PID STATUS THREAD# SEQUENCE# BLOCK#
---------- --------- ---------- ------------ ---------- ---------- ----------
1 MRP0 5155 APPLYING_LOG 1 50 420

To make the external destination self managing I did set the archivelog deletion policy on GOLDCOP to

CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;

in rman so that applied archives become reclaimable automatically in the fast recovery area. In addition I set

fal_server='PRIM'

on GOLDCOP to ensure that archive gaps can be resolved.

The pro’s of above configuration are:
– the GOLDCOP-DB does not cause much overhead for my Production DB (async redo transport).
– Decoupling my GOLDCOP DB from Primary (temporarily) is fast and easy:
edit configuration set property ExternalDestination1 = '';

REMARK: Of course I do also have the other advantages of the gDBClone-approach:
– A production copy on a separate cluster which serves as a gold-copy to take snapshots from for testing or development purposes.
– Creating a snapshot database takes minutes regardless of the DB size.

Con’s:
– I have to take care to start managed standby database recovery on my GOLDCOP-DB. I.e. the same as when running data guard without the Broker.

To create a snapshot DB I just do something like:


# ./gDBClone snap -sdbname GOLDCOP -tdbname PRODCOP1

Et voilà a production copy in 2 minutes.

If PRODCOP1 is no longer needed I can delete it:


# ./gDBClone deldb -tdbname PRODCOP1

Besides using the configuration property ExternalDestination1 there are other possibilities in 12c to run a standby DB as a master copy for snapshots without affecting the production system (like e.g. the support of cascaded standby DBs in the Broker), but I still think that the external destinations feature offers a good possibility to run a master copy.

 

Cet article Transfer redo in async-mode to the Gold/Master copy of the Production DB for ACFS snapshots est apparu en premier sur Blog dbi services.

How to rename an existing Fusion Middleware WebLogic Domain

Tue, 2017-11-28 23:54

Some times it happens that we need to rename an existing fusion Middleware WebLogic domain. I was asked to do such on a Fusion Middleware Reports & Forms environment.
I took some time to check how this can be done and did some testing to confirm it works as expected. The difficulty is not the WebLogic Domain itself as a WebLogic domain can be created quickly but it is time consuming if the complete configuration has to be redone like SSL, logging settings, etc and what about the system components.

I used pack and unpack to rename the FMW WebLogic Domain.

Let’s say I wanted to rename a Fusion Middleware Forms & Reports WebLogic domain named fr_domain in fr_domain_new

First I used pack to create the domain archive:

cd $MW_HOME/oracle_common/common/bin
./pack.sh -domain /u01/config/domains/fr_domain -template $HOME/fr_domain.jar -template_name full_fr_domain

Then using unpack, I changed the domain directory path thus the domain name.

./unpack.sh -domain /u01/config/domains/fr_domain_new -template /home/oracle/fr_domain.jar -user_name weblogic -password Welcome1 -server_start_mode prod -app_dir /u01/config/applications/fr_domain_new -java_home $JAVA_HOME

Of course the JAVA_HOME environment variable needs to be set before.

This simply worked but I had to recreate the security files  for the Administration Server and Managed Servers (boot.properties) if needed and those for the system Components.

To create the security files for the System Components, the Node Manager needs to be started

export WEBLOGIC_DOMAIN_HOME=/u01/config/domains/fr_domain_new/
nohup ${WEBLOGIC_DOMAIN_HOME}/bin/startNodeManager.sh >> ${WEBLOGIC_DOMAIN_HOME}/nodemanager/nohup-NodeManager.out 2>&1 &

And then start once the System  components with the storeUserConfig option. for example:

cd /u01/config/domains/fr_domain_new/bin 
./startComponent.sh ohs1 storeUserConfig
./startComponent.sh vm01_reportsServer storeUserConfig

This was for a simple WebLogic Domain on a single machine. For clustered WebLogic Domains installed on several hosts, the pack and unpack needs to be used again to dispatch the WebLogic Managed  Servers on the targeted machines.

As example, to create the archive files for the Managed Servers to be installed on remote machines:

$MW_HOME/oracle_common/common/bin/pack.sh -managed=true -domain /u01/config/domains/fr_domain_new -template /home/oracle/fr_domain_new.jar -template_name fr_domain_new

 

 

 

Cet article How to rename an existing Fusion Middleware WebLogic Domain est apparu en premier sur Blog dbi services.

impdp logtime=all metrics=y and 12cR2 parallel metadata

Tue, 2017-11-28 14:35

A quick post to show why you should always use LOGTIME=ALL METRICS=Y when using Data Pump. Just look at an example showing the timestamp in front of each line and a message about each task completed by the worker:

15-NOV-17 11:48:32.305: W-5 Processing object type DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA
15-NOV-17 11:48:34.439: W-13 Completed 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.439: W-13 Completed by worker 1 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.440: W-13 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE
15-NOV-17 11:48:35.472: W-17 Startup took 70 seconds
15-NOV-17 11:48:35.596: W-18 Startup took 70 seconds
15-NOV-17 11:48:35.719: W-20 Startup took 70 seconds
15-NOV-17 11:48:35.841: W-19 Startup took 70 seconds
15-NOV-17 11:48:43.520: W-5 Completed 1714 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 1 39 TABLE objects in 2 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 2 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 3 85 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 4 111 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 5 25 TABLE objects in 1 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 6 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 7 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 8 111 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 9 89 TABLE objects in 5 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 10 74 TABLE objects in 4 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 11 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 12 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 13 34 TABLE objects in 2 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 14 111 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 15 108 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 16 90 TABLE objects in 4 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 17 82 TABLE objects in 4 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 18 40 TABLE objects in 3 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 19 97 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 20 53 TABLE objects in 3 seconds
15-NOV-17 11:48:44.266: W-1 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA

Here, I was running an import with PARALLEL=20 and I can see exactly how many tables were processed by each worker. You see it is ‘TABLE’ and not ‘TABLE_DATA’ which is the proof that 12cR2 can import metadata in parallel.

I see no reason not to use LOGTIME=ALL METRICS=Y always and you will be happy to have this detail if something goes wrong.

 

Cet article impdp logtime=all metrics=y and 12cR2 parallel metadata est apparu en premier sur Blog dbi services.

Dataguard: QUIZZ on Snapshot Standby

Tue, 2017-11-28 07:51

In oracle documentation we can find this about snapshot standby: A snapshot standby database is a fully updatable standby database that provides disaster recovery and data protection benefits that are similar to those of a physical standby database.

The concept of snapshot standby is that we can temporary convert a physical standby to an updatable database for different purposes and then convert back to a physical standby. During the time that the database is a snapshot standby, it can be used as a normal read write database. And then after the flashback technology is used combined with archived logfiles to convert back the snapshot to a physical standby.
In this blog I have tested some common tasks on a snapshot database and I am describing below the results.

We show our configuration, oracle 12.2 is used.
ORCL_SITE: Primary
ORCL_SITE2: Physical Standby
ORCL_SITE2: Physical Standby
ORCL_SITE3: Logical Standby


DGMGRL> show configuration;
.
Configuration - ORCL_DR
.
Protection Mode: MaxPerformance
Members:
ORCL_SITE - Primary database
ORCL_SITE1 - Physical standby database
ORCL_SITE2 - Physical standby database
ORCL_SITE3 - Logical standby database
.
Fast-Start Failover: DISABLED
.
Configuration Status:
SUCCESS (status updated 42 seconds ago)
.
DGMGRL>

The first question we can ask is which type of standby can be converted to a snapshot database

1- Can we convert a logical standby to a snapshot standby
Let’s convert our logical standby to a snapshot standby

DGMGRL> CONVERT DATABASE 'ORCL_SITE3' TO SNAPSHOT STANDBY;
Converting database "ORCL_SITE3" to a Snapshot Standby database, please wait...
Error: ORA-16831: operation not allowed on this member
.
Failed.
Failed to convert database "ORCL_SITE3"
DGMGRL>

Answer: NO we cannot convert a logical standby to a snapshot standby

2- Can we convert a physical standby to a snapshot standby
Let’s convert our physical standby to a snapshot standby

DGMGRL> CONVERT DATABASE 'ORCL_SITE2' TO SNAPSHOT STANDBY;
Converting database "ORCL_SITE2" to a Snapshot Standby database, please wait...
Database "ORCL_SITE2" converted successfully
DGMGRL>

And we can verify the new status of the database ‘ORCL_SITE2′

DGMGRL> show database 'ORCL_SITE2';
.
Database - ORCL_SITE2
.
Role: SNAPSHOT STANDBY
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 1 minute 33 seconds (computed 1 second ago)
Instance(s):
ORCL
.
Database Status:
SUCCESS
.
DGMGRL>
.

Answer: Yes we can convert a physical standby to a snapshot standby.

Now that the physical is converted to a snapshot let’s continue our quizz.

3- Can we open a snapshot standby on a read only mode
Let’s shutdown our standby snapshot and let’s open it on read only mode

SQL> startup open read only;
ORACLE instance started.
.
Total System Global Area 943718400 bytes
Fixed Size 8627440 bytes
Variable Size 348130064 bytes
Database Buffers 583008256 bytes
Redo Buffers 3952640 bytes
Database mounted.
Database opened.

The status is now open read only

SQL> select db_unique_name,database_role,open_mode from v$database;
.
DB_UNIQUE_NAME DATABASE_ROLE OPEN_MODE
------------------------------ ---------------- --------------------
ORCL_SITE2 SNAPSHOT STANDBY READ ONLY

Answer: Yes a snapshot standby can be opened in a READ ONLY mode

4- Can we create a tablespace on a snapshot standby
Connected to the standby database let’s create a tablespace

SQL> create tablespace mytab_snap datafile '/u01/app/oracle/oradata/ORCL/mytab_snap01.dbf' size 2M autoextend on maxsize 10M;
.
Tablespace created.

We can verify in the table dba_tablespaces

SQL> select tablespace_name from dba_tablespaces;
.
TABLESPACE_NAME
------------------------------
SYSTEM
SYSAUX
UNDOTBS1
TEMP
USERS
MYTAB_SNAP
TAB_TEST

Answer: Yes we can see that the tablespace MYTAB_SNAP was created.

5- Can we drop a tablespace on a snapshot standby
Let’s drop a tablespace
SQL> drop tablespace TAB_TEST including contents and datafiles;
drop tablespace TAB_TEST including contents and datafiles
*
ERROR at line 1:
ORA-38881: Cannot drop tablespace TAB_TEST on primary database due to
guaranteed restore points.

Answer: No due to guaranteed restore point, we cannot drop a tablespace on a snapshot database.

6- Can we extend a datafile on a snapshot standby
We are going to consider two types of datafiles.
• One from tablespace MYTAB_SNAP created on the snapshot standby
• Another from tablespace TAB_TEST created on the primary

FILE_NAME TABLESPACE_NAME
-------------------------------------------------- ------------------------------
/u01/app/oracle/oradata/ORCL/users01.dbf USERS
/u01/app/oracle/oradata/ORCL/undotbs01.dbf UNDOTBS1
/u01/app/oracle/oradata/ORCL/system01.dbf SYSTEM
/u01/app/oracle/oradata/ORCL/sysaux01.dbf SYSAUX
/u01/app/oracle/oradata/ORCL/stab_test.dbf TAB_TEST
/u01/app/oracle/oradata/ORCL/mytab_snap01.dbf MYTAB_SNAP

Let’s extend first the datafile created on the snapshot standby

SQL> alter database datafile '/u01/app/oracle/oradata/ORCL/mytab_snap01.dbf' resize 20M;
.
Database altered.

And then let’s do the same operation on the datafile created on the primary

SQL> alter database datafile '/u01/app/oracle/oradata/ORCL/stab_test.dbf' resize 20M;
.
Database altered.

We can verify the new sizes

SQL> select FILE_NAME,sum(BYTES) from dba_data_files group by FILE_NAME;
.
FILE_NAME SUM(BYTES)
-------------------------------------------------- ----------
/u01/app/oracle/oradata/ORCL/users01.dbf 5242880
/u01/app/oracle/oradata/ORCL/undotbs01.dbf 73400320
/u01/app/oracle/oradata/ORCL/system01.dbf 870318080
/u01/app/oracle/oradata/ORCL/sysaux01.dbf 587202560
/u01/app/oracle/oradata/ORCL/stab_test.dbf 20971520
/u01/app/oracle/oradata/ORCL/mytab_snap01.dbf 20971520
.
6 rows selected.

Answer: Yes we can extend datafiles on a snapshot standby.

Just note that when the snapshot standby is converted back to a physical, the datafile is shrinked until his previous size.

7- Can we reduce a datafile on a standby database
Let’s now reduce the size of the datafile created on the snapshot

SQL> alter database datafile '/u01/app/oracle/oradata/ORCL/mytab_snap01.dbf' resize 5M;
.
Database altered.

And let’s do the same operation on the datafile created on the primary

SQL> alter database datafile '/u01/app/oracle/oradata/ORCL/stab_test.dbf' resize 5M;
alter database datafile '/u01/app/oracle/oradata/ORCL/stab_test.dbf' resize 5M
*
ERROR at line 1:
ORA-38883: Cannot shrink data file /u01/app/oracle/oradata/ORCL/stab_test.dbf
on primary database due to guaranteed restore points.

Answer: Yes we see that we can only reduce size for datafiles created on the snapshot standby.

8- Can we do a switchover to a snapshot standby
As a snapshot is a physical standby which was converted, one may ask if a switchover is possible to a snapshot standby.

DGMGRL> switchover to 'ORCL_SITE2';
Performing switchover NOW, please wait...
Error: ORA-16831: operation not allowed on this member
.
Failed.
Unable to switchover, primary database is still "ORCL_SITE"
DGMGRL>

Answer: No we cannot do a switchover to a snapshot standby.

9- Can we do a failover to a snapshot standby
The same question can be asked about failover.

DGMGRL> connect sys/root@ORCL_SITE2
Connected to "ORCL_SITE2"
Connected as SYSDBA.
DGMGRL> failover to 'ORCL_SITE2';
Converting database "ORCL_SITE2" to a Physical Standby database, please wait...
Operation requires shut down of instance "ORCL" on database "ORCL_SITE2"
Shutting down instance "ORCL"...
Connected to "ORCL_SITE2"
Database closed.
Database dismounted.
ORACLE instance shut down.
Operation requires start up of instance "ORCL" on database "ORCL_SITE2"
Starting instance "ORCL"...
ORACLE instance started.
Database mounted.
Connected to "ORCL_SITE2"
Connected to "ORCL_SITE2"
Continuing to convert database "ORCL_SITE2" ...
Database "ORCL_SITE2" converted successfully
Performing failover NOW, please wait...
Failover succeeded, new primary is "ORCL_SITE2"
DGMGRL>

Answer: Yes, we can do a failover to a snapshot standby but the time of the failover is longer than if the failover was done to a physical standby. Indeed oracle has

• Convert the snapshot to physical standby one
• Applied archived logs to the physical standby
• And then do the failover

Conclusion: In this blog, we tried to explain some behaviors of snapshot standby. Hope that this article may help

 

Cet article Dataguard: QUIZZ on Snapshot Standby est apparu en premier sur Blog dbi services.

Are statistics immediately available after creating a table or an index in PostgreSQL?

Tue, 2017-11-28 06:32

While giving the last PostgreSQL DBA Essentials workshop this question came up: When we create a table or an index: are the statistics available automatically? To be more precise: When we create and load a table in one step, create an index on that table afterwards: Do we have the statistics available by default or do we need to wait for autovacuum to kick in or analyze manually? Lets see …

First of all lets disable autovacuum so it does not kick off analyze in the background:

postgres=# \! ps -ef | grep autov | grep -v grep
postgres  1641  1635  0 07:08 ?        00:00:00 postgres: MY_CLUSTER: autovacuum launcher process   
postgres=# alter system set autovacuum=off;
ALTER SYSTEM
postgres=# select * from pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# \! ps -ef | grep autov | grep -v grep

Create and populate the table:

postgres=# \! cat a.sql
drop table if exists t;
create table t
as select a.*, md5(a::varchar) from generate_series(1,5000000) a;
postgres=# \i a.sql
psql:a.sql:1: NOTICE:  table "t" does not exist, skipping
DROP TABLE
SELECT 5000000

Create an index:

postgres=# create index i1 on t(a);
CREATE INDEX
postgres=# \d+ t
                                     Table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 a      | integer |           |          |         | plain    |              | 
 md5    | text    |           |          |         | extended |              | 
Indexes:
    "i1" btree (a)

Do we have statistics already? Lets check:

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

No, at least not for the table. What about the index?

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i1'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

No. Lets analyze:

postgres=# analyze t;
ANALYZE
postgres=# analyze i1;
WARNING:  skipping "i1" --- cannot analyze non-tables or special system tables
ANALYZE

Apparently we can not analyze an index. What do we see now?

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |        4 |          -1
           0 |       33 |          -1
(2 rows)

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i1'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

We do see statistics for the table but not for the index. The reason is that “analyze” works on the tables, but not on the indexes. For regular indexes there will be nothing in pg_statistic because that information would be redundant with the underlying table columns. But there will be statistics for function based indexes:

postgres=# create index i2 on t(lower(a::text));
CREATE INDEX
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i2'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

postgres=# analyze t;
ANALYZE
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i2'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |       10 |          -1
(1 row)

So, when autovacuum is off we do not get statistics when we do not kick off a manual analyze (which is not a surprise). What happens when autovacuum is on?

postgres=# alter system set autovacuum=on;
ALTER SYSTEM
postgres=# select * from pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)
postgres=# \i a.sql
DROP TABLE
SELECT 5000000
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

Nope, same picture here. But some seconds later:

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |        4 |          -1
           0 |       33 |          -1
(2 rows)

… statistics are there. Conclusion: When you require current statistics directly after loading a table you’d better kick of a manual analyze right after. Otherwise autovacuum will take care about that, but not immediately.

 

Cet article Are statistics immediately available after creating a table or an index in PostgreSQL? est apparu en premier sur Blog dbi services.

DBVISIT from Oracle to Postgres

Mon, 2017-11-27 09:51

As I regularly work on Oracle and PostgreSQL, I decided to test the replication from Oracle to PostgreSQL using the Dbvisit Replicate tool.

Dbivisit Replicate does not use Oracle logminer ot triggers but its own mining processes to get the changes when they are written to the redo logs. When a change appears in the redo log, an external file called PLOG is generated and transferred to the target.

dbvisit

The architecture is quite easy to understand, you have a MINE process on the source server, looking at the redo logs for changed data, and an APPLY process which applies SQL on the target database.

The configuration is easy to implement but must not be under estimated:=)

My configuration is the following:

  • Oracle server named cloud13c, with PSI database version 12.2.0.1
  • Postgres server named pg_essentials_p1 with Postgres version 9.6

At first, we create a user in the Postgres database:

postgres@pg_essentials_p1:/home/postgres/ [PG1] createuser -d -e -E -l -P -r -s dbvrep_admin
Enter password for new role: 
Enter it again: 
CREATE ROLE dbvrep_admin ENCRYPTED 
PASSWORD 'md5e3c4e8f1b4f8e388eef4fe890d6bdb36' SUPERUSER CREATEDB 
CREATEROLE INHERIT LOGIN;

We edit the configuration file /u02/pgdata/postgresql.conf in order to allow non-localhost connections:

postgres@pg1:/u02/pgdata/PG1/ [PG1] cp postgresql.conf postgresql.conf.save
postgres@pg1:/u02/pgdata/PG1/ [PG1] sed -i "s/^#\(
listen_addresses = '\)localhost'/\1*'\t/" postgresql.conf

We also enable connections on non-localhost address:

postgres@pg1:/u02/pgdata/PG1/ [PG1] cp pg_hba.conf 
pg_hba.conf.save
postgres@p1:/u02/pgdata/PG1/ [PG1] echo -e 
"host\tall\t\tall\t\t0.0.0.0/0\t\tmd5" >> pg_hba.conf

cat pg_hba.conf:

 # TYPE  DATABASE        USER            ADDRESS                 METHOD
# "local" is for Unix domain socket connections only
local   all             all                                     md5
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
host    all             all             192.168.22.201/24       md5
 # IPv6 local connections:
host    all             all             ::1/128                 md5
 host    all             barman          192.168.1.101/24       md5
host    replication     barman_streaming 192.168.1.101/24       md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     postgres                                md5
host    replication     postgres        127.0.0.1/32            md5
host    replication     postgres        ::1/128                 md5
host all       all       0.0.0.0/0      md5

We have to restart the postgres server:

postgres@pg1:/u02/pgdata/PG1/ [PG1] pgrestart
waiting for server to shut down.... done
server stopped
server starting
postgres@pg1:/u02/pgdata/PG1/ [PG1] 2017-07-17 13:52:52.350 CEST
 - 1 - 3106 - 
 - @ LOG:  redirecting log output to logging collector process
2017-07-17 13:52:52.350 CEST - 2 - 3106 - 
 - @ HINT:  Future log output will appear in directory 
"/u01/app/postgres/admin/PG1/pg_log".
 
postgres@pg_essentials_p1:/u02/pgdata/PG1/ [PG1] 
postgres@pg_essentials_p1:/u02/pgdata/PG1/ [PG1] alias | grep pgrestart
alias pgrestart='pg_ctl -D ${PGDATA} restart -m fast'

Then we install dbvisit replicate:

We download the dbvisit_replicate-2.9.00-el5.x86_64.rpm and we install it:

[root@localhost software]# rpm -ivh dbvisit_replicate-2.9.00-el5.x86_64.rpm 
Preparing...                       ################################# [100%]
Updating / installing...
   1:dbvisit_replicate-2.9.00-el5  ################################# [100%]

To make it work properly, I had to modify the sqlnet.ora file as follows in order to avoid the following error message:

ERR-11: Could not connect as dbvrep to database PSI, 
error is ORA-24327: need explicit attach
before authenticating a user (DBD ERROR: OCISessionBegin)
sqlnet.ora:
SQLNET.SQLNET_ALLOWED_LOGON_VERSION=11
SQLNET.ALLOWED_LOGON_VERSION_CLIENT =11
SQLNET.ALLOWED_LOGON_VERSION_SERVER =11

Before running dbvrep, be sure you can connect with psql from the Oracle server to the postgreSQL server !! I needed to install a postgres client own the Oracle host and to define the PATH properly.

Finally by running dbvrep on the Oracle server, you  run the setup wizard, and you enter your configuration settings, this menu is quite easy to understand. The setup wizard is defined in 4 steps:

– Step 1: describe databases

– Step 2: Replicate Pairs

– Step 3: Replicated tables

– Step 4: Process Configuration

oracle@localhost:/home/oracle/ora2pg/ [PSI] dbvrep
Initializing......done
Dbvisit Replicate version 2.9.02
Copyright (C) Dbvisit Software Limited. All rights reserved.
No DDC file loaded.
Run "setup wizard" to start the configuration wizard or try "help" 
to see all commands available.
dbvrep> setup wizard                                                                              
This wizard configures Dbvisit Replicate.
 
The setup wizard creates configuration scripts, which need to be run 
after the wizard ends. Nochanges to the databases are made before that.
 
The progress is saved every time a list of databases, replications, etc. 
is shown. It will bere-read if wizard is restarted and the same DDC 
name and script path is selected.
Run the wizard now? [Yes]                                                   
Accept end-user license agreement? (View/Yes/No) [No] yes                  
Before starting the actual configuration, some basic information is needed. The DDC name and
script path determines where all files created by the wizard go 
(and where to reread them ifwizard is rerun) and the license key 
determines which options are available for this
configuration.
(DDC_NAME) - Please enter a name for this replication: [] ora2pg                                  
(LICENSE_KEY) - Please enter your license key: [(trial)]                                          
Which Replicate edition do you want to trial (LTD/XTD/MAX): [MAX]                                 
(SETUP_SCRIPT_PATH) - Please enter a directory for location of 
configuration scripts on this     
machine: [/home/oracle/Documents/ora2pg] /home/oracle/ora2pg                                 
 
Network configuration files were detected on this system in these locations:
/u00/app/oracle/network/admin
/u00/app/oracle/product/12.2.0.1/dbhome_1/network/admin
(TNS_ADMIN) - Please enter TNS configuration directory for this machine:                          [/u00/app/oracle/network/admin]                                          
Read 2 described databases from previous wizard run.
 
Step 1 - Describe databases
========================================
The first step is to describe databases used in the replication. 
There are usually two of them
(source and target).
 
Following databases are now configured:
1: Oracle PSI, SYS/***, SYSTEM/***, dbvrep/***, USERS/TEMP, dbvrep/, 
ASM:No, TZ: +02:00
2: Postgres postgres, dbvrep_admin/***, dbvrep_admin/***, dbvrep/***, 
/, dbvrep/, ASM:n/a, TZ: 
Enter the number of the database to modify it, or "add", or "done": [done]                        
Read 1 replication pairs from previous wizard run.
 
Step 2 - Replication pairs
========================================
The second step is to set source and targets for each replication pair.
 
Enter number of replication pair to modify it, or "add", or "done": [done] 1                  
Do you want to "edit" the replication pair or "delete" it? [edit] edit                        
Let's configure the replication pair, selecting source and target.
Following databases are described:
1: PSI#DBVREP (Oracle)
2: DBNAME=POSTGRES;HOST=PG1#DBVREP (Postgres) 
(cannot be source: not an Oracle database)
Select source database: [1]                                                                   
Select target database: [2]                                                                   
Will limited DDL replication be enabled? (Yes/No) [Yes]                                       
Use fetcher to offload the mining to a different server? (Yes/No) [No]                        
Should where clauses (and Event Streaming) include all columns, 
not just changed and PK?      (Yes/No) [No]                                                                            
Would you like to encrypt the data across the network? (Yes/No) [No]                          
Would you like to compress the data across the network? (Yes/No) [No]                         
How long do you want to set the network timeouts. 
Recommended range between 60-300 seconds    [60]                                                                                       
Lock and copy the data initially one-by-one or at a single SCN?
one-by-one : Lock tables one by one and capture SCN
single-scn : One SCN for all tables
ddl-only   : Only DDL script for target objects
resetlogs  : Use SCN from last resetlogs operation
(standby activation, rman incomplete
recovery)
no-lock    : Do not lock tables. Captures previous SCN of oldest active 
transaction. Requires pre-requisite running of pre-all.sh script            (one-by-one/single-scn/ddl-only/resetlogs/no-lock) [single-scn] 
 
What data instantiation script to create?
ddl_file       : DDL file created (APPLY.sql)
ddl_run        : DDL is automatically executed on target
load           : All replicated data is created and loaded automatically
none                                                                                          (ddl_file/ddl_run/load/none) [ddl_run] ddl-file
 
 
Following replication pairs are now configured:
1: PSI (Oracle) ==> postgres (Postgres), DDL: Yes, fetcher: No, 
process suffix: (no suffix),
compression: No, encryption: No, network timeout: 60, prepare type: 
single-scn,:
ddl-run
Enter number of replication pair to modify it, or "add", or "done": [done]                        
Read 1 replication pairs from previous wizard run.
 
Step 3 - Replicated tables
========================================
The third step is to choose the schemas and tables to be replicated. 
If the databases arereachable, the tables are checked for existence, 
datatype support, etc., schemas are queried for tables. 
Note that all messages are merely hints/warnings and may be ignored 
if issues are rectified before the scripts are actually executed.
 
Following tables are defined for replication pairs:
1: PSI (Oracle) ==> postgres (Postgres), DDL: Yes, suffix: (no suffix), 
prepare: single-scn
  PSI(tables)
Enter number of replication pair to modify it, or "done": [done]                                  
Read 2 replication pairs from previous wizard run.
 
Step 4 - Process configuration
========================================
The fourth step is to configure the replication processes for each 
replication.
 
Following processes are defined:
1: MINE on PSI
  Host: cloud13c, SMTP: No, SNMP: No
2: APPLY on postgres
  Host: pg1, SMTP: No, SNMP: No
Enter number of process to modify it, or "done": [done] 1                                         
Fully qualified name of the server for the process (usually co-located 
with the database, unless  mine is offloaded using fetcher): [cloud13c]                                                     
Server type (Windows/Linux/Unix): [Linux]                                                         
Enable email notifications about problems? (Yes/No) [No]                                          
Enable SNMP traps/notifications about problems? (Yes/No) [No]                                     
Directory with DDC file and default where to create log files etc. 
(recommended: same as global   setting, if possible)? [/home/oracle/ora2pg]                                                    
Following settings were pre-filled with defaults or your reloaded settings:
----------------------------------------
[MINE_REMOTE_INTERFACE]: Network remote interface: cloud13c:7901 
[MINE_DATABASE]: Database TNS: PSI 
[TNS_ADMIN]: tnsnames.ora path: /u00/app/oracle/network/admin 
[MINE_PLOG]: Filemask for generated plogs: /home/oracle/ora2pg/mine/%S.%E 
(%S is sequence, %T thread, %F original filename (stripped extension), 
%P process type, %N process name, %E default extension)
[LOG_FILE]: General log file: /home/oracle/ora2pg/log/dbvrep_%N_%D.%E 
[LOG_FILE_TRACE]: Error traces: 
/home/oracle/ora2pg/log/trace/dbvrep_%N_%D_%I_%U.%E 
 
Checking that these settings are valid...
Do you want to change any of the settings? [No]                                                   
Following processes are defined:
1: MINE on PSI
  Host: cloud13c, SMTP: No, SNMP: No
2: APPLY on postgres
  Host: pg1, SMTP: No, SNMP: No
Enter number of process to modify it, or "done": [done] 2                                         
Fully qualified name of the server for the process (usually co-located 
with the database, unless  mine is offloaded using fetcher): [pg1]                                                          
Server type (Windows/Linux/Unix): [Linux]                                                         
Enable email notifications about problems? (Yes/No) [No]                                          
Enable SNMP traps/notifications about problems? (Yes/No) [No]                                     
Directory with DDC file and default where to create log files etc. 
(recommended: same as global   setting, if possible)? [/home/oracle/ora2pg]                                                    
Following settings were pre-filled with defaults or your reloaded settings:
----------------------------------------
[APPLY_REMOTE_INTERFACE]: Network remote interface: pg1:7902 
[APPLY_DATABASE]: Database Postgres connection string: dbname=postgres;
host=pg1 
[TNS_ADMIN]: tnsnames.ora path: /u00/app/oracle/network/admin 
[APPLY_SCHEMA]: Dbvisit Replicate database (schema): dbvrep 
[APPLY_STAGING_DIR]: Directory for received plogs: /home/oracle/ora2pg/apply 
[LOG_FILE]: General log file: /home/oracle/ora2pg/log/dbvrep_%N_%D.%E 
[LOG_FILE_TRACE]: Error traces: 
/home/oracle/ora2pg/log/trace/dbvrep_%N_%D_%I_%U.%E 
 
Checking that these settings are valid...
Do you want to change any of the settings? [No]                                                   
Following processes are defined:
1: MINE on PSI
  Host: cloud13c, SMTP: No, SNMP: No
2: APPLY on postgres
  Host: pg1, SMTP: No, SNMP: No
Enter number of process to modify it, or "done": [done]                                           
Created file /home/oracle/ora2pg/ora2pg-APPLY.ddc.
Created file /home/oracle/ora2pg/ora2pg-MINE.ddc.
Created file /home/oracle/ora2pg/config/ora2pg-setup.dbvrep.
Created file /home/oracle/ora2pg/config/ora2pg-dbsetup_PSI_DBVREP.sql.
Created file /home/oracle/ora2pg/config/ora2pg-dbsetup_DBNAME_POSTGRES_HOST_PG1_DBVREP.sql.
Created file /home/oracle/ora2pg/config/ora2pg-grants_PSI_DBVREP.sql.
Created file /home/oracle/ora2pg/config/ora2pg-grants_DBNAME_POSTGRES_HOST_PG1_DBVREP.sql.
Created file /home/oracle/ora2pg/config/ora2pg-onetime.ddc.
Created file /home/oracle/ora2pg/start-console.sh.
Created file /home/oracle/ora2pg/ora2pg-run-cloud13c.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-cloud13c-start-MINE.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-cloud13c-stop-MINE.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-cloud13c-dbvrep-MINE.sh.
Created file /home/oracle/ora2pg/scripts/systemd-dbvrep-MINE_ora2pg.service.
Created file /home/oracle/ora2pg/scripts/upstart-dbvrep-MINE_ora2pg.conf.
Created file /home/oracle/ora2pg/ora2pg-run-pg1.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-pg1-start-APPLY.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-pg1-stop-APPLY.sh.
Created file /home/oracle/ora2pg/scripts/ora2pg-pg1-dbvrep-APPLY.sh.
Created file /home/oracle/ora2pg/scripts/systemd-dbvrep-APPLY_ora2pg.service.
Created file /home/oracle/ora2pg/scripts/upstart-dbvrep-APPLY_ora2pg.conf.
Created file /home/oracle/ora2pg/Nextsteps.txt.
Created file /home/oracle/ora2pg/ora2pg-all.sh.
============================================================================
Dbvisit Replicate wizard completed
Script /home/oracle/ora2pg/ora2pg-all.sh created. 
This runs all the above created scripts. Please exit out of dbvrep, 
review and run script as current user to setup and start Dbvisit Replicate.
============================================================================
Optionally, the script can be invoked now by this wizard.
Run this script now? (Yes/No) [No]                                          dbvrep> exit

As it is asked at the end of the setup wizard, we run the ora2pg_all.sh :

oracle@localhost:/home/oracle/ora2pg/ [PSI] . ora2pg-all.sh 
Setting up Dbvisit Replicate configuration
Configure database PSI...
This check fails if the DBID is not the expected one...
Ok, check passed.
Configure database dbname=postgres
Object grants for database PSI...
Object grants for database dbname=postgres
Setting up the configuration
Initializing......done
DDC loaded from database (0 variables).
Dbvisit Replicate version 2.9.02
Copyright (C) Dbvisit Software Limited. All rights reserved.
DDC file /home/oracle/ora2pg/config/ora2pg-onetime.ddc loaded.
MINE: Cannot determine Dbvisit Replicate dictionary version. (no
dictionary exists)
APPLY: Cannot determine Dbvisit Replicate dictionary version. (no
dictionary exists)
dbvrep> #clear the no-DDC-DB-available warning
dbvrep> process clear previous warnings
dbvrep> set ON_WARNING SKIP
Variable ON_WARNING set to SKIP for process *.
dbvrep> set ON_ERROR EXIT
Variable ON_ERROR set to EXIT for process *.
dbvrep> 
dbvrep> # Configuring default processes
dbvrep> choose process MINE
Process type MINE set to: MINE.
dbvrep> choose process APPLY
Process type APPLY set to: APPLY.
dbvrep> PROCESS SWITCH_REDOLOG
Redo log switch requested.
dbvrep> PROCESS SETUP MINE DROP DICTIONARY
0 dictionary objects dropped.
dbvrep> PROCESS SETUP MINE CREATE DICTIONARY
dbvrep> PROCESS SETUP MINE LOAD DICTIONARY
Oldest active transaction SCN: 2054212 (no active transaction)
Supplemental logging on database set.
dbvrep> PROCESS SETUP APPLY DROP DICTIONARY
0 dictionary objects dropped.
dbvrep> PROCESS SETUP APPLY CREATE DICTIONARY
dbvrep> PROCESS SETUP APPLY LOAD DICTIONARY
dbvrep> PROCESS SETUP PAIR MINE AND APPLY
Applier SCN set (start=2054228, current=2054228).
dbvrep> SET APPLY.INSTANTIATE_SCN NOW
Variable INSTANTIATE_SCN set to NOW for process APPLY.
dbvrep> SET MINE._PREPARE_SUPLOG_TYPE PK
Variable _PREPARE_SUPLOG_TYPE set to PK for process MINE.
dbvrep> EXCLUDE CREATE TABLE %.DBMS_TABCOMP_TEMP_UNCMP #Ignore tables
created by Compression Advisor
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.DBMS_TABCOMP_TEMP_CMP #Ignore tables
created by Compression Advisor
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.SCHEDULER$_% #Ignore tables created by
Oracle scheduler (also used by schema/full expdp/impdp)
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.CMP1$% #Ignore tables created by
Compression Advisor since 11.2.0.4
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.CMP2$% #Ignore tables created by
Compression Advisor since 11.2.0.4
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.CMP3$% #Ignore tables created by
Compression Advisor since 11.2.0.4
Exclude rule created.
dbvrep> EXCLUDE CREATE TABLE %.CMP4$% #Ignore tables created by
Compression Advisor since 11.2.0.4
Exclude rule created.
dbvrep> memory_set IGNORE_APPLY_DDL_DIFFERENCES Yes
Variable IGNORE_APPLY_DDL_DIFFERENCES set to YES for process *.
dbvrep> SET PREPARE_SCHEMA_EXCEPTIONS none
Variable PREPARE_SCHEMA_EXCEPTIONS set to none for process *.
dbvrep> PROCESS SUPPLEMENTAL LOGGING SCHEMA "PSI" ENABLE PRIMARY KEY
dbvrep> PROCESS SWITCH_REDOLOG
Redo log switch requested.
dbvrep> PROCESS WAIT_SCN_FLIP
Waited 1 seconds until scn_to_timestamp changed.
dbvrep> #single-scn instantiation: lock all tables and schemas
dbvrep> PROCESS LOCK SCHEMAS "PSI"
Locking all schemas.
...locked 2 of 2 tables from PSI schema.
Lock done.
dbvrep> #single-scn instantiation: unlock all tables and schemas, but
keep the SCN
dbvrep> PROCESS LOCK RELEASE LOCKS
Engine locks released.
dbvrep> 
dbvrep> #prepare the tables (we use OFFLINE as neither MINE nor APPLY
is running; with OFFLINE we won't wait on network timeout)
dbvrep> PREPARE OFFLINE SCHEMA "PSI"
Table PSI.EMPLOYE instantiated at SCN 2056800
Table PSI.OFFICE instantiated at SCN 2056800
dbvrep> 
dbvrep> #single-scn instantiation: unlock all tables and schemas,
forget the SCN (so it does not affect any further PREPARE statements)
dbvrep> PROCESS LOCK CLEAR SCN
dbvrep> PROCESS SWITCH_REDOLOG
Redo log switch requested.
dbvrep> #prepare script for instantiation
dbvrep> PROCESS PREPARE_DP WRITE DDL_FILE FILE
/home/oracle/ora2pg/APPLY.sql USERID SYSTEM/manager@PSI
File /home/oracle/ora2pg/APPLY.sql has been written successfully.
Created DDL script /home/oracle/ora2pg/APPLY.sql.
dbvrep> create ddcdb from ddcfile
DDC loaded into database (430 variables).
dbvrep> load ddcdb
DDC loaded from database (430 variables).
dbvrep> set ON_WARNING SKIP
Variable ON_WARNING set to SKIP for process *.
dbvrep> set ON_ERROR SKIP
Variable ON_ERROR set to SKIP for process *.
OK-0: Completed successfully.
WARN-1850: No DDC DB available, dictionary table does not exist.
These steps are required after the ora2pg-all.sh script runs:
 
1) Create the necessary directory(ies) on the servers:
cloud13c: /home/oracle/ora2pg
pg1: /home/oracle/ora2pg
 
2) Copy the DDC files to the server(s) where the processes will run:
pg1: /home/oracle/ora2pg/ora2pg-APPLY.ddc
cloud13c: /home/oracle/ora2pg/ora2pg-MINE.ddc
 
Ensure that the parameter TNS_ADMIN (in the ddc file) is pointing to the correct TNS_ADMIN path on each of the servers.
 
3) Review that path to dbvrep executable is correct in the run scripts:
/home/oracle/ora2pg/ora2pg-run-cloud13c.sh
/home/oracle/ora2pg/ora2pg-run-pg1.sh
 
4) Copy the run script to the server(s) where the processes will run:
cloud13c: /home/oracle/ora2pg/ora2pg-run-cloud13c.sh
pg1: /home/oracle/ora2pg/ora2pg-run-pg1.sh
 
5) Ensure firewall is open for listen interfaces 0.0.0.0:7902, 0.0.0.0:7901 used by the processes.
 
6) Make sure the data on apply are in sync as of time when setup was run.
Scripts for Data Pump/export/DDL were created as requested:
 
Create referenced database links (if any) before running the scripts.
/home/oracle/ora2pg/APPLY.sql
 
7) Start the replication processes on all servers:
cloud13c: /home/oracle/ora2pg/ora2pg-run-cloud13c.sh
pg1: /home/oracle/ora2pg/ora2pg-run-pg1.sh
 
8) Start the console to monitor the progress:
/home/oracle/ora2pg/start-console.sh

As explained you have to copy two files on the postgres server : /home/oracle/ora2pg/ora2pg-APPLY.ddc and /home/oracle/ora2pg/ora2pg-run-pg1.sh

As I choosed the option ddl_only, we have to first create the tables on the postgres server. In order to do this we can use the APPLY:sql file from the Oracle server.

The next step consist in running the MINE process on the Oracle server:

oracle@localhost:/home/oracle/ora2pg/ [PSI] . ora2pg-run-cloud13c.sh 
Initializing......done
DDC loaded from database (430 variables).
Dbvisit Replicate version 2.9.02
Copyright (C) Dbvisit Software Limited. All rights reserved.
DDC file /home/oracle/ora2pg/ora2pg-MINE.ddc loaded.
Starting process MINE...started

And we launch the APPLy process on the postgres server:

postgres@pg_essentials_p1:/home/oracle/ora2pg/ [PG1] . ora2pg-run-pg1.sh 
Initializing......done
DDC loaded from database (431 variables).
Dbvisit Replicate version 2.9.02
Copyright (C) Dbvisit Software Limited. All rights reserved.
DDC file /home/oracle/ora2pg/ora2pg-APPLY.ddc loaded.
Starting process APPLY...Created directory /home/oracle/ora2pg/ddc_backup
Created directory /home/oracle/ora2pg/log/
Created directory /home/oracle/ora2pg/log/trace/
Created directory /home/oracle/ora2pg/apply
started

Initially I had two tables in my PSI oracle database belonging to the psi schema: EMPLOYE and OFFICE. I used the APPLY.sql script to create the tables in the postgres environment.

To visualize the activity we run start_console.sh on the Oracle server:

oracle@localhost:/home/oracle/ora2pg/ [PSI] . start-console.sh 
Initializing......done
DDC loaded from database (431 variables).
Dbvisit Replicate version 2.9.02
Copyright (C) Dbvisit Software Limited. All rights reserved.
 
| Dbvisit Replicate 2.9.02(MAX edition) - Evaluation License expires in 30 days
MINE is running. Currently at plog 120 and SCN 2060066 (11/07/2017 15:27:57).
APPLY is running. Currently at plog 120 and SCN 2060021 (11/07/2017 15:27:45).
Progress of replication ora2pg:MINE->APPLY: total/this execution
-------------------------------------------------------------------------------------------------
PSI.EMPLOYE/psi.employe:      100%  Mine:1/1             Unrecov:0/0         Applied:1/1         Conflicts:0/0       Last:07/11/2017 15:20:06/OK
PSI.OFFICE/psi.office:        100%  Mine:1/1             Unrecov:0/0         Applied:1/1         Conflicts:0/0       Last:07/11/2017 15:21:36/OK
-------------------------------------------------------------------------------------------------
2 tables listed.

And we can validate that each insert in the employe or office table is replicated on the postgres server:

From the postgres database;

(postgres@[local]:5432) [postgres] > select * from psi.employe;
 name  | salary 
-------+--------
 Larry |  10000
 Bill  |   2000
(2 rows)

From the Oracle server:

SQL> insert into employe values ('John', 50000);
 
1 row created.
 
SQL> commit;
 
Commit complete.

The console is giving us correct informations:

/ Dbvisit Replicate 2.9.02(MAX edition) - Evaluation License expires in 30 days
MINE is running. Currently at plog 120 and SCN 2075526 (11/07/2017 16:44:17).
APPLY is running. Currently at plog 120 and SCN 2075494 (11/07/2017 16:44:08).
Progress of replication ora2pg:MINE->APPLY: total/this execution
-------------------------------------------------------------------------------------------------
PSI.EMPLOYE/psi.employe:      100%  Mine:3/3             Unrecov:0/0         Applied:3/3         Conflicts:0/0       Last:07/11/2017 16:18:41/OK
PSI.OFFICE/psi.office:        100%  Mine:3/3             Unrecov:0/0         Applied:3/3         Conflicts:0/0       Last:07/11/2017 15:37:02/OK
-------------------------------------------------------------------------------------------------
2 tables listed.

And the result is apllied on the postgres database:
(postgres@[local]:5432) [postgres] > select * from psi.employe;
 name  | salary 
-------+--------
 Larry |  10000
 Bill  |   2000
 John  |  50000
(3 rows)
 
As previously we have choosen the single-scn and ddl-run option, we had to run the APPLY.sql script from the Oracle server in order to create the tables on the postgres side, you can also choose in Step 2 of the configuration wizard, the load option (all replicated data is created and loaded automatically):
Lock and copy the data initially one-by-one or at a single SCN?
one-by-one : Lock tables one by one and capture SCN
single-scn : One SCN for all tables
ddl-only   : Only DDL script for target objects
resetlogs  : Use SCN from last resetlogs operation (standby activation, rman incomplete
recovery)
no-lock    : Do not lock tables. Captures previous SCN of oldest active transaction. Requires
pre-requisite running of pre-all.sh script                                                    (one-by-one/single-scn/ddl-only/resetlogs/no-lock) [single-scn] 
 
What data instantiation script to create?
ddl_file       : DDL file created (APPLY.sql)
ddl_run        : DDL is automatically executed on target
load           : All replicated data is created and loaded automatically
none                                                                                          (ddl_file/ddl_run/load/none) [ddl_run] load
Do you want to (re-)create the tables on target or keep them (they are already created)?      (create/keep) [keep] create

In this case you can visualize that each Oracle table is replicated to the Postgres server.

From the oracle server:

SQL> create table salary (name varchar2(10)); 
 
Table created.
 
SQL> insert into salary values ('Larry');
 
1 row created.
 
SQL> commit;
 
Commit complete.

The dbvist console displays correct informations:

\ Dbvisit Replicate 2.9.02(MAX edition) - Evaluation License expires in 30 days
MINE is running. Currently at plog 135 and SCN 2246259 (11/27/2017 14:44:24).
APPLY is running. Currently at plog 135 and SCN 2246237 (11/27/2017 14:44:18).
Progress of replication replic:MINE->APPLY: total/this execution
--------------------------------------------------------------------------------------
REP.SALARY:                   100%  Mine:1/1             Unrecov:0/0         Applied:1/1         Conflicts:0/0       Last:27/11/2017 14:01:25/OK
---------------------------------------------------------------------------------------------
1 tables listed.

From the postgres server:

(postgres@[local]:5432) [postgres] > select * from rep.salary;
 name  
-------
 Larry
(1 row)

The plog files generated in the postgres server contains the strings we need:

The plot files are generated on the postgres server in the directory /home/oracle/replic/apply

-bash-4.2$ ls
122.plog.gz  124.plog.gz  126.plog  128.plog.gz  130.plog.gz  132.plog.gz  134.plog
123.plog.gz  125.plog.gz  127.plog  129.plog.gz  131.plog.gz  133.plog.gz  135.plog
-bash-4.2$ strings 135.plog | grep -l larry
-bash-4.2$ strings 135.plog | grep -i larry
Larry
-bash-4.2$ strings 135.plog | grep -i salary
SALARY
create table salary (name varchar2(10))
SALARY
SALARY

Despite some problems at the beginning of my tests, the replication from Oracle to PostgreSQL is working fine and fast. There are many possibilities with Dbvisit Replicate I will try to test in the following weeks.

 

 

 

 

 

 

Cet article DBVISIT from Oracle to Postgres est apparu en premier sur Blog dbi services.

WebLogic – SSO/Atn/Atz – How to enable debug logs

Sat, 2017-11-25 13:04

Let’s say that you have an application deployed on a WebLogic Server and that the SSO/Atn/Atz (SSO/Authentication/Authorization) isn’t working or throwing a 4xx error code like ‘401 – Unauthorized’ or ‘403 – Forbidden’. How would you be able to know what is happening and how to solve it? Well the answer is always the same: enable debug logs, it is just a matter or choosing the right scope and choosing a way to do it because yes, it is possible to enable/disable them in several ways…

In this blog, I will be using a WebLogic Server configured with a SAML2 Single Sign-On (SSO) and hosting a Documentum D2 application. I will therefore enable and disable the debug logs for a Managed Server that is named ‘msD2-01′ and this MS is hosting D2, obviously (this is just for the background, it does not matter which application is running there!).

In the case of the SAML2 SSO, it’s pretty easy to recognize some pattern of errors, you just have to access the SSO URL and then watch closely the redirection from the WLS to the IdP Partner and then back to the WLS. Depending on when the issue appears, you can find out where the issue is. But in case you absolutely don’t know anything about all that or if you are completely new to this, you can and should always do the same thing: enable the debug logs.

 

I. Administration Console

WebLogic provides several ways to enable debug logs but the one that makes it really easy, even for beginners, is the WebLogic Administration Console. In this blog, I will only talk about SAML2 SSO, Authentication and Authorization. So enabling the debug logs for these three elements can be done using these steps:

  1. Login to the Admin Console using your weblogic account
  2. Navigate to the correct page: DOMAIN > Environment > Servers > ServerName (msD2-01 for me) > Debug
  3. Click on the ‘Lock & Edit’ button
  4. Expand the scope ‘weblogic’ and then ‘security’
  5. Check the correct checkboxes, depending on which element you want to enable:
    • atn
    • atz
    • ldap
    • saml
    • saml2
  6. Go back to the top and click on the ‘Enable’ button
  7. Click on the ‘Activate Changes’ button -> All changes have been activated. No restarts are necessary.

The correct page on the WebLogic Admin Console (step 2) is this one:

Debug

From this moment, you can check the Managed Server log file (its name is defined in the ‘Logging’ tab) which is $DOMAIN_HOME/servers/ServerName/logs/ServerName.log by default ($DOMAIN_HOME/servers/msD2-01/logs/msD2-01.log in my case) and it will start to log the messages for the elements that you enabled.

Do NOT enable too many elements! I usually only enable the SAML2 when I know it is linked to the SSO or atn+atz when it is more linked to how WebLogic manages the Authentication/Authorization. These are the three main scopes that you should work with when debugging an issue related to SSO/Authentication/Authorization.

Also, do not forget to disable the debug logs when you are done with your analysis. It is pretty obvious but it can store a lot of information in the log files so…

Please also note that if you expand the ‘atn’ scope for example, you will see that inside it, there is actually the ‘DebugSecurityAtn’, ‘DebugSecuritySAML2Atn’ and ‘DebugSecuritySAMLAtn’ -> these are attributes (not expandable). Therefore enabling the whole ‘atn’ will already activate some of the SAML2 debug logs. So there is a notion of group (=scope) on the Admin Console to simplify the selection that you can also find on other solutions below.

 

II. config.xml

If you already worked with WebLogic, you probably know what is the purpose of the config.xml file and you probably also know that it can be a little bit dangerous to mess with this file (not really if you know what you are doing). However it is still possible to do some configuration there, so at your own risks :).

By default, this file will NOT contain any information regarding the debug logging, unless you already enabled them at least once in the past using the Admin Console for example. So since there is, by default, no information regarding the debug logging, you need to add them manually if you want to configure the logging using the config.xml file. Be aware that the location where you put the configuration matters!

[weblogic@weblogic_server_01 ~]$ cd $DOMAIN_HOME/config
[weblogic@weblogic_server_01 config]$
[weblogic@weblogic_server_01 config]$ grep -C1 -E "debug|server>" config.xml
  </log>
  <server>
    <name>AdminServer</name>
--
    <custom-trust-key-store-pass-phrase-encrypted>{AES}QARbQAV1ul1u3tJcsGAhdATQrNJe1YlVnQmY9d1jWQFx4aM=</custom-trust-key-store-pass-phrase-encrypted>
  </server>
  <server>
    <name>msD2-01</name>
--
    </single-sign-on-services>
  </server>
  <production-mode-enabled>true</production-mode-enabled>
[weblogic@weblogic_server_01 config]$

 

With the example above, there are no debug information in the config.xml file so let’s add the SAML2, atn and atz logging. You can put these entries between the ‘<listen-port-enabled>’ and ‘<listen-address>’ parameters for example. That’s what it looks like afterwards:

[weblogic@weblogic_server_01 config]$ grep -C1 -E "debug|server>" config.xml
  </log>
  <server>
    <name>AdminServer</name>
--
    <custom-trust-key-store-pass-phrase-encrypted>{AES}QARbQAV1ul1u3tJcsGAhdATQrNJe1YlVnQmY9d1jWQFx4aM=</custom-trust-key-store-pass-phrase-encrypted>
  </server>
  <server>
    <name>msD2-01</name>
--
    <listen-port-enabled>false</listen-port-enabled>
    <server-debug>
      <debug-scope>
        <name>weblogic.security.atn</name>
        <enabled>true</enabled>
      </debug-scope>
      <debug-scope>
        <name>weblogic.security.atz</name>
        <enabled>true</enabled>
      </debug-scope>
      <debug-scope>
        <name>weblogic.security.saml2</name>
        <enabled>true</enabled>
      </debug-scope>
    </server-debug>
    <listen-address>10.10.10.10</listen-address>
--
    </single-sign-on-services>
  </server>
  <production-mode-enabled>true</production-mode-enabled>
[weblogic@weblogic_server_01 config]$

 

As you can see above, the debug-scope name that needs to be added in the config.xml file is simply the concatenation of the different elements you expanded in the WebLogic Administration Console separated by dots (E.g.: weblogic.security.saml2).

Above, I only talked about scopes again. If you want to set the attributes directly, it is also possible by adding this for example ‘<debug-security-saml2-service>true</debug-security-saml2-service>’ or ‘<debug-security-atz>true</debug-security-atz>’ but let’s try to keep it simple… I’m only showing the minimum to be able to debug issues but that is obviously not everything you can do.

The problem with configuring something in the config.xml: you need to restart the Managed Server for the change to be applied… Indeed, all changes are written to this file as soon as they are activated but this file is only loaded at the boot so a restart is needed, which makes this solution a little bit less effective when we are talking about debug logging because we usually want to enable/disable them on the fly!

 

III. WLST

You all know what the WLST is so let’s just get to it. This solution is probably the best one if you want to script/automate it because it’s the whole purpose of the WebLogic Scripting Tool, yeah yeah I know it is unbelievable ;)! Below I will not use the config and key files to connect to the AdminServer but keep in mind that these files contain the weblogic username and password encrypted so if you want to automate something, you will most probably need them to avoid clear text passwords! So first the connection to the WLST:

[weblogic@weblogic_server_01 config]$ source $MW_HOME/wlserver/server/bin/setWLSEnv.sh
CLASSPATH=$JAVA_HOME/lib/tools.jar:$MW_HOME/wlserver/modules/features/wlst.wls.classpath.jar:

PATH=$MW_HOME/wlserver/server/bin:$MW_HOME/wlserver/../oracle_common/modules/org.apache.ant_1.9.2/bin:$JAVA_HOME/jre/bin:$JAVA_HOME/bin:$JAVA_HOME/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/weblogic/bin:$MW_HOME/wlserver/../oracle_common/modules/org.apache.maven_3.2.5/bin

Your environment has been set.
[weblogic@weblogic_server_01 config]$
[weblogic@weblogic_server_01 config]$ java -Djava.security.egd=file:///dev/./urandom -Dwlst.offline.log=disable weblogic.WLST

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

wls:/offline> connect('weblogic','P4sSw0rd','t3s://weblogic_server_01:8443')
Connecting to t3s://weblogic_server_01:8443 with userid weblogic ...
<Nov 11, 2017 2:50:12 PM UTC> <Info> <Security> <BEA-090908> <Using the default WebLogic SSL Hostname Verifier implementation.>
Successfully connected to Admin Server "AdminServer" that belongs to domain "DOMAIN".

wls:/DOMAIN/serverConfig/> edit()
Location changed to edit tree.
This is a writable tree with DomainMBean as the root.
To make changes you will need to start an edit session via startEdit().
For more help, use help('edit').

wls:/DOMAIN/edit/> cd('Servers/msD2-01/ServerDebug/msD2-01')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01> startEdit()
Starting an edit session ...
Started edit session, be sure to save and activate your changes once you are done.
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> set('DebugSecurityAtn','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> set('DebugSecurityAtz','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> set('DebugSecuritySAML2Atn','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> set('DebugSecuritySAML2Service','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> set('DebugSecuritySAMLCredMap','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> save()
Saving all your changes ...
Saved all your changes successfully.
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01 !> activate()
Activating all your changes, this may take a while ...
The edit lock associated with this edit session is released once the activation is completed.
Activation completed
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01>

 

The commands above will enable the attributes only. On the first paragraph above (related to the Admin Console), I talked about groups (=scope) and I actually only enabled the groups themselves on this first solution (scopes = ‘atn’, ‘atz’ and ‘saml2′). So if you enabled the scope ‘atn’ on the Admin Console for example, then you will probably see the scope ‘DebugSecurityAtn’ (on the WLST session) set to false (unless you already changed it using the above WLST commands). If you want to enable the scopes directly, it is not on the same location. Here is an example for the SAML2 scope:

wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01> cd('DebugScopes/weblogic.security.saml2')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2> ls()

-r--   DynamicallyCreated                           false
-rw-   Enabled                                      false
-r--   Id                                           0
-rw-   Name                                         weblogic.security.saml2
-rw-   Notes                                        null
-rw-   Tags                                         null
-r--   Type                                         DebugScope

-r-x   addTag                                       Boolean : String(tag)
-r-x   freezeCurrentValue                           Void : String(attributeName)
-r-x   getInheritedProperties                       String[] : String[](propertyNames)
-r-x   isInherited                                  Boolean : String(propertyName)
-r-x   isSet                                        Boolean : String(propertyName)
-r-x   removeTag                                    Boolean : String(tag)
-r-x   restoreDefaultValue                          Void : String(attributeName)
-r-x   unSet                                        Void : String(propertyName)

wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2> startEdit()
Starting an edit session ...
Started edit session, be sure to save and activate your changes once you are done.
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2 !> set('Enabled','true')
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2 !> ls()

-r--   DynamicallyCreated                           false
-rw-   Enabled                                      true
-r--   Id                                           0
-rw-   Name                                         weblogic.security.saml2
-rw-   Notes                                        null
-rw-   Tags                                         null
-r--   Type                                         DebugScope

-r-x   addTag                                       Boolean : String(tag)
-r-x   freezeCurrentValue                           Void : String(attributeName)
-r-x   getInheritedProperties                       String[] : String[](propertyNames)
-r-x   isInherited                                  Boolean : String(propertyName)
-r-x   isSet                                        Boolean : String(propertyName)
-r-x   removeTag                                    Boolean : String(tag)
-r-x   restoreDefaultValue                          Void : String(attributeName)
-r-x   unSet                                        Void : String(propertyName)

wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2 !> save()
Saving all your changes ...
Saved all your changes successfully.
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2 !> activate()
Activating all your changes, this may take a while ...
The edit lock associated with this edit session is released once the activation is completed.
Activation completed
wls:/DOMAIN/edit/Servers/msD2-01/ServerDebug/msD2-01/DebugScopes/weblogic.security.saml2> disconnect()
Disconnected from weblogic server: AdminServer
wls:/offline> exit()

Exiting WebLogic Scripting Tool.

[weblogic@weblogic_server_01 config]$

 

Pretty cool, right?

 

IV. Command line

The last way to do this is via the command line but just like the config.xml, this will require a restart of the concerned Managed Server. Here are some examples of command line arguments that can be added to the JAVA_OPTIONS to enable the debug logs on the Attributes level:

  • -Dweblogic.debug.DebugSecurityAtn=true
  • -Dweblogic.debug.DebugSecurityAtz=true
  • -Dweblogic.debug.DebugSecuritySAML2Atn=true
  • -Dweblogic.debug.DebugSecuritySAML2Service=true

With all that, you should be well armed to face any issue!

 

 

Cet article WebLogic – SSO/Atn/Atz – How to enable debug logs est apparu en premier sur Blog dbi services.

Documentum – DFC traces setup & investigation

Sat, 2017-11-25 12:11

When working with Documentum, you will most probably have to enable the DFC traces one day or another and then work with these traces to analyze them. The purpose of this blog is simply to show how the DFC traces can be enabled, which tools can be used to quickly process them and what are the limitations of such things.

Enabling the DFC traces can be done very easily by updating the dfc.properties file of the client. This client can be a DA, D2, JMS, Index Agent, aso… The change is applied directly (if enabled=true) and disabled by default (if commented or enable=false). If you have a dfc.properties that is inside a war file (for DA/D2 for example) and that you deployed your application as a war file (not exploded), then disabling the tracing might need a restart of your application. To avoid that, you can have a dfc.properties inside the war file that just point to another one outside of the war file and then enabling/disabling the traces from this second file will work properly. There are a lot of options to customize how the traces should be generated. A first example with only a few properties that you can use and reuse every time you need traces:

dfc.tracing.enable=true
dfc.tracing.verbose=true
dfc.tracing.max_stack_depth=0
dfc.tracing.mode=compact
dfc.tracing.dir=/tmp/dfc_tracing

 

Another example with more properties to really specify what you want to see:

dfc.tracing.enable=true
dfc.tracing.verbose=true
dfc.tracing.max_stack_depth=4
dfc.tracing.include_rpcs=true
dfc.tracing.mode=standard
dfc.tracing.include_session_id=true
dfc.tracing.user_name_filter[0]=dmadmin
dfc.tracing.user_name_filter[1]=myuser
dfc.tracing.thread_name_filter[0]=Thread-3
dfc.tracing.thread_name_filter[1]=Thread-25
dfc.tracing.timing_style=milliseconds_from_start
dfc.tracing.dir=/tmp/dfc_tracing
dfc.tracing.file_prefix=mydfc
dfc.tracing.max_backup_index=10
dfc.tracing.max_file_size=104857600
...

 

All these properties are quite easy to understand even without explanation but you can probably find more information and all the possible options in the official Documentum documentation. It’s not the main purpose of this blog so I’m just mentioning a few properties to get started. By default, the name of the generated files will be something like “dfctrace.timestamp.log”, you can change that by setting the “dfc.tracing.file_prefix” for example. Adding and customizing the properties will change the display format and style inside the files so if you want to have a way to analyze these DFC traces, it is better to use more or less always the same set of options. For the example below, OTX asked me to use these properties only:

dfc.tracing.enable=true
dfc.tracing.verbose=true
dfc.tracing.max_stack_depth=4
dfc.tracing.include_rpcs=true
dfc.tracing.mode=compact
dfc.tracing.include_session_id=true
dfc.tracing.dir=/tmp/dfc_tracing

 

When you have your DFC traces, you need a way to analyze them. They are pretty much readable but it will be complicated to get something out of it without spending a certain amount of time – unless you already know what you are looking for – simply because there are a lot of information inside… For that purpose, Ed Bueche developed more than 10 years ago some AWK scripts to parse the DFC traces files: traceD6.awk and trace_rpc_histD6.awk. You can find these scripts at the following locations (all EMC links… So might not be working at some point in the future):

As you can see above, it is not really maintained and the same scripts or a mix of several versions can be found at several locations so it can be a little bit confusing. All the old links are about the awk scripts but since 2013, there is now a python script too (also developed by Ed Bueche).

In this blog, I wanted to talk about the AWK scripts mainly. Earlier this month, I was working with OTX on some performance tuning tasks and for that, I gathered the DFC traces for several scenarios, in different log files, well separated, aso… Then, I provided them to OTX for the analysis. OTX came back to me a few minutes later saying that most of the traces were corrupted and asking me to regenerate them. I wasn’t quite OK with that simply because it takes time and because there were some testing in progress on this environment so gathering clean DFC traces for several scenarios would have forced the tests to be stopped, aso… (Ok ok you got me, I’m just lazy ;))

The content of the DFC traces looked correct to me and after a quick verification, I saw that OTX was using the AWK scripts (traceD6.awk and trace_rpc_histD6.awk) to analyze the logs but they were apparently getting an error. The files didn’t look corrupted to me so I mentioned to OTX that the issue might very well be with the AWK scripts they were using. They didn’t really listen to what I said and stayed focus on getting a new set of DFC traces. I already used these scripts but never really looked inside so it was the perfect reason to take some time for that:

[dmadmin@content_server_01 ~]$ cd /tmp/dfc_tracing/
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ ls -l trace* dfctrace.*.log
-rw-r-----. 1 dmadmin dmadmin 92661060 Nov 3 09:24 dfctrace.1510220481672.log
-rw-r-----. 1 dmadmin dmadmin 3240 Nov 4 14:10 traceD6.awk
-rw-r-----. 1 dmadmin dmadmin 7379 Nov 4 14:10 traceD6.py
-rw-r-----. 1 dmadmin dmadmin 5191 Nov 4 14:10 trace_rpc_histD6.awk
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ awk -f traceD6.awk < dfctrace.1510220481672.log > output_awk_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -l output_awk_1.log
2 output_awk_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ awk -f trace_rpc_histD6.awk < dfctrace.1510220481672.log > output_awk_2.log
awk: trace_rpc_histD6.awk:203: (FILENAME=- FNR=428309) fatal: division by zero attempted
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -l output_awk_2.log
4 output_awk_2.log
[dmadmin@content_server_01 dfc_tracing]$

 

As you can see above, the first script generated a log file that contains only 2 lines, so this is already suspicious even if there are no errors. The second script generated an error and its log file contains only 4 lines… The input DFC trace file has a size of 90Mb so it’s clear that there is something wrong and that’s why OTX said that the DFC traces were corrupted. The error message shows the line (203) as the origin of the issue as well as a “division by zero attempted” message. This obviously means that somewhere on this line, there is a division and that the divisor is equal to 0 or at least not set at all. Since I love all kind of UNIX scripting, I would rather fix the bug in the script than having to generate a new set of DFC traces (and the new set would still be impacted by the issue anyway…)! So checking inside the trace_rpc_histD6.awk file, the line 203 is the following one:

[dmadmin@content_server_01 dfc_tracing]$ grep -n -C1 "TIME SPENT" trace_rpc_histD6.awk
202-    printf ("DURATION (secs):\t%17.3f\n", ((curr_tms - st_tms)) );
203:    printf ("TIME SPENT EXECUTING RPCs (secs):%8.3f (which is %3.2f percent of total time)\n", total_rpc_time, 100*total_rpc_time/(curr_tms - st_tms));
204-    printf ("Threads :\t%25d\n", thread_cnt);
[dmadmin@content_server_01 dfc_tracing]$

 

The only division on this line is the total time taken to execute the RPCs divided by the duration of the log file (timestamp of last message – first message). So the value of “curr_tms – st_tms” is 0. Potentially, it could be that both variables have the exact same value but since the first and last messages on the DFC traces don’t have the same timestamp, this isn’t possible and therefore both variables are actually 0 or not set. To check where these variables are defined, how and in which function:

[dmadmin@content_server_01 dfc_tracing]$ grep -n -C15 -E "curr_tms|st_tms" trace_rpc_histD6.awk | grep -E "curr_tms|st_tms|^[0-9]*[:-][^[:space:]]"
144-/ .RPC:/ {
159:                    st_tms = $1;
162:            curr_tms = $1;
175-}
177-/obtained from pool/ {
--
187-}
188-/.INFO: Session/ {
193-}
197-END {
202:    printf ("DURATION (secs):\t%17.3f\n", ((curr_tms - st_tms)) );
203:    printf ("TIME SPENT EXECUTING RPCs (secs):%8.3f (which is %3.2f percent of total time)\n", total_rpc_time, 100*total_rpc_time/(curr_tms - st_tms));
[dmadmin@content_server_01 dfc_tracing]$

 

This shows that the only location where these two variables are set is inside the matching pattern “/ .RPC:/” (st_tms is set to $1 only on the first execution). So it means that this portion of code is never executed so in other words: this pattern is never found in the DFC trace file. Why is that? Well that’s pretty simple: the DFC traces file contains a lot of RPC calls but these lines never contain ” .RPC:”, there are always at least two dots (so something like that: ” ..RPC:” or ” …RPC:” or ” ….RPC:”). The reason why there are several dots is simply because the RPC are placed where they are called… In this case, OTX asked us to use “dfc.tracing.max_stack_depth=4″ so this is what I did and it is the reason why the AWK scripts cannot work by default because they need “dfc.tracing.max_stack_depth=0″, that’s written at the beginning of the scripts in the comment sections.

So a simple way to fix the AWK scripts is to remove the space at the beginning of the pattern for both the traceD6.awk and trace_rpc_histD6.awk scripts and after doing that, it will work for all max_stack_depth values:

[dmadmin@content_server_01 dfc_tracing]$ grep -n ".RPC:/" *.awk
traceD6.awk:145:/ .RPC:/ {
trace_rpc_histD6.awk:144:/ .RPC:/ {
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ sed -i 's,/ .RPC:/,/.RPC:/,' *.awk
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ grep -n ".RPC:/" *.awk
traceD6.awk:145:/.RPC:/ {
trace_rpc_histD6.awk:144:/.RPC:/ {
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ awk -f traceD6.awk < dfctrace.1510220481672.log > output_awk_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -lc output_awk_1.log
 1961 163788 output_awk_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ awk -f trace_rpc_histD6.awk < dfctrace.1510220481672.log > output_awk_2.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -l output_awk_2.log
 367 49050 output_awk_2.log
[dmadmin@content_server_01 dfc_tracing]$

 

That looks much much better… Basically, the first script list all RPCs with their thread, name and times while the second script creates a sorted list of queries that took the most time to execute as well as a list of calls and occurrences per types/names.

The AWK and Python scripts, even if they are globally working, might have some issues with commas, parenthesis and stuff like that (again it depends which dfc.tracing options you selected). This is why I mentioned above that there is actually both a AWK and Python version of these scripts. Sometimes, the AWK scripts will contain the right information, sometimes it is the Python version that will but in all cases, the later will run much faster. So if you want to work with these scripts, you will have to juggle a little bit:

[dmadmin@content_server_01 dfc_tracing]$ python traceD6.py dfctrace.1510220481672.log > output_py_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -l output_py_1.log
 1959 194011 output_py_1.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ python traceD6.py dfctrace.1510220481672.log -profile > output_py_2.log
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ wc -l output_py_2.log
 342 65917 output_py_2.log
[dmadmin@content_server_01 dfc_tracing]$

 

As you can see, there are fewer lines in the python output files but that’s because some unnecessary headers have been removed in the python version so it’s actually normal. However there are much more characters so it shows that, in this case, the extracted DQL queries contain more characters but it does not mean that these characters are actually part of the DQL queries: you will see below that there are references to “, FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)” => This is NOT part of the DQL but it is present on the output of the Python script while it is not for the AWK one:

[dmadmin@content_server_01 dfc_tracing]$ head -15 output_awk_1.log
analysis program version 2 based on DFC build 6.0.0.76
68354.130 & 0.005 & [http--0.0.0.0-9082-3] & EXEC_QUERY  select r_object_id from dm_sysobject where folder ('/Home') and object_name = 'Morgan Patou'
68354.135 & 0.000 & [http--0.0.0.0-9082-3] & multiNext
68354.136 & 0.005 & [http--0.0.0.0-9082-3] & SysObjFullFetch  0b0f12345004f0de
68354.165 & 0.002 & [http--0.0.0.0-9082-4] & EXEC_QUERY  select r_object_id from dm_user where user_name = 'Morgan Patou'
68354.167 & 0.000 & [http--0.0.0.0-9082-4] & multiNext
68354.167 & 0.002 & [http--0.0.0.0-9082-4] & IsCurrent
68354.170 & 0.003 & [http--0.0.0.0-9082-4] & EXEC_QUERY  SELECT COUNT(*) AS items FROM dm_group WHERE group_name = 'report_user' AND ANY i_all_users_names = 'Morgan Patou'
68354.173 & 0.001 & [http--0.0.0.0-9082-4] & multiNext
68354.175 & 0.003 & [http--0.0.0.0-9082-4] & EXEC_QUERY  select r_object_id from dm_sysobject where folder ('/myInsight') and object_name = 'myInsight.license'
68354.178 & 0.001 & [http--0.0.0.0-9082-4] & multiNext
68354.179 & 0.001 & [http--0.0.0.0-9082-4] & IsCurrent
68354.165 & 0.010 & [http--0.0.0.0-9082-3] & SysObjGetPermit
68354.175 & 0.006 & [http--0.0.0.0-9082-3] & SysObjGetXPermit
68354.181 & 0.006 & [http--0.0.0.0-9082-4] & MAKE_PULLER
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ head -15 output_py_1.log
68354.130 & 0.005 & [http--0.0.0.0-9082-3] & EXEC_QUERY & select r_object_id from dm_sysobject where folder ('/Home') and object_name = 'Morgan Patou', FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)
68354.135 & 0.000 & [http--0.0.0.0-9082-3] & multiNext &
68354.136 & 0.005 & [http--0.0.0.0-9082-3] & SysObjFullFetch & 0b0f12345004f0de
68354.165 & 0.002 & [http--0.0.0.0-9082-4] & EXEC_QUERY & select r_object_id from dm_user where user_name = 'Morgan Patou', FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)
68354.167 & 0.000 & [http--0.0.0.0-9082-4] & multiNext &
68354.167 & 0.002 & [http--0.0.0.0-9082-4] & IsCurrent & 110f123450001d07
68354.170 & 0.003 & [http--0.0.0.0-9082-4] & EXEC_QUERY & SELECT COUNT(*) AS items FROM dm_group WHERE group_name = 'report_user' AND ANY i_all_users_names = 'Morgan Patou', FOR_UPDATE=T, BATCH_HINT=50, BOF_DQL=T, FLUSH_BATCH=-1]],50,true,true)
68354.173 & 0.001 & [http--0.0.0.0-9082-4] & multiNext &
68354.175 & 0.003 & [http--0.0.0.0-9082-4] & EXEC_QUERY & select r_object_id from dm_sysobject where folder ('/myInsight') and object_name = 'myInsight.license', FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)
68354.178 & 0.001 & [http--0.0.0.0-9082-4] & multiNext &
68354.179 & 0.001 & [http--0.0.0.0-9082-4] & IsCurrent & 090f123450023f63
68354.165 & 0.010 & [http--0.0.0.0-9082-3] & SysObjGetPermit & 0b0f12345004f0de
68354.175 & 0.006 & [http--0.0.0.0-9082-3] & SysObjGetXPermit & 0b0f12345004f0de
68354.181 & 0.006 & [http--0.0.0.0-9082-4] & MAKE_PULLER & null
68354.187 & 0.000 & [http--0.0.0.0-9082-4] & getBlock &
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$
[dmadmin@content_server_01 dfc_tracing]$ head -35 output_py_2.log

****** PROFILE OF rpc CALLS *****
     3.273           0.080              41      AUTHENTICATE_USER
     0.032           0.002              17      BEGIN_TRANS
     0.001           0.000              14      END_PUSH_V2
     0.202           0.012              17      END_TRANS
    21.898           0.071             310      EXEC_QUERY
     0.028           0.005               6      FETCH_CONTENT
     0.011           0.000              55      GET_ERRORS
     0.117           0.004              27      GET_LOGIN
     0.290           0.002             163      IsCurrent
     0.013           0.000              82      KILL_PULLER
     0.003           0.000              14      KILL_PUSHER
     0.991           0.012              82      MAKE_PULLER
     0.005           0.000              14      MAKE_PUSHER
     0.002           0.000               5      NEXT_ID_LIST
     0.083           0.002              38      NORPC
     0.015           0.005               3      RelationCopy
     0.446           0.032              14      SAVE
     0.274           0.014              20      SAVE_CONT_ATTRS
     0.140           0.010              14      START_PUSH
     0.134           0.045               3      SysObjCheckin
     0.048           0.016               3      SysObjCheckout
     2.199           0.009             240      SysObjFullFetch
     0.913           0.006             141      SysObjGetPermit
     0.764           0.005             141      SysObjGetXPermit
     0.642           0.046              14      SysObjSave
     0.033           0.000              82      getBlock
     1.454           0.004             399      multiNext

**** QUERY RESPONSE SORTED IN DESCENDING ORDER ****

10.317  select distinct wf.object_name as workflow_name, pr.object_name as process_name, i.name as Performer_Name, i.task_name as Task_Name, i.date_sent as Date_Task_Sent, i.actual_start_date as Date_Task_Acquired, wf.r_creator_name as Workflow_Initiator, cd.primary_group as "group", cd.subgroup as subgroup, cd.artifact_name as Artifact_Name, cd.object_name as document_name, cd.r_version_label as version_label, cd.title as Document_Title, cd.r_object_id as object_id from cd_common_ref_model(all) cd, dmi_package p, dmi_queue_item i, dm_workflow wf, dm_process pr
0.607   select r_object_id from dm_sysobject where folder ('/myInsight/Presentations/Standard Presentations/Graphical Reports') and object_name = 'FusionInterface.xsl', FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)
0.505   select r_object_id from dm_sysobject where folder ('/myInsight/Presentations/Life Sciences') and object_name = 'Unique Templates.xsl', FOR_UPDATE=F, BATCH_HINT=50, BOF_DQL=T]],50,true,true)
[dmadmin@content_server_01 dfc_tracing]$

 

To conclude this blog on a more philosophical note: always question what other people ask you to do and think twice before doing the same thing over and over again. ;)

 

 

Cet article Documentum – DFC traces setup & investigation est apparu en premier sur Blog dbi services.

#DOAG2017

Fri, 2017-11-24 16:02

CaptureGPTW

The discussions about the technologies we love. With Bryn about my tests on the MLE and the fact that I compared very different things, running a recursive function on different datatype (integer vs. number). With Mike about the way RUs will be recommended and RURs only for very special cases. With Nigel about the ODC Database Ideas, with Stefan about what is documented or not, with… Discussions about community also, and user groups.

The trip, where meeting fellow speakers start in the plane,…

The dinners with ACEs, with Speakers, with friends…

The beers, thanks to the Pieter & Philippe for sharing Belgian beers & cheese & mustard & celery salt & your good mood

The sessions of course. Kamil’s tool to show tablespace fragmentation visually, Jan’s comparison between Oracle and EDB, Philippe & Pieter technical view on GDPR, Adam’s research on NFS for his appliance,…

The party for sure,…

DSC00332My session, and the very interesting questions I got… I was lucky to speak on the first day. And proud to speak on the Oak Table stream for the first time. I was happy to see many people already with a CDB and even in production. It is a slow adoption but people come to it and finally notice that it is not a big change for daily job.

IMG_4712And colleagues of course. This is the conference where dbi services has a booth and several speakers. We are passionate and like to share. At the booth, we did some demos of Dbvisit Standby 8, Orachrome Lighty, and also the OpenDB Appliance. We meet customers, or candidatees, talk about the technologies we love, explain how we do our training workshops. It is also a great place to discuss among us. Even if we have internal projects, and two ‘dbi xChange’ events every year, we are mainly at customers and have so much to share.

DOAG is an amazing conference. Intense time compressed into 3 days. This incredibly friendly ambiance is hard to quit at the end of the conference. Fortunately, persistence and durability are guaranteed thanks to Kamil’s snapshots:

Some of the speakers at #DOAG2017 party – @MDWidlake @BrynLite @ChandlerDBA @FranckPachot @RoelH @pioro @boliniak @chrisrsaxon @phurley @kmensah @lleturgez @DBAKevlar @oraesque @MikeDietrichDE @OracleSK – it was fun :) pic.twitter.com/Oe2l26QxSp

— Kamil Stawiarski (@ora600pl) November 23, 2017

#DOAG2017 speakers dinner was awesome! pic.twitter.com/cSsUaf6VPB

— Kamil Stawiarski (@ora600pl) November 22, 2017

When you see how Kamil highlights each personality with a simple camera, can you imagine what he can do when organizing a conference? Keep an eye on POUG website.

 

Cet article #DOAG2017 est apparu en premier sur Blog dbi services.

DOAG 2017: avg_row_len with virtual columns

Fri, 2017-11-24 11:47

At the DOAG I attended a session “Top-level DB design for Big Data in ATLAS Experiment at CERN” provided by Gancho Dimitrov. The presentation was actually very interesting. As part of Gancho’s improvement activities to reduce space in a table he stored data in a 16 Bytes raw format (instead of a string representing hex values which requires 36 Bytes) and use virtual columns to actually calculate the real hex-string.

So the original value is e.g. 21EC2020-3AEA-4069-A2DD-08002B30309D, which is reduced to 16 Bytes by removing the ‘-‘ and converting the resulting hex-string to raw:

HEXTORAW(REPLACE(’21EC2020-3AEA-4069-A2DD-08002B30309D’, ‘-‘, ”))

The problem was that the longer virtual columns added to the average row length statistic in Oracle. I.e. here the simple testcase:


create table cern_test
(
GUID0 RAW(16)
,GUID1 RAW(16)
,GUID2 RAW(16)
,GUID0_CHAR as (SUBSTR(RAWTOHEX(GUID0),1,8)||'-'||
SUBSTR(RAWTOHEX(GUID0),9,4)||'-'||
SUBSTR(RAWTOHEX(GUID0),13,4)||'-'||
SUBSTR(RAWTOHEX(GUID0),17,4)||'-'||
SUBSTR(RAWTOHEX(GUID0),21,12))
,GUID1_CHAR as (SUBSTR(RAWTOHEX(GUID1),1,8)||'-'||
SUBSTR(RAWTOHEX(GUID1),9,4)||'-'||
SUBSTR(RAWTOHEX(GUID1),13,4)||'-'||
SUBSTR(RAWTOHEX(GUID1),17,4)||'-'||
SUBSTR(RAWTOHEX(GUID1),21,12))
,GUID2_CHAR as (SUBSTR(RAWTOHEX(GUID2),1,8)||'-'||
SUBSTR(RAWTOHEX(GUID2),9,4)||'-'||
SUBSTR(RAWTOHEX(GUID2),13,4)||'-'||
SUBSTR(RAWTOHEX(GUID2),17,4)||'-'||
SUBSTR(RAWTOHEX(GUID2),21,12))
);
 
insert into cern_test (guid0,guid1,guid2)
select HEXTORAW('21EC20203AEA4069A2DD08002B30309D'),
HEXTORAW('31DC20203AEA4069A2DD08002B30309D'),
HEXTORAW('41CC20203AEA4069A2DD08002B30309D')
from xmltable('1 to 10000');
commit;
 
exec dbms_stats.gather_table_stats(user,'CERN_TEST',estimate_percent=>100,method_opt=>'FOR ALL COLUMNS SIZE 1');
 
select avg_row_len from tabs where table_name='CERN_TEST';
 
AVG_ROW_LEN
-----------
162
 
select sum(avg_col_len) from user_tab_columns
where table_name='CERN_TEST' and column_name in ('GUID0','GUID1','GUID2');
 
SUM(AVG_COL_LEN)
----------------
51
 
select sum(avg_col_len) from user_tab_columns
where table_name='CERN_TEST' and column_name in ('GUID0_CHAR','GUID1_CHAR','GUID2_CHAR');
 
SUM(AVG_COL_LEN)
----------------
111

The question is if the computation of the average row length by Oracle is correct. I.e. should the physically non-existent virtual columns be considered?
I.e. physically they do not take space. So physically the average row length in the example above is 51, but logically it is 162. What is correct?

To answer that question it has to be checked what the average row length is used for. That information is not documented, but my assumption is that it’s actually only used for the calculation of Bytes required when doing a “select * “, i.e. all columns. That number however, may become important later on when calculating the memory required for e.g. a hash join.

Anyway, the basic question is how Oracle treats virtual columns in execution plans? I.e. does it compute the value of the virtual column when the table is accessed or does it compute the virtual column when it needs it (e.g. when fetching the row or when needing it as a column to join with). According the number “Bytes” in the execution plan the value is computed when the table is accessed:


SQL> explain plan for
2 select a.*, b.guid0 b_guid0 from cern_test a, cern_test b
3 where a.guid0_char=b.guid0_char;
 
Explained.
 
SQL> select * from table(dbms_xplan.display(format=>'+PROJECTION'));
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 3506643611
 
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100M| 20G| 267 (83)| 00:00:01 |
|* 1 | HASH JOIN | | 100M| 20G| 267 (83)| 00:00:01 |
| 2 | TABLE ACCESS FULL| CERN_TEST | 10000 | 527K| 23 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| CERN_TEST | 10000 | 1582K| 23 (0)| 00:00:01 |
--------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
1 - access("A"."GUID0_CHAR"="B"."GUID0_CHAR")
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
1 - (#keys=1) "A"."GUID0_CHAR"[VARCHAR2,132], "GUID0"[RAW,16],
"GUID0"[RAW,16], "GUID1"[RAW,16], "GUID2"[RAW,16] 2 - "GUID0"[RAW,16] 3 - "GUID0"[RAW,16], "GUID1"[RAW,16], "GUID2"[RAW,16]  
23 rows selected.
 
SQL> select column_name, avg_col_len from user_tab_columns
2 where table_name='CERN_TEST' and column_name in ('GUID0','GUID0_CHAR');
 
COLUMN_NAME AVG_COL_LEN
----------------------------------- -----------
GUID0 17
GUID0_CHAR 37
 
SQL> select (10000*17)/1024 from dual;
 
(10000*17)/1024
---------------
166.015625
 
SQL> select ((10000*17)+(10000*37))/1024 from dual;
 
((10000*17)+(10000*37))/1024
----------------------------
527.34375

So according the projection at step 2 of the plan we use B.GUID0 only, but the Bytes value of 527K considers GUID0 and the virtual column GUID0_CHAR. So the calculation of Bytes is done when the table is accessed and not when the virtual column is actually needed (during the hash).

In that regard the calculation of the avg_row_len by dbms_stats with the virtual columns considered is correct.

The only issue I see are old scripts people wrote long ago, which try to compute the amount of data in a table based on its avg_row_len statistic using something like


SELECT table_name, num_rows * avg_row_len actual_size_of_data
FROM user_tables order by 2;

If there are virtual columns in the table, such a select may return too high values for “actual_size_of_data”.

REMARK: Using the old ANALYZE command to gather statistics results in a value for the avg_row_len, which considers only “real” columns. However, ANALYZE must not be used anymore of course.

 

Cet article DOAG 2017: avg_row_len with virtual columns est apparu en premier sur Blog dbi services.

DOAG 2017

Thu, 2017-11-23 08:00

Als Consultant bei dbi services war ich die letzten 2 Jahre hauptsächlich in Konsolidierungsprojekten basierend auf den Oracle Engineered Systems unterwegs. Deshalb waren für mich natürlich die Vorträge über die neue Generation der Oracle Database Appliance X7-2 interessant.

Aus meiner Sicht hat Oracle den richtigen Schritt getan und die Vielfalt der ODA Systeme wie es sie noch in der X6-2 Generation gab (Small/Medium/Large und HA) reduziert.

Künftig wird es die ODA X7-2 nur noch in 3 Modellen (S, M und HA) geben, wobei klar zu sagen ist, dass die kleineren Systeme leistungstechnisch aufgewertet wurden und die HA endlich wieder in einer Ausstattung zur Verfügung steht, die eine Konsolidierung auch von größeren Datenbank- und Applikationssystemen ermöglicht:

  • die ODA X7-2 S als Einsteigersystem mit einer 10 Core CPU, bis zu 384 GB RAM und 12,8 TB NVMe Storage
  • die ODA X7-2 M entspricht nun eher den X6 L Systemen mit 2×18 Core, bis zu 768 GB RAM und bis 51,2 TB NVMe Storage
  • die ODA X7-2 HA ist natürlich das Flaggschiff der ODA Klasse. 2 Server mit je 2×18 Cores, bis zu 768 GB RAM pro Server und diversen Storage Erweiterungen bis 150 TB geben einem das alte X5 Gefühl (oder vielleicht schon mit einer Exadata) zu arbeiten

Die für mich interessantesten Neuerungen sind weniger im Hardwarebereich zu finden, sondern viel mehr in den möglichen Deployments der Systeme:

  • alle Systeme unterstützen SE/SE1/SE2/EE 11.2.0.4, 12.1.0.2 und 12.2.01
  • alle Systeme unterstützen ein virtualisiertes Setup, die kleinen Systeme mit KVM, die X7-2 HA mit KVM und OVM, wobei bis jetzt noch kein Hardpartitioning mit KVM möglich, aber in Planung ist
  • auf der X7-2 HA kann bei den Storage Erweiterungen zwischen High Performance (SSD) und High Capacity (HDD) gewählt werden, sogar Mischformen sind mit Einschränkungen möglich

Eingespart wurde allerdings bei den Netzwerkschnittstellen, hier gibt es nur noch 2 Interfaces statt bisher 4 wie an der X5-2 (neben den privaten Interfaces für den Interconnect). Es gibt zwar die Möglichkeit ein weiteres Interface nach dem Deployment zu konfigurieren, allerdings nur mit 1GB. Geplant ist zwar, dass künftig auch im Bare Metal Setup VLAN Konfigurationen auf dem Public Interface möglich sind, trotzdem hätte man (insbesondere der HA) noch zwei zusätzliche Interfaces spendieren können, zudem Steckplätze vorhanden sind.

Hoch interessant sind die Leistungsdaten des NVMe bzw. SSD Storage. Hier sind bis zu 100 000 IOPS möglich, bei der HA sehe ich als begrenzenden Faktor eher den SAS Bus als die SDDs. Was wirklich schön ist, dass der Storage für die Redo Logs auf 4x800GB SSD erweitert wurde, hier musste man in den früheren Systemen immer etwas sparsam sein…

Alles in allem freue ich mich darauf mit der X7-2 zu arbeiten, denn Oracle stellt hier ein gutes Stück Hardware bereit, das auch preislich im Rahmen bleibt.

 

 

 

 

 

 

 

 

Cet article DOAG 2017 est apparu en premier sur Blog dbi services.

Create index CONCURRENTLY in PostgreSQL

Wed, 2017-11-22 12:10

In PostgreSQL when you create an index on a table, sessions that want to write to the table must wait until the index build completed by default. There is a way around that, though, and in this post we’ll look at how you can avoid that.

As usual we’ll start with a little table:

postgres=# \! cat a.sql
drop table if exists t1;
create table t1 ( a int, b varchar(50));
insert into t1
select a.*, md5(a::varchar) from generate_series(1,5000000) a;
postgres=# \i a.sql
DROP TABLE
CREATE TABLE
INSERT 0 5000000

When you now create an index on that table and try to write the table at the same time from a different session that session will wait until the index is there (the screenshot shows the first session creating the index on the left and the second session doing the update on the right, which is waiting for the left one):
Selection_007

For production environments this not something you want to happen as this can block a lot of other sessions especially when the table in question is heavily used. You can avoid that by using “create index concurrently”.

Selection_008

Using that syntax writes to the table from other sessions will succeed while the index is being build. But, as clearly written in the documentation: The downside is that the table needs to be scanned twice, so more work needs to be done which means more resource usage on your server. Other points need to be considered as well. When, for whatever reason, you index build fails (e.g. by canceling the create index statement):

postgres=# create index concurrently i1 on t1(a);
^CCancel request sent
ERROR:  canceling statement due to user request

… you maybe would expect the index not to be there at all but this is not the case. When you try to create the index right after the canceled statement again you’ll hit this:

postgres=# create index concurrently i1 on t1(a);
ERROR:  relation "i1" already exists

This does not happen when you do not create the index concurrently:

postgres=# create index i1 on t1(a);
^CCancel request sent
ERROR:  canceling statement due to user request
postgres=# create index i1 on t1(a);
CREATE INDEX
postgres=# 

The questions is why this happens in the concurrent case but not in the “normal” case? The reason is simple: When you create an index the “normal” way the whole build is done in one transaction. Because of this the index does not exist when the transaction is aborted (the create index statement is canceled). When you build the index concurrently there are multiple transactions involved: “In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions”. So in this case:

postgres=# create index concurrently i1 on t1(a);
ERROR:  relation "i1" already exists

… the index is already stored in the catalog:

postgres=# create index concurrently i1 on t1(a);
^CCancel request sent
ERROR:  canceling statement due to user request
postgres=# select relname,relkind,relfilenode from pg_class where relname = 'i1';
 relname | relkind | relfilenode 
---------+---------+-------------
 i1      | i       |       32926
(1 row)

If you don’t take care of that you will have invalid indexes in your database:

postgres=# \d t1
                        Table "public.t1"
 Column |         Type          | Collation | Nullable | Default 
--------+-----------------------+-----------+----------+---------
 a      | integer               |           |          | 
 b      | character varying(50) |           |          | 
Indexes:
    "i1" btree (a) INVALID

You might think that this does not harm, but then consider this case:

-- in session one build a unique index
postgres=# create unique index concurrently i1 on t1(a);
-- then in session two violate the uniqueness after some seconds
postgres=# update t1 set a = 5 where a = 4000000;
UPDATE 1
-- the create index statement will fail in the first session
postgres=# create unique index concurrently i1 on t1(a);
ERROR:  duplicate key value violates unique constraint "i1"
DETAIL:  Key (a)=(5) already exists.

This is even worse as the index now really consumes space on disk:

postgres=# select relpages from pg_class where relname = 'i1';
 relpages 
----------
    13713
(1 row)

The index is invalid, of course and will not be used by the planner:

postgres=# \d t1
                        Table "public.t1"
 Column |         Type          | Collation | Nullable | Default 
--------+-----------------------+-----------+----------+---------
 a      | integer               |           |          | 
 b      | character varying(50) |           |          | 
Indexes:
    "i1" UNIQUE, btree (a) INVALID

postgres=# explain select * from t1 where a = 12345;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Gather  (cost=1000.00..82251.41 rows=1 width=37)
   Workers Planned: 2
   ->  Parallel Seq Scan on t1  (cost=0.00..81251.31 rows=1 width=37)
         Filter: (a = 12345)
(4 rows)

But the index is still maintained:

postgres=# select relpages from pg_class where relname = 'i1';
 relpages 
----------
    13713
(1 row)
postgres=# insert into t1 select a.*, md5(a::varchar) from generate_series(5000001,6000000) a;
INSERT 0 1000000

postgres=# select relpages from pg_class where relname = 'i1';
 relpages 
----------
    16454
(1 row)

So now you have an index which can not be used to speed up queries (which is bad) but the index is still maintained when you write to the table (which is even worse because you consume resources for nothing). The only way out of this is to drop and re-create the index:

postgres=# drop index i1;
DROP INDEX
-- potentially clean up any rows that violate the constraint and then
postgres=# create unique index concurrently i1 on t1(a);
CREATE INDEX
postgres=# \d t1
                        Table "public.t1"
 Column |         Type          | Collation | Nullable | Default 
--------+-----------------------+-----------+----------+---------
 a      | integer               |           |          | 
 b      | character varying(50) |           |          | 
Indexes:
    "i1" UNIQUE, btree (a)

postgres=# explain select * from t1 where a = 12345;
                          QUERY PLAN                           
---------------------------------------------------------------
 Index Scan using i1 on t1  (cost=0.43..8.45 rows=1 width=122)
   Index Cond: (a = 12345)
(2 rows)

Remember: When a create index operations fails in concurrent mode make sure that you drop the index immediately.

One more thing to keep in mind: When you create an index concurrently and there is another session already modifying the data the create index command waits until that other operation completes:

-- first session inserts data without completing the transaction
postgres=# begin;
BEGIN
Time: 0.579 ms
postgres=# insert into t1 select a.*, md5(a::varchar) from generate_series(6000001,7000000) a;
INSERT 0 1000000
-- second sessions tries to build the index
postgres=# create unique index concurrently i1 on t1(a);

The create index operation will wait until that completes:

postgres=# select query,state,wait_event,wait_event_type from pg_stat_activity where state ='active';
                                query                                 | state  | wait_event | wait_event_t
----------------------------------------------------------------------+--------+------------+-------------
 create unique index concurrently i1 on t1(a);                        | active | virtualxid | Lock
 select query,state,wait_event,wait_event_type from pg_stat_activity; | active |            | 

… meaning when someone forgets to end the transaction the create index command will wait forever. There is the parameter idle_in_transaction_session_timeout which gives you more control on that but still you need to be aware what is happening here.

Happy index creation :)

 

Cet article Create index CONCURRENTLY in PostgreSQL est apparu en premier sur Blog dbi services.

DOAG2017 my impressions

Wed, 2017-11-22 11:28

As each year at end of November the biggest Oracle European conference takes place in Nürnberg, #DOAG2017. This year is a little bit special, because the DOAG celebrate the 30th edition of the conference.

2017_DOAG_Banner
dbi services is for the 5th time present with a booth and 8 sessions at the DOAG.IMG_3943
During the last 2 days I already followed many sessions, and I want to give you my impression and feedback’s about the market trends.
Tuesday morning as usual the conference started with a keynote, which is often not much interesting, because they only inform us, what was already communicated some weeks before at the Oracle Open Word conference. But this year it was not the case, I saw a very interesting session from Neil Sholay(Oracle) about technology and market shifts that will have an impact on our neat future. For example in the near future you running shoes will be directly made in the shop with a 3D printer, and your clothes will be directly made with a machine in the shop ,which is 17 time faster as clothes made by a men. 

After this nice introduction, I followed a very interesting session from Guido Schmutz(Trivadis) about Kafka with a very nice live demo, i like to see live demo but is something that I see less and less at the DOAG. At dbi services we try to have interesting live demo’s in each of our sessions.Later after a short break, I was very curious to see how many people will follow the session from Jan Karremans(EDB) about comparing Oracle to PostgreSQL, and as supposed the room was full. Therefore I can confirm the interest of seeing PostgreSQL sessions at the DOAG is very high. Because today most of the Oracle DBA beside their tasks, will also have to manage PostgreSQL databases.

IMG_3938
Today morning I followed a session from Mike Dietrich(Oracle)about the new Oracle database release model, as usual his session was very good with more hundred of participants.
The key word of the session, if you are still running Oracle database version 11.2.0.4, Mike advice to upgrade it very soon ! because begin of next year (5 weeks) you will  enter into the extended period with additional cost for the support.So last but not the least, this begin of afternoon I saw a session “Cloud provider battle” from Manfred Klimke(Trevisto). The interest for this session was also very high, because I suppose that most of the participants are not in the Cloud, and don’t know where they should go. During the session he presented a funny slide to resume the available Cloud service  with a pizza, and I can confirm it reflate the reality, “Dined at a restaurant” it the most expensive service.

pizza
As conclusion of this 2 days, all around Open Source is also a very important topic beside the Cloud at the DOAG, which also has presentations of the Oracle competitors.

 

Cet article DOAG2017 my impressions est apparu en premier sur Blog dbi services.

12c Multitenant Internals: compiling system package from PDB

Wed, 2017-11-22 07:38

DPKi1vxX0AAADLmWhen I explain the multitenant internals, I show that all metadata about system procedures and packages are stored only in CDB$ROOT and are accessed from the PDBs through metadata links. I take an example with DBMS_SYSTEM that has nothing in SOURCE$ of the PDB. But I show that we can compile it from the PDB. This is my way to prove that the session can access the system objects, internally switching the session to the root container when it needs to read SOURCE$. At DOAG Conference I had a very interesting question about what happens exactly in CDB$ROOT: Is the session really executing all the DML on the internal tables storing the compiled code of the procedure?

My first answer was something like ‘why not’ because the session in a PDB can switch and do modifications into CDB$ROOT internally. For example, even a local PDB DBA can change some ‘spfile’ parameters which are actually stored in the CDB$ROOT. But then I realized that the question goes further: is the PDB session really compiling the DBMS_SYSTEM package in the CDB$ROOT? Actually, there are some DDL that are transformed to ‘no-operation’ when executed on the PDB.

To see which ones are concerned, the best is to trace:

SQL> alter session set events='10046 trace name context forever, level 4';
Session altered.
SQL> alter session set container=PDB1;
Session altered.
SQL> alter package dbms_system compile;
Package altered.
SQL> alter session set events='10046 trace name context off';
Session altered.

I’ll not show the whole trace here. For sure I can see that the session switches to CDB$ROOT to read the source code of the package:

*** 2017-11-22T08:36:01.963680+01:00 (CDB$ROOT(1))
=====================
PARSING IN CURSOR #140650193204552 len=54 dep=1 uid=0 oct=3 lid=0 tim=5178881528 hv=696375357 ad='7bafeab8' sqlid='9gq78x8ns3q1x'
select source from source$ where obj#=:1 order by line
END OF STMT
PARSE #140650193204552:c=0,e=290,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=5178881527
EXEC #140650295606992:c=1000,e=287,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=813480514,tim=5178881999
FETCH #140650295606992:c=0,e=35,p=0,cr=4,cu=0,mis=0,r=1,dep=2,og=4,plh=813480514,tim=5178882057
CLOSE #140650295606992:c=0,e=12,dep=2,type=3,tim=5178882104

That was my point about metadata links. But now about modifications.

As I need to see only the statements, I can use TKPROF to get them aggregated, but then the container switch – like (CDB$ROOT(1)) here – is ignored.

Here is a small AWK script I use to add the Container ID to the SQL ID so that it is visible and detailed into TKPROF output:

awk '/^[*]{3}/{con=$3}/^PARSING IN/{sub(/sqlid=./,"&"con" ")}{print > "con_"FILENAME }'

Then I run TKPROF on the resulting file, with ‘sort=(execu)’ so that I have the modifications (insert/delete/update) first. The result starts with something like this:

SQL ID: (PDB1(3)) 1gfaj4z5hn1kf Plan Hash: 1110520934
 
delete from dependency$
where
d_obj#=:1

I know that dependencies are replicated into all containers (because table metadata is replicated into all containers) so I see following tables modified in the PDB: DEPENDENCY$, ACCESS$, DIANA_VERSION$, and of course OBJ$.

But to answer the initial question, there are no modifications done in the CDB$ROOT. Only SELECT statements there, on SOURCE$, SETTINGS$, CODEAUTH$, WARNING_SETTINGS$

So, probably, the updates have been transformed to no-op operations once the session is aware that the source is the same (same signature) and it just reads the compilation status.

Just as a comparison, tracing the same compilation when done on the CDB$ROOT will show inserts/delete/update on ARGUMENT$, PROCEDUREINFO$, SETTINGS$, PROCEDUREPLSQL$, IDL_UB1$, IDL_SB4$, IDL_UB2$, IDL_CHAR$, … all those tables sorting the compiled code.

So basically, when running DDL on metadata links in a PDB, not all the work is done in the CDB, especially not writing again what is already there (because you always upgrade the CDB$ROOT first). However, up to 12.2 we don’t see a big difference in time. This should change in 18c where the set of DDL to be run on the PDB will be pre-processed to avoid unnecessary operations.

 

Cet article 12c Multitenant Internals: compiling system package from PDB est apparu en premier sur Blog dbi services.

firewalld rules for Veritas Infoscale 7.3 with Oracle

Mon, 2017-11-20 06:30

You might wonder, but yes, Veritas is still alive and there are customers that use it and are very happy with it. Recently we upgraded a large cluster from Veritas 5/RHEL5 to Veritas InfoScale 7.3/RHEL7 and I must say that the migration was straight forward and very smooth (when I have time I’ll write another post specific to the migration). At a point in time during this project the requirement to enable the firewall on the Linux hosts came up so we needed to figure out all the ports and then setup the firewall rules for that. This is how we did it…

The first step was to create a new zone because we did not want to modify any of the default zones:

root@:/home/oracle/ [] firewall-cmd --permanent --new-zone=OracleVeritas
root@:/home/oracle/ [] firewall-cmd --reload
success
root@:/home/oracle/ [] firewall-cmd --get-zones
OracleVeritas block dmz drop external home internal public trusted work

The ports required for Veritas InfoScale are documented here. This is the set of ports we defined:

##### SSH
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-service=ssh
##### Veritas ports
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=4145/udp            # vxio
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=4145/tcp            # vxio
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=5634/tcp            # xprtld
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=8199/tcp            # vras
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=8989/tcp            # vxreserver
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14141/tcp           # had
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14144/tcp           # notifier
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14144/udp           # notifier
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14149/tcp           # vcsauthserver
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14149/udp           # vcsauthserver
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14150/tcp           # CmdServer
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14155/tcp           # wac
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14155/udp           # wac
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14156/tcp           # steward
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=14156/udp           # steward
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=443/tcp             # Vxspserv
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=49152-65535/tcp     # vxio
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=49152-65535/udp     # vxio
#### Oracle ports
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=1521/tcp            # listener
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --permanent --add-port=3872/tcp            # cloud control agent

Because we wanted the firewall only on the public network, but not on the interconnect we changed the interfaces for the zone:

root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --change-interface=bond0
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --change-interface=eth0
root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --change-interface=eth2

One additional step to make this active is to add the zone to the interface configuration (this is done automatically if the interfaces are under control of network manager):

root@:/home/oracle/ [] echo "ZONE=OracleVeritas" >> /etc/sysconfig/network-scripts/ifcfg-eth0
root@:/home/oracle/ [] echo "ZONE=OracleVeritas" >> /etc/sysconfig/network-scripts/ifcfg-eth2
root@:/home/oracle/ [] echo "ZONE=OracleVeritas" >> /etc/sysconfig/network-scripts/ifcfg-bond0

Restart the firewall service:

root@:/home/oracle/ [] systemctl restart firewalld

… and it should be active:

root@:/home/postgres/ [] firewall-cmd --get-active-zones
OracleVeritas
  interfaces: eth0 eth2 bond0
public
  interfaces: eth1 eth3

root@:/home/oracle/ [] firewall-cmd --zone=OracleVeritas --list-all
OracleVeritas (active)
  target: default
  icmp-block-inversion: no
  interfaces: bond0 eth0 eth2
  sources: 
  services: 
  ports: 4145/udp 4145/tcp 5634/tcp 8199/tcp 8989/tcp 14141/tcp 14144/tcp 14144/udp 14149/tcp 14149/udp 14150/tcp 14155/tcp 14155/udp 14156/tcp 14156/udp 443/tcp 49152-65535/tcp 49152-65535/udp 1521/tcp 3872/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 

Just for completeness: You can also directly check the configuration file for the zone:

root@:/home/oracle/ [] cat /etc/firewalld/zones/OracleVeritas.xml

Hope this helps …

 

Cet article firewalld rules for Veritas Infoscale 7.3 with Oracle est apparu en premier sur Blog dbi services.

Is it an index, a table or what?

Sun, 2017-11-19 10:54

A recent tweet from Kevin Closson outlined that in PostgreSQL it might be confusing if something is an index or table. Why is it like that? Lets have a look and start be re-building the example from Kevin:

For getting into the same situation Kevin described we need something like this:

postgres=# create table base4(custid int, custname varchar(50));
CREATE TABLE
postgres=# create index base4_idx on base4(custid);
CREATE INDEX

Assuming that we forgot that we created such an index and come back later and try to create it again we have exactly the same behavior:

postgres=# create index base4_idx on base4(custid);
ERROR:  relation "base4_idx" already exists
postgres=# drop table base4_idx;
ERROR:  "base4_idx" is not a table
HINT:  Use DROP INDEX to remove an index.
postgres=# 

They keyword here is “relation”. In PostgreSQL a “relation” does not necessarily mean a table. What you need to know is that PostgreSQL stores everything that looks like a table/relation (e.g. has columns) in the pg_class catalog table. When we check our relations there:

postgres=# select relname from pg_class where relname in ('base4','base4_idx');
  relname  
-----------
 base4
 base4_idx
(2 rows)

… we can see that both, the table and the index, are somehow treated as a relation. The difference is here:

postgres=# \! cat a.sql
select a.relname 
     , b.typname
  from pg_class a
     , pg_type b 
 where a.relname in ('base4','base4_idx')
   and a.reltype = b.oid;
postgres=# \i a.sql
 relname | typname 
---------+---------
 base4   | base4
(1 row)

Indexes do not have an entry in pg_type, tables have. What is even more interesting is, that the “base4″ table is a type itself. This means for every table you create a composite type is created as well that describes the structure of the table. You can even link back to pg_class:

postgres=# select typname,typrelid from pg_type where typname = 'base4';
 typname | typrelid 
---------+----------
 base4   |    32901
(1 row)

postgres=# select relname from pg_class where oid = 32901;
 relname 
---------
 base4
(1 row)

When you want to know what type a relation is of the easiest way is to ask like this:

postgres=# select relname,relkind from pg_class where relname in ('base4','base4_idx');
  relname  | relkind 
-----------+---------
 base4     | r
 base4_idx | i
(2 rows)

… where:

  • r = ordinary table
  • i = index
  • S = sequence
  • t = TOAST table
  • m = materialized view
  • c = composite type
  • f = foreign table
  • p = partitioned table

Of course there are also catalog tables for tables and indexes, so you can also double check there. Knowing all this the message is pretty clear:

postgres=# create index base4_idx on base4(custid);
ERROR:  relation "base4_idx" already exists
postgres=# drop relation base4_idx;
ERROR:  syntax error at or near "relation"
LINE 1: drop relation base4_idx;
             ^
postgres=# drop table base4_idx;
ERROR:  "base4_idx" is not a table
HINT:  Use DROP INDEX to remove an index.
postgres=# 

PostgreSQL finally is telling you that “base4_idx” is an index and not a table which is fine. Of course you could think that PostgreSQL should to that on its own but it is also true: When you want to drop something, you should be sure on what you really want to drop.

 

Cet article Is it an index, a table or what? est apparu en premier sur Blog dbi services.

Unstructed vs. structured

Sat, 2017-11-18 01:13

The title of this blog post was: “Tracing DBMS_RCVMAN for reclaimable archivelogs” until I started to write the conclusion…

In a previous post I mentioned that there’s a bug with archivelog deletion policy when you want to mention both the ‘BACKED UP … TIMES TO …’ and ‘APPLIED’ or ‘SHIPPED’ as conditions for archived logs to be reclaimable. I opened a SR, they didn’t even try to reproduce it (and I can guarantee you can reproduce it in 2 minutes on any currently supported version) so I traced it myself to understand the bug and suggest the fix.

I traced the DBMS_RCVMAN with Kernel Recovery Area function SQL Tracing:

SQL> alter session set events 'trace[kra_sql] disk high, memory disable';
SQL> dbms_backup_restore.refreshAgedFiles;
SQL> alter session set events 'trace[kra_sql] off';

I know refreshAgedFiles checks for reclaimable file in FRA since it was an old bug where we had to run it manually on databases in mount.

I compared the traces when changing the order of ‘APPLIED’ and ‘BACKED UP’ and found the following:

< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: setRedoLogDeletionPolicy with policy = TO BACKED UP 1 TIMES TO DISK APPLIED ON ALL STANDBY
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: setRedoLogDeletionPolicy with policy = TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK
5340c5340
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: policy = TO BACKED UP 1 TIMES TO DISK APPLIED ON ALL STANDBY
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: policy = TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK
5343c5343
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING setRedoLogDeletionPolicy with policy = TO BACKED UP 1 TIMES TO DISK APPLIED ON ALL STANDBY with alldest = 1
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING setRedoLogDeletionPolicy with policy = TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK with alldest = 1
5350,5351c5350,5351
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: parseBackedUpOption devtype=DISK
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: parseBackedUpOption backed up conf - devtype=DISK , backups=1
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: parseBackedUpOption devtype=DISK
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: parseBackedUpOption backed up conf - devtype=DISK, backups=1
5363c5363
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING getBackedUpAl with TRUE
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING getBackedUpAl with key = 128 stamp = 958068130
5367c5367
< *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING getBackedUpFiles with: no_data_found
---
> *:KRA_SQL:kraq.c@1035:kraqgdbg(): DBGRCVMAN: EXITING getBackedUpFiles

You see at the top the difference in the way I mentioned the deletion policy. You see at the bottom that the first one (starting with ‘BACKED UP’) didn’t find archivelogs being backed up (no_data_found). But the second one (starting with ‘APPLIED’) mentioned the sequence# 128.

But if you look carefully, you see another difference in the middle: the “devtype=DISK” has an additional space before the comma in the first case.

So I traced a bit further, including SQL_TRACE and I found that the deletion policy is just using some INSTR and SUBSTR parsing on the deletion policy text to find the policy, the backup times, and the device type. For sure, looking for backups with DEVICE_TYPE=’DISK ‘ instead of ‘DISK’ will not find anything and this is the reason for the bug: no archived logs backed up means no archived log reclaimable.

If you look closer at DBMS_RCVMAN you will find that the device type is extracted with SUBSTR(:1, 1, INSTR(:1, ‘ ‘)) when the device type is followed by a space, which is the reason of this additional space. The correct extraction should be SUBSTR(:1, 1, INSTR(:1, ‘ ‘)-1) and this is what I suggested on the SR.

So what?

Writing the conclusion made me change the title. Currently, a lot of people are advocating for unstructured data. Because it is easy (which rhymes with ‘lazy’). Store information as it comes and postpone the parsing to a more structured data type until you need to process it. This seems to be how the RMAN configuration is stored: as the text we entered. And it is parsed later with simple text function as INSTR(), SUBSTR(), and LIKE. But you can see how a little bug, such as reading an additional character, has big consequences. If you look at the archivelog deletion policy syntax, you have 50% chances to run into this bug on a Data Guard configuration. The Recovery Area will fill up and your database will be blocked. The controlfile grows. Or you noticed it before and you run a ‘delete archivelog’ statement without knowing the reason. You waste space, removing some recovery files from local storage, which could have been kept for longer. If the deletion policy was parsed immediately when entered, like SQL DDL or PL/SQL APIs, the issue would have been detected a long time ago. Structure and strong typing is the way to build robust applications.

 

Cet article Unstructed vs. structured est apparu en premier sur Blog dbi services.

CBO, FIRST_ROWS and VIEW misestimate

Thu, 2017-11-16 23:36

There are several bugs with the optimizer in FIRST_ROWS mode. Here is one I encountered during a 10.2.0.4 to 12.2.0.1 migration when a view had an ‘order by’ in its definition.

Here is the test case that reproduces the problem.

A big table:

SQL> create table DEMO1 (n constraint DEMO1_N primary key,x,y) as select 1/rownum,'x','y' from xmltable('1 to 1000000');
Table DEMO1 created.

with a view on it, and that view has an order by:

SQL> create view DEMOV as select * from DEMO1 order by n desc;
View DEMOV created.

and another table to join to:

SQL> create table DEMO2 (x constraint DEMO2_X primary key) as select dummy from dual;
Table DEMO2 created.

My query reads the view in a subquery, adds a call to a PL/SQL function, and joins the result with the other table:


SQL> explain plan for
select /*+ first_rows(10) */ *
from
( select v.*,dbms_random.value from DEMOV v)
where x in (select x from DEMO2)
order by n desc;
 
Explained.

You can see that I run it with FIRST_ROWS(10) because I actually want to fetch the top-10 rows when ordered by N. As N is a number and I have an index on it and there are no nulls (it is the primary key) I expect to read the first 10 entries from the index, call the function for each of them, then nested loop to the other tables.

In the situation I encountered it, this is what was done in 10g but when migrated to 12c the query was very long because it called the PL/SQL function for million of rows. Here is the plan in my example:


SQL> select * from dbms_xplan.display(format=>'+projection');
 
PLAN_TABLE_OUTPUT
-----------------
Plan hash value: 2046425878
 
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | | 7 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 1 | 21 | | 7 (0)| 00:00:01 |
| 2 | VIEW | DEMOV | 902 | 17138 | | 7 (0)| 00:00:01 |
| 3 | SORT ORDER BY | | 968K| 17M| 29M| 6863 (1)| 00:00:01 |
| 4 | TABLE ACCESS FULL | DEMO1 | 968K| 17M| | 1170 (1)| 00:00:01 |
| 5 | VIEW PUSHED PREDICATE | VW_NSO_1 | 1 | 2 | | 0 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | DEMO2_X | 1 | 2 | | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
6 - access("X"="V"."X")
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
1 - (#keys=0) "V"."N"[NUMBER,22], "V"."X"[CHARACTER,1], "V"."Y"[CHARACTER,1] 2 - "V"."N"[NUMBER,22], "V"."X"[CHARACTER,1], "V"."Y"[CHARACTER,1] 3 - (#keys=1) INTERNAL_FUNCTION("N")[22], "X"[CHARACTER,1], "Y"[CHARACTER,1] 4 - "N"[NUMBER,22], "X"[CHARACTER,1], "Y"[CHARACTER,1]

A full table scan of the big table, with a call to the PL/SQL function for each row and the sort operation on all rows. Then the Top-10 rows are filtered and the nested loop operates on that. But you see the problem here. The cost of the ‘full table scan’ and the ‘order by’ has been evaluated correctly, but the cost after the VIEW operation is minimized.

My interpretation (but it is just a quick guess) is that the the rowset is marked as ‘sorted’ and then the optimizer considers that the cost to get first rows is minimal (as if it were coming from an index). However, this just ignores the initial cost of getting this rowset.

I can force with a hint the plan that I want – index full scan to avoid a sort and get the top-10 rows quickly:

SQL> explain plan for
select /*+ first_rows(10) INDEX_DESC(@"SEL$3" "DEMO1"@"SEL$3" ("DEMO1"."N")) */ *
from
( select v.*,dbms_random.value from DEMOV v)
where x in (select x from DEMO2)
order by n desc;
 
Explained.

This plan is estimated with an higher cost than the previous one and this is why it was not chosen:

SQL> select * from dbms_xplan.display(format=>'+projection');
PLAN_TABLE_OUTPUT
Plan hash value: 2921908728
 
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | 9 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 1 | 21 | 9 (0)| 00:00:01 |
| 2 | VIEW | DEMOV | 902 | 17138 | 9 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEMO1 | 968K| 17M| 8779 (1)| 00:00:01 |
| 4 | INDEX FULL SCAN DESCENDING| DEMO1_N | 968K| | 4481 (1)| 00:00:01 |
| 5 | VIEW PUSHED PREDICATE | VW_NSO_1 | 1 | 2 | 0 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | DEMO2_X | 1 | 2 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
6 - access("X"="V"."X")
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
1 - (#keys=0) "V"."N"[NUMBER,22], "V"."X"[CHARACTER,1], "V"."Y"[CHARACTER,1] 2 - "V"."N"[NUMBER,22], "V"."X"[CHARACTER,1], "V"."Y"[CHARACTER,1] 3 - "N"[NUMBER,22], "X"[CHARACTER,1], "Y"[CHARACTER,1] 4 - "DEMO1".ROWID[ROWID,10], "N"[NUMBER,22]

This cost estimation is fine. The cost of getting all rows by index access is higher than with a full table scan, but the optimizer knows that the actual cost is proportional to the number of rows fetched and then it adjusts the cost accordingly. This is fine here because the VIEW has only non-blocking operations. The problem in the first plan without the hint, was because the same arithmetic was done, without realizing that the SORT ORDER BY is a blocking operation and not a permanent sorted structure, and must be completed before being able to return the first row.

In this example, as in the real case I’ve encountered, the difference in cost is very small (7 versus 9 here) which means that the plan can be ok and switch to the bad one (full scan, call function for all rows, sort them) with a small change in statistics. Note that I mentioned that the plan was ok in 10g but that may simply be related to the PGA settings and different estimation for the cost of sorting.

 

Cet article CBO, FIRST_ROWS and VIEW misestimate est apparu en premier sur Blog dbi services.

A response to: What makes a community?

Thu, 2017-11-16 03:39

A recent tweet of mine resulted in Martin Widlake to write a really great blog post about What makes a community. Please read it before you continue to read this. There was another response from Stefan Koehler which is worth mentioning as well.

Both, Martin and Stefan, speak about Oracle communities because this is were they are involved in. At the beginning of Martin’s post he writes: “Daniel was not specific about if this was a work/user group community or a wider consideration of society, …” and this was intentional. I don’t think that it really matters much if we speak about a community around a product, a community that just comes together for drinking beer and to discuss the latest football results or even if we talk about a community as a family. At least in the German translation “Gemeinschaft” applies to a family as well. This can be a very few people (mother,father,kids) or more if we include brothers, sisters, grandmas and so on. But still the same rules that Martin outlines in hist blog post apply: You’ll always have people driving the community such as organizing dinners (when we speak about families), organizing conferences (when we speak about technical communities) or organizing parties (when we talk about fun communities) or organizing whatever for whatever people make up the specific community. Then you’ll always have the people willing to help (the people Martin describes as the people who share and/or talk) and you’ll always have the people that consume/attend which is good as well, because without them you’d have nothing to share and to organize.

We at dbi services are a community as well. As we work with various products the community is not focused on a specific product (well, it is in the area of a specific product, of course) but rather on building an environment we like to work in. The community here is tight to technology but detached from a single product. We share the same methodologies, the same passion and have fun attending great parties that are organized mostly by the non technical people in our company. In this case you could say: The non-technical people are the drivers for the community of the company even if the company is very technical from its nature. And here we have the same situation again: Some organize, some attend/consume and some share, but all are required (as Martin outlined in his post as well).

Of course I have to say something about the PostgreSQL community: Because PostgreSQL is a real community project the community around it is much more important than with other technical communities. I do not say that you do not need a community for vendor controlled products because when the vendor fails to build a community around its product the product will fail as well. What I am saying is that the PostgreSQL community goes deeper as the complete product is driven by the community. Of course there are companies that hire people working for the community but they are not able to influence the direction if there is no agreement about the direction in the community. Sometimes this can make it very hard to progress and a lot of discussions need to be discussed but at the end I believe it is better to have something which the majority agrees on. In the PostgreSQL community I think there are several drivers: For sure all the developers are drivers, the people who take care of all the infrastructure (mailing lists, commitfests, …) are drivers as well. Basically everybody you can see on the mailing lists and answers questions are drivers because they keep the community active. Then we have all the people you see in other communities as well: Those who share and those who consume/attend. I think you get the point: An open source community is by its nature far more active than what you usually see for non-opensource communities for one reason: It already starts with the developers and not with a community around a final product. You can be part of such a community from the very beginning, which is writing new features and patches.

Coming back to the original question: What makes a community? Beside what Martin outlined there are several other key points:

  • The direction of the community (no matter if technical or not) must be so that people want to be part of that
  • When we speak about a community around a product: You must identify yourself with the product. When the product goes into a direction you can not support for whatever reason you’ll leave, sooner or later. The more people leave, the weaker the community
  • It must be easy to participate and to get help
  • A lot of people are willing to spend (free-) time to do stuff for the community
  • There must be a culture which respects you and everybody else
  • Maybe most important: A common goal and people that are able and willing to work together, even if this sometimes requires a lot of discussions

When you have all of these, the drivers, the people who share, and those that attend will come anyway, I believe.

 

Cet article A response to: What makes a community? est apparu en premier sur Blog dbi services.

Pages