Skip navigation.

DBA Blogs

SQL Server 2016 : A New Security Feature – Always Encrypted

Pythian Group - Tue, 2016-04-12 08:15

Security. This word is so important when it comes to data, and there is a reason why. Every business has it’s vital data, and this data has to be accessed only by those who are authorized. Back in 2009, I wrote an article on what measures to take when it comes to securing SQL Server.

There were days when we used to have a third party tool to encrypt the data inside SQL Server. Later, Microsoft introduced Transparent Data Encryption (TDE) bundled with the release of SQL Server 2008. You may be wondering why it is so important to encrypt the data. Inside our database, there may be a case that the customer/application has to enter and store the sensitive information such as Social Security Number (SSN) or Financial/Payment Data, which should not be read in plain text, even by a DBA. With this requirement, a strong encryption and/or data masking comes into the picture.

With the launch of SQL Server 2016 Release Candidate 0 (RC0) , Microsoft has introduced two new features that are my personal favorite – 1)  Always Encrypted and 2) Dynamic Data Masking. Today I am going to walk you through the Always Encrypted feature.

First and foremost, we need to have SQL Server 2016 RC0 installed so that we can test this feature. You can download the RC0 here.  Once you are ready with RC0, create a test database and a table with an Encrypted column to store the sensitive data. There are few prerequisites that I will list for you here. If you want, you can use the sample schema from MS.

  1. Create a sample database
  2. Create Column Master Key
  3. Generate a self-signed certificate (well, you will need to install this certificate on the machine where the application will run)
  4. Configure Column Encryption Key
  5. Create a test table with Always Encrypted column
  6. Create an application to Insert data into the sample table we created in previous step

I have created a sample app and a demo script for the reference which you can download here. Basically, what we have to remember is that we can not insert the value inside the Always Encrypted table directly, we will need to use the tool/app, and the data will always be encrypted when it goes inside the database. This will ensure that the intruder can not get the data as it travels to the database in a cipher text form.

Here is some further reading on this topic. Enjoy reading and testing an excellent feature of SQL Server 2016 RC0.

Categories: DBA Blogs

SQL On The Edge #9 – Azure SQL Database Threat Detection

Pythian Group - Mon, 2016-04-11 14:30

Despite being well documented for several years now, every now and then we still run into clients that have bad experiences because of SQL injection attacks. If you’re not familiar, a SQL injection attack happens when an attacker exploits an application vulnerability in how they pass queries and data into the database and insert their own malicious SQL code to be executed. If you want to see different examples and get the full details, the Wikipedia page is very comprehensive.

Depending on how the application is configured, this kind of attack can go all the way from enabling attackers to see data they shouldn’t, to dropping an entire database if your application is allowed to do so. The fact that it’s an application based vulnerability also means that it really depends on proper coding and testing of all inputs in the application to prevent it. In other words, it can be very time-consuming to go back and plug all the holes if the application wasn’t securely built from the ground up.

Built-in Threat Detection

To attack this issue, and as part of the ongoing security story of SQL Server, Microsoft has now invested in the feature called Database Threat Detection. When enabled, the service will automatically scan the audit records generated from the database and will flag any anomalies that it detects. There are many patterns of injections so it makes sense to have a machine be the one reading all the SQL and flagging them. MS is not disclosing the patterns or the algorithms in an effort to make working around the detection more difficult.

What about on-premises?

This feature right now is only available on Azure SQL Db. However, we all know that Azure SQL Db is basically the testing grounds for all major new features coming to the box product. I would not be surprised if the threat detection eventually makes it to the on-premises product as well (my speculation though, nothing announced about this).

Pre-requisites
For this new feature you will need Azure SQL Db, you will also need to have auditing enabled on the database. The current way this works is by analyzing the audit records so it’s 100% reactive, nothing proactive. You will need a storage account as well since that’s where the audit logs get stored. The portal will walk you through this whole process, we’ll see that in the demo video.

Current State
As I mentioned, right now the tool is more of a reactive tool as it only lets you know after it has detected the anomaly. In the future, I would love to see a preventive configuration where one can specify a policy to completely prevent suspicious SQL from running. Sure, there can always be false alarms, however, if all the application query patterns are known, this number should be very low. If the database is open to ad-hoc querying then a policy could allow to only prevent the queries or even shut down the database after several different alerts have been generated. The more flexible the configuration, the better, but in the end what I want to see is a move from alerting me to preventing the injection to begin with.

In the demo, I’m going to go through enabling Azure SQL threat detection, some basic injection patterns and what the alerts look like. Let’s check it out!

 

Categories: DBA Blogs

Enter the Exadata X6

Pythian Group - Mon, 2016-04-11 14:20

Data is exploding and Exadata is catching up. With the proliferation of cloud technology and in-memory databases; Oracle Exadata X6-2 and X6-8 has it all. It seems to be an ideal platform for hyper-convergence for any data center running Oracle products.

Following are some of the salient features of Oracle X6:

  • The compute nodes have twenty two-core Intel Xeon E5-2699 v4 processors
  • The memory is DDR4 and of size 256Gb and it can be expanded to 768Gb.
  • The local storage can now be upgraded to 8 drives from default of 4.
  • The cell servers have ten-core Intel Xeon E5-2630 v4 processors.
  • Flashcache has become massive here reaching up to 12.8TB. Full rack has 179TB of flash.
  • There will be up to 1.7 PB of disk capacity (raw) per rack.
  • The infiniband network will have 40Gb/second. There is no change to Infiniband. In X5 it became active/active – but that’s the only difference.
  • On the software side, there are many improvement but one thing which caught my eye is the feature that enables Storage Indexes to be moved along the data when a disk hits predictive failure or true failure. This surely will improve performance by a long way.

With X6, Exadata has surely come a long way forward.

Categories: DBA Blogs

New A-Team Mobile Persistence Accelerator (AMPA) for Mobile Application Framework

The recent Oracle MAF 2.3 release already available on OTN, is a major update of MAF coming less than 6 months after the last major release. This release has several new & exciting features,...

We share our skills to maximize your revenue!
Categories: DBA Blogs

FBDA -- 6 : Some Bug Notes

Hemant K Chitale - Sun, 2016-04-10 09:27
Some MoS documents on FBDA Bugs

1.  Bug 16454223  :  Wrong Results  (more rows than expected)

2.  Bug 16898135  :  FBDA does not split partitions  (resulting in rows not being purged)

3.  Bug 18294320  :   ORA-01555 (ORA-2475) on SMON_SCN_TIME

4.  Bug 22456983  :   Limit on SMON_SCN_TIME affecting FBDA

5.  Document 2039070.1 :  Known Issues with Flashback Data Archive
.
.
.




Categories: DBA Blogs

Recent Blog Series on Compression

Hemant K Chitale - Sun, 2016-04-10 09:16
These have been my recent posts on Compression.

1.  1.  BASIC Table Compression  (Feb-16)

2.  1b.  More on BASIC Table Compression  (Feb-16)

3.  2.  Compressed Table Partitions  (Mar-16)

4.  3.  Index (Key) Compression  (Mar-16)

4.  4.  RMAN (BASIC) Compression  (Mar-16)

5.  5.  OLTP Compression  (Mar-16)
.
.

.
Categories: DBA Blogs

FBDA -- 5 : Testing AutoPurging

Hemant K Chitale - Sun, 2016-04-10 09:06
Tracking data changes after one row added (ID_COLUMN=2000) on 06-Apr

SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
06-APR-16 10.53.20.328132 PM +08:00

SQL> select scn_to_timestamp(startscn), scn_to_timestamp(endscn), count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4 order by 1,2;

SCN_TO_TIMESTAMP(STARTSCN)
---------------------------------------------------------------------------
SCN_TO_TIMESTAMP(ENDSCN)
---------------------------------------------------------------------------
COUNT(*)
----------
02-APR-16 11.32.55.000000000 PM
02-APR-16 11.46.11.000000000 PM
450

02-APR-16 11.32.55.000000000 PM
03-APR-16 11.45.24.000000000 PM
550

02-APR-16 11.46.11.000000000 PM
03-APR-16 11.41.33.000000000 PM
5

02-APR-16 11.46.11.000000000 PM
03-APR-16 11.45.24.000000000 PM
445

03-APR-16 11.41.33.000000000 PM
03-APR-16 11.45.24.000000000 PM
5

03-APR-16 11.45.24.000000000 PM
04-APR-16 11.05.33.000000000 PM
1000

06-APR-16 10.40.43.000000000 PM
06-APR-16 10.42.54.000000000 PM
1


7 rows selected.

SQL>
SQL> select count(*) from sys_fba_tcrv_93250;

COUNT(*)
----------
1002

SQL>


More changes on 07-Apr


SQL> insert into test_fbda
2 select 3000, to_char(3000), trunc(sysdate)
3 from dual;

1 row created.

SQL> commit;

Commit complete.

SQL> update test_fbda
2 set date_inserted=date_inserted
3 where id_column=3000;

1 row updated.

SQL> delete test_fbda
2 where id_column < 1001 ;

1000 rows deleted.

SQL> commit;

Commit complete.

SQL>
SQL> l
1 select scn_to_timestamp(startscn) starttime, scn_to_timestamp(endscn) endtime, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4* order by 1,2,3
SQL> /

STARTTIME ENDTIME COUNT(*)
-------------------------------- -------------------------------- ----------
02-APR-16 11.32.55.000000000 PM 02-APR-16 11.46.11.000000000 PM 450
02-APR-16 11.32.55.000000000 PM 03-APR-16 11.45.24.000000000 PM 550
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.41.33.000000000 PM 5
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.45.24.000000000 PM 445
03-APR-16 11.41.33.000000000 PM 03-APR-16 11.45.24.000000000 PM 5
03-APR-16 11.45.24.000000000 PM 04-APR-16 11.05.33.000000000 PM 1000
04-APR-16 11.09.43.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
06-APR-16 10.40.43.000000000 PM 06-APR-16 10.42.54.000000000 PM 1
07-APR-16 10.27.35.000000000 PM 07-APR-16 10.28.03.000000000 PM 1
07-APR-16 10.28.03.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000

10 rows selected.

SQL>
SQL> l
1 select id_column, trunc(date_inserted), count(*)
2 from test_fbda
3 group by id_column, trunc(date_inserted)
4* order by 1
SQL> /

ID_COLUMN TRUNC(DAT COUNT(*)
---------- --------- ----------
2000 06-APR-16 1
3000 07-APR-16 1

SQL>


I see two new 1000 row sets (04-Apr and 07-Apr).  I should expect only one.

Now that rows for ID_COLUMN less than 1001 have been deleted on 07-Apr, we have to see if and when they get purged from the History table.


On 09-Apr:

SQL> insert into test_fbda
2 select 4000, to_char(4000),trunc(sysdate)
3 from dual;

1 row created.

SQL> commit;

Commit complete.

SQL> update test_fbda
2 set date_inserted=date_inserted
3 where id_column=4000;

1 row updated.

SQL> commit;

Commit complete.

SQL> l
1 select scn_to_timestamp(startscn) starttime, scn_to_timestamp(endscn) endtime, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4* order by 1,2,3
SQL> /

STARTTIME ENDTIME COUNT(*)
-------------------------------- -------------------------------- ----------
02-APR-16 11.32.55.000000000 PM 02-APR-16 11.46.11.000000000 PM 450
02-APR-16 11.32.55.000000000 PM 03-APR-16 11.45.24.000000000 PM 550
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.41.33.000000000 PM 5
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.45.24.000000000 PM 445
03-APR-16 11.41.33.000000000 PM 03-APR-16 11.45.24.000000000 PM 5
03-APR-16 11.45.24.000000000 PM 04-APR-16 11.05.33.000000000 PM 1000
04-APR-16 11.09.43.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
06-APR-16 10.40.43.000000000 PM 06-APR-16 10.42.54.000000000 PM 1
07-APR-16 10.27.35.000000000 PM 07-APR-16 10.28.03.000000000 PM 1
07-APR-16 10.28.03.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
09-APR-16 11.10.25.000000000 PM 09-APR-16 11.10.48.000000000 PM 1

11 rows selected.

SQL>
SQL> select * from user_flashback_archive
2 /

OWNER_NAME
------------------------------
FLASHBACK_ARCHIVE_NAME
--------------------------------------------------------------------------------
FLASHBACK_ARCHIVE# RETENTION_IN_DAYS
------------------ -----------------
CREATE_TIME
---------------------------------------------------------------------------
LAST_PURGE_TIME
---------------------------------------------------------------------------
STATUS
-------
SYSTEM
FBDA
1 3
02-APR-16 11.24.39.000000000 PM
02-APR-16 11.24.39.000000000 PM



SQL>


As on the morning of 10-Apr (after leaving the database instance running overnight) :

SQL> select scn_to_timestamp(startscn) starttime, scn_to_timestamp(endscn) endtime, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4 order by 1,2,3
5 /

STARTTIME ENDTIME COUNT(*)
-------------------------------- -------------------------------- ----------
02-APR-16 11.32.55.000000000 PM 02-APR-16 11.46.11.000000000 PM 450
02-APR-16 11.32.55.000000000 PM 03-APR-16 11.45.24.000000000 PM 550
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.41.33.000000000 PM 5
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.45.24.000000000 PM 445
03-APR-16 11.41.33.000000000 PM 03-APR-16 11.45.24.000000000 PM 5
03-APR-16 11.45.24.000000000 PM 04-APR-16 11.05.33.000000000 PM 1000
04-APR-16 11.09.43.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
06-APR-16 10.40.43.000000000 PM 06-APR-16 10.42.54.000000000 PM 1
07-APR-16 10.27.35.000000000 PM 07-APR-16 10.28.03.000000000 PM 1
07-APR-16 10.28.03.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
09-APR-16 11.10.25.000000000 PM 09-APR-16 11.10.48.000000000 PM 1

11 rows selected.

SQL> select systimestamp from dual
2 /

SYSTIMESTAMP
---------------------------------------------------------------------------
10-APR-16 08.51.29.398107 AM +08:00

SQL>
SQL> select * from user_flashback_archive
2 /

OWNER_NAME
------------------------------
FLASHBACK_ARCHIVE_NAME
--------------------------------------------------------------------------------
FLASHBACK_ARCHIVE# RETENTION_IN_DAYS
------------------ -----------------
CREATE_TIME
---------------------------------------------------------------------------
LAST_PURGE_TIME
---------------------------------------------------------------------------
STATUS
-------
SYSTEM
FBDA
1 3
02-APR-16 11.24.39.000000000 PM
02-APR-16 11.24.39.000000000 PM



SQL>


So auto-purge of the data as of earlier days (02-Apr to 06-Apr) hasn't yet kicked in ?  Let's try a manual purge.

SQL> alter flashback archive fbda purge before timestamp (sysdate-4);

Flashback archive altered.

SQL> select * from user_flashback_archive;

OWNER_NAME
------------------------------
FLASHBACK_ARCHIVE_NAME
--------------------------------------------------------------------------------
FLASHBACK_ARCHIVE# RETENTION_IN_DAYS
------------------ -----------------
CREATE_TIME
---------------------------------------------------------------------------
LAST_PURGE_TIME
---------------------------------------------------------------------------
STATUS
-------
SYSTEM
FBDA
1 3

05-APR-16 11.52.16.000000000 PM



SQL>
SQL> ! sleep 300
SQL> l
1 select scn_to_timestamp(startscn) starttime, scn_to_timestamp(endscn) endtime, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4* order by 1,2,3
SQL> /

STARTTIME ENDTIME COUNT(*)
-------------------------------- -------------------------------- ----------
02-APR-16 11.32.55.000000000 PM 02-APR-16 11.46.11.000000000 PM 450
02-APR-16 11.32.55.000000000 PM 03-APR-16 11.45.24.000000000 PM 550
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.41.33.000000000 PM 5
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.45.24.000000000 PM 445
03-APR-16 11.41.33.000000000 PM 03-APR-16 11.45.24.000000000 PM 5
03-APR-16 11.45.24.000000000 PM 04-APR-16 11.05.33.000000000 PM 1000
04-APR-16 11.09.43.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
06-APR-16 10.40.43.000000000 PM 06-APR-16 10.42.54.000000000 PM 1
07-APR-16 10.27.35.000000000 PM 07-APR-16 10.28.03.000000000 PM 1
07-APR-16 10.28.03.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
09-APR-16 11.10.25.000000000 PM 09-APR-16 11.10.48.000000000 PM 1

11 rows selected.

SQL>


Although USER_FLASHBACK_ARCHIVE shows that a purge till 05-Apr (the 11:52pm timestamp is strange) has been done, I still see older rows in the History table.  The query on the active table does correctly exclude the rows that should not be available. 


SQL> select scn_to_timestamp(startscn) starttime, scn_to_timestamp(endscn) endtime, count(*)
2 from sys_fba_hist_93250
3 group by scn_to_timestamp(startscn), scn_to_timestamp(endscn)
4 order by 1,2,3;

STARTTIME ENDTIME COUNT(*)
---------------------------------- ---------------------------------- ----------
02-APR-16 11.32.55.000000000 PM 02-APR-16 11.46.11.000000000 PM 450
02-APR-16 11.32.55.000000000 PM 03-APR-16 11.45.24.000000000 PM 550
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.41.33.000000000 PM 5
02-APR-16 11.46.11.000000000 PM 03-APR-16 11.45.24.000000000 PM 445
03-APR-16 11.41.33.000000000 PM 03-APR-16 11.45.24.000000000 PM 5
03-APR-16 11.45.24.000000000 PM 04-APR-16 11.05.33.000000000 PM 1000
04-APR-16 11.09.43.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
06-APR-16 10.40.43.000000000 PM 06-APR-16 10.42.54.000000000 PM 1
07-APR-16 10.27.35.000000000 PM 07-APR-16 10.28.03.000000000 PM 1
07-APR-16 10.28.03.000000000 PM 07-APR-16 10.28.03.000000000 PM 1000
09-APR-16 11.10.25.000000000 PM 09-APR-16 11.10.48.000000000 PM 1

11 rows selected.

SQL> select * from user_flashback_archive;

OWNER_NAME
------------------------------
FLASHBACK_ARCHIVE_NAME
------------------------------------------------------------------------------------------------------------------------------------
FLASHBACK_ARCHIVE# RETENTION_IN_DAYS CREATE_TIME
------------------ ----------------- ---------------------------------------------------------------------------
LAST_PURGE_TIME STATUS
--------------------------------------------------------------------------- -------
SYSTEM
FBDA
1 3
05-APR-16 11.52.16.000000000 PM


SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
10-APR-16 10.52.12.361412 PM +08:00

SQL> select count(*) from test_fbda as of timestamp (sysdate-3);

COUNT(*)
----------
2

SQL>
SQL> select partition_position, high_value
2 from user_tab_partitions
3 where table_name = 'SYS_FBA_HIST_93250'
4 order by 1;

PARTITION_POSITION HIGH_VALUE
------------------ --------------------------------------------------------------------------------
1 MAXVALUE

SQL>



Support Document 16898135.1 states that if Oracle isn't maintaining partitions for the History table, it may not be purging data properly.  Even an ALTER FLASHBACK ARCHIVE ... PURGE doesn't purge data (unless PURGE ALL is issued).  I'd seen this behaviour in 11.2.0.4 . The bug is supposed to have been fixed in 12.1.0.2  but my 12.1.0.2 environment shows the same behaviour.   The fact that my test database has very little activity (very few SCNs being incremented) shouldn't matter. The "Workaround" is, of course, unacceptable.
.
.
.
Categories: DBA Blogs

Cloud Integration Challenges and Opportunities

The rapid shift from on-premise applications to a hybrid mix of Software-as-a-Service (SaaS) and onpremise applications has introduced big challenges for companies attempting to simplify enterprise...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Installing Sample Databases To Get Started In Microsoft SQL Server

Pythian Group - Fri, 2016-04-08 08:20

Anyone interested in getting started in SQL server will need some databases to work with/on. This article hopes to help the new and future DBA/Developer get started with a few databases.

There are several places to get sample databases, but one starting out in SQL server should go to the Microsoft sample databases. The reason for this is that there are thousands of Blogs/Tutorials on the internet that use these databases as the basis for the tutorial.

The below steps will detail how to get the sample databases and how to attach them to SQL server to start working with them.

This blog assumes you have a version of SQL server installed if not you can click here for a great tutorial

These 2 Databases are a great start to learning SQL Server from both a Transactional and Data Warehousing point of view.

  • Now that we have downloaded these 2 files we will need to attach them one at a time. First, open SQL Server and connect to your instance.
  • Expand the object explorer tree until you can right click on the folder called databases and then left click on Attach…

ObjectExplorer

  • Click the Add Button and navigate to and select the .mdf file (This is the database file you downloaded).

AtatchDatabase1

  • There is one step a lot of people getting started in SQL server often miss. As we have just attached a data file in order for SQL Server to bring the database online, it needs a log file which we don’t have. The trick to this is, if we remove the log file from the Attach Database window, SQL Server will automatically create a log file for us. To do this, simply select the log file and click remove.

LogRemove

  • Finally, when you window looks like below simply click ok to attach the database.

FinalAtatch

  • Repeat steps 3 to 6 for the second database file and any others you wish to attach.
  • The Databases are now online in SQL server and ready to be used.

FinalObjectTree

And that’s it! You now have an OLTP database and a DW database for BI.

Below are links to some good starting tutorials and some additional databases.

Databases

CodePlex

Stack Overflow

Tutorials

W3Schools

Enjoy!

Categories: DBA Blogs

Links for 2016-04-07 [del.icio.us]

Categories: DBA Blogs

Oracle Live SQL: Explain Plan

Pythian Group - Thu, 2016-04-07 12:14

We’ve all encountered a situation when you want to check a simple query or syntax for your SQL and don’t have a database around. Of course, most of us have at least a virtual machine for that, but it takes time to fire it up, and if you work from battery, it can leave you without power pretty quickly. Some time ago, Oracle began to offer a new service called “Oracle Live SQL” . It provides you with the ability to test a sql query, procedure or function, and have a code library containing a lot of examples and scripts. Additionally, you can store your own private scripts to re-execute them later. It’s a really great online tool, but it lacks some features. I’ve tried to check the  execution plan for my query but, unfortunately, it didn’t work:

explain plan for 
select * from test_tab_1 where pk_id<10;

ORA-02402: PLAN_TABLE not found

So, what could we do to make it work? The workaround is not perfect, but it works and can be used in some cases. We need to create our own plan table using script from an installed Oracle database home $ORACLE_HOME/rdbms/admin/utlxplan.sql. We can open the file and copy the statement to create plan table to SQL worksheet in the Live SQL. And you can save the script in Live SQL code library, and make it private to reuse it later because you will need to recreate the table every time when you login to your environment again. So far so good. Is it enough? Let’s check.

explain plan for 
select * from test_tab_1 where pk_id<10;

Statement processed.

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
ERROR: an uncaught error in function display has happened; please contact Oracle support
       Please provide also a DMP file of the used plan table PLAN_TABLE
       ORA-00904: DBMS_XPLAN_TYPE_TABLE: invalid identifier


Ok, the package doesn’t work. I tried to create the types in my schema but it didn’t work. So far the dbms_xplan is not going to work for us and we have to request the information directly from our plan table. It is maybe not so convenient, but it give us enough and, don’t forget, you can save your script and just reuse it later. You don’t need to memorize the queries. Here is a simple example of how to get information about your last executed query from the plan table:

SELECT parent_id,id, operation,plan_id,operation,options,object_name,object_type,cardinality,cost from plan_table where plan_id in (select max(plan_id) from plan_table) order by 2;

PARENT_ID	ID	OPERATION	PLAN_ID	OPERATION	OPTIONS	OBJECT_NAME	OBJECT_TYPE	CARDINALITY	COST
 - 	0	SELECT STATEMENT	268	SELECT STATEMENT	 - 	 - 	 - 	9	49
0	1	TABLE ACCESS	268	TABLE ACCESS	FULL	TEST_TAB_1	TABLE	9	49

[/lang] 
I tried a hierarchical query but didn't find it too useful in the Live SQL environment. Also you may want to put unique identifier for your query to more easily find it in the plan_table. 

explain plan set statement_id='123qwerty' into plan_table for
select * from test_tab_1 where pk_id<10;

SELECT parent_id,id, operation,plan_id,operation,options,object_name,object_type,cardinality,cost from plan_table where statement_id='123qwerty' order by id;

PARENT_ID	ID	OPERATION	PLAN_ID	OPERATION	OPTIONS	OBJECT_NAME	OBJECT_TYPE	CARDINALITY	COST
 - 	0	SELECT STATEMENT	272	SELECT STATEMENT	 - 	 - 	 - 	9	3
0	1	TABLE ACCESS	272	TABLE ACCESS	BY INDEX ROWID BATCHED	TEST_TAB_1	TABLE	9	3
1	2	INDEX	272	INDEX	RANGE SCAN	TEST_TAB_1_PK	INDEX	9	2

Now I have my plan_table script and query saved in the Live SQL and reuse them when I want to check the plan for my query. I posted the feedback about the ability to use dbms_xplan and Oracle representative replied to me promptly and assured they are already working implementing dbms_xplan feature and many others including ability to run only selected SQL statement in the SQL worksheet (like we do it in SQLdeveloper). It sounds really good and promising and is going to make the service even better. Stay tuned.

Categories: DBA Blogs

Which Cassandra version should you use for production?

Pythian Group - Thu, 2016-04-07 11:47
What version for a Production Cassandra Cluster?

tl;dr; Latest Cassandra 2.1.x

Long version:

A while ago, Eventbrite wrote:
“You should not deploy a Cassandra version X.Y.Z to production where Z <= 5.” (Full post).

And, in general, it is still valid up until today! Why “in general“? That post is old, and Cassandra has moved a lot since them. So we can get a different set of sentences:

Just for the ones that don’t want follow the links, and still pick 3.x for production use, read this:

“Under normal conditions, we will NOT release 3.x.y stability releases for x > 0.  That is, we will have a traditional 3.0.y stability series, but the odd-numbered bugfix-only releases will fill that role for the tick-tock series — recognizing that occasionally we will need to be flexible enough to release an emergency fix in the case of a critical bug or security vulnerability.

We do recognize that it will take some time for tick-tock releases to deliver production-level stability, which is why we will continue to deliver 2.2.y and 3.0.y bugfix releases.  (But if we do demonstrate that tick-tock can deliver the stability we want, there will be no need for a 4.0.y bugfix series, only 4.x tick-tock.)”

What about end of life?

Well, it is about stability, there are still a lot of clusters out there running 1.x and 2.0.x. And since it is an open source software, you can always search in the community or even contribute.

If you still have doubts about which version, you can always contact us!

Categories: DBA Blogs

Links for 2016-04-06 [del.icio.us]

Categories: DBA Blogs

Partner Webcast – Oracle WebLogic Server 12.2.1 Multitenancy and Continuous Availability

As part of the latest major Oracle Fusion Middleware release, Oracle announced the largest release of Oracle WebLogic Server in a decade. Oracle WebLogic Server 12c, the world’s first cloud-native,...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Log Buffer #468: A Carnival of the Vanities for DBAs

Pythian Group - Wed, 2016-04-06 14:38

This Log Buffer Edition rounds up Oracle, SQL Server, and MySQL blog posts of the week.

Oracle:

When using strings such as “FREQ=DAILY; BYDAY=MON,TUE,WED,THU,FRI; BYHOUR=9,10? within the scheduler, sometimes its not readily apparent how this will translate to actual dates and times of the day that the scheduled activity will run. To help you understand, a nice little utility is to use EVALUATE_CALENDAR_STRING”.

Most developers have struggled with wires in SOA composites. You may find yourself in a situation where a wire has been deleted. Some missing wires are restored by JDeveloper. Other missing wires have to be added manually, by simply re-connecting the involved adapters and components. Simple.

In-Memory Parallel Query and how it works in 12c.

Oracle recently launched a new family of offerings designed to enable organizations to easily move to the cloud and remove some of the biggest obstacles to cloud adoption. These first-of-a-kind services provide CIOs with new choices in where they deploy their enterprise software and a natural path to easily move business critical applications from on premises to the cloud.

Two Oracle Server X6-2 systems, using the Intel Xeon E5-2699 v4 processor, produced a world record x86 two-chip single application server SPECjEnterprise2010 benchmark result of 27,509.59 SPECjEnterprise2010 EjOPS. One Oracle Server X6-2 system ran the application tier and the second Oracle Server X6-2 system ran the database tier.

SQL Server:

To be able to make full use of the system catalog to find out more about a database, you need to be familiar with the metadata functions.

Powershell To Get Active Directory Users And Groups into SQL!

A code review is a serious business; an essential part of development. Whoever signs off on a code review agrees, essentially, that they would be able to support it in the future, should the original author of the code be unavailable to do it.

Change SQL Server Service Accounts with Powershell

Learn how to validate integer, string, file path, etc. input parameters in PowerShell as well as see how to test for invalid parameters.

MySQL:

The MySQL Utilities has announced a new beta release of MySQL Utilities. This release includes a number of improvements for usability, stability, and a few enhancements.

In this webinar, we will discuss the practical aspects of migrating a database setup based on traditional asynchronous replication to multi-master Galera Cluster.

Docker has gained widespread popularity in recent years as a lightweight alternative to virtualization. It is ideal for building virtual development and testing environments. The solution is flexible and seamlessly integrates with popular CI tools.

How ProxySQL adds Failover and Query Control to your MySQL Replication Setup

Read-write split routing in MaxScale

Categories: DBA Blogs

Six things I learned about privacy from some of the world’s top IT pros

Pythian Group - Wed, 2016-04-06 07:32

 

Everyone has their own experiences with data privacy. Ask almost anyone, and they’ll be able to tell you how a retail company found some innovative way of collecting their personal data, or how Facebook displayed a suggested post or ad that was uncannily targeted at them based on personal information they did not purposely share.

 

The discussion about privacy can get pretty interesting when you get a group of CIOs and IT leaders in the same room to talk about it.  And that’s just what we did at Pythian’s Velocity of Innovation (Velocity) events in New York, San Francisco and Sydney. With their ears constantly to the ground for the latest trends and unparalleled insider IT knowledge, there’s no better group with whom to talk security and privacy than our Velocity panelists and attendees.

 

At Pythian we manage the revenue-generating systems for some of the world’s leading enterprises, and work with our clients on some of the most difficult security and privacy issues. But staying on top of the evolving threats to data privacy is a constant challenge, even for the most knowledgeable security experts. One of our biggest challenges is helping our clients use data responsibly to improve customer experiences, while protecting individual privacy and safeguarding personally identifiable information from malicious threats. So there’s always more we can learn. This is one of the reasons we keep listening to our customers and peers through various channels and events like our Velocity discussions.

 

Privacy is a high stakes game for companies that want to maximize their use of client data while adhering to legislation designed to protect personal information. In many countries, new, more rigorous laws are emerging to enable individuals to maintain more control over their personal information, and to help them understand when they are being monitored or when their personal information is recorded, shared or sold. These laws include the recently  reformed  EU Data protection rules and PIPEDA in Canada. But companies are responding by finding clever ways of making it attractive for consumers to trade their data for services, or worse, by making it impossible to get service at all without giving up certain information.

With these issues in mind, we had some interesting discussions at each of our Velocity sessions. Here are six main privacy-related themes that emerged at these recent events:

 

  1. Privacy and security are not the same

Although are related, they refer to different ideas. Privacy is a major goal of security, and relates to a consumer’s right to safeguard their personal information. It can involve vulnerable data such as social media data, customer response data, demographic data or other personal information. In general, privacy is the individual’s right to keep his or her data to himself or herself. By contrast, security refers to the protection of enterprise or government systems. Security may incorporate customer privacy as part of its agenda, but the two are not synonymous. According to Tim Sheedy, principal analyst with Forrester Research, a panelist at our Sydney event, security and privacy are different things. “But if you have a security breach, privacy becomes an issue,” he said. “So they’re closely linked.”

 

  1. Balance is key

Our panelists agreed that you have to balance protecting customer data with moving the business forward. But if you’re in the insurance business, like San Francisco Velocity panelist and CTO at RMS Cory Isaacson, that’s a tough balance to strike. He has to worry about how his company’s data privacy measures up against industry standards and the regulations they have meet. At the same time he has to ensure that data security doesn’t paralyze innovation in the business. He says you have to be innovative with moving the business forward, as well as keeping up with standards and regulations.

“If you listen to your ISO auditor you’re never going to get out of bed in the morning,” Isaacson said. “If you’re faced with securing highly sensitive customer data, you have to incorporate it into the way you work every day instead of letting it slow you down.”

 

  1. Don’t be creepy

Companies have to walk a fine line between using personal data to enhance the customer experience and being intrusive for no good reason. Forrester’s Sheedy warns that perception is everything. “Don’t be creepy,” he said. “When you’re creepy that that’s when privacy becomes an issue. Don’t underestimate the ability of people to give away information for a discount or for a benefit. But the key is that there has to be a benefit. When there isn’t an upside, it becomes creepy for customers, and you have to wonder if you’ve  gone too far.”

 

  1. People will share data in exchange for perks.

Our Sydney panelists and attendees exchanged ideas about new ways that companies are obtaining information about customers, behaviours and preferences. They talked about social media companies learning your behaviours by using more than just the data you share on your profile, including using tactics like tracking your location through GPS and triangulation between cell phone towers.

 

“When I visit Pythian’s head office in Ottawa, the nearby coffee shop has an app that asks if I want them to prepare my usual drink when it detects I’m nearby. Is that intrusive? No, because I opted in,” said Francisco Alvarez, vice president, APAC at Pythian.

 

A Marketing Magazine article reported that in a recent survey, a majority of consumers will share data for certain benefits: 80 percent said they would share data for rewards from a company, 79 percent would share data for cash back, and 77 percent would share data for coupons. The majority also said they would share personal data for location-based discounts (69 percent).

 

  1. Millennials have a higher tolerance for sharing data.

The Australian panelists talked about this extensively, citing personal experience. They expressed the same concern many of us feel  about how much data their children’s generation is a sharing online and with companies. One of our attendees had a child who was denied a prestigious scholarship because his digital footprint wasn’t favourable. But, Alvarez said, this type of consequence doesn’t seem to deter millennials from freely sharing very personal information. And when it comes to sharing information with businesses in exchange for perceived benefits, their tolerance is very high.  

 

In fact, according to a survey conducted by the USC Annenberg Center for Digital Future and Bovitz Inc., millennials (individuals between the ages of 18 and 35) have a different attitude than Internet users 35 years and older when it comes to sharing their personal data online with businesses.  According to the study results, millennials were more likely than older respondents (35 years and older) to trade some of their personal information in exchange for more relevant advertising.

 

  1. New  approaches for protecting customer data are on the horizon

In our San Francisco discussion, Pythian Chief Data Officer Aaron Lee cited Google’s new initiative for protecting customer data.  The new approach involves placing corporate applications on the Internet itself and focusing its security efforts on registered devices and user authentication.
“It’s really simple idea,” Lee said.  “In most companies the internal network is special. If you’re on the internal network you get stuff you can’t get from outside just because you’re sitting there on the network. That’s not the case anymore. With Google’s Beyond Corp initiative it is like there are no more internal networks. If you want to talk to my service, you treat it like you came in from the internet just like everybody else. That way I can apply a consistent approach to authentication, authorization, all that good stuff. It’s actually acting to reduce the surface of complexity. It’s an approach that takes the simplest way that we can secure these things, and treats them the same regardless of where they actually live,” he said.

Categories: DBA Blogs

DEFAULT_CACHE_SIZE mentioned in alert.log of an #Oracle database

The Oracle Instructor - Wed, 2016-04-06 03:53

Today, I got this message in my alert.log file:

Full DB Caching disabled: DEFAULT_CACHE_SIZE should be at least 709 MBs bigger than current size.

When I look at the datafile sizes and compare them with the buffer cache size, it shows:

 

SYS@cloudcdb > select name,bytes/1024/1024 as mb from v$sgainfo;

NAME                                                       MB
-------------------------------------------------- ----------
Fixed SGA Size                                     2,80265045
Redo Buffers                                       13,1953125
Buffer Cache Size                                        3296
In-Memory Area Size                                      2048
Shared Pool Size                                          736
Large Pool Size                                            32
Java Pool Size                                             16
Streams Pool Size                                           0
Shared IO Pool Size                                       208
Data Transfer Cache Size                                    0
Granule Size                                               16
Maximum SGA Size                                         6144
Startup overhead in Shared Pool                    181,258133
Free SGA Memory Available                                   0

14 rows selected.

SYS@cloudcdb > select sum(bytes)/1024/1024 as mb from v$datafile;

        MB
----------
      3675

It is true, the database doesn’t fit completely into the buffer cache, missing roughly that amount of space mentioned. There is no such parameter as DEFAULT_CACHE_SIZE, though.
What we have instead is DB_CACHE_SIZE. In order to fix that issue, I was using this initialization parameter file to create a new spfile from:

[oracle@uhesse-service2 dbs]$ cat initCLOUDCDB.ora
*.audit_file_dest='/u02/app/oracle/admin/CLOUDCDB/adump'
*.audit_trail='db'
*.compatible='12.1.0.2.0'
*.control_files='/u02/app/oracle/oradata/CLOUDCDB/control01.ctl','/u03/app/oracle/fra/CLOUDCDB/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='CLOUDCDB'
*.db_recovery_file_dest='/u03/app/oracle/fra'
*.db_recovery_file_dest_size=10737418240
*.diagnostic_dest='/u02/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=CLOUDCDBXDB)'
*.enable_pluggable_database=true
*.open_cursors=300
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*.sga_target=6g
*.pga_aggregate_target=2g
*.inmemory_size=1g
*.db_cache_size=4g

That reduced the size of the In-Memory Column Store to make room for the buffer cache. Now the database fits nicely into the buffer cache again:

SYS@cloudcdb > select name,bytes/1024/1024 as mb from v$sgainfo;

NAME                                                       MB
-------------------------------------------------- ----------
Fixed SGA Size                                     2,80265045
Redo Buffers                                       13,1953125
Buffer Cache Size                                        4256
In-Memory Area Size                                      1024
Shared Pool Size                                          800
Large Pool Size                                            32
Java Pool Size                                             16
Streams Pool Size                                           0
Shared IO Pool Size                                         0
Data Transfer Cache Size                                    0
Granule Size                                               16
Maximum SGA Size                                         6144
Startup overhead in Shared Pool                    181,290176
Free SGA Memory Available                                   0

14 rows selected.

Accordingly the message in the alert.log now reads
Buffer Cache Full DB Caching mode changing from FULL CACHING DISABLED to FULL CACHING ENABLED

Don’t get me wrong: I’m not arguing here against the In-Memory Option or in favor of Full Database Caching. Or whether it makes sense to use any of them or both. This post is just about clarifying the strange message in the alert.log that may confuse people.

And by the way, my demo database is running in the Oracle Cloud:-)


Tagged: 12c New Features
Categories: DBA Blogs

Redshift Table Maintenance: Vacuuming

Pythian Group - Tue, 2016-04-05 12:38
Overview

Part of the appeal of AWS’ Redshift is that it’s a managed service, which means lower administration costs. While you don’t have to hire a full time DBA to make sure it runs smoothly (from Pythian’s experience it takes ~10-20 hours/month to manage Redshift), there are still some tasks that should be attended to keep it happy:

  • Vacuuming
  • Analyzing
  • Skew analysis
  • Compression analysis
  • Query monitoring

Let us start with Vacuuming as the first topic of a series of deeper dives into this list.

Vacuuming is an integral part of performance maintenance of Redshift.  Since deletes and updates both flag the old data, but don’t actually remove it, if we’re doing those kinds of actions, vacuuming is needed to reclaim that space. Updates and deletes can be pretty big performance hits (a simple update can easily take 60 secs on a 50 million record table on a small cluster, so we’re looking at 20 minutes for a similar update on a 1 billion record table), so we try to avoid them as much as we can on large tables. The space reclamation portion of the vacuum typically accounts for 10% of the time we see spent on the tables.  We can use the SORT ONLY parameter to skip this phase, but we generally have no compelling reason to.

In addition, if tables have sort keys, and table loads have not been optimized to sort as they insert, then the vacuums are needed to resort the data which can be crucial for performance.  While loads of empty tables automatically sort the data, subsequent loads are not. We have seen query times drop by 80% from the implementation of vacuuming, but of course the impact varies with table usage patterns.

The biggest problem we face with vacuuming is the time it takes. While vacuuming does not block reads or writes, it can slow them considerably as well as take significant resources from the cluster, and you can only vacuum one table at a time. Remember that resource utilization can be constrained through WLM queues. A typical pattern we see among clients is that a nightly ETL load will occur, then we will run vacuum and analyze processes, and finally open the cluster for daily reporting. The faster the vacuum process can finish, the sooner the reports can start flowing, so we generally allocate as many resources as we can.

Operations

Let us start with the process itself.  It’s simple enough and you can get syntax documentation from AWS . There’s not too much that’s tricky with the syntax and for most use cases

VACUUM myschema.mytablename;

will suffice.  Note that INTERLEAVED sort keys need the REINDEX parameter added for all re-indexing to occur.  You can discern which tables have this set up by using the query:

select schemaname, tablename
from (SELECT
n.nspname AS schemaname
,c.relname AS tablename
,min(attsortkeyord) min_sort FROM pg_namespace AS n
INNER JOIN&nbsp; pg_class AS c ON n.oid = c.relnamespace
INNER JOIN pg_attribute AS a ON c.oid = a.attrelid
WHERE c.relkind = 'r'
AND abs(a.attsortkeyord) &gt; 0
AND a.attnum &gt; 0
group by 1,2 )
where min_sort&lt;0

In order to give the vacuum process more resources, we preface this command with

SET wlm_query_slot_count TO <N>;

where N is the maximum number of query slots we think we can get away with.  If you’re not sure what that number should be (we’ll discuss WLM queues in another post), usually 5 is a safe number though be warned that if the value of wlm_query_slot_count is larger than the number of available slots for the service class, the vacuum command will fail.

Knowing when to vacuum is reasonably straight forward. Anytime after substantial inserts, updates, or deletes are made is always appropriate, but you can be more exacting by querying two tables:

select * from STL_ALERT_EVENT_LOG where Solution LIKE ‘%VACUUM command%’

and

select * from SVV_TABLE_INFO where unsorted > 8

The latter check works great for daily loads. We will often set the threshold at 8 (percent) immediately after the loads, then run another vacuum process in the evening with a lower threshold (4 percent) that addresses larger tables that take a fair amount of time to vacuum since we want to avoid that situation in the morning.

Last fall AWS built a nice tool to automate vacuums, Analyze & Vacuum Schema Utility, that incorporated these queries. It works quite well, and we recommend it to our clients as a simple way to set up this maintenance. However, note that it does not automatically add the REINDEX parameter for those tables with INTERLEAVED sortkeys. The code is all available, so it is easy enough to adjust to make more custom filtering of tables (on fact_* and dim_* for instance) within a schema.

AWS has built a very useful view, v_get_vacuum_details, (and a number of others that you should explore if you haven’t already) in their Redshift Utilities repository that you can use to gain some insight into how long the process took and what it did. None of the system tables for vacuuming keep any historical information which would be nice for tracking growing process times, but you can see them for a week in STL_QUERY which gets purged to a history of 7 days. I recommend creating a simple process to track the vacuum data:

First create the table:

create table vacuum_history sortkey (xid) as select * from v_get_vacuum_details where processing_seconds > 0;

Then set up a cron process to populate:

0 18 * * * psql -h myRScluster -U myUser -p5439 -c “INSERT INTO vacuum_history SELECT * FROM v_get_vacuum_details WHERE xid > (SELECT MAX(xid) FROM vacuum_history) where processing_seconds > 0;” &> /var/log/vacuum_history.log

Once you start to see tables taking an inordinate amount of time to vacuum, some additional intervention may be appropriate. Our team recently ran into a sizable table (3 billion records) that had been taking 3 hours to vacuum daily. Some issue occurred where the table needed a partial reload of 2 billion rows. Once that finished, we ran a vacuum which kept going all afternoon. Checking SVV_VACUUM_PROGRESS we could see that it would take almost 30 hours to complete. Note that restarting a stopped vacuum does not mean the process will pick up where it left off. Since this would have impacted the daily load performance, we killed the vacuum with “cancel <pid>” using the pid pulled from

select pid, text from SVV_QUERY_INFLIGHT where text like ‘%Vacuum%’

We then ran a deep copy (created a new version of the table and ran a SELECT INTO) which took about 5 hours. The load into an empty table triggers the correct sorting, so a subsequent vacuum took only a few minutes to complete.

Just a note on killing long running vacuums: it sometimes doesn’t work especially once it’s in the initialize merge phase. We’ve found that continually issuing the cancel command while it’s in the sort phase is effective, but the point it to be wary of vacuuming large tables for their first time. Vacuums on large, unsorted tables write temporary data to disk, so there is also the potential to run out of disk and freeze the cluster, so be sure to always check that up to 3x the table size of disk space is available.

There are a few simple strategies to prevent long running vacuums:

  • Load your data in SORTKEY order: The incoming data doesn’t have to be pre-ordered, just greater than existing data.
  • Vacuum often: A table with a small unsorted region vacuums faster than one with a large unsorted region.
  • If tables become too large to vacuum within a maintenance window, consider breaking them apart: We often see multi-billion record tables where the only data being queried is from the last month or two.
  • Deep copies can be a faster solution than vacuums.
Categories: DBA Blogs