Skip navigation.

Jonathan Lewis

Syndicate content Oracle Scratchpad
Just another Oracle weblog
Updated: 9 hours 40 min ago

Viewing Figures

Sun, 2014-08-03 17:05

The day has just started in Singapore – though it’s just coming up to midnight back home – and the view counter has reached 4,00,009 despite the fact that I’ve not been able to contribute much to the community for the last couple of months. Despite the temporary dearth of writing it’s time to have a little review to see what’s been popular and how things have changed in the 10 months it took to accumulate the last 500,000 views so here are some of the latest WordPress stats.

All time ratings AWR / Statspack 80,997 Updated from time to time NOT IN 51,673 February 2007 Cartesian Merge Join 39,265 December 2006 dbms_xplan in 10g 37,725 November 2006 Shrink Tablespace 31,184 November 2006 Ratings over the last year AWR / Statspack 13,905 Updated from time to time AWR Reports 9,494 February 2011 Shrink Tablespace 8,402 February 2010 Lock Modes 8,221 June 2010 NOT IN 6,388 February 2007

The figures for the previous half million views (opens in a new window) are very similar for most of the top 5 although “Analysing Statspack (1)” has been pushed from 5th place to 6th place in the all-time greats; and “Lock Modes” has swapped places with “NOT IN” in the annual ratings. As the annual WordPress summary says: “… your posts have staying powere, why not write more about …”.

The number of followers has gone up from about 2,500  to just over 3,900 but, as I said last time, I suspect that there’s a lot of double counting related to twitter.

 


Analogy – 2

Sun, 2014-08-03 06:41

I suggested a little while ago that thinking about the new in-memory columnar store as a variation on the principle of bitmap indexes was quite a good idea. I’ve had a couple of emails since then asking me to expand on the idea because “it’s wrong” – I will follow that one up as soon as I can, but in the meantime here’s another angle for connecting old technology with new technology:

It is a feature of in-memory column storage that the default strategy is to store all columns in memory. But it’s quite likely that you’ve got some tables where a subset of the columns are frequently accessed and other columns are rarely accessed and it might seem a waste of resources to keep all the columns in memory just for the few occasional queries. So the feature allows you to de-select columns with the “no inmemory({list of columns})” option – it’s also possible to use different degrees of compression for different columns, of course, which adds another dimension to design and planning – but that’s a thought for another day.

So where else do you see an example of being selective about where you put columns ?  Index Organized Tables (IOTs) – where you can choose to put popular columns in the index (IOT_TOP) segment, and the rest in the overflow segment, knowing that this can give you good performance for critical queries, but less desirable performance for the less important or less frequent queries. IOTs allow you to specify the (typically short) list of columns you want “in” – it might be quite nice if the same were true for the in-memory option, I can imagine cases where I would want to include a small set of columns and exclude a very large number of them (for reasons that bring me back to the bitmap index analogy).

 


Parallel Plans

Sun, 2014-07-27 13:39

I’ve popped this note to the top of the stack because I’ve added an index to Randolf Geist’s series on parallel execution skew, and included a reference his recent update to the XPLAN_ASH utility.

This is the directory for a short series I wrote discussing how to interpret parallel execution plans in newer versions of Oracle.

For other aspects of parallel execution, here are links to several articles by Randolf Geist, published on his own blog or on Oracle’s Technet:

One of the awkward problems you can encounter with parallel execution is data skew – which has the potential to make just one slave in a set do (almost) all the work hence reducing the performance to something close to serial execution times.  has written a series of articles on Parallel Skew that has been published by AllthingsOracle over the last few months.

And a summary posting from Randolf listing the 5 articles above, but also including a set of short videos on the topic.

 


Analogy

Sun, 2014-07-27 01:02

So 12.1.0.2 is out with a number of interesting new features, of which the most noisily touted is the “in-memory columnar storage” feature. As ever the key to making best use of a feature is to have an intuitive grasp of what it gives you, and it’s often the case that a good analogy helps you reach that level of understanding; so here’s the first thought I had about the feature during one of the briefing days run by Maria Colgan.

“In-memory columnar storage gives you bitmap indexes on OLTP systems without the usual disastrous locking side effects.”

Obviously the analogy isn’t perfect … but I think it’s very close:  for each column stored you use a compression technique to pack the values for a large number of rows into a very small space, and for each stored row you can derive the rowid by arithmetic.  In highly concurrent OLTP systems there’s still room for some contention as the session journals are applied to the globally stored compressed columns (but then, private redo introduces some effects of that sort anyway); and the “indexes” have to be created dynamically as tables are initially accessed (but that’s a startup cost, it’s timing can be controlled, and it’s basically limited to a tablescan).

Whatever the technical hand-waving it introduces – thinking of the in-memory thing as enabling real-time bitmaps ought to help you think of ways to make good use of the feature.

 

 


Deferrable RI – 2

Sun, 2014-07-13 12:46

A question came up on Oracle-L recently about possible locking anomalies with deferrable referential integrity constraints.

An update by primary key is taking a long time; the update sets several columns, one of which is the child end of a referential integrity constraint. A check on v$active_session_history shows lots of waits for “enq: TX – row lock contention” in mode 4 (share), and many of these waits also identify the current object as the index that has been created to avoid the “foreign key locking” problem on this constraint (though many of the waits show the current_obj# as -1). A possible key feature of the issue is that foreign key constraint is defined as “deferrable initially deferred”. The question is, could such a constraint result in TX/4 waits.

My initial thought was that if the constraint was deferrable it was unlikely, there would have to be other features coming into play.

Of course, when the foreign key is NOT deferrable it’s easy to set up cases where a TX/4 appears: for example you insert a new parent value without issuing a commit then I insert a new matching child, at that point my session will wait for your session to commit or rollback. If you commit my insert succeeds if you rollback my session raises an error (ORA-02291: integrity constraint (schema_name.constraint_name) violated – parent key not found). But if the foreign key is deferred the non-existence (or potential existence, or not) of the parent should matter.  If the constraint is deferrable, though, the first guess would be that you could get away with things like this so long as you fixed up the data in time for the commit.

I was wrong. Here’s a little example:


create table parent (
	id	number(4),
	name	varchar2(10),
	constraint par_pk primary key (id)
)
;

create table child(
	id_p	number(4)
		constraint chi_fk_par
		references parent
		deferrable initially deferred,
	id	number(4),
	name	varchar2(10),
	constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;

begin
	dbms_stats.gather_table_stats(user,'parent');
	dbms_stats.gather_table_stats(user,'child');
end;
/

pause Press return

update child set id_p = 3 where id_p = 2 and id = 2;

If you don’t do anything after the pause and before the insert then the update will succeed – but fail on a subsequent commit unless you insert parent 3 before committing. But if you take advantage of the pause to use another session to insert parent 3 first, the update will then hang waiting for the parent insert to commit or rollback – and what happens next may surprise you. Basically the deferrability doesn’t protect you from the side effects of conflicting transactions.

The variations on what can happen next (insert the parent elsewhere, commit or rollback) are interesting and left as an exercise.

I was slightly surprised to find that I had had a conversation about this sort of thing some time ago, triggered by a comment to an earlier post. If you want to read a more thorough investigation of the things that can happen and how deferrable RI works then there’s a good article at this URL.

 


SQL Plan Baselines

Sun, 2014-07-06 11:34

Here’s a thread from Oracle-L that reminded of an important reason why you still have to hint SQL sometimes (rather than following the mantra “if you can hint it, baseline it”).

I have a query that takes 77 seconds to optimize (it’s not a production query, fortunately, but one I engineered to make a point). I can enable sql plan baseline capture and create a baseline for it, and given the nature of the query I can be confident that the resulting plan will always be exactly the plan I want. If I have to re-optimize the query at any time  (because it runs once per hour, say, and is constantly being flushed from the library cache) how much time will the SQL plan baseline save for me ?

The answer is NONE.

The first thing that the optimizer does for a query with a stored sql plan baseline is to optimize it as if the baseline did not exist.

If I want to get rid of that 77 seconds I’ll have to extract (most of) the hints from the SQL Plan Baseline and write them into the query.  (Or, maybe, create a Stored Outline – except that they’re deprecated in the latest version of Oracle, and I’d have to check whether the optimizer used the same strategy with stored outlines or whether it applied the outline before doing any optimisation). Maybe we could do with a hint which forces the optimizer to attempt to use an existing, accepted SQL Baseline without attempting the initial optimisation pass.

 


Adjusting Histograms

Fri, 2014-07-04 13:32

This is a quick response to a question on an old blog post asking how you can adjust the high value if you’ve already got a height-balanced histogram in place. It’s possible that someone will come up with a tidier method, but this was just a quick sample I created and tested on 11.2.0.4 in a few minutes.  (Note – this is specifically for height-balanced histograms,  and it’s not appropriate for 12c which has introduced hybrid histograms that will require me to modify my “histogram faking” code a little).

rem
rem	Script:		adjust_histogram.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Jun 2014
rem	Purpose:
rem
rem	Last tested
rem		11.2.0.4
rem	Not tested
rem		12.1.0.1
rem		11.1.0.7
rem		10.2.0.5
rem	Outdated
rem		 9.2.0.8
rem		 8.1.7.4	no WITH subquery
rem
rem	Notes:
rem	Follow-on from a query on my blog about setting the high value
rem	when you have a histogram.  We could do this by hacking, or by
rem	reading the user_tab_histogram values and doing a proper prepare
rem

start setenv
set timing off

execute dbms_random.seed(0)

drop table t1;

begin
	begin		execute immediate 'purge recyclebin';
	exception	when others then null;
	end;

	begin
		dbms_stats.set_system_stats('MBRC',16);
		dbms_stats.set_system_stats('MREADTIM',10);
		dbms_stats.set_system_stats('SREADTIM',5);
		dbms_stats.set_system_stats('CPUSPEED',1000);
	exception
		when others then null;
	end;
/*
	begin		execute immediate 'begin dbms_stats.delete_system_stats; end;';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_cost_model"=io';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_gather_stats_on_load" = false';
	exception	when others then null;
	end;
*/

	begin		execute immediate  'begin dbms_space_admin.materialize_deferred_segments(''TEST_USER''); end;';
	exception	when others then null;
	end;

end;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(sysdate,'YYYY') + trunc(dbms_random.normal * 100,1)	d1
from
	generator	v1,
	generator	v2
where
	rownum <= 1e4
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 32'
	);

end;
/

spool adjust_histogram.lst

prompt	==================
prompt	Current High Value
prompt	==================

select to_char(max(d1),'dd-Mon-yyyy hh24:mi:ss') from t1;

prompt	==============================
prompt	Initial Histogram distribution
prompt	==============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

rem
rem	Note - we can't simply overwrite the last srec.novals
rem	because that doesn't adjust the stored high_value.
rem	We have to make a call to prepare_column_values,
rem	which means we have to turn the stored histogram
rem	endpoint values into their equivalent date types.
rem

prompt	==================
prompt	Hacking the values
prompt	==================

declare

	m_distcnt		number;
	m_density		number;
	m_nullcnt		number;
	srec			dbms_stats.statrec;
	m_avgclen		number;

	d_array			dbms_stats.datearray := dbms_stats.datearray();
	ct			number;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

	ct := 0;
	for r in (
		select	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val
		from	user_tab_histograms
		where	table_name = 'T1'
		and	column_name = 'D1'
		order by endpoint_number
	) loop

		ct := ct + 1;
		d_array.extend;
		d_array(ct) := r.d_val;
		if ct = 1 then
			srec.bkvals(ct) := 0;
		else
			srec.bkvals(ct) := 1;
		end if;

	end loop;

	d_array(ct) := to_date('30-Jun-2015','dd-mon-yyyy');

	dbms_stats.prepare_column_values(srec, d_array);

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	);
end;
/

prompt	============================
prompt	Final Histogram distribution
prompt	============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

spool off

doc

#


Philosophy 22

Thu, 2014-07-03 02:59

Make sure you agree on the meaning of the jargon.

If you had to vote would you say that the expressions “more selective” and “higher selectivity” are different ways of expressing the same idea, or are they exact opposites of each other ? I think I can safely say that I have seen people waste a ludicrous amount of time arguing past each other and confusing each other because they didn’t clarify their terms (and one, or both, parties actually misunderstood the terms anyway).

Selectivity is a value between 0 and 1 that represents the fraction of data that will be selected – the higher the selectivity the more data you select.

If a test is “more selective” then it is a harsher, more stringent, test and returns less data  (e.g. Oxford University is more selective than Rutland College of Further Education): more selective means lower selectivity.

If there’s any doubt when you’re in the middle of a discussion – drop the jargon and explain the intention.

Footnote

If I ask:  “When you say ‘more selective’ do you mean ….”

The one answer which is absolutely, definitely, unquestionably the wrong reply is: “No, I mean it’s more selective.”

 


Comparisons

Wed, 2014-07-02 10:09

Catching up (still) from the Trivadis CBO days, here’s a little detail which had never crossed my mind before.


where   (col1, col2) < (const1, const2)

This isn’t a legal construct in Oracle SQL, even though it’s legal in other dialects of SQL. The logic is simple (allowing for the usual anomaly with NULL): the predicate should evaluate to true if (col1 < const1), or if (col1 = const1 and col2 < const2). The thought that popped into my mind when Markus Winand showed a slide with this predicate on it – and then pointed out that equality was the only option that Oracle allowed for multi-column operators – was that, despite not enabling the syntax, Oracle does implement the mechanism.

If you’re struggling to think where, it’s in multi-column range partitioning: (value1, value2) belongs in the partition with high value (k1, k2) if (value1 < k1) or if (value1 = k1 and value2 < k2).