Skip navigation.

Jonathan Lewis

Syndicate content Oracle Scratchpad
Just another Oracle weblog
Updated: 2 hours 59 min ago

Not In CTAS

Sun, 2015-04-05 11:49

Everyone gets caught out some of the time with NOT IN.

NOT IN is not the opposite of IN.

This came up in a (fairly typical) question on OTN recently where someone had the task of “deleting 6M rows from a table of 18M”. A common, and perfectly reasonable, suggestion for dealing with a delete on this scale is to consider creating a replacement table holding the data you do want rather than deleting the data you don’t want.  In this case, however, the query for deleting the data looked like this:


DELETE FROM EI.CASESTATUS
     WHERE CASEID NOT IN (SELECT CASEID FROM DO.STG_CASEHEADER);

The suggested code for creating the kept data was this:


CREATE TABLE newTable as
  SELECT * FROM EI.CASESTATUS
     WHERE CASEID IN (SELECT CASEID FROM DO.STG_CASEHEADER);

You might get the same result sets in the two tables after this process – but it depends on the CASEID in both tables never being NULL. (You might think that a column with ID in the name probably ought to be a primary key, or a foreign key with a NOT NULL declaration, but then again there’s that STG_ in the subquery table that might be indicative of a “staging” table, so who knows what might happen if the data’s allowed to start life as dirty data.) Here’s a quick demo to prove the point. First some simple data creation – with an optional insert so that you’ve got two tests in one script – followed by the two strategies for identifying data:


drop table t3;
drop table t2;
drop table t1;

create table t1 (n1 number);
create table t2 (n1 number);

insert into t1 values(null);
insert into t1 values(1);
insert into t1 values(2);

/* two tests available, with or without a null in t2 */

-- insert into t2 values(null);
insert into t2 values(1);

commit;

-- gather stats here

set null n/a
delete from t1
where  t1.n1 not in (select t2.n1 from t2);

prompt  Remainder after delete

select  *  from t1;

rollback;

prompt  Selected on create

create table t3 as
select  *  from t1
where   t1.n1 in (select t2.n1 from t2);

select * from t3;

Then the two sets of output from running the test, first with the the NULL insert into t2:


Remainder after delete

        N1
----------
n/a
         1
         2
Selected on create

        N1
----------
         1

We haven’t deleted ANY data from t1 when we were probably hoping that the 2 would disappear – after all, it’s not in t2; however since the equality comparison between a t1 row and every t2 row must evaluate to FALSE before a t1 row is deleted and the comparison of 2 and NULL evaluates to NULL the 2 row won’t be deleted (similarly the comparison for “t1.NULL = t2.anything” evaluates to NULL rather than FALSE, so the NULL isn’t deleted).

Still, perhaps the rewrite would have been okay for the data set where we don’t have a NULL in t2:


Remainder after delete

        N1
----------
n/a
         1
Selected on create

        N1
----------
         1

Oops – still doesn’t produce matching results . This time the row with the 2 has disappeared from t1 in both cases – which might have been closer to what the original OTN poster had hoped but we still have the difference in the survival of the NULLs from t1 – for the reason given for the previous data set

Footnote:

In passing, the execution plan subsequently supplied by the OP showed a “MERGE JOIN ANTI NA” with stg_caseheader (the subquery table) as the second table. The significance of the NA (Null-aware) is that it tells us that the join column in stg_caseheader definitely doesn’t have a NOT NULL constraint on it. (We can’t draw any conclusion about the join column in casestatus.)


ANSI expansion

Fri, 2015-03-27 04:46

Here’s a quirky little bug that appeared on the OTN database forum in the last 24 hours which (in 12c, at least) produces an issue which I can best demonstrate with the following cut-n-paste:


SQL> desc purple
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 G_COLUMN_001                        NOT NULL NUMBER(9)
 P_COLUMN_002                                 VARCHAR2(2)

SQL> select p.*
  2  from GREEN g
  3    join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
  4    join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001;
  join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001
       *
ERROR at line 4:
ORA-01792: maximum number of columns in a table or view is 1000

SQL> select p.g_column_001, p.p_column_002
  2  from GREEN g
  3    join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
  4    join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001;

no rows selected

A query that requires “star-expansion” fails with ORA-01792, but if you explicitly expand the ‘p.*’ to list all the columns it represents the optimizer is happy. (The posting also showed the same difference in behaviour when changing “select constant from  {table join}” to “select (select constant from dual) from {table join}”)

The person who highlighted the problem supplied code to generate the tables so you can repeat the tests very easily; one of the quick checks I did was to modify the code to produce tables with a much smaller number of columns and then expanded the SQL to see what Oracle would have done with the ANSI. So, with only 3 columns each in table RED and GREEN, this is what I did:

set serveroutput on
set long 20000

variable m_sql_out clob

declare
    m_sql_in    clob :=
                        '
                        select p.*
                        from GREEN g
                        join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
                        join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001
                        ';
begin

    dbms_utility.expand_sql_text(
        m_sql_in,
        :m_sql_out
    );

end;
/

column m_sql_out wrap word
print m_sql_out

The dbms_utility.expand_sql_text() function is new to 12c, and you’ll need the execute privilege on the dbms_utility package to use it; but if you want to take advantage of it in 11g you can also find it (undocumented) in a package called dbms_sql2.

Here’s the result of the expansion (you can see why I reduced the column count to 3):


M_SQL_OUT
--------------------------------------------------------------------------------
SELECT "A1"."G_COLUMN_001_6" "G_COLUMN_001","A1"."P_COLUMN_002_7" "P_COLUMN_002"
FROM  (SELECT "A3"."G_COLUMN_001_0" "G_COLUMN_001","A3"."G_COLUMN_002_1"
"G_COLUMN_002","A3"."G_COLUMN_003_2" "G_COLUMN_003","A3"."G_COLUMN_001_3"
"G_COLUMN_001","A3"."R_COLUMN__002_4" "R_COLUMN__002","A3"."R_COLUMN__003_5"
"R_COLUMN__003","A2"."G_COLUMN_001" "G_COLUMN_001_6","A2"."P_COLUMN_002"
"P_COLUMN_002_7" FROM  (SELECT "A5"."G_COLUMN_001"
"G_COLUMN_001_0","A5"."G_COLUMN_002" "G_COLUMN_002_1","A5"."G_COLUMN_003"
"G_COLUMN_003_2","A4"."G_COLUMN_001" "G_COLUMN_001_3","A4"."R_COLUMN__002"
"R_COLUMN__002_4","A4"."R_COLUMN__003" "R_COLUMN__003_5" FROM
"TEST_USER"."GREEN" "A5","TEST_USER"."RED" "A4" WHERE
"A5"."G_COLUMN_001"="A4"."G_COLUMN_001") "A3","TEST_USER"."PURPLE" "A2" WHERE
"A3"."G_COLUMN_001_0"="A2"."G_COLUMN_001") "A1"

Tidying this up:


SELECT
        A1.G_COLUMN_001_6 G_COLUMN_001,
        A1.P_COLUMN_002_7 P_COLUMN_002
FROM    (
        SELECT
                A3.G_COLUMN_001_0 G_COLUMN_001,
                A3.G_COLUMN_002_1 G_COLUMN_002,
                A3.G_COLUMN_003_2 G_COLUMN_003,
                A3.G_COLUMN_001_3 G_COLUMN_001,
                A3.R_COLUMN__002_4 R_COLUMN__002,
                A3.R_COLUMN__003_5 R_COLUMN__003,
                A2.G_COLUMN_001 G_COLUMN_001_6,
                A2.P_COLUMN_002 P_COLUMN_002_7
        FROM    (
                SELECT
                        A5.G_COLUMN_001 G_COLUMN_001_0,
                        A5.G_COLUMN_002 G_COLUMN_002_1,
                        A5.G_COLUMN_003 G_COLUMN_003_2,
                        A4.G_COLUMN_001 G_COLUMN_001_3,
                        A4.R_COLUMN__002 R_COLUMN__002_4,
                        A4.R_COLUMN__003 R_COLUMN__003_5
                FROM
                        TEST_USER.GREEN A5,
                        TEST_USER.RED A4
                WHERE
                        A5.G_COLUMN_001=A4.G_COLUMN_001
                ) A3,
                TEST_USER.PURPLE A2
        WHERE
                A3.G_COLUMN_001_0=A2.G_COLUMN_001
        ) A1

As you can now see, the A1 alias lists all the columns in GREEN, plus all the columns in RED, plus all the columns in PURPLE – totalling 3 + 3 + 2 = 8. (There is a little pattern of aliasing and re-aliasing that turns the join column RED.g_column_001 into G_COLUMN_001_3, making it look at first glance as if it has come from the GREEN table).

You can run a few more checks, increasing the number of columns in the RED and GREEN tables, but essentially when the total number of columns in those two tables goes over 998 then adding the two extra columns from PURPLE makes that intermediate inline view break the 1,000 column rule.

Here’s the equivalent expanded SQL if you identify the columns explicitly in the select list (even with several hundred columns in the RED and GREEN tables):


SELECT
        A1.G_COLUMN_001_2 G_COLUMN_001,
        A1.P_COLUMN_002_3 P_COLUMN_002
FROM    (
        SELECT
                A3.G_COLUMN_001_0 G_COLUMN_001,
                A3.G_COLUMN_001_1 G_COLUMN_001,
                A2.G_COLUMN_001 G_COLUMN_001_2,
                A2.P_COLUMN_002 P_COLUMN_002_3
        FROM    (
                SELECT
                        A5.G_COLUMN_001 G_COLUMN_001_0,
                        A4.G_COLUMN_001 G_COLUMN_001_1
                FROM
                        TEST_USER.GREEN A5,
                        TEST_USER.RED A4
                WHERE
                        A5.G_COLUMN_001=A4.G_COLUMN_001
                ) A3,
                TEST_USER.PURPLE A2
        WHERE
                A3.G_COLUMN_001_0=A2.G_COLUMN_001
        ) A1

As you can see, the critical inline view now holds only the original join columns and the columns required for the select list.

If you’re wondering whether this difference in expansion could affect execution plans, it doesn’t seem to; the 10053 trace file includes the following (cosmetically altered) output:


Final query after transformations:******* UNPARSED QUERY IS *******
SELECT
        P.G_COLUMN_001 G_COLUMN_001,
        P.P_COLUMN_002 P_COLUMN_002
FROM
        TEST_USER.GREEN   G,
        TEST_USER.RED     R,
        TEST_USER.PURPLE  P
WHERE
        G.G_COLUMN_001=P.G_COLUMN_001
AND     G.G_COLUMN_001=R.G_COLUMN_001

So it looks as if the routine to transform the syntax puts in a lot of redundant text, then the optimizer takes it all out again.

The problem doesn’t exist with traditional Oracle syntax, by the way, it’s an artefact of Oracle’s expansion of the ANSI syntax, and 11.2.0.4 is quite happy to handle the text generated by the ANSI transformation when there are well over 1,000 columns in the inline view.


ASH

Fri, 2015-03-27 03:41

There was a little conversation on Oracle-L about ASH (active session history) recently which I thought worth highlighting – partly because it raised a detail that I had got wrong until Tim Gorman corrected me a few years ago.

Once every second the dynamic performance view v$active_session_history copies information about active sessions from v$session. (There are a couple of exceptions to the this rule – for example if a session has called dbms_lock.sleep() it will appear in v$session as state = ‘ACTIVE’, but it will not be recorded in v$active_session_history.) Each of these snapshots is referred to as a “sample” and may hold zero, one, or many rows.

The rows collected in every tenth sample are flagged for copying into the AWR where, once they’ve been copied into the underlying table, they can be seen in the view dba_hist_active_sess_history.  This is where a common misunderstanding occurs: it is not every 10th row in v$active_session_history it’s every 10th second; and if a sample happens to be empty that’s still the sample that is selected (which means there will be a gap in the output from dba_hist_active_sess_history). In effect dba_hist_active_sess_history holds copies of the information you’d get from v$session if you sampled it once every 10 seconds instead of once per second.

It’s possible to corroborate this through a fairly simple query as the rows from v$active_session_history that are going to be dumped to the AWR are as they are created:


select
        distinct case is_awr_sample when 'Y' then 'Y' end flag,
        sample_id,
        sample_time
from
        v$active_session_history
where
        sample_time > sysdate - 1/1440
order by
        2,1
;

F  SAMPLE_ID SAMPLE_TIME
- ---------- --------------------------------
     3435324 26-MAR-15 05.52.53.562 PM
     3435325 26-MAR-15 05.52.54.562 PM
     3435326 26-MAR-15 05.52.55.562 PM
     3435327 26-MAR-15 05.52.56.562 PM
     3435328 26-MAR-15 05.52.57.562 PM
     3435329 26-MAR-15 05.52.58.562 PM
     3435330 26-MAR-15 05.52.59.562 PM
     3435331 26-MAR-15 05.53.00.562 PM
Y    3435332 26-MAR-15 05.53.01.562 PM
     3435333 26-MAR-15 05.53.02.572 PM
     3435334 26-MAR-15 05.53.03.572 PM
     3435335 26-MAR-15 05.53.04.572 PM
     3435336 26-MAR-15 05.53.05.572 PM
     3435337 26-MAR-15 05.53.06.572 PM
     3435338 26-MAR-15 05.53.07.572 PM
     3435339 26-MAR-15 05.53.08.572 PM
     3435340 26-MAR-15 05.53.09.572 PM
     3435341 26-MAR-15 05.53.10.582 PM
Y    3435342 26-MAR-15 05.53.11.582 PM
     3435343 26-MAR-15 05.53.12.582 PM
     3435344 26-MAR-15 05.53.13.582 PM
     3435345 26-MAR-15 05.53.14.582 PM
     3435346 26-MAR-15 05.53.15.582 PM
     3435347 26-MAR-15 05.53.16.582 PM
     3435348 26-MAR-15 05.53.17.582 PM
     3435349 26-MAR-15 05.53.18.592 PM
     3435350 26-MAR-15 05.53.19.592 PM
     3435351 26-MAR-15 05.53.20.592 PM
Y    3435352 26-MAR-15 05.53.21.602 PM
     3435355 26-MAR-15 05.53.24.602 PM
     3435358 26-MAR-15 05.53.27.612 PM
     3435361 26-MAR-15 05.53.30.622 PM
     3435367 26-MAR-15 05.53.36.660 PM
     3435370 26-MAR-15 05.53.39.670 PM
     3435371 26-MAR-15 05.53.40.670 PM
     3435373 26-MAR-15 05.53.42.670 PM
     3435380 26-MAR-15 05.53.49.700 PM
     3435381 26-MAR-15 05.53.50.700 PM
Y    3435382 26-MAR-15 05.53.51.700 PM
     3435383 26-MAR-15 05.53.52.700 PM

40 rows selected.

As you can see at the beginning of the output the samples have a sample_time that increases one second at a time (with a little slippage), and the flagged samples appear every 10 seconds at 5.53.01, 5.53.11 and 5.53.21; but then the instance becomes fairly idle and there are several sample taken over the next 20 seconds or so where we don’t capture any active sessions; in particular there are no rows in the samples for 5.53.31, and 5.53.41; but eventually the instance gets a little busy again and we see that we’ve had active sessions in consecutive samples for the last few seconds, and we can see that we’ve flagged the sample at 5.53.51 for dumping into the AWR.

You’ll notice that I seem to be losing about 1/100th second every few seconds; this is probably a side effect of virtualisation and having a little CPU-intensive work going on in the background. If you see periods where the one second gap in v$active_session_history or 10 second gap in dba_hist_active_sess_history has been stretched by several percent you can assume that the CPU was under pressure over that period. The worst case I’ve seen to date reported gaps of 12 to 13 seconds in dba_hist_active_sess_history.  The “one second” algorithm is “one second since the last snapshot was captured” so if the process that’s doing the capture doesn’t get to the top of the runqueue in a timely fashion the snapshots slip a little.

When the AWR snapshot is taken, the flagged rows from v$active_session_history are copied to the relevant AWR table. You can adjust the frequency of sampling for both v$active_session_history, and dba_hist_active_sess_history, of course – there are hidden parameters to control both: _ash_sampling_interval (1,000 milliseconds) and _ash_disk_filter_ratio (10). There’s also a parameter controlling how much memory should be reserved in the shared pool to hold v$active_session_history.: _ash_size (1048618 bytes per session in my case).  The basic target is to keep one hour’s worth of data in memory, but if there’s no pressure for memory you can find that the v$active_session_history holds more than the hour; conversely, if there’s heavy demand for memory and lots of continuously active sessions you may find that Oracle does “emergency flushes” of v$active_session_history between the normal AWR snapshots. I have heard of people temporarily increasing the memory and reducing the interval and ratio – but I haven’t yet felt the need to do it myself.

 


12c MView refresh

Thu, 2015-03-26 07:19

Some time ago I wrote a blog note describing a hack for refreshing a large materialized view with minimum overhead by taking advantage of a single-partition partitioned table. This note describes how Oracle 12c now gives you an official way of doing something similar – the “out of place” refresh.

I’ll start by creating a matieralized view and creating a couple of indexes on the resulting underlying table; then show you three different calls to refresh the view. The materialized view is based on all_objects so it can’t be made available for query rewrite (ORA-30354: Query rewrite not allowed on SYS relations) , and I haven’t created any materialized view logs so there’s no question of fast refreshes – but all I intend to do here is show you the relative impact of a complete refresh.


create materialized view mv_objects nologging
build immediate
refresh on demand
as
select
        *
from
        all_objects
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'mv_objects',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

create index mv_obj_i1 on mv_objects(object_name) nologging compress;
create index mv_obj_i2 on mv_objects(object_type, owner, data_object_id) nologging compress 2;

This was a default install of 12c, so there were about 85,000 rows in the view. You’ll notice that I’ve created all the objects as “nologging” – this will have an effect on the work done during some of the refreshes.

Here are the three variants I used – all declared explicitly as complete refreshes:


begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> true
	);
end;
/

begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> false
	);
end;
/

begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> false,
		out_of_place		=> true
	);
end;
/

The first one (atomic_refresh=>true) is the one you have to use if you want to refresh several materialized views simultaneously and keep them self consistent, or if you want to ensure that the data doesn’t temporarily disappear if all you’re worried about is a single view. The refresh works by deleting all the rows from the materialized view then executing the definition to generate and insert the replacement rows before committing. This generates a lot of undo and redo – especially if you have indexes on the materialized view as these have to be maintained “row by row” and may leave users accessing and applying a lot of undo for read-consistency purposes. An example at a recent client site refreshed a table of 6.5M rows with two indexes, taking about 10 minutes to refresh, generating 7GB of redo as it ran, and performing 350,000 “physical reads for flashback new”. This strategy does not take advantage of the nologging nature of the objects – and as a side effect of the delete/insert cycle you’re likely to see the indexes grow to roughly twice their optimal size and you may see the statistic “recursive aborts on index block reclamation” climbing as the indexes are maintained.

The second option (atomic_refresh => false) is quick and efficient – but may result in wrong results showing up in any code that references the materialized view (whether explicitly or by rewrite). The session truncates the underlying table, sets any indexes on it unusable, then reloads the table with an insert /*+ append */. The append means you get virtually no undo generated, and if the table is declared nologging you get virtually no redo. In my case, the session then dispatched two jobs to rebuild the two indexes – and since the indexes were declared nologging the rebuilds generated virtually no redo. (I could have declared them with pctfree 0, which would also have made them as small as possible).

The final option is the 12c variant – the setting atomic_refresh => false is mandatory if we want  out_of_place => true. With these settings the session will create a new table with a name of the form RV$xxxxxx where xxxxxx is the hexadecimal version of the new object id, insert the new data into that table (though not using the /*+ append */ hint), create the indexes on that table (again with names like RV$xxxxxx – where xxxxxx is the index’s object_id). Once the new data has been indexed Oracle will do some name-switching in the data dictionary (shades of exchange partition) to make the new version of the materialized view visible. A quirky detail of the process is that the initial create of the new table and the final drop of the old table don’t show up in the trace file  [Ed: wrong, see comment #1] although the commands to drop and create indexes do appear. (The original table, though it’s dropped after the name switching, is not purged from the recyclebin.) The impact on undo and redo generation is significant – because the table is empty and has no indexes when the insert takes place the insert creates a lot less undo and redo than it would if the table had been emptied by a bulk delete – even though the insert is a normal insert and not an append; then the index creation honours my nologging definition, so produces very little redo. At the client site above, the redo generated dropped from 7GB to 200MB, and the time dropped to 200 seconds which was 99% CPU time.

Limitations, traps, and opportunities

The manuals say that the out of place refresh can only be used for materialized views that are joins or aggregates and, surprisingly, you actually can’t use the method on a view that simply extracts a subset of rows and columns from a single table.  There’s a simple workaround, though – join the table to DUAL (or some other single row table if you want to enable query rewrite).

Because the out of place refresh does an ordinary insert into a new table the resulting table will have no statistics – you’ll have to add a call to gather them. (If you’ve previously been using a non-atomic refreshes this won’t be a new problem, of course). The indexes will have up to date statistics, of course, because they will have been created after the table insert.

The big opportunity, of course, is to change a very expensive atomic refresh into a much cheaper out of place refresh – in some special cases. My client had to use the atomic_refresh=>true option in 11g because they couldn’t afford to leave the table truncated (empty) for the few minutes it took to rebuild; but they might be okay using the out_of_place => true with atomic_refresh=>false in 12c because:

  • the period when something might break is brief
  • if something does go wrong the users won’t get wrong (silently missing) results, they’ll an Oracle error (probably ORA-08103: object no longer exists)
  • the application uses this particular materialized view directly (i.e. not through query rewrite), and the query plans are all quick, light-weight indexed access paths
  • most queries will probably run correctly even if they run through the moment of exchange

I don’t think we could guarantee that last statement – and Oracle Corp. may not officially confirm it – and it doesn’t matter how many times I show queries succeeding but it’s true. Thanks to “cross-DDL read-consistency” as it was called in 8i when partition-exchange appeared and because the old objects still exist in the data files, provided your query doesn’t hit a block that has been overwritten by a new object, or request a space management block that was zero-ed out on the “drop” a running query can keep on using the old location for an object after it has been replaced by a newer version. If you want to make the mechanism as safe as possible you can help – put each relevant materialized view (along with its indexes) into its own tablespace so that the only thing that is going to overwrite an earlier version of the view is the stuff you create on the next refresh.