Oracle Scratchpad

October 30, 2014

Quiz night

Filed under: CBO,Oracle — Jonathan Lewis @ 6:43 pm GMT Oct 30,2014

Here’s a little puzzle that came up on OTN recently.  (No prizes for following the URL to find the answer) (Actually, no prizes anyway). There’s more in the original code sample than was really needed, so although I’ve done a basic cut and paste from the original I’ve also eliminated a few lines of the text:


execute dbms_random.seed(0)

create table t
as
select rownum as id,
       100+round(ln(rownum/3.25+2)) aS val2,
       dbms_random.string('p',250) aS pad
from dual
connect by level <= 1000
order by dbms_random.value;

begin
  dbms_stats.gather_table_stats(ownname          => user,
                                tabname          => 'T',
                                method_opt       => 'for all columns size 254'
  );
end;
/

column endpoint_value format 9999
column endpoint_number format 999999
column frequency format 999999

select endpoint_value, endpoint_number,
       endpoint_number - lag(endpoint_number,1,0)
                  OVER (ORDER BY endpoint_number) AS frequency
from user_tab_histograms
where table_name = 'T'
and column_name = 'VAL2'
order by endpoint_number
;

alter session set optimizer_mode = first_rows_100;

explain plan set statement_id '101' for select * from t where val2 = 101;
explain plan set statement_id '102' for select * from t where val2 = 102;
explain plan set statement_id '103' for select * from t where val2 = 103;
explain plan set statement_id '104' for select * from t where val2 = 104;
explain plan set statement_id '105' for select * from t where val2 = 105;
explain plan set statement_id '106' for select * from t where val2 = 106;

select statement_id, cardinality from plan_table where id = 0;

The purpose of the method_opt in the gather_table_stats() call is to ensure we get a frequency histogram on val2; and the query against the user_tab_columns view should give the following result:


ENDPOINT_VALUE ENDPOINT_NUMBER FREQUENCY
-------------- --------------- ---------
           101               8         8
           102              33        25
           103             101        68
           104             286       185
           105             788       502
           106            1000       212

Given the perfect frequency histogram, the question then arises why the optimizer seems to calculate incorrect cardinalities for some of the queries; the output from the last query is as follows:


STATEMENT_ID                   CARDINALITY
------------------------------ -----------
101                                      8
102                                     25
103                                     68
104                                    100           -- expected prediction 185
105                                    100           -- expected prediction 502
106                                    100           -- expected prediction 212

I’ve disabled comments so that you can read the answer at OTN if you want to – but see if you can figure out the reason before reading it. (This reproduces on 11g and 12c – and probably on earlier versions all the way back to 9i).

I haven’t done anything extremely cunning with hidden parameters, materialized views, query rewrite, hidden function calls, virtual columns or any other very dirty tricks, by the way.

October 28, 2014

First Rows

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 7:01 am GMT Oct 28,2014

Following on from the short note I published about the first_rows optimizer mode yesterday here’s a note that I wrote on the topic more than 2 years ago but somehow forgot to publish.

I can get quite gloomy when I read some of the material that gets published about Oracle; not so much because it’s misleading or wrong, but because it’s clearly been written without any real effort being made to check whether it’s true. For example, a couple of days ago [ed: actually some time around May 2012] I came across an article about optimisation in 11g that seemed to be claiming that first_rows optimisation somehow “defaulted” to first_rows(1) , or first_rows_1, optimisation if you didn’t supply a final integer value.

For at least 10 years the manuals have described first_rows (whether as a hint or as a parameter value) as being available for backwards compatibility; so if it’s really just a synonym for first_rows_1 (or first_rows(1)) you might think that the manuals would actually mention this. Even if the manuals didn’t mention it you might just consider a very simple little test before making such a contrary claim, and if you did make such a test and found that your claim was correct you might actually demonstrate (or describe) the test so that other people could check your results.

It’s rather important, of course, that people realise (should it ever happen) that first_rows has silently changed into first_rows_1 because any code that’s using it for backwards compatibility might suddenly change execution path when you did the critical upgrade where the optimizer changed from “backwards compatibility” mode to “completely different optimisation strategy” mode. So here’s a simple check (run from 11.2.0.4 – to make sure I haven’t missed the switch):

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t2 as
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from    all_objects
where rownum <= 3000
;

begin
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt => 'for all columns size 1'
 
        );
end;
/

create index t2_i1 on t2(n1);

SQL> select /*+ all_rows */ n2 from t2 where n1 = 15;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    15 |   120 |    12   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=15)

You’ll notice that I’ve created my data in a way that means I’ll have 15 rows with the value 15, scattered evenly through the table. As a result of the scattering the clustering_factor on my index is going to be similar to the number of rows in the table, and the cost of fetching all the rows by index is going to be relatively high. Using all_rows optimization Oracle has chosen a tablescan.

So what happens if I use the first_rows(1) hint, and how does this compare with using the first_rows hint ?

SQL> select /*+ first_rows(1) */ n2 from t2 where n1 = 15;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     2 |    16 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2    |     2 |    16 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_I1 |       |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=15)

SQL> select /*+ first_rows */ n2 from t2 where n1 = 15;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    15 |   120 |    16   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2    |    15 |   120 |    16   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=15)

You might not find it too surprising to see that Oracle used the indexed access path in both cases. Does this mean that first_rows really means (or defaults to) first_rows(1) ?

Of course it doesn’t – you need only look at the estimated cost and cardinality to see this. The two mechanisms are clearly implemented through difference code paths. The first_rows method uses some heuristics to restrict the options it examines, but still gives us the estimated cost and cardinality of fetching ALL the rows using the path it has chosen. The first_rows(1) method uses arithmetic to decide on the best path for getting the first row, and adjusts the cost accordingly to show how much work it thinks it will have to do to fetch just that one row.

Of course, no matter how inane a comment may seem to be, there’s always a chance that it might be based on some (unstated) insight. Is there any way in which first_rows(n) and first_rows are related ? If so could you possibly manage to claim that this establishes a “default value” link?

Funnily enough there is a special case: if you try hinting with first_rows(0) – that’s the number zero – Oracle will use the old first_rows optimisation method – you can infer this from the cost and cardinality figures, or you can check the 10053 trace file, or use a call to dbms_xplan() to report the outline.  It’s an interesting exercise (left to the reader) to decide whether this is the lexical analyzer deciding to treat the “(0)” as a new – and meaningless – token following the token “first_rows”, or whether it is the optimizer recognising the lexical analyzer allowing “first_rows(0)” as a token which the optimizer is then required to treat as first_rows.

Mind you, if you only want the first zero rows of the result set there’s a much better trick you can use to optimise the query – don’t run the query.

 

October 27, 2014

First Rows

Filed under: CBO,Oracle — Jonathan Lewis @ 7:21 am GMT Oct 27,2014

I received an email earlier on this year asking me my opinion of the first_rows option for the optimizer mode. My correspondent was looking at a database with the following settings:

optimizer_mode=first_rows
_sort_elimination_cost_ratio=4

He felt that first_rows was a very old optimizer instruction that might cause suboptimal execution plans in it’s attempt to avoid blocking operations. As for the cost ratio, no-one seemed to be able to explain why it was there.

He was correct; I’ve written the first_rows option a few times in the past – it was left in for backwards compatibility, and reported as such from 9i onwards!

As for the _sort_elimination_cost_ratio – it’s (probably) there to work around the problems caused by first_rows optimisation when you have an ORDER BY clause that could be met by walking an index in competition with a WHERE clause that could be met from another index. Under first_rows the optimizer is highly likely to choose the index for the order by to avoid sorting; but the _sort_elimination_cost_ratio says:  “if the cost of using the index for the ORDER BY is more than N times the cost of using the other index for the WHERE clause then use an index on the WHERE clause and sort the result.”

 The fact that the parameter has been set so low in this case suggests that the end-user:
  1. set first_rows because “it’s an OLTP system” – a myth that even the manuals promoted
  2. found lots of queries taking silly index-driven execution plans because they’d use (say) a primary key index to access and discard vast amounts of  data in the required order, instead of picking up a small amount of data using a better choice of index and then sorting it.
I’ve said many times it in the past: you probably don’t need any first_rows(n) or first_rows_N optimisation, but if you’re using first_rows (i.e. the old option) you really ought to get away from it. Depending on the time you have for testing and your aversion to risk, you might go straight to all_rows, or switch to first_rows_10.  (First_rows_1 can be over-aggressive and introduce some of the same side effects as first_rows).

October 22, 2014

Deadlocks

Filed under: deadlocks,Locks,Oracle — Jonathan Lewis @ 6:34 pm GMT Oct 22,2014

A recent question on the OTN forum asked about narrowing down the cause of deadlocks, and this prompted me to set up a little example. Here’s a deadlock graph of a not-quite-standard type:


Deadlock graph:
                                          ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name                             process session holds waits  process session holds waits
TX-00040001-000008EC-00000000-00000000         50     249     X             48       9           X
TX-000A001F-000008BC-00000000-00000000         48       9     X             50     249           S


My session (the one that dumped the trace file) is 249, and I was blocked by session 9. The slight anomaly, of course, is that I was waiting on a TX lock in mode 4 (Share) rather than the more common mode 6 (eXclusive).

There are plenty of notes on the web these days to tell you that this wait relates in some way to a unique index (or some associated referential integrity) or an ITL wait. (Inevitably there are a couple of other less frequently occurring and less well documented reasons, such as waits for tablespaces to change state but I’m going to ignore those for now). The question is, how do I tell whether this example is related to uniqueness (indexing) or ITLs ? For recent versions of Oracle the answer is in the rest of the trace file which now hold the recent wait history for the session that dumped the trace file.

Reading down my trace file, past the line which says “Information for THIS session”, I eventually get to this:


    Current Wait Stack:
     0: waiting for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc
        wait_id=80 seq_num=81 snap_id=1
 

So it didn’t take me long to find out I had an ITL problem (which should be a pretty rare occurrence in newer versions of Oracle); but there’s more:

...

    There is at least one session blocking this session.
      Dumping 1 direct blocker(s):
        inst: 1, sid: 9, ser: 40192
      Dumping final blocker:
        inst: 1, sid: 9, ser: 40192
    There are 2 sessions blocked by this session.
    Dumping one waiter:
      inst: 1, sid: 357, ser: 7531
      wait event: 'enq: TX - allocate ITL entry'

...

    Session Wait History:
        elapsed time of 0.000035 sec since current wait
     0: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn&lt;&lt;16 | slot=0x5000c, sequence=0xa39
        wait_id=79 seq_num=80 snap_id=1
        wait times: snap=5.002987 sec, exc=5.002987 sec, total=5.002987 sec
        wait times: max=5.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000047 sec of elapsed time
     1: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn&lt;&lt;16 | slot=0xa001f, sequence=0x8bc
        wait_id=78 seq_num=79 snap_id=1
        wait times: snap=1 min 4 sec, exc=1 min 4 sec, total=1 min 4 sec
        wait times: max=1 min 4 sec
        wait counts: calls=22 os=22
        occurred after 0.000032 sec of elapsed time

...
     8: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn&lt;&lt;16 | slot=0x5000c, sequence=0xa39
        wait_id=71 seq_num=72 snap_id=1
        wait times: snap=5.001902 sec, exc=5.001902 sec, total=5.001902 sec
        wait times: max=5.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000042 sec of elapsed time
     9: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn&lt;&lt;16 | slot=0xa001f, sequence=0x8bc
        wait_id=70 seq_num=71 snap_id=1
        wait times: snap=4.005342 sec, exc=4.005342 sec, total=4.005342 sec
        wait times: max=4.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000031 sec of elapsed time

...

    Sampled Session History of session 249 serial 3931
    ---------------------------------------------------

    The history is displayed in reverse chronological order.

    sample interval: 1 sec, max history 120 sec
    ---------------------------------------------------
      [9 samples,                                          11:14:50 - 11:14:58]
        waited for 'enq: TX - allocate ITL entry', seq_num: 81
          p1: 'name|mode'=0x54580004
          p2: 'usn&lt;= 8 sec (still in wait)
      [5 samples,                                          11:14:45 - 11:14:49]
        waited for 'enq: TX - allocate ITL entry', seq_num: 80
          p1: 'name|mode'=0x54580004
          p2: 'usn&lt;&lt;16 | slot'=0x5000c
          p3: 'sequence'=0xa39
          time_waited: 5.002987 sec (sample interval: 4 sec)
...

The little report that follows the initial wait state shows that the situation was a little messy – session 9 was my first and last blocker, but there was another session tangled up in the chain of waits, session 357.

Following this there’s a set of entries from my v$session_wait_history - and if you look carefully at the slot and sequence that appears on the second line of each wait you’ll notice that my waits have been alternating between TWO other sessions/transactions before I finally crashed.

Finally there’s a set of entries for my session extracted from v$active_session_history. (Question: I’m only allowed to query v$active_session_history if I’ve licensed the Diagnostic Pack – so should I shut my eyes when I get to this part of the trace file ;) This breakdown also shows my session alternating between waits on the two different blockers, giving me a pretty good post-event breakdown of what was going on around the time of the deadlock.

October 19, 2014

Plan depth

Filed under: 12c,Bugs,Execution plans,Oracle,subqueries — Jonathan Lewis @ 6:20 pm GMT Oct 19,2014

A recent posting on OTN reminded me that I haven’t been poking Oracle 12c very hard to see which defects in reporting execution plans have been fixed. The last time I wrote something about the problem was about 20 months ago referencing 11.2.0.3; but there are still oddities and irritations that make the nice easy “first child first” algorithm fail because the depth calculated by Oracle doesn’t match the level that you would get from a connect-by query on the underlying plan table. Here’s a simple fail in 12c:


create table t1
as
select
	rownum 			id,
	lpad(rownum,200)	padding
from	all_objects
where	rownum <= 2500
;

create table t2
as
select	* from t1
;

-- call dbms_stats to gather stats

explain plan for
select
	case mod(id,2)
		when 1 then (select max(t1.id) from t1 where t1.id <= t2.id)
		when 0 then (select max(t1.id) from t1 where t1.id >= t2.id)
	end id
from	t2
;

select * from table(dbms_xplan.display);

It ought to be fairly clear that the two inline scalar subqueries against t1 should be presented at the same level in the execution hierarchy; but here’s the execution plan you get from Oracle:

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  2500 | 10000 | 28039   (2)| 00:00:02 |
|   1 |  SORT AGGREGATE      |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL  | T1   |   125 |   500 |    11   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  4 |     TABLE ACCESS FULL| T1   |   125 |   500 |    11   (0)| 00:00:01 |
|   5 |  TABLE ACCESS FULL   | T2   |  2500 | 10000 |    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."ID"<=:B1)
   4 - filter("T1"."ID">=:B1)

As you can see, the immediate (default?) visual impression you get from the plan is that one of the subqueries is subordinate to the other. On the other hand if you check the id and parent_id columns from the plan_table you’ll find that lines 1 and 3 are both direct descendents of line 0 – so they ought to have the same depth. The plan below is what you get if you run the 8i query from utlxpls.sql against the plan_table.


SQL> select id, parent_id from plan_table;

        ID  PARENT_ID
---------- ----------
         0
         1          0
         2          1
         3          0
         4          3
         5          0

--------------------------------------------------------------------------------
| Operation                 |  Name    |  Rows | Bytes|  Cost  | Pstart| Pstop |
--------------------------------------------------------------------------------
| SELECT STATEMENT          |          |     2K|    9K|  28039 |       |       |
|  SORT AGGREGATE           |          |     1 |    4 |        |       |       |
|   TABLE ACCESS FULL       |T1        |   125 |  500 |     11 |       |       |
|  SORT AGGREGATE           |          |     1 |    4 |        |       |       |
|   TABLE ACCESS FULL       |T1        |   125 |  500 |     11 |       |       |
|  TABLE ACCESS FULL        |T2        |     2K|    9K|     11 |       |       |
--------------------------------------------------------------------------------

So next time you see a plan and the indentation doesn’t quite seem to make sense, perhaps a quick query to select the id and parent_id will let you check whether you’ve found an example where the depth calculation produces a misleading result.

 

Update 20th Oct 2014

A question via twitter – does the error also show up with dbms_xplan.display_cursor(), SQL tuning sets, AWR, etc. or is it just a defect of explain plan. Since the depth is (probably) a derived value for display purposes that Oracle doesn’t use internally for executing the plan I would be inclined to assume that the defect is universal, but I’ve only checked it through explain plan/display, and through execution/display_cursor().

 

 

 

October 17, 2014

sreadtim

Filed under: Oracle,Statistics,System Stats,Troubleshooting — Jonathan Lewis @ 1:22 pm GMT Oct 17,2014

Here’s a question that appeared in my email a few days ago:

 

Based on the formula: “sreadtim = ioseektim + db_block_size/iotrfrspeed” sreadtim should always bigger than ioseektim.

But I just did a query on my system, find it otherwise, get confused,

SQL> SELECT * FROM SYS.AUX_STATS$;<

SNAME                          PNAME                               PVAL1 PVAL2
------------------------------ ------------------------------ ---------- --------------------
SYSSTATS_INFO                  STATUS                                    COMPLETED
SYSSTATS_INFO                  DSTART                                    10-08-2014 10:45
SYSSTATS_INFO                  DSTOP                                     10-10-2014 10:42
SYSSTATS_INFO                  FLAGS                                   1
SYSSTATS_MAIN                  CPUSPEEDNW                     680.062427
SYSSTATS_MAIN                  IOSEEKTIM                              10
SYSSTATS_MAIN                  IOTFRSPEED                           4096
SYSSTATS_MAIN                  SREADTIM                            4.716
SYSSTATS_MAIN                  MREADTIM                            2.055
SYSSTATS_MAIN                  CPUSPEED                             1077
SYSSTATS_MAIN                  MBRC                                    4
SYSSTATS_MAIN                  MAXTHR                          956634112
SYSSTATS_MAIN                  SLAVETHR                           252928

How do we explain this ?

 

This question highlights two points – one important, the other only slightly less so.

The really important point is one of interpretation.  Broadly speaking we could reasonably say that the (typical) time required to perform a single block read is made up of the (typical) seek time plus the transfer time which, using the names of the statistics above, would indeed give us the relationship: sreadtim = ioseektim + db_block_size/iotfrspeed; but we have to remember that we are thinking of a simplified model of the world. The values that we capture for sreadtim include the time it takes for a request to get from Oracle to the O/S, through the various network software and hardware layers and back again, the formula ignores those components completely and, moreover, doesn’t allow for the fact that some “reads” could actually come from one of several caches without any physical disc access taking place; similarly we should be aware that the time for an actual I/O seek would vary dramatically with the current position  of the read head, the radial position of the target block, the speed and current direction of movement of the read head, and the rotational distance to the target block. The formula is not attempting to express a physical law, it is simply expressing an approximation that we might use in a first line estimate of performance.

In fact we can see in the figures above that multi-block reads (typically of 4 blocks)  were faster than single block reads on this hardware for the duration of the sampling period – and that clearly doesn’t fit with the simple view embedded in our formula of how disc drives work.  (It’s a fairly typical effect of SANs, of course, that large read requests make the SAN software start doing predictive read-ahead, so that the next read request from Oracle may find that the SAN has already loaded the required data into its cache.)

There is, however, the second point that these figures highlight – but you have to be in the know to spot the detail: whatever the complexities introduced by SAN caching, we’re notlooking at the right figures. The ioseektim and iotfrspeed shown here are the default values used by Oracle. It looks as if the user has called dbms_stats.gather_system_stats() with a 48 hour workload (8th Oct to 10th Oct), but hasn’t yet executed the procedure using the ‘noworkload’ option. Perhaps the ioseektim and iotfrspeed figures from a noworkload call would look a little more reasonable when compared with the 4.716 milliseconds of the workload single block read. There may still be a large gap between the model and the reality, but until the two sets of figures we’re using come from the same place we shouldn’t even think about comparing them.

October 1, 2014

Shrink Tablespace

Filed under: fragmentation,Infrastructure,Oracle — Jonathan Lewis @ 2:55 pm GMT Oct 1,2014

In a comment on my previous post on shrinking tablespaces Jason Bucata and Karsten Spang both reported problems with small objects that didn’t move to the start of the tablespace. This behaviour is inevitable with dictionary managed tablespaces (regardless of the size of the object), but I don’t think it’s likely to happen with locally managed tablespaces if they’ve been defined with uniform extent sizes. Jason’s comment made me realise, though, that I’d overlooked a feature of system allocated tablespaces that made it much harder to move objects towards the start of file. I’ve created a little demo to illustrate the point.

I created a new tablespace as locally managed, ASSM, and auto-allocate, then created a few tables or various sizes. The following minimal SQL query reports the resulting extents in block_id order, adding in a “boundary_1m” column which subtracts 128 blocks (1MB) from the block_id, then divides by 128 and truncates to show which “User Megabyte” in the file the extent starts in.  (Older versions of Oracle typically have an 8 block space management header, recent versions expanded this from 64KB to 1MB – possibly as a little performance aid to Exadata).


select
        segment_name, block_id, blocks , trunc((block_id - 128)/128) boundary_1M
from
        dba_extents where owner = 'TEST_USER'
order by
        block_id
;

SEGMENT_NAME               BLOCK_ID     BLOCKS BOUNDARY_1M
------------------------ ---------- ---------- -----------
T1                              128       1024           0
T1                             1152       1024           8
T2                             2176       1024          16
T2                             3200       1024          24
T3                             4224          8          32
T4                             4232          8          32
T5                             4352        128          33

As you can see t3 and t4 are small tables – 1 extent of 64KB each – and t5, which I created after t4, starts on the next 1MB boundary. This is a feature of auto-allocate: not only are extents (nearly) fixed to a small number of possible extent sizes, the larger extents are restricted to starting on 1MB boundaries and the 64KB extents are used preferentially to fill in odd-sized” holes. To show the impact of this I’m going to drop table t1 (at the start of file) to make some space.


SEGMENT_NAME               BLOCK_ID     BLOCKS BOUNDARY_1M
------------------------ ---------- ---------- -----------
T2                             2176       1024          16
T2                             3200       1024          24
T3                             4224          8          32
T4                             4232          8          32
T5                             4352        128          33

Now I’ll move table t3 – hoping that it will move to the start of file and use up some of the space left by t1. However there’s a 1MB area (at boundary 32) which is partially used,  so t3 moves into that space rather than creating a new “partly used” megabyte.


SEGMENT_NAME               BLOCK_ID     BLOCKS BOUNDARY_1M
------------------------ ---------- ---------- -----------
T2                             2176       1024          16
T2                             3200       1024          24
T4                             4232          8          32
T3                             4240          8          32
T5                             4352        128          33

It’s a little messy trying to clear up the tiny fragments and make them do what you want. In this case you could, for example, create a dummy table with storage(initial 64K next 64K minextents 14) to use up all the space in the partly used megabyte, then move t3 – which should go to the start of file – then move table t4 – which should go into the first partly-used MB (i.e. start of file) rather than taking up the hole left by t3.

Even for a trivial example it’s messy – imagine how difficult it can get to cycle through building and dropping suitable dummy tables and move objects in the right order when you’ve got objects with several small extents scattered through the file, and objects with a mixture of small extents and large extents.

September 29, 2014

12c Fixed Subquery

Filed under: Execution plans,Oracle,Tuning — Jonathan Lewis @ 4:18 pm GMT Sep 29,2014

It’s been about 8 months since I posted a little note about a “notable change in behaviour” of the optimizer when dealing with subqueries in the where clause that could be used to return a constant, e.g.:


select
	*
from	t1
where	id between (select 10001 from dual)
	   and     (select 90000 from dual)
;

There’s been a note at the start of the script ever since saying: Check if this is also true for any table with ‘select fixed_value from table where primary = constant’ I finally had a few minutes this morning (San Francisco time) to check – and it does, in both 11.2.0.4 and 12.1.0.2. With the t1 table from the previous article run the following:


drop table t2 purge;

create table t2 (
        n1 number(6) not null,
        n2 number(6) not null
);

alter table t2 add constraint t2_pk primary key(n1);

insert into t2 values(1,10000);
insert into t2 values(2,90000);

set autotrace traceonly explain

select * from t1
where   id between (select 10000 from t2 where n1 = 1)
           and     (select 90000 from t2 where n1 = 1)
;

set autotrace off

Instead of the historic 5% of 5% selectivity, the plan shows the optimizer predicting (approximately) the 80,000 rows that it will actually get:

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 80003 |  8828K|   218   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL | T1    | 80003 |  8828K|   218   (4)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN| T2_PK |     1 |    13 |     0   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| T2_PK |     1 |    13 |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("ID"<= (SELECT 90000 FROM "T2" "T2" WHERE "N1"=1) AND
              "ID">= (SELECT 10000 FROM "T2" "T2" WHERE "N1"=1))
   2 - access("N1"=1)
   3 - access("N1"=1)

I can’t think it’s very likely that anyone has written SQL that looks like this – but I’m often surprised by what I see in the field, so if this style looks familiar and you’re still on 11.2.0.3 or lower, watch out for changes in execution plan on the upgrade to 11.2.0.4 or 12c.

September 24, 2014

Lock Time

Filed under: AWR,Locks,Oracle,Statspack — Jonathan Lewis @ 12:42 pm GMT Sep 24,2014

Here’s a little detail I was forced to re-learn yesterday; it’s one of those things where it’s easy to say “yes, obviously” AFTER you’ve had it explained so I’m going to start by posing it as a question. Here are two samples of PL/SQL that using locking to handle a simple synchronisation mechanism; one uses a table as an object that can be locked, the other uses Oracle’s dbms_lock package. I’ve posted the code for each fragment, and a sample of what you see in v$lock if two sessions execute the code one after the other:

Table locking – the second session to run this code will wait for the first session to commit or rollback:


begin
        lock table t1 in exclusive mode;
end;
/

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
00007FF409E57BF8 00007FF409E57C58         15 TM     157778          0          0          6         65          0
00007FF409E57BF8 00007FF409E57C58        125 TM     157778          0          6          0         91          1

Using dbms_lock.

variable m_handle       varchar2(255);

declare
        n1              number;
begin

        dbms_lock.allocate_unique(
                lockname        => 'Synchronize',
                lockhandle      => :m_handle
        );

        dbms_output.put_line(:m_handle);

        n1 := dbms_lock.request(
                lockhandle              => :m_handle,
                lockmode                => dbms_lock.x_mode,
                timeout                 => dbms_lock.maxwait,
                release_on_commit       => true         -- the default is false !!
        );

        dbms_output.put_line(n1);

end;
/

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
000000008ED8F738 000000008ED8F790         15 UL 1073742427          0          0          6         42          0
000000008ED902B0 000000008ED90308        125 UL 1073742427          0          6          0        103          1

The big question is this – although the two code fragments produce the same effects in terms of lock waits and the reports from v$lock, what’s the big difference in the way that they are reported in the AWR report.

The high-level difference appears in the Time Model stats. Here are two extracts showing the difference:

Using dbms_lock.


Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time                                 65.4         99.9
PL/SQL execution elapsed time                            63.8         97.4

-> Captured SQL account for    2.8% of Total DB Time (s):              65
-> Captured PL/SQL account for   99.4% of Total DB Time (s):              65

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            63.7              1         63.72   97.3     .0     .0 10u1qbw4a27sp
Module: SQL*Plus
declare n1 number; begin dbms_lock.allocate_unique
( lockname => 'Synchronize', lockhandle
 => :m_handle ); dbms_output.put_line(:m_handle); n
1 := dbms_lock.request( lockhandle => :m_handle,

Table locking method:

Time Model Statistics

Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time                                 95.5         99.9
DB CPU                                                    0.9           .9
parse time elapsed                                        0.1           .1
hard parse elapsed time                                   0.1           .1
PL/SQL execution elapsed time                             0.1           .1

SQL ordered by Elapsed Time

-> Captured SQL account for   99.6% of Total DB Time (s):              96
-> Captured PL/SQL account for   98.7% of Total DB Time (s):              96

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            93.9              1         93.88   98.3     .0     .0 8apkdghttmndx
Module: SQL*Plus
begin lock table t1 in exclusive mode; end;

            93.9              1         93.88   98.3     .0     .0 29fwr53agvbc0
Module: SQL*Plus
LOCK TABLE T1 IN EXCLUSIVE MODE

The time spent waiting for the table lock is reported purely as SQL execution time in the Time Model stats; but the time spent waiting for the user-defined lock is reported as SQL execution time AND as PL/SQL execution time. I had forgotten this yesterday so, as I skipped through the various headline figures of an hourly snapshot, I was amazed to see the Time Model stats reporting 33,000 seconds of PL/SQL and 66,000 seconds of SQL – how on earth do you manage to do that much PL/SQL on any Oracle system. (To add to the embarrassment, it had only been a few moments earlier that I’d gone through the Top 5 Timed Events and said something like: “you can ignore all time spent on ‘enq: UL – contention’, it’s probably just synchronisation code”.

In this case the SQL ordered by Elapsed Time gives you a strong visual clue about what’s going on – but it won’t always be that obvious.

Bottom Line:

PL/SQL execution time includes the time spent waitng for UL locks, don’t forget that you may need to subtract wait time for ‘enq: UL – contention’ from the PL/SQL time before you start to worry about how much actual work you’re doing in PL/SQL.

 

September 21, 2014

Partitioned Clusters

Filed under: 12c,clusters,Infrastructure,Oracle — Jonathan Lewis @ 7:28 pm GMT Sep 21,2014

In case you hadn’t noticed it, partitioning has finally reached clusters in 12c – specifically 12.1.0.2. They’re limited to hash clusters with range partitioning, but it may be enough to encourage more people to use the technology. Here’s a simple example of the syntax:


create cluster pt_hash_cluster (
        id              number(8,0),
        d_date          date,
        small_vc        varchar2(8),
        padding         varchar2(100)
)
-- single table
hashkeys 10000
hash is id
size 700
partition by range (d_date) (
        partition p2011Jan values less than (to_date('01-Feb-2011','dd-mon-yyyy')),
        partition p2011Feb values less than (to_date('01-Mar-2011','dd-mon-yyyy')),
        partition p2011Mar values less than (to_date('01-Apr-2011','dd-mon-yyyy'))
)
;

I’ve been waiting for them to appear ever since 11.2.0.1 and the TPC-C benchmark that Oracle did with them – they’ve been a long time coming (check the partition dates – that gives you some idea of when I wrote this example).

Just to add choice (a.k.a. confusion) 12.1.0.2 has also introduce attribute clustering so you can cluster data in single tables without creating clusters – but only while doing direct path loads or table moves. The performance intent is similar, though the technology and circumstances of use are different.

September 19, 2014

Shrink Tablespace

Filed under: fragmentation,Infrastructure,Oracle — Jonathan Lewis @ 12:10 pm GMT Sep 19,2014

A recent question on the OTN database forum raised the topic of returning free space in a tablespace to the operating system by rebuilding objects to fill the gaps near the start of files and leave the empty space at the ends of files so that the files could be resized downwards.

This isn’t a process that you’re likely to need frequently, but I have written a couple of notes about it, including a sample query to produce a map of the free and used space in a tablespace. While reading the thread, though, it crossed my mind that recent versions of Oracle introduced a feature that can reduce the amount of work needed to get the job done, so I thought I’d demonstrate the point here.

When you move a table its indexes become unusable and have to be rebuilt; but when an index becomes unusable, the more recent versions of Oracle will drop the segment. Here’s a key point – if the index becomes unusable because the table has been moved the segment is dropped only AFTER the move has completed. Pause a minute for thought and you realise that the smart thing to do before you move a table is to make its indexes unusable so that they release their space BEFORE you move the table. (This strategy is only relevant if you’ve mixed tables and indexes in the same tablespace and if you’re planning to do all your rebuilds into the same tablespace rather than moving everything into a new tablespace.)

Here are some outputs demonstrating the effect in a 12.1.0.2 database. I have created (and loaded) two tables in a tablespace of 1MB uniform extents, 8KB block size; then I’ve created indexes on the two tables. Running my ts_hwm.sql script I get the following results for that tablespace:


FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 TEST_USER  T1              TABLE
                256         383 TEST_USER  T2              TABLE
                384         511 TEST_USER  T1_I1           INDEX
                512         639 TEST_USER  T2_I1           INDEX
                640      65,535 free       free

Notice that it was a nice new tablespace, so I can see the two tables followed by the two indexes at the start of the tablespaces. If I now move table t1 and re-run the script this is what happens:


alter table t1 move;

FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 free       free
                256         383 TEST_USER  T2              TABLE
                384         511 free       free
                512         639 TEST_USER  T2_I1           INDEX
                640         767 TEST_USER  T1              TABLE
                768      65,535 free       free

Table t1 is now situated past the previous tablespace highwater mark and I have two gaps in the tablespace where t1 and the index t1_i1 used to be.

Repeat the experiment from scratch (drop the tables, purge, etc. to empty the tablespace) but this time mark the index unusable before moving the table and this is what happens:


FILE_ID    BLOCK_ID   END_BLOCK OWNER      SEGMENT_NAME    SEGMENT_TYPE
------- ----------- ----------- ---------- --------------- ------------------
      5         128         255 free       free
                256         383 TEST_USER  T2              TABLE
                384         511 TEST_USER  T1              TABLE
                512         639 TEST_USER  T2_I1           INDEX
                640      65,535 free       free

Table t1 has moved into the space vacated by index t1_i1, so the tablespace highwater mark has not moved up.

If you do feel the need to reclaim space from a tablespace by rebuilding objects, you can find that it’s quite hard to decide the order in which the objects should be moved/rebuilt to minimise the work you (or rather, Oracle) has to do; if you remember that any table you move will release its index space anyway and insert a step to mark those indexes unusable before you move the table you may find it’s much easier to work out a good order for moving the tables.

Footnote: I appreciate that some readers may already take advantage of the necessity of rebuilding indexes by dropping indexes before moving tables – but I think it’s a nice feature that we can now make them unusable and get the same benefit without introducing a risk of error when using a script to recreate an index we’ve dropped.

 

September 12, 2014

Unusual Deadlock

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 1:08 pm GMT Sep 12,2014

Prompted by a question on OTN I came up with a strategy for producing an ORA-00060 deadlock that DIDN’T produce a deadlock graph (because there isn’t one) and didn’t get reported in the alert log (at least, not when tested on 11.2.0.4). It’s a situation that shouldn’t arise in a production system because it’s doing the sorts of things that you shouldn’t do in a production system: but possibly if you’re trying to do some maintenance or upgrades while keeping the system live it could happen. Here’s the starting code:


drop procedure p2;
drop procedure p1;

drop table t1 purge;
create table t1 (n1 number);

insert into t1 values(1);
insert into t1 values(3);

create or replace procedure p1
as
begin
        update t1 set n1 = 2 where n1 = 1;
        dbms_lock.sleep(10);
        update t1 set n1 = 4 where n1 = 3;
end;
/

create or replace procedure p2
as

        procedure q
        is
                pragma autonomous_transaction;
        begin
                execute immediate 'drop procedure p1';
        end;

begin

        update t1 set n1 = 4 where n1 = 3;
        q;

end;
/

Of course you’re asking for trouble if you start doing DDL as part of your production code; and you’re asking for trouble if you start playing around with autonomous transaction; and if you use one to do the other you’re almost guaranteed to hit a problem eventually. All it takes in this case is a simple sequence of actions followed by a short wait (ca. 10 seconds):

Session 1: execute p1
Session 2: wait a couple of seconds, then execute p2

I got the following result from session 2:


BEGIN p2; END;

*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "TEST_USER.P2", line 8
ORA-06512: at "TEST_USER.P2", line 14
ORA-06512: at line 1

While this doesn’t show up in the alert log, I do get a trace file dumped for the session; in fact I got a trace file from both processes. The trace from the process that reported the deadlock started like this:


DEADLOCK DETECTED

  Performing diagnostic dump and signaling ORA-00060

  Complete deadlock information is located in the trace file of process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc

-------------------------------------------------------------------------------

*** 2014-09-09 12:44:13.427
-------------------------------------------------------------------------------
HUNG PROCESS DIAGNOSTIC DUMP BEGIN:

    dump requested by process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc
-------------------------------------------------------------------------------

The other process didn’t mention a deadlock, but started at the “HUNG PROCESS DIAGNOSTIC” line – the trace file had been triggered by the DIAG process. Of course, the DIAG process also dumped a trace file, summarising the situation; and this started with the following important note:


*** 2014-09-09 12:44:13.426
-------------------------------------------------------------------------------

DEADLOCK DETECTED (id=0xf678cfe3)

Chain Signature: 'enq: TX - row lock contention'<='library cache pin' (cycle)
Chain Signature Hash: 0x39f9868d

The rest of the trace file told me what the other two processes had been doing when the ORA-00060 was initiated, but the point I want to pick up here is that we have a very brief summary in the “Chain Signature” that tells us we’ve had a collision between some DML (enq: TX) and some sort of cursor-like problem (library cache pin) and not a simple data cross-over.

If you’ve looked at the original OTN posting you’ll see that the Chain Signature in that case is “Chain Signature: ‘library cache lock'<=’library cache pin’ (cycle)”, which indicates a collision restricted entirely to the library cache (the lock suggests someone is using a package/cursor while the pin suggests that another session is trying to destroy/(re)compile it – and they’re each trying to do it to each other’s package ! (In the past when I’ve a deadlock of this type it’s been reported as ORA-04020 rather than ORA-00060.) I still have to work out exactly how the OP got into their deadlock (especially in view of their latest comment) -but since I don’t have the right version of Oracle to hand, and it might be a version-specific bug, I don’t think I’m going to try.

 

September 9, 2014

Quiz Night

Filed under: Execution plans,Indexing,Oracle — Jonathan Lewis @ 6:46 pm GMT Sep 9,2014

I have a table with several indexes on it, and I have two versions of a query that I might run against that table. Examine them carefully, then come up with some plausible reason why it’s possible (with no intervening DDL, DML, stats collection, parameter fiddling etc., etc., etc.) for the second form of the query to be inherently more efficient than the first.


select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 between 1 and 3
;

prompt  ===========
prompt  Split query
prompt  ===========

select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 = 1
or      bit_1 > 1 and bit_1 <= 3
;

Update / Answers

I avoided giving any details about the data and indexes in this example as I wanted to allow free rein to readers’ imagination  – and I haven’t been disappointed with the resulting suggestions. The general principles of allowing more options to the optimizer, effects of partitioning, and effects of skew are all worth considering when the optimizer CAN’T use an execution path that you think makes sense.  (Note: I didn’t make it clear in my original question, but I wasn’t looking for cases where you could get a better path by hinting (or profiling) I was after cases where Oracle literally could not do what you wanted.)

The specific strategy I was thinking of when I posed the question was based on a follow-up to some experiments I had done with the cluster_by_rowid() hint. and (there was a little hint in the “several indexes” and more particularly the column name “bit_1″) I was looking at a data warehouse table with a number of bitmap indexes. So here’s the execution plan for the first version of the query  when there’s a simple bitmap index on bit_1.


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   600 | 18000 |    96 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   600 | 18000 |    96 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|*  3 |    BITMAP INDEX RANGE SCAN   | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("BIT_1">=1 AND "BIT_1"<=3)

And here’s the plan for the second query:


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   560 | 16800 |    91 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   560 | 16800 |    91 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|   3 |    BITMAP OR                 |         |       |       |       |
|*  4 |     BITMAP INDEX SINGLE VALUE| BT1     |       |       |       |
|   5 |     BITMAP MERGE             |         |       |       |       |
|*  6 |      BITMAP INDEX RANGE SCAN | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("BIT_1"=1)
   6 - access("BIT_1">1 AND "BIT_1"<=3)

Clearly the second plan is more complex than the first – moreover the added complexity had resulted in the optimizer getting a different cardinality estimate – but, with my data set, there’s a potential efficiency gain. Notice how lines 5 and 6 show a bitmap range scan followed by a bitmap merge: to do the merge Oracle has to “superimpose” the bitmaps for the different key values in the range scan to produce a single bitmap that it can then OR with the bitmap for bit_1 = 1 (“bitmap merge” is effectively the same as “bitmap or” except all the bitmaps come from the same index). The result of this is that when we convert to rowids the rowids are in table order. You can see the consequences in the ordering of the result set or, more importantly for my demo, in the autotrace statistics:


For the original query:
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        604  consistent gets
          0  physical reads
          0  redo size
      27153  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed


For the modified query
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        218  consistent gets
          0  physical reads
          0  redo size
      26714  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed

Note, particularly, the change in the number of consistent gets. Each table block I visited held two or three rows that I needed, in the first query I visit the data in order of (bit_1, rowid) and get each table block 3 time; in the second case I visit the data in order of rowid and only get each table block once (with a “buffer is pinned count” for subsequent rows from the same block).

Here’s the starting output from each query, I’ve added the rowid to the original select statements so that you can see the block ordering:


Original query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         1         12 12         AAAmeCAAFAAAAECAAB
         1         22 22         AAAmeCAAFAAAAEDAAB
         1         32 32         AAAmeCAAFAAAAEEAAB
         1         42 42         AAAmeCAAFAAAAEFAAB
         1         52 52         AAAmeCAAFAAAAEGAAB

Modified query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         2          3 3          AAAmeCAAFAAAAEBAAC
         3          4 4          AAAmeCAAFAAAAEBAAD
         1         12 12         AAAmeCAAFAAAAECAAB
         2         13 13         AAAmeCAAFAAAAECAAC
         3         14 14         AAAmeCAAFAAAAECAAD
         1         22 22         AAAmeCAAFAAAAEDAAB
         2         23 23         AAAmeCAAFAAAAEDAAC
         3         24 24         AAAmeCAAFAAAAEDAAD

By rewriting the query I’ve managed to force a “cluster by rowid” on the data access. Of course, the simpler solution would be to add the /*+ cluster_by_rowid() */ hint to the original query – but it doesn’t work for bitmap indexes, and when I found that it worked for B-tree indexes the next test I did was to try a single bitmap index, which resulted in my writing this note.

Footnote: I don’t really expect Oracle Corp. to modify their code to make the hint work with bitmaps, after all it’s only relevant in the special case of using a bitmap index with a range scan and no subsequent bitmap AND/OR/MINUS operations where it would be needed – and you’re not really expected to use a single bitmap index to access a table, we engineer bitmaps to take advantage of combinations.

September 8, 2014

ASSM Truncate.

Filed under: ASSM,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 11:34 am GMT Sep 8,2014

Here’s one that started off with a tweet from Kevin Closson, heading towards a finish that shows some interesting effects when you truncate large objects that are using ASSM. To demonstrate the problem I’ve set up a tablespace using system allocation of extents and automatic segment space management (ASSM).  It’s the ASSM that causes the problem, but it requires a mixture of circumstances to create a little surprise.


create
	tablespace test_8k_auto_assm
	datafile	-- OMF
	SIZE 1030M
	autoextend off
	blocksize 8k
	extent management local
	autoallocate
	segment space management auto
;

create table t1 (v1 varchar2(100)) pctfree 99 tablespace test_8k_auto_assm storage(initial 1G);

insert into t1 select user from dual;
commit;

alter system flush buffer_cache;

truncate table t1;

I’ve created a table with an initial definition of 1GB, which means that (in a clean tablespace) the autoallocate option will jump straight to extents of 64MB, with 256 table blocks mapped per bitmap block for a total of 32 bitmap blocks in each 64MB extent. Since I’m running this on 11.2.0.4 and haven’t included “segment creation immediate” in the definition I won’t actually see any extents until I insert the first row.

So here’s the big question – when I truncate this table (using the given command) how much work will Oracle have to do ?

Exchanging notes over twitter (140 char at a time) and working from a model of the initial state, it took a little time to get to understand what was (probably) happening and then produce this silly example – but here’s the output from a snapshot of v$session_event for the session across the truncate:


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
local write wait                                    490           0          83.26        .170          13
enq: RO - fast object reuse                           2           0         104.90      52.451         105
db file sequential read                              47           0           0.05        .001           0
db file parallel read                                 8           0           0.90        .112           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.67        .067         153
events in waitclass Other                             2           0           0.04        .018         109

The statistic I want to highlight is the number recorded against “local write wait”: truncating a table of one row we wait for 490 blocks to be written! We also have 8 “db file parallel read”  waits which, according to a 10046 trace file, were reading hundreds of blocks. (I think the most significant time in this test – the RO enqueue wait – may have been waiting for the database writer to complete the work needed for an object checkpoint, but I’m not sure of that.)

The blocks written were the space management bitmap blocks for the extent(s) that remained after the truncate – even the ones that referenced extents above the high water mark for the table. Since we had set the tables initial storage to 1GB, we had a lot of bitmap blocks. At 32 per extent and 16 extents (64MB * 16 = 1GB) we might actually expect something closer to 512 blocks, but actually Oracle had formatted the last extent with only 8 space management blocks. and the first extent had an extra 2 to cater for the level 2 bitmap lock and segment header block giving: 32 * 15 + 8 + 2 = 490.

As you may have seen above, the impact on the test that Kevin was doing was quite dramatic – he had set the initial storage to 128GB (lots of bitmap blocks), partitioned the table (more bitmap blocks) and was running RAC (so the reads were running into waits for global cache grants).

I had assumed that this type of behaviour happened only with the “reuse storage” option of the truncate command: and I hadn’t noticed before that it also appeared even if you didn’t reuse storage – but that’s probably because the effect applies only to the bit you keep, which may typically mean a relatively small first extent. It’s possible, then, that in most cases this is an effect that isn’t going to be particularly visible in production systems – but if it is, can you work around it ? Fortunately another tweeter asked the question “What happens if you ‘drop all storage?'” Here’s the result from adding that clause to my test case:


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
enq: RO - fast object reuse                           1           0           0.08        .079           0
log file sync                                         1           0           0.03        .031           0
db file sequential read                              51           0           0.06        .001           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.56        .056         123
events in waitclass Other                             3           0           0.87        .289         186


Looking good – if you don’t keep any extents you don’t need to make sure that their bitmaps are clean. (The “db file sequential read” waits are almost all about the data dictionary, following on from my “flush buffer cache”).

Footnote 1: the same effect appears in 12.1.0.2
Footnote 2: it’s interesting to note that the RO enqueue wait time seems to parallel the local write wait time: perhaps a hint that there’s some double counting going on. (To be investigated, one day).

September 4, 2014

Group By Bug

Filed under: 12c,Bugs,dbms_xplan,Execution plans,Oracle — Jonathan Lewis @ 5:11 pm GMT Sep 4,2014

This just in from OTN Database Forum – a surprising little bug with “group by elimination” exclusive to 12c.


alter session set nls_date_format='dd-Mon-yyyy hh24:mi:ss';

select
       /* optimizer_features_enable('12.1.0.1')*/
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from (
  select
        ts, max(fieldb) fieldb
  from (
  select trunc(sysdate) - 1/24 ts, 1 fieldb from dual
  union all
  select trunc(sysdate) - 2/24 ts, 2 fieldb from dual
  union all
  select trunc(sysdate) - 3/24 ts, 3 fieldb from dual
  union all
  select trunc(sysdate) - 4/24 ts, 4 fieldb from dual
  union all
  select trunc(sysdate) - 5/24 ts, 5 fieldb from dual
  )
  group by ts
)
group by trunc (ts,'DD')
/

You might expect to get one row as the answer – but this is the result I got, with the execution plan pulled from memory:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00          1
03-Sep-2014 00:00:00          5
03-Sep-2014 00:00:00          4
03-Sep-2014 00:00:00          2
03-Sep-2014 00:00:00          3

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    11 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    11  (10)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    10   (0)| 00:00:01 |
|   3 |    UNION-ALL     |      |       |       |            |          |
|   4 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   5 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   6 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

You’ll notice that I’ve got an “optimizer_features_enable()” comment in the code: if I change it into a hint I get the following (correct) result and plan:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00         15

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    12 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    12  (17)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    11  (10)| 00:00:01 |
|   3 |    HASH GROUP BY |      |     5 |    60 |    11  (10)| 00:00:01 |
|   4 |     VIEW         |      |     5 |    60 |    10   (0)| 00:00:01 |
|   5 |      UNION-ALL   |      |       |       |            |          |
|   6 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Somehow 12.1.0.2 has managed to get confused by the combination of “group by ts” and “group by trunc(ts,’DD’)” and has performed “group-by elimination” when it shouldn’t have. If you use the ‘outline’ option for dbms_xplan.display_cursor() you’ll find that the bad result reports the hint elim_groupby(@sel$1), which leads to an alternative solution to hinting the optimizer_features level. Start the code like this:


select
       /*+ qb_name(main) no_elim_groupby(@main) */
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from  ...

The (no_)elim_groupby is a hint that appeared in v$sql_hints only in the 12.1.0.2.

Next Page »

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,306 other followers