Oracle Scratchpad

November 29, 2016

Delete/Insert #2

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 4:33 pm GMT Nov 29,2016

In the previous post I threw out a couple of options for addressing the requirement to transfer data from one table to another (“cut and paste” rather than just “copy”) without running into odd inconsistency errors. This triggered of a wonderful comment trail of alternatives based on how large the volume might be, how relaxed the concurrency requirements might be, and so on.

A comment by SydOracle1 picked up on my failure to get Oracle working with the “as of SCN” syntax because it kept reporting ORA-08187 and suggested a straightforward use of the VERSIONS strategy. I thought it was quite surprising that this could work given that “as of SCN” didn’t, so I whipped up a quick test to check it – adding a couple of little refinements to the supplied sample – and it worked.

create table t1
as
select	object_id, object_name, owner
from	all_objects
;

create table t2
as
select	* from t1
where	rownum = 0
;

alter table t1 add constraint t1_pk primary key(object_id);

execute dbms_stats.gather_table_stats(user,'t1')

execute dbms_lock.sleep(5)

That dbms_lock.sleep() is very important for the purposes of this demonstration; it has to be just a few seconds otherwise the references back to earlier SCNs could report error: “ORA-01466: unable to read data – table definition has changed”. This is probably a side effect due to the 3 second interval in the capture that Oracle uses in the table smon_scn_time.

So now we do the following:

  • find the current SCN,
  • start a transaction,
  • get the transaction id,
  • delete the data from the source table,
  • find the current SCN again,
  • insert into the target table the data that was deleted by our transaction in the interval spanned by the two SCNs.

 


rem
rem     insert_delete_2.sql
rem     Jonathan Lewis
rem     Nov 2016
rem

column current_scn new_value m_start_scn 
select to_char(current_scn,'FM999999999999999999999') current_scn from v$database;

variable m_tx_id varchar2(20)
exec :m_tx_id := dbms_transaction.local_transaction_id(true)

column xid new_value m_xid
select xid from v$transaction where xidusn || '.' || xidslot || '.' || xidsqn = :m_tx_id;

delete from t1 where owner = 'SYSTEM';
commit;

column current_scn new_value m_end_scn format 999999999999999999
select to_char(current_scn,'FM999999999999999999999') current_scn from v$database;
 
insert	into t2
select	r.*
from	t1  versions between scn &m_start_scn and &m_end_scn r
where 
	versions_operation = 'D'
and	versions_xid = '&m_xid'
and	owner = 'SYSTEM'
;

commit;

There are a couple of variants on getting the transaction ID – I decided to use a function call to start a transaction without doing any work rather than doing the delete and then finding the transaction id that the delete initiated (I could have linked v$session for my SID to v$transaction after the delete). Because of the choice I made I have to do a little bit of messing around in the subsequent code – the function call returns the transaction ID in the form 31.16.19111 (that’s undo segment, slot number, sequence) but the VERSIONS mechanism wants a transaction ID in its HEX form which, for the example shown, would be ‘1F001000A74A0000’. I could have converted the three part form to the other using a messy bit of to_char(,’XXXXXXXX’) code, but I was feeling a little lazy.

To my surprise I didn’t see any ORA-08187 errors – which made me look back at the notes I had jotted down on the couple of tests I’d initially tried to find out what I had been doing wrong. My first attempt did the insert first then tried to do the delete “as of SCN” and failed, so my second attempt tried to do the delete first just in case the problem related to using “as of SCN” in the middle of a transaction:


column current_scn new_value m_scn

prompt  ============
prompt  Insert first
prompt  ============

select to_char(current_scn,'FM999999999999999999') current_scn from V$database;

insert into t2
select  *
from    t1 as of scn &m_scn r
where
        owner = 'SQLTXADMIN'
;

delete from t1 as of scn &m_scn r
where owner = 'SQLTXADMIN'
;

-- ORA-08171: snapshot expression not allowed here

rollback;

prompt  ============
prompt  Delete first
prompt  ============

select to_char(current_scn,'FM999999999999999999') current_scn from V$database;

delete from t1 as of scn &m_scn r
where owner = 'SQLTXADMIN'
;

-- ORA-08171: snapshot expression not allowed here

insert into t2
select  *
from    t1 as of scn &m_scn r
where
        owner = 'SQLTXADMIN'
;

rollback;

Clearly “delete as of scn” is illegal.

Of course, if I’d gone a little further with this idea I might have tried starting with a delete that didn’t use “as of SCN”, and then the code would have succeeded. In fact, though, this wouldn’t be a perfect solution because it would allow a window for error: some other session might delete a relevant row between my call for SCN and my delete, which means my insert would insert a row deleted by another user.

The code could be modified though in its choice of SCN. Provided I started my transaction with the delete I could then query v$transaction for the start SCN for the transaction, and use that as the “as of” SCN for the insert:


delete from t1
where owner = 'SQLTXADMIN'
;

select
        to_char(
                start_scnw * power(2,32) + start_scnb,
                'FM999999999999999999'
        )       current_scn
from
        v$transaction
where   ses_addr = (
                select  saddr
                from    v$session
                where   sid = (
                        select  sid
                        from    V$mystat
                        where   rownum = 1
                )
        )
;

insert into t2
select  *
from    t1 as of scn &m_scn r
where
        owner = 'SQLTXADMIN'
;

By using the delete to initiate the transaction and set the SCN I think we block any window of inconsistency and, apart from the messy little bit of code that finds the transaction entry, we have an even simpler piece of code than the example give by SydOracle.

Collaboration or, kicking ideas around, is a wonderful way to learn.

 

October 24, 2016

Anniversary OICA

Filed under: CBO,Oracle,Performance,Statistics,Troubleshooting — Jonathan Lewis @ 1:00 pm GMT Oct 24,2016

Happy anniversary to me!

On this day 10 years ago I published the first article in my blog. It was about the parameter optimizer_index_cost_adj (hence OICA), a parameter that has been a  source of many performance problems and baffled DBAs over the years and, if you read my first blog posting and follow the links, a parameter that should almost certainly be left untouched.

It seems appropriate to mention it today because I recently found a blog posting (dated 3rd May 2013) on the official Oracle Blogs where the director for Primavera advises setting this parameter to 1 (and the optimizer_index_caching parameter to 90) for the Primavera P6 OLTP (PMDB) database. The recommendation is followed by a fairly typical “don’t blame me” warning, viz: “As with any changes that affect query optimization, it is paramount to TEST, TEST and TEST again. At least these settings are easily adjusted or change back to the original value”.

Here’s a thought, though: setting the optimizer_index_cost_adj to the extreme value 1 is a catastrophic change so don’t suggest it unless you are extremely confident that it’s almost certain to be the right thing to do. If you’re confident that it’s a good idea to reduce the parameter to a much smaller value than the default then suggest a range of values that varies from “ideal if it works, but high risk” to “low risk and mostly helpful”. Maybe a suggestion like: “Primavera P6 OLTP (PMDB) tends to work best with this parameter set to a value in the range of 1 to 15” would be a more appropriate comment from someone in a position of authority.

Here’s another thought: if you work for Oracle you could always contact the optimizer group to present them with your argument for the strategy and see what they think about it. Then you can include their opinion when you offer your suggestion.

For what it’s worth, here’s my opinion: as a general rule you shouldn’t be working around performance issues by fiddling with the optimizer_index_cost_adj; as a specific directive do not set it to 1. If you want to encourage Oracle to be enthusiastic about indexes in general then adjust the system statistics (preferably with a degree of truth). If you need to persuade Oracle that particular indexes are highly desirable than you can use dbms_stats.set_index_stats() to adjust the clustering_factor (and avg_data_blocks_per_key) of those indexes. If you are running 11.2.0.4 or later then you can use dbms_stats.set_table_prefs() to set the “table_cached_blocks” parameter for tables where you think Oracle should be particularly keen on using indexes but isn’t; and if your queries are suffering from bad cardinality estimates because of a pattern of multi-column filter predicates create some column group (extended) statistics.

Why am I so firmly set against setting the optimizer_index_cost_adj to 1 ? Because it doesn’t tell Oracle to “use indexes instead of doing tablescans”, it tells Oracle that every index is just about as good as every other index for almost any query. Here’s a pdf file of an article (formerly published on DBAZine and then on my old website) I wrote over twelve years ago explaining the issue. Various links in the article no longer work, and the data pattern was generated to display the problem in 8i and 9i and you would need to modify the data to display the same effect in newer versions of Oracle – but the principle remains the same.

If you would like to see a slightly newer example of how the parameter causes problems. Here’s a thread dated April 2012 from the OTN database forum where a SYS-recursive query caused a performance problem because the parameter was set 1.

 

October 17, 2016

Fixed Stats

Filed under: Execution plans,Oracle,Statistics,Statspack,Troubleshooting — Jonathan Lewis @ 12:43 pm GMT Oct 17,2016

There are quite a lot of systems around the world that aren’t using the AWR (automatic workload repository) and ASH (active session history) tools to help them with trouble shooting because of the licensing requirement – so I’m still finding plenty of sites that are using Statspack and I recently came across a little oddity at one of these sites that I hadn’t noticed before: one of the Statspack snapshot statements was appearing fairly regularly in the Statspack report under the “SQL Ordered by Elapsed Time” section – even when the application had been rather busy and had generated lots of other work that was being reported. It was the following statement – the collection of file-level statistics:


select
       ts.name      tsname
     , df.name      filename
     , fs.phyrds
     , fs.phywrts
     , fs.readtim
     , fs.writetim
     , fs.singleblkrds
     , fs.phyblkrd
     , fs.phyblkwrt
     , fs.singleblkrdtim
     , fw.count     wait_count
     , fw.time      time
     , df.file#
  from x$kcbfwait   fw
     , v$filestat   fs
     , v$tablespace ts
     , v$datafile   df
 where ts.ts#    = df.ts#
   and fs.file#  = df.file#
   and fw.indx+1 = df.file#
;

The execution plan didn’t look very friendly, and the volume of I/O it generated (several hundred thousand disk reads) was surprising. The reason why the statement stood out so much in this case was that there was a fairly large number of files in the database (over 1,000) and the default execution plan was showing very bad cardinality estimates that resulted in highly inappropriate cartesian merge joins. At best the statement was taking around 2 minutes to run, at worst it was much, much worse.

This system was running 10g – also something which is still fairly common, though becoming much scarcer – which produced the following execution plan (which  I’ve recreated on a much smaller system):

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   535 |     2 (100)| 00:00:01 |
|*  1 |  HASH JOIN                    |                 |     1 |   535 |     2 (100)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN        |                 |     5 |  1190 |     0   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |                 |     1 |   199 |     0   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN      |                 |     1 |   173 |     0   (0)| 00:00:01 |
|*  5 |      FIXED TABLE FULL         | X$KCCTS         |     1 |    43 |     0   (0)| 00:00:01 |
|   6 |      BUFFER SORT              |                 |     1 |   130 |     0   (0)| 00:00:01 |
|*  7 |       FIXED TABLE FULL        | X$KCFIO         |     1 |   130 |     0   (0)| 00:00:01 |
|*  8 |     FIXED TABLE FIXED INDEX   | X$KCCFE (ind:1) |     1 |    26 |     0   (0)| 00:00:01 |
|   9 |    BUFFER SORT                |                 |   100 |  3900 |     0   (0)| 00:00:01 |
|  10 |     FIXED TABLE FULL          | X$KCBFWAIT      |   100 |  3900 |     0   (0)| 00:00:01 |
|* 11 |   VIEW                        | GV$DATAFILE     |     1 |   297 |     1 (100)| 00:00:01 |
|  12 |    SORT ORDER BY              |                 |     1 |   957 |     1 (100)| 00:00:01 |
|  13 |     NESTED LOOPS              |                 |     1 |   957 |     0   (0)| 00:00:01 |
|  14 |      NESTED LOOPS             |                 |     1 |   647 |     0   (0)| 00:00:01 |
|  15 |       NESTED LOOPS            |                 |     1 |   371 |     0   (0)| 00:00:01 |
|* 16 |        FIXED TABLE FULL       | X$KCCFN         |     1 |   323 |     0   (0)| 00:00:01 |
|* 17 |        FIXED TABLE FIXED INDEX| X$KCVFH (ind:1) |     1 |    48 |     0   (0)| 00:00:01 |
|* 18 |       FIXED TABLE FIXED INDEX | X$KCCFE (ind:1) |     1 |   276 |     0   (0)| 00:00:01 |
|* 19 |      FIXED TABLE FULL         | X$KCCFN         |     1 |   310 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TSTSN"="TS#" AND "K"."KCFIOFNO"="FILE#" AND "FILE#"="FW"."INDX"+1)
   5 - filter("TSTSN"<>(-1) AND "INST_ID"=USERENV('INSTANCE'))
   7 - filter("K"."INST_ID"=USERENV('INSTANCE'))
   8 - filter("F"."FEDUP"<>0 AND "F"."FENUM"="K"."KCFIOFNO")
  11 - filter("INST_ID"=USERENV('INSTANCE'))
  16 - filter("FN"."FNNAM" IS NOT NULL AND "FN"."FNTYP"=4 AND BITAND("FN"."FNFLG",4)<>4)
  17 - filter("FN"."FNFNO"="FH"."HXFIL")
  18 - filter("FE"."FEDUP"<>0 AND "FN"."FNFNO"="FE"."FENUM" AND
              "FE"."FEFNH"="FN"."FNNUM" AND "FE"."FETSN"<>(-1))
  19 - filter("FE"."FEPAX"<>65535 AND "FE"."FEPAX"<>0 AND "FE"."FEPAX"="FNAUX"."FNNUM"
              OR ("FE"."FEPAX"=0 OR "FE"."FEPAX"=65535) AND "FE"."FENUM"="FNAUX"."FNFNO" AND
              "FNAUX"."FNTYP"=4 AND "FNAUX"."FNNAM" IS NOT NULL AND BITAND("FNAUX"."FNFLG",4)<>4 AND
              "FE"."FEFNH"="FNAUX"."FNNUM")

Note particularly the two Cartesian merge joins and the very late filter at operation 1.

Note also the number of times the cardinality estimate is 1 – always a bit of a threat when the query gets complicated: “anything goes following a one for Rows”.

The easy (first thought) solution was simply to gather stats on all the fixed objects in this query:


begin
        dbms_stats.gather_table_stats('sys','x$kcbfwait',method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccfe',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccfn',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccts',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kcfio',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kcvfh',   method_opt=>'for all columns size 1');
end;
/

The option to gather fixed objects stats individually with a call to dbms_stats.gather_table_stats() is not commonly known, but it does work.

Here’s the plan (again from the small system) after stats collection:

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                 |    29 | 10411 |     3 (100)| 00:00:01 |
|*  1 |  HASH JOIN                     |                 |    29 | 10411 |     3 (100)| 00:00:01 |
|*  2 |   HASH JOIN                    |                 |     1 |   350 |     2 (100)| 00:00:01 |
|   3 |    MERGE JOIN CARTESIAN        |                 |    25 |  1325 |     0   (0)| 00:00:01 |
|   4 |     NESTED LOOPS               |                 |     4 |   148 |     0   (0)| 00:00:01 |
|*  5 |      FIXED TABLE FULL          | X$KCFIO         |   200 |  6200 |     0   (0)| 00:00:01 |
|*  6 |      FIXED TABLE FIXED INDEX   | X$KCCFE (ind:1) |     1 |     6 |     0   (0)| 00:00:01 |
|   7 |     BUFFER SORT                |                 |     7 |   112 |     0   (0)| 00:00:01 |
|*  8 |      FIXED TABLE FULL          | X$KCCTS         |     7 |   112 |     0   (0)| 00:00:01 |
|*  9 |    VIEW                        | GV$DATAFILE     |     1 |   297 |     1 (100)| 00:00:01 |
|  10 |     SORT ORDER BY              |                 |     1 |   316 |     1 (100)| 00:00:01 |
|  11 |      NESTED LOOPS              |                 |     1 |   316 |     0   (0)| 00:00:01 |
|  12 |       NESTED LOOPS             |                 |     1 |   248 |     0   (0)| 00:00:01 |
|  13 |        NESTED LOOPS            |                 |     1 |   226 |     0   (0)| 00:00:01 |
|* 14 |         FIXED TABLE FULL       | X$KCCFE         |     4 |   612 |     0   (0)| 00:00:01 |
|* 15 |         FIXED TABLE FIXED INDEX| X$KCCFN (ind:1) |     1 |    73 |     0   (0)| 00:00:01 |
|* 16 |        FIXED TABLE FIXED INDEX | X$KCVFH (ind:1) |     1 |    22 |     0   (0)| 00:00:01 |
|* 17 |       FIXED TABLE FULL         | X$KCCFN         |     1 |    68 |     0   (0)| 00:00:01 |
|  18 |   FIXED TABLE FULL             | X$KCBFWAIT      |   400 |  3600 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("FILE#"="FW"."INDX"+1)
   2 - access("TSTSN"="TS#" AND "K"."KCFIOFNO"="FILE#")
   5 - filter("K"."INST_ID"=USERENV('INSTANCE'))
   6 - filter("F"."FEDUP"<>0 AND "F"."FENUM"="K"."KCFIOFNO")
   8 - filter("TSTSN"<>(-1) AND "INST_ID"=USERENV('INSTANCE'))
   9 - filter("INST_ID"=USERENV('INSTANCE'))
  14 - filter("FE"."FEDUP"<>0 AND "FE"."FETSN"<>(-1))
  15 - filter("FN"."FNTYP"=4 AND "FN"."FNNAM" IS NOT NULL AND BITAND("FN"."FNFLG",4)<>4
              AND "FN"."FNFNO"="FE"."FENUM" AND "FE"."FEFNH"="FN"."FNNUM")
  16 - filter("FN"."FNFNO"="FH"."HXFIL")
  17 - filter("FE"."FEPAX"<>65535 AND "FE"."FEPAX"<>0 AND "FE"."FEPAX"="FNAUX"."FNNUM" OR
              ("FE"."FEPAX"=0 OR "FE"."FEPAX"=65535) AND "FE"."FENUM"="FNAUX"."FNFNO" AND
              "FNAUX"."FNTYP"=4 AND "FNAUX"."FNNAM" IS NOT NULL AND BITAND("FNAUX"."FNFLG",4)<>4 AND
              "FE"."FEFNH"="FNAUX"."FNNUM")

Note the changes in cardinality estimates: they now look a little more realistic and we’re down to one cartesian merge join which (if you have a rough idea of what your X$ tables hold) still looks a little surprising at first sight but not completely unreasonable. A change of plan doesn’t necessarily mean much without the data and time behind it, of course, so here are the two sets of results from a 10g database with a handful of datafiles and tablespaces showing the Row Source Operation sections from the tkprof output before and after stats collection:

Before stats collection:

Rows     Row Source Operation
-------  ---------------------------------------------------
      6  HASH JOIN  (cr=0 pr=0 pw=0 time=1957860 us)
  16800   MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=33855662 us)
     42    NESTED LOOPS  (cr=0 pr=0 pw=0 time=73795 us)
   1400     MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=21555 us)
      7      FIXED TABLE FULL X$KCCTS (cr=0 pr=0 pw=0 time=3204 us)
   1400      BUFFER SORT (cr=0 pr=0 pw=0 time=7233 us)
    200       FIXED TABLE FULL X$KCFIO (cr=0 pr=0 pw=0 time=1210 us)
     42     FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=1859092 us)
  16800    BUFFER SORT (cr=0 pr=0 pw=0 time=67643 us)
    400     FIXED TABLE FULL X$KCBFWAIT (cr=0 pr=0 pw=0 time=2008 us)
      6   VIEW  GV$DATAFILE (cr=0 pr=0 pw=0 time=68087 us)
      6    SORT ORDER BY (cr=0 pr=0 pw=0 time=68065 us)
      6     NESTED LOOPS  (cr=0 pr=0 pw=0 time=65989 us)
      6      NESTED LOOPS  (cr=0 pr=0 pw=0 time=56632 us)
      6       NESTED LOOPS  (cr=0 pr=0 pw=0 time=47217 us)
      6        FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=19830 us)
      6        FIXED TABLE FIXED INDEX X$KCVFH (ind:1) (cr=0 pr=0 pw=0 time=25568 us)
      6       FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=9849 us)
      6      FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=9715 us)

After stats collection:


Rows     Row Source Operation
-------  ---------------------------------------------------
      6  HASH JOIN  (cr=0 pr=0 pw=0 time=196576 us)
      6   HASH JOIN  (cr=0 pr=0 pw=0 time=195829 us)
     42    MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=4390 us)
      6     NESTED LOOPS  (cr=0 pr=0 pw=0 time=7810 us)
    200      FIXED TABLE FULL X$KCFIO (cr=0 pr=0 pw=0 time=1224 us)
      6      FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=150150 us)
     42     BUFFER SORT (cr=0 pr=0 pw=0 time=1574 us)
      7      FIXED TABLE FULL X$KCCTS (cr=0 pr=0 pw=0 time=1353 us)
      6    VIEW  GV$DATAFILE (cr=0 pr=0 pw=0 time=41058 us)
      6     SORT ORDER BY (cr=0 pr=0 pw=0 time=41005 us)
      6      NESTED LOOPS  (cr=0 pr=0 pw=0 time=39399 us)
      6       NESTED LOOPS  (cr=0 pr=0 pw=0 time=34229 us)
      6        NESTED LOOPS  (cr=0 pr=0 pw=0 time=15583 us)
      6         FIXED TABLE FULL X$KCCFE (cr=0 pr=0 pw=0 time=1124 us)
      6         FIXED TABLE FIXED INDEX X$KCCFN (ind:1) (cr=0 pr=0 pw=0 time=15067 us)
      6        FIXED TABLE FIXED INDEX X$KCVFH (ind:1) (cr=0 pr=0 pw=0 time=18971 us)
      6       FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=5581 us)
    400   FIXED TABLE FULL X$KCBFWAIT (cr=0 pr=0 pw=0 time=1615 us)

The execution time has dropped from about 2 seconds to less than 1/10th of a second – and all I’ve got is 6 or 7 files and tablespaces.  (Some of the “time=” values in the first plan are very odd, but the final time figure is about right.)

Generating an intermediate result set of 16,800 rows for a tiny number of files is not a good thing – just imagine how big that number would get with 1,000 files and a couple of hundred tablespaces.

I have to say that (for a couple of tiny databases) the 11.2.0.4 and 12.1.0.2 systems I checked this query on had no problem and immediately picked a sensible path. It’s possible that the definition of some of the v$ objects has actually changed or that the optimizer features have changed (some assistence from complex view merging, perhaps) – but if you are still running Statspack, even if it’s on 11g or 12c, then it’s worth checking from time to time how much work goes into executing the snapshot and seeing if you need some fixed object stats to make things a little more efficient.

Footnote:

Several years ago I wrote a short note about how Statspack actually captured its own execution time (from 10g onwards) and how you could run a report on it to check the run time. It’s worth running that report from time to time. I’ve recently updated that note to add the equivalent query against the AWR.

October 6, 2016

My session workload

Filed under: Oracle,Performance,Troubleshooting — Jonathan Lewis @ 1:19 pm GMT Oct 6,2016

My old website (www.jlcomp.demon.co.uk) will be disappearing in a couple of weeks – but there are a couple of timeless articles on it that are worth saving and although the popularity of this one has probably been surpassed by Tanel Poder’s Snapper script, or other offerings by Tom Kyte or Adrian Billington, it’s still one of those useful little things to have around – it’s a package to takes a snapshot of your session stats.

The package depends on a view created in the SYS schema, and the package itself has to be installed in the SYS schema – which is why other strategies for collecting the information have become more popular; but if you want to have it handy, here are the two scripts:

rem
rem     Script:         c_mystats.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2001
rem     Purpose:        Put names to v$mystat
rem
rem     Last tested
rem             12.1.0.2        -- naming issue
rem             11.2.0.4
rem             10.2.0.5
rem             10.1.0.4
rem              9.2.0.8
rem              8.1.7.4
rem
rem     Notes:
rem     Should be run by SYS - which means it has to be re-run
rem     on a full database export/import
rem
rem     It looks as if it is illegal to create a view with a
rem     name starting with v$ in the sys account as from 12c.
rem     (ORA-00999: invalid view name). Hence the JV$ name.
rem
rem     But you can create a public synonym starting "v$"
rem

create or replace view jv$my_stats
as
select
        /*+
                first_rows
                ordered
        */
        ms.sid,
        sn.statistic#,
        sn.name,
        sn.class,
        ms.value
from
        v$mystat        ms,
        v$statname      sn
where
        sn.statistic# = ms.statistic#
;

drop public synonym v$my_stats;
create public synonym v$my_stats for jv$my_stats;
grant select on v$my_stats to public;

rem
rem	Script:		snap_myst.sql
rem	Author:		Jonathan Lewis
rem	Dated:		March 2001
rem	Purpose:	Package to get snapshot start and delta of v$mystat
rem
rem	Last tested
rem		12.1.0.2
rem		11.2.0.4
rem		10.2.0.5
rem		10.1.0.4
rem		 9.2.0.8
rem		 8.1.7.4
rem
rem	Notes
rem	Has to be run by SYS to create the package
rem	Depends on view (j)v$my_stats (see c_mystats.sql)
rem
rem	Usage:
rem		set serveroutput on size 1000000 format wrapped
rem		set linesize 120
rem		set trimspool on
rem		execute snap_my_stats.start_snap
rem		-- do something
rem		execute snap_my_stats.end_snap
rem

create or replace package snap_my_stats as
	procedure start_snap;
	procedure end_snap (i_limit in number default 0);
end;
/

create or replace package body snap_my_stats as

cursor c1 is
	select 
		statistic#, 
		name,
		value
	from 
		v$my_stats
	where
		value != 0
	;


	type w_type is table of c1%rowtype index by binary_integer;
	w_list		w_type;
	empty_list	w_type;

	m_start_time	date;
	m_start_flag	char(1);
	m_end_time	date;

procedure start_snap is
begin

	m_start_time := sysdate;
	m_start_flag := 'U';
	w_list := empty_list;

	for r in c1 loop
		w_list(r.statistic#).value := r.value;
	end loop;

end start_snap;


procedure end_snap (i_limit in number default 0) 
is
begin

	m_end_time := sysdate;

	dbms_output.put_line('---------------------------------');

	dbms_output.put_line('Session stats - ' ||
				to_char(m_end_time,'dd-Mon hh24:mi:ss')
	);

	if m_start_flag = 'U' then
		dbms_output.put_line('Interval:-  '  || 
				trunc(86400 * (m_end_time - m_start_time)) ||
				' seconds'
		);
	else
		dbms_output.put_line('Since Startup:- ' || 
				to_char(m_start_time,'dd-Mon hh24:mi:ss')
		);
	end if;

	if (i_limit != 0) then
		dbms_output.put_line('Lower limit:-  '  || i_limit);
	end if;

	dbms_output.put_line('---------------------------------');

	dbms_output.put_line(
		rpad('Name',60) ||
		lpad('Value',18)
	);

	dbms_output.put_line(
		rpad('----',60) ||
		lpad('-----',18)
	);

	for r in c1 loop
		if (not w_list.exists(r.statistic#)) then
		    w_list(r.statistic#).value := 0;
		end if;

		if (
		       (r.value > w_list(r.statistic#).value + i_limit)
		) then
			dbms_output.put(rpad(r.name,60));
			dbms_output.put(to_char(
				r.value - w_list(r.statistic#).value,
					'9,999,999,999,990')
			);
			dbms_output.new_line;
		end if;
	end loop;

end end_snap;

begin
	select
		logon_time, 'S'
	into
		m_start_time, m_start_flag
	from
		v$session
	where
		sid = 	(
				select /*+ no_unnest */ sid 
				from v$mystat 
				where rownum = 1
			);

end snap_my_stats;
/

drop public synonym snap_my_stats;
create public synonym snap_my_stats for snap_my_stats;
grant execute on snap_my_stats to public;

One point to be cautious about with this package: do not embed it inside anonymous pl/sql blocks, e.g.

begin

        snap_my_stats.start_snap;

        -- some other code

        snap_my_stats.end_snap;

end;
/

There are some statistics in v$my_stats (v$mystat / v$sesstat) which are not updated until the end of a database call – and calling an anonymous pl/sql block counts as a single database call, so some of your statistics (for example “CPU used by this session”) will report misleading values.

October 3, 2016

Kill CPU

Filed under: Oracle,Performance,Troubleshooting — Jonathan Lewis @ 8:58 am GMT Oct 3,2016

My old website (www.jlcomp.demon.co.uk) will be disappearing in a couple of weeks – but there are a couple of timeless articles on it that are worth saving and a method for soaking up all the CPU on your system with a simple SQL statement against a small data set is, surely, one of them. Here, then is a little script that I wrote (or, at least, formalised) 15 years ago to stress out a CPU:


rem
rem     Script:         kill_cpu.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2001
rem     Purpose:        Exercise CPU and latches
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem             11.1.0.7
rem             10.2.0.5
rem             10.1.0.4
rem              9.2.0.8
rem              8.1.7.4
rem
rem     Notes:
rem     The count(*) will return power((2,n-1))
rem     To run from Oracle 9 and later we have to set parameter
rem             _old_connect_by_enabled = true;
rem
rem     Base calculation (historical):
rem     ==============================
rem     Rule of thumb - ca. 10,000 logical I/Os per sec per 100 MHz of CPU.
rem     (Modern versions of Oracle on modern CPUs - about twice that, maybe
rem     due to the introduction of the "fastpath" consistent gets with the
rem     elimination of some logging activity that used to exist.)
rem
rem     With the value of 23 shown we do 6M buffer visits of which 4M
rem     are "conistent read gets", and 2M are "buffer is pinned count".
rem     (That's power(2,23-1) and power(2,23-2) respectively). For each
rem     row you add to the kill_cpu table you double the run-time.
rem
rem      This is an example of SQL that can take MUCH longer when run
rem      with rowsource_execution_statistics enabled. Mostly spent on 
rem      CPU calling the O/S timer. (On my last test, using a 12c VM
rem      the time jumped from 6 seconds - 23 rows - to 75 seconds when
rem      I set statistics_level to all; but half would be the effect of
rem      running through the VM.)
rem

drop table kill_cpu;

begin

        begin           execute immediate 'purge recyclebin';
        exception       when others then null;
        end;

        begin           execute immediate 'alter session set "_old_connect_by_enabled"=true';
        exception       when others then null;
        end;

end;
/

create table kill_cpu(n, primary key(n))
organization index
as
select  rownum n
from    all_objects
where   rownum <= 23
;
execute snap_my_stats.start_snap 

set timing on 
set serveroutput off

-- alter session set statistics_level = all;

spool kill_cpu

select  count(*) X
from    kill_cpu 
connect by 
        n > prior n
start with 
        n = 1
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

set serveroutput on
execute snap_my_stats.end_snap

spool off

set timing off
alter session set statistics_level = typical;

The calls to snap_my_stats use a package (owned by sys) that I wrote a long time ago for taking a snapshot of v$mystats; many people use Tanel Poder’s “Snapper” script or Tom Kyte’s script instead.

September 16, 2016

Stats collection time

Filed under: Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 12:58 pm GMT Sep 16,2016

Yesterday I posted a note about querying dba_optstat_operations to get a quick report of how long calls to dbms_stats had been taking but said I had another script that helped to fill some of the gaps it left. One of my readers points out fairly promptely that 12c enhances the feature considerably, with a view dba_optstat_operation_tasks that (for example) lists all the tables processed during a single call to gather_schema_stats.

Well, I wrote my script years (if not decades) before 12c came out, so I’m going to publish it anyway.

It definitely falls into the quick and dirty category. It’s not trying to be clever, makes a critical assumption about the recent past, and isn’t a generally useful tool, but I’ve found it a quick way to highlight a possible problem with expensive stats collection routines (especially those written by 3rd parties who have disabled Oracle’s automatic job.)

The concept is very simple: list all the tables in the order they were last analyzed and assume that for any “obviously continuous” stretch of time the start of stats collections for one table coincided with the end of stats collection for the previous one. A little fiddling with the lag() function then allows you to report the time spent on gathering the stats.

rem
rem     stats_check.sql
rem
rem     Quick and dirty to get an idea of
rem     where the time went on the most
rem     recent stats collection job.
rem
rem     Weakness: It assumes that the start
rem     of stats collection of a table is the
rem     end of stats collection for the previous
rem     table.  This may be far from true.
rem

define m_schema = '&1'

set linesize 144
set pagesize 60
set trimspool on

column a_date format a20

spool stats_check

select
        table_name,
        blocks, num_rows, sample_size,
        round(100 * sample_size/nullif(num_rows,0),2)   sample_pct,
        to_char(last_analyzed,'dd-mon-yyyy hh24:mi:ss') a_date,
        to_char(
                round(1440 *
                        (
                        last_analyzed -
                        lag(last_analyzed, 1) over (order by last_analyzed)
                        ), 2
                ),'9,999.00'
        )                        a_time
from
        dba_tables
where
        owner = upper('&m_schema')
order by
        last_analyzed
;

spool off

It’s a very simple script – one detail that’s quite useful is the presence of the data about table size and sample size. But that’s really only the starting point. If you’re gathering stats on a table then most of the time might be spent on gathering stats for histograms or indexes, so there’s plenty of scope to add bits to the script to report number of histograms per table, number of indexes per table, and so on. Personally I tend to do “incremental” trouble-shooting and keep my scripts (and their run time) short, so I’d only worry about specific details if I saw a headline figure that seemed worth a little more effort.

Here’s some (camouflaged) output – which happens to highlight one of the weaknesses of the approach:

TABLE_NAME              BLOCKS   NUM_ROWS SAMPLE_SIZE SAMPLE_PCT A_DATE                  A_TIME
--------------------- --------  --------- ----------- ---------- --------------------  --------
TABLE_AAA                    0          0           0            14-jul-2016 01:27:55       .00
TABLE_BBB              6320666  540356865   108071373         20 14-jul-2016 02:30:45     62.83
TABLE_CCC              9587372  930770535   186154107         20 14-jul-2016 03:36:13     65.47
TABLE_DDD              2272319  104698080    20939616         20 14-jul-2016 04:19:45     43.53
TABLE_EEE              4413696  258746720    51749344         20 14-jul-2016 04:49:13     29.47
TABLE_FFF                    0          0           0            14-jul-2016 05:02:28     13.25
TABLE_GGG              4377656  258740995    51748199         20 14-jul-2016 05:02:28       .00
TABLE_HHH                    0          0           0            14-jul-2016 05:02:28       .00
TABLE_III                   60         52          52        100 14-jul-2016 05:02:29       .02
TABLE_JJJ                   60        170         170        100 14-jul-2016 05:02:30       .02
TABLE_KKK                   60        100         100        100 14-jul-2016 05:02:30       .00
TABLE_LLL                   60       5548        5021       90.5 14-jul-2016 05:02:31       .02
TABLE_MMM                 4660     686575      137315         20 14-jul-2016 05:02:32       .02
...
TABLE_TTT                    0          0           0            14-jul-2016 05:02:38       .00
TABLE_UUU                   60        659         659        100 14-jul-2016 05:02:38       .00
TABLE_VVV               325324   18618685     3723737         20 14-jul-2016 05:04:14      1.60
TABLE_WWW                   60       1106        1106        100 14-jul-2016 05:06:01      1.78
TABLE_XXX                24001    2534810      506962         20 14-jul-2016 05:06:13       .20
TABLE_YYY                    0          0           0            14-jul-2016 05:06:20       .12
TABLE_ZZZ                    0          0           0            14-jul-2016 05:06:20       .00

You’ll notice that the zero row FFF table is reported to have taken 13.25 minutes for stats gathering, while the 4.3M row GGG table table took no time at all. That’s the problem of sorting by the last_analyzed time when it’s accurate only to the second and some tables take less than a second to gather stats. Clearly these two time estimates are the wrong way round. (WWW and XXX look fairly suspect too, but their timestamps are different so there must be some other explanation – but the time is too small to worry about at the moment.)

Tables BBB and CCC are also an example of how variable the results are when you compare cost with time. Although the table and sample had twice the rows table CCC has fewer histograms and indexes. The same variation shows up clearly between EEE and GGG (after allowing for the order error), and slightly less immediately between DDD and all the other large tables.

That brings me to one more threat of mis-interpretation (and a reason why I’m not terribly keen on publishing scripts that I’ve hacked together). If you see Oracle claiming (through this script) that it took no more than a couple of minutes to gather stats on a 5 billion row table it may be true – but maybe that stats collection was only related to gathering stats for a single, fairly new, partition in a partitioned table. There are, no doubt, several boundary cases that a simple script like this will have overlooked – and when I run it I’ll be at a client site with knowledge of the client database and I’ll spot the anomalies and omissions.

Update

One simple enhancement to the script above – indexes take time to analyze as well, and if you query only for tables the time spent analyzing the indexes on a table will be attributed to the NEXT table in the output. All we have to do is change the reference to dba_tables to be an inline view referencing dba_tables and dba_indexes in a UNION ALL:


select
        table_name, index_name,
        blocks, num_rows, sample_size,
        round(100 * sample_size/nullif(num_rows,0),2)   sample_pct,
        to_char(last_analyzed,'dd-mon-yyyy hh24:mi:ss') a_date,
        to_char(
                round(1440 *
                        (
                        last_analyzed -
                        lag(last_analyzed, 1) over (order by last_analyzed)
                        ), 2
                ),'9,999.99'
        )                        a_time
from
        (
        select
                owner, table_name, null as index_name,
                blocks, num_rows, sample_size, last_analyzed
        from
                dba_tables
        union all
        select
                table_owner, table_name, index_name,
                leaf_blocks, num_rows, sample_size, last_analyzed
        from
                dba_indexes
        )
where
        owner = upper('&m_schema')
order by
        last_analyzed
;

You’ll notice that I’ve left the ‘&schema’ outside the view so that I only have to supply a value once; the optimizer is smart enough to push this predicate inside the union all view so that the plan shows the two separate subqueries being operated only for the given schema.

Footnote

A reader has emailed me to point out that one of the cases where this script won’t help (or, at the least, would produce highly misleading results) when the CONCURRENT preference has been set to true, allowing multiple jobs run to gather stats on difference objects at the same.

 

September 15, 2016

Stats time

Filed under: Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 9:41 am GMT Sep 15,2016

I don’t really remember how long it’s been since Oracle created an automatic log of how long a call to the dbms_stats package took, though it was probably some time in the 10g time-line. It wasn’t until it had been around for several years, though before I wrote little script (possibly prompted by a comment from Martin Widlake) that I’ve used occasionally since to see what’s been going on in the past, how variable stats collection times have been, and what unexpected dbms_stats call an application may have been making. Here’s what it currently looks like:

rem
rem     Script:         optimizer_operations.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2010
rem     Purpose:
rem
rem     Last tested
rem             11.2.0.4
rem     Not tested
rem             12.1.0.2
rem             10.2.0.5
rem
rem     Notes:
rem     Simple script to list calls to dbms_stats showing
rem     what they did and how long it took.
rem
rem     Notice that gather_database_stats and gather_schema_stats
rem     don't tell you anything about individual tables.
rem

spool optimizer_operations

column start_time       format a35
column operation        format a28
column target           format a28

column duration         format a12
column max_duration     format a12

column utc_start_time   format a28
column first_utc_time   format a28

select
        operation,
        cast(sys_extract_utc(min(start_time)) as timestamp(0))                  first_utc_time,
        --
        to_char(extract(hour from max(end_time - start_time)),'FM00') ||
        ':' ||
        to_char(extract(minute from max(end_time - start_time)),'FM00') ||
        ':' ||
        to_char(round(extract(second from max(end_time - start_time)),2),'FM00.00')     max_duration,
        --
        count(*)        Frequency
from
        dba_optstat_operations
group by
        operation
order by
        operation
;

select
        operation,
        target,
        -- start_time,
        cast(sys_extract_utc(start_time) as timestamp(0))                               utc_start_time,
        to_char(extract(hour from (end_time - start_time)),'FM00') ||
        ':' ||
        to_char(extract(minute from (end_time - start_time)),'FM00') ||
        ':' ||
        to_char(round(extract(second from (end_time - start_time)),2),'FM00.00')        duration
from
        dba_optstat_operations
where
        end_time - start_time >= numtodsinterval(2,'SECOND')
and     target is not null
order by
        start_time
;

spool off

This version of the script first prints a summary showing how many calls to which operation have occured, the worst case scenario for each, and when the first one happened – in case a recent change has introduced a problem – you might find it useful to include other bits of information, like the min(duration) in the first section, it’s not difficult modify the SQL. The second part of the report is then just a list of the individual calls, with a minimum interesting time included.

And here’s some (camouflagued) output


OPERATION                    FIRST_UTC_TIME               MAX_DURATION  FREQUENCY
---------------------------- ---------------------------- ------------ ----------
copy_table_stats             14-JUL-16 11.12.55 PM        00:00:03.43         320
gather_database_stats(auto)  14-JUL-16 06.00.01 AM        01:13:17.25          67
gather_schema_stats          14-JUL-16 04.00.02 AM        05:16:01.22          35
gather_table_stats           14-JUL-16 01.15.59 AM        00:04:07.98         153


OPERATION                    TARGET                             UTC_START_TIME               DURATION
---------------------------- ---------------------------------- ---------------------------- ------------
gather_schema_stats          MY_SCHEMA                          01-AUG-16 04.00.02 AM        04:40:33.03
copy_table_stats             SYS.WRH$_SQLSTAT.WRH$_SQLSTA_24706 01-AUG-16 11.18.15 PM        00:00:02.04
                             69947_32443

gather_schema_stats          MY_SCHEMA                          02-AUG-16 04.00.02 AM        04:57:39.61
gather_schema_stats          MY_SCHEMA                          03-AUG-16 04.00.02 AM        04:40:10.88
gather_schema_stats          MY_SCHEMA                          04-AUG-16 04.00.02 AM        05:00:48.05
gather_schema_stats          MY_SCHEMA                          05-AUG-16 04.00.03 AM        05:07:42.33
gather_schema_stats          MY_SCHEMA                          06-AUG-16 04.00.02 AM        04:50:39.93
gather_table_stats           SYS.WRI$_ADV_ACTIONS               06-AUG-16 12.42.38 PM        00:00:03.01
gather_table_stats           SYS.WRI$_ADV_FINDINGS              06-AUG-16 12.42.41 PM        00:00:02.94
gather_table_stats           SYS.WRI$_ADV_RECOMMENDATIONS       06-AUG-16 12.42.46 PM        00:00:02.03
gather_table_stats           SYS.WRI$_ADV_SQLT_PLANS            06-AUG-16 12.42.49 PM        00:00:04.45
gather_schema_stats          MY_SCHEMA                          07-AUG-16 04.00.02 AM        04:48:57.23
gather_schema_stats          MY_SCHEMA                          08-AUG-16 04.00.01 AM        04:40:57.81
gather_table_stats           MY_SCHEMA.TABLE_A                  09-AUG-16 01.16.21 AM        00:00:03.61
gather_schema_stats          MY_SCHEMA                          09-AUG-16 04.00.03 AM        04:51:14.48
gather_table_stats           MY_SCHEMA.TABLE_A                  10-AUG-16 01.16.44 AM        00:00:02.07
gather_schema_stats          MY_SCHEMA                          10-AUG-16 04.00.02 AM        04:36:19.35
gather_schema_stats          MY_SCHEMA                          11-AUG-16 04.00.02 AM        04:43:30.77
copy_table_stats             SYS.WRH$_FILESTATXS.WRH$_FILEST_24 11-AUG-16 11.24.21 PM        00:00:02.65
                             70669947_32683

gather_schema_stats          MY_SCHEMA                          12-AUG-16 04.00.02 AM        04:42:15.92
gather_schema_stats          MY_SCHEMA                          12-AUG-16 04.00.02 AM        04:42:15.89
gather_schema_stats          MY_SCHEMA                          13-AUG-16 04.00.02 AM        04:41:06.68
gather_table_stats           SYS.WRI$_ADV_PARAMETERS            13-AUG-16 10.31.35 AM        00:00:03.88
gather_table_stats           SYS.WRI$_ADV_RATIONALE             13-AUG-16 10.31.39 AM        00:00:03.03
gather_table_stats           MY_SCHEMA.TABLE_B                  13-AUG-16 01.24.33 PM        00:04:07.98
gather_table_stats           MY_SCHEMA.TABLE_C.SYS_P12423       13-AUG-16 02.06.34 PM        00:03:22.33
copy_table_stats             SYS.WRH$_ACTIVE_SESSION_HISTORY.WR 13-AUG-16 11.03.35 PM        00:00:02.11
                             H$_ACTIVE_2470669947_32731


There is an important limitation with this script – it’s all very well seeing that gather_schema_stats has been running at a fairly uniform 4.5 to 5 hours, but it doesn’t tell us where the time went inside that call and it’s a call that could have been covering a lot of tables. There’s not a lot you can do about that from this set of data, but there’s another script I wrote a long time ago (before I wrote this one) that tried to do something about that question, and I’ll be publishing that tomorrow.

 

September 13, 2016

Securefile space

Filed under: fragmentation,Infrastructure,LOBs,Oracle,Troubleshooting — Jonathan Lewis @ 7:29 am GMT Sep 13,2016

Here’s a little script I hacked together a couple of years ago from a clone of a script I’d been using for checking space usage in the older types of segments. Oracle Corp. eventually put together a routine to peer inside securefile LOBs:

rem
rem	Script:		dbms_space_use_sf.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Dec 2013
rem	Purpose:	
rem
rem	Last tested 
rem		12.1.0.2
rem		11.2.0.4
rem	Not tested
rem		11.1.0.7
rem	Not relevant
rem		10.2.0.5
rem		 9.2.0.8
rem		 8.1.7.4
rem
rem	Notes:
rem	See also dbms_space_use.sql
rem
rem	11g introduced securefiles lobs and two overloads of 
rem	dbms_space_usage to report space used by their segments
rem
rem	Valid values for suoption are:
rem		SPACEUSAGE_EXACT (16): Computes space usage exhaustively
rem		SPACEUSAGE_FAST  (17): Retrieves values from in-memory statistics
rem
rem	This version allows for partitioned objects, could delete
rem	lines to parameter 4 and partition names to eliminate
rem	the complaints about substitution variables.
rem


define m_seg_owner	= &1
define m_seg_name	= &2
define m_seg_type	= '&3'
define m_part_name	= &4

define m_segment_owner	= &m_seg_owner
define m_segment_name	= &m_seg_name
define m_segment_type	= '&m_seg_type'
define m_partition_name = &m_part_name

@@setenv

execute snap_enqueues.start_snap
execute snap_events.start_snap
execute snap_my_stats.start_snap

spool dbms_space_use_sf

prompt	============
prompt	Secure files
prompt	============

declare
	wrong_ssm	exception;
	pragma exception_init(wrong_ssm, -10614);

	m_segment_size_blocks	number(12,0);
	m_segment_size_bytes	number(12,0);
	m_used_blocks		number(12,0);
	m_used_bytes		number(12,0);
	m_expired_blocks	number(12,0);
	m_expired_bytes		number(12,0);
	m_unexpired_blocks	number(12,0);
	m_unexpired_bytes	number(12,0);

begin
	dbms_space.space_usage(
		upper('&m_segment_owner'),
		upper('&m_segment_name'),
		upper('&m_segment_type'),
		suoption		=> dbms_space.spaceusage_exact,	
--		suoption		=> dbms_space.spaceusage_fast,
		segment_size_blocks	=> m_segment_size_blocks,
		segment_size_bytes	=> m_segment_size_bytes,
		used_blocks		=> m_used_blocks,
		used_bytes		=> m_used_bytes,
		expired_blocks		=> m_expired_blocks,
		expired_bytes		=> m_expired_bytes,
		unexpired_blocks	=> m_unexpired_blocks,
		unexpired_bytes		=> m_unexpired_bytes,
		partition_name		=> upper('&m_partition_name')
	);

	dbms_output.new_line;
	dbms_output.put_line(' Segment Blocks:   ' || to_char(m_segment_size_blocks,'999,999,990') || ' Bytes: ' || to_char(m_segment_size_bytes,'999,999,999,990')); 
	dbms_output.put_line(' Used Blocks:      ' || to_char(m_used_blocks,'999,999,990')         || ' Bytes: ' || to_char(m_used_bytes,'999,999,999,990')); 
	dbms_output.put_line(' Expired Blocks:   ' || to_char(m_expired_blocks,'999,999,990')      || ' Bytes: ' || to_char(m_expired_bytes,'999,999,999,990')); 
	dbms_output.put_line(' Unexpired Blocks: ' || to_char(m_unexpired_blocks,'999,999,990')    || ' Bytes: ' || to_char(m_unexpired_bytes,'999,999,999,990')); 

exception
	when wrong_ssm then
		dbms_output.put_line('Segment not ASSM');
end;
/

prompt	===============
prompt	Generic details
prompt	===============

declare
	m_total_blocks			number;
	m_total_bytes			number;
	m_unused_blocks			number;
	m_unused_bytes			number;
	m_last_used_extent_file_id	number;
	m_last_used_extent_block_id	number;
	m_last_used_block		number;
begin
	dbms_space.unused_space(
		segment_owner		=> upper('&m_segment_owner'),
		segment_name		=> upper('&m_segment_name'),
		segment_type		=> upper('&m_segment_type'),
		total_blocks		=> m_total_blocks,
		total_bytes 		=> m_total_bytes, 
		unused_blocks		=> m_unused_blocks,  
		unused_bytes		=> m_unused_bytes,
		last_used_extent_file_id 	=> m_last_used_extent_file_id, 
		last_used_extent_block_id	=> m_last_used_extent_block_id,
		last_used_block		=> m_last_used_block,
		partition_name		=> upper('&m_partition_name')
	);

	dbms_output.put_line('Segment Total blocks: ' || to_char(m_total_blocks,'999,999,990'));
	dbms_output.put_line('Object Unused blocks: ' || to_char(m_unused_blocks,'999,999,990'));

end;
/

-- execute snap_my_stats.end_snap
-- execute snap_events.end_snap
-- execute snap_enqueues.end_snap

spool off

Sample of output (from a slightly older version of the code):


============
Secure files
============

 Segment Blocks:    168960 Bytes: 1384120320
 Used Blocks:       151165 Bytes: 1238343680
 Expired Blocks     17795 Bytes: 145776640
 Unexpired Blocks   0 Bytes: 0

PL/SQL procedure successfully completed.

===============
Generic details
===============
Segment Total blocks: 168960
Object Unused blocks: 0

PL/SQL procedure successfully completed.


September 11, 2016

Space Usage

Filed under: fragmentation,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:09 pm GMT Sep 11,2016

Here’s a simple script that I’ve used for many years to check space usage inside segments.  The comment about freelist groups may be out of date  – I’ve not had to worry about that for a very long time. There is a separate script for securefile lobs.


rem
rem	Script:		dbms_space_use.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Nov 2002
rem	Purpose:	
rem
rem	Last tested 
rem		12.1.0.2
rem		11.2.0.4
rem	Not tested
rem		11.1.0.7
rem		10.2.0.5
rem		 9.2.0.8
rem	Not relevant
rem		 8.1.7.4
rem
rem	Notes:
rem	For accuracy in free space you (once) needed to set the
rem	scan limit; and for those rare objects cases where you 
rem	had defined multiple freelist groups you still have to
rem	work through each free list group in turn
rem
rem	For the ASSM calls:
rem		FS1	=> 0% - 25% free space
rem		FS2	=> 25% - 50% free space
rem		FS3	=> 50% - 75% free space
rem		FS4	=> 75% - 100% free space
rem		Bytes = blocks * block size
rem
rem	Expected errors:
rem		ORA-10614: Operation not allowed on this segment
rem			(MSSM segment, ASSM call)
rem		ORA-10618: Operation not allowed on this segment
rem			(ASSM segment, MSSM call)
rem		ORA-03200: the segment type specification is invalid
rem			(e.g. for LOBINDEX or LOBSEGMENT)
rem			11g - "LOB" is legal for LOB segments
rem			    - use "INDEX" for the LOBINDEX
rem
rem	For indexes
rem		Blocks are FULL or FS2 (re-usable)
rem
rem	Special case: LOB segments.
rem	The number of blocks reported by FS1 etc. is actually the
rem	number of CHUNKS in use (and they're full or empty). So 
rem	if your CHUNK size is not the same as your block size the
rem	total "blocks" used doesn't match the number of blocks 
rem	below the HWM.
rem
rem	The package dbms_space is created by dbmsspu.sql
rem	and the body is in prvtspcu.plb
rem
rem	11.2 overloads dbms_space.space_usage for securefile lobs
rem	See dbms_space_use_sf.sql
rem
rem	When supplying details about partitions the segment type
rem	can consist of two words (e.g. LOB PARTITION), these 
rem	must be surrounded by quotes to survive the script.
rem
rem	You might want to set up two versions of this code with
rem	all references to partitions removed from one of them
rem	or you have to keep pressing return to bypass the 
rem	requests for substitution variables
rem

define m_seg_owner	= &1
define m_seg_name	= &2
define m_seg_type	= '&3'
define m_part_name	= &4

define m_segment_owner	= &m_seg_owner
define m_segment_name	= &m_seg_name
define m_segment_type	= '&m_seg_type'
define m_partition_name = &m_part_name

@@setenv

spool dbms_space_use

prompt	===================
prompt	Freelist management
prompt	===================

declare
	wrong_ssm	exception;
	pragma exception_init(wrong_ssm, -10618);

	m_free	number(10);
begin
	dbms_space.free_blocks(
		segment_owner		=> upper('&m_segment_owner'),
		segment_name		=> upper('&m_segment_name'),
		segment_type		=> upper('&m_segment_type'),
		partition_name		=> upper('&m_partition_name'),
--		scan_limit		=> 50,
		freelist_group_id	=> 0,
		free_blks		=> m_free
	);
	dbms_output.put_line('Free blocks below HWM: ' || m_free);
exception
	when wrong_ssm then
		dbms_output.put_line('Segment not freelist managed');
end;
/


prompt	====
prompt	ASSM
prompt	====

declare
	wrong_ssm	exception;
	pragma exception_init(wrong_ssm, -10614);

	m_unformatted_blocks	number;
	m_unformatted_bytes	number;
	m_fs1_blocks		number;
	m_fs1_bytes		number;
	m_fs2_blocks		number;  
	m_fs2_bytes		number;

	m_fs3_blocks		number;
	m_fs3_bytes		number;
	m_fs4_blocks		number; 
	m_fs4_bytes		number;
	m_full_blocks		number;
	m_full_bytes		number;

begin
	dbms_space.SPACE_USAGE(
		segment_owner		=> upper('&m_segment_owner'),
		segment_name		=> upper('&m_segment_name'),
		segment_type		=> upper('&m_segment_type'),
		unformatted_blocks	=> m_unformatted_blocks,
		unformatted_bytes	=> m_unformatted_bytes, 
		fs1_blocks		=> m_fs1_blocks , 
		fs1_bytes		=> m_fs1_bytes,
		fs2_blocks		=> m_fs2_blocks,  
		fs2_bytes		=> m_fs2_bytes,
		fs3_blocks		=> m_fs3_blocks,  
		fs3_bytes		=> m_fs3_bytes,
		fs4_blocks		=> m_fs4_blocks,  
		fs4_bytes		=> m_fs4_bytes,
		full_blocks		=> m_full_blocks, 
		full_bytes		=> m_full_bytes,
		partition_name		=> upper('&m_partition_name')
	);


	dbms_output.new_line;
	dbms_output.put_line('Unformatted                   : ' || to_char(m_unformatted_blocks,'999,999,990') || ' / ' || to_char(m_unformatted_bytes,'999,999,999,990'));
	dbms_output.put_line('Freespace 1 (  0 -  25% free) : ' || to_char(m_fs1_blocks,'999,999,990') || ' / ' || to_char(m_fs1_bytes,'999,999,999,990'));
	dbms_output.put_line('Freespace 2 ( 25 -  50% free) : ' || to_char(m_fs2_blocks,'999,999,990') || ' / ' || to_char(m_fs2_bytes,'999,999,999,990'));
	dbms_output.put_line('Freespace 3 ( 50 -  75% free) : ' || to_char(m_fs3_blocks,'999,999,990') || ' / ' || to_char(m_fs3_bytes,'999,999,999,990'));
	dbms_output.put_line('Freespace 4 ( 75 - 100% free) : ' || to_char(m_fs4_blocks,'999,999,990') || ' / ' || to_char(m_fs4_bytes,'999,999,999,990'));
	dbms_output.put_line('Full                          : ' || to_char(m_full_blocks,'999,999,990') || ' / ' || to_char(m_full_bytes,'999,999,999,990'));

exception
	when wrong_ssm then
		dbms_output.put_line('Segment not ASSM');
end;
/


prompt	=======
prompt	Generic
prompt	=======

declare
	m_total_blocks			number;
	m_total_bytes			number;
	m_unused_blocks			number;
	m_unused_bytes			number;
	m_last_used_extent_file_id	number;
	m_last_used_extent_block_id	number;
	m_last_used_block		number;
begin
	dbms_space.unused_space(
		segment_owner		=> upper('&m_segment_owner'),
		segment_name		=> upper('&m_segment_name'),
		segment_type		=> upper('&m_segment_type'),
		total_blocks		=> m_total_blocks,
		total_bytes 		=> m_total_bytes, 
		unused_blocks		=> m_unused_blocks,  
		unused_bytes		=> m_unused_bytes,
		last_used_extent_file_id 	=> m_last_used_extent_file_id, 
		last_used_extent_block_id	=> m_last_used_extent_block_id,
		last_used_block		=> m_last_used_block,
		partition_name		=> upper('&m_partition_name')
	);

	dbms_output.put_line('Segment Total blocks: '  || to_char(m_total_blocks,'999,999,990'));
	dbms_output.put_line('Object Unused blocks: '  || to_char(m_unused_blocks,'999,999,990'));

end;
/

undefine 1
undefine 2
undefine 3
undefine 4

undefine m_seg_owner
undefine m_seg_name
undefine m_seg_type
undefine m_part_name

undefine m_segment_owner
undefine m_segment_name
undefine m_segment_type
undefine m_partition_name

spool off



Here’s a sample of output (from a segment using ASSM):


===================
Freelist management
===================
Segment not freelist managed

PL/SQL procedure successfully completed.

====
ASSM
====

Unformatted                   :  132,385 / ############
Freespace 1 (  0 -  25% free) :        0 /            0
Freespace 2 ( 25 -  50% free) :        0 /            0
Freespace 3 ( 50 -  75% free) :        0 /            0
Freespace 4 ( 75 - 100% free) :        0 /            0
Full                          :   12,327 /  100,982,784

PL/SQL procedure successfully completed.

=======
Generic
=======
Segment Total blocks: 145920
Object Unused blocks: 0

PL/SQL procedure successfully completed.

(I’ve increased the length of the byte-count output since I produced that report ;)

July 8, 2016

DML and Bloom

Filed under: Bugs,Oracle,Partitioning,Troubleshooting — Jonathan Lewis @ 1:01 pm GMT Jul 8,2016

One of the comments on my recent posting about “Why use pl/sql bulk strategies over simple SQL” pointed out that it’s not just distributed queries that can change plans dramatically when you change from a simple select to “insert into … select …”; there’s a similar problem with queries that use Bloom filters – the filter disappears when you change from the query to the DML.

This seemed a little bizarre, so I did a quick search on MoS (using the terms “insert select Bloom Filter”) to check for known bugs and then tried to run up a quick demo. Here’s a summary of the related bugs that I found through my first simple search:

Bug 20112932 : BLOOM FILTER IS NOT CHOOSED WHEN EXECUTING INSERT SELECT STATEMENT
11.2.0.4 - Nov 2014: "not a bug".
This cites three more documents:
  Bug 13801198  BLOOM PRUNING/FILTER NOT USED IN DML STATEMENTS 
    11.2.0.3 - March 2012 - Duplicate, base bug: 18949550
    18949550: MERGE QUERY NOT USING BLOOM FILTER
    11.2.0.3 - June 2014: fixed in 12.2 (no patches yet)

  Bug 14325392  BLOOM FILTER NOT SUPPORTED IN PARTITION WISE JOIN 
    11.2.0.3 - July 2012 - fixed in 12.1 
    (but I think this is a PX filter, not a partition filter)

  Bug 17716301  BLOOM FILTER NOT USED FOR OUTER JOIN</pre>

    Aug 2014 - hidden bug note. Patches available for 11.2.0.3
    Note 1919508.1 describes the problem.

And here’s some code to demonstrate the problem – running 11.2.0.4 (with similar results on 12.1.0.2).


rem
rem     Script:         dml_bloom_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table pt_hash (
        id,
        grp,
        small_vc,
        padding
)
nologging
pctfree 90 pctused 10
partition by hash(grp)
(
        partition p1,
        partition p2,
        partition p3,
        partition p4
)
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        trunc(rownum/50)                grp,
        to_char(trunc(rownum/20))       small_vc,
        rpad('x',100)                   padding
from
        generator, generator
where
        rownum <= 1e5
;

create table t1
as
select
        rownum  id,
        rownum  n1,
        lpad(rownum,10) v1,
        rpad('x',100) padding
from
        dual
connect by
        rownum <= 4
;

create table target (
        n1      number,
        id      number
)
;

-- gather stats on t1 and pt_hash (for all columns size 1, otherwise default)

There’s nothing fancy in this code, just a hash partitioned table with 100,000 rows (and a lot of wasted space), and a “driving” table with 4 rows that I can use in a very simple join to the partitioned table. I’ve also got an empty table ready for an insert statement.

So here’s a query with its execution plan (with rowsource execution statistics) that (based on the content of the t1 table) should select from at most two of the four partitions. I’ve hinted a hash join which is where Bloom filtering can be very effective with partitioned table joins:


set serveroutput off
alter session set statistics_level = all;

select
        /*+
                leading(t1 h1)
                use_hash(h1)    no_swap_join_inputs(h1)
        */
        t1.n1,
        h1.id
from
        t1,
        pt_hash h1
where
        t1.id between 2 and 3
and     h1.grp = t1.n1
and     h1.id <= 100
;

select * from table(dbms_xplan.display_cursor(null,null,'outline allstats last cost partition'));

------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |      1 |        |  2115 (100)|       |       |      1 |00:00:00.16 |    8374 |   8367 |       |       |          |
|*  1 |  HASH JOIN                  |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.16 |    8374 |   8367 |  2440K|  2440K|  777K (0)|
|   2 |   PART JOIN FILTER CREATE   | :BF0000 |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |      0 |       |       |          |
|*  3 |    TABLE ACCESS FULL        | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |      0 |       |       |          |
|   4 |   PARTITION HASH JOIN-FILTER|         |      1 |    100 |  2113   (2)|:BF0000|:BF0000|     51 |00:00:00.16 |    8372 |   8367 |       |       |          |
|*  5 |    TABLE ACCESS FULL        | PT_HASH |      2 |    100 |  2113   (2)|:BF0000|:BF0000|     51 |00:00:00.16 |    8372 |   8367 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

-- Predicates and Outline redacted

The most significant points to notice are:

  • Operation 2 – Part Join Filter Created: we’ve created a Bloom filter
  • Operation 4 – Partition Hash Join – Filter: we’ve used the Bloom filter, Pstart/Pstop = :BF0000
  • Operations 5 – Table Access Full: starts twice, the two partitions we need

And now we use the same SQL statement but insert the result set into the holding table:

insert into target(n1, id)
select
        /*+
                ordered
                use_hash(h1)    no_swap_join_inputs(h1)
        */
        t1.id,
        h1.id
from
        t1,
        pt_hash h1
where
        t1.id between 2 and 3
and     h1.grp = t1.n1
and     h1.id <= 100
;

select * from table(dbms_xplan.display_cursor(null,null,'outline allstats last cost partition'));

------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |         |      1 |        |  2115 (100)|       |       |      0 |00:00:00.02 |   16698 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |         |      1 |        |            |       |       |      0 |00:00:00.02 |   16698 |       |       |          |
|*  2 |   HASH JOIN              |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.02 |   16689 |  1969K|  1969K|  723K (0)|
|*  3 |    TABLE ACCESS FULL     | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |       |       |          |
|   4 |    PARTITION HASH ALL    |         |      1 |    100 |  2113   (2)|     1 |     4 |    100 |00:00:00.02 |   16687 |       |       |          |
|*  5 |     TABLE ACCESS FULL    | PT_HASH |      4 |    100 |  2113   (2)|     1 |     4 |    100 |00:00:00.02 |   16687 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------

-- Predicate and Outline sections redacted

In this case we don’t see any operation to create a Bloom filter, and operation 5 reports 4 starts, one for each partition, with the Pstart/Pstop showing 1 – 4. As a cross-check you can also see that the buffers visited has doubled and the time (though don’t trust it too much on such a small test) has increased.

To my surprise the outline information for the first plan didn’t report a Bloom filter hint (px_join_filter), but I tried adding one to the insert statement anyway – and nothing changed; however there is a “subquery pruning” mechanism that’s been available to the optimizer for a very long time now so I decided to try hinting that for both queries /*+ subquery_pruning(@sel$1 h1@sel$1 partition) */. This had no effect on the select statement, but produced the following plan for the insert:


-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT          |         |      1 |        |  2115 (100)|       |       |      0 |00:00:00.01 |    8390 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL  |         |      1 |        |            |       |       |      0 |00:00:00.01 |    8390 |       |       |          |
|*  2 |   HASH JOIN               |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.01 |    8381 |  1969K|  1969K|  829K (0)|
|*  3 |    TABLE ACCESS FULL      | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |       |       |          |
|   4 |    PARTITION HASH SUBQUERY|         |      1 |    100 |  2113   (2)|KEY(SQ)|KEY(SQ)|     51 |00:00:00.01 |    8379 |       |       |          |
|*  5 |     TABLE ACCESS FULL     | PT_HASH |      2 |    100 |  2113   (2)|KEY(SQ)|KEY(SQ)|     51 |00:00:00.01 |    8377 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------

As you can see operation 4 now reports Partition Hash Subquery – Oracle has “pre-queried” the driving table (you can see the statement in the 10046 trace file if you enable SQL trace) to discover which partitions it will have to visit – and operation 5 starts twice with a Pstart/Pstop of KEY(SQ).

Interestingly the hint didn’t end up in the Outline section of the execution plan, so this is an example of a query which you can’t fix by capturing the SQL Plan Baseline for one statement against the text for another – though you can hack the hint into an SQL Profile or (my preference in this case, I think) an SQL Patch.

Footnote:

After publishing this note I started to browse through the comments on the article I’d linked to reference subquery pruning, and discovered that an example of this insert/select issue appeared there with a surprising workaround, which was to use the “multi-table insert” (with just a single table). At the moment, though I haven’t managed to get this working with this example.

Footnote 2:

I’ve just been reminded by email of another detail mentioned in the comments of the earlier post – this problem isn’t present for direct path (i.e. /*+ append */) inserts – and I have run a quick test on 12.1.0.2 to check that this is indeed the case (though the fix isn’t present in 11.2.0.4).  Don’t forget to check the listed Bug notes on MoS, in case there’s a back-port available for your version of Oracle.

 

May 16, 2016

Cursor_Sharing problem

Filed under: 12c,Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 12:49 pm GMT May 16,2016

Here’s a possible bug (though maybe “not a bug”) that came up over the weekend on the OTN database forum. An application generating lots of “literal string” SQL was tested with cursor_sharing set to force. This successfully forced the use of bind variable substitution, but a particular type of simple insert statement started generating very large numbers of child cursors – introducing a lot of mutex waits and library cache contention. Here’s a (substituted) statement that was offered as an example of the problem:


INSERT INTO schema.tableName (column1,columns2,..)
VALUES (:"SYS_B_0",:"SYS_B_1",:"SYS_B_2",:"SYS_B_3",:"SYS_B_4",:"SYS_B_5",:"SYS_B_6",timestamp:"SYS_B_7",SYSTIMESTAMP,:"SYS_B_8")

Note particularly the slightly odd looking detail: timestamp:”SYS_B_7″; this is how bind variable substitution looks if you’ve used the “compact” ANSI mechanism for handling datetime literals. We were told, in fact, that the tables had only number, varchar2, and date columns – so it looks a little suspicious when see timestamp values being inserted but the implied coercion wasn’t the source of the problem. Here’s all it takes to see the problem (tested only on 12.1.0.2):


rem
rem     Script:         ansi_datetime_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2016
rem

create table t1 (
        n1      number(8,0),
        v1      varchar2(10),
        d1      date,
        t1      timestamp
);

insert into t1 values(-1,'x',sysdate, systimestamp);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

set serveroutput off
alter session set cursor_sharing = force;

prompt  ============================
prompt  Testing DATE literals
prompt  Expect to see child number 2
prompt  ============================

insert into t1 values(1, 'A', date'2016-01-01', null);
insert into t1 values(1, 'A', date'2016-01-02', null);
insert into t1 values(1, 'A', date'2016-01-03', null);

select * from table(dbms_xplan.display_cursor(null,null,'-note -plan_hash'));

prompt  ============================
prompt  Testing TIMESTAMP literals
prompt  Expect to see child number 2
prompt  ============================

insert into t1 values(1, 'A', null, timestamp'2016-01-01 00:00:00');
insert into t1 values(1, 'A', null, timestamp'2016-01-02 00:00:00');
insert into t1 values(1, 'A', null, timestamp'2016-01-03 00:00:00');

select * from table(dbms_xplan.display_cursor(null,null,'-note -plan_hash'));

prompt  ===============================
prompt  Need privilege to see this view
prompt  Pre-coded for the sql_ids above
prompt  ===============================

break on sql_id skip 1

select
        sql_id, child_number, hash_match_failed
from
        v$sql_shared_cursor
where
        sql_id in ('58udhcm270bhn', 'gssz5cbnt7mgn')
order by
        sql_id, child_number
;

A couple of points – there are several more tests in the script demonstrating things that do NOT cause multiple child cursors to appear. I probably didn’t cover all the options that I could have covered but I hit a number of common cases to check that it wasn’t simply that cursor_sharing being severely broken in 12c. I’ve also allowed a side effect to demonstrate the presence of multiple child cursors rather than explcitly listing the child cursors. If the three statements (of each type) had produced shareable cursors then the child number reported by dbms_xplan.display_cursor() would have been zero in both cases. Here are the actual outputs:


SQL_ID  58udhcm270bhn, child number 2
-------------------------------------
insert into t1 values(:"SYS_B_0", :"SYS_B_1", date:"SYS_B_2", null)

-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL | T1   |       |
-------------------------------------------------


SQL_ID  gssz5cbnt7mgn, child number 2
-------------------------------------
insert into t1 values(:"SYS_B_0", :"SYS_B_1", null, timestamp:"SYS_B_2")


-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL | T1   |       |
-------------------------------------------------


And, having pre-coded the script with the SQL_IDs of the two guilty statements, here’s the output identifying the cause of the failure to share from v$sql_shared_cursor:


SQL_ID        CHILD_NUMBER H
------------- ------------ -
58udhcm270bhn            0 N
                         1 Y
                         2 Y

gssz5cbnt7mgn            0 N
                         1 Y
                         2 Y

There is a bug on MoS relating to timestamp columns and failure to share cursors – it doesn’t really look like the same problem but it could be related in some way: Bug 13407937 : HIGH VERSION COUNT FOR INSERT WITH CURSOR_SHARING=FORCE AND TIMESTAMP COLUMN. It’s described as “not a bug” :(

May 9, 2016

RI Locks

Filed under: deadlocks,Indexing,IOT,Locks,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 12:24 pm GMT May 9,2016

RI = Referential Integrity: also known informally as parent/child integrity, and primary (or unique) key/foreign key checking.

I’m on a bit of a roll with things that I must have explained dozens or even hundreds of times in different environments without ever formally explaining them on my blog. Here’s a blog item I could have done with to response to  a question that came up on the OTN database forum over the weekend.

What happens in the following scenario:


-- session 1

create table parent (
        id        number(8,0),
        constraint par_pk primary key(id)
);

create table child  (
        id_p      number(8,0) not null references parent,
        id_c      number(8,0) not null,
        constraint child_pk primary key(id_p, id_c)
)
;

insert into parent values(1);

-- session 2
insert into child values(1,1);

Since the parent row corresponding to the child row doesn’t (yet) seem to exist as far as session 2 is concerned you might expect session 2 to respond immediately with an error message like:

ERROR at line 1:
ORA-02291: integrity constraint (TEST_USER.SYS_C0017926) violated - parent key not found

In fact, although the end-user is not allowed to see the uncommitted parent row, the user’s process can see the uncommitted row and will wait until session 1 commits or rolls back – so if you examine v$lock for the current locks for the two sessions you’d see something like this:

  1  select  sid, type, id1, id2, lmode, request, ctime, block
  2  from    V$lock
  3  where   sid in (select sid from V$session where username = 'TEST_USER')
  4  and     type != 'AE'
  5  order by
  6*         sid, type desc
  7  /

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- -- ---------- ---------- ---------- ---------- ---------- ----------
         3 TX     327709      12584          6          0        283          1
           TM     143734          0          2          0        283          0
           TM     143732          0          3          0        283          0

       250 TX     589829      12877          6          0        240          0
           TX     327709      12584          0          4        240          0
           TM     143734          0          3          0        240          0
           TM     143732          0          3          0        240          0


7 rows selected.

In the above, SID 250 is session 2: it’s holding a transaction lock (TX) in mode 6 because it has acquired an undo segment and has generated some undo, it’s also waiting for a transaction lock in mode 4 (share) and – checking id1 and id2 – we can see that the transaction table entry it’s waiting for is held by session 3 in mode 6 (and we also note that the lock held by session 3 is marked as a blocker).

If session 3 commits (thus releasing the transaction lock) session 250 will continue processing the insert; if session 3 rolls back session 250 will raise error ORA-02291 and roll back its insert statement. (Note: if this were a multi-statement transaction it would only be the insert into child that would be rolled back; that’s another one of those details that is important but often isn’t stated explicitly, leaving people believing that the entire transaction would be rolled back.)

Updates and deletes can produce the same effects. Imagine that we have just created the two tables, and then run the following:


-- session 1
insert into parent values(1);
commit;
delete from parent where id = 1;

-- session 2
insert into child values(1,1);

Again session 2 will wait for session 1 to commit or roll back. In this case if session 1 commits session 2 will raise Oracle error ORA-02291, if session 1 rolls back session 2 will continue with the insert.

Deadlocks

Whenever you can demonstrate a way of producing a wait chain you can also manage to produce a deadlock. Consider the following (starting, again, from empty tables);


-- (1) session 1
insert into parent values(1);

-- (2) session 2
insert into parent values(2);

-- (3) session 1
insert into child values(2,2);

-- (4)session 2
insert into child values(1,1);

Session 1 will start waiting for session 2 to commit (or rollback) at step 3, then session 2 will start to wait for session 1 at step 4 – with the result that session 1 will recognise the deadlock after about three seconds and rollback its last statement, raising exception ORA-00060 and dumping a trace file. (Note: session 1 will not, as many people think, roll back the entire transaction, it will only roll back the statement that allowed the deadlock to develop). Session 2 will still be waiting for session 1 to commit or rollback its insert into parent. Contrary to the popular claim, Oracle will not “resolve” the deadlock, it will simply break the deadlock leaving one session waiting for the other session to respond appropriately to the deadlock error.

For reference, here’s the deadlock graph (from a 12c trace file) produced by session 1 (SID = 3) for this demo:


Deadlock graph:
                                          ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name                             process session holds waits  process session holds waits
TX-00010017-000026C7-00000000-00000000          6       3     X             33     250           S
TX-000A000D-000026F8-00000000-00000000         33     250     X              6       3           S

session 3: DID 0001-0006-00000004       session 250: DID 0001-0021-00000041
session 250: DID 0001-0021-00000041     session 3: DID 0001-0006-00000004

Rows waited on:
  Session 3: no row
  Session 250: no row

When you see a deadlock graph with TX waits of type S (share, mode 4) it’s a very good bet that the wait has something to do with indexes – which may mean referential integrity as discussed here, but may mean collisions on primary keys, and may mean something to do with simple collisions on index-organized tables. You’ll notice that the “Rows waited on:” section shows no row – unfortunately in earlier versions of Oracle you may find a spurious row entry here because the wait information from some other (block) wait has been left in the relevant columns in v$session.

March 20, 2016

Index Speculation

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 10:32 pm GMT Mar 20,2016

There’s a current question on the OTN database forum as follows (with a little cosmetic adjustment):

I have a request for 3 indices as shown below. Does the 1st index suffice for 2 and 3?  Do I need all 3?

  • create index idx_atm_em_eff_ver_current_del on atm_xxx_salary (employee_key, effective_dt, salary_version_number, is_current, is_deleted);
  • create index idx_atm_em_ver_current on atm_xxx_salary (employee_key, salary_version_number, is_current);
  • create index .idx_atm_sal_current_del on atm_xxx_salary (employee_key, is_deleted, is_current);

In the absence of any information about the data and the application the correct answer is: “How could we possibly tell?”

On the other hand there’s plenty of scope for intelligent speculation, and that’s an important skill to practise because when we’re faced with a large number of options and very little information we need to be able to make best-guess choices about which ones are most likely to be worth the effort of pursuing in detail. So if we have to make some guesses about this table and the set of indexes shown, are there any reasonable guesses we might make.

I’ve highlighted the table name and leading column for the first index. The table seems to about salary and the leading column seems to identify an employee. In fact we see that all three indexes start with the employee_key and that may be what prompted the original question. Previous (“real-world”) experience tells me that employees are, generally, paid a salary and that salaries are likely to change (usually upwards) over time, and I note that another column in one of these indexes is effective_dt (date ?), and a third column (appearing in two of the indexes) is is_current.

This looks like a table of employee salaries recording their current and historic salaries, engineered with a little redundant information to make it easy to find the current salary. (Perhaps there’s a view of current_salary defined as is_current = ‘Y’ and is_deleted = ‘N’.)

It’s harder to speculate with any confidence on the columns is_deleted and salary_version_number;  why would a salary row be marked as deleted – is this something that happens when an employee leaves or an employee is deleted (or, following the pattern, has their is_deleted flag set to ‘Y’); why does a salary have a version number – does the table contain it’s own audit trail of errors and corrections, perhaps a correction is effected by marking the incorrect entry as deleted and incrementing its version number to generate the version number for the correct entry. Possibly the notional primary key of the table is (employee_key, effective_dt, is_deleted, salary_version_number).

The level of complexity surrounding these two columns could send further speculation in completely the wrong direction, but let’s follow the line that these two columns see very little action – let’s assume that most of the data is not “deleted” and virtually none of the data needs “versioning”. How does this assumption help us with the original question.

The largest employer in the world is the America Department of Defence with 3.2 million employees (following by the People’s Liberation Army of China with only 2.3 million employees), so an “employees” table is not really likely to be very big. How often does an employee have a salary review and change ? Would once per year be a reasonable figure to pluck from the air ? How many employess stay at the same company for 40 years – how many rows per employee would you end up with, and how scattered would they be through the salary table ?

Under any reasonable estimate it seems likely that if you created the first index (5 columns) then all the salary rows for a given employee are likely to be contained in a single leaf block, so if all the searches were driven by employee then that single index would allow exactly the correct set of table rows to be identified from one index leaf block access plus a little extra CPU.

Of course it’s possible that, with different circumstances, the size and clustering factor of the first index would be so much greater than the size and clustering factors of the other two that a query that would use one of the smaller indexes won’t use the larger index -but in this case the most significant contributor to the optimizer’s cost is likely to be the clustering_factor and given our assumption of the slow appearance over time of the new salaries for an employee the clustering factor of all three indexes is likely to be the same (probably very similar to the number of rows in the salary table).

Having got this far, it’s worth considering whether or not the salary table should actually be an index-organized table – it looks like an obvious candidate; how many other columns are there likely to be in a salary table ? Of course it’s worth thinking about other queries that might access a salary table without reference to the employees table at that point, perhaps a secondary index on (is_current, employee_key) might be appropriate, but in the absence of any other information we’ve reached the point where speculation needs to be backed up by some hard facts.

Bottom Line:

I wouldn’t guarantee that the first index makes the other two indexes redundant but it seems highly likely that it should and it’s probably worth spending some time looking at the requirements and numbers a little more closely – especially if you’re the US DoD or the Chinese People’s Liberation Army.

 

 

 

 

February 15, 2016

Connect By

Filed under: Execution plans,Hints,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 2:01 pm GMT Feb 15,2016

I received an email a couple of days ago that was a little different from usual – although the obvious answer was “it’s the data”. A connect by query with any one of several hundred input values ran in just a few seconds, but with one specific input it was still running 4,000 seconds later using the same execution plan – was this a bug ?

There’s nothing to suggest that it should be, with skewed data anything can happen: even a single table access by exact index could take 1/100th of a second to return a result if there was only one row matching the requirement and 1,000 seconds if there were 100,000 rows in 100,000 different table blocks (and the table was VERY big). The same scaling problem could be true of any type of query – and “connect by” queries can expose you to a massive impact because their run time can increase geometrically as the recursion takes place.

So it was easy to answer the question – no it’s (probably) not a bug, check the data for that one value.

Then I decided to build a simple model. The original email had a four table join, but I just created a single table, and used a “no filtering” connect by which I had to hint. Here’s some code I ran on 11.2.0.4:


rem
rem     script: connect_by_skew.sql
rem     dated:  Feb 2016
rem     Last tested:
rem             12.1.0.2
rem

create table t1 nologging 
as
select 
        rownum id_p, 10 * rownum id
from
        all_objects
where 
        rownum <= 50000 ; execute dbms_stats.gather_table_stats(user,'t1', method_opt=>'for all columns size 1')

alter system flush shared_pool;

set serveroutput off
alter session set statistics_level = all;

select sum(ct) 
from    (
        select
                /*+ no_connect_by_filtering */
                count(id) ct
        from
                t1
        connect by
                id = 20 * prior id_p
        start with
                id_p = 1
        group by
                id
)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

update t1 set id_p = 0
where   id_p = 1
;

update t1 set id_p = 1
where   id_p > 45000
;

select sum(ct) 
from    (
        select
                /*+ no_connect_by_filtering */
                count(id) ct
        from
                t1
        connect by
                id = 20 * prior id_p
        start with
                id_p = 1
        group by
                id
)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

The sum() of the inline aggregate view emulates the original code – I don’t know what it was for, possibly it was a way of demonstrating the problem without producing a large output, I just copied it.

As you can see in my script every parent id (id_p) starts out unique, and if I look at the pattern of the raw data identified by the recursion from id_p = 1 (rather than looiking at the result of the actual query) this is what I’d get:

      ID_P         ID
---------- ----------
         1         10
         2         20
         4         40
         8         80
        16        160
        32        320
        64        640
       128       1280
       256       2560
       512       5120
      1024      10240
      2048      20480
      4096      40960
      8192      81920
     16384     163840
     32768     327680

When I modify the data so that I have exactly 5,000 rows with id_p = 1 the initial data generation will be 80,000 rows of data. If you want to try setting id_p = 1 for more rows make sure you do it to rows where id_p is already greater than 32768 or you’ll run into Oracle error ORA-01436: CONNECT BY loop in user data.

Here’s the execution plan, with rowsource execution stats I got for the first query (running 11.2.0.4):


-----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |      |      1 |        |    32 (100)|      1 |00:00:00.44 |     103 |       |       |          |
|   1 |  SORT AGGREGATE                            |      |      1 |      1 |            |      1 |00:00:00.44 |     103 |       |       |          |
|   2 |   VIEW                                     |      |      1 |      2 |    32   (7)|     16 |00:00:00.44 |     103 |       |       |          |
|   3 |    HASH GROUP BY                           |      |      1 |      2 |    32   (7)|     16 |00:00:00.44 |     103 |  1519K|  1519K| 1222K (0)|
|*  4 |     CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |            |     16 |00:00:00.44 |     103 |       |       |          |
|   5 |      TABLE ACCESS FULL                     | T1   |      1 |  50000 |    31   (4)|  50000 |00:00:00.10 |     103 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------

As you can see, this took 0.44 seconds, generated the expected 16 rows (still visible up to operation 2) which it then counted. Oracle followed the same execution plan when I set 5,000 rows to the critical value – here’s the new run-time plan:


-----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |      |      1 |        |    32 (100)|      1 |00:05:39.25 |     103 |       |       |          |
|   1 |  SORT AGGREGATE                            |      |      1 |      1 |            |      1 |00:05:39.25 |     103 |       |       |          |
|   2 |   VIEW                                     |      |      1 |      2 |    32   (7)|   5015 |00:05:39.24 |     103 |       |       |          |
|   3 |    HASH GROUP BY                           |      |      1 |      2 |    32   (7)|   5015 |00:05:39.22 |     103 |  5312K|  2025K| 1347K (0)|
|*  4 |     CONNECT BY NO FILTERING WITH START-WITH|      |      1 |        |            |  80000 |00:05:38.56 |     103 |       |       |          |
|   5 |      TABLE ACCESS FULL                     | T1   |      1 |  50000 |    31   (4)|  50000 |00:00:00.09 |     103 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------

As expected, 80,000 rows generated (5,000 * 16), aggregated down to 5,015, then aggregated again to the one row result. Time to complete: 5 minutes 39 seconds – and it was all CPU time. It’s not entirely surprising – a single recursive descent (with startup overheads) took 0.44 seconds – presumably a fairly large fraction of that was startup, but even 0.1 seconds adds up if you do it 5,000 times.

Everybody knows that skewed data can produced extremely variable response times. With a deeper tree and more rows with the special value it wouldn’t be hard for the total run time of this query to get to the 4,000 seconds reported in the original email. (I also tried running with 10,000 rows set to 1 and the run time went up to 18 minutes – of which a large fraction was reading from the TEMPORARY tablespace because something had overflowed to disc).

Was there a solution ?

I don’t know – but I did suggest two options
a) create a histogram on the data to show that there was one particular special value; since the code seemed to include literals perhaps the optimizer would notice the special case and choose a different plan.
b) hint the code to use a different strategy – the hint would be /*+ connect_by_filtering */. Here’s the resulting execution plan:


---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |    95 (100)|      1 |00:00:06.50 |    1751 |       |       |          |
|   1 |  SORT AGGREGATE              |      |      1 |      1 |            |      1 |00:00:06.50 |    1751 |       |       |          |
|   2 |   VIEW                       |      |      1 |      2 |    95   (6)|   5015 |00:00:06.49 |    1751 |       |       |          |
|   3 |    HASH GROUP BY             |      |      1 |      2 |    95   (6)|   5015 |00:00:06.47 |    1751 |  5312K|  2025K| 1346K (0)|
|   4 |     CONNECT BY WITH FILTERING|      |      1 |        |            |  80000 |00:00:06.30 |    1751 |   337K|   337K|  299K (0)|
|*  5 |      TABLE ACCESS FULL       | T1   |      1 |      1 |    31   (4)|   5000 |00:00:00.01 |     103 |       |       |          |
|*  6 |      HASH JOIN               |      |     16 |      1 |    63   (5)|     15 |00:00:05.98 |    1648 |  1969K|  1969K|  741K (0)|
|   7 |       CONNECT BY PUMP        |      |     16 |        |            |     16 |00:00:00.01 |       0 |       |       |          |
|   8 |       TABLE ACCESS FULL      | T1   |     16 |  50000 |    31   (4)|    800K|00:00:01.49 |    1648 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

We get the result in 6.5 seconds! [UPDATE: but there’s a nice explanation for that – most of the time comes from the work done gathering rowsource execution statistics; with statistics_level set back to typical the run time dropped to 0.19 seconds.]

February 3, 2016

Hinting

Filed under: Hints,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 1:04 pm GMT Feb 3,2016

This is just a little example of thinking about hinting for short-term hacking requirements. It’s the answer to a question that came up on the Oracle-L listserver  a couple of months ago (Oct 2015) and is a convenient demonstration of a principle that can often (not ALWAYS) be applied as a response to the problem: “I can make this query work quickly once, how do I make it work quickly when I make it part of a join ?”

The question starts with this query, which returns “immediately” for any one segment:


SELECT DE.TABLESPACE_NAME, DE.OWNER,DE.SEGMENT_NAME,
       MAX(DE.BYTES) LARGEST_EXTENT_BYTES
FROM dba_extents DE
WHERE 1=1
  AND DE.OWNER           = <owner>
  AND DE.SEGMENT_NAME    = <segment_name>
  AND DE.segment_type    = <segment_type>
  AND DE.tablespace_name = <tablespace_name>
  AND DE.partition_name  = <max_partition_name>
GROUP BY DE.TABLESPACE_NAME, DE.OWNER, DE.SEGMENT_NAME
;

But the email then goes on to explain: “I’ve got a table of values that I need to use as a list of inputs for this query, but when I do the join it takes ages to complete; how do I make the join run quickly?”

Here’s the initial code:


WITH SEGMENT_LIST AS
(
  select * from (
   SELECT /*+ materialize cardinality(100) */
           owner, segment_name, segment_type, tablespace_name,
           MAX(partition_name) MAX_PARTITION_NAME
   FROM my_custom_table
   GROUP BY owner, segment_name, segment_type, tablespace_name
  ) where rownum < 2
)
SELECT
       DE.TABLESPACE_NAME, DE.OWNER,DE.SEGMENT_NAME,
       MAX(DE.BYTES) LARGEST_EXTENT_BYTES
FROM SEGMENT_LIST SL, dba_extents DE
WHERE 1=1
  AND DE.OWNER           = SL.OWNER
  AND DE.SEGMENT_NAME    = SL.SEGMENT_NAME
  AND DE.segment_type    = SL.segment_type
  AND DE.tablespace_name = SL.tablespace_name
  AND DE.partition_name  = SL.max_partition_name
GROUP BY DE.TABLESPACE_NAME, DE.OWNER, DE.SEGMENT_NAME

What we’ve got is a query where the user’s reference table generates a list of segments (the rownum < 2 was a temporary test) and we want the detail query to run for each segment identified. The “for each segment” gives us a clue that what we want to see is a simple nested loop join, driven by the factored subquery, with the very efficient query above running as the “second table of the nested loop”.

What I failed to notice at the time is that the /*+ materialize */ hint was in the wrong place, it should have been placed after the outer (i.e. very first) select, and it’s possible that if it had been in the right place the user would have got the plan they wanted – especially in the later versions of Oracle. As it was I suggested that we merely need to tell the optimizer to:

Visit the “tables” in the order (segment_list, dba_extents), and do a nested loop into (dba_extents), but since both segment_list and dba_extents were views we needed to stop Oracle from trying to merge them and play silly games with distinct aggregate placement, etc. by including directives that the views should not be merged, but then we might need to explain to Oracle that it would have to push the join predicate between segment_list and dba_extents inside the dba_extents view.

In other words, a list of 4 hints, as shown below:


WITH SEGMENT_LIST AS
(
  select * from (
   SELECT /*+ materialize cardinality(100) */
           owner, segment_name, segment_type, tablespace_name,
           MAX(partition_name) MAX_PARTITION_NAME
   FROM my_custom_table
   GROUP BY owner, segment_name, segment_type, tablespace_name
  ) where rownum < 2
)
SELECT /*+
        leading(sl de)
        no_merge(sl)
        no_merge(de)
        push_pred(de)
        */
       DE.TABLESPACE_NAME, DE.OWNER,DE.SEGMENT_NAME,
       MAX(DE.BYTES) LARGEST_EXTENT_BYTES
FROM SEGMENT_LIST SL, dba_extents DE
WHERE 1=1
  AND DE.OWNER           = SL.OWNER
  AND DE.SEGMENT_NAME    = SL.SEGMENT_NAME
  AND DE.segment_type    = SL.segment_type
  AND DE.tablespace_name = SL.tablespace_name
  AND DE.partition_name  = SL.max_partition_name
GROUP BY DE.TABLESPACE_NAME, DE.OWNER, DE.SEGMENT_NAME

According to a follow-up email, this was sufficient.  The OP had actually tried variations on the leading() and use_nl() hints – but without the no_merge() hint the optimizer was probably rewriting the SQL in a way that put the hints out of context. It’s worth noting that the /*+ materialize */ hint is in the wrong place – it should be after the first appearance of the SELECT keyword in the factored subquery – and that probably added to the difficulty of getting the desired execution plan.

For a production system I’d probably want to do something a little more sophisticated in terms of stability once I’d got the plan I wanted – but this looks like a DBA query used to run an ad hoc report, so perhaps this solution is good enough for the current requirement.

 

Next Page »

Blog at WordPress.com.