If you’ve used Tanel Poder’s snapper script then you probably want to know about the latest release, and webinar coming up this Wednesday.
February 11, 2013
Optimisation ?
I was at a client site recently where one of the end-users seemed to have discovered a cunning strategy for optimising a critical SQL statement. His problem was that his query screen times out after 2 minutes, so any query he runs has to complete in less than two minutes or he doesn’t see the results. Unfortunately he had a particular query which took nearly 32 minutes from cold to complete – partly because it’s a seven-table join using ANSI OUTER joins, against tables ranging through the 10s of millions of rows and gigabytes of data – the (necessary) tablescan of the table that had to be first in the join order took 70 seconds alone.
But our intrepid user seems to have made an important discovery and engineered a solution to his performance problem. I think he’s noticed that when you run a query twice in a row the second execution is often faster than the first. I can’t think of any other reason why the same person would run the same query roughly every four minutes between 8:00 and 9:00 am every morning (and then do the same again around 5:00 in the afternoon).
Looking at the SQL Monitoring screen around 10:00 the first day I was on-site I noticed this query with a very pretty graphic effect of gradually shrinking blue bars as 32 minutes of I/O turned into 2 minutes of CPU over the course of 8 consecutive executions which reported run times something like: 32 minutes, 25 minutes, 18 minutes, 12 minutes, 6 minutes, 4 minutes, 2.1 minutes, 2 minutes.
It’s lucky (for that user) that the db_cache_size is 60GB. On the other hand this machine is one of those Solaris boxes that likes to pretend that it’s got 128 CPUs when really it’s only 16 cores with 8 lightweight threads per core – you don’t want anyone running a query that uses 2 solid CPU minute on one of those boxes because it’s taking out 1/16th of your CPU availability, while reporting a load of 1/128 of your CPUs.
Footnote: the query can be optimised (properly) – it accessed roughly 100M rows of data to return roughly 300 rows (with no aggregation), so we just need to do a little bit of work on precise access paths.
February 6, 2013
Delphix
If you’re a regular follower or my blog you may recall Kyle Hailey and the joint webinar we did nearly two years ago on “Visual SQL Tuning” covering an approach I’ve written about in the past and a product that he developed at Embarcadero to automate the work that I’d been doing by hand and eye.
Kyle has now moved on to Delphix, and has become involved with another really interesting piece of technology – database virtualization. How do you supply a terabyte sized database to five different development teams without using up 5TB of disc space ? Create an operating environment that keeps one master copy of the database while maintaining a set of (small) private files for each team that hold private copies of the blocks that have been changed by that team – and that’s just one feature of the product.
The product is sufficiently interesting (plus I have a healthy regard for Kyle’s opinions) that I’ve accepted an invitation to go over to California for a few days next month to experiment with it, see what it can do, try to stress it a bit and so on. The people at Delphix are so confident that I’ll be impressed that they’re going to let me do this and then write up a blog telling you how things went.
Have a browse around their documentation and if you’re interested add a suggestion to the comment telling me what you’d like me to test, and how, and I’ll see if I can fit it into my timetable (no promises – but if you come up with interesting ideas I’ll see what I can do).
January 28, 2013
Losing it
The example I gave last week showing how a SORT operation in an execution plan might include the work of resolving function calls in your SQL and might, therefore, be reporting much higher resource utilisation than expected reminded me of some problems I’ve had with gaps in execution plans in the past. So I thought I’d give a little demonstration of the way in which the completeness of execution plans can develop over time.
We’ll start with the same two tables I had in last week’s demo.
create table t1 as select rownum id, lpad(rownum,200) padding from all_objects where rownum <= 2500 ; create table t2 as select * from t1 ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', method_opt => 'for all columns size 1' ); end; /
Once we have the tables and stats, we can start running a few very simple queries – I have a sequence of three queries to demonstrate, showing the lovely progression of history:
select (select max(t1.id) from t1 where t1.id <= t2.id) id from t2 ; select id from t1 minus select (select max(t1.id) from t1 where t1.id <= t2.id) id from t2 ; select case mod(id,2) when 1 then (select max(t1.id) from t1 where t1.id <= t2.id) when 0 then (select max(t1.id) from t1 where t1.id >= t2.id) end id from t2 ;
The first, and simplest query, demonstrates 8i failing to produce the right plan – but 9i handles it correctly:
Plan from 8.1.7.4
Id Par Pos Ins Plan
---- ---- ---- ---- ----------------------------------------------------------------------
0 3 SELECT STATEMENT (all_rows) Cost (3,2500,10000)
1 0 1 2 TABLE ACCESS (analyzed) TEST_USER T2 (full) Cost (3,2500,10000)
Plan from 9.2.0.8
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 10000 | 12 (9)|
| 1 | SORT AGGREGATE | | 1 | 4 | |
|* 2 | TABLE ACCESS FULL | T1 | 125 | 500 | 12 (9)|
| 3 | TABLE ACCESS FULL | T2 | 2500 | 10000 | 12 (9)|
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T1"."ID"<=:B1)
The second query, including a section of SQL that 9i handled properly, shows an incomplete plan in 9i and 10g, but gets a complete plan in 11g. (The row estimate of 125 in line 5 of the second plan is the usual 5% estimate for a range-based predicate against an unknown value: 5% of 2,500 is 125).
Plan from 9.2.0.8 / 10.2.0.5
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 20000 | 26 (58)|
| 1 | MINUS | | | | |
| 2 | SORT UNIQUE | | 2500 | 10000 | 13 (16)|
| 3 | TABLE ACCESS FULL | T1 | 2500 | 10000 | 12 (9)|
| 4 | SORT UNIQUE | | 2500 | 10000 | 13 (16)|
| 5 | TABLE ACCESS FULL | T2 | 2500 | 10000 | 12 (9)|
-------------------------------------------------------------------------
Plan from 11.1.0.7
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 20000 | 25 (56)| 00:00:01 |
| 1 | MINUS | | | | | |
| 2 | SORT UNIQUE | | 2500 | 10000 | 12 (9)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T1 | 2500 | 10000 | 11 (0)| 00:00:01 |
| 4 | SORT AGGREGATE | | 1 | 4 | | |
|* 5 | TABLE ACCESS FULL| T1 | 125 | 500 | 11 (0)| 00:00:01 |
| 6 | SORT UNIQUE | | 2500 | 10000 | 12 (9)| 00:00:01 |
| 7 | TABLE ACCESS FULL| T2 | 2500 | 10000 | 11 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("T1"."ID"<=:B1)
But if we start to hide subqueries inside CASE operators (decodes would do the same), 11g starts to get it a little wrong, as shown by the third example. The depth column of the plan_table can be calculated incorrectly, in this case giving the impression that line 6 is a descendent of line 5. (Funnily enough, 9i gets this example right because the code to display the plan from the plan_table uses the old “connect by” query on id and parent_id rather than the calculated depth column.
Plan from 11.1.0.7 / 11.2.0.3
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 20000 | 25 (56)| 00:00:01 |
| 1 | MINUS | | | | | |
| 2 | SORT UNIQUE | | 2500 | 10000 | 12 (9)| 00:00:01 |
| 3 | TABLE ACCESS FULL | T1 | 2500 | 10000 | 11 (0)| 00:00:01 |
| 4 | SORT AGGREGATE | | 1 | 4 | | |
|* 5 | TABLE ACCESS FULL | T1 | 125 | 500 | 11 (0)| 00:00:01 |
| 6 | SORT AGGREGATE | | 1 | 4 | | |
|* 7 | TABLE ACCESS FULL| T1 | 125 | 500 | 11 (0)| 00:00:01 |
| 8 | SORT UNIQUE | | 2500 | 10000 | 12 (9)| 00:00:01 |
| 9 | TABLE ACCESS FULL | T2 | 2500 | 10000 | 11 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("T1"."ID"<=:B1) 7 - filter("T1"."ID">=:B1)
You might note, by the way, that in all these examples the estimated COST of the plans is very misleading. The optimizer has made no attempt to allow for the cost of the repeated execution of the scalar subqueries. This doesn’t really matter, of course, for very simple queries like this, but it could make a big difference if something of this sort were embedded in the middle of a more complex statement.
Next time you have to unravel the execution plan for a complex query with scalar subqueries floating around the place – there may be bits of the plan that you can’t see, or that aren’t doing quite what you think you’re being told. When interpretation gets tough make sure you track through the query and the plan to see if the plan is likely to be a complete and truthful representation of what the statement has to do.
January 25, 2013
Sorting
Here’s a little quirk of execution plans that came up recently on the Oak Table network. If you call a function in a query, and do some sorting with the results, where does the work of calling the function get reported in the execution plan if you trace the query or look at the in-memory rowsource execution stats. Let’s take a look at a simple example:
create table t1 as select rownum id, lpad(rownum,200) padding from all_objects where rownum <= 2500 ; create table t2 as select * from t1 ; -- collect stats create or replace function f (i_target in number) return number as m_target number; begin select max(id) into m_target from t1 where id <= i_target; return m_target; end; /
So I’ve got two tables with exactly the same data and a function that will do a full tablescan of t1 (which is going to be 75 blocks) and return the original input (assuming the input was between 1 and 2,500). Here’s the query I want to run (and it will return no rows), followed by the base execution plan.
select /*+ gather_plan_statistics */ id from t1 minus select f(id) from t2 ; select * from table(dbms_xplan.display_cursor(null,null,'basic +rows')); -------------------------------------------- | Id | Operation | Name | Rows | -------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 | MINUS | | | | 2 | SORT UNIQUE | | 2500 | | 3 | TABLE ACCESS FULL| T1 | 2500 | | 4 | SORT UNIQUE | | 2500 | | 5 | TABLE ACCESS FULL| T2 | 2500 | --------------------------------------------
Because of the call to f() in the select against t2, I’m going to call the function 2,500 times and incur a load of buffer reads (2,500 * 75) doing so. Where will those buffer gets and the attendant CPU appear in the plan ? This example is by no means an exhaustive analysis of all the possible options when you include functions in your select list, but in this particular case the function isn’t called until we run the SORT UNIQUE operation at line 4:
select * from table(dbms_xplan.display_cursor(null,null,'allstats last projection'));
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.54 | 187K| | | |
| 1 | MINUS | | 1 | | 0 |00:00:00.54 | 187K| | | |
| 2 | SORT UNIQUE | | 1 | 2500 | 2500 |00:00:00.01 | 75 | 70656 | 70656 |63488 (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 2500 | 2500 |00:00:00.01 | 75 | | | |
| 4 | SORT UNIQUE | | 1 | 2500 | 2500 |00:00:00.54 | 187K| 70656 | 70656 |63488 (0)|
| 5 | TABLE ACCESS FULL| T2 | 1 | 2500 | 2500 |00:00:00.01 | 75 | | | |
-----------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - STRDEF[22]
2 - (#keys=1) "ID"[NUMBER,22]
3 - "ID"[NUMBER,22]
4 - (#keys=1) "F"("ID")[22]
5 - "ID"[NUMBER,22]
The 187K buffer gets for the function calls (and any other resources) appear at line 4 of the plan, when you might have expected them to appear in line5 as part of the tablescan. This observation can be confirmed by checking the column projection information – the output from line 5 is the “ID”, the output from line 4 includes “F”(“ID”). I don’t often look at the projection information, but it’s nice to know that sometimes it can give you some ideas of what’s going on when the row source execution stats don’t seem to be what you were expecting.
Footnote: The same sort of effect appears with scalar subqueries in the select lists, although in any recent versions of Oracle the plan for the scalar subquery appears in the main plan (although sometimes in a counter-intuitive position) and will give you a much better idea of where and why the work is being done.
January 24, 2013
Usage Stats
The video of the online chat that I had with Grant Fritchey about statistics in SQL Server and Oracle is now online. It went pretty well – according to the stats 537 people attended, although the peak concurrency I noticed was only 467 – of which there were still over 400 after an hour and a quarter.
At the start of the event, James Murtagh put up a quick poll to see how many attendees used SQL Server, how many used Oracle, and how many used both. There’s a degree of bias in the results, no doubt due to the self-selecting nature of the event, but I thought the results were interesting:
There were several questions outstanding by the end of the webinar, so James has sent a copy of them to both of us, and I’ve tried to give short answers below – but only for the questions that appear to be related to Oracle.
I’ve left the questions in their original form, but please note that the questions had to be typed but the audience in real-time into a tiny box as they were trying to listen to what we were saying – so some of the grammar and spelling is likely to be a little garbled.
Different Does Copy Stats Affect The Optimizer from the Collection of Stats in Oracle? Does Partition, Subpartition Level Copy Has Different Affects in Oracle?
In principle, the stats are just a set of numbers, that the optimizer reads and processes, it doesn’t care how they got into the data dictionary. There are a couple of details about partition stats, though, that give Oracle a clue about what to do when accumulating stats the next time you run dbms_stats() against the object. For some useful information about partitions and manipulating stats the best reference I have is still a set of articles by Doug Burns, which I’ve collated here with a few from other authors.
If i say 10% sample for a table, what data is considered for sampling, the existing data or the new one? I mean if the table is highly dml intensive, which data will be considered for sampling?
I guess this could be aimed at either Oracle or SQL Server. From the Oracle perspective – the sample is randomly selected across the full size of the table. You can choose to pick a sample of random rows, or a sample of random blocks.
So who are the Brent Ozar’s, Michelle Ufford’s, and Ola Hallengren’s of the Oracle world (and Grant Fritchey’s)? I’m aware of Ask Tom, but there seems to be far fewer people in the Oracle world who share their knowledge like the people I mentioned above. It looks like Red Gate is trying to help is this area, and it is greatly appreciated!
There seem to be quite a lot – being sure you’ve found one that actually produces correct and appropriate information is the problem, of course. You could do worse than start with the people on my blog roll, and bloggers who are members of the Oak Table network (whose aggregator is also listed on my blog roll.)
I think Red Gate is doing a very good job too – particularly in the way it’s trying to pull developers and dbas closer together, as well as pulling Oracle and SQL Server specialists together.
How does Oracle maintain stats on multiple columns as in correlated statistics? In SQL Server we have to create these manually depending on certain cases to give the optimizer better information with cardinality.
Oracle 11g introduced a feature known as “extended stats” which allows you to create stats on expressions and on column groups. Once you’ve created such an object it’s associated with a table as if it were a virtual column of that table, and any time you refresh the table stats the extended stats are refreshed at the same time. One of the suggestions I’ve made for this feature is to use it to create stats on the leading column combinations of indexes – which SQL Server does automatically.
How about collecting statistics for temporary tables and for collections?
Oracle 11g can use dynamic sampling to get some statistical information for a collection – with the usual limitation that the sampling is only done the first time a statement is optimised. Gathering stats for global temporary tabls (GTTs) is a bit of a problem because different people may use the same GTT for remarkably different quantities of data. Again, dynamic sampling could be used (and will be used by default from 10g onwards), but you could write some stats into the data dictionary for the GTT by creating a permanent table, populating it, collecting stats, then using the dbms_stats export and import function to copy the stats from the permanent table to the GTT. (If the GTT is declared as “on commit preserve rows” you can just load it with representative data then collect stats – but “on commit preserve” is less commonly used.)
Extended stats are great, but is there a way to determine which columns would benefit from extended stats? Most developers, or even users, don’t know, so would be nice to have (if you’ll forgive the pun) a statistical way of determining this.
Oracle recently came out with an advisor for column groups. There’s an article about it on the Optimizer Group blog.
Jonathan, what is the scale of clustering index in Oracle? Upper and lower limits?
Roughly speaking (and it is possible to created boundary cases). the clustering_factor for a B-tree index will be a value between the number of blocks in the table and the number of rows in the table. If the value is close to the number of blocks the optimizer thinks that the data is very well clustered and will favour using the index for relatively large numbers of rows, if the value is close to the number of rows in the table the optimizer thinks the data is very widely scattered and could ignore the index for even a relatively small number of rows.
The value is not intuitively meaningful, so Oracle also presents a transformed version of the clustering_factor in the form of the avg_data_blocks_per_key – which is the typical number of data block visits Oracle thinks it will have to make to collect all the rows for a typical key value.
I often see the bind problem together with histograms, which I see to often in OLTP database. I recommend not using histogram until otherwise proved. Do you agree?
Yes. Can be expensive to generate accurately, and can introduce instability when they are perfect representations of the data that was present when the histogram was created. It’s best to recognise the few cases where they can be very helpful then make sure that the application and the server (developer and DBA) co-operate to get the best effects from them.
On Oracle, when talking about partitioning, you have something called global statistics. This global statistics you can generate with DBMS_STAT, or you could have Oracle calculate it by using the partition statistics. What do you recommend Jonathan?
Oracle 11g has introduce the “approximate NDV (number of distinct values)” mechanism, and with that allows global stats (specifically the global NDV) to be calculatede incrementally for partitioned tables. This can be very helpful for systems that aren’t too comples, and don’t have too many partitions. As a general rule, though, I still think that any site that does something complex should probably work out a programmatic way of constructing stats, rather than letting Oracle gather them on partitioned tables.
Jonathan: Do you have an article that describes how you define your “perfect” histogram?
Yes – here’s one example.
How much improvement we would have to the queries when the system stats in Oracle is updated?
This is one of those questions which is impossible to answer. Essentially the introduction of system stats (when combined, if necessary, with removing the parameter db_file_multiblock_read_count from the parameter file) should give Oracle a better balance between indexed access paths with nested loops and tablescan paths with hash joins. However, as with any global change to the optimizer, some people are bound to find that a few things change for the worse.
I have a tongue-in-cheek guideline: whatever you do, the optimizer will be okay 99% of the time: when you fiddle with any parameters, or things like system stats, you’re just changing the 1% that you have to fix.
Oracle apex tracks session state in tables within its schema (eg wwv_flow_collection_members$) the data in these tables is extremely volatile – how frequently would you recommend collecting stats on tables like these where the contents can change completely over the course of a day eg from zero to hundreds of thousands of rows ? clearly an overnight gather wont help !
This is an excellent question. Every DBA and developer should be aware of tables in their systems that behave this way, because these are the tables where ANY automatic strategy for collecting stats is likely to cause problems. People often forget that your statistics are often wrong within a few seconds of collecting them – and in special cases the effect can be catastrophic.
The generic answer to this type of problem is to create statistics that tell the optimizer what you want it to think the data looks like. Alternatively, you have to include complete execution paths (through hints, or SQL Plan Baselines) in the SQL for the queries that are likely to be unstable. The target is a plan that is always “good enough” irrespective of the state of the data.
In SQL Server, if there is query with where clause for many columns,it uses all possible non clustered indexes.Is it true that oracle uses only one index per table unless there is bit map index
No. Oracle used to have an “AND_EQUAL” access path into a table to combine indexes, but since 9i that has been deprecated in favour of the “INDEX_COMBINE” access path, which combines index range scans on b-trees by converting the rowids produced from each index range scan into a string of bits, combining the bits, then converting back to rowids. There is no limit on the number of indexes that can be combined in this way.
(Oracle also has an “INDEX_JOIN” path, which allows it to combine results from as many indexes as necessary to create a result set without visiting the table at all).
Can we collect statistics on Temporary tables in Oracle?
See above – only if they are defined as ‘on commit preserve rows’, because the first step of stats collection is a commit – which leaves the table empty if it’s declared as ‘on commit delete rows’. (The word “delete” is a bad choice in this case, Oracle does delete the rows, it simply forgets the data segment)
Hello, How much consider the state of the network for the query optimization in distribuited databases? And what stadistics consider?
The Oracle optimizer does have a small component of cost allocated to the time it takes to move data across a network in a distributed query. I believe this feature appeared in 9i, and only if you enabled “CPU Costing” (aka System Stats). I haven’t looked into it very closely, but I think Oracle estimates the number of round trips it will take to move the data that has to be moved, and allows a small unit of time per round trip.
Question for Jonathan: From your experience what have parameter option for METHOD_OPT do feel has given you the best stats for performance?
I’m always in favour of doing something that is simple and “good enough” for most of the time – and then introducing a small amount of complexity where necessary. This, in part, is why I avoid histograms until I can show that they are needed, and then write code to create the right histogram. In terms of method_opt, this means “for all columns size 1″. (In 11g I would also let estimate_percent default to “auto_sample_size” because that allows the new, fast, accurate, “approximate NDV” to do its job.
After collecting state with this method_opt, I would run the code that rebuilt any histogram that had been overwritten by the most recent collection.
What sample % does the oracle daily job uses?
By default 10%, but in 11g you can change this at the database, schema, or object level using one of the “set_xxx_prefs” calls.
Jonathan, regarding the clustering factor: besides the obvious impact it has on physical IO, can’t it also have an influence on how much the query can benefit from pinned buffers. I remember that Oracle may keep buffers pinned between single buffer visits, but the number of buffers it may keep pinned is limited
The clustering_factor itself has no effect – but you’re correct in thinking that the order in which Oracle visits table blocks can make a difference to the amount of work done, even if all the blocks are buffered. This is, perhaps, one of the last steps in physically optimising the placement of your data that you might take to squeeze the maximum possible performance out of a system.
Taking a simplified example, if you had 10 rows scattered across 2 blocks A and B, and the index (for some reason) made you visit the blocks in the order ABABABABAB this would require more work than visiting them in the order AAAAABBBBB; in the first order Oracle would have to get and release each block 5 times, in the second case it would only have to get and release each block once, keeping it pinned for the other 4 rows.
Of course, the clustering_factor of the index that allows Oracle to do AAAAABBBBB will (almost certainly) be smaller than the clustering_factor of the index that allows Oracle to do ABABABABAB – but I would view the clustering_factor is an effect, not the cause.
Data compression is very popular nowadays in Oracle, compress data to save space in large databases, x3, x10, even x30 compression ratio. Does Oracle keep statistics if a row is compressed or not? does it matter to the optimizer.
Data compression doesn’t change the “logic” of statistics. From Oracle’s perspective you still have tables with rows and columns and indexes. The only difference is that you seem to have more rows per table. (There’s a five-part series on compression coming out on AllThingsOracle if you want to read up about the effects it can have.)
How to determine optimal block size in Oracle
There’s a simple answer to that question – if you need to ask, stick with the default (8KB on most platforms). There are few special cases where a different block size may give you a little performance edge, or where it’s worth mixing block sizes in a single database – but you really have to know what your application looks like to be sure it’s worth changing – and you could be in a position where you’re running in an environment that has had less field-testing than the default.
How do you ‘tell oracle to treat this col in this table as if it has this data volume, density etc’?
How do you override Oracle’s ‘incorrect’ scatter assumptions
The package dbms_stats has called to set_column_stats(), set_table_stats() and set_index_stats(). (See above for the reference to faking a histogram).
Is the process of collecting statistics (Oracle) always the same on each plattform (Unix or Windows) or exists differences?
It’s the same on both platforms.
Would scan density in SQLServer be analagous to cluster factor in Oracle?
I don’t think so. I believe scan density tells you how much data you have to pick up, while clustering_factor tries to tell you something about how much work it’s going to be to pick up all the different bits.
What are histograms and how they are add on to the statistics in oracle, to get very good plan
A histogram is, in effect, a picture of the data distribution, and can be particularly useful for columns with a small number of values with an extremely uneven data distribution – for example a flag column on a trades table that shows that almost all trades are closed with a tiny number that are in different state. By default Oracle will consider any columns that have been used in a where clause as a possible candidate for a histogram if you allow it to do its default collection – and this is often overkill.
Oracle Auto Stats runs and gather stats when data changes 10%. I have a large table that gets a small amount of rows added each day. We had two instances 10 months apart where the difference between actual row count and row count reported by statistics from a few weeks ago was 5-7 % and stats were not gathered. The execution plan went from using indexes to full table scan. Needed to manually gather stats to have optimizer use index again. How can I predict when to gather stats again?
There is no generic answer to this question. You have to understand the data and what it means to the application. In cases like this where the table is very large you may find that all you really need to do set use the set_column_stats() procedure to tell Oracle that the high-value for a date, or sequence-based, column has increased.
If you want to investigate in detail you could check the execution plan from day to day for a few weeks and watch the cardinality prediction changing as the query is reoptimised over time. This may give you a clue about why the plan can change, and how much the data has to change before you hit a break point.
I’ve seen a situation where parameter sniffing can cause problems with SQL Server mis-identifying the appropriate stats for any given query. We’ve solved this in the past by using techniques like establishing a local variable in a stored procedure and setting that value to the value of the parameter passed into the stored proc. Does Oracle have this same type of problem (parameter sniffing causing issues with assessing the proper stats)?
It’s called “bind variable peeking” in Oracle, and it does cause problems of stability. The type of workaround you’ve described wouldn’t quite work because Oracle would still see a bind variable. However, if you created a deterministic function to return the value then the optimizer wouldn’t be able to see the value and would have to use one of it’s fixed value estimates for selectivity.
The feature of SQL Server I’d like to see in Oracle for (at least some of the) cases like this is the hint that allows you to say: “optimize this query as if I’d supplied the value X”.
In context of oracle, do you perceive some danger in the common practice of “recalc every night”. Related to that can you comment on the changs in optimizer behaviour when histograms have been created.
Recalc (everything) every night can introduce instability – especially since Oracle will take automatic sample sizes. With 11g and the “approximate NDV” the instability and workload is likely to be reduced – provided you’re not collecting histograms. See further up the page for my comments on histograms.
Does each column have stats? or if a column is only used in a SELECT, not part of any index or WHERE, GROUP, etc., will it not have stats maintained?
In Oracle every column gets stats by default. Even if a column is not used in the where clause the number of non-null values and the length of the column content can make a difference to the work needed if Oracle wants to do a hash join or sort merge join – and column stats allow it to work out the total volume of data that needs to be sorted or hashed.
How do you mange/collect statistics for a new database application that is just launched and most of the transaction tables don’t have any data? Especially say for a trade or auction application… and there is a storm of activities on day one when the application is launched.
If there’s not much data then it might be quite quick and cheap the first few days to collect stats. If necessary (in 11g) you could collect “pending” stats a couple of times during the day, then switch them all from pending to active to minimize the time when you had a mixture of old new stats.
Sounds like this should be a weekly conversation for awhile
I agree – if only we could find the time – there’s lots we could talk about.
Can you address dynamic sampling in Oracle?
Big topic. Basically, when Oracle decides that certain rules apply it will sample some blocks from some of the tables in a query to see what the data looks like – and may then use the results of the sample to work out the execution plan. On the positive side this means that where there are multiple filter or join predicates on a table, or where Oracle has had to guess (e.g. because you have “function(col) = constant” then the optimizer has better information. On the minus side – the sampling only takes place once when the query is optimised – so an unlucky set of inputs may produce a plan that is bad for everyone else that wants to run that query.
Does it make sense to gather stats on oracle table after rebuilding one of the indexes of the same table
No
Jonathan what are your thoughts on Direct Path Reads in 11g
Outside the scope of the discussion. Any change will help some people and cause other people problems. So the most important thing is to be aware that it’s an option that the run-time engine (not the optimizer) might take.
Compression
Red Gate have asked me to write a few articles for their Oracle site, so I’ve sent them a short series on “traditional” compression in Oracle – which means I won’t be mentioning Exadata hybrid columnar compression (HCC a.k.a. EHCC). There will be five articles, published at the rate of one per week starting Tuesday (15th Jan). I’ll be supplying links for them as they are published.
January 22, 2013
Statistics
Updated 22nd Jan
I’ve previously advertised the fact that the latest online discussion that Red Gate has arranged for me to have with Grant Fritchey will be on 23rd Jan and we will be talking about statistics. If you’ve listened in to any of these talks in the past you’ll realise that they are completely unscripted; what you get is a couple of guys in a (virtual) pub comparing and contrasting their favourite database engines and trying to learn a little bit about how the other technology works.
It’s not completely unprepared, though; we’ve usually exchanged a couple of notes with bullet points about the ideas that might come up, and some sort of progression. That doesn’t mean that we’ll hit all the bullet points, or follow the order, or not end up talking about something completely unexpected, but it does mean that James Murtagh (our chairman for the event) has a few topics up his sleeve that he can prod us with if he thinks we are going too far off topic.
If you want some idea of how the conversation might go this time around, here’s a copy of an email I’ve just sent to James and Grant:
A couple of ideas for focal points for Wednesday
- Why do we need statistics
- What statistics are (or could be ) created
- How automatically (if at all)
- What does it cost
- How are statistics used
- What statistics do we need that aren’t collected
- What do we do if the statistics are missing
A few briefing comments for Oracle
a) The key reason for collecting stats is to allow the optimizer to work out how much data it will acquire as it steps through an execution plan, and how scattered that data is. The “clustering_factor” of an index is one of the most significant things we can learn about an index and the “num_distinct” for a column combined with the “num_rows” for a table.
b) Oracle doesn’t collect statistics automatically as the data changes (there is a small change coming in 12c) – but there is a default overnight job that checks for objects with stale statistics and refreshes them automatically – this can actually cause problems. There is also a package which allows us to be very flexible about collecting statistics, or even inventing statistics.
c) To handle skewed data distribution you can create a histogram of up to 254 buckets to describe the contents of a column.
d) Recent innovations (11g) allow Oracle to be very efficient at getting accurate num_distinct for a column – but histograms are very expensive to gather if you use a large sample, and potentially very inaccurate if you use a small sample (12c has introduced some great enhancements here in terms of performance of collection and quality of histogram.)
e) Recently we were allowed to create virtual columns and statistics on virtual columns. We can also create statistics on groups of columns in the same table. We can’t create statistics across tables – even though we can (for example) create bitmap join indexes. This would be very helpful – if the stats could be kept reasonably accurate fairly cheaply.
f) For indexes the only column-related stats we collect are the number of distinct keys – if we want to get stats on partial keys we have to manual define “column groups”.
g) If stats on a table are missing, Oracle can sample some blocks in the table to see what the data looks like – the sample tends to be small. We can change the sampling rules to deal with various problems that arise if we have functions applied to columns in where clauses, or need statistics about correlated columns.
h) We have tools and mechanisms that allow us to associate “statistical corrections” to queries. Technically we can create these by hand – although the method is undocumented and unsupported – officially we need to license the Diagnostic Pack and Performance Pack to let Oracle analyze the statement thoroughly and produce the corrections.
i) There are cases where the optimizer simply guesses because there is no way to produce a justifiable estimate of statistics behaviour across tables – e.g. select where exists (subquery) assumes a 1% “survival rate”. Select where table1_column > table2_column assumes a 5% survival rate on the Cartesian join.
Addendum
While I’m in Webinar advertising mode – Kyle Hailey has just published a blog item about a webinar the following Wednesday (30th Jan) about Database Virtualization with Delphix. They’ve got a product which could make it feasible for developers to do their coding and test on a full-sized, private, copy of a production database without having to create a copy of the production database for each developer. I’m quite like to get my hands on the product – although I don’t have the hardware to test it properly – but I thought I’d start by viewing the webinar, and maybe asking a few questions. There’s a stack of whitepapers and various technical documents available here if you think the strategy looks interesting.
Addendum 2 (22nd Jan)
Another webinar for next week – we’re getting spoiled for choice nowadays, but luckily it’s 29th Jan, so not colliding with mine or Kyle’s. This webinar comes from Karen Morton (fellow member of the Oak Table) through Embarcadero and is also about statistics – but it’s pure Oracle, and all about what the optimizer does with the statistics.
Another update for today: Grant Fritchey has published an article on Simple Talk (the SQL Server site managed by Red Gate) covering some features of Statistics in SQL Server, so we’ll probably spend some time browsing through some of the topics there in our comparison between Oracle and SQL Server.
January 17, 2013
dbms_xplan bug
Here’s a very long post (which is mainly an example) demonstrating a little bug in the “explain plan” functionality. It’s a variation of a bug which I thought had been fixed in 11g, but it still appears in some cases. Take a look at this execution plan, which comes from explaining “select * from dba_tab_cols” – the bit I want to emphasise is in lines 1 to 10:
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 21704 | | 2387 (1)| 00:00:29 |
| 1 | TABLE ACCESS BY INDEX ROWID | OBJ$ | 480 | 14400 | | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | I_OBJ1 | 192 | | | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | OBJ$ | 480 | 14400 | | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | I_OBJ1 | 192 | | | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID | OBJ$ | 480 | 14400 | | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | I_OBJ1 | 192 | | | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | OBJ$ | 480 | 14400 | | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | I_OBJ1 | 192 | | | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| OBJ$ | 480 | 14400 | | 3 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | I_OBJ1 | 192 | | | 2 (0)| 00:00:01 |
|* 11 | HASH JOIN | | 480 | 1929K| | 5 (20)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | I_OBJ1 | 480 | 12480 | | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | USER$ | 59 | 117K| | 2 (0)| 00:00:01 |
| 14 | NESTED LOOPS OUTER | | 1 | 2084 | | 3 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID | COL$ | 1 | 56 | | 2 (0)| 00:00:01 |
|* 16 | INDEX UNIQUE SCAN | I_COL3 | 1 | | | 1 (0)| 00:00:01 |
| 17 | TABLE ACCESS BY INDEX ROWID | ATTRCOL$ | 1 | 2028 | | 1 (0)| 00:00:01 |
|* 18 | INDEX UNIQUE SCAN | I_ATTRCOL1 | 1 | | | 0 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | ATTRCOL$ | 1 | 2028 | | 2 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | I_ATTRCOL1 | 1 | | | 1 (0)| 00:00:01 |
|* 21 | FILTER | | | | | | |
|* 22 | HASH JOIN RIGHT OUTER | | 72962 | 188M| | 2387 (1)| 00:00:29 |
| 23 | TABLE ACCESS FULL | USER$ | 59 | 118K| | 3 (0)| 00:00:01 |
|* 24 | HASH JOIN RIGHT OUTER | | 72962 | 45M| | 2384 (1)| 00:00:29 |
|* 25 | TABLE ACCESS FULL | OBJ$ | 583 | 30899 | | 205 (0)| 00:00:03 |
|* 26 | HASH JOIN RIGHT OUTER | | 72962 | 41M| | 2178 (1)| 00:00:27 |
| 27 | TABLE ACCESS FULL | COLTYPE$ | 2886 | 174K| | 275 (0)| 00:00:04 |
|* 28 | HASH JOIN | | 72962 | 37M| | 1902 (1)| 00:00:23 |
| 29 | TABLE ACCESS FULL | USER$ | 59 | 1770 | | 3 (0)| 00:00:01 |
|* 30 | HASH JOIN | | 72962 | 35M| | 1899 (1)| 00:00:23 |
| 31 | INDEX FAST FULL SCAN | I_USER2 | 59 | 3068 | | 2 (0)| 00:00:01 |
|* 32 | HASH JOIN RIGHT OUTER | | 72962 | 31M| 2976K| 1896 (1)| 00:00:23 |
| 33 | TABLE ACCESS FULL | HIST_HEAD$ | 16280 | 2782K| | 70 (0)| 00:00:01 |
| 34 | NESTED LOOPS | | 72962 | 19M| | 660 (1)| 00:00:08 |
| 35 | TABLE ACCESS FULL | OBJ$ | 47960 | 3840K| | 205 (0)| 00:00:03 |
| 36 | TABLE ACCESS CLUSTER | COL$ | 2 | 404 | | 1 (0)| 00:00:01 |
|* 37 | INDEX UNIQUE SCAN | I_OBJ# | 1 | | | 0 (0)| 00:00:01 |
|* 38 | TABLE ACCESS CLUSTER | TAB$ | 1 | 26 | | 2 (0)| 00:00:01 |
|* 39 | INDEX UNIQUE SCAN | I_OBJ# | 1 | | | 1 (0)| 00:00:01 |
| 40 | NESTED LOOPS | | 1 | 78 | | 3 (0)| 00:00:01 |
|* 41 | INDEX RANGE SCAN | I_OBJ4 | 1 | 39 | | 2 (0)| 00:00:01 |
|* 42 | INDEX RANGE SCAN | I_USER2 | 1 | 39 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("O"."OBJ#"=:B1)
4 - access("O"."OBJ#"=:B1)
6 - access("O"."OBJ#"=:B1)
8 - access("O"."OBJ#"=:B1)
10 - access("O"."OBJ#"=:B1)
11 - access("O"."OWNER#"="USER#")
12 - access("O"."OBJ#"=:B1)
16 - access("CL"."OBJ#"=:B1 AND "CL"."INTCOL#"=:B2-1)
18 - access("RC"."OBJ#"(+)=:B1 AND "RC"."INTCOL#"(+)=:B2-1)
filter("CL"."INTCOL#"="RC"."INTCOL#"(+))
20 - access("TC"."OBJ#"=:B1 AND "TC"."INTCOL#"=:B2)
21 - filter((("O"."TYPE#"=3 OR "O"."TYPE#"=4) OR "O"."TYPE#"=2 AND NOT EXISTS (SELECT 0
FROM "SYS"."TAB$" "T" WHERE "T"."OBJ#"=:B1 AND (BITAND("T"."PROPERTY",512)=512 OR
BITAND("T"."PROPERTY",8192)=8192))) AND ("O"."TYPE#"<>4 AND "O"."TYPE#"<>5 AND "O"."TYPE#"<>7
AND "O"."TYPE#"<>8 AND "O"."TYPE#"<>9 AND "O"."TYPE#"<>10 AND "O"."TYPE#"<>11 AND
"O"."TYPE#"<>12 AND "O"."TYPE#"<>13 AND "O"."TYPE#"<>14 AND "O"."TYPE#"<>22 AND
"O"."TYPE#"<>87 AND "O"."TYPE#"<>88 OR BITAND("U"."SPARE1",16)=0 OR ("O"."TYPE#"=4 OR
"O"."TYPE#"=5 OR "O"."TYPE#"=7 OR "O"."TYPE#"=8 OR "O"."TYPE#"=9 OR "O"."TYPE#"=10 OR
"O"."TYPE#"=11 OR "O"."TYPE#"=12 OR "O"."TYPE#"=13 OR "O"."TYPE#"=14 OR "O"."TYPE#"=22 OR
"O"."TYPE#"=87) AND (SYS_CONTEXT('userenv','current_edition_name')='ORA$BASE' AND
"U"."TYPE#"<>2 OR "U"."TYPE#"=2 AND "U"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edit
ion_id')) OR EXISTS (SELECT 0 FROM SYS."USER$" "U2",SYS."OBJ$" "O2" WHERE "O2"."TYPE#"=88
AND "O2"."DATAOBJ#"=:B2 AND "U2"."TYPE#"=2 AND "O2"."OWNER#"="U2"."USER#" AND
"U2"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id'))))))
22 - access("OT"."OWNER#"="USER#"(+))
24 - access("AC"."TOID"="OT"."OID$"(+))
25 - filter("OT"."TYPE#"(+)=13)
26 - access("C"."OBJ#"="AC"."OBJ#"(+) AND "C"."INTCOL#"="AC"."INTCOL#"(+))
28 - access("O"."SPARE3"="U"."USER#")
30 - access("O"."OWNER#"="U"."USER#")
32 - access("C"."OBJ#"="H"."OBJ#"(+) AND "C"."INTCOL#"="H"."INTCOL#"(+))
37 - access("O"."OBJ#"="C"."OBJ#")
38 - filter(BITAND("T"."PROPERTY",512)=512 OR BITAND("T"."PROPERTY",8192)=8192)
39 - access("T"."OBJ#"=:B1)
41 - access("O2"."DATAOBJ#"=:B1 AND "O2"."TYPE#"=88)
42 - access("O2"."OWNER#"="U2"."USER#" AND "U2"."TYPE#"=2 AND
"U2"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id')))
filter("U2"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id')))
If you try reading this execution plan using the rule of thumb (ROT) “the first thing that happens is near the top and over to the right” you’ll run into problems because (a) that’s not a good guideline and (b) there are a number of scalar subqueries in the select list and some filter subqueries in the where clause confusing the issue. Here’s the definition of the view dba_tab_cols:
select u.name, o.name,
c.name,
decode(c.type#, 1, decode(c.charsetform, 2, 'NVARCHAR2', 'VARCHAR2'),
2, decode(c.scale, null,
decode(c.precision#, null, 'NUMBER', 'FLOAT'),
'NUMBER'),
8, 'LONG',
9, decode(c.charsetform, 2, 'NCHAR VARYING', 'VARCHAR'),
12, 'DATE',
23, 'RAW', 24, 'LONG RAW',
58, nvl2(ac.synobj#, (select o.name from obj$ o
where o.obj#=ac.synobj#), ot.name),
69, 'ROWID',
96, decode(c.charsetform, 2, 'NCHAR', 'CHAR'),
100, 'BINARY_FLOAT',
101, 'BINARY_DOUBLE',
105, 'MLSLABEL',
106, 'MLSLABEL',
111, nvl2(ac.synobj#, (select o.name from obj$ o
where o.obj#=ac.synobj#), ot.name),
112, decode(c.charsetform, 2, 'NCLOB', 'CLOB'),
113, 'BLOB', 114, 'BFILE', 115, 'CFILE',
121, nvl2(ac.synobj#, (select o.name from obj$ o
where o.obj#=ac.synobj#), ot.name),
122, nvl2(ac.synobj#, (select o.name from obj$ o
where o.obj#=ac.synobj#), ot.name),
123, nvl2(ac.synobj#, (select o.name from obj$ o
where o.obj#=ac.synobj#), ot.name),
178, 'TIME(' ||c.scale|| ')',
179, 'TIME(' ||c.scale|| ')' || ' WITH TIME ZONE',
180, 'TIMESTAMP(' ||c.scale|| ')',
181, 'TIMESTAMP(' ||c.scale|| ')' || ' WITH TIME ZONE',
231, 'TIMESTAMP(' ||c.scale|| ')' || ' WITH LOCAL TIME ZONE',
182, 'INTERVAL YEAR(' ||c.precision#||') TO MONTH',
183, 'INTERVAL DAY(' ||c.precision#||') TO SECOND(' ||
c.scale || ')',
208, 'UROWID',
'UNDEFINED'),
decode(c.type#, 111, 'REF'),
nvl2(ac.synobj#, (select u.name from "_BASE_USER" u, obj$ o
where o.owner#=u.user# and o.obj#=ac.synobj#),
ut.name),
c.length, c.precision#, c.scale,
decode(sign(c.null$),-1,'D', 0, 'Y', 'N'),
decode(c.col#, 0, to_number(null), c.col#), c.deflength,
c.default$, h.distcnt,
case when SYS_OP_DV_CHECK(o.name, o.owner#) = 1
then h.lowval
else null
end,
case when SYS_OP_DV_CHECK(o.name, o.owner#) = 1
then h.hival
else null
end,
h.density, h.null_cnt,
case when nvl(h.distcnt,0) = 0 then h.distcnt
when h.row_cnt = 0 then 1
when (h.bucket_cnt > 255
or
(h.bucket_cnt > h.distcnt
and h.row_cnt = h.distcnt
and h.density*h.bucket_cnt < 1))
then h.row_cnt
else h.bucket_cnt
end,
h.timestamp#, h.sample_size,
decode(c.charsetform, 1, 'CHAR_CS',
2, 'NCHAR_CS',
3, NLS_CHARSET_NAME(c.charsetid),
4, 'ARG:'||c.charsetid),
decode(c.charsetid, 0, to_number(NULL),
nls_charset_decl_len(c.length, c.charsetid)),
decode(bitand(h.spare2, 2), 2, 'YES', 'NO'),
decode(bitand(h.spare2, 1), 1, 'YES', 'NO'),
h.avgcln,
c.spare3,
decode(c.type#, 1, decode(bitand(c.property, 8388608), 0, 'B', 'C'),
96, decode(bitand(c.property, 8388608), 0, 'B', 'C'),
null),
decode(bitand(ac.flags, 128), 128, 'YES', 'NO'),
decode(o.status, 1, decode(bitand(ac.flags, 256), 256, 'NO', 'YES'),
decode(bitand(ac.flags, 2), 2, 'NO',
decode(bitand(ac.flags, 4), 4, 'NO',
decode(bitand(ac.flags, 8), 8, 'NO',
'N/A')))),
decode(c.property, 0, 'NO', decode(bitand(c.property, 32), 32, 'YES',
'NO')),
decode(c.property, 0, 'NO', decode(bitand(c.property, 8), 8, 'YES',
'NO')),
decode(c.segcol#, 0, to_number(null), c.segcol#), c.intcol#,
case when nvl(h.row_cnt,0) = 0 then 'NONE'
when (h.bucket_cnt > 255
or
(h.bucket_cnt > h.distcnt and h.row_cnt = h.distcnt
and h.density*h.bucket_cnt < 1))
then 'FREQUENCY'
else 'HEIGHT BALANCED'
end,
decode(bitand(c.property, 1024), 1024,
(select decode(bitand(cl.property, 1), 1, rc.name, cl.name)
from sys.col$ cl, attrcol$ rc where cl.intcol# = c.intcol#-1
and cl.obj# = c.obj# and c.obj# = rc.obj#(+) and
cl.intcol# = rc.intcol#(+)),
decode(bitand(c.property, 1), 0, c.name,
(select tc.name from sys.attrcol$ tc
where c.obj# = tc.obj# and c.intcol# = tc.intcol#)))
from sys.col$ c, sys."_CURRENT_EDITION_OBJ" o, sys.hist_head$ h, sys.user$ u,
sys.coltype$ ac, sys.obj$ ot, sys."_BASE_USER" ut
where o.obj# = c.obj#
and o.owner# = u.user#
and c.obj# = h.obj#(+) and c.intcol# = h.intcol#(+)
and c.obj# = ac.obj#(+) and c.intcol# = ac.intcol#(+)
and ac.toid = ot.oid$(+)
and ot.type#(+) = 13
and ot.owner# = ut.user#(+)
and (o.type# in (3, 4) /* cluster, view */
or
(o.type# = 2 /* tables, excluding iot - overflow and nested tables */
and
not exists (select null
from sys.tab$ t
where t.obj# = o.obj#
and (bitand(t.property, 512) = 512 or
bitand(t.property, 8192) = 8192))))
I’ve left the line numbers in for this one to make it easier to refer to bits of the code. Notice that there are several nvl2() calls in the select list that include the option to query table obj$ – these are at lines 11, 19, 23, 25 and 27; these are the five scalar subqueries represented by lines 1 to 10 of the execution plan, and given that they are all within the same decode() statement you can appreciate that the corresponding pairs of lines in the execution plan should all be at the same indentation rather than forming the steadily increasing cascade that they do. This is an error in the calculated depth column of the plan table – an error which I first mentioned a few years ago with an example from 10g adding a throwaway comment that I had found a similar bug in 11g – but I’d not got around to showing an example until today and this example from 11.2.0.3
Footnote:
As a little instruction in reading execution plans – the main section of the query starts at line 21, with lines 22 – 37 representing the list of tables in the “from” list – including two tables representing the view _CURRENT_EDITION_OBJ.
Lines 38 and 39 represent the “not exists” subquery against table tab$.
Lines 40 – 42 represent an existence subquery embedded in the definition of view _CURRENT_EDITION_OBJ.
Lines 11 – 13 represent the inline scalar subquery in the nvl2() call at line 40 of the view definition.
Lines 14 – 20 represent the scalar subqueries in the decode() call at line 99 of the view definition, and again we have a depth problem, because lines 19 and 20 represent the second scalar subquery in that decode and line 19 ought to be indented to the same level as line 14.
If you write some old-style code to query the plan table, using a connect by between the id and parent_id columns, you’ll find that the indentation is correct – here’s the output from the plan table when reported by the old utlxpls.sql script from Oracle 8i:
Plan Table -------------------------------------------------------------------------------- | Operation | Name | Rows | Bytes| Cost | Pstart| Pstop | -------------------------------------------------------------------------------- | SELECT STATEMENT | | 8 | 21K| 2387 | | | | TABLE ACCESS BY INDEX ROW|OBJ$ | 480 | 14K| 3 | | | | INDEX RANGE SCAN |I_OBJ1 | 192 | | 2 | | | | TABLE ACCESS BY INDEX ROW|OBJ$ | 480 | 14K| 3 | | | | INDEX RANGE SCAN |I_OBJ1 | 192 | | 2 | | | | TABLE ACCESS BY INDEX ROW|OBJ$ | 480 | 14K| 3 | | | | INDEX RANGE SCAN |I_OBJ1 | 192 | | 2 | | | | TABLE ACCESS BY INDEX ROW|OBJ$ | 480 | 14K| 3 | | | | INDEX RANGE SCAN |I_OBJ1 | 192 | | 2 | | | | TABLE ACCESS BY INDEX ROW|OBJ$ | 480 | 14K| 3 | | | | INDEX RANGE SCAN |I_OBJ1 | 192 | | 2 | | | | HASH JOIN | | 480 | 1M| 5 | | | | INDEX RANGE SCAN |I_OBJ1 | 480 | 12K| 2 | | | | TABLE ACCESS FULL |USER$ | 59 | 117K| 2 | | | | NESTED LOOPS OUTER | | 1 | 2K| 3 | | | | TABLE ACCESS BY INDEX RO|COL$ | 1 | 56 | 2 | | | | INDEX UNIQUE SCAN |I_COL3 | 1 | | 1 | | | | TABLE ACCESS BY INDEX RO|ATTRCOL$ | 1 | 1K| 1 | | | | INDEX UNIQUE SCAN |I_ATTRCOL | 1 | | 0 | | | | TABLE ACCESS BY INDEX ROW|ATTRCOL$ | 1 | 1K| 2 | | | | INDEX UNIQUE SCAN |I_ATTRCOL | 1 | | 1 | | | | FILTER | | | | | | | | HASH JOIN RIGHT OUTER | | 72K| 188M| 2387 | | | | TABLE ACCESS FULL |USER$ | 59 | 118K| 3 | | | | HASH JOIN RIGHT OUTER | | 72K| 45M| 2384 | | | | TABLE ACCESS FULL |OBJ$ | 583 | 30K| 205 | | | | HASH JOIN RIGHT OUTER | | 72K| 41M| 2178 | | | | TABLE ACCESS FULL |COLTYPE$ | 2K| 174K| 275 | | | | HASH JOIN | | 72K| 37M| 1902 | | | | TABLE ACCESS FULL |USER$ | 59 | 1K| 3 | | | | HASH JOIN | | 72K| 35M| 1899 | | | | INDEX FAST FULL SCA|I_USER2 | 59 | 2K| 2 | | | | HASH JOIN RIGHT OUT| | 72K| 31M| 1896 | | | | TABLE ACCESS FULL |HIST_HEAD | 16K| 2M| 70 | | | | NESTED LOOPS | | 72K| 19M| 660 | | | | TABLE ACCESS FULL|OBJ$ | 47K| 3M| 205 | | | | TABLE ACCESS CLUS|COL$ | 2 | 404 | 1 | | | | INDEX UNIQUE SCA|I_OBJ# | 1 | | 0 | | | | TABLE ACCESS CLUSTER |TAB$ | 1 | 26 | 2 | | | | INDEX UNIQUE SCAN |I_OBJ# | 1 | | 1 | | | | NESTED LOOPS | | 1 | 78 | 3 | | | | INDEX RANGE SCAN |I_OBJ4 | 1 | 39 | 2 | | | | INDEX RANGE SCAN |I_USER2 | 1 | 39 | 1 | | | --------------------------------------------------------------------------------
January 11, 2013
Quiz Night
Warning – this is a catch question, and I haven’t given you enough information to have any idea of the right answer; though, by telling you that I haven’t given you enough information to have any idea of the right answer, you now have some information that might help you to get closer to the right answer.
I have a simple heap table with no indexes. Immediately after flushing the buffer_cache I’ve run a query that looks ike this:
select max(column_ZZZ) from table_X;
The most significant session stats for this operation are as follows:
Name Value ---- --------- session logical reads 20,651 consistent gets 20,651 consistent gets direct 20,649 physical reads 655 physical reads cache 1 physical reads direct 654 Number of read IOs issued 6 no work - consistent read gets 20,649 table scan rows gotten 329,922 table scan blocks gotten 10,649 table fetch continued row 645 buffer is not pinned count 10,000
The instance is 11.2.0.3, so serial direct path reads have been used for the (necessary) tablescan. The tablespace is using 8KB blocks, 1MB uniform extent sizing, and manual (freelist) segment space management. The data in the table was created by a pl/sql loop of inserts with commits, there have been no updates, deletes, merges or rollbacks. Here’s the code (with one crtical detail hidden) that populated the table:
begin for i in 1..X loop insert into member(member_id, block_age_max_nbr) values (1,1); commit; end loop; end; / execute dbms_stats.gather_table_stats(user,'member',method_opt => 'for all columns size 1')
Roughly how many rows are there in the table ?
Update 13th Jan:
The answer is 10,000; the interesting observations are in this comment and the reply.
January 10, 2013
Over-indexing
This is the text of an article I published in the UKOUG magazine a few years ago, but it caught my eye while I was browsing through my knowledge base recently, and it’s still relevant. I haven’t altered the original apart from adding a couple of very brief comments in brackets [Ed: like this].
Over-indexing
One of the strengths of a relational database is that you should be able to throw any reasonable query (and even some unreasonable queries) at it and it will be able to return the right answer without being told how to navigate through the data.
There’s no guarantee, though, that you’ll get the answer quickly unless you’ve given the database some help by turning your logical model into a sensible physical implementation. Part of the physical implementation will be the choice of indexes – and this article reviews one of the commonest indexing issues that I see in OLTP systems
Costs and Benefits.
Why do we create indexes? There are three main reasons as far as Oracle is concerned. We need some indexes to help Oracle enforce uniqueness; we need some indexes for performance reasons to supply a high-precision access path to important data; and we may need some indexes to help Oracle enforce referential integrity constraints.
Theoretically, of course, we don’t need any indexes to ensure data correctness, so you could say that the only reason for having indexes is to improve performance. For example, you could enforce primary key constraints simply by locking tables on every insert, update or delete and then doing a tablescan to make sure that there is no conflicting data – but having an index “for” the primary key allows you to check just one block in the index without locking the table, so you get better performance and better concurrency. (In passing, an index that covers a primary key or unique constraint need not be a unique index, and the index definition need not be an exact match for the constraint – provided it starts with the relevant columns.)
So indexes can be a massive aid to performance and concurrency. But indexes have a price – you have to maintain them, usually in real-time, and that can cause problems. Roughly every two weeks [Ed: not quite so frequently in 2012], someone writes into the Oracle Forums with a question like: “I have a batch process that inserts 1 million rows of data into a table every night. But it’s very slow and I see lots of waits for ‘db file sequential read’ while it’s going on. Why is this happening?”
If you insert 1 million rows into a table you might have to create and (eventually) write something in the region of 25,000 table blocks (assuming a row size of about 200 bytes … 40 rows per block). If you have one index on that table then, in principle, you might have to do a random read to find an index leaf block for every single row you insert – and you may have to write the updated block out very soon afterwards to make space for the next random read. If the table is large and you have a couple of dozen indexes this “in principle” threat very soon becomes a real one.
There is a very important difference between (heap) tables and their indexes – a row can go into a table wherever the next empty space happens to be; a row has to go into an index in exactly the right place, and finding that place can be expensive. Things get worse if you are doing updates – you can update a row in the table “in situ”, but if you change the value of an indexed column you have to find an index entry for the old value, delete it, and then insert an index entry for the new value in the right place.
So don’t create indexes unless they really pay for their existence when you come to query the data – and when you do create an index make sure you maximise the payback.
The Commonest Error.
There are a number of subtle ways in which you can create too many indexes, but there is one error that is easy to spot – the foreign key index.
When you create a referential integrity constraint (foreign key) you do not need to create an index to support it and (unlike the unique constraint and primary key constraint) Oracle will not create such an index automatically. However, if you are going to update the primary key or unique key at the “parent” end of the referential integrity constraint Oracle will lock the “child” table to check for committed and uncommitted child rows if a suitable index does not exist. Consequently many people (and some application generators) automatically create a matching index for every foreign key they create.
Rule 1: don’t create “foreign key” indexes unless they are really necessary, or unless they are sufficiently useful that you would have created them any way even if there hadn’t been a referential integrity constraint to protect.
If you do have a foreign key constraint that needs an index, though, remember that the index does not have to be an exact match for the constraint – the critical feature is that it should start with the same columns (not necessarily in the same order) as the constraint definition. This means that you can always ‘add value’ to a foreign key index that might otherwise be just a technical overhead.
Rule 2: consider adding columns to foreign key indexes to make them useful to high precision queries.
Moreover, when you think about the parent/child relationship remember that you often see one parent row with several child rows – think orders/order_lines, or currency_codes/payments. In cases like this there may be an opportunity for saving quite a lot of space (and reducing the impact on the buffer cache) by compressing the index on some, or all, of the foreign key column(s).
Case Study.
Here’s a list of indexes (with a little camouflage) from a system I was looking at recently [Ed: some time in 2008]. It’s typical of the over-indexing problems that I see fairly frequently. I got this list by running a query similar to:
select
index_name, column_name
from
user_ind_columns
where
table_name = 'XXXXX'
order by
index_name, column_position
;
You could enhance this by query to use dba_ind_columns, of course, and add all sorts of extras by joining to dba_tab_columns, dba_indexes, and dba_ind_expressions to pick up details about mandatory columns, column types, uniqueness, index types, and functions involved in function-based indexes. In the case of my example, I used a slightly more sophisticated query to show the expressions used in function-based indexes – and here’s the list of indexes I found on one table:
ap_fun_fk_i ( fun_id ) ap_grp_fk_i ( grp_id ) ap_grp_fun_id_i( grp_id,fun_id ) ap_org_ap_i ( org_id,ap_id ) ap_org_fk_i ( org_id ) ap_per_ap_i ( per_id,ap_id ) ap_per_fk_i ( per_id ) ap_pk ( ap_id ) ap_ud_i ( trunc(update_date) )
This system had a convenient naming convention, PK in the name indicates the primary key, and FK indicates foreign keys. You can see immediately that we seem to have several redundant indexes as far as protecting foreign keys is concerned.
- AP_GRP_FK_I is made redundant by AP_GRP_FUN_ID_I,
- AP_ORG_FK_I is made redundant by AP_ORG_AP_I,
- AP_PER_FK_I is made redundant by AP_PER_AP_I.
On top of this, we might decide that since there are only a few organizations (org_id) we could benefit by compressing the index on (org_id, ap_id) on its first column. We might also compress the index on functions (fun_id) as well – although we might go one step further and decide that we will never delete functions or update the key values, and drop the index altogether.
Following the thought about compression, we could also consider compressing the index on (trunc(update_date)) – a fairly large amount of data gets updated each day, so the value is quite repetitive, similarly (grp_id, fun_id) is also fairly repetitive, so we might compress on both columns.
This leaves us with:
ap_grp_fun_id_i ( grp_id,fun_id ) compress 2 ap_org_ap_i ( org_id,ap_id ) compress 1 ap_per_ap_i ( per_id,ap_id ) ap_pk ( ap_id ) ap_ud_i ( trunc(update_date) ) compress 1
The reduction in the number of indexes is likely to reduce the undo and redo by about 40% as we insert (single row) data into this table, as well as reducing the possible I/O demand by the same sort of percentage. At the same time, the effect of the compression could (in this case) eliminate a further 10% to 15% of I/O related to this table because of the reduction in size of the indexes.
Further considerations.
The only remaining issue to consider is whether there is anything about the nature of our processing that suggests the need for housekeeping on these indexes – and the index on trunc(update_date) is an ideal candidate for consideration.
As we update data, we are going to delete entries from blocks in the left hand end of the index and insert them at the right hand end of the index. Depending on the rate and pattern of updates it is possible that a large number of blocks at the left hand end of the index will become close to empty – this could have a significant impact on the effectiveness of the buffer cache and might encourage us to use the coalesce command on the index every few days. (If this sounds familiar, I have written about this particular index in the past.)
Having thought about that, you might then wonder why we have an index on trunc(update_date) at all. A query that tries to find all rows updated on a certain date, or range of dates, may have to jump around all over the table to acquire a lot of data. Who wants to do this, and how often? It’s possible that the index exists for a particular report that runs just once per day – perhaps there is an argument for dropping this index as well.
Conclusion:
It’s very easy to create more indexes than you really need, and indexes can be very expensive to maintain. Sometimes you can spot “obviously” redundant indexes simply by inspecting index names and column names. A few reasonable guesses may allow you to drop some indexes and make others more efficient with a minimum of risk and effort.
January 7, 2013
Analysing Statspack 13
A recent (Jan 2013) post on the OTN database forum reported a performance problem on Oracle 9.2.0.6 (so no AWR), and posted a complete statspack report to one of the public file-sharing sites. It’s been some time since I did a quick run through the highlights of trouble-shooting with statspack, so I’ve picked out a few points from this one to comment on.
As usual, although this specific report is Statspack, the same analysis would have gone into looking at a more modern AWR report, although I will make a couple of comments at the end about the extra material that would have been available by default with the AWR report that would have helped us help the OP.
First, here are the Load Profile, and “Top 5″ parts of the report. To make sense of them, you also need to know that report covered a two-hour interval, and the machine had 12 CPUs; there’s also a little clue in the fact that this is 32-bit Oracle with a 600MB SGA (with the sga_max_size set to 1.6GB) running (alone) on a machine with 12GB of memory.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 54,437.02 65,579.53
Logical reads: 2,053,336.52 2,473,626.15
Block changes: 383.39 461.86
Physical reads: 43,270.67 52,127.57
Physical writes: 46.63 56.18
User calls: 35.83 43.16
Parses: 25.53 30.76
Hard parses: 0.18 0.21
Sorts: 186.25 224.38
Logons: 0.08 0.10
Executes: 108.76 131.02
Transactions: 0.83
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 74,717 84.80
latch free 181,788 8,101 9.19
db file scattered read 47,546,136 2,893 3.28
db file sequential read 23,092,208 2,006 2.28
buffer busy waits 1,812,920 193 .22
-------------------------------------------------------------
Working from the top down, the first thing we notice is that 2M logical I/Os per second is quite busy, and that 43, 000 blocks read per second is pretty amazing – especially since the number of user calls and executes isn’t particularly large.
A cross check to the Top 5 then shows that a huge fraction of the work done by the database is CPU. At 2 hours and 12 cores we have 12 * 2 * 3600 = 86,400 CPU seconds available, and we’ve used 74,700 of them – and that’s into Cary Millsap’s “knee of the curve” for queueing problems. Since this isn’t the AWR, of course, we don’t have the OS Stats available, so we can’t see if the rest of the machines CPU is being used up outside the database – but looking at the db file read figures I’ll bet it is. If you have extreme CPU usage in an Oracle system, then latch contention problems are very likely to show up – and that’s our next highest figure – AWR would probably tells us which latch but, again based on the db file read figures, I’ll guess it’s the cache buffers chains latches.
So we look at the db file read figures and do the arithmetic that the AWR would show us to work out that we’re seeing multiblock read times of 0.06 milliseconds, and single block read times of 0.08 milliseconds. These have to be coming from the file system cache to be this fast. (Even then I think we may also seeing a timing problem (perhaps a measurement bug) that’s managing to lose wait time.) First thoughts, then – are that there is some code that is so inefficient that it’s doing far more physical I/O than it should; on the plus side the excess memory (12GB) means we’re being protected from the worst effects of disk I/O, but the caching effect means we’re able to burn up CPU at an astonishing rate to do the work.
So let’s look at the SQL to see where the CPU and disk activity is likely to be going. Here’s extract from the SQL ordered by Gets, and SQL Ordered by Reads. (We don’t have SQL ordered by Time in Statspack, of course):
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
5,636,117,159 28 201,289,898.5 38.1 ######## ######### 835260576
Module: RAXTRX
INSERT INTO RA_INTERFACE_ERRORS (INTERFACE_LINE_ID, MESSAGE_
TEXT, INVALID_VALUE) SELECT INTERFACE_LINE_ID, :b_err_msg6, '
trx_number='||T.TRX_NUMBER||','||'customer_trx_id='||TL.CUSTOMER
_TRX_ID FROM RA_INTERFACE_LINES_GT IL, RA_CUSTOMER_TRX_LINES TL,
RA_CUSTOMER_TRX T WHERE IL.REQUEST_ID = :b1 AND IL.INTERFAC
511,626,129 20 25,581,306.5 3.5 8155.90 15045.56 205460129
Module: JDBC Thin Client
select TRANSACTION_BRANCH , BRANCH_ADDRESS_1 , BRANCH_ADDRESS_2
, BRANCH_PHONES , BRANCH_FAX , BRANCH_TIN , TRX_NUMBER , TRX_NU
MBER_DISP , TRX_DATE , TRX_DATE_DISP , BILL_TO_CUSTOMER_ID , BIL
L_TO_CUSTOMER_NAME , SHIP_DATE_ACTUAL_DISP , SHIP_TO_NAME , SHIP
_TO_ADDRESS1 , SHIP_TO_ADDRESS2 , SHIP_TO_ADDRESS3 , SHIP_TO_ADD
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
281,953,754 20 14,097,687.7 90.5 8155.90 15045.56 205460129
Module: JDBC Thin Client
select TRANSACTION_BRANCH , BRANCH_ADDRESS_1 , BRANCH_ADDRESS_2
, BRANCH_PHONES , BRANCH_FAX , BRANCH_TIN , TRX_NUMBER , TRX_NU
MBER_DISP , TRX_DATE , TRX_DATE_DISP , BILL_TO_CUSTOMER_ID , BIL
L_TO_CUSTOMER_NAME , SHIP_DATE_ACTUAL_DISP , SHIP_TO_NAME , SHIP
_TO_ADDRESS1 , SHIP_TO_ADDRESS2 , SHIP_TO_ADDRESS3 , SHIP_TO_ADD
94,281,089 8,550 11,027.0 30.3 2924.20 4550.22 3761220362
Module: JDBC Thin Client
SELECT DELV_ADDR.ADDRESS1 FROM APPS.OE_ORDER_HEADERS_ALL OE_HEAD
ERS , APPS.RA_SITE_USES_ALL DELV_SITE , APPS.RA_ADDRESSES_ALL DE
LV_ADDR WHERE TO_CHAR(ORDER_NUMBER) = :B1 AND DELV_SITE.SITE_USE
_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID AND DELV_ADDR.ADDRESS_ID(+
) = DELV_SITE.ADDRESS_ID
94,011,901 8,544 11,003.3 30.2 2919.47 4527.54 803969757
Module: JDBC Thin Client
SELECT COALESCE(DELV_ADDR.CITY, DELV_ADDR.PROVINCE) FROM APPS.OE
_ORDER_HEADERS_ALL OE_HEADERS , APPS.RA_SITE_USES_ALL DELV_SITE
, APPS.RA_ADDRESSES_ALL DELV_ADDR WHERE TO_CHAR(ORDER_NUMBER) =
:B1 AND DELV_SITE.SITE_USE_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID
AND DELV_ADDR.ADDRESS_ID(+) = DELV_SITE.ADDRESS_ID
93,981,254 8,551 10,990.7 30.2 2919.13 4533.69 4273350236
Module: JDBC Thin Client
SELECT DELV_ADDR.ADDRESS2 FROM APPS.OE_ORDER_HEADERS_ALL OE_HEAD
ERS , APPS.RA_SITE_USES_ALL DELV_SITE , APPS.RA_ADDRESSES_ALL DE
LV_ADDR WHERE TO_CHAR(ORDER_NUMBER) = :B1 AND DELV_SITE.SITE_USE
_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID AND DELV_ADDR.ADDRESS_ID(+
) = DELV_SITE.ADDRESS_ID
The first statement reported by Gets is responsible for 500 millions buffer gets in the interval (and even then that’s 40% of the total – so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them). Look at the table that the SQL is inserting into – RA_INTERFACT_ERRORS. We might hope that an error table wasn’t going to receive many errors – so perhaps this is just an example of a statement with an insanely bad execution plan; clearly one to look at. Trouble is, though, that unless you set statspack to run at level 6 you don’t capture the execution plans – unlike the AWR which capture them by default. The batch took a couple of days to complete, though, so it would have been possible either to change the snapshot level, or to pull the execution plans from memory.
Looking at the SQL ordered by Reads, we see another example of the top statement being much more expensive than anything following it – but there’s an odd little pattern that caught my eye as I scanned the list. The next three statements look very similar to each other, and have similar statistics, and if you add up their stats the answer is very close to the figures for the first statement. Moreover, the report says that the top statement is responsible for 90% of the work – I think the top statement is actually calling the next three, so we need to look at those statements to see how to make them efficient in order to fix the first statement.
You might note, by the way, the capitals and the bind variables (:B1) in the next three statements – I suspect that there is a PL/SQL function in the first statement that is being called too many times. (20 executions of the first statement, 8,500 of the other three). Perhaps a change in the execution plan of the main query has produced a change in table order which has resulted in some predicate – one with a pl/sql function in it – being tested much more frequently than usual.
A quick cross-reference to the instance activity is in order:
physical reads 311,462,252 43,270.7 52,127.6 prefetched blocks 240,396,452 33,397.7 40,233.7 table fetch by rowid 12,348,226,523 1,715,508.0 2,066,648.8 table fetch continued row 12,047,845,308 1,673,776.8 2,016,375.8 table scan blocks gotten 361,692,971 50,249.1 60,534.4 table scan rows gotten 7,787,418,493 1,081,886.4 1,303,333.6 table scans (long tables) 32,041 4.5 5.4
A huge fraction of the physical reads seem to be for prefetched blocks – that a good indication that they are for tablescans (or index fast full scans), and when we compare rowid access with tablescan access we see that there are a lot of “long” tablescans (maybe the expensive three queries are all doing tablescans of a long table, that would account for 25,500 of those reported), with with 360M blocks scanned. Apart from anything else, all those block scans will be using a big chunk of CPU. The AWR would show us which segments were being scanned, of course, as would statspack if we were running it at level 7.
I’ve included the table fetch by rowid and table fetch continued row. Normally we should be concerned by “continued fetches” – but in this case I’m prepared to overlook them temporarily: it’s possible that after the critical SQL is corrected we may still see that most of the fetches by rowid turn into continued fetches, and at that point it will be worth worrying; but at present, even though they’re probably responsible for a serious fraction of the CPU usage, those continued fetches may be for rows that we’re not going to be fetching once the SQL is doing the right thing.
The last little detail, just for confirmation, comes from the latch activity part of the report; I’ve just selected the busiest 4 latches:
Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
cache buffers chains ############## 0.0 0.0 0 577,860,644 0.1
cache buffers lru chain 157,333 0.3 0.0 0 687,799,033 0.8
multiblock read objects 114,602,621 0.6 0.0 12 0
simulator hash latch 811,998,194 0.0 0.0 2 0
Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
cache buffers chains ############## 12,262,452 180,671 0/0/0/0/0
multiblock read objects 114,602,621 719,477 115 719369/101/7/0/0
simulator hash latch 811,998,194 3,889 19 3872/15/2/0/
As you can see, and as we had guessed, they’re all about the buffer cache – doing multiblock reads and loading/dumping blocks from the cache; and almost all of the sleeps are on the cache buffers chains.
Next Steps
As a quick and dirty which might help a bit – put more of the sga_max_size into the buffer cache – it doesn’t happen automatically in 9.2 – and that may reduce the interaction between the O/S and Oracle and save a bit of latching and CPU.
Address the three statements doing all the physical I/O, and check the path for the statement that cause them to be executed. Fix the statement that has done 40% of the buffer gets. In all cases the problem may simply be an execution plan change – but perhaps the machine has been hiding an inherent performance problem and on this occasion a small change in data (a few more errors, perhaps) resulted in the excess work generating a massive increase in workload and contention.
We do know, however, that there is a lot of work done on buffer gets that didn’t show up in this report – so run off the report for the previous hour and see if we can find a couple more big hitters. Once we’ve sorted out the worst SQL, we might then want to see if the volume of “continued rows” is high – and find out why they’re appearing. It’s possible that it’s due to a table with more than 254 columns, of course, or a long column, but we’ll worry about that only when we need to.
One final thought – it’s easy to fix the wrong thing when you’re looking at statspack reports. Although we can see some SQL that is “obviously” overloading the machine, maybe the business critical task that has to complete promptly is simply hidden because it does (relatively) little work. Unless we can fix the big problems extremely quickly, we ought to take a little time to check with the business and the developers that what we’re looking at really is the code that’s running the critical business task.
Longer term, of course, upgrade to 9.2.0.8 at least and then start running Statspack at level 7 for better information. Then upgrade some more.
Philosophy 20
It’s important to revisit the questions you think you’ve answered from time to time. You may find that your previous answer was wrong or incomplete; you may find that looking at your past answers may give you ideas for new questions.
I had this thought while staring out of the window earlier on today. When I’m working at home I spend most of my time in a room that looks onto my back garden - and I have five different bird feeders in the garden and a pair of binoculars by my computer. Today I was watching some (Eurasian) Jays that tend to appear fairly promptly when I put out a handful of peanuts.
There’s clearly some sort of pecking order among these jays (and I think there are two different families), and one of the jays is clearly very aggressive and tends to frighten off the others, but a common behaviour pattern when two are down is that the less aggressive jay hops a few steps away from the more aggressive one and turns its back.
For years I’ve assumed that this is just a typical “underdog” behaviour – i.e. “I’m not a threat, I can’t attack, I’m not even lookine g at you” – but today it suddenly dawned on me that there was another possibility that simply hadn’t crossed my mind: if you’re a bird and thinking about running away you won’t want to take off towards your opponent, the best direction to point in is the direction that’s going to move you away from trouble as quickly as possible.
My point, of course, is that it’s easy to believe that you understand something simply because you’ve accepted a reasonable explanation – coming back to the issue some time later may allow you to come up with other ideas, whether or not those ideas arise by you deliberately questioning your belief, or by an accident of intuition.
Footnote: If this was an example of Oracle behaviour I’d be doing some serious research on it by now; but my birdwatching is only for casual pleasure, so I’m not going to start trawling the internet for theses on Jay behaviour.
January 6, 2013
Blog advert
Just a quick note to say that I found a blog over the weekend with a number of interesting posts, so I thought I’d pass it on: http://www.bobbydurrettdba.com/
There’s a really cute example (complete with test case) of an optimizer bug (possibly only in 11.1) in the December archive: http://www.bobbydurrettdba.com/2012/12/04/index-causes-poor-performance-in-query-that-doesnt-use-it/
January 3, 2013
Skip Scan 2
Here’s a question that is NOT a trick question, it’s demonstrating an example of optimizer behaviour that might come as a surprise.
I have an index (addr_id0050, effective_date), the first column is numeric, the second is a date. Here’s a query with an execution plan that uses that index:
define m_date='30-Jan-2013'
select
small_vc
from t1
where
addr_id0050 between 24 and 26
and effective_date = to_date('&m_date', 'dd-mon-yyyy')
;
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 18 | 396 | 23 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 18 | 396 | 23 |
|* 2 | INDEX SKIP SCAN | T1_I0050 | 18 | | 5 |
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ADDR_ID050">=24 AND "EFFECTIVE_DATE"=TO_DATE(' 2013-01-30'
00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "ADDR_ID050"<=26)
filter("EFFECTIVE_DATE"=TO_DATE(' 2013-01-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
So here’s the question – given that my where clause includes a predicate on the first column of the index that would allow an index range scan to take place, wouldn’t you expect Oracle to do a range scan, and how does a skip scan work in this case ?
To push the point a little further I have another column, also numeric, in the same table which appears in a similar index (addr_id2500, effective_date). Here’s the equivalent query with its execution plan.
select
small_vc
from t1
where
addr_id2500 between 24 and 26
and effective_date = to_date('&m_date', 'dd-mon-yyyy')
;
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 23 | 5 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 23 | 5 |
|* 2 | INDEX RANGE SCAN | T1_I2500 | 1 | | 4 |
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ADDR_ID2500">=24 AND "EFFECTIVE_DATE"=TO_DATE('
2013-01-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "ADDR_ID2500"<=26)
filter("EFFECTIVE_DATE"=TO_DATE(' 2013-01-30 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
In this case, with the same starting predicate (but referencing the alternative column), the optimizer produces the index range that we might expect.
The index skip scan isn’t just for cases where the first column of an index is missing from the list of predicates. The basic principle is that the optimizer has the option to do a small number of tiny range scans on an index when the alternative is to do a tablescan or a very large index range scan.
As you might guess from my column names, addr_id0050 holds 50 distinct values, so the range 24 to 26 actually accounts for 6% of the total volume of the index. On the other hand, there are over 800 distinct values for effective_date. There is even an index on (effective_date) but that’s going to pick up lots of rows that are widely scattered throughout the table then throw away 94% of them so Oracle has decided it’s too expensive to use.
So the optimizer has worked out that it can probe for (24, 30th Jan), (25, 30th Jan), and (26, 30th Jan) as the most efficient access path. Of course, it doesn’t know that it’s probing for exactly those values but statistically it assumes that there are only a few possible values that will show up if it probes the index on the first column – in effect using an “inlist iterator” on the first column without knowing in advance what it’s going to find in the list.
When the optimizer sees the query using addr_id2500 (which holds 2,500 distinct values) the arithmetic faviours the option we are familiar with. A simple range scan based on values between 24 and 26 is going to range through roughly l/800th of the index – which is a tiny number of leaf blocks (just one or two, in my case) so the optimizer decides to do that and check every index entry on the way to see if the effective_date holds a suitable value.
If you want to repeat the experiment, I was using 11.2.0.3 with 1MB uniform extents and freelist management. The SQL to create the table and indexes is as follows:
create table t1
as
with generator as (
select --+ materialize
rownum id
from all_objects
where rownum <= 3000
)
select
mod(rownum,2500) addr_id2500,
mod(rownum,50) addr_id0050,
trunc(sysdate) + trunc(mod(rownum,2501)/3) effective_date,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 250000
;
create index t1_i1 on t1(effective_date);
create index t1_i2500 on t1(addr_id2500, effective_date);
create index t1_i0050 on t1(addr_id0050, effective_date);
-- collect stats, no histograms, 11g auto sample size
I had disabled CPU costing to get repeatable results – depending on parameter settings whether you have system stats enabled or not then you may need to tweak the code to change the relative numbers of distinct values in the numeric columns before you see the switch between range scan and skip scan.

