I’ve been a little slow in the follow-up to my previous posting on possibly redundant indexes. Before going into the slightly more complex stuff, there’s another peripheral point (but a very important one) that’s worth raising about how clever the optimizer can be. Here’s some code for 11.2.0.4 to demonstrate the point:
create table t1 nologging as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select rownum id, trunc(sysdate,'MM') + (rownum-1)/1440 date_time, rpad('x',100) padding from generator v1, generator v2 where rownum <= 1e5 ; alter table t1 add ( date_only generated always as (trunc(date_time)) virtual ) ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); end; / create index t1_i1 on t1(date_only) nologging;
So, in a two-step process, I’ve got an indexed virtual column that holds the value of the date_time column truncated to just the date. Would you expect the optimizer to use the index to execute the following query efficiently:
select max(id) from t1 where date_time between sysdate-1 and sysdate ;
Note that the query references the real date_time column not the virtual column date_only, and it’s not using the expression that defines the index – yet the plan reads as follows:
----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 21 | 86 (2)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 21 | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1442 | 30282 | 86 (2)| 00:00:01 | |* 4 | INDEX RANGE SCAN | T1_I1 | 4306 | | 13 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(SYSDATE@!>=SYSDATE@!-1) 3 - filter("DATE_TIME"<=SYSDATE@! AND "DATE_TIME">=SYSDATE@!-1) 4 - access("T1"."DATE_ONLY">=TRUNC(SYSDATE@!-1) AND "T1"."DATE_ONLY"<=TRUNC(SYSDATE@!))
It’s a little odd that even though the optimizer in the newer versions of Oracle treats many simple expressions on sysdate as constants it still checks (operation 2) that “sysdate >= sysdate – 1” but perhaps that’s just a case of a piece of generic code that isn’t worth the risk or effort of changing.
The key point, of course, is that Oracle has managed to generate some extra predicates that allow it to use the “wrong” index to get a first approximation of the result set fairly efficiently, and then used the original predicate to reduce the approximation down to the correct result set.
If you want a quick sanity check on the access predicates used for operation 4:
- If date_time >= sysdate-1, then trunc(date_time) >= trunc(sysdate-1)
- If date_time <= sysdate, then trunc(date_time) <= trunc(sysdate)
This style of predicate manipulation also works numeric data types, but I think its greatest benefit (or convenience) is likely to come from date data types where the data has been created with a time component but there are frequent “date-only” queries. The days of creating two indexes as a workaround for handling generated code that wants to deal with both date_time and trunc(date_time) predicates should be numbered.
Footnote:
This enhancement probably appeared in 11.2.0.2, and I first saw it described in October 2013 in this blog note by Mohamed Houri; but 12c offers a delightful little enhancement – here’s what my table looks like in the 12c version of the code:
SQL> desc t1 Name Null? Type ----------------------------- -------- -------------------- ID NUMBER DATE_TIME DATE PADDING VARCHAR2(100) SQL>
Where’s the virtual column ? The 12c version of my code had a slightly different definition for it:
alter table t1 add ( date_only invisible generated always as (trunc(date_time)) virtual ) ;
The transformation still works even when the virtual column is invisible. So (subject to searching for anomalies, boundary conditions and bugs) it looks as if you can change the table definition, and get the benefits of two indexes for the price of one without the application realising that anything has changed.
Hi Jonathan.
I believe that you actually first saw it in your own blog :-)
Comment by Oren Nakdimon (@DBoriented) — August 30, 2015 @ 9:27 am BST Aug 30,2015 |
Oren,
Thanks for the reminder.
I’d tracked it back to something I’d said on Oracle-L, but hadn’t linked the model I’d created to that blog note. One problem of having published more than 1,000 items, I guess.
Comment by Jonathan Lewis — August 30, 2015 @ 10:02 am BST Aug 30,2015 |
Reading some of the commentary on the death of Oliver Sacks, I came across this quote from one of his essays:
It’s a fascinating observation, and one that I feel I could apply to myself quite often.
Comment by Jonathan Lewis — August 30, 2015 @ 6:40 pm BST Aug 30,2015 |
[…] ? Given my comments about the optimizer’s clever trick with indexes on trunc(date_column) in the second post in this series perhaps there’s scope here for getting rid of the dp_datetime index even though the simple […]
Pingback by Index Usage – 3 | Oracle Scratchpad — September 1, 2015 @ 5:52 pm BST Sep 1,2015 |
[…] Index Usage 2 – Using constraints to eliminate indexes […]
Pingback by Index Usage | Oracle Scratchpad — January 9, 2017 @ 12:22 pm GMT Jan 9,2017 |
[…] Using constraints to eliminate indexes […]
Pingback by Indexing Catalogue | Oracle Scratchpad — January 28, 2022 @ 3:32 pm GMT Jan 28,2022 |