Bugs | Oracle Scratchpad

December 21, 2023

Descending Bug

Filed under: Bugs,descending indexes,Execution plans,Indexing,Oracle — Jonathan Lewis @ 3:12 pm GMT Dec 21,2023

This is another example of defects in the code to handle descending columns in indexes, with the added feature that the problem is addressed somewhere between 19.4 and 19.10 (it’s present in 19.3, gone in 19.11) – which means that if you upgrade to a recent RU of from some of earlier versions some of your production code may return rows in a different order. On the plus side, it will be the correct order rather than the previously incorrect order. It’s likely that if your code was exhibiting this bug you would have noticed it before the code got to production, so this note is more a confirmation than a realistic warning of a future threat.

The bug originally showed up in a thread on the Oracle developer forum more than a year ago but I was prompted to finish and publish this note after seeing an article on deadlocks by Frank Pachot where the behaviour of his demonstration code could vary with version of Oracle because of this bug.

Here’s some code to create a demonstration data set:

rem
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2022
rem     Purpose:        
rem
rem     Last tested 
rem             19.11.0.0       Right order
rem             19.3.0.0        Wrong order
rem             12.2.0.1        Wrong order
rem
rem     Notes
rem     From 12.1.0.2 to ca. 19.3(++?) the optimizer loses a "sort order by" 
rem     operation when a "descending" index meets an in-list iterator.
rem     

create table t1 
as
with generator as (
        select  rownum id
        from    dual
        connect by
                level <= 1e4
)
select
        rownum                                  id,
        substr(dbms_random.string('U',6),1,6)   v1,
        rpad('x',100,'x')                       padding
from
        generator
/

alter table t1 modify v1 not null;

update t1 set v1 = 'BRAVO'      where id = 5000;
update t1 set v1 = 'MIKE'       where id = 1000;
update t1 set v1 = 'YANKEE'     where id = 9000;

create index t1_i1 on t1(v1 desc);

I’ve created a table with a column generated as short random strings, then set three rows scattered through that table to specific values, and created an index on that column – but the index is defined with the column descending.

(Reminder: if all the columns in an index are declared as descending that all you’ve done is waste space and introduce an opportunity for the optimizer to go wrong – descending columns in indexes only add value if the index uses a combination of ascending and descending columns).

Here’s a simple query – with the results when executed from SQL*Plus in 12.2.0.1. Note, particularly, the order by clause, the order of the results, and the body of the execution plan:

set serveroutput off

select  v1, id
from    t1
where   v1 in (
                'MIKE',
                'YANKEE',
                'BRAVO'
        ) 
order by 
        v1
;

select * from table(dbms_xplan.display_cursor(format=>'outline'));


V1                               ID
------------------------ ----------
YANKEE                         9000
MIKE                           1000
BRAVO                          5000

3 rows selected.


SQL_ID  6mpvantc0m4ap, child number 0
-------------------------------------
select v1, id from t1 where v1 in (   'MIKE',   'YANKEE',   'BRAVO'  )
order by  v1

Plan hash value: 4226741654

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |       |       |    22 (100)|          |
|   1 |  INLIST ITERATOR              |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |     3 |    33 |    22   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN DESCENDING| T1_I1 |    40 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

The most important point, of course, is that the result set is in the wrong order. It’s interesting to note that there is no “sort order by” operation and that the index range scan is described as “descending”. A brief pause for thought suggests that if you do a descending range scan of a “descending index” then the results ought to come out in ascending order which might explain why the optimizer thought it could eliminate the sort operation. However that thought isn’t necessarily valid since the “inlist iterator” means Oracle should be executing “column = constant” once for each value in the list, which would make the ascending/descending nature of the index fairly irrelevant (for this “single-column” example).

When I created the same data set and ran the same query on 19.11.0.0 I got exactly the same execution plan, including matching Predicate Information and Outline Data (apart from the db_version and optimizer_features_enable values, of course), but the result set was in the right order. (It was still wrong in a test against 19.3, so the fix must have appeared somewhere in the 19.4 to 19.11 range.)

Workaround

In this example one of the ways to work around the problem (in 12.2) was to add the index() hint (which is equivalent to the index_rs_asc() hint) to the query, resulting in the following plan (again identical in 12c and 19c):

SQL_ID  6x3ajwf41x91x, child number 0
-------------------------------------
select  /*+ index(t1 t1_i1) */  v1, id from t1 where v1 in (   'MIKE',
 'YANKEE',   'BRAVO'  ) order by  v1

Plan hash value: 1337030419

-----------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |       |       |    23 (100)|          |
|   1 |  SORT ORDER BY                        |       |     3 |    33 |    23   (5)| 00:00:01 |
|   2 |   INLIST ITERATOR                     |       |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     3 |    33 |    22   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | T1_I1 |    40 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

The “index range scan” operation is no longer “descending”, and we have a “sort order by” operation. You’ll note that, thanks to blocking sort operation the table access is now “batched”.

Best Guess

The way Oracle handles an IN-list is to start by reducing it to a sorted list of distinct items, before iterating through each item in turn. Then, if there is an order by clause that matches the order of the sorted in-list, and Oracle can walk the index in the right order then it can avoid a “sort order by” operation.

I’m guessing that there may be two separate optimizer strategies in the “descending columns” case that have collided and effectively cancelled each other out:

Hypothetical Strategy 1: If there is a “descending index” that can be range scanned for the data the in-list should be sorted in descending order before iterating. (There is a flaw in this suggestion – see below)
Hypothetical strategy 2: Because the query has an order by (ascending) clause the index scan should be in descending order to avoid a sort operation.

The flaw in the first suggestion is that the Predicate Information suggests that it’s not true. This is what you get in every case (though the operation number changes to 4 when the plan includes a “sort order by” operation):

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("T1"."SYS_NC00004$"=HEXTORAW('BDADBEA9B0FF') OR
              "T1"."SYS_NC00004$"=HEXTORAW('B2B6B4BAFF') OR
              "T1"."SYS_NC00004$"=HEXTORAW('A6BEB1B4BABAFF')))
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00004$")='BRAVO' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00004$")='MIKE' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00004$")='YANKEE'))

As you can see the values appearing in the access() predicate are the one’s complements of BRAVO, MIKE and YANKEE in that order; in no case was the order reversed, and previous experience says that predicates are used in the order they appear in the Predicate Information.

On the other hand, it’s arguable that the three predicate values should have been reported (in some form) at the inlist iterator operation – so this may be a case where the simplest strategy for presenting the plan doesn’t match the actual activity of the plan.

Post script

If I change the unhinted query to “order by v1 desc” the rows are reported in ascending order in 12.2.0.1, but in the correct descending order in 19.11.

Comments (3)

November 1, 2023

Descending max()

Filed under: Bugs,Function based indexes,Hints,Indexing,Oracle,Performance — Jonathan Lewis @ 2:54 pm GMT Nov 1,2023

I’ve written a few notes about problems with “descending” indexes in the past (the word is in quotes because it’s not the index that’s defined as descending, it’s a proper subset of the columns of the index). A silly example came up recently on twitter where the cost of an execution plan changed so dramatically that it had to be hinted when the query changed from “order by list of column names” to “order by list of numeric positions”.

Just to make things a little messier, the index was not just “descending” but the column that had been declared as descending was actually a to_char() of a date column. I won’t reproduce the example here as the tweet links to a pastebin drop of a self-contained example. What I have, instead, is an even simpler example of a “descending” index producing a most unsuitable plan – even when you try to hint around it.

Here’s the demonstration that I’ve run on 19.11 and 23.3:

rem
rem     Script:         ind_desc_max_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2023
rem
rem     Last tested 
rem             21.3.0.0
rem             19.11.0.0
rem

create table t1 
as 
select  * 
from    all_objects
where   rownum <= 10000
/
  
alter table t1 modify object_name not null;

-- create index t1_i1a on t1(object_name);
create index t1_i1d on t1(object_name desc);

execute dbms_stats.gather_table_stats(user,'t1',cascade=>true)

alter session set statistics_level=all;
set serveroutput off

select  max(object_name)
from    t1
/

select * from table(dbms_xplan.display_cursor(format=>'cost allstats last hint_report projection'));

set serveroutput on
alter session set statistics_level = typical;

A couple of details to mention:

The code includes lines to create two indexes, one ascending and one descending. When I run the code I create only one of them.
I gather stats after creating the index – this is for the benefit of the descending index only, which needs stats collected on the underlying hidden column definition that Oracle creates to support it.
There’s a call to define object_name as not null – this is for the benefit of 23c. In 19c the view has several columns which carry forward their underlying not null declarations. In 23c none of the view columns has a not null declaration.

If I create the ascending index 19c and 23c both produce the following plan:

Plan hash value: 1421318352

------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |      1 |        |     2 (100)|      1 |00:00:00.01 |       2 |
|   1 |  SORT AGGREGATE            |        |      1 |      1 |            |      1 |00:00:00.01 |       2 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I1A |      1 |      1 |     2   (0)|      1 |00:00:00.01 |       2 |
------------------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) MAX("OBJECT_NAME")[128]
   2 - "OBJECT_NAME"[VARCHAR2,128]

If I create the descending index the plan changes (19c and 23c behave the same way, the following plan is from 23c):

-------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |      1 |        |     8 (100)|      1 |00:00:00.01 |      49 |
|   1 |  SORT AGGREGATE       |        |      1 |      1 |            |      1 |00:00:00.01 |      49 |
|   2 |   INDEX FAST FULL SCAN| T1_I1D |      1 |  10000 |     8   (0)|  10000 |00:00:00.01 |      49 |
-------------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) MAX(SYS_OP_UNDESCEND("T1"."SYS_NC00029$"))[128]
   2 - "T1"."SYS_NC00029$"[RAW,193]

There was a slight difference between versions – there are a couple of extra columns in the 23c view so the hidden column referenced in the Projection Information was sys_nc00027$ in 19c compared to sys_nc00029$ in 23c). I don’t know why the length is reported as 193 – I would have expected it to be 129 (since it’s going to hold the one’s-complement of the object_name and a trailing 0xFF byte).

The critical point, of course, is that the query is no longer using the special min/max path, it’s doing an index fast full scan, scanning through 49 buffers instead of accessing just the 2 buffers the min/max needed.

I added a no_index_ffs(t1) hint to see what would happen if I tried to block the bad path: Oracle did a tablescan; so I also added an index(t1) hint to see if that would help and got the following plan:

--------------------------------------------------------------------------------------------------
| Id  | Operation        | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |      1 |        |    46 (100)|      1 |00:00:00.01 |      46 |
|   1 |  SORT AGGREGATE  |        |      1 |      1 |            |      1 |00:00:00.01 |      46 |
|   2 |   INDEX FULL SCAN| T1_I1D |      1 |  10000 |    46   (0)|  10000 |00:00:00.01 |      46 |
--------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) MAX(SYS_OP_UNDESCEND("T1"."SYS_NC00029$"))[128]
   2 - "T1"."SYS_NC00029$"[RAW,193]

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   2 -  SEL$1 / "T1"@"SEL$1"
           -  index(t1)

As you can see from the Hint Report, the optimizer accepted and used my hint. But it hasn’t used the min/max optimisation, it’s done an index full scan, walking through 46 buffers in index order, which could well be more resource-intensive than the fast full scan.

I tried various tricks to get back the min/max optimisation, and I did finally manage to achieve it – but it’s not a realistic solution so don’t copy it. Here’s the SQL and plan:

select
        /*+ index(t1) */
        utl_raw.cast_to_varchar2(
                sys_op_undescend(
                        min(sys_op_descend(object_name))
                )
        )
from    t1
/

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |      1 |        |    46 (100)|      1 |00:00:00.01 |       2 |
|   1 |  SORT AGGREGATE             |        |      1 |      1 |            |      1 |00:00:00.01 |       2 |
|   2 |   FIRST ROW                 |        |      1 |  10000 |    46   (0)|      1 |00:00:00.01 |       2 |
|   3 |    INDEX FULL SCAN (MIN/MAX)| T1_I1D |      1 |  10000 |    46   (0)|      1 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) MIN("T1"."SYS_NC00029$")[193]
   2 - "T1"."SYS_NC00029$"[RAW,193]
   3 - "T1"."SYS_NC00029$"[RAW,193]

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   3 -  SEL$1 / "T1"@"SEL$1"
           -  index(t1)

There is a “extra” first row operator in the plan but you can see that we’re back to a min/max optimisation accessing only 2 buffers to get the result we want. (And it was the same result).

For 23c the indexed column is the hidden column SYS_NC00029$, and I know that the optimizer will recognise the equivalence between the column and the expression sys_op_descend(object_name) so it’s been fooled into using the min/max optimisation on the index.
If I want the maximum object name I want the minimum sys_op_descend(object_name). Possibly the first row operation appears because the optimizer doesn’t know that the function call will always return a non-null result.
Once I’ve found the minimum I need to reverse the effect of the sys_op_descend() – which is where the sys_op_undescend() comes in, but that returns a raw value, so I’ve had to call a utl_raw function to convert the raw to a varchar2(). Watch out, though, because it’s going to be a maximum length varchar2().

If I can make it happen without even changing the shape of the query the optimizer ought to be able to make it happen – but there’s probably a little bit of generic index-usage code that’s currently getting in the way.

Summary

We’ve seen the pattern fairly often: indexes with a mix of ascending and descending columns can be very helpful in specific cases, but we shouldn’t be surprised when a mechanism that appears for “ascending only” indexes doesn’t work perfectly for an index with some descending columns.

Comments (5)

September 5, 2023

Case Study

Filed under: Bugs,Hints,Joins,Oracle — Jonathan Lewis @ 4:14 pm BST Sep 5,2023

A recent post on the Oracle SQL and PL/SQL forum posted the following query with a complaint that it produced the wrong results:

select  count(*) 
from    all_synonyms
left join 
        all_objects b 
on      (b.owner,object_name)=(select table_owner,table_name from dual)
;

This caused a little confusion to start with since the opening complaint was that the query sometimes produced the wrong results and sometimes produced Oracle error ORA-01799: a column may not be outer-joined to a subquery. The variation in behaviour appears, of course, because the restriction applied to older versions of Oracle but had been lifted by 12c.

The other point of confusion is that there’s no easy way to tell from the output that the result is wrong – especially when the count was in the thousands. This point was addressed by the OP supplying an image of the first rows (and a subset of the columns) for the query when “count(*)” was changed to just “*” (and switched from the “all_” views to the “dba_” views: the output for the columns dba_objects.owner and dba_objects.object_name reported the same pair of values for every row – in the case of the OP this was “SYS” and “DUAL”.

Investigation

You could attack this problem from several directions – you could just edit the query to avoid the problem (the structure is a little curious); you could investigate the 10053 trace file to see what the optimizer is doing every step of the way, or you could try to simplify the model and see if the problem still appears. Since I like to keep things simple (to start with, at least) I created table my_synonyms as a copy of the data in dba_synonyms, and my_objects as a copy of the data in dba_objects and created an index on my_objects(owner, object_name). So my query turned into:

select
        *
from    
        my_synonyms   syn
left join 
        my_objects    obj
on 
        (obj.owner, obj.object_name) = (
                select /*+ qb_name(dual_bug) */ syn.table_owner, syn.table_name from dual
        )
/

You’ll notice that I’ve given both tables a “meaningful” alias and used the aliases for every column, and I’ve also added a query block name (qb_name) to the subquery against dual – because I’d assumed that the subquery probably played a key role in messing the optimizer up (I was wrong) and I wanted to be able to track it easily.

The query produced the wrong results. In my case every row reported obj.object owner and obj.name as “SYS” / “PRINT_TABLE” – selecting 11,615 rows. The basic execution plan – pulled from memory using dbms_xplan.display_cursor() – looked like this:

----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |       |       | 46548 (100)|          |
|   1 |  MERGE JOIN OUTER                      |                 | 11615 |  6261K| 46548   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL                    | MY_SYNONYMS     | 11615 |   805K|    18   (6)| 00:00:01 |
|   3 |   BUFFER SORT                          |                 |     1 |   481 | 46530   (1)| 00:00:02 |
|   4 |    VIEW                                | VW_LAT_881C048D |     1 |   481 |     4   (0)| 00:00:01 |
|   5 |     TABLE ACCESS BY INDEX ROWID BATCHED| MY_OBJECTS      |     1 |   132 |     4   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN                  | OBJ_I1          |     1 |       |     3   (0)| 00:00:01 |
|   7 |       FAST DUAL                        |                 |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("OBJ"."OWNER"= AND "OBJ"."OBJECT_NAME"=)

The optimizer had transformed the query into a form using a lateral view – and there have been bugs with lateral views in the past, particularly relating to decorrelation – so maybe the problem was there and a few tests with various hints or fix_controls, or optimizer parameter settings to disable certain features might identify the source of the problem; but before doing that I thought I’d just get a small result set to check the results and execution plan in detail by adding a predicate “rownum <= 12”.

After re-executing with rowsource execution stats enabled, and report the plan with Query Block Names and Object Aliases, Outline Data, and Column Projection Information I observed two critical changes:

The query produced results that looked as if they were likely to be correct – I no longer had a repeating, incorrect, object owner and name on every row.
The plan had changed (see following).

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |      1 |        |     12 |00:00:00.01 |      24 |
|*  1 |  COUNT STOPKEY                         |                 |      1 |        |     12 |00:00:00.01 |      24 |
|   2 |   NESTED LOOPS OUTER                   |                 |      1 |     12 |     12 |00:00:00.01 |      24 |
|   3 |    TABLE ACCESS FULL                   | MY_SYNONYMS     |      1 |     12 |      7 |00:00:00.01 |       3 |
|   4 |    VIEW                                | VW_LAT_881C048D |      7 |      1 |     12 |00:00:00.01 |      21 |
|   5 |     TABLE ACCESS BY INDEX ROWID BATCHED| MY_OBJECTS      |      7 |      1 |     12 |00:00:00.01 |      21 |
|*  6 |      INDEX RANGE SCAN                  | OBJ_I1          |      7 |      1 |     12 |00:00:00.01 |      17 |
|   7 |       FAST DUAL                        |                 |      7 |      1 |      7 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------------------------

There’s a “count stopkey” operation to handle the “rownum <= 12” predicate, of course; but most significantly the “merge join outer” operation has changed to a “nested loop outer” operation.

Repeating the test but with “rownum <= 1200” I still got results that looked correct, and still got a “nested loop outer”. A third attempt, with “rownum <= “30000” switched back to wrong results and a “merge join outer” (with a “count stopkey”). So – initial conclusion – it’s the implementation of the merge join that has gone wrong.

Resolution

If the only problem is the choice of join mechanism we can patch the code (by hand or with an sql_patch) to bypass the problem. Given the simple pattern that produces the problem I should be able to find two critical hints in the plan’s Outline Data, one which is a leading() hint that dictates the order (my_synonyms, vw_lat_xxxxxxxx) – though the user will presumably have a different name generated for their lateral view – and a use_merge() or use_nl() hint that dictates the mechanism to use to join from my_synonyms to vw_lat_xxxxxxxx.

In both plans the Outline Data showed me the following (cosmetically adjusted) leading() hint:

leading(@sel$d9e17d64 syn@sel$1 vw_lat_6b6b5ecb@sel$6b6b5ecb)

This was then followed by a use_nl() hint (again cosmetically adjusted) for the correct results:

use_nl(@sel$d9e17d64 vw_lat_6b6b5ecb@sel$6b6b5ecb)

and by a use_merge_cartesian() hint for the erroneous results (so it’s possibly just the “cartesian” bit that’s going wrong.:

use_merge_cartesian(@sel$d9e17d64 vw_lat_6b6b5ecb@sel$6b6b5ecb)

So, as a final test, I edited the query to include the leading() and use_nl() hints, and ran it.

select  /*+
                leading(@sel$d9e17d64 syn@sel$1 vw_lat_6b6b5ecb@sel$6b6b5ecb)
                use_nl(@sel$d9e17d64 vw_lat_6b6b5ecb@sel$6b6b5ecb)
        */
        *
from    
        my_synonyms syn
left join 
        my_objects obj
on 
        (obj.owner, obj.object_name) = (
                select /*+ qb_name(dual_bug) */ syn.table_owner, syn.table_name from dual
        )
/

The problem of the repeating “SYS” / “PRINT_TABLE” disappeared, and the total number of rows selected increased from 11,615 to 12,369 (mostly due to package and package bodies having the same owner and name and so doubling up a number of rows from the dba_synonyms table).

Summary

It’s almost always a good idea to simplify a messy problem.

Always check the execution plan, including Predicate Information, Outline Data and Query Block / Object Alias details.

Although I didn’t use the results in this case, executing with rowsource execution statistics enabled can be very helpful.

This case focused on a detail that looked as if a “newish” feature might be guilty; but the simple model suggested it’s an implementation detail of an old feature (possibly reached because the optimizer has applied a new feature). We can work around it very easily, and we can build a very simple test case to raise with Oracle in an SR.

Comments (5)

August 22, 2022

Encryption oddity

Filed under: Bugs,LOBs,Oracle,Troubleshooting — Jonathan Lewis @ 12:34 am BST Aug 22,2022

Here’s a strange problem (with possible workaround) that appeared in a thread on the Oracle developer forum a couple of days ago. It looks like the sort of problem that might be a memory overflow problem in a rarely use code path, given that it seems to need a combination of:

move LOB column
varchar2() declared with character semantics
transparent data encryption (TDE)

Here’s a simple script to generate a tiny data set that demonstrates the problem in my 19.11.0.0 system – but you’ll need to enable transparent data encryption if you want it to work (see footnote).

rem
rem     Script:         encryption_problem.sql
rem     Author:         Solomon Yakobson / Jonathan Lewis
rem     Dated:          August 2022
rem
rem     Last tested
rem             21.3.0.0 
rem             19.11.0.0
rem

drop table test_tbl_move purge;

create table test_tbl_move(
        encrypted_column varchar2(9 char) 
                encrypt using 'aes192' 'sha-1' no salt 
                constraint ttm_nn_ec not null,
        clob_column      clob
)
lob (clob_column) store as securefile
;

insert into test_tbl_move
values( '123456789', 'x')
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'test_tbl_move',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter table test_tbl_move move lob(clob_column) store as securefile;

The script creates a table with a clob column (enable storage in row by default), and a single encrypted varchar2() column declared using character semantics viz: “9 CHAR“.

We insert a row, gather stats – just in case – and then try to move the LOB storage (which also has to move the table). Would you expect a result like the following:

alter table test_tbl_move move lob(clob_column) store as securefile
*
ERROR at line 1:
ORA-12899: value too large for column ??? (actual: 27, maximum: 9)

That “maximum: 9” suggests that Oracle is complaining about encrypted_column – but why? Before worrying about that question, though, I wanted to pursue a different point: if you check the original post and compare the error message with the one above you’ll see that the “actual:” value was different. So I ran the entire test again, and again, and again; and by the time I had executed the entire script half a dozen times this is the list of error messages I had collected:

ORA-12899: value too large for column ??? (actual: 27, maximum: 9)
ORA-12899: value too large for column ??? (actual: 31, maximum: 9)
ORA-12899: value too large for column ??? (actual: 29, maximum: 9)
ORA-12899: value too large for column ??? (actual: 29, maximum: 9)
ORA-12899: value too large for column ??? (actual: 26, maximum: 9)
ORA-12899: value too large for column ??? (actual: 21, maximum: 9)

Your first thought might have been that Oracle is trying to copy an encrypted value while forgetting that it had been encrypted, but the variation in the actual lengths makes it look as if Oracle is injecting generating random data somehow (maybe through a pointer error) and generating garbage as a side-effect. Moreover if you know your encryption you’ll be suspicious of almost all the actual lengths reported because Oracle’s working is as follows:

Round the source length up to the next multiple of 16 bytes
aes192 encryption (the default encryption) will add 16 bytes (‘nomac’ and no salt)
adding a salt (default behaviour) will add a further 16 bytes
adding the ‘SHA-1’ integrity (the default) will add another 20 bytes

You can’t get an odd number of bytes as an encrypted value (unless, perhaps, the code thought it was reading a null-terminated character-string and there was a zero in the middle of it).

Workaround

You’ll see in the thread that Solomon Yakobson did a number of experiments to see what effects they had on the test case; but there was one experiment that he didn’t do that happened to be the first one I thought of. (There was a certain amount of luck in the choice, plus a bit of background suspicion from a couple of prior bugs I’d seen, plus it seemed to be the virtually the only thing that SY hadn’t tried).

Declaring a varchar2(x CHAR) is fairly rare – and with all the messing around with padding, encoding etc. going on, the code to handle multi-byte character sets might be a fruitful source of bugs. So I re-ran the test, but changed the declaration from varchar2(9 CHAR) to varchar2(9 [byte]), and Oracle happily ran my test to completion.

On its own this isn’t a complete workaround. If you’re running with a multi-byte character set a declaration using character semantics means Oracle allows you to store many more bytes than characters. Conversely, if you use byte semantics you will have to declare your column with a large enough byte count to store the number of (multi-byte) characters you really want – but then that could allow your users to insert more characters than you wanted (unless the character set was a fixed-width character set – but then you could waste a lot of space storing character strings – see this note about “in row” CLOB columns).

So, to use byte semantics with a character limit, you have to adopt a strategy that I once saw at a company running Peoplesoft (I assume it’s been changed since – it was a long time ago). They declared their varchar2() columns with far too many bytes (4 times the required character count) then added a check constraint on the length to restrict the number of characters. (In their case that resulted in tens of thousands of check constraints in the database with an undesirable overhead on dictionary cache latching and parse times).

Here’s an alternative declaration of the table that allows the alter table move command to work and still ensures the correct maximum number of characters in the varchar2() column:

create table test_tbl_move(
        encrypted_column varchar2(18)
                encrypt using 'aes192' 'sha-1' no salt
                constraint ttm_nn_ec not null 
                constraint ttm_ck_ec_len9 check (length(encrypted_column) <= 9),
        clob_column      clob
)
lob (clob_column) store as securefile
/

Table created.

SQL> insert into test_tbl_move values('0123456789','xxx');
insert into test_tbl_move values('0123456789','xxx')
*
ERROR at line 1:
ORA-02290: check constraint (TEST_USER.TTM_CK_EC_LEN9) violated

Footnotes

If, like me, the last time you’d played around with encryption was in 11g you’ll find that a lot has changed in setting it up – not only in the added requirements for pluggable databases but also with a new command framework. (As usual Tim Hall’s blog on the topic is a good starting point if you want to do a quick experiment in a sand box.)
The code sample include SHA-1 as the integrity algorithm – ‘NOMAC’ is the only alternative, and in any single table the same algorithm has to be used for all encrypted columns. (If you try to use SHA-1 on one column and NOMAC on another as you create the table Oracle will raise “ORA-28379: a different integrity algorithm has been chosen for the table”. More importantly – a note in the Oracle 21c reference manual states that SHA-1 is deprecated from that version onwards and advises moving from TDE column encryption to TDE tablespace encryption.

Comments (3)

July 18, 2022

drop t/s bug

Filed under: Bugs,LOBs,Oracle,Partitioning,Troubleshooting — Jonathan Lewis @ 9:17 am BST Jul 18,2022

A recent thread on the MOS database admin forum (needs an account) demonstrated a number of little issues with bugs, debugging, complex syntax, and the never-ending list of “not quite complete” code that shows up when features (in this case LOBs and partitioning) collide.

It’s a silly little thing, but one to remind you that you always have to ask “What have I forgotten?”, “Does my test suite include every case I need to test?”

In this case we had been given a model where, after creating a composite partitioned table with a LOB column using a statement that referenced 3 different tablespaces, the OP had

moved one component of this complex structure to a fourth tablespace
dropped the tablespace that had been defined as the holder of the moved component
renamed the fourth tablespace to match the name of the dropped tablespace
called dbms_metadata.get_ddl() to generate a new table definition.

The result of this was that the generated statement included a reference to a tablespace with the unexpected name of “_$deleted$51$0”.

Sidenote: The name is an example of the value stored in ts$.name when you drop a tablespace. The row in ts$ is not deleted (or marked for deletion), instead ts$.online$ is set to 3 and ts$.name is set to reflect the tablespace number with a name of the format “_$deleted${ts$.ts#}$0″.

Here, with some cosmetic changes to the “create table” statement and with a preamble to create enough tablespaces and quotas, is the model supplied by the OP (if you’re thinking of running it make sure you read it carefully first):

rem
rem     Script:         drop_ts_pt_bug.sql
rem     Author:         Jean-François56 / Jonathan Lewis
rem     Dated:          July 2022
rem
rem     Last tested 
rem             19.11.0.0
rem             11.2.0.4
rem 

connect sys/sys as sysdba

prompt  =================================================
prompt  Clearing the stage - some steps may report errors
prompt  =================================================

drop table test_user.test purge;

drop tablespace jf1 including contents and datafiles;
drop tablespace jf2 including contents and datafiles;
drop tablespace jf3 including contents and datafiles;
drop tablespace jf4 including contents and datafiles;

create tablespace jf1 datafile size 100m;
create tablespace jf2 datafile size 100m;
create tablespace jf3 datafile size 100m;
create tablespace jf4 datafile size 100m;

alter user test_user quota unlimited on jf1;
alter user test_user quota unlimited on jf2;
alter user test_user quota unlimited on jf3;
alter user test_user quota unlimited on jf4;

prompt  =======================
prompt  Connecting to test user
prompt  =======================

connect test_user/test

drop table test purge;

create table test(
       idarchive                number(10,0),
       data                     blob,
       partition_date           date,
       customer                 number(10,0),
       prefix_archive_key       varchar2(5)
)
partition by range (partition_date)
subpartition by list (customer)
(
partition p1 
        values less than (to_date('2008-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss', 'nls_calendar=gregorian'))
        tablespace jf1
        lob (data) store as basicfile (tablespace jf2 chunk 16384)
        (
        subpartition sp values (1) 
                tablespace jf3
                lob (data) store as basicfile (tablespace jf2)
        )
);

alter table test move subpartition sp lob(data) store as (tablespace jf4);


connect sys/sys as sysdba

drop tablespace jf2 including contents and datafiles;
alter tablespace jf4 rename to jf2;


connect test_user/test

set long 20000
set longchunksize 20000

set linesize 132
column text_line format a128

begin
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'PRETTY',             true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'TABLESPACE',         true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES', true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'STORAGE',            false);
end;
/

select dbms_metadata.get_ddl('TABLE','TEST') text_line from dual
/

The first thing I did after reading the posting was a quick search on MOS, using the four search terms: drop tablespace rename deleted. It was a lucky choice because on the first page of results from the Knowledge Base I found:

Renaming a Tablespace to An Already Dropped one Changes The Tablespace Name To "_$DELETED" (Doc ID 1937848.1)

Conveniently the notes in this document said: “This is caused by bug 18136584”,and that bug (and the problem) is labelled as “The bug is fixed in 12.2”. Unfortunately the OP was running 11.2.0.4, but that’s okay because the note also said: “Backport is feasible. However, A simple workaround is available.” So I thought I’d create the test case and check the workaround – which I why I’ve got the script.

It just so happened that I prepared, debugged and ran the script (without the workaround) on 19.11 before bothering to start up a VM with 11.2.0.4 – and I got the following output from my call to dbms_metadata.get_ddl():

  CREATE TABLE "TEST_USER"."TEST"
   (	"IDARCHIVE" NUMBER(10,0),
	"DATA" BLOB,
	"PARTITION_DATE" DATE,
	"CUSTOMER" NUMBER(10,0),
	"PREFIX_ARCHIVE_KEY" VARCHAR2(5)
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  TABLESPACE "TEST_8K_ASSM"
 LOB ("DATA") STORE AS SECUREFILE (
  ENABLE STORAGE IN ROW CHUNK 8192
  NOCACHE LOGGING  NOCOMPRESS  KEEP_DUPLICATES )
  PARTITION BY RANGE ("PARTITION_DATE")
  SUBPARTITION BY LIST ("CUSTOMER")
 (PARTITION "P1"  VALUES LESS THAN (TO_DATE(' 2008-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  TABLESPACE "JF1"
 LOB ("DATA") STORE AS BASICFILE (
  TABLESPACE "_$deleted$36$0" ENABLE STORAGE IN ROW CHUNK 16384 RETENTION
  NOCACHE LOGGING )
 ( SUBPARTITION "SP"  VALUES (1) SEGMENT CREATION IMMEDIATE
 LOB ("DATA") STORE AS BASICFILE (
  TABLESPACE "JF2" )
  TABLESPACE "JF3"
 NOCOMPRESS ) )

If you check line 18 (highlighted) you’ll see that even in 19.11 you can end up with generated statement that references “deleted” tablespaces – and here’s a funny little side effect (cut-n-paste from SQL*Plus – but your deletion number will probably be different):

SQL> drop table test;
drop table test
           *
ERROR at line 1:
ORA-00959: tablespace '_$deleted$36$0' does not exist

SQL> drop table test purge;

Table dropped.

So (part of) the bug is still present in 19.11; and on the plus side that means I can examine the workaround to see what it is and how it works. This is what the note says: “Execute below command once you finished with dropping and renaming the tablespace”:

alter table <table_name> modify default attributes tablespace <old_tablespace_name>;

That’s not actually going to do anything for us – but it’s an important clue to how we might be able to fix things; it also suggests why the bug “fixed” in 12.2 isn’t quite fixed in 19c – someone missed a bit of the code path: maybe the bit about LOBs, or maybe the bit about composite partitioned tables, or maybe (very precisely) the bit about partition-level default values for LOBs in composite partitioned tables. (And who knows what might be missing if we start looking at index-organized tables, nested table, and other complicated structures.)

Taking the clue from the suggested workaround, here are three possible fixes to try:

alter table test modify default attributes tablespace jf2;

alter table test modify default attributes                  lob(data) (tablespace jf2);

alter table test modify default attributes for partition p1 lob(data) (tablespace jf2);

The third of these options is the one that “works” – and the word is in quote marks because all I mean is that the generated SQL uses JF2 as the tablespace name rather than the “deleted” tablespace name – I make no guarantee about how future behaviour might vary from past behaviour after this change, and it’s likely to depend on exactly how you’re expecting to add and move partitions and subpartitions anyway.

Follow-up

The problem (which means, possibly, the omitted code path) comes from the need for handling default storage clauses. When we’re handling composite partitioning the only segments that come into existence are subpartition segments – but you can still specify physical (and logical) storage information at the table and partition level with the inference (possibly not stated explicitly) that any table-level metadata should be the default metadata for the partition level and any partition-level metadata should be the default metadata for subpartitions. What does this mean in terms of our original table creation script and the subsequent call to dbms_metadata?

You’ll notice that I’ve highlighted lines 8 – 11 in the output above from dbms_metadata.

Line 8 references tablespace test_8k_assm: I didn’t include a default tablespace at the table level for the table segments, but that’s the tablespace that happened to be my default tablespace when I ran the script.
Lines 9 – 11 define a default LOB storage clause with no specified tablespace and using securefiles (which is the default LOB storage for 19c). Again I didn’t specify anything about a table-level LOB in my original definition.
The rest of the generated definition has, apart from the “deleted” tablespace, reproduced my original definition – including the 16KB declaration of chunk size for the partition and the lack of specified chunksize for the subpartition.

So questions to think about:

what chunk size is / would be used in the subpartition – is it silently picking up the value specified for the partition, or is it silently picking up the default for the table, or is it simply using the “absolute” default of 1 block?
what happens if I execute a simple “add subpartition” on the existing p1 partition? Where will the subpartition be stored and what will its storage details look like.
What will I see if I execute a simple “add partition” to add a new partition to the table. Will I also get a physical subpartition and if so where will it be and what will its storage clause look like.
What would the dbms_metadata output have looked like if I had had a table-level LOB definition that specified tablespace jf2?
What side effects might appear if I extended the definition to interval partitioning, with automatic list subpartitions, and inserted a row that needed a new partition and subpartition?!

Underlying all these detailed questions, of course, is the specification for the maintenance work that the DBA is expected to handle, viz:

what is the defined strategy for adding new partitions and subpartitions to the table,
what is the strategy for aging out old partitions and subpartitions.
are there any plans about grouping partitions into “age-related” tablespaces
are tablespaces going to be renamed to transport them to another database

It’s possible that the anomaly in this note only showed up because the OP was experimenting with options, and maybe the ultimate production code will be based on a strategy that means the anomaly will never appear. It’s possible that the anomaly is already in the production system but only became visible when someone decided to think about archiving out old partitions and the archival code started raising errors. Playing around with models to discover what happens is time well spent; and modelling the full production life cycle before going live is a critical activity.

Some answers

To find out what we’ve actually got from the original create table statement we can query the views (user/all/dba/cdb):

USER_TABLES
USER_PART_TABLES
USER_TAB_PARTITIONS
USER_TAB_SUBPARTITIONS

USER_LOBS
USER_PART_LOBS
USER_LOB_PARTITIONS
USER_LOB_SUBPARTITIONS

USER_SEGMENTS

For the chunk sizes we find: user_lobs.chunk = 8192, user_lob_partitions.chunk = 2 (blocks?) and user_lob_subpartitions.chunk = 16,384 (bytes?). We also have user_part_lobs.def_chunk = 1 (block?). So the explicit p1 partition chunk size has cascaded down to its subpartition.

Side note: who needs consistency! You might also want to remember that when you call dbms_space.space_usage() procedure that appeared in 11g to report the space used in a (secure file) LOB, the “blocks” input parameters actually return chunks and (for CLOBS) the “bytes” input parameters actually return character counts.

For the tablespaces we find: user_lobs.tablespace_name= TEST_8K_ASSM, user_lob_partitions.tablespace_name = _$deleted$36$0 and user_lob_subpartitions.tablespace_name = JF2. We also have user_part_lobs.def_tablespace_name is null.

What happens if I try to add a subpartition to the existing p1 partition without first applying the fix:

SQL> alter table test modify partition p1 add subpartition sp2 values(2);
alter table test modify partition p1 add subpartition sp2 values(2)
            *
ERROR at line 1:
ORA-00959: tablespace '_$deleted$36$0' does not exist

Now try again but apply the fix before adding the subpartition:

SQL> alter table test modify default attributes for partition p1 lob(data) (tablespace jf2);

Table altered.

SQL> alter table test modify partition p1 add subpartition sp2 values(2);

Table altered.

Checking the data dictionary for the effects of changing the default attribute we find that user_lob_partitions.tablespace_name is now JF2, which has then been taken on by the new subpartition.

What about adding a new partition:

SQL> alter table test add partition p2 values less than (to_date('01-Jan-2010'));

Table altered.

SQL> select partition_name, chunk , tablespace_name from user_lob_partitions order by partition_name;

PARTITION_NAME              CHUNK TABLESPACE_NAME
---------------------- ---------- ------------------------------
P1                              2 JF2
P2                              1

SQL> select lob_partition_name, subpartition_name, chunk , tablespace_name from user_lob_subpartitions order by 1,2;

LOB_PARTITION_NAME   SUBPARTITION_NAME           CHUNK TABLESPACE_NAME
-------------------- ---------------------- ---------- ------------------------------
SYS_LOB_P20609       SP                          16384 JF2
SYS_LOB_P20609       SP2                         16384 JF2
SYS_LOB_P20622       SYS_SUBP20621                8192 TEST_8K_ASSM

The new partition has no tablespace_name, but it has automatically generated a subpartition (values (default)), which has climbed the tree to the table-level to set the tablespace for the LOB, and that had defaulted to the tablespace of the table itself, which was the user default tablespace of TEST_8K_ASSM. Maybe we should have modified the “default attributes lob(data)” at some point so that the user_part_lobs.def_tablespace_name was not null.

I’ll finish with just one more comment – you’ve seen how messy things can get and how much detail could be overlooked when handling marginally complex composite partitioned table. Do you really think that interval partitioning and automatic list partitioning are really going to mean you don’t have to worry about partition maintenance code? Possibly. If you plan to have one huge tablespace for all the bits and never have to worry about backing up and restoring that tablespace you will be able to forget about all the housekeeping code, but realistically you’ll need to know how to check and change the metadata and rename, move or otherwise manipulate segments so make sure you know what’s going to happen so that you don’t have to work it out when everyone’s running around in panic mode.

Comments (3)

May 31, 2022

Join View delete

Filed under: ANSI Standard,Bugs,CBO,Join Elimination,Oracle,Transformations — Jonathan Lewis @ 5:39 pm BST May 31,2022

I’ve written a couple of notes about deleting from join views and (ultimately) the fact that if you have muliple key-preserved tables in join view then the first key preserved table in the from clause is the one where the delete is applied. The topic came up on the Oracle developer forum again today, and closed with a funny little quirk.

Here’s a little data model followed by one query and two deletes that I’ve just run on Oracle 19.11.0.0 and 21.3.0.0. The query and first delete feature a join using “ANSI” syntax, the second delete translates the join into “traditional” Oracle syntax.

rem
rem     Script:         delete_join_2.sql       
rem     Dated:          May 2022
rem     Author:         J P Lewis
rem
rem     Last tested
rem             21.3.0.0
rem             19.11.0.0
rem

create table t_parent (
        id      number 
                        constraint t_parent_pk primary key,
        val     varchar2(10)
);

create table t_child (
        id      number 
                        constraint t_child_fk references t_parent(id),
        val     varchar2(10)
);

alter table t_child add constraint t_child_pk primary key(id);

insert into t_parent values(1,rpad('parent',10));
insert into t_child  values(1,rpad('child', 10));

commit;

execute dbms_stats.gather_table_stats(user,'t_parent',cascade=>true)
execute dbms_stats.gather_table_stats(user,'t_child', cascade=>true)

set autotrace on explain

select t_child.* from t_parent join t_child on t_child.id = t_parent.id;

delete
        (select t_child.* from t_parent join t_child on t_child.id = t_parent.id);

delete (select t_child.* from t_parent, t_child where t_child.id = t_parent.id);

set autotrace off

The table definitions and content (with a little modification) come from the forum posting, and I have to admit that the “one-to-one” referential integrity from parent to child is a little unusual (though occasionally a very effective strategy). I’ve also enabled autotrace to get the (probable) execution plans for the three statements, but in the script on my test machine I’ve also enabled the CBO (10053) trace to pick up a few extra details.

The query is there to show us that the join will return a single row, and the row content will be from the t_child table. What happens, though, when we try to use that query as an inline view to the delete.

It is an updateable join view – and both tables are, in fact, key-preserved thanks to the primary key constraints at both ends of the join. Here’s what we get (with the execution plans removed from the output):

        ID VAL
---------- ----------
         1 child

1 row selected.



delete
*
ERROR at line 1:
ORA-02292: integrity constraint (TEST_USER.T_CHILD_FK) violated - child record found


1 row deleted.

Line 5 tells us that we selected one row – as expected.

Line 11 tells us that we’ve raised an error because we’re trying to break the foreign key constraint on the t_child table – and that error shouldn’t be a surprise because the t_parent table appears first in the from clause, so that’s the table the optimizer wants to delete rows from.

However, line 15 tells us that if we change to “traditional ” Oracle syntax the delete works – so which of the two delete statements is displaying buggy behaviour?

Execution plans

There’s a clue in the execition plans (when we get them – autotrace doesn’t give a plan for the delete that raised the error). So here are the plans for the select and the “traditional” delete in that order:

----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |    14 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| T_CHILD |     1 |    14 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------



-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | DELETE STATEMENT |            |     1 |     3 |     1   (0)| 00:00:01 |
|   1 |  DELETE          | T_CHILD    |       |       |            |          |
|   2 |   INDEX FULL SCAN| T_CHILD_PK |     1 |     3 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

With assistance from the foreign key constraint, Join Elimination has occurred in both cases leaving t_child as the candidate for deletion. You’ll also note that the optimizer has recognised that while the view’s select list is “all columns” it only needs to optimize for “select rowid” to drive the delete, so produces an index-only access path to get the data it needs.

If we use explain plan, or check the 10053 trace file, for the ANSI delete that failed we’ll find the following:

-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT    |             |     1 |     6 |     1   (0)| 00:00:01 |
|   1 |  DELETE             | T_PARENT    |       |       |            |          |
|   2 |   NESTED LOOPS      |             |     1 |     6 |     1   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN  | T_CHILD_PK  |     1 |     3 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN| T_PARENT_PK |     1 |     3 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$4396EC5C
   3 - SEL$4396EC5C / T_CHILD@SEL$1
   4 - SEL$4396EC5C / T_PARENT@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T_CHILD"."ID"="T_PARENT"."ID")

There’s no Join Elimination, and Oracle is following the rule that “the first key preserved table in the from clause“ should be the target of the deletes.

Where’s the bug?

This is a question that doesn’t have an obvious answer.

The traditional syntax gives you the result you probably wanted but it’s deleting from the second table in the from clause of the join view – so is that a documentation bug or a coding bug because if the documentation is correct the statement should have raised the error that the ANSI syntax gave.

Conversely, is there an optimizer bug (or limitation) because table elimination could have been used in the ANSI case but wasn’t, although you could say that the ANSI code is doing the right thing (following the documentation) by raising an error and that the traditional syntax is doing the wrong thing.

If we search the CBO (10053) traces of the select and the ANSI join, looking for sections where the optimizer considers Join Elimination, the results are helpful:

grep -nT "^JE" or19_ora_30861_*.trc >temp.txt

With a little cosmetic editing here’s the output from the trace of the select statement:

or19_ora_30861_select.trc:  78  JE - Join Elimination
or19_ora_30861_select.trc: 807  JE:[V2]   Considering Join Elimination on query block SEL$1 (#0)
or19_ora_30861_select.trc: 813  JE:[V2] Query block (0x7fd3a152f5c0) before join elimination:
or19_ora_30861_select.trc: 816  JE:[V2]: Try to eliminate T_CHILD by ref. join elim using NTSID
or19_ora_30861_select.trc: 817  JE:[V2]: Cannot eliminate T_CHILD by ref. join elim - no constr. based join pred(s)
or19_ora_30861_select.trc: 818  JE:[V2]: Try to eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
or19_ora_30861_select.trc: 819  JE:[V2]: Cannot eliminate T_PARENT by ref. join elim - predicate column refs.
or19_ora_30861_select.trc: 820  JE:[V2]: Try to eliminate T_PARENT by ref. join elim using NTSID
or19_ora_30861_select.trc: 821  JE:[V2]: Cannot eliminate T_PARENT by ref. join elim - no constr. based join pred(s)
or19_ora_30861_select.trc: 822  JE:[V2] Query block (0x7fd3a152f5c0) after join elimination:

or19_ora_30861_select.trc: 844  JE:[V2]   Considering Join Elimination on query block SEL$58A6D7F6 (#0)
or19_ora_30861_select.trc: 850  JE:[V2] Query block (0x7fd39c09ee60) before join elimination:
or19_ora_30861_select.trc: 853  JE:[V2]: Try to eliminate T_CHILD by ref. join elim using NTSID
or19_ora_30861_select.trc: 854  JE:[V2]: Cannot eliminate T_CHILD by ref. join elim - no constr. based join pred(s)
or19_ora_30861_select.trc: 855  JE:[V2]: Try to eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
or19_ora_30861_select.trc: 856  JE:[V2]: Can eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
or19_ora_30861_select.trc: 857  JE:[V2] Eliminate table: T_PARENT (T_PARENT)
or19_ora_30861_select.trc: 858  JE:[V2] Query block (0x7fd39c09ee60) after join elimination:

We can see that the optimizer has considered join elimination twice (lines 2 and 12), once for each of two query blocks (sel$1 – a query block embedded in the text – and sel$58a6d7f6, a query block name generated during some query transformation).

At lines 5 and 6 we can see that Oracle failed to take advantage of the referential integrity constraint to eliminate t_parent with the explanation “predicate column refs”; but at lines 16 and 17 we see that there’s a change of heart and the optimizer does take advantage of the referential integrity to eliminate t_parent. So let’s look at the text of the “query block before join elimination” in both cases and see what’s changed:

First pass
----------
SELECT 
        "T_PARENT"."ID"  "QCSJ_C000000000300000",
        "T_PARENT"."VAL" "QCSJ_C000000000300002",
        "T_CHILD"."ID"   "QCSJ_C000000000300001",
        "T_CHILD"."VAL"  "QCSJ_C000000000300003" 
FROM 
        "TEST_USER"."T_PARENT" "T_PARENT",
        "TEST_USER"."T_CHILD" "T_CHILD" 
WHERE 
        "T_CHILD"."ID"="T_PARENT"."ID"


Second Pass
-----------
SELECT 
        "T_CHILD"."ID" "ID",
        "T_CHILD"."VAL" "VAL" 
FROM 
        "TEST_USER"."T_PARENT" "T_PARENT",
        "TEST_USER"."T_CHILD" "T_CHILD" 
WHERE 
        "T_CHILD"."ID"="T_PARENT"."ID"

The first pass seems to have expanded the two tables in the intial JOIN, ignoring the limited select list, so join elimination is blocked by the presence of the (non-PK) column t_parent.val (aliased as qcsj_c000000000300002, but when we get to the 2nd pass the select list has been limited to the minimum necessary set of columns that would be needed by a pure select statement and join elimination is possible.

The clue to how/why this has happened is the in the few lines between the two passes; the trace file also reports: “CVM: Merging SPJ view SEL$1 (#0) into SEL$2 (#0)” and when we look for the point where the optimizer registered query block sel$2 it turns out to have one item in its from clause, an object called: from$_subquery$_003 (which tells us where the 3 came from in all those column aliases QCSJ_C0000000003nnnnn. In effect it looks like the optimizer started by rewriting the ANSI select in Oracle syntax as (with a little poetic licence for clarity):

select  child_id, child_val, parent_id
from    (
         select  t_child.*, t_parent.*
         from    t_child, t_parent
         where   t_child.id = t_parent.id
        ) from$_subquery$_003
/

So the first pass was the optimizer trying to do join elimination on the inner query block and the second pass happened after the inner block was merged with the outer block and the resulting query block (called sel$58a6d7f6) allowed the join elimination to take place

So what happened with the delete statement. Here’s the result of the call to grep:

or19_ora_30861_delete.trc:  90  JE - Join Elimination
or19_ora_30861_delete.trc: 837  JE:[V2]   Considering Join Elimination on query block SEL$1 (#0)
or19_ora_30861_delete.trc: 843  JE:[V2] Query block (0x7fd3a152f758) before join elimination:
or19_ora_30861_delete.trc: 846  JE:[V2]: Try to eliminate T_CHILD by ref. join elim using NTSID
or19_ora_30861_delete.trc: 847  JE:[V2]: Cannot eliminate T_CHILD by ref. join elim - no constr. based join pred(s)
or19_ora_30861_delete.trc: 848  JE:[V2]: Try to eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
or19_ora_30861_delete.trc: 849  JE:[V2]: Cannot eliminate T_PARENT by ref. join elim - predicate column refs.
or19_ora_30861_delete.trc: 850  JE:[V2]: Try to eliminate T_PARENT by ref. join elim using NTSID
or19_ora_30861_delete.trc: 851  JE:[V2]: Cannot eliminate T_PARENT by ref. join elim - no constr. based join pred(s)
or19_ora_30861_delete.trc: 852  JE:[V2] Query block (0x7fd3a152f758) after join elimination:

That’s the lot – there is no attempt to do join elimination on a querye block called sel$58a6d7f6, though a few lines after this extract the trace file reports:

CVM:   Merging SPJ view SEL$1 (#0) into SEL$2 (#0)
Registered qb: SEL$58A6D7F6 0x9c0a3038 (VIEW MERGE SEL$2; SEL$1; SEL$2)

That’s exactly the same view merging that was present in the select – but the optimizer doesn’t attempt join elimination before moving on to report, a couple of lines later:

CVM:   Merging SPJ view SEL$58A6D7F6 (#0) into DEL$1 (#0)
Registered qb: SEL$4396EC5C 0x9c09dde0 (VIEW MERGE DEL$1; SEL$58A6D7F6; DEL$1)

It looks like this is a case of the optimizer missing a cycle in its transform/optimze loopback. The delete went through exactly the same process as the select, but failed to optimize the intermediate query block (which happened to be called from$_subquery$_004 in this case, being one layer deeper thanks to the extra level introduced by the delete).

Workaround

I tried two or three things to get the ANSI-style code to work without changing the table order in the from clause – extra layers of subqueries, explicit aliases, but Oracle kept coming back to a point where it had the full projection of all the columns in the two tables. (This reminded me of an old bug in ANSI select expansion that was fixed in 12cR2. It made me wonder if this was a code path where the same bug had been overlooked – it also made me wonder if there would be any problems if the two tables in the join totalled more than 1,000 columns – but that’s a test for another day.)

I did eventually find a fix (that didn’t involved switching the table order in the inline view):

delete
        (select t_child.id from t_parent join t_child on t_child.id = t_parent.id);

Note that I now have only the primary key of t_child in the select

The section in the CBO trace file for Join Elimination looked like this:

JE:[V2] Query block (0x7f93996231f8) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT "T_PARENT"."ID" "QCSJ_C000000000400000","T_CHILD"."ID" "QCSJ_C000000000400001" FROM "TEST_USER"."T_PARENT" "T_PARENT","TEST_USER"."T_CHILD" "T_CHILD" WHERE "T_CHILD"."ID"="T_PARENT"."ID"
JE:[V2]: Try to eliminate T_CHILD by ref. join elim using NTSID
JE:[V2]: Cannot eliminate T_CHILD by ref. join elim - no constr. based join pred(s)
JE:[V2]: Try to eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
JE:[V2]: Can eliminate T_PARENT by ref. join elim using PRIMARY(ID) <- FOREIGN(ID)
JE:[V2] Eliminate table: T_PARENT (T_PARENT)
JE:[V2] Replaced column: T_PARENT.ID with column: T_CHILD.ID
JE:[V2] Query block (0x7f93996231f8) after join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT "T_CHILD"."ID" "QCSJ_C000000000400000","T_CHILD"."ID" "QCSJ_C000000000400001" FROM "TEST_USER"."T_CHILD" "T_CHILD"
Registered qb: SEL$E703A888 0x996231f8 (JOIN REMOVED FROM QUERY BLOCK SEL$1; SEL$1; "T_PARENT"@"SEL$1")

You may decide that this is an obvious workaround, but it’s not particularly helpful. In the general case I would expect to see a stored view that joined the two tables, with the expectation that this type of delete was just one of many operations the view was used for. Rewriting a view to use the increasingly common ANSI syntax shouldn’t result in some statements crashing “for no apparent reason”.

The last laugh

Given that the select and delete both included a merge of sel$1 into sel$2 to produce a query block called sel$58a6d7f6 I wondered what would happen if I blocked the merge:

SQL>  delete  /*+  no_merge (@sel$1) */
  2          (select t_child.id from t_parent join t_child on t_child.id = t_parent.id);
	(select t_child.id from t_parent join t_child on t_child.id = t_parent.id)
        *
ERROR at line 2:
ORA-01752: cannot delete from view without exactly one key-preserved table

Isn’t that the error message that should have been coming out the entire time – according to the orginal documentation for updateable join views?

Comments (5)

May 5, 2022

Lag/Lead slow

Filed under: Bugs,Oracle,Performance,Problem Solving,Troubleshooting,Tuning — Jonathan Lewis @ 10:05 am BST May 5,2022

This note is about a surprising performance difference between the lead() and lag() analytic functions (which31 turns out to be due to the behaviour of the nth_value() function) when the option to “ignore nulls” is included in their use (jump to conclusion). The detail I’ll be writing about was highlighted in a thread on the Oracle developer forum about a requirement to add a number of analytic columns to a dataset of 156 million rows using a statement of the following shape:

create table tb_target_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

You’ll notice that I’ve introduced a row_number(), and both a lead() and a lag() of a column called ed_ucc_dt. All three analytic columns use the same partitioning and ordering, though, so Oracle will only be doing one “window sort” in the execution plan. Part of the performance problem, of course, was that with 156M rows of a couple of dozen existing columns and adding a dozen new columns, the workload due to sorting was bound to be very large, so there were various suggestions of how to minimise that part of the workload.

However Solomon Yakobson pointed out that the code was using the “ignore nulls” option and there was a bug in 11g that made lead() and lag() very slow when this option was used. He subsequently reported that this defect was still present in 19c, restricted it to just the lag() function, and cited a MOS document ID referencing the problem: LAG Function is slow when using ignore nulls (Doc ID 2811596.1). The implication of the MOS note is that we shouldn’t expect this to change.

A follow-up posting by User_H3J7U gave us a reason for the slowness of the lag() function by running a sample query through dbms_utility.expand_sql(). Oracle rewrites the query to use variants of the nth_value() function when you use “ignore nulls”, but rewrites it to use variants of first_value() when you aren’t using the “ignore nulls” option. This isn’t a complete explanation of why lag() should be slow while lead() is not – but it’s a significant pointer towards a possible implementation issue and is a good clue about working around the problem. So let’s build a model of the situation.

The basic model

rem
rem     Script:         lag_ignore_nulls.sql
rem     Author:         Jonathan Lewis / Sam P
rem     Dated:          May 2022
rem     Purpose:     
rem
rem     Last tested
rem             19.11.0.0
rem

create table tb_source (
        pat_id,
        visit_date_dt,
        ed_ucc_dt
)
as
with generator as (
        select rownum id
        from    dual
        connect by
                level <= 1e4    --> comment to avoid wordpress format issue
)
select
        g1.id,
        to_date('01-Apr-2022') + dbms_random.value(0,100),
        to_date('01-Apr-2022') + dbms_random.value(5,105)
--      to_date(null)
from
        generator g1,
        generator g2
where
        g2.id <= 20     --> comment to avoid wordpress format issue
order by
        dbms_random.value
/


spool lag_ignore_nulls.lst

set serveroutput off
set timing on

prompt  ======================
prompt  Without "ignore nulls"
prompt  (My time 0.61 seconds)
prompt  ======================

create table tb_target_no_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

prompt  ======================
prompt  With "ignore nulls"
prompt  (My time 0.88 seconds)
prompt  ======================

create table tb_target_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

I’ve created a source table with 200,000 rows, consisting of 10,000 pat_id values, and 20 rows per pat_id. The 20 rows for a pat_id (probably) each have a different visit_date_dt and a different ed_ucc_dt.

After creating the data set I’ve created two more tables using the lead() and lag() functions to generate a previous (lag) and next (lead) ed_ucc_dt for each row, partitioning by pat_id, ordering by visit_date_dt. One statement includes the “ignore nulls” option the other doesn’t and, as you can see, the time to create the “no ignore” table was 0.61 seconds while the time to create the “ignore null” table was 0.88 seconds.

The variation isn’t dramatic – but this is just 200,000 rows, in memory, with only a few columns and only two columns added through lead and lag.

After the baseline test I tweaked the statement that created the table with the “ignore nulls” option to get three more times.

With neither lead() nor lag() the time was 0.29 seconds
With just the lead() column the time was 0.46 seconds – an increase of 0.17 seconds
With just the lag() column the time was 0.71 seconds – an increase of 0.42 seconds

You might note that 0.29 + 0.17 + 0.42 = 0.88 (the time I got for adding both columns) – it’s a little lucky that it looks like a perfect match, but even matching within a couple of hundredths of a second would be have been a nice detail. It certainly seems that lag() – with my test data – consumes more resources than lead() for a pair of operationd that look as if they should produce the same workloads.

Internal Rewrite

The next step was to check what the internal rewrite of the code looked like, so I passed the select part of the statements (the procedure won’t accepts “create as select”) through dbms_utility.expand_sql() and reformatted the results. Here are the two rewritten statements – first without “ignore nulls”:

select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        decode(
                count(*) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 preceding and 1 preceding
                        ),a
                 1,     first_value(a1.ed_ucc_dt) over (
                                partition by a1.pat_id order by a1.visit_date_dt
                                rows between 1 preceding and 1 preceding
                        ),
                        null
        ) prev_ed_ucc_dt,
        decode(
                count(*) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and 1 following
                        ),a
                 1,     first_value(a1.ed_ucc_dt) over (
                                partition by a1.pat_id order by a1.visit_date_dt
                                rows between 1 following and 1 following
                        ),
                        null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1

The code looks a little long and messy, but that’s mainly because just about everything it does happens twice. The lag() function (prev column) turns into a first_value() function that looks at the row preceding the current row in the partition (rows between 1 preceding and 1 preceding). However it first has to count over the same clause to see if a row exists, and then either report its value or report a null – hence the structure decode(count(), 1, first_value(), null)

Note: the full lag() function call is: “lag(expression, offset, default)” where the offset (number of rows to lag) defaults to 1 and the default is the value you want reported when there is no matching row, and defaults to null.

The call to lead() basically does the same thing, but uses (rows between 1 following and 1 following) to access the next row in the partition.

On the other hand this is the SQL that Oracle generates when we include the “ignore nulls” clause (which means Oracle can’t restrict the row range to just one preceding or following row):

select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        nvl(
                nth_value(a1.ed_ucc_dt, 1) from last ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt,
        nvl(
                nth_value(a1.ed_ucc_dt, 1)           ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and unbounded following
                ),
                null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1

Both lag() and lead() turn into nth_value() with a second parameter of 1 (i.e. nth == 1st … which makes you wonder why Oracle isn’t using first_value()), and we can also see the “ignore nulls” still being used.

The lag() call now uses the range (rows between unbounded preceding and 1 preceding) i.e. everything from the start of partition to the previous row, while the lead() call uses the range (rows between 1 following and unbounded following) i.e. from the next row to the end of partition.

The other important detail to note is that the translation of the lag() call also includes the clause “from last” – in other words we want the first row when reading the partition in reverse order, and that might have something to do with the extra time it takes to operate the (translated) lag() function.

Workaround

Oracle is using a generic nth_value() to translate a generic lead()/lag(), but we’re in the special case where we know n = 1, which means we (and Oracle) could use first_value()/last_value(). It’s perfectly reasonable for Oracle’s internal code to avoid special cases if it makes no difference to performance, of course, but maybe in this case we could imitate Oracle’s rewrite to get some benefit.

Step 1 – change nth_value() to the appropriate first/last.
Step 2 – get rid of the “from last” which won’t be needed with last_value()
Step 3 – move the “ignore nulls” to the spot that Oracle wants to see it with first/last

Here’s the resulting SQL – I’ve left the nvl(count, expression, null) in place, but if you wanted a null as the default return value for the original lead()/lag() calls you could simplify the code a little further.

create table tb_target 
as
select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        nvl(
                last_value(a1.ed_ucc_dt ignore nulls) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt,
        nvl(
                first_value(a1.ed_ucc_dt ignore nulls) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and unbounded following
                ),
                null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1
/

Run-time: 0.61 seconds.
Run-time with just last_value() / preceding: 0.47 seconds
Run time with just first_value() / following: 0.43 seconds

There still seems to be a little discrepancy between accessing to the preceding data compared to accessing the following data but there’s a much better balance than before.

One more edit – taking out the nvl() construct because the original lead()/lag() calls didn’t have a non-null default supplied:

create table tb_target 
as
select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
--
        last_value(a1.ed_ucc_dt ignore nulls) over (
                partition by a1.pat_id order by a1.visit_date_dt
                rows between unbounded preceding and 1 preceding
        ) prev_ed_ucc_dt,
--
        first_value(a1.ed_ucc_dt ignore nulls) over (
                partition by a1.pat_id order by a1.visit_date_dt
                rows between 1 following and unbounded following
        ) next_ed_ucc_dt,
--
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1
/

With this simplification the time dropped by a further couple of hundredths of a second hovering between 0.57 and 0.58 seconds.

There was one last detail about the test code that should be mentioned – do the changes in code still produce the same results? As a quick and dirty check I ran the following query after each variant of creating the tb_target table (tb_target_ignore is the table created using the original lead()/lag() code with “ignore nulls”):

select  *
from   (
        select * from tb_target minus select * from tb_target_ignore
        union all
        select * from tb_target_ignore minus select * from tb_target
        )
/

The result was always zero rows.

Hypothesis

I raised the idea that the difference in timing for the lead() and lag() functions might have something to do with the volume of data that Oracle could be processing to find the one row it needed.

My thinking was that for the lead() rewrite – the call to nth_value(ed_ucc_dt,1) – would simply be looking at the next row in the partition (if it existed) because my data has no nulls that neeed to be ignored, while the rewrite of the lag() function with its “from last” requirement could be making Oracle re-read the entire preceding section of the partition before starting to process it backwards.

As a possible check to see if this was a viable hypothesis I ran one more test – visible in the initial declaration of tb_source – I created the data with ed_ucc_dt set to null in every single row, so that Oracle would be forced to process from the current position to whichever end of the partition was relevant regardless of whether it was calling lead() or lag().

With this change in place the timing for the lead() only and lag() only statements were nearly identical – which is a weak support for the hypothesis.

And once I’d done that test the next obvious test was to see what happened if I increased size of each partition (using non-null values for ed_ucc_dt) to see if larger partitions would increase the difference between the forward and backward tests. To do this I changed the script to create the tb_source table to produce 5,000 pat_id value with 40 rows per pat_id by changing the where clause to:

where
        g2.id <= 40     --> comment to avoid wordpress format issue
and     g1.id <= 5e3    --> comment to avoid wordpress format issue

With this change in place the timings for the original form of the lead()/lag() statement were:

With both lead() and lag() in place the time was 1.05 seconds
With neither lead() nor lag() the time was 0.25 seconds
With just the lead() column the time was 0.41 seconds – an increase of 0.16 seconds
With just the lag() column the time was 0.98 seconds – an increase of 0.73 seconds

So the lag() time (ballpark figures) nearly doubles as the partition size doubles but the lead() time stays pretty much the same.

The results of these two tests do tend to suggest that the generic nth_value() implementation can do some short-circuiting when working “forwards”, using a mechanism that isn’t available when the “from last” clause requires it to work “backwards”.

Writing the previous paragraph prompted me to do one last test – it wouldn’t produce the same results, of course, but I ought to check the performance when I moved the “from last” clause out of the “prev”/lag() column expression into the “next”/lead() column expression in Oracle’s original translation to confirm that the problem was associated with the “from last” and not with the choice of “preceding” or “following” in the row range section of the over() clause. (It was the “from last” that made the difference.)

tl;dr

If you’re using the lag() or lead() functions with “ignore nulls” on a very large dataset you may find that you can rewrite the code with first_value() or last_value() calls that use less CPU. The benefit is probably only significant on fairly large data sets, and may be particularly noticeable for cases where the same over() clause is used many times and the partition sizes are more than a couple of dozen rows each.

The potential for excess CPU usage comes from the effect of a generic internal rewrite using the nth_value() function with the “from last” clause even for the special case where your lead()/lag() have an offset (i.e. n) of 1 which would allow for the use of first_value()/last_value().

To get the correct rewrite you can use dbms_utility.expand_sql() to generate a suitable statement from which you can extract and edit the relevant pieces of text.

Footnote

A further point raised by Solomon Jakobson was that in the generic case (where you can’t use first_value() / last_value() to rewrite the SQL because you want to use an offset greater than one) it’s worth noting that the following two expressions are effectively identical:

        nvl(
                nth_value(a1.ed_ucc_dt, 2) from last ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt

        nvl(
                nth_value(a1.ed_ucc_dt, 2)           ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt desc
                        rows between 1 following and unbounded following
                ),
                 null
        ) prev_ed_ucc_dt
/

Note particularly that to avoid the issue of scanning the data “from last”, I’ve changed the ordering (in line 11) to descending and I’ve changed the preceding of line 4 to following in line 12 while reversing the positions of 1 and unbounded. (The same strategy can be applied for the lag() function in the original code).

In general it’s likely that lead() is will be more efficient than lag() so this change could make a big difference in performance. It’s worth noting, however, that if you’ve been using both lead() and lag() as my example did then your execution plan will (truthfully) show two “window sort” operations and the second (i.e. lower number in the plan) window sort will also be sorting the columns added by the first window sort. Depending on the nature of your data the additional sort might increase the workload by more than the benefit you get from eliminating a lag().

In an example created by Solomon Jakobson the effect of using lag( ,5) was catastrophic and the change that introduced the extra sort to get a lead( ,5) made a huge difference (40 seconds down to 0.15 seconds – and I haven’t made a mistake in the decimal point there); in my original test case the improvement I got from applying the same strategy was small (0.88 seconds down to 0.86 seconds). The critical difference between the test cases was that one had a single partition of 20,000 rows, the other had (10,000) partitions of only 20 rows.

(Footnote to the footnote – it did cross my mind that if you had a performance catastrophe with the simple lag(,n) and with the extra sort from reversing the order to use a lead(,n) would Oracle do something clever if you found a way to rewrite the query with a cascade of inline-views that all used last_value() – but I decided I wasn’t going to look at that option until someone paid me to. More seriously I did wonder about the option for using match_recognize to bypass the lag() problem, and may get around to thinking about that more carefully at some point.)

Comments (2)

May 4, 2022

redefinition error

Filed under: Bugs,Infrastructure,Oracle — Jonathan Lewis @ 12:22 pm BST May 4,2022

Here’s a note about a data error generated by using (possibly mis-using) the dbms_redefinition package (jump to conclusion). The original code to demonstrate the problem comes from a note on the Oracle Developer forum, and was brought to my attention by a tweet from Daniel Stein.

The critical feature of the demo is that we can end up with a column containing nulls despite being declared NOT NULL (and I don’t mean by that a simple “is not null” check constraint – which is not quite the same as a NOT NULL declaration).

Here’s the first part of a script, mostly copied from the forum post, that I’ve I’ve been running on 19.11.0.0:

rem
rem     Script:         redef_bug.sql
rem     Author:         Jonathan Lewis / Sebastian (User_6AT2M)
rem     Dated:          May 2022
rem
rem     Last tested 
rem             19.11.0.0
rem

create table test_nulls (
        id      number (8)      constraint tn_pk primary key,
        text    varchar2 (25)   constraint tn_nn_tx not null
)
/

create unique index i_test_nulls on test_nulls(text);

insert into test_nulls (id, text) 
select rownum, 'some text '||rownum from dual connect by level <= 50;

-- create an empty copy, without the constraints or indexes

create table test_nulls_interim (
        id      number (8),
        text    varchar2 (25)
);

begin
        dbms_redefinition.start_redef_table(
                uname           => user,
                orig_table      => 'test_nulls',
                int_table       => 'test_nulls_interim',
                col_mapping     => 'id id, substr(text, id, 2) text'
        );
end;
/

The script creates a table with a primary key declared on an id column, a not null declaration on a text column and a unique index on the text column, then populates the table with 50 rows that have the form (N, “some text N”) where N is a number between 1 and 50.

After creating an empty copy of the table with no constraints or indexes I start an online redefinition – modifying the content of the text column as part of the redefinition. If you check the col_mapping carefully you will realise that when id reaches 13 the result from the substr() function becomes null.

This is where the trouble starts. If I now call dbms_redefition.copy_table_dependents() to add the original constraints and indexes to the interim table what’s Oracle going to do about the not null declaration on the text column?

declare
        error_ct pls_integer;
begin
        dbms_redefinition.copy_table_dependents(
                uname           => user,
                orig_table      => 'test_nulls',
                int_table       => 'test_nulls_interim',
                num_errors      => error_ct,
                copy_indexes    => 1,
                copy_constraints=> true,
                ignore_errors   => false,
--
                copy_triggers   => false,
                copy_privileges => false,
                copy_statistics => false,
                copy_mvlog      => false
        );  

        dbms_output.put_line('error count: '||error_ct);
end;
/

begin
        dbms_redefinition.finish_redef_table(user, 'test_nulls', 'test_nulls_interim');
end;
/

drop table TEST_NULLS_INTERIM purge;

I’ve exposed all the parameters to the copy_table_dependents() procedure call in my code, and you can see that I’ve chosen to copy only the constraints and indexes, and I don’t want to ignore errors.

The PL/SQL anonymous block terminates successfully, doesn’t report any errors, and outputs an error count of zero. So let’s see what we’ve got as the final result of the redefinition.

column search_condition_vc format a20
break on table_name skip 1 on index_name
set echo on

select
         table_name, index_name, column_name 
from 
        user_ind_columns 
order by 
        1,2,column_position
/

desc test_nulls

select count(*) from test_nulls;
select /*+ full(test_nulls) */ count(*) from test_nulls;

select
        constraint_name,
        constraint_type,
        search_condition_vc,
        status,
        deferrable,
        deferred,
        validated
from
        user_constraints
where
        table_name = 'TEST_NULLS'
/

set echo off

Here are the results – my login.sql has a lot of column format commands so your results may look a lot messier if you run this bit of code. Here’s the echoed output:

SQL> select
  2           table_name, index_name, column_name
  3  from
  4          user_ind_columns
  5  order by
  6          1,2,column_position
  7  /

TABLE_NAME                INDEX_NAME           COLUMN_NAME
------------------------- -------------------- --------------------
TEST_NULLS                I_TEST_NULLS         TEXT
                          TN_PK                ID


2 rows selected.

SQL> 
SQL> desc test_nulls
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 ID                                                                       NOT NULL NUMBER(8)
 TEXT                                                                     NOT NULL VARCHAR2(25)

SQL> 
SQL> select count(*) from test_nulls;

  COUNT(*)
----------
        12

1 row selected.

SQL> select /*+ full(test_nulls) */ count(*) from test_nulls;

  COUNT(*)
----------
        50

1 row selected.

SQL> 
SQL> select
  2          constraint_name,
  3          constraint_type,
  4          search_condition_vc,
  5          status,
  6          deferrable,
  7          deferred,
  8          validated
  9  from
 10          user_constraints
 11  where
 12          table_name = 'TEST_NULLS'
 13  /

CONSTRAINT_NAME      C SEARCH_CONDITION_VC  STATUS   DEFERRABLE     DEFERRED  VALIDATED
-------------------- - -------------------- -------- -------------- --------- -------------
TN_NN_TX             C "TEXT" IS NOT NULL   ENABLED  NOT DEFERRABLE IMMEDIATE VALIDATED
TN_PK                P                      ENABLED  NOT DEFERRABLE IMMEDIATE VALIDATED

Both indexes have arrived; both columns have NOT NULL declarations.

When you count the number of rows in the table it’s only 12 – unless you force a full tablescan in which case it’s 50. This happens because the NOT NULL declaration of column text allows the optimizer to use an index-only plan on the index i_test_nulls and there are 38 nulls in the table that don’t appear in the index.

The check on user_constraints shows that both the primary key constraint and the “is not null” check constraint are enabled and validated.

Conclusion

There is code in the copy_table_dependents() procedure that is defective, deficient and dangerous. Clearly I’ve done something that I probably shouldn’t have done (and, historically, I wouldn’t have done) but the code should still have protected me from an error that leaves the database in a state that is not internally consistent and can produce incorrect results.

In my example the copy_table_dependents() procedure has allowed me to add a NOT NULL declaration to a column after the col_mapping parameter of start_table_redef() has resulted in rows that have nulls in that column. There may be variants on this theme that allow other types of error to appear.

Notes

If you want to repeat this test and try some variations on the theme you will need a few non-trivial privileges (which can be granted through a role). For the basic dbms_redefinition package you will need: execute on dbms_redefinition, select any table, create any table, alter any table, lock any table, drop any table; and to execute the copy_table_dependents() procedure you will also need create any index, create any trigger.

The copy_table_dependents() procedure appeared (I think) in the 10g time-line; prior to that you had to sort out all the constraints and dependencies “by hand” – which you would probably manage online through the sync_interim_table() procedure (viz: “create an index, sync interim table, create next index, sync etc.) before calling the finish_redef_table() procedure. That being the case my immediate response to this issue was that if you don’t want the not null declaration on text then you can have to exclude the copy_constraints option when copying the table dependants; if you did want the not null declaration then you should have included it in the initial definition of the interim table because the start_redef_table() call would then have failed, raising:

ORA-12008: error in materialized view or zonemap refresh path
ORA-01400: cannot insert NULL into ("{schema}"."TEST_NULLS_INTERIM"."TEXT")

Comments (4)

March 28, 2022

Drop column bug

Filed under: Bugs,Infrastructure,Oracle — Jonathan Lewis @ 11:29 am BST Mar 28,2022

In the previous note about a problem dropping virtual columns the “guilty party” that made it impossible to drop any columns was based on a complex data type owned by the MDSYS (Spatial) schema. This note demonstrates the same problem with a very simple example created from scratch in an ordinary user schema.

rem
rem     Script:         object_virtual_col.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2022
rem
rem     Last tested 
rem             19.11.0.0
rem

create type point_type as object(x_coord number, y_coord number);
/

create or replace function my_point(inpoint in point_type)
return point_type
deterministic as 
begin
        return inpoint;
end;
/

show errors

create table t1 (id, n1, p1, n2, v1, padding)
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          n1,
        point_type(rownum, rownum)      p1,
        rownum                          n2,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1
where
        rownum <= 100   -- > comment to avoid WordPress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter table t1 add constraint t1_pk primary key(id);

So I’ve declared a type “point” which is an object with two attributes of type number, and I’ve created a function that takes a point as its input parameter and returns a point. Then I’ve created a table which includes a column of type point.

Let’s start with a little reminder of what a pain it is to use even simple object types correctly. What’s going to happen with the following three SQL statements:

select    p1.x_coord from t1    where rownum <= 4;
select t1.p1.x_coord from t1    where rownum <= 4;
select t1.p1.x_coord from t1 t1 where rownum <= 4;

The first two will fail – the first one shouldn’t be too surprising, the second does seem a little unreasonable:

ORA-00904: "P1"."X_COORD": invalid identifier
ORA-00904: "T1"."P1"."X_COORD": invalid identifier

So let’s try adding some virtual columns to pick out the X value:

alter table t1 add x_val generated always as (p1.x_coord) virtual;
alter table t1 add c_val generated always as (cast(p1.x_coord as number)) virtual;

The first call will fail (ORA-54016: Invalid column expression was specified) but the second will succeed. What if we try to hide out point column behind a call to our function:

alter table t1 add fp_val generated always as (my_point(p1)) virtual;
alter table t1 add fx_val generated always as (my_point(p1).x_coord) virtual;

Again the first call will fail (ORA-54004: resultant data type of virtual column is not supported) but that’s a documented restriction – a user-defined type may not be used as the type of a virtual column and I wasn’t at that point trying to return just the one attribute.

The second call, however, will succeed. So I can’t create a virtual column p1.x_coord, but I can create a virtual column my_point(p1).x_coord.

We now have two virtual columns that should return the required values, so that’s do a quick check with a couple of simple queries – cut and paste:

SQL> select fx_val "my_point(p1).x_coord" from t1 where rownum <= 4;

my_point(p1).x_coord
--------------------
                   1
                   2
                   3
                   4

4 rows selected.

SQL> select c_val  "cast(p1.x_coord as -" from t1 where rownum <= 4;

cast(p1.x_coord as -
--------------------
                   1
                   2
                   3
                   4

4 rows selected.

Finally we’ll finish off by demonstrating that I’ve just created a problem that no-one will notice until long after I’ve left the site (maybe):

SQL> alter table t1 drop column n1;
alter table t1 drop column n1
 *
ERROR at line 1:
ORA-00904: "TEST_USER"."MY_POINT": invalid identifier

After creating (and using successfully) the virtual column that calls my function, I can’t drop any of the columns in the table.

Summary

The manuals have a stated restriction for virtual columns that they cannot be a user-defined type, and this restriction seems to carry forward to an attribute of a user-defined type unless the attribute has been cast() to a base type.

The same restriction seems to apply to functions returning a user-defined type, but not to the individual attributes of the returned value – it is not necessary to cast() them to a base type. However, if you (accidentally) take advantage of this relaxation of the restriction you will be unable to drop any columns from the table in the future.

Comments (1)

March 24, 2022

Drop column bug

Filed under: Bugs,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 11:07 am GMT Mar 24,2022

Here’s a problem that appeared recently on the Orace Developer forum showing one of the classic symptons of new features namely that “mix and match” often runs into problems. This example has been a long time coming so “new” is something of a misnomer, but the alternative source of unlucky collisions is in the “rare” features – in this case Spatial. (It’s possible that the problem is not restricted to just Spatial but it probably does require a somewhat exotic data type.)

The problem appeared in a thread on the Oracle Developer Forum from someone who was trying to drop a pair of columns and finding that the statement failed with a surprising choice of error: ORA-00904: Invalid Identifier. The surprising thing about this error was that the named identifier was clearly not invalid. The suspicion that that this was an example of “new features colliding” was that the columns to be dropped were virtual columns based on a real column of the table that had been declared as an object type defined in the MDSYS (Spatial) schema.

Conveniently the author had supplied a short, simple, script to demonstrate the issue, so I copied it and modified it a bit to do a few tests around the edges of the problem. Here’s the code that I used to start my investigation:

rem
rem     Script:         drop_col_bug.sql
rem     Author:         Jonathan Lewis/User_77G7L
rem     Dated:          Mar 2022
rem     Purpose:        
rem
rem     Last tested 
rem             21.3.0.0        Still broken
rem             19.11.0.0
rem

create table xxx (
        v1              varchar2(10),
        n1              number,
        shape           mdsys.sdo_geometry,
        x_value         number generated always as (mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.x) virtual,
        y_value         number generated always as (mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.y) virtual,
        v2              varchar2(10),
        n2              number,
        n3              number
)
segment creation immediate
;

insert into xxx(v1, n1, v2, n2, n3) values('z',1,'y',2,3);
update xxx  set
        shape = sdo_geometry(
                2003,                           -- two-dimensional polygon
                null,
                null,
                sdo_elem_info_array(1,1003,3),  -- one rectangle (1003 = exterior)
                sdo_ordinate_array(1,1, 5,7)    -- only 2 points needed to define rectangle
        )
;


commit;

alter table xxx drop (x_value, y_value) ;

The modifications I made from the original code are:

I’ve removed a couple of redundant sets of parentheses from the virtual column definitions
I’ve added a few columns before, in between, and after the virtual columns
I’ve used “segment creation immediate”
I’ve inserted a row into the table

The last two are simply to ensure that I have data segments and at least one item for the drop to work on – just in case it’s a run-time problem being reported as a parse time issue.

The extra columns are to test whether the type and position of the column I drop affects the outcome, and the change in parentheses is purely aesthetic.

Here’s the result of the attempt to drop the virtual columns:

alter table xxx drop (x_value, y_value)
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

This is a little strange since I have used the packaged function mdsys.sdo_geom.sdo_pointonsurface() to define the virtual columns and Oracle didn’t complain when I created the column. (Part of the reason I had reduced the original parentheses was to check that the compiler hadn’t got confused by an excess of paretheses).

As a quick “what if” test I tried using the alternative syntax for drop column that you can use with just one column:

SQL> alter table xxx drop column x_value;
alter table xxx drop column x_value
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

What about trying to set the column unused before dropping all unused columns?

SQL> alter table xxx set unused column x_value;
alter table xxx set unused column x_value
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

So is the problem restricted to the virtual columns – what happens if I try to drop a column from the end of the table, what about the one between the two virtual columns, how about a column that appears before even the shape column? Nothing changes:

SQL> alter table xxx drop column v1;
alter table xxx drop column v1
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier


SQL> alter table xxx set unused column v1;
alter table xxx set unused column v1
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

What if I have only one of the virtual columns? No difference.

What if I don’t have either of the virtual columns? Finally I can drop any column I like from the table (including the shape column). Not that that’s much use to the user.

You can’t set unused or drop any columns in the table thanks to an error that looks as if it’s associated with the definition of those virtual columns.

Workaround

Is there any way to bypass the problem and still store the information we need (until we want to drop it). Let’s start by taking a look at the way Oracle has used our table definition to create column definitions, just in case that gives us a clue:

select 
        column_id id, segment_column_id seg_id, internal_column_id int_id, 
        column_name, data_type, data_default  
from 
        user_tab_cols 
where 
        table_name = 'XXX' 
order by 
        column_id,
        internal_column_id
/

        ID     SEG_ID     INT_ID COLUMN_NAME          DATA_TYPE                 DATA_DEFAULT
---------- ---------- ---------- -------------------- ------------------------- --------------------------------------------------------------------------------
         1          1          1 V1                   VARCHAR2
         2          2          2 N1                   NUMBER
         3          3          3 SHAPE                SDO_GEOMETRY
         3          4          4 SYS_NC00004$         NUMBER
         3          5          5 SYS_NC00005$         NUMBER
         3          6          6 SYS_NC00006$         NUMBER
         3          7          7 SYS_NC00007$         NUMBER
         3          8          8 SYS_NC00008$         NUMBER
         3          9          9 SYS_NC00009$         SDO_ELEM_INFO_ARRAY
         3         10         10 SYS_NC00010$         SDO_ORDINATE_ARRAY
         4                    11 X_VALUE              NUMBER                    (("MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE"("SHAPE",0.005))."SDO_POINT")."X"
         5                    12 Y_VALUE              NUMBER                    (("MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE"("SHAPE",0.005))."SDO_POINT")."Y"
         6         11         13 V2                   VARCHAR2
         7         12         14 N2                   NUMBER
         8         13         15 N3                   NUMBER

15 rows selected.

There’s quite a lot going on there in terms of columns hidden behind the sdo_geometry type. In fact internal columns 9 and 10 might prompt you to look for other objects like table types or LOBs:

SQL> select column_name, segment_name, index_name from user_lobs where table_name = 'XXX';

COLUMN_NAME                    SEGMENT_NAME                   INDEX_NAME
------------------------------ ------------------------------ ------------------------------
"SHAPE"."SDO_ELEM_INFO"        SYS_LOB0000100168C00009$$      SYS_IL0000100168C00009$$
"SHAPE"."SDO_ORDINATES"        SYS_LOB0000100168C00010$$      SYS_IL0000100168C00010$$

2 rows selected.

But the interesting detail is the data_default column for our two virtual columns – which have more parentheses than the original definitions. Perhaps the storage of the expression has gone wrong (as happened in an older version of Oracle with case expressions) and is causing the ORA-00904 error to appear. So let’s try selecting data from the table using the expression stored in data dictionary:

select
        ((MDSYS.SDO_GEOM.SDO_POINTONSURFACE(SHAPE,0.005)).SDO_POINT).X          old_x,
          mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.x            new_x,
        ((MDSYS.SDO_GEOM.SDO_POINTONSURFACE(SHAPE,0.005)).SDO_POINT).Y          old_y
from
        xxx
/

     OLD_X      NEW_X      OLD_Y
---------- ---------- ----------
         1          1          1

1 row selected.

No syntax error there – as far as a simple select is concerned. I’ve included my tidier format for the x_value column aligned with the resulting stored value (with all the double quotes removed – though I’ve also tested it with the quotes in place) – and the only significant visual difference is the number of parentheses, so maybe that’s a clue. In particular we note that the error reports “MDSYS”.”SDO_GEOM”.”SDO_POINTONSURFACE” as the invalid identifier and the first time an extra (close) parenthesis appears is just after that function call. Maybe (for no good reason) the code path involved with handling column data during a drop/set unused call is getting confused by parentheses. So let’s try to reduce the complexity of the expression by hiding it inside a local function.

First attempt – create a function to return an sdo_point_type and define the virtual columns to expose the X and Y values from the point:

create or replace function my_point(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return mdsys.sdo_point_type
deterministic
as
begin
        return mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point;
end;
/

        x_value         number generated always as (my_point(shape,0.005).x) virtual,
        y_value         number generated always as (my_point(shape,0.005).y) virtual,

This approach still produces an ORA-00904, though the invalid identifier becomes “TEST_USER”.”MY_POINT”.

Second attempt – two separate functions, one for the x value, one for the y value:

create or replace function my_x(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return number
deterministic
as
begin
        return   mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point.x;
end;
/

show errors

create or replace function my_y(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return number
deterministic
as
begin
        return   mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point.y;
end;
/

        x_value         number generated always as (my_x(shape, 0.005)) virtual,
        y_value         number generated always as (my_y(shape, 0.005)) virtual,

This worked so, finally, I looked at the SQL Language reference manual to see if there were any restrictions on virtual columns that might explain the problem I had had with all the previous definitions (and, yes, I know I should have done that right after the first failure) and I found the following:

The virtual column cannot be an Oracle supplied data type, a user-defined type, or LOB or LONG RAW.

None of my virtual column definitions returned an Oracle supplied data type or a user-defined data type. But would the restriction also apply to single attributes of such a data type, or has the complexity of spatial data types found a hole in the code? And the answer to that question is a whole new blog note waiting to be finish (because I’d forgotten what a pain it was to use object types in simple SQL.)

Comments (1)

March 22, 2022

Upgrade Surprise

Filed under: 19c,Bugs,Oracle,Transformations,Upgrades — Jonathan Lewis @ 10:04 am GMT Mar 22,2022

Here’s a little surprise that showed up in the most recent (March 2022) article that I sent to Simpletalk for the series on transformations. I had been using 19c (19.11.0.0) to create and run my little demos but the editor had used 12.2.0.1 to check the examples and questioned a comment I had made about a “default plan”.

Here’s the query in question. I was using the emp and dept tables from the Scott schema to demonstrate a point about subquery execution:

rem
rem     Script:         unnest_demo_simpletalk_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2022
rem 

break on deptno skip 1

select
        /*+ 
                qb_name(main)
                gather_plan_statistics 
        */
        *
from    emp e1
where   e1.sal + nvl(e1.comm,0) > (
                select  /*+ qb_name(subq) */
                        avg(e2.sal + nvl(e2.comm,0))
                from    emp e2
                where   e2.deptno = e1.deptno
        )
order by
        e1.deptno, e1.empno
/

As you can see, I’ve used a correlated aggregate subquery to report all employees who earned more than the average for their department, where “earnings” is calculated as the sum of salary and commission.

Here’s the plan I got when I ran this query under 19c:

------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      6 |00:00:00.01 |      24 |       |       |          |
|   1 |  SORT ORDER BY       |      |      1 |      1 |      6 |00:00:00.01 |      24 |  2048 |  2048 | 2048  (0)|
|*  2 |   FILTER             |      |      1 |        |      6 |00:00:00.01 |      24 |       |       |          |
|   3 |    TABLE ACCESS FULL | EMP  |      1 |     14 |     14 |00:00:00.01 |       6 |       |       |          |
|   4 |    SORT AGGREGATE    |      |      3 |      1 |      3 |00:00:00.01 |      18 |       |       |          |
|*  5 |     TABLE ACCESS FULL| EMP  |      3 |      5 |     14 |00:00:00.01 |      18 |       |       |          |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("E1"."SAL"+NVL("E1"."COMM",0)>)
   5 - filter("E2"."DEPTNO"=:B1)

When my editor ran the query on 12.2.0.1, and when I started up an instance of 12.2.0.1 and ran the query, the plan looked like this:

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      6 |00:00:00.02 |      29 |      6 |       |       |          |
|*  1 |  FILTER              |      |      1 |        |      6 |00:00:00.02 |      29 |      6 |       |       |          |
|   2 |   SORT GROUP BY      |      |      1 |      4 |     14 |00:00:00.02 |      29 |      6 |  2048 |  2048 | 2048  (0)|
|*  3 |    HASH JOIN         |      |      1 |     70 |     70 |00:00:00.02 |      29 |      6 |  1922K|  1922K| 1053K (0)|
|   4 |     TABLE ACCESS FULL| EMP  |      1 |     14 |     14 |00:00:00.01 |       7 |      6 |       |       |          |
|   5 |     TABLE ACCESS FULL| EMP  |      1 |     14 |     14 |00:00:00.01 |       7 |      0 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$9E18A930")
      MERGE(@"SEL$AA0D0E02" >"SEL$B4BE209F")
      OUTLINE(@"SEL$B4BE209F")
      UNNEST(@"SUBQ")
      OUTLINE(@"SEL$AA0D0E02")
      OUTLINE(@"SEL$D6166863")
      OUTLINE(@"SUBQ")
      OUTLINE(@"MAIN")
      FULL(@"SEL$9E18A930" "E2"@"SUBQ")
      FULL(@"SEL$9E18A930" "E1"@"MAIN")
      LEADING(@"SEL$9E18A930" "E2"@"SUBQ" "E1"@"MAIN")
      USE_HASH(@"SEL$9E18A930" "E1"@"MAIN")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("E1"."SAL"+NVL("E1"."COMM",0)>SUM("E2"."SAL"+NVL("E2"."COMM",0))/COUNT("E2"."SAL"+NVL("E2"."COMM",0)))
   3 - access("E2"."DEPTNO"="E1"."DEPTNO")

(I’ve added in a request for the ‘outline’ in the display_cursor() format.) The correlated subquery has been unnested and the resulting inline view has been subject to complex view merging. It was only at this point that I realised that the 19c plan was a little surprising and not what I should have expected.

After checking that the configuration and statistics (including the system stats) were the same on the two instances I re-ran the query on 12c with the /*+ no_unnest */ hint to make it use the plan that 19c had produced and I found (as expected) that the plan with filter subquery produced by 19c had a higher cost than the unnesting plan produced by 12c.

Next I re-ran the query on 19c with the /*+ unnest */ hint to make it use the plan that 12c had produced – but it didn’t! 19c “ignored” the hint and carried on using the filter subquery plan. It did, however, supply the following extra information when I added the ‘hint_report’ option to the to the display_cursor() format:

Total hints for statement: 3 (U - Unused (1))
---------------------------------------------------------------------------

   1 -  MAIN
           -  qb_name(main)

   4 -  SUBQ
         U -  unnest / Failed basic validity checks
           -  qb_name(subq)

The code in 19c thinks that it would be illegal to unnest the subquery that 12c was unnesting (does this mean that some people may be getting wrong results in 12c). So I checked the 10053 (CBO) trace file to see if there was any further information there that would “explain” the problem. This is what I found:

SU: Checking validity for Null Accepting Semi JoinUnnesting for query block MAIN(#1)
SU:   Checking validity of unnesting subquery SUBQ (#2)
SU:     SU bypassed: More than 1 column in connect condition.
SU:     SU bypassed: Failed basic validity checks.
SU:   Validity checks failed.

The reference to “Null accepting” looks a little suspect but prompted a few experiments (modifying the data to eliminate nulls, adding not null declarations to columns, simplifying the query etc.) that suggested that the problem was essentially that the optimizer did not want to unnest when the comparison was with the expression (sal + comm) regardless of the operator, and even when all the relevant columns had been populated, declared not null, and the nvl() function had been removed.

It doesn’t seem reasonable in this case, but possibly the block is a generic strategy to avoid errors in some more subtle cases, and perhaps the block will be refined and relaxed in future versions. (Or maybe it’s a bug that the wrong test is being called at this point – there was nothing in the query requiring “null acceptance” by the time I got through the last test.)

I did find a workaround that you could use to avoid any code rewrite:

alter table emp add nvl_earnings 
        invisible 
        generated always as (sal + nvl(comm,0)) 
        virtual
;

There’s seems to be no good reason why this should work – but it does. The subquery unnests and the Predicate Information in the plan doesn’t give any clue that it’s using a virtual column.

Summary:

When you upgrade from 12c there are some queries involving correlated subqueries that no longer unnest the subquery. This may have a significant impact on performance and it may not be possible to bypass the problem unless you rewrite the query to do a manual unnest although I did find a virtual column workaround for my specific example. So far I’ve tested the query on 19.11.0.0 and 21.3.0.0, the behaviour is the same in both versions.

Footnote:

After failing to find anything on MOS about the problem I emailed a draft of this note to Nigel Bayliss at Oracle – who did find a promising match on MOS.

The failure to unnest may be the consequence of the fix for bug 30593046: “A query having a scalar subquery returned a wrong result”. The fix was introduced in 19.9.0.0 but was too restrictive, leading to the creation of bug 33325981: “Query Loses Subquery Unnesting From Plan in 19.9 and Above”.

The fix for 33325981 was distributed in 19.13.0.0 and 21.4.0.0 (plus a couple of earlier RURs, with patches available for various versions back to 12.2.0.1). Unfortunately the newer fix still doesn’t go far enough in reducing the restrictions and my example still doesn’t unnest.

Make sure you check any code that depends on “expression-based” subquery unnesting before you upgrade to 19.9, as it may change plan, which may make a difference to performance and a requirement for a workaround.

Comments (2)

January 1, 2022

Happy New Year

Filed under: 12c,Bugs,CBO,Execution plans,Oracle,Transformations — Jonathan Lewis @ 12:01 am GMT Jan 1,2022

Here’s an entertaining little bug that appeared on the oracle-l list server just in time to end one year and start another in a suitable way. The thread starts with an offering from Matthias Rogel (shown below with some cosmetic changes) to be run on Oracle 12.2.0.1:

rem
rem     Script:         group_by_bug.sql
rem     Author:         Matthias Rogel  / Jonathan Lewis
rem     Dated:          Dec 2021
rem
rem     Last tested 
rem             19.11.0.0       Fixed
rem             12.2.0.1        Fail
rem

create table t as (
        select date'2021-12-30' d from dual 
        union all 
        select date'2021-12-31'   from dual
);

select extract(year from d), count(*) from t group by extract(year from d);

alter table t add primary key(d);
select extract(year from d), count(*) from t group by extract(year from d);

This doesn’t look particularly exciting – I’ve created a table with two rows holding two dates in the same year, then counted the number of rows for “each” year before and after adding a primary key on the date column. Pause briefly to think about what the results might look like …

…

Table created.


EXTRACT(YEARFROMD)   COUNT(*)
------------------ ----------
              2021          2

1 row selected.


Table altered.


EXTRACT(YEARFROMD)   COUNT(*)
------------------ ----------
              2021          1
              2021          1

2 rows selected.

After adding the primary key (with its unique index) the result changes to something that is clearly (for this very simple data set) wrong.

At this point I offered a hypothetical reason why Oracle might be producing the wrong result, but Tim Gorman was one step ahead of me and supplied a bug reference from MOS: Wrong Result Using GROUP BY with EXTRACT Function Against DATE (Doc ID 2629968.1)

The MOS document describes this as a bug introduced in the upgrade from 12.1.0.2 to 12.2.0.1, demonstrates the error with the extract() function applied to a date, and supplies three possible workarounds (but not the workaround or explanation I supplied in my response on oracle-l).

The document also pointed to a further bug note that described how the problem also appeared with the combination of the to_char() function applied to a date column with a unique indexes: 12.2 Wrong Results from Query with GROUP BY Clause Based on To_char Function of Unique Index Columns (Doc ID 2294763.1) with a further suggestion for applying a patch (see MOS Doc ID: 26588069.8) or upgrading to 18.1 (where the bug has been fixed).

Matthias Rogel supplied a follow-up demonstrating the problem with to_char(), which prompted me to create an example showing that it wasn’t just about dates – which I’ve tidied up below (reminder, all results on this page are from 12.2.0.1):

create  table t1 as 
select  round(rownum/10,1) n1 
from    all_objects 
where   rownum <= 10
;


select n1 from t1 order by n1;
select n1, count(*) from t1 group by n1 order by n1;

column nch format A3

select to_char(n1,'99') nch, count(*) from t1 group by to_char(n1,'99') order by 1,2;

select * from table(dbms_xplan.display_cursor(format =>'outline'));

alter table t1 add constraint t1_pk primary key(n1);
select to_char(n1,'99') nch , count(*) from t1 group by to_char(n1,'99') order by 1,2;

select * from table(dbms_xplan.display_cursor(format =>'outline'));

As before I’ve created a simple table, and populated it with a few rows of data. THe first two queries are there to show you the data (0.1 to 1.0 by steps of 0.1), and show that aggregating the raw data produces one row per value.

I’ve then repeated the aggregation query, but converted each value to a character string that effectively rounds the value to an integer. Here are the two sets of results, before and after adding the primary key.

NCH   COUNT(*)
--- ----------
  0          4
  1          6

2 rows selected.

Table altered.

NCH   COUNT(*)
--- ----------
  0          1
  0          1
  0          1
  0          1
  1          1
  1          1
  1          1
  1          1
  1          1
  1          1

10 rows selected.

Again, the introduction of the primary key constraint on the column results in wrong results. In this example, though I’ve pulled the execution plans from memory along with their outlines, and this is what the two plans look like.

SQL_ID  gt5a14jb0g4n0, child number 0
-------------------------------------
select to_char(n1,'99') nch, count(*) from t1 group by to_char(n1,'99')
order by 1,2

Plan hash value: 2808104874

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |     4 (100)|          |
|   1 |  SORT ORDER BY      |      |    10 |    30 |     4  (50)| 00:00:01 |
|   2 |   HASH GROUP BY     |      |    10 |    30 |     4  (50)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   |    10 |    30 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      USE_HASH_AGGREGATION(@"SEL$1")
      END_OUTLINE_DATA
  */


SQL_ID  4fxxtmrh8cpzp, child number 0
-------------------------------------
select to_char(n1,'99') nch , count(*) from t1 group by
to_char(n1,'99') order by 1,2

Plan hash value: 1252675504

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |       |       |     2 (100)|          |
|   1 |  SORT ORDER BY   |       |    10 |    30 |     2  (50)| 00:00:01 |
|   2 |   INDEX FULL SCAN| T1_PK |    10 |    30 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$9BB7A81A")
      ELIM_GROUPBY(@"SEL$47952E7A")
      OUTLINE(@"SEL$47952E7A")
      ELIM_GROUPBY(@"SEL$1")
      OUTLINE(@"SEL$1")
      INDEX(@"SEL$9BB7A81A" "T1"@"SEL$1" ("T1"."N1"))
      END_OUTLINE_DATA
  */

In the absence of the primary key (index) Oracle does a full tablescan, then hash group by, then sort order by. When the primary key is put in place Oracle does an index full scan (which is legal because the index must contain all the data thanks to the not null declaration inherent in a primary key) and a sort order by without any group by.

You might wonder if the problem arises because Oracle assumes the indexed path somehow means the aggregation doesn’t apply – but with a /*+ full(t1) */ hint in place and a full tablescan in the plan the aggregation step is still missing — and if you look at the Outline Data section of the plan you can see that this is explicitly demanded by the hint(s): /*+ elim_groupby() */

My hypothesis (before I read the bug note) was that the optimizer had picked up the primary key declaration and seen that n1 was unique and therefore allowed the aggregating group by to be eliminated, but failed to “notice” that the to_char() – or extract() in the date example – meant that the assumption of uniqueness was no longer valid. To work around this problem very locally I simply added the hint /*+ no_elim_groupby */ (with no query block specified) to the query – and got the correct results.

Footnote

There is an interesting side note to this example (though not one that I would want to see used in a production system – this comment is for interest only). If you look at the Outline Data for the plan when there was no primary key you’ll notice that the only outline_leaf() is named sel$1 whereas in the plan with the primary key sel$1 appears as an outline() and the only outline_leaf() is named sel$9bb7a81a. As “outline leaf” is a query block that was used by the optimizer in constructing the final plan, while an “outline” is an intermediate query block that was examined before being transformed into another query block. So this difference in the Outline Data tells us that the problem appears thanks to a transformation that did not happen when there was no index – so what would our execution plan look like if the only hint we used in the query was /*+ outline_leaf(@sel$1) */ ?

SQL_ID  apgu34hc3ap7f, child number 0
-------------------------------------
select /*+ outline_leaf(@sel$1) */ to_char(n1,'99') nch , count(*) from
t1 group by to_char(n1,'99') order by 1,2

Plan hash value: 3280011052

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |     3 (100)|          |
|   1 |  SORT ORDER BY    |       |    10 |    30 |     3  (67)| 00:00:01 |
|   2 |   HASH GROUP BY   |       |    10 |    30 |     3  (67)| 00:00:01 |
|   3 |    INDEX FULL SCAN| T1_PK |    10 |    30 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."N1"))
      USE_HASH_AGGREGATION(@"SEL$1")
      END_OUTLINE_DATA
  */

This posting was scheduled to launch at 00:01 GMT on 1st January 2022. Happy new year – just be careful that you don’t try to extract() or to_char() it if you’re running 12.2.0.1 unless you’ve applied patch 26588069.

Comments (5)

August 21, 2021

GTT LOBs

Filed under: Bugs,Infrastructure,Oracle — Jonathan Lewis @ 1:53 pm BST Aug 21,2021

Searching my blog recently for some details about a temporary space problem I came across a draft I’d written a few years ago about LOB columns in global temporary tables. It was a summary of an exchange from OTN that did a good job of capturing my general air of scepticism when reviewing database results. The basic problem came from a user who had discovered that when they included a LOB column in a global temporary table the LOBINDEX segment “was created in the SYSTEM tablespace”.

This note is probably not going to be of much practical benefit to many people – because no-one should be using an version old enough to display the anomaly – but it does demonstrate a principle and a pattern of thinking, so here goes – 5 years late.

Like the “Earn $50M by helping me steal $100M” email the claim from the OP seemed a little suspect. My basic approach to Oracle is: “if it looks unreasonable it probably isnt’ true so test it”, so I did. Here’s the first bit of SQL (which I ran on 12.1.0.2 on an empty schema):


rem
rem     Script:         gtt_lobs.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2016
rem     Purpose:
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4

create global temporary table gtt1(
        id      number          not null,
        v1      varchar2(10)    not null,
        c1      clob,
        constraint gtt1_pk primary key(id)
)
-- tablespace gtt_temp
;

select
        table_name, index_name, tablespace_name
from
        user_indexes
;

TABLE_NAME           INDEX_NAME                       TABLESPACE_NAME
-------------------- -------------------------------- ------------------------------
GTT1                 GTT1_PK
GTT1                 SYS_IL0000168694C00003$$         SYSTEM

Sure enough, the query against user_indexes says the LOBINDEX for the LOB in the global temporary table will be created in the SYSTEM tablespace! This is clearly ridiculous – so I’m not going to believe it until I’ve actually confirmed it. I tend to trust the data dictionary rather more than I trust the manuals and MOS – but the data dictionary is just a bunch of tables and a bit of SQL that’s been written by someone else. so even the data dictionary can be wrong. It’s easy enough to test:

insert into gtt1 values(1,'a',rpad('x',4000,'x'));

select
        username, tablespace, segtype, segfile#, segblk#
from
        v$tempseg_usage
;

USERNAME                       TABLESPACE                      SEGTYPE     SEGFILE#    SEGBLK#
------------------------------ ------------------------------- --------- ---------- ----------
TEST_USER                      TEMP                            DATA             201     261632
TEST_USER                      TEMP                            INDEX            201     261760
TEST_USER                      TEMP                            LOB_DATA         201     262016
TEST_USER                      TEMP                            INDEX            201     261888

Whatever the data dictionary says, the actual segment (at file 201, block 261888 – it would be nice if the type were LOB_INDEX rather than just INDEX, but that still hasn’t been fixed, even in 21.3.0.0) has been created in the temporary tablespace. Checking the definiton of the dba_indexes view, I came to the conclusion that the property column of the sys.ind$ table hadn’t had the bit set to show that it’s an index associated with a temporary table; as a result the SQL in the view definition reports tablespace 0 (SYSTEM) rather than decoding the zero to a null.

You’ll note that there’s a commented reference to “tablespace gtt_temp” in my original table creation statement. It’s not terribly well known but in recent versions of Oracle you can associate a global temporary table with a specific temporary tablespace – this gives you some scope for easily monitoring the impact of particular global temporary tables on the total I/O resource usage. After re-running the test but specifying this tablespace as the location for the GTT I got the following results:


TABLE_NAME           INDEX_NAME                       TABLESPACE_NAME
-------------------- -------------------------------- ------------------------------
GTT1                 GTT1_PK                          GTT_TEMP
GTT1                 SYS_IL0000262251C00003$$         GTT_TEMP

USERNAME                       TABLESPACE                      SEGTYPE     SEGFILE#    SEGBLK#
------------------------------ ------------------------------- --------- ---------- ----------
TEST_USER                      GTT_TEMP                        DATA             202        512
TEST_USER                      GTT_TEMP                        INDEX            202        384
TEST_USER                      GTT_TEMP                        LOB_DATA         202        128
TEST_USER                      GTT_TEMP                        INDEX            202        256

The anomaly disappears – everything is reported as being linked to the gtt_temp tablespace (and that includes the table itself, as reported in view user_tables, though I haven’t included that query in the test or results).

Footnote:

A few months after the first draft of this note I happened to rediscover it (but still failed to publish it) and after a quick search on MOS found the following bug, reported as fixed in 12.2, with a backport currently available to 11.2.0.3 (the document has since been hidden, and the only remaining evidence is a patch for AIX):

Bug 18897516 : GLOBAL TEMPORARY TABLE WITH LOBCOLUMNS CREATE LOB INDEX IN SYSTEM SCHEMA

Interestingly the description says: “lob index created in system tablespace” rather than “lob index incorrectly reported as being in system tablespace”. You do have to be very careful with how you describe things if you don’t want to cause confusion – this “problem” isn’t a space management threat, it’s just an irritating reporting error.

Comments (2)

July 20, 2021

Hex tip

Filed under: 19c,Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 5:40 pm BST Jul 20,2021

A surprising amount of the work I do revolves around numbers; and once I’m outside the realm of the optimizer (i.e. getting away from simple arithmetic), one of the bits of playing with numbers that I do most often is conversion – usually decimal to hexadecimal, sometimes decimal to binary.

Here’s an example of how this helped me debug an Oracle error a few days ago. We start with someone trying to purge data from aud$ using the official dbms_audit_mgmt package, first setting the package’s db_delete_batch_size parameter to the value 100,000 then calling dbms_audit_mgmt.clean_audit_trail.

In theory this should have deleted (up to) 100,000 rows from aud$ starting from the oldest data. In practice it tried to delete far more rows, generating vast amounts of undo and redo, and locking up resources in the undo tablespace for ages. The SQL statement doing all the work looked like the following (after a little cosmetic work):

DELETE FROM SYS.AUD$ 
WHERE  DBID = 382813123 
AND    NTIMESTAMP# < to_timestamp('2020-12-17 00:00:00', 'YYYY-MM-DD HH24:MI:SS.FF')
AND    ROWNUM <= 140724603553440

That’s a rather large number in the rownum predicate, much larger than the expected 100,000. Whenever I am puzzled by very large numbers in places I’m not expecting to see them one of the first things I do to poke it around is to convert it to hexadecimal. (Although it seems a fairly random thing to do it doesn’t take very long and it produces an interesting result fairly frequently.)

140724603553440 (dec) = 0x7FFD000186A0

You may not think that the resulting hex number is very interesting – but there’s a string of zeros in the middle that is asking for a little extra poking. So let’s convert the last 8 digit (starting with those 3 zeros) back to decimal.

0x000186A0 = 100,000 (dec)

There’s an interesting coincidence – we’ve got back to the 100,000 that the OP had set as the db_delete_batch_size. Is this really a coincidence or does it tell us something about a bug? That’s easy enough to test, just try setting a couple of different values for the parameter and see if this affects the rownum predicate in a consistent fashion. Here are the results from two more test values:

1,000,000 ==> 140733194388032 (dec) = 0x7FFF000F4240 .... 0x000F4240 = 1,000,000 (dec)
   50,000 ==> 140728898470736 (dee) = 0x7FFE0000C350 .... 0x0000C350 =    50,000 (dec)

The top 4 digits (2 bytes) have changed, but the bottom 8 digits (4 bytes) do seem to hold the db_delete_batch_size requested. At this point I felt that we were probably seeing some sort of pointer error in a C library routine. If you examine the file $ORACLE_HOME/rdbms/admin/prvtamgt.plb) you’ll find that one of the few readable lines says:

CREATE OR REPLACE LIBRARY audsys.dbms_audit_mgmt_lib wrapped

My guess was that there were probably a couple of external C routines involved, with PL/SQL wrappers in the public package; and that there was a mismatch between the declarations in C and the declarations in the PL/SQL.

Update (after twitter exchange)

It turns out that I wasn’t quite right, but I was in the right Olympic stadium. This is now (unpublished) bug 33136016, and if you’ve been seeing unexpected work patterns when purging the audit trail after upgrading to 19c or later then there may be a patch for you in the not too distant future.

Update (Oct 2023)

This bug is now visible in MOS (and probably has been for some time) as: Doc ID 33136016.8 – incorrect handling of batch size in dbms_audit_mgmt.clean_audit_trail reported as “first fixed” in 19.19 and 23.1 (and no patches / back-ports for earlier releases yet reported).

Comments (2)

March 8, 2021

Join Elimination redux

Filed under: Bugs,CBO,Join Elimination,Oracle,Transformations — Jonathan Lewis @ 12:58 pm GMT Mar 8,2021

This note is a followup to a post from a few years back (originally dating further back to 2012) where I described an inconsistency that appeared when join elimination and deferrable constraints collided. The bug resurfacted recently in a new guise in a question on the Oracle Developer forum with a wonderful variation on the symptons that ultimately gave a good clue to underlying issue. The post included a complete working example of the anomaly, but I’ll demonstrate it using a variation of my 2012/2017 code. We start with a pair of tables with referential integrity defined between them:

rem
rem     Script:         join_eliminate_bug_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2021
rem
rem     Last tested 
rem             19.11.0.0
rem             19.8.0.0 (LiveSQL)
rem

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        deferrable initially immediate
)
;

create table child(
        id_p    number(4)       
                constraint chi_fk_par
                references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id) 
)
;

insert into parent values (1,'Smith');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/

You’ll notice that I’ve created the primary key constraint on parent as “deferrable initially immediate”. So let’s write some code that defers the constraint, inserts some duplicate data executes a join between the two tables:

set serveroutput off
set constraint par_pk deferred;

insert into parent (id,name) values (1,'Smith');

select
        /*+ initially immediate  PK */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

Since I’ve now got two rows with id = 1 in parent the query ought to return duplicates for every row in child where id_p = 1, but it doesn’t. Here’s the output from the query and the execution plan:

     ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally

2 rows selected.


PLAN_TABLE_OUTPUT
--------------------------------------------------
SQL_ID  gy6h8td4tmdpg, child number 0
-------------------------------------
select  /*+ initially immediate  PK */  chi.* from  child chi,  parent
par where  par.id = chi.id_p

Plan hash value: 2406669797

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS FULL| CHILD |     2 |    24 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

The optimizer has applied “join elimination” to take parent out of the transformed query, so no duplicates. Arguably this is the wrong result.

Let’s roll back the insert and change the experiment – let’s change the constraint on the parent primary key so that it’s still deferrable, but initially deferred then repeat the insert and query:

rollback;
alter table child drop constraint chi_fk_par;
alter table parent drop constraint par_pk;

alter table parent add constraint par_pk primary key (id) deferrable initially deferred;
alter table child add constraint chi_fk_par foreign key(id_p) references parent;

insert into parent (id,name) values (1,'Smith');

select
        /*+ initially deferred  PK */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

In this case we don’t need to “set constraint par_pk deferred”, it’s implicitly deferred by definition and will only be checked when we commit any transaction. Would you expect this to make any difference to the result? This is what we get:

      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          1 Simon
         1          2 Sally
         1          2 Sally

4 rows selected.


PLAN_TABLE_OUTPUT
--------------------------------------------------
SQL_ID  8gvn3mzr8uv0h, child number 0
-------------------------------------
select  /*+ initially deferred  PK */  chi.* from  child chi,  parent
par where  par.id = chi.id_p

Plan hash value: 1687613841

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |       |       |     2 (100)|          |
|   1 |  NESTED LOOPS      |        |     2 |    30 |     2   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| CHILD  |     2 |    24 |     2   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | PAR_PK |     1 |     3 |     0   (0)|          |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PAR"."ID"="CHI"."ID_P")

When the parent primary key is initially deferred then join elimination doesn’t take place – so we get two copies of each child row in the output. (This is still true even if we add the “rely” option to the parent primary key constraint).

Debug Analysis

As I said at the top of the article, this does give us a clue about the source of the bug. A check of the dictionary table cdef$ (constraint definitions) shows the following notes for column cdef$.defer:

  defer         number,                     /* 0x01 constraint is deferrable */
                                              /* 0x02 constraint is deferred */
                                /* 0x04 constraint has been system validated */
                                 /* 0x08 constraint name is system generated */
etc...

With my examples the “initially immediate” constraint reported defer = 5, for the “initially deferred” constraint it reported the value 7. It looks as if the optimizer code to handle join elimination look only at the static definition of the constraint (bit 0x02) and doesn’t consider the possibility that if bit 0x01 is set it should also check the session state to see if the constraint has been temporarily deferred.

Conclusion

If you are going to implement deferrable constraints be very careful about tracking exactly how you use them, and be aware that if you execute arbitrary queries in mid-transaction then you may find that the results are not exactly what you expect. In fact, though it’s not demonstrated here, different forms of SQL to that should express the same requirement may not give the same results.

Update (May 2021)

This buggy behaviour is still present in 19.11.0.0

Comments (3)

December 21, 2023

Workaround

Best Guess

Post script

November 1, 2023

Summary

September 5, 2023

Investigation

Resolution

Summary

August 22, 2022

Workaround

Footnotes

July 18, 2022

Follow-up

Some answers

May 31, 2022

Execution plans

Where’s the bug?

Workaround

The last laugh

May 5, 2022

The basic model

Internal Rewrite

Workaround

Hypothesis

tl;dr

Footnote

May 4, 2022

Conclusion

Notes

March 28, 2022

Summary

March 24, 2022

Workaround

March 22, 2022

Summary:

Footnote:

January 1, 2022

Footnote

August 21, 2021

Footnote:

July 20, 2021

Update (after twitter exchange)

Update (Oct 2023)

March 8, 2021

Debug Analysis

Conclusion

Update (May 2021)

Click on the motto below

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS