Following on from a Twitter reference and an update to an old posting about a side effect of constraints on the work done inserting data, I decided to have a closer look at the more general picture of default values and inserts. Here’s a script that I’ve tested against 11.2.0.4, 12.1.0.2, 12.2.0.1 and most recently 18.3.0.0 (original install, no patches applied in all cases):
rem rem Script: defaults_cost.sql rem Author: Jonathan Lewis rem Dated: Dec 2017 rem create table t1 ( column1 varchar2(10), column2 varchar2(10), column3 varchar2(10), column4 varchar2(10), column32 varchar2(32) default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', column33 varchar2(33) default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', virtual1 generated always as ( column4 || column3 || column2 || column1 ) virtual ) segment creation immediate ; execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1 || column2 || column3 || column4)')) execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1,column2,column3)')) create or replace function plsqlfunction_with_30char_name( i_in varchar2 ) return varchar deterministic is begin return initcap(i_in); end; / create index t1_i1 on t1(substr(plsqlfunction_with_30char_name(column1),1,10));
When you create a function-based index you get a hidden, virtual column supporting the index expression; when you create extended stats (of either type) you get a hidden virtual column holding the extension definition, when you create any type of virtual column, including a “real” virtual column you get a data dictionary entry holding the column name and the expression definition: all these options use the “data_default” column from user_tab_cols to display the defining information – as we can see when we the following query:
select column_name, data_default from user_tab_cols where table_name = 'T1' order by column_id ; COLUMN_NAME DATA_DEFAULT -------------------------------- -------------------------------------------------------------------------------- COLUMN1 COLUMN2 COLUMN3 COLUMN4 COLUMN32 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' COLUMN33 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' VIRTUAL1 "COLUMN4"||"COLUMN3"||"COLUMN2"||"COLUMN1" SYS_STUDAN97VB_XDKRTR_NPFAB80P "COLUMN1"||"COLUMN2"||"COLUMN3"||"COLUMN4" SYS_STUTCBJ8OFFSY1D9ZXRYZ0W3W# SYS_OP_COMBINED_HASH("COLUMN1","COLUMN2","COLUMN3") SYS_NC00010$ SUBSTR("TEST_USER"."PLSQLFUNCTION_WITH_30CHAR_NAME"("COLUMN1"),1,10)
Apart from the special cases I’ve just listed, you’ll also see the “default values” I specified for column32 and column33 – you’ll notice that I’ve supplied a 30 character string as the default for column32, and a 31 character string as the default for column33 – this is a convenience that means the used space in the data_default (which is a long column) corresponds to the name of the column once you include the single quotes in their character count.
Having set my data up I’m going to emulate a bad application that uses lots of literal string SQL and leaves Oracle to fill in the default values (and, of course, derive the various virtual values it might need).
alter session set events '10046 trace name context forever, level 4'; begin for i in 1..10 loop execute immediate ' insert into t1 (column1, column2, column3, column4) values( ' || i || ', ' || i || ', ' || i || ', ' || i || ')' ; commit; end loop; end; / alter session set events '10046 trace name context off';
This code generates 10 strings that populate column1 through to column4 only. But you’ll notice the call to enable SQL tracing – and here’s the interesting bit of the output from applying tkprof to the trace file:
SQL ID: 47r1y8yn34jmj Plan Hash: 2191121161 select default$ from col$ where rowid=:1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 50 0.00 0.00 0 0 0 0 Execute 50 0.00 0.00 0 0 0 0 Fetch 50 0.00 0.00 0 100 0 50 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 150 0.00 0.00 0 100 0 50
The summary is the same for all three versions of Oracle I tested*** – we’ve queried for a default value from col$ 5 times for each row we insert! (Technically that’s 5 times for each insert statement we’ve had to (hard-)parse; this anomaly wouldn’t appear if we have been using a bind-variable method and reusing the insert statement.)
*** There is one difference from 12.2 onwards – the number of parse calls reported for the statement was 1 rather than 50 but, judging by the various cursor cache hit stats, that may be due to a change in accounting rather than a change in workload.
Check the table definition: there are two “real defaults” and 4 expressions due to the various virtual columns – so why 5 calls per insert and not 6 ? The answer lies in the length of the actual value involved – if the text that appears in the (long) data_default column is 32 characters or shorter it will be stored in the dictionary cache (rowcache), but only one of our 6 values is that short, so Oracle looks up the other five on each insert (hard parse).
This is a little strange on two counts: first, given the enormous memories that have been available for years and the extreme usefulness of virtual columns and extended stats it seems something of an oversight to limit the memory available to the cache that holds such critical definitions; secondly, we don’t need to evaluate the virtual columns (unless they are indexed) or extended stats on inserts so why is Oracle querying the definitions anyway ? [Possible answer: perhaps Oracle is using generic code that allows for column check constraints – which might exist on users’ virtual columns – and hasn’t catered for bypassing system-generated virtual columns.]
A key point to remember before you start worrying too much about the impact of the execution count for this query against col$ is that it’s only relevant to “hard” parses – so you’re only likely to notice it if you have a large number of different “literal string” inserts that should be using bind variables; and that means you’ve probably got an important coding defect to address before you worry too much about the extra impact caused by this particular call. Nevertheless there are a couple of bug reports on MoS that have been raised against this query and, after writing this note, I did a quick Google search for the critical SQL_ID and found (among others) this production example from Franck Pachot.
Hello,Jonathan
According to your script,I test it in Oracle 12.2.0.1.0 for Windows x64.Unfortunately,my trace file – using TKPROF formatted shows as follows,
I found that the count of ‘Parse’ is only 1 and it seems to be a little different as yours.By the way,from your analysis it actually misses in library cache during parse and my file has also remindered.
However,It has same effect on 11.2.0.4.0 I tested as your trace file.
Best Regards
Quanwen Zhao
Comment by quanwenzhao — January 3, 2018 @ 7:58 am GMT Jan 3,2018 |
Quanwen Zha,
Thanks for taking the trouble to run the test. I did point out, however, that 12.2 recorded just one parse call.
Comment by Jonathan Lewis — January 3, 2018 @ 9:22 am GMT Jan 3,2018 |
Hi,Jonathan
Thanks for reminding seriously.
I also do this test,that’s i supplied a 32 character string as the default of column32 and a 33 character string as the default of column33 basing on your table t1 – the full and default value,if so,it shows 6 times about the count of ‘Parse’ for each row we insert on 11.2.0.4.0,my trace file formatting by TKPROF is as follows,
and on 12.2.0.1.0,it’s such as,
Best Regards
Quanwen Zhao
Comment by quanwenzhao — January 4, 2018 @ 7:03 am GMT Jan 4,2018 |
[…] which means the default value won’t be cached in the dictionary cache (v$rowcache)- and this might introduce a parsing overhead that you would prefer to […]
Pingback by data_default | Oracle Scratchpad — February 1, 2021 @ 4:51 pm GMT Feb 1,2021 |