Oracle Scratchpad

September 29, 2022

Case Study

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:27 pm BST Sep 29,2022

A recent question on the Oracle Developer Community forum asked for help with a statement that was taking a long time to run. The thread included the results from a trace file that had been passed through tkprof so we have the query and the actual execution plan with some rowsource execution stats.

Here’s the query – extracted from the tkprof output:

SELECT DISTINCT
       pll.po_line_id,
       ploc.line_location_id,
       (SELECT ptl.line_type
          FROM apps.po_line_types_tl ptl
         WHERE ptl.line_type_id = pll.line_type_id AND ptl.LANGUAGE = 'US')
           "Line_Type",
       ploc.quantity_accepted,
       NULL
           release_approved_date,
       NULL
           release_date,
       NULL
           release_hold_flag,
       NULL
           release_type,
       DECODE (ploc.po_release_id, NULL, NULL, ploc.quantity)
           released_quantity,
       (SELECT items.preprocessing_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "PreProcessing_LT",
       (SELECT items.full_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "Processing_LT",
       (SELECT items.postprocessing_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "PostProcessing_LT",
       ploc.firm_status_lookup_code,
       NVL (
           (SELECT pla.promised_date
              FROM apps.po_line_locations_archive_all pla
             WHERE     pla.po_header_id = pha.po_header_id
                   AND pla.po_line_id = pll.po_line_id
                   AND pla.line_location_id = ploc.line_location_id
                   AND pla.revision_num =
                       (SELECT MIN (revision_num)
                          FROM apps.po_line_locations_archive_all plla2
                         WHERE     plla2.promised_date IS NOT NULL
                               AND plla2.line_location_id =
                                   ploc.line_location_id)),
           ploc.promised_date)
           "Original_Promise_Date",
       (SELECT items.long_description
          FROM apps.mtl_system_items_tl items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id IN
                       (SELECT fin.inventory_organization_id
                          FROM apps.financials_system_params_all fin
                         WHERE fin.org_id = pha.org_id)
               AND items.LANGUAGE = 'US')
           "Item_Long_Description",
       NVL (ploc.approved_flag, 'N')
           approved_code,
       pvs.country
           "Supplier_Site_Country",
       pll.note_to_vendor,
         NVL (ploc.quantity, 0)
       - NVL (ploc.quantity_cancelled, 0)
       - NVL (ploc.quantity_received, 0) * ploc.price_override
           "Shipment_Amount",
       ploc.attribute4
           "PO_Ship_Date",
       (SELECT meaning
          FROM apps.fnd_lookup_values
         WHERE     lookup_type = 'SHIP_METHOD'
               AND lookup_code = ploc.attribute9
               AND language = 'US')
           "Ship_Method",
       (SELECT prla.note_to_receiver
          FROM apps.po_req_distributions_all  prda
               INNER JOIN apps.po_requisition_lines_all prla
                   ON prda.requisition_line_id = prla.requisition_line_id
         WHERE prda.distribution_id = pdi.req_distribution_id)
           "Note_To_Receiver",
       DECODE (pha.USER_HOLD_FLAG, 'Y', 'Y', pll.USER_HOLD_FLAG)
           "Hold_Flag",
       (SELECT ABC_CLASS_NAME
          FROM APPS.MTL_ABC_ASSIGNMENT_GROUPS  ASG
               INNER JOIN APPS.MTL_ABC_ASSIGNMENTS ASSI
                   ON ASG.ASSIGNMENT_GROUP_ID = ASSI.ASSIGNMENT_GROUP_ID
               INNER JOIN APPS.MTL_ABC_CLASSES classes
                   ON ASSI.ABC_CLASS_ID = classes.ABC_CLASS_ID
         WHERE     ASG.organization_id = ploc.SHIP_TO_ORGANIZATION_ID
               AND ASG.ASSIGNMENT_GROUP_NAME = 'MIN ABC Assignment'
               AND ASSI.inventory_item_id = pll.item_id)
           ABCClass,
       (SELECT CONCATENATED_SEGMENTS AS charge_accountsfrom
          FROM apps.gl_code_combinations_kfv gcc
         WHERE gcc.code_combination_id = pdi.code_combination_id)
           AS charge_accounts
  FROM apps.po_headers_all         pha,
       apps.po_lines_all           pll,
       apps.po_line_locations_all  ploc,
       apps.po_distributions_all   pdi,
       apps.per_all_people_f       papf,
       apps.AP_SUPPLIERS           pv,
       apps.AP_SUPPLIER_SITES_ALL  pvs,
       apps.AP_SUPPLIER_CONTACTS   pvc,
       apps.ap_terms               apt,
       apps.po_lookup_codes        plc1,
       apps.po_lookup_codes        plc2,
       apps.hr_locations           hlv_line_ship_to,
       apps.hr_locations           hlv_ship_to,
       apps.hr_locations           hlv_bill_to,
       apps.hr_organization_units  hou,
       apps.hr_locations_no_join   loc,
       apps.hr_locations_all_tl    hrl1,
       apps.hr_locations_all_tl    hrl2
 WHERE     1 = 1
       AND pll.po_header_id(+) = pha.po_header_id
       AND ploc.po_line_id(+) = pll.po_line_id
       AND pdi.line_location_id(+) = ploc.line_location_id
       AND ploc.shipment_type IN ('STANDARD', 'PLANNED')
       AND papf.person_id(+) = pha.agent_id
       AND TRUNC (SYSDATE) BETWEEN papf.effective_start_date
                               AND papf.effective_end_date
       AND papf.employee_number IS NOT NULL
       AND pv.vendor_id(+) = pha.vendor_id
       AND pvs.vendor_site_id(+) = pha.vendor_site_id
       AND pvc.vendor_contact_id(+) = pha.vendor_contact_id
       AND apt.term_id(+) = pha.terms_id
       AND plc1.lookup_code(+) = pha.fob_lookup_code
       AND plc1.lookup_type(+) = 'FOB'
       AND plc2.lookup_code(+) = pha.freight_terms_lookup_code
       AND plc2.lookup_type(+) = 'FREIGHT TERMS'
       AND hlv_line_ship_to.location_id(+) = ploc.ship_to_location_id
       AND hlv_ship_to.location_id(+) = pha.ship_to_location_id
       AND hlv_bill_to.location_id(+) = pha.bill_to_location_id
       AND hou.organization_id = pha.org_id
       AND hou.location_id = loc.location_id(+)
       AND hrl1.location_id(+) = pha.ship_to_location_id
       AND hrl1.LANGUAGE(+) = 'US'
       AND hrl2.location_id(+) = pha.bill_to_location_id
       AND hrl2.LANGUAGE(+) = 'US'
       AND hou.organization_id IN (2763)
       AND NVL (pha.closed_code, 'OPEN') IN ('OPEN', 'CLOSED')
       AND NVL (pll.closed_code, 'OPEN') IN ('OPEN', 'CLOSED')
       AND NVL (ploc.cancel_flag, 'N') = 'N'
       AND pha.authorization_status IN
               ('APPROVED', 'REQUIRES REAPPROVAL', 'IN PROCESS')

As you can see there are 10 inline scalar subqueries (highlighted) in the query with a select distinct to finish off the processing of an 18 table join. That’s a lot of scalar subqueries so it’s worth asking whether the code should be rewritten to use joins (though in newer vesions of Oracle some of the subqueries might be transformed to outer joins anyway – but the OP is using 11.2.0.4). We also know that a distinct is sometimes a hint that the code has a logic error that has been “fixed” by eliminating duplicates.

Ignoring those points, let’s consider the execution plan from the tkprof output which (with a tiny bit of extra formatting) is as follows:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.46       1.75          0          3          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    50346    279.02    1059.39     179103   30146895          0      755164
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50348    279.49    1061.14     179103   30146898          0      755164

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 678  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         9          9          9  TABLE ACCESS BY INDEX ROWID PO_LINE_TYPES_TL (cr=20 pr=0 pw=0 time=680 us cost=2 size=32 card=1)
         9          9          9   INDEX UNIQUE SCAN PO_LINE_TYPES_TL_U1 (cr=11 pr=0 pw=0 time=323 us cost=1 size=0 card=1)(object id 63682480)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267756 pr=28 pw=0 time=22598079 us cost=4 size=13 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720936 pr=0 pw=0 time=4644552 us cost=3 size=0 card=1)(object id 42812859)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267747 pr=0 pw=0 time=2442479 us cost=4 size=13 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720936 pr=0 pw=0 time=1238342 us cost=3 size=0 card=1)(object id 42812859)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267743 pr=0 pw=0 time=2029190 us cost=4 size=14 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720932 pr=0 pw=0 time=967729 us cost=3 size=0 card=1)(object id 42812859)

    672743     672743     672743  TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=5507736 pr=163043 pw=0 time=535914552 us cost=3 size=27 card=1)
    672743     672743     672743   INDEX UNIQUE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=4560824 pr=163043 pw=0 time=533161038 us cost=2 size=0 card=1)(object id 42811947)
    755121     755121     755121    SORT AGGREGATE (cr=3540960 pr=163043 pw=0 time=530079821 us)
   1040963    1040963    1040963     TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=3540960 pr=163043 pw=0 time=534243973 us cost=5 size=15 card=1)
   1776649    1776649    1776649      INDEX RANGE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=1123074 pr=6392 pw=0 time=37128373 us cost=3 size=0 card=2)(object id 42811947)

    587486     587486     587486  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_TL (cr=3436629 pr=3564 pw=0 time=64125044 us cost=5 size=34 card=1)
    587486     587486     587486   INDEX RANGE SCAN MTL_SYSTEM_ITEMS_TL_U1 (cr=2852930 pr=869 pw=0 time=45628505 us cost=4 size=0 card=1)(object id 136492495)
         1          1          1    TABLE ACCESS BY INDEX ROWID FINANCIALS_SYSTEM_PARAMS_ALL (cr=645351 pr=0 pw=0 time=5743158 us cost=2 size=10 card=1)
    322268     322268     322268     INDEX SKIP SCAN FINANCIALS_SYSTEM_PARAMS_U1 (cr=323083 pr=0 pw=0 time=5104895 us cost=1 size=0 card=1)(object id 42770563)

        10         10         10  TABLE ACCESS BY INDEX ROWID FND_LOOKUP_VALUES (cr=51 pr=1 pw=0 time=3620 us cost=5 size=60 card=1)
        20         20         20   INDEX RANGE SCAN FND_LOOKUP_VALUES_X99 (cr=31 pr=1 pw=0 time=2133 us cost=4 size=0 card=1)(object id 42759866)

    634276     634276     634276  NESTED LOOPS  (cr=3540930 pr=5535 pw=0 time=181518759 us cost=5 size=28 card=1)
    634276     634276     634276   TABLE ACCESS BY INDEX ROWID PO_REQ_DISTRIBUTIONS_ALL (cr=1631471 pr=5253 pw=0 time=65405333 us cost=3 size=12 card=1)
    634276     634276     634276    INDEX UNIQUE SCAN PO_REQ_DISTRIBUTIONS_U1 (cr=994522 pr=5252 pw=0 time=31023194 us cost=2 size=0 card=1)(object id 42788583)
    634276     634276     634276   TABLE ACCESS BY INDEX ROWID PO_REQUISITION_LINES_ALL (cr=1909459 pr=282 pw=0 time=115275921 us cost=2 size=16 card=1)
    634276     634276     634276    INDEX UNIQUE SCAN PO_REQUISITION_LINES_U1 (cr=944449 pr=268 pw=0 time=12285440 us cost=1 size=0 card=1)(object id 42789681)

    511989     511989     511989  NESTED LOOPS  (cr=3533763 pr=6 pw=0 time=8999321 us cost=5 size=55 card=1)
    511989     511989     511989   NESTED LOOPS  (cr=2850293 pr=6 pw=0 time=7086027 us cost=4 size=45 card=1)
    576055     576055     576055    TABLE ACCESS BY INDEX ROWID MTL_ABC_ASSIGNMENT_GROUPS (cr=612378 pr=0 pw=0 time=2002832 us cost=2 size=29 card=1)
    576055     576055     576055     INDEX UNIQUE SCAN MTL_ABC_ASSIGNMENT_GROUPS_U2 (cr=36323 pr=0 pw=0 time=951307 us cost=1 size=0 card=1)(object id 42783622)
    511989     511989     511989    TABLE ACCESS BY INDEX ROWID MTL_ABC_ASSIGNMENTS (cr=2237915 pr=6 pw=0 time=4672006 us cost=3 size=16 card=1)
    511989     511989     511989     INDEX UNIQUE SCAN MTL_ABC_ASSIGNMENTS_U1 (cr=1551490 pr=4 pw=0 time=2533524 us cost=2 size=0 card=1)(object id 42757737)
    511989     511989     511989   TABLE ACCESS BY INDEX ROWID MTL_ABC_CLASSES (cr=683470 pr=0 pw=0 time=1488045 us cost=1 size=10 card=1)
    511989     511989     511989    INDEX UNIQUE SCAN MTL_ABC_CLASSES_U1 (cr=171481 pr=0 pw=0 time=693745 us cost=0 size=0 card=1)(object id 42789694)

     13320      13320      13320  TABLE ACCESS BY INDEX ROWID GL_CODE_COMBINATIONS (cr=34801 pr=0 pw=0 time=802675 us cost=3 size=49 card=1)
     13320      13320      13320   INDEX UNIQUE SCAN GL_CODE_COMBINATIONS_U1 (cr=21481 pr=0 pw=0 time=397344 us cost=2 size=0 card=1)(object id 42775044)


    755164     755164     755164  HASH UNIQUE (cr=30147018 pr=179103 pw=0 time=1058922684 us cost=749257 size=197349453 card=482517)
    768890     768890     768890   HASH JOIN  (cr=7289842 pr=6926 pw=0 time=244582512 us cost=696202 size=197349453 card=482517)
    140451     140451     140451    TABLE ACCESS FULL PER_ALL_PEOPLE_F (cr=38207 pr=0 pw=0 time=313692 us cost=18484 size=13278261 card=428331)
    768890     768890     768890    NESTED LOOPS OUTER (cr=7251635 pr=6926 pw=0 time=242897348 us cost=672652 size=30016980 card=79410)
    755121     755121     755121     NESTED LOOPS OUTER (cr=5538283 pr=6031 pw=0 time=154841427 us cost=443987 size=28382903 card=78623)
    755121     755121     755121      NESTED LOOPS OUTER (cr=5508916 pr=6031 pw=0 time=153523676 us cost=443982 size=18184959 card=51809)
    755121     755121     755121       NESTED LOOPS OUTER (cr=5386279 pr=6031 pw=0 time=151985656 us cost=443978 size=11642422 card=34142)
    755121     755121     755121        NESTED LOOPS  (cr=5090949 pr=6031 pw=0 time=139220421 us cost=375644 size=11574138 card=34142)
    792959     792959     792959         NESTED LOOPS  (cr=1747964 pr=134 pw=0 time=64597738 us cost=109035 size=19934760 card=73560)
    254919     254919     254919          HASH JOIN OUTER (cr=315780 pr=6 pw=0 time=14811187 us cost=29121 size=5413350 card=22650)
    254919     254919     254919           NESTED LOOPS OUTER (cr=286919 pr=0 pw=0 time=12395919 us cost=13792 size=5209500 card=22650)
    254919     254919     254919            HASH JOIN RIGHT OUTER (cr=107134 pr=0 pw=0 time=12153146 us cost=13790 size=3868572 card=17426)
      3834       3834       3834             VIEW  HR_LOCATIONS (cr=3913 pr=0 pw=0 time=15826 us cost=125 size=360 card=60)
      3834       3834       3834              NESTED LOOPS  (cr=3913 pr=0 pw=0 time=15055 us cost=125 size=1080 card=60)
      3834       3834       3834               TABLE ACCESS FULL HR_LOCATIONS_ALL (cr=262 pr=0 pw=0 time=11211 us cost=125 size=304 card=38)
      3834       3834       3834               INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=3651 pr=0 pw=0 time=6183 us cost=0 size=20 card=2)(object id 42783719)
    254919     254919     254919             HASH JOIN RIGHT OUTER (cr=103221 pr=0 pw=0 time=11917174 us cost=13666 size=3764016 card=17426)
      3834       3834       3834              VIEW  HR_LOCATIONS (cr=3898 pr=0 pw=0 time=14651 us cost=125 size=360 card=60)
      3834       3834       3834               NESTED LOOPS  (cr=3898 pr=0 pw=0 time=14267 us cost=125 size=1080 card=60)
      3834       3834       3834                TABLE ACCESS FULL HR_LOCATIONS_ALL (cr=247 pr=0 pw=0 time=9532 us cost=125 size=304 card=38)
      3834       3834       3834                INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=3651 pr=0 pw=0 time=9539 us cost=0 size=20 card=2)(object id 42783719)
    254919     254919     254919              HASH JOIN RIGHT OUTER (cr=99323 pr=0 pw=0 time=11817243 us cost=13541 size=3659460 card=17426)
        45         45         45               INDEX RANGE SCAN FND_LOOKUP_VALUES_U1 (cr=21 pr=0 pw=0 time=614 us cost=4 size=49 card=1)(object id 63685210)
    254919     254919     254919               HASH JOIN RIGHT OUTER (cr=99302 pr=0 pw=0 time=11729251 us cost=13537 size=2805586 card=17426)
        59         59         59                INDEX RANGE SCAN FND_LOOKUP_VALUES_U1 (cr=20 pr=0 pw=0 time=445 us cost=4 size=49 card=1)(object id 63685210)
    254919     254919     254919                NESTED LOOPS  (cr=99282 pr=0 pw=0 time=11653162 us cost=13533 size=1951712 card=17426)
         1          1          1                 NESTED LOOPS OUTER (cr=116 pr=0 pw=0 time=113273 us cost=3 size=40 card=1)
         1          1          1                  NESTED LOOPS  (cr=113 pr=0 pw=0 time=113227 us cost=2 size=32 card=1)
         1          1          1                   INDEX UNIQUE SCAN HR_ALL_ORGANIZATION_UNTS_TL_PK (cr=110 pr=0 pw=0 time=113164 us cost=1 size=17 card=1)(object id 63680720)
         1          1          1                   TABLE ACCESS BY INDEX ROWID HR_ALL_ORGANIZATION_UNITS (cr=3 pr=0 pw=0 time=59 us cost=1 size=15 card=1)
         1          1          1                    INDEX UNIQUE SCAN HR_ORGANIZATION_UNITS_PK (cr=2 pr=0 pw=0 time=7 us cost=0 size=0 card=1)(object id 42789144)
         1          1          1                  TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=3 pr=0 pw=0 time=42 us cost=1 size=8 card=1)
         1          1          1                   INDEX UNIQUE SCAN HR_LOCATIONS_PK (cr=2 pr=0 pw=0 time=7 us cost=0 size=0 card=1)(object id 42797079)
    254919     254919     254919                 TABLE ACCESS BY INDEX ROWID PO_HEADERS_ALL (cr=99166 pr=0 pw=0 time=11505632 us cost=13530 size=1254672 card=17426)
    255397     255397     255397                  INDEX SKIP SCAN PO_HEADERS_ALL_X3 (cr=1753 pr=0 pw=0 time=725236 us cost=352 size=0 card=37674)(object id 42773719)
    254883     254883     254883            INDEX UNIQUE SCAN AP_TERMS_TL_U1 (cr=179785 pr=0 pw=0 time=183291 us cost=0 size=8 card=1)(object id 42798416)
    482528     482528     482528           TABLE ACCESS FULL AP_SUPPLIER_SITES_ALL (cr=28861 pr=6 pw=0 time=227983 us cost=13727 size=4323123 card=480347)
    792959     792959     792959          TABLE ACCESS BY INDEX ROWID PO_LINES_ALL (cr=1432184 pr=128 pw=0 time=53002963 us cost=5 size=96 card=3)
    793375     793375     793375           INDEX RANGE SCAN PO_LINES_U2 (cr=504726 pr=20 pw=0 time=17603112 us cost=2 size=0 card=5)(object id 42755253)
    755121     755121     755121         TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ALL (cr=3342985 pr=5897 pw=0 time=71357938 us cost=4 size=68 card=1)
   1138558    1138558    1138558          INDEX RANGE SCAN PO_LINE_LOCATIONS_N15 (cr=1707311 pr=5830 pw=0 time=37903421 us cost=3 size=0 card=2)(object id 63697005)
    723002     723002     723002        VIEW PUSHED PREDICATE  HR_LOCATIONS (cr=295330 pr=0 pw=0 time=11391536 us cost=2 size=2 card=1)
    723002     723002     723002         NESTED LOOPS  (cr=295330 pr=0 pw=0 time=11004720 us cost=2 size=18 card=1)
    723002     723002     723002          INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=146911 pr=0 pw=0 time=1391389 us cost=1 size=10 card=1)(object id 42783719)
    723002     723002     723002          TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=148419 pr=0 pw=0 time=9233363 us cost=1 size=8 card=1)
    723002     723002     723002           INDEX UNIQUE SCAN HR_LOCATIONS_PK (cr=117800 pr=0 pw=0 time=836734 us cost=0 size=0 card=1)(object id 42797079)
    755119     755119     755119       INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=122637 pr=0 pw=0 time=829404 us cost=0 size=20 card=2)(object id 42783719)
    755121     755121     755121      INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=29367 pr=0 pw=0 time=716408 us cost=0 size=20 card=2)(object id 42783719)
    768883     768883     768883     TABLE ACCESS BY INDEX ROWID PO_DISTRIBUTIONS_ALL (cr=1713352 pr=895 pw=0 time=75314769 us cost=3 size=17 card=1)
    768883     768883     768883      INDEX RANGE SCAN PO_DISTRIBUTIONS_N1 (cr=1096671 pr=874 pw=0 time=24392643 us cost=2 size=0 card=1)(object id 42782429)

The plan is a bit long, but you may recall that a query with scalar subqueries in the select list reports the plans for each of the separate scalar subqueries before reporting the main query block – and I’ve inserted blank lines in the output above to improve the visibility of the individual blocks / scalar subqueries.

An odd little detail of this tkprof output was that there was no report of the wait information recorded against the query, though the following information appeared as the summary for the trace file, giving us a very good idea of the wait events for the individual query:

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        6      0.85       2.14          0          6          0           0
Execute      6      0.00       0.00          0          7        104          85
Fetch    50358    279.03    1059.39     179103   30146895          0      755329
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50370    279.88    1061.54     179103   30146908        104      755414

Misses in library cache during parse: 3

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   50363        0.00          0.00
  SQL*Net message from client                 50362      157.17        227.70
  row cache lock                                141        0.03          0.67
  library cache lock                             77        0.01          0.21
  library cache pin                              75        0.01          0.27
  Disk file operations I/O                      791        0.00          0.01
  gc current block 3-way                     835881        0.15        305.35
  gc current block 2-way                     471360        0.24        144.04
  KJC: Wait for msg sends to complete            40        0.00          0.00
  gc cr multi block request                       8        0.00          0.00
  gc current block congested                  10014        0.03          4.23
  gc cr block 3-way                           20215        0.06          4.69
  gc current grant busy                          20        0.00          0.00
  gc cr grant 2-way                          165010        0.07         25.13
  db file sequential read                    179103        0.05        196.31
  gc cr grant congested                         729        0.19          0.36
  gc current block busy                       71431        0.05        118.15
  gc cr block 2-way                            1800        0.01          0.31
  latch free                                      3        0.00          0.00
  gc cr block congested                         197        0.01          0.06
  latch: cache buffers chains                    45        0.00          0.00
  latch: gc element                              15        0.00          0.00
  gc cr block busy                               15        0.02          0.07
  latch: object queue header operation            1        0.00          0.00
  KSV master wait                                 2        0.00          0.00
  ASM file metadata operation                     1        0.00          0.00
  SQL*Net more data to client                     1        0.00          0.00
  gc current grant 2-way                          6        0.00          0.00

An important initial observation is that the query returned 750,000 rows in 50,000 fetches (all figures rounded for convenience) and that’s consistent with the SQL*Plus default arraysize of 15. So there might be a little time saved by setting the arraysize to a larger value (but only a few 10s of seconds – based on the 227 seconds total minus the 157 second maximum wait for the “SQL*Net message from client” figures and there may be some benefit of increasing the SQL*Net SDU_SIZE at the same time). Critically, though, we should ask “why do you want a query to return 750,000 rows?”, and “how fast do you think is ‘reasonable’?” You’ll also note from the “gc” waits that the system is based on RAC with at least 3 nodes – and RAC is always a suspect when you see unexpected time spent in a query.

Where in the driving query block does most of the time go between the last hash join (line 62) and the hash unique (line 61) – it’s in the query block whose plan starts at line 28 where we see 163,000 physical blocks read (pr=) and 535 seconds (time= microseconds) of which 6,400 blocks come from the index range scan operation at line 32 but most comes from line 31 fetching 1 million rows (by index rowid) from table po_lines_locations_archive_all.

    672743     672743     672743  TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=5507736 pr=163043 pw=0 time=535914552 us cost=3 size=27 card=1)
    672743     672743     672743   INDEX UNIQUE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=4560824 pr=163043 pw=0 time=533161038 us cost=2 size=0 card=1)(object id 42811947)
    755121     755121     755121    SORT AGGREGATE (cr=3540960 pr=163043 pw=0 time=530079821 us)
   1040963    1040963    1040963     TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=3540960 pr=163043 pw=0 time=534243973 us cost=5 size=15 card=1)
   1776649    1776649    1776649      INDEX RANGE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=1123074 pr=6392 pw=0 time=37128373 us cost=3 size=0 card=2)(object id 42811947)

This part of the workload comes from 672,743 executions of the subquery starting at line 36 of the original query text:

           (SELECT pla.promised_date
              FROM apps.po_line_locations_archive_all pla
             WHERE     pla.po_header_id = pha.po_header_id
                   AND pla.po_line_id = pll.po_line_id
                   AND pla.line_location_id = ploc.line_location_id
                   AND pla.revision_num =
                       (SELECT MIN (revision_num)
                          FROM apps.po_line_locations_archive_all plla2
                         WHERE     plla2.promised_date IS NOT NULL
                               AND plla2.line_location_id =
                                   ploc.line_location_id))

If we want to improve the performance of this query with a minimum of re-engineering, recoding and risk then a good point to start would be to examine this query block in isolation and see if there is a simple, low-cost way of improving its efficiency. (Note: this may not be a route to optimising the whole query “properly”, but it may give a quick win that is “good enough”.)

We could go a little further down this route of optimising the scalar subqueries by looking at the time spent in each of them in turn. Taking out the top line of each of the separate sections of the plan and extracting just the pr, pw and time values (which I’ll scale back from microseconds to seconds) we get the following

pr=      0      pw=0    time=   0
pr=     28      pw=0    time=  23
pr=      0      pw=0    time=   2
pr=      0      pw=0    time=   2
pr= 163043      pw=0    time= 536
pr=   3564      pw=0    time=  64
pr=      1      pw=0    time=   0
pr=   5535      pw=0    time= 182
pr=      6      pw=0    time=   9
pr=      0      pw=0    time=   1

The 8th scalar subquery (line 42 in the plan, line 75 in the query) gives us an opportunity to reduce the run time by 182 seconds, so might be worth a little investment in programmer time.

The 6th subquery (line 34 in the plan, line 49 in the query) adds only 64 seconds to the run time, so we might be less inclined to do anything about it.

You might note that the 2nd, 3rd and 4th subqueries are against the same table with the same predicate to get three different columns – this group is the “obvious” choice for recoding as a single join rather than three separate subqueries, but if you look at the total times of the three subqueries the “extra” two executions add only two seconds each to the total time – so although the this scalar subquery coding pattern is undesirable, it’s not necessarily going to be worth expending the effort to rewrite it in this case.

If you’re wondering, by the way, why different subqueries are reporting different numbers of rows returned (and each one should return at most one row on each execution), there are two reasons for any subquery to be reporting fewer than the 768,890 rows reported by the basic driving hash join:

  • first – an execution may simply return no rows,
  • secondly – there may be some benefits from scalar subquery caching.

One of the nice details about newer versions of Oracle is that the “starts” statistic is also reported in the trace/tkprof output so you would be able to see how much your query had benefited from scalar subquery caching.

If we add together the time reported by each of the scalar subquery sections of the plan the total time reported is approximately 819 seconds. Cross-checking with the difference in the times reported for operations 61 and 62 (hash unique of hash join) we see: 1,059 seconds – 245 seconds = 814 seconds. This is a good match (allowing for the accumulation of a large number of small errors) for the 819 seconds reported in the subqueries – so the hash unique isn’t a significant part of the query even though it has virtually no effect on the volume of data. You’ll note that it didn’t spill to disc (pw = 0) but completed in memory.

Summary

I’ve written a quick note on this query because the coding style was undesirable and the execution plan quite lengthy. I’ve reviewed how the style of the SQL is echoed in the shape of the plan. I’ve then pursued the idea of optimising the code “piece-wise” to see if there were any opportunities for improving the performance “enough” without going through the effort of a complete redesign of the query. [Update: One of the participants in the thread is currently walking through the mechanics of manually transforming the most expensive scalar subquery into an outer join by unnesting.]

Given the information in the Rowsource Execution from the trkprof output it proved easy to identify where the largest amounts of times appeared that might be reduced by very localised optimsation.

In passing I pointed out the option for reducing the time spent on network traffic by increasing the array fetch size, and increasing the SDU_SIZE for the SQL*Net messages to client.

Footnote (addendum)

I made a passing reference to the waits that told us that the user was running RAC. These waits merit a few follow-up comments.

The numbers for “gc” waits are high. Of particular note are the 71,000 waits and 118 seconds waited on “gc current block busy” which wave a big red flag telling us that there’s too much DML modifying the same object(s) from multiple nodes at the same time. (The even larger numbers for the “gc current block 2/3- way” say the same, but “busy” really emphasises the “hot-spot” aspect of the problem.)

Ideally we would like to see exactly where in the execution plan the bulk of those waits is occurring and, since the OP has been able to supply a trace file for the query, it’s possible that the query can be re-run to produce the SQL Monitor report (if the OP is suitably licenced) that summarises the active session history (ASH) for each line of the plan.

If the ASH data were available for a run of the report we could then do some analysis of parameter values recorded in v$active_session_history to see if that supplied further information. Unfortunately the view v$event_name doesn’t tell us what the parameter values mean for most of the “gc current%” waits, but a couple of the ones that do have descriptions report parameter1 as the file# and parameter2 as the block#, so maybe that’s true for many of them. (For some of the waits parameter1 is listed as the “le” (lock element), which doesn’t really help very much.)

Another diagnostic that could be very helpful is to take a snapshot of the session activity stats (v$sesstat)for the session as this might tell us that part of the load comes from “unexpected” work going on. In particular if we do an analysis of the “db file sequential read” waits we may find that many of the waits are for blocks in the undo tablespace, which would prompt us to examine the session stats to see what they tell us through the “% – undo records applied” statistics.

As a final comment – linking the “gc” activity back to my comment “That’s a lot of scalar subqueries so it’s worth asking whether the code should be rewritten to use joins” – if you add an extra table to a query with a simple join condition to add columns from that table to the query select list then Oracle can pin various index blocks; if you replace the join with a scalar subquery (which means you’re after just one column from one row each time) then Oracle has to pin and release all the index blocks on each call to the subquery. The benefit of the subquery approach is that scalar subquery caching may mean the subquery is rarely executed (check the highly suggestive stats in the plan for the first and seventh scalar subquery blocks – lines 16 and 39); the downside to the subquery approach is that you may end up spending a lot more time in buffer cache handling which, for RAC, includes the global cache (gc) management.

September 12, 2022

Dumping redo

Filed under: Infrastructure,Oracle,redo,Troubleshooting — Jonathan Lewis @ 10:05 am BST Sep 12,2022

In the past I’ve sometimes had to dump the contents of the redo log to a trace file when I needed to find out what work Oracle was doing behing the scenes. To minimise the volume dumped by the “alter system dump logfile” command and make it easier to find the bit I wanted to see I used to “switch logfile” just before (and sometimes just after) the statement I was investigating.

With the advent of pluggable databases the “switch logfile” command now raises Oracle error: “ORA-65040: operation not allowed from within a pluggable database”, so I had to change the strategy. This is just a brief note (echoing a footnote to an older note) of the approach I now use:

column current_scn new_value start_scn
select to_char(current_scn,'9999999999999999') current_scn from v$database;

-- do something interesting here

column current_scn new_value end_scn
select to_char(current_scn,'9999999999999999') current_scn from v$database;

alter session set tracefile_identifier='sometextyoulike';

alter system dump redo scn min &start_scn scn max &end_scn ;
alter session set tracefile_identifier='';

The list of options for the dump has been extended since I published the note on dumping the log file, and now (19.11.0.0) allows the following options (using c notation for the type of the variables you supply to each parameter):

 rdba min  %d rdba max  %d tablespace_no  %d
 dba min  %u  %u dba max  %u  %u
 securefile_dba  %u  %u
 length  %d
 time min  %d
 time max  %d
 layer  %d
 opcode  %d
 scn min  %llu
 scn max  %llu
 xid  %d  %d  %d
 objno  %u
 con_id  %d
 skip corruption


If you try to restrict the dump on objno (object id) or xid (transaction id) then the trace file will skip any redo records generated by private threads / in-memory undo and report the text: “Skipping IMU Redo Record: cannot be filtered by XID/OBJNO”

The tablespace_no option can only be used when both rdba min and rdba max (rolback data block address range) have been specified.

The con_id option may only be legal when used to specify a PDB from the CDB

Remember – when you dump redo you get just the redo for your session; there is some scope for being selective, but the starting point would be all the redo for the PDB you’re working from.

September 9, 2022

Parallel Default

Filed under: Oracle,Parallel Execution,Troubleshooting — Jonathan Lewis @ 10:25 am BST Sep 9,2022

“Why did my query go parallel?”

It’s a question that crops up from time to time, usually followed by a list of reasons why it shouldn’t have gone parallel – no hints in the query, table is not declared parallel, parallel_degree_policy is set to manual etc.

When the question appeared recently on the Oracle developer forum it turned out that the table in question was declared as “parallel (degree default)”, which prompted the OP to ask the question: “is parallel = default not equivalent to parallel = 1”.

The answer to the question is that the two options are not equivalent – but that’s not the point of this note. Here’s a little script to test the claim:

drop table t1 purge;

create table t1 pctfree 90 as select * from all_objects where rownum <= 50000;

select degree, instances from user_tables where table_name = 'T1';

explain plan for select sum(object_id) from t1;
select * from table(dbms_xplan.display);

alter table t1 parallel (degree default);
select degree, instances from user_tables where table_name = 'T1';

explain plan for select sum(object_id) from t1;
select * from table(dbms_xplan.display);


I’ve created a table in the simplest possible way, but picked a fixed number of rows (to help reproducibility) and – because parallel is usually about “big” objects – I’ve left a lot of empty space (90%) in each block.

Then I’ve checked the execution plan for a very simple query that can only do a full tablescan, with the two declarations of parallelism set.

Here are the outputs of the 4 queries I’ve run:

DEGREE                                   INSTANCES
---------------------------------------- ----------------------------------------
         1                                        1

1 row selected.


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     5 |  1275   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 50000 |   244K|  1275   (2)| 00:00:01 |
---------------------------------------------------------------------------

9 rows selected.


DEGREE                                   INSTANCES
---------------------------------------- ----------------------------------------
   DEFAULT                                        1

1 row selected.



PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3110199320

----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |     5 |   350   (0)| 00:00:01 |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     5 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     5 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     5 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          | 50000 |   244K|   350   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| T1       | 50000 |   244K|   350   (0)| 00:00:01 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 4 because of degree limit

17 rows selected.


Clearly “parallel default” does not have the same effect as “parallel 1”. Any time you’ve got a query unexpectedly running parallel it’s possible that some table (or index on the table) has been created with a parallel degree of default. (More commonly, someone may have rebuilt an index “parallel N” to get the job done more quickly then forgotten to alter the index back to parallel 1 – or noparallel – afterwards.)

The point of this note, though, is that there are some questions you should not ask until you’ve spent a few minutes thinking about how you might create a model that gives you the answer. There are several reasons for this

  • The more you do it, the better and faster you get at modelling and understanding – and sometimes you really need to model a complex problem because you’re not allowed to show anything that looks like production in public.
  • If the simple model seems to disagree with the behaviour you see in production it may give you some clues about where to look in the production system for the source of the difference.
  • If the answer isn’t what you thought it would be you can change the question you put publicly to: “I thought Oracle would do X but it did Y; here’s how I tested, is there a flaw in the test?”

It took about 5 minutes for me to run up this demo – that might seem a bit quick but I’ve had a lot of practice (and it took a lot longer to write the note) – and it was, in this case, a waste of my time because I knew the answer; but I often run up little models before responding to questions on the forums or listservers because while I often think I know what the answer “ought” to be I do like to check before I say something that might be incorrect.

August 22, 2022

Encryption oddity

Filed under: Bugs,LOBs,Oracle,Troubleshooting — Jonathan Lewis @ 12:34 am BST Aug 22,2022

Here’s a strange problem (with possible workaround) that appeared in a thread on the Oracle developer forum a couple of days ago. It looks like the sort of problem that might be a memory overflow problem in a rarely use code path, given that it seems to need a combination of:

  • move LOB column
  • varchar2() declared with character semantics
  • transparent data encryption (TDE)

Here’s a simple script to generate a tiny data set that demonstrates the problem in my 19.11.0.0 system – but you’ll need to enable transparent data encryption if you want it to work (see footnote).

rem
rem     Script:         encryption_problem.sql
rem     Author:         Solomon Yakobson / Jonathan Lewis
rem     Dated:          August 2022
rem
rem     Last tested
rem             21.3.0.0 
rem             19.11.0.0
rem

drop table test_tbl_move purge;

create table test_tbl_move(
        encrypted_column varchar2(9 char) 
                encrypt using 'aes192' 'sha-1' no salt 
                constraint ttm_nn_ec not null,
        clob_column      clob
)
lob (clob_column) store as securefile
;

insert into test_tbl_move
values( '123456789', 'x')
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'test_tbl_move',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter table test_tbl_move move lob(clob_column) store as securefile;

The script creates a table with a clob column (enable storage in row by default), and a single encrypted varchar2() column declared using character semantics viz: “9 CHAR“.

We insert a row, gather stats – just in case – and then try to move the LOB storage (which also has to move the table). Would you expect a result like the following:

alter table test_tbl_move move lob(clob_column) store as securefile
*
ERROR at line 1:
ORA-12899: value too large for column ??? (actual: 27, maximum: 9)

That “maximum: 9” suggests that Oracle is complaining about encrypted_column – but why? Before worrying about that question, though, I wanted to pursue a different point: if you check the original post and compare the error message with the one above you’ll see that the “actual:” value was different. So I ran the entire test again, and again, and again; and by the time I had executed the entire script half a dozen times this is the list of error messages I had collected:

ORA-12899: value too large for column ??? (actual: 27, maximum: 9)
ORA-12899: value too large for column ??? (actual: 31, maximum: 9)
ORA-12899: value too large for column ??? (actual: 29, maximum: 9)
ORA-12899: value too large for column ??? (actual: 29, maximum: 9)
ORA-12899: value too large for column ??? (actual: 26, maximum: 9)
ORA-12899: value too large for column ??? (actual: 21, maximum: 9)

Your first thought might have been that Oracle is trying to copy an encrypted value while forgetting that it had been encrypted, but the variation in the actual lengths makes it look as if Oracle is injecting generating random data somehow (maybe through a pointer error) and generating garbage as a side-effect. Moreover if you know your encryption you’ll be suspicious of almost all the actual lengths reported because Oracle’s working is as follows:

  • Round the source length up to the next multiple of 16 bytes
  • aes192 encryption (the default encryption) will add 16 bytes (‘nomac’ and no salt)
  • adding a salt (default behaviour) will add a further 16 bytes
  • adding the ‘SHA-1’ integrity (the default) will add another 20 bytes

You can’t get an odd number of bytes as an encrypted value (unless, perhaps, the code thought it was reading a null-terminated character-string and there was a zero in the middle of it).

Workaround

You’ll see in the thread that Solomon Yakobson did a number of experiments to see what effects they had on the test case; but there was one experiment that he didn’t do that happened to be the first one I thought of. (There was a certain amount of luck in the choice, plus a bit of background suspicion from a couple of prior bugs I’d seen, plus it seemed to be the virtually the only thing that SY hadn’t tried).

Declaring a varchar2(x CHAR) is fairly rare – and with all the messing around with padding, encoding etc. going on, the code to handle multi-byte character sets might be a fruitful source of bugs. So I re-ran the test, but changed the declaration from varchar2(9 CHAR) to varchar2(9 [byte]), and Oracle happily ran my test to completion.

On its own this isn’t a complete workaround. If you’re running with a multi-byte character set a declaration using character semantics means Oracle allows you to store many more bytes than characters. Conversely, if you use byte semantics you will have to declare your column with a large enough byte count to store the number of (multi-byte) characters you really want – but then that could allow your users to insert more characters than you wanted (unless the character set was a fixed-width character set – but then you could waste a lot of space storing character strings – see this note about “in row” CLOB columns).

So, to use byte semantics with a character limit, you have to adopt a strategy that I once saw at a company running Peoplesoft (I assume it’s been changed since – it was a long time ago). They declared their varchar2() columns with far too many bytes (4 times the required character count) then added a check constraint on the length to restrict the number of characters. (In their case that resulted in tens of thousands of check constraints in the database with an undesirable overhead on dictionary cache latching and parse times).

Here’s an alternative declaration of the table that allows the alter table move command to work and still ensures the correct maximum number of characters in the varchar2() column:

create table test_tbl_move(
        encrypted_column varchar2(18)
                encrypt using 'aes192' 'sha-1' no salt
                constraint ttm_nn_ec not null 
                constraint ttm_ck_ec_len9 check (length(encrypted_column) <= 9),
        clob_column      clob
)
lob (clob_column) store as securefile
/

Table created.

SQL> insert into test_tbl_move values('0123456789','xxx');
insert into test_tbl_move values('0123456789','xxx')
*
ERROR at line 1:
ORA-02290: check constraint (TEST_USER.TTM_CK_EC_LEN9) violated

Footnotes

  1. If, like me, the last time you’d played around with encryption was in 11g you’ll find that a lot has changed in setting it up – not only in the added requirements for pluggable databases but also with a new command framework. (As usual Tim Hall’s blog on the topic is a good starting point if you want to do a quick experiment in a sand box.)
  2. The code sample include SHA-1 as the integrity algorithm – ‘NOMAC’ is the only alternative, and in any single table the same algorithm has to be used for all encrypted columns. (If you try to use SHA-1 on one column and NOMAC on another as you create the table Oracle will raise “ORA-28379: a different integrity algorithm has been chosen for the table”. More importantly – a note in the Oracle 21c reference manual states that SHA-1 is deprecated from that version onwards and advises moving from TDE column encryption to TDE tablespace encryption.

July 18, 2022

drop t/s bug

Filed under: Bugs,LOBs,Oracle,Partitioning,Troubleshooting — Jonathan Lewis @ 9:17 am BST Jul 18,2022

A recent thread on the MOS database admin forum (needs an account) demonstrated a number of little issues with bugs, debugging, complex syntax, and the never-ending list of “not quite complete” code that shows up when features (in this case LOBs and partitioning) collide.

It’s a silly little thing, but one to remind you that you always have to ask “What have I forgotten?”, “Does my test suite include every case I need to test?”

In this case we had been given a model where, after creating a composite partitioned table with a LOB column using a statement that referenced 3 different tablespaces, the OP had

  1. moved one component of this complex structure to a fourth tablespace
  2. dropped the tablespace that had been defined as the holder of the moved component
  3. renamed the fourth tablespace to match the name of the dropped tablespace
  4. called dbms_metadata.get_ddl() to generate a new table definition.

The result of this was that the generated statement included a reference to a tablespace with the unexpected name of “_$deleted$51$0”.

Sidenote: The name is an example of the value stored in ts$.name when you drop a tablespace. The row in ts$ is not deleted (or marked for deletion), instead ts$.online$ is set to 3 and ts$.name is set to reflect the tablespace number with a name of the format “_$deleted${ts$.ts#}$0″.

Here, with some cosmetic changes to the “create table” statement and with a preamble to create enough tablespaces and quotas, is the model supplied by the OP (if you’re thinking of running it make sure you read it carefully first):

rem
rem     Script:         drop_ts_pt_bug.sql
rem     Author:         Jean-François56 / Jonathan Lewis
rem     Dated:          July 2022
rem
rem     Last tested 
rem             19.11.0.0
rem             11.2.0.4
rem 

connect sys/sys as sysdba

prompt  =================================================
prompt  Clearing the stage - some steps may report errors
prompt  =================================================

drop table test_user.test purge;

drop tablespace jf1 including contents and datafiles;
drop tablespace jf2 including contents and datafiles;
drop tablespace jf3 including contents and datafiles;
drop tablespace jf4 including contents and datafiles;

create tablespace jf1 datafile size 100m;
create tablespace jf2 datafile size 100m;
create tablespace jf3 datafile size 100m;
create tablespace jf4 datafile size 100m;

alter user test_user quota unlimited on jf1;
alter user test_user quota unlimited on jf2;
alter user test_user quota unlimited on jf3;
alter user test_user quota unlimited on jf4;

prompt  =======================
prompt  Connecting to test user
prompt  =======================

connect test_user/test

drop table test purge;

create table test(
       idarchive                number(10,0),
       data                     blob,
       partition_date           date,
       customer                 number(10,0),
       prefix_archive_key       varchar2(5)
)
partition by range (partition_date)
subpartition by list (customer)
(
partition p1 
        values less than (to_date('2008-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss', 'nls_calendar=gregorian'))
        tablespace jf1
        lob (data) store as basicfile (tablespace jf2 chunk 16384)
        (
        subpartition sp values (1) 
                tablespace jf3
                lob (data) store as basicfile (tablespace jf2)
        )
);

alter table test move subpartition sp lob(data) store as (tablespace jf4);


connect sys/sys as sysdba

drop tablespace jf2 including contents and datafiles;
alter tablespace jf4 rename to jf2;


connect test_user/test

set long 20000
set longchunksize 20000

set linesize 132
column text_line format a128

begin
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'PRETTY',             true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'TABLESPACE',         true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES', true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'STORAGE',            false);
end;
/

select dbms_metadata.get_ddl('TABLE','TEST') text_line from dual
/


The first thing I did after reading the posting was a quick search on MOS, using the four search terms: drop tablespace rename deleted. It was a lucky choice because on the first page of results from the Knowledge Base I found:

Renaming a Tablespace to An Already Dropped one Changes The Tablespace Name To "_$DELETED" (Doc ID 1937848.1)

Conveniently the notes in this document said: “This is caused by bug 18136584”,and that bug (and the problem) is labelled as “The bug is fixed in 12.2”. Unfortunately the OP was running 11.2.0.4, but that’s okay because the note also said: “Backport is feasible. However, A simple workaround is available.” So I thought I’d create the test case and check the workaround – which I why I’ve got the script.

It just so happened that I prepared, debugged and ran the script (without the workaround) on 19.11 before bothering to start up a VM with 11.2.0.4 – and I got the following output from my call to dbms_metadata.get_ddl():

  CREATE TABLE "TEST_USER"."TEST"
   (	"IDARCHIVE" NUMBER(10,0),
	"DATA" BLOB,
	"PARTITION_DATE" DATE,
	"CUSTOMER" NUMBER(10,0),
	"PREFIX_ARCHIVE_KEY" VARCHAR2(5)
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  TABLESPACE "TEST_8K_ASSM"
 LOB ("DATA") STORE AS SECUREFILE (
  ENABLE STORAGE IN ROW CHUNK 8192
  NOCACHE LOGGING  NOCOMPRESS  KEEP_DUPLICATES )
  PARTITION BY RANGE ("PARTITION_DATE")
  SUBPARTITION BY LIST ("CUSTOMER")
 (PARTITION "P1"  VALUES LESS THAN (TO_DATE(' 2008-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  TABLESPACE "JF1"
 LOB ("DATA") STORE AS BASICFILE (
  TABLESPACE "_$deleted$36$0" ENABLE STORAGE IN ROW CHUNK 16384 RETENTION
  NOCACHE LOGGING )
 ( SUBPARTITION "SP"  VALUES (1) SEGMENT CREATION IMMEDIATE
 LOB ("DATA") STORE AS BASICFILE (
  TABLESPACE "JF2" )
  TABLESPACE "JF3"
 NOCOMPRESS ) )

If you check line 18 (highlighted) you’ll see that even in 19.11 you can end up with generated statement that references “deleted” tablespaces – and here’s a funny little side effect (cut-n-paste from SQL*Plus – but your deletion number will probably be different):

SQL> drop table test;
drop table test
           *
ERROR at line 1:
ORA-00959: tablespace '_$deleted$36$0' does not exist

SQL> drop table test purge;

Table dropped.

So (part of) the bug is still present in 19.11; and on the plus side that means I can examine the workaround to see what it is and how it works. This is what the note says: “Execute below command once you finished with dropping and renaming the tablespace”:

alter table <table_name> modify default attributes tablespace <old_tablespace_name>;

That’s not actually going to do anything for us – but it’s an important clue to how we might be able to fix things; it also suggests why the bug “fixed” in 12.2 isn’t quite fixed in 19c – someone missed a bit of the code path: maybe the bit about LOBs, or maybe the bit about composite partitioned tables, or maybe (very precisely) the bit about partition-level default values for LOBs in composite partitioned tables. (And who knows what might be missing if we start looking at index-organized tables, nested table, and other complicated structures.)

Taking the clue from the suggested workaround, here are three possible fixes to try:

alter table test modify default attributes tablespace jf2;

alter table test modify default attributes                  lob(data) (tablespace jf2);

alter table test modify default attributes for partition p1 lob(data) (tablespace jf2);

The third of these options is the one that “works” – and the word is in quote marks because all I mean is that the generated SQL uses JF2 as the tablespace name rather than the “deleted” tablespace name – I make no guarantee about how future behaviour might vary from past behaviour after this change, and it’s likely to depend on exactly how you’re expecting to add and move partitions and subpartitions anyway.

Follow-up

The problem (which means, possibly, the omitted code path) comes from the need for handling default storage clauses. When we’re handling composite partitioning the only segments that come into existence are subpartition segments – but you can still specify physical (and logical) storage information at the table and partition level with the inference (possibly not stated explicitly) that any table-level metadata should be the default metadata for the partition level and any partition-level metadata should be the default metadata for subpartitions. What does this mean in terms of our original table creation script and the subsequent call to dbms_metadata?

You’ll notice that I’ve highlighted lines 8 – 11 in the output above from dbms_metadata.

  • Line 8 references tablespace test_8k_assm: I didn’t include a default tablespace at the table level for the table segments, but that’s the tablespace that happened to be my default tablespace when I ran the script.
  • Lines 9 – 11 define a default LOB storage clause with no specified tablespace and using securefiles (which is the default LOB storage for 19c). Again I didn’t specify anything about a table-level LOB in my original definition.
  • The rest of the generated definition has, apart from the “deleted” tablespace, reproduced my original definition – including the 16KB declaration of chunk size for the partition and the lack of specified chunksize for the subpartition.

So questions to think about:

  • what chunk size is / would be used in the subpartition – is it silently picking up the value specified for the partition, or is it silently picking up the default for the table, or is it simply using the “absolute” default of 1 block?
  • what happens if I execute a simple “add subpartition” on the existing p1 partition? Where will the subpartition be stored and what will its storage details look like.
  • What will I see if I execute a simple “add partition” to add a new partition to the table. Will I also get a physical subpartition and if so where will it be and what will its storage clause look like.
  • What would the dbms_metadata output have looked like if I had had a table-level LOB definition that specified tablespace jf2?
  • What side effects might appear if I extended the definition to interval partitioning, with automatic list subpartitions, and inserted a row that needed a new partition and subpartition?!

Underlying all these detailed questions, of course, is the specification for the maintenance work that the DBA is expected to handle, viz:

  • what is the defined strategy for adding new partitions and subpartitions to the table,
  • what is the strategy for aging out old partitions and subpartitions.
  • are there any plans about grouping partitions into “age-related” tablespaces
  • are tablespaces going to be renamed to transport them to another database

It’s possible that the anomaly in this note only showed up because the OP was experimenting with options, and maybe the ultimate production code will be based on a strategy that means the anomaly will never appear. It’s possible that the anomaly is already in the production system but only became visible when someone decided to think about archiving out old partitions and the archival code started raising errors. Playing around with models to discover what happens is time well spent; and modelling the full production life cycle before going live is a critical activity.

Some answers

To find out what we’ve actually got from the original create table statement we can query the views (user/all/dba/cdb):

USER_TABLES
USER_PART_TABLES
USER_TAB_PARTITIONS
USER_TAB_SUBPARTITIONS

USER_LOBS
USER_PART_LOBS
USER_LOB_PARTITIONS
USER_LOB_SUBPARTITIONS

USER_SEGMENTS

For the chunk sizes we find: user_lobs.chunk = 8192, user_lob_partitions.chunk = 2 (blocks?) and user_lob_subpartitions.chunk = 16,384 (bytes?). We also have user_part_lobs.def_chunk = 1 (block?). So the explicit p1 partition chunk size has cascaded down to its subpartition.

Side note: who needs consistency! You might also want to remember that when you call dbms_space.space_usage() procedure that appeared in 11g to report the space used in a (secure file) LOB, the “blocks” input parameters actually return chunks and (for CLOBS) the “bytes” input parameters actually return character counts.

For the tablespaces we find: user_lobs.tablespace_name= TEST_8K_ASSM, user_lob_partitions.tablespace_name = _$deleted$36$0 and user_lob_subpartitions.tablespace_name = JF2. We also have user_part_lobs.def_tablespace_name is null.

What happens if I try to add a subpartition to the existing p1 partition without first applying the fix:

SQL> alter table test modify partition p1 add subpartition sp2 values(2);
alter table test modify partition p1 add subpartition sp2 values(2)
            *
ERROR at line 1:
ORA-00959: tablespace '_$deleted$36$0' does not exist

Now try again but apply the fix before adding the subpartition:

SQL> alter table test modify default attributes for partition p1 lob(data) (tablespace jf2);

Table altered.

SQL> alter table test modify partition p1 add subpartition sp2 values(2);

Table altered.

Checking the data dictionary for the effects of changing the default attribute we find that user_lob_partitions.tablespace_name is now JF2, which has then been taken on by the new subpartition.

What about adding a new partition:

SQL> alter table test add partition p2 values less than (to_date('01-Jan-2010'));

Table altered.

SQL> select partition_name, chunk , tablespace_name from user_lob_partitions order by partition_name;

PARTITION_NAME              CHUNK TABLESPACE_NAME
---------------------- ---------- ------------------------------
P1                              2 JF2
P2                              1

SQL> select lob_partition_name, subpartition_name, chunk , tablespace_name from user_lob_subpartitions order by 1,2;

LOB_PARTITION_NAME   SUBPARTITION_NAME           CHUNK TABLESPACE_NAME
-------------------- ---------------------- ---------- ------------------------------
SYS_LOB_P20609       SP                          16384 JF2
SYS_LOB_P20609       SP2                         16384 JF2
SYS_LOB_P20622       SYS_SUBP20621                8192 TEST_8K_ASSM

The new partition has no tablespace_name, but it has automatically generated a subpartition (values (default)), which has climbed the tree to the table-level to set the tablespace for the LOB, and that had defaulted to the tablespace of the table itself, which was the user default tablespace of TEST_8K_ASSM. Maybe we should have modified the “default attributes lob(data)” at some point so that the user_part_lobs.def_tablespace_name was not null.

I’ll finish with just one more comment – you’ve seen how messy things can get and how much detail could be overlooked when handling marginally complex composite partitioned table. Do you really think that interval partitioning and automatic list partitioning are really going to mean you don’t have to worry about partition maintenance code? Possibly. If you plan to have one huge tablespace for all the bits and never have to worry about backing up and restoring that tablespace you will be able to forget about all the housekeeping code, but realistically you’ll need to know how to check and change the metadata and rename, move or otherwise manipulate segments so make sure you know what’s going to happen so that you don’t have to work it out when everyone’s running around in panic mode.

July 8, 2022

ORA-00054 pt.2

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:03 am BST Jul 8,2022

This is the follow-up to an initial post that covered some details of using the errorstack and ksq traces as and aid to finding the cause of an intermittent ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired. We were (hypothetically) looking at a scenario where a batch-like process would occasionally fail raising this error and leaving us to deal with an error that we could reproduce on demand.

Recapping the previous article, we saw that we could set a system-wide call to dump an errorstack at level 1 whenever the error occurred, producing a trace file containing the statement that had raised the error along with its SQL ID and a call stack that would allow us to find the object_id of the specific object that had caused the lock conflict. Once we had the SQL ID we then had the option to set a system-wide call to dump the ksq (Kernel Service Enqueues) trace whenever that statement was executed [but see footnote 1]. Whenever the statement succeeded this would give us a complete listing of the locks (enqueues) needed by the statement, and when the statement failed (due to ORA-00054) we would be able to see very clearly where the breakdown had occurred.

alter system set events '54 trace name errorstack level 1'; 
alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] disk=highest';

Neither option, however, would tell us anything about the competing session or about the SQL that caused the competing lock to come into existence; all we could hope for was some hint about why the ORA-00054 and little clue about the part of the application that was causing the conflict.

Using the SystemState

The basic systemstate dump tends to be rather large – for starters, it’s going to include a lot of information about currently open cursors for every (user) session – so it’s not something you really want to make frequent use of, and you don’t want it to be triggered frequently. But if you have an occasional (and critical) batch failure due to an intermittent locking problem then you can issue a call like:

alter system set events '00054 trace name systemstate level 2, lifetime 1';

The “lifetime 1” ensures that a session will only dump a systemstate once in its lifetime – which may be necessary to ensure the system isn’t overloaded with by larger numbers of system state dumps begin generated in a very short time interval. You may need to allow for more than just 1 dump per session, though.

In fact, since I want to dump both the errorstack and the systemstate when the ORA-00054 occurrs the critical three lines in my model were as follows

alter system set events '54 trace name errorstack level 1; name systemstate level 2, lifetime 1';

alter table child enable novalidate constraint chi_fk_par;

alter system set events '54 trace name systemstate off; name errorstack off';

So what do you get if you make this call and then try to re-enable a foreign key constraint when the parent table is locked. In my very small system, with just a couple of live sessions, and shortly after instance startup my tracefile was about 1.5MB and 16,000 lines in size, so not something to read through without a little filtering.

From the Call Stack Trace produced by the errorstack dump I could see that the first argument to ktaiam (and the associated function calls) was 00001EA61.This told me that I would find at least one session holding a lock identified as TM-0001EA61 so that’s the text I searched for next. Note the little trap: the value reported in the call stack has an extra leading zero. I found the TM enqueue 11,000 lines further down the file in a “State Object (SO:)” of type “DML Lock”:

        SO: 0x9cf0e708, type: DML lock (83), map: 0x9bac7b98
            state: LIVE (0x4532), flags: 0x0
            owner: 0x9cf648d8, proc: 0xa0eed9f0
            link: 0x9cf0e728[0x9cf64948, 0x9cf64948]
            conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
        SOC: 0x9bac7b98, type: DML lock (83), map: 0x9cf0e708
             state: LIVE (0x99fc), flags: INIT (0x1)
        DML LOCK: tab=125537 flg=11 chi=0
                  his[0]: mod=6 spn=348
2022-07-06 23:34:49.285*:ksq.c@10787:ksqdmc(): Enqueue Dump        (enqueue) TM-0001EA61-00000000-0039DED3-00000000  DID: ksqlkdid: 0001-0029-0000001F

        lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x16
        mode: X, lock_flag: 0x20, lock: 0x9bac7bc0, res: 0x9e7ec5e8
        own: 0xa086dc70, sess: 0xa086dc70, proc: 0xa05617d0, prv: 0x9e7ec5f8
        SGA version=(1,0)

In my case there was only one holder for this lock, but in a live system the same object could be locked by many users.

Having found a state object for the lock I had to identify the process holding this lock – which means searching backwards to find the parent of this state object, then its parent, and so on. Here, in the order I found them (which is the reverse order they appear in the file) are the lines I found:

        SO: 0x9cf0e708, type: DML lock (83), map: 0x9bac7b98
      SO: 0x9cf648d8, type: transaction (85), map: 0x9bbebdc8
      SO: 0x8df2ac88, type: LIBRARY OBJECT LOCK (118), map: 0x63693c60
      SO: 0x8df4fa78, type: LIBRARY OBJECT LOCK (118), map: 0x81af83e8
    SO: 0xa0efc020, type: session (4), map: 0xa086dc70
  SO: 0xa0eed9f0, type: process (2), map: 0xa05617d0

So the DML lock is owned by a transaction, which is owned by a session (with a couple of “library object lock” state objects “in the way”) which is owned by a process. You may find other “extraneous” lines on the way but the key detail to note is the hierarchical pattern of state objects – keep going until you’re reached the session and process state objects.

Once we’ve got this information we need to search the systemstate dump for any cursors that this session/process has open to see if we can find something that looks like a statement that could have created the lock, so we need to search for state objects of type “LIBRARY OBJECT LOCK” with the correct owner information.

Technically they would have to appear in the trace file between the process state object we’ve found and the next process state object listed in the file, but it would be a little tedious to do this seacrh with a text editor so I switched from using vi to using grep – and here’s a search condition that will identify and print part of each state object owned by this session and process:

grep -B+2 -A+13 "owner: 0xa0efc020, proc: 0xa0eed9f0" or19_ora_19369.trc >temp.txt

The hexadecimal value following “owner: “ is from the session state object, the value following “proc: “ is from the process state object. or19_ora_19369.trc is my trace file, and each time I’ve found a matching line I’ll write the 2 lines before it, the line itself, and 13 lines after it to the file temp.txt.

In my example I found 29 state objects owned by the process, of which 25 were of type “LIBRARY OBJECT LOCK” – and I’ve reported two of them below:

      SO: 0x8df36d38, type: LIBRARY OBJECT LOCK (118), map: 0x7b54afa0
          state: LIVE (0x4532), flags: 0x1
          owner: 0xa0efc020, proc: 0xa0eed9f0
          link: 0x8df36d58[0x8df496c8, 0x8df31608]
          child list count: 0, link: 0x8df36da8[0x8df36da8, 0x8df36da8]
          conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
      SOC: 0x7b54afa0, type: LIBRARY OBJECT LOCK (118), map: 0x8df36d38
           state: LIVE (0x99fc), flags: INIT (0x1)

      LibraryObjectLock:  Address=0x7b54afa0 Handle=0x82967ac0 Mode=N
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=1

        User=0xa086dc70 Session=0xa086dc70 ReferenceCount=1
        Flags=CNB/[0001] SavepointNum=155 Time=07/06/2022 23:29:53
      LibraryHandle:  Address=0x82967ac0 Hash=5ce74124 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
        ObjectName:  Name=lock table parent in exclusive mode


      SO: 0x8defb228, type: LIBRARY OBJECT LOCK (118), map: 0x944f8880
          state: LIVE (0x4532), flags: 0x1
          owner: 0xa0efc020, proc: 0xa0eed9f0
          link: 0x8defb248[0x96fbe5e8, 0x8df496c8]
          child list count: 0, link: 0x8defb298[0x8defb298, 0x8defb298]
          conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
      SOC: 0x944f8880, type: LIBRARY OBJECT LOCK (118), map: 0x8defb228
           state: LIVE (0x99fc), flags: INIT (0x1)

      LibraryObjectLock:  Address=0x944f8880 Handle=0x896af550 Mode=N
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0
        Context=0x7f125ecf34b8
        User=0xa086dc70 Session=0xa08664b8 ReferenceCount=1
        Flags=[0000] SavepointNum=0 Time=07/06/2022 23:29:53
      LibraryHandle:  Address=0x896af550 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
        Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=3

Details to note:

  • The last line I’ve selected from the first state object looks like a good candidate SQL statement for creating the blocking lock. The line above it, showing “Hash=5ce74124”, gives us the hexadecimal equivalent of the v$sql.hash_value for this statement.
  • I believe the last line of the second state object is telling us that the associated statement has been flushed from the library cache but I’m not sure that I’m interpreting that correctly. You’ll notice though that the line does gives us a suitable namespace and type for something to do with a SQL or PL/SQL cursor (and a hash value of zero – so if it is/was a (Pl/)SQL statement that’s the clue that it’s not in memory any more).

Comparing these two state objects, the things I want to find with minimal hassle are (I hope) the lines that start with the text “ObjectName” that appear one line after a line holding the text “Hash=” followed by anything but a zero, from state objects labelled “LIBRARY OBJECT LOCK”. Here’s a one-line (wrapped) grep command to do that, followed by the (slightly re-formatted) results I got from my trace file:

grep -B+2 -A+13 "owner: 0xa0efc020, proc: 0xa0eed9f0" or19_ora_19369.trc |
                grep -A+15 "SO:.*LIBRARY OBJECT LOCK" |
                grep -A+1  "Hash=[^0]" |
                grep "ObjectName"


        ObjectName:  Name=lock table parent in exclusive mode 
        ObjectName:  Name=select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg,
                     maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg,
                     maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg, cmpflag_stg, cmplvl_stg,imcflag_stg, 
                     ccflag_stg, flags2_stg from deferred_stg$  where obj# =:1 

It looks like I’ve got the information I need (or, at least) a good clue about why my batch session raised an ORA-00054; and, in a real system the other open cursors reported for this session might give me enough information to work out where the problem is coming from.

Warnings

The first warning is just a reminder that there may have been multiple sessions/processes holding locks on table, so don’t stop after finding the first occurrence of the TM-xxxxxxxx lock, check to see if there are any more and repeat the search for its owning process and owned Library Object Locks.

The second warning is that all this work may not give you an answer. A session may have locked a table ages ago and still have an active transaction open; if you’re unlucky the statement that produced the lock may have been flushed from the library cache. A comment I made in 2009 about finding the locking SQL is just as relevant here for the systemstate dump immediately after the ORA-00054 as it was when I first wrote about querying v$sql all those years ago. You may get lucky, and this prompt dumping of the systemstate may make you luckier, but there’s no guarantee you’ll find the guilty statement.

Furthermore, the state objects that I’ve been looking at are “LIBRARY OBJECT LOCK” state objects – these are the things that linke to a cursor that’s held open by the session (i.e. things you’d see in v$open_cursor) so if session introduced a table lock then closed the cursor (and hasn’t commited) the table will still be locked but the systemstate won’t have a state object for the statement that locked the table. For example when I created and executed the following procedure to lock the table using an “execute immediate” I found a state object for the procedure call, but I didn’t find a state object for the “lock table” statement:

create or replace procedure lock_p as
begin
        execute immediate 'lock table PARENT in exclusive mode';
end;
/

On the other hand when I created the procedure with embedded SQL I found state objects for both the procedure call and the SQL statement.

create or replace procedure lock_p as
begin
        LOCK TABLE test_user.PARENT in exclusive mode;
mode';
end;
/

In passing, the text of the “ObjectName:” you find for the procedure call varies depending on whether you “execute lock_p” or “call lock_p()” from SQL*Plus. The former shows up as “BEGIN lock_p; END;” and the latter as “call lock_p()”

The second warning is to remember that there may have been multiple sessions/processes holding locks on table, so don’t stop after finding the first occurrence of the TM-xxxxxxxx lock, check to see if there are any more and repeat the search for its owning process and owned Library Object Locks.

Conclusion

If you need to track down the cause of an intermittent locking problem that results in an Oracle error ORA-00054 then enabling a system-wide dump of the systemstate (level 2 is sufficient) on error 54 may allows you to find out what everyone else was doing around the time of the problem.

If you don’t already know which object is the locked object that’s the direct cause of the ORA-00054 then enabling the errorstack trace at the same time will allow you to find the object_id of the object, so that you can then find the the processes/sessions that are holding a DML (TM-) lock with the correct id.

For each State Object for the relevant DML lock you can track backwards up the trace file, and then use the address of each pair of session and process state objects to find all of their “open cursor”/”library open lock” state objects, and check the “ObjectName” of each to see the SQL or PL/SQL text. This may give you the information you need to identify where/how the application is going wrong.

Cursors close, and cursors that are still open (but not pinned) can be flushed from the library cache, and a lock may have been placed by a cursor whose text is no longer available, or not part of the systemstate dump, so this method is not perfect – however, since the systemstate dump takes place the instance the error occurs it doess improve your chances that the problem statement is still available and reported.

Footnote 1

Trouble-shooting when the problem is not reproducible on demand often puts you in a position where you have to make a trade-off between information gained and overheads required. Dumping an errorstack for every occurrence of an ORA-00054 is probably a small overhead since (you hope that) you don’t generate thousands of locking problems per hour . In the unlikely case that the errors occur very frequently to every session that connects you might be able to limit the overhead by adding the “lifetime” clause to the call, e.g:

alter system set events '00054 trace name errorstack level 1 , lifetime 5';

This would result in every session being able to dump the trace only on the first 5 occasions it triggered the error.

On the other hand, I can think of no effective way of limiting the ksq trace (beyond restricting its action to a specific SQL_ID). If the problem statement executes a couple of times in each batch run, and there are only a few batch runs per day, then the overhead will be small when you’re trying to find the details of a problem that happens one a week. But if the problem statement runs thousands of times in each batch run then it would probably be very expensive to enable the ksq trace to catch an intermittent error.

July 6, 2022

ORA-00054 pt.1

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 8:20 am BST Jul 6,2022

If you don’t recognise the number the relevant extract from the oraus.msg file is:

00054, 00000, "resource busy and acquire with NOWAIT specified or timeout expired"
// *Cause:  Interested resource is busy.
// *Action: Retry if necessary or increase timeout.

The error is typically the result of application code trying to do some DDL on an object that is locked in an incompatible mode by some other session, and when it shows up in the log from some batch process it can be difficult to find out what was going on at the same time to cause the problem because the error message tells you nothing about the blocker.

Nenad Noveljic has just published a note discussing what you can do to trouble-shoot this type of problem, but I thought I’d write up a note on one of the ways I’d address the problem, in this case starting from a specific question on the Oracle-L list-server.

Statement of problem

A piece of application code disables the foreign key constraints on a table, inserts (using “insert as select” with the /*+ append */ hint) a very large volume of data (tens to hundreds of millions of rows), then executes a pl/sql loop to re-enable, in novalidate mode, all the foreign key constraints on that table.

****Sample code *****
INSERT /*+Append*/ INTO TAB1 (c1,c2.....)
        SELECT (ROWNUM + sqq_key) ,col1, col2, col3..... from....tab2,
tab3, tab4.....;
COMMIT;

FOR I IN (
        SELECT TABLE_NAME, CONSTRAINT_NAME 
        FROM ALL_CONSTRAINTS 
        WHERE TABLE_NAME = v_table_nm 
        AND CONSTRAINT_TYPE = 'R' 
        AND STATUS = 'DISABLED'
) LOOP
   EXECUTE IMMEDIATE ('ALTER TABLE ' || v_table_nm || ' ENABLE NOVALIDATE CONSTRAINT '|| I.CONSTRAINT_NAME);
 END LOOP I;

From time to time one of the calls to re-enable a constraint fails raising ORA-00054, so the OP had set event 54 to do a systemstate dump to see if that would help identify the cause of the error:

alter system set events '54 trace name systemstate level 266, lifetime 1';

Picking through the resulting trace file, though, the OP got the impression that the session was blocking itself, leading to a worry that somehow the “commit;” wasn’t releasing locks properly so that the lock due to the insert was blocking the lock needed for the “enable constraint”.

Trouble-shooting

Reading the key question “is the commit not working properly?” my first thought was “it’s almost guaranteeable that the commit is doing what it’s supposed to do”; and I had no intention of reading a systemstate dump (or reading the bits that had been extracted by someone who had (almost guaranteeably) misinterpreted it).

Where you start trouble-shooting does depend to a degree on how much you already know about what’s going on. The OP, for example, already knew that the error appeared in response to one of the “alter table” commands and was also able to identify which constraint had caused the error to appear – but if you don’t even have that information how do you begin?

Since Oracle is raising an error (and one that probably doesn’t occur very frequently) you could just set the system to dump an errorstack every time the error occurred. (For a repeatable test you might use “alter session”, for a randomly occurring event you might have to “alter system” unless you were able to modify the batch code itself to issue its own “alter session” at the right point.) To minimise the size of the trace file level 1 should suffice, at least to begin with:

alter system set events '54 trace name errorstack level 1';

Here’s the start of the trace information that I produced by setting this event and trying to enable a constraint when I knew the call would be blocked by a competing lock:

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
<error barrier> at 0x7ffcdfb0dd20 placed dbkda.c@296
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
----- Current SQL Statement for this session (sql_id=3afvh3rtqqwyg) -----
alter table child enable novalidate constraint chi_fk_par

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst1()+95         call     kgdsdst()            7FFCDFB0D180 000000002
                                                   7FFCDFB074B0 ? 7FFCDFB075C8 ?
                                                   000000000 000000082 ?

Key points: the report starts with the SQL that triggered the error and gives us its SQL_ID. Before doing anything else, then, let’s consider what locks might be necessary for the constraint to be enabled (noting, particularly, that in this case the constraint is being enabled with the “novalidate” option. The OP suspects that the problem appears because the session can’t acquire a lock on the child table – but maybe there are other locks involved.

Let’s model the scenario with a parent/child table and referential integrity constraint and see what locks appear as we try to enable the constraint with “novalidate” . But we want to find out what locking goes on when there are no problems with competing sessions and no errors raised. You’ll note that I used the pause command in mid-script so that I could connect through another session if I wanted to introduce some competing locks:

rem
rem     Script:         errorstack.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2022
rem

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
)
;

create table child(
        id_p    number(4)       constraint chi_fk_par references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id) 
)
;

alter table child disable constraint chi_fk_par;

pause Press return

-- alter system set events '54 trace name errorstack level 1'; 
alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] disk=highest';

alter table child enable novalidate constraint chi_fk_par;

alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] off';
-- alter system set events '54 trace name errorstack off'; 

This is using the “new” trace mechanism, tracing the “ksq” (Kernel Service Enqueues) component of the RDBMS library (“oradebug doc component rdbms”), restricted to tracing only when the current SQL statement has a specific SQL_ID.

If you examine the trace file you’ll find lots of lines referencing the source file ksq.c, with call names like ksqgtlctx (get lock context ?) and ksqcli (clear lock information?). I’m just going to grep out the lines that contain the text “mode=”:

2022-07-05 10:44:13.642*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8C8-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
2022-07-05 10:44:13.642*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8C6-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
2022-07-05 10:44:13.644*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000005-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.644*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000006-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000007-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000008-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000009-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.647*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000A-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.647*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000B-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.648*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000C-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.649*:ksq.c@9175:ksqgtlctx(): *** TX-00090014-000022C2-0039DED3-00000000 mode=6 flags=0x401 why=176 timeout=0 ***
2022-07-05 10:44:13.649*:ksq.c@9175:ksqgtlctx(): *** TM-00000061-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.651*:ksq.c@9175:ksqgtlctx(): *** TM-00000049-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.651*:ksq.c@9175:ksqgtlctx(): *** TM-00000004-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.652*:ksq.c@9175:ksqgtlctx(): *** TM-0000001F-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.653*:ksq.c@9175:ksqgtlctx(): *** TM-00000012-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.654*:ksq.c@9175:ksqgtlctx(): *** RC-00000002-00000012-0039DED3-00000000 mode=4 flags=0x10401 why=294 timeout=5 ***
2022-07-05 10:44:13.654*:ksq.c@9175:ksqgtlctx(): *** RC-00000002-0000001F-0039DED3-00000000 mode=4 flags=0x10401 why=294 timeout=5 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TM-00004887-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TM-0001E643-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TX-0003001C-0000223A-0039DED3-00000000 mode=6 flags=0x401 why=176 timeout=0 ***

  • The first two locks are table locks in mode 4 (share) on the child and parent tables respectively (The values 0x0001E8C8 and 0x0001E8C6 are the hexadecimal equivalents of the object_ids).
  • The next 8 locks in mode 6 (exclusive) are something to do with the child table (same object_id appearing) and a check in v$lock_type tells us they’re something to do with compression.
  • Then we see a TX (transaction) lock in mode 6, 5 TM locks on very low (data dictionary) object_ids in mode 3, two RC (result cache) locks in mode 4.
  • Finally there’s two TM locks and a TX lock – and the table locks are for the aud$unified table and one of its partitions.

It’s probably safe to ignore the locking related to the recursive transactions (especially in since the locks show non-zero timeouts); I don’t know what the ZH locks are about but the increasing nature of the second component of the lock id suggests that they’re not likely to be a problem (even though they have to be acquired without a timeout).

The thing that catches my eye is that we have to lock both the child and the parent – and until I did this test I wasn’t certain that for a “novalidate” constraint there would be any need for a data lock on the parent – though a rowcache lock to check for the legality of the constraint definition would make sense.

So maybe the problem isn’t about the child table, possibly it’s about the parent. I’m going to rerun the whole test again, enabling the ksq trace and the errorstack, and in the pause that’s in the script I’m going to lock the parent table from another session before enabling the constraint. From the new trace file I’m going to show you more lines about some of the TM locks (which will be for new object_ids since the code drops and recreates tables) and then a few more lines from the error stack.

First the locking information:

2022-07-05 11:20:54.686*:ksq.c@9100:ksqgtlctx(): ksqtgtlctx: PDB mode 
2022-07-05 11:20:54.688*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8D4-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
...
2022-07-05 11:20:54.688*:ksq.c@9851:ksqgtlctx(): ksqgtlctx: updated lock mode, mode:4 req:0
2022-07-05 11:20:54.688*:ksq.c@9960:ksqgtlctx(): SUCCESS

2022-07-05 11:20:54.689*:ksq.c@9100:ksqgtlctx(): ksqtgtlctx: PDB mode 
2022-07-05 11:20:54.689*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8D2-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
...
2022-07-05 11:20:54.689*:ksq.c@9001:ksqcmi(): returns 51
2022-07-05 11:20:54.689*:ksq.c@9948:ksqgtlctx(): FAILURE: returns 51

Note how we can see here that it’s the attempt to lock the parent (lower object id:0x0001E8D2) that makes Oracle raise the error. Notice, by the way, that internally it’s raising error 51 (which is “timeout occurred while waiting for a resource”) not error 54.

And here’s a section of the error stack – quite a long way down:

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
...
ktagetg0()+929       call     ktaiam()             00001E8D2 000000004 000000000
                                                   7FFF26A8BC38 ? 000000000 ?
                                                   000000000 ?
ktagetp_internal()+  call     ktagetg0()           00001E8D2 ? 000000004 ?
141                                                000000004 ? 7FFF26A8BC38 ?
                                                   000000000 ? 000000000 ?
ktagetg_ddlX()+323   call     ktagetp_internal()   00001E8D2 000000000 000000004
                                                   000000000 ? 000000000
                                                   F7A4F57D00000000
ktagetg_ddl()+30     call     ktagetg_ddlX()       00001E8D2 ? 000000000 ?
                                                   000000004 ? 000000000 ?
                                                   000000000 ? 000000000
kkdllk0()+1551       call     ktagetg_ddl()        00001E8D2 ? 000000000 ?
                                                   000000004 ? 000000000 ?
                                                   000000000 ? 000000000 ?

You’ll notice in the list of arguments for these calls (which relate to getting locks for DDL) that the value 00001E8D2 (the object_id) and 000000004 (requested lock mode) keep appearing. So simply setting errorstack to level 1 will give you the SQL statement that caused the ORA-00054, and you will be able to find the object_id that Oracle was unable to lock and the attempted lock mode.

Conclusion

It can be a little tricky to track down the source of Oracle error ORA-00054 when it appears unpredictably and cannot be reproduced on demand, but there are two options that help you to get started.

The error is about attempting to lock something – so if you set the system to dump an error stack (even at only level 1) when the error occurs you should be able to find the object_id of the object that is the source of the problem and the lock mode being requested. It’s probably the case that this will be a small overhead when run at the system level since you probably don’t generate lock timeouts very often.

If you are lucky enough to know the SQL ID of statement that runs into the problem you can enable ksq (lock) tracing for that specific SQL statement and that will make it a lot easier to see exactly which lock attempt failed. If you have no idea of the SQL_ID, then ksq tracing for the whole system will probably be too much of an overhead to leave in place. The benefit of the ksq trace is that if you don’t know what locking your application code needs you will be able to see all the locks involved, and simply knowing what locks are involved may be enough to point you in the right direction.

Note: (which I haven’t tested) if the guilty SQL is called from inside a package, then using the SQL_ID of the package call may result in ksq tracing for every statement call inside the package call, and that might be a bearable overhead.

Footnote:

Although this allows us to discover where the locking conflict appeared, it doesn’t tell us what the blocking session did to get in our way. In the next installment I’ll describe how we can drill through through the systemstate dump to find out (if the information is still there) what the other session was doing to cause the problem.

Footnote 2:

It’s worth mentioning that in some cases of locking it can be a good idea to use the “wait N” (for a small value of N) option in your code as a wait of a few seconds may allow you to find some clues about blockers in the ASH (v$active_session_history / dba_hist_active_sess_history) information when a timeout occurs. In this specific case, though, I don’t think there’s a variant of the syntax that would allow you to do something like “alter table modify constraint …. wait 5”.

Update (a few minutes after publishing)

If you check comment #1 below you’ll see that Alexander Chervinskiy has supplied a method for getting the recursive locks to wait for a limited period by setting the parameter ddl_lock_timeout to a small value (in seconds). This can be done at the session or at the system level.

June 20, 2022

ORA-29857

Filed under: Oracle,Problem Solving,trace files,Troubleshooting — Jonathan Lewis @ 3:30 pm BST Jun 20,2022

This is another little “case study” on solving a problem. It’s based on a question on the MOSC DBA Forum (needs an account) about trying to drop a tablespace and seeing the error “ORA-29857: domain indexes and/or secondary objects exist in the tablespace” when (apparently) there are no domain indexes or secondary objects in the tablespace according to the queries that have been used to try and locate the problem objects.

Not being completely familiar with all the possibilities that might be relevant to that particular error, but having a vague memory of seeing an oddity once before, I thought I’d do a quick check on a couple of possible causes.

  • Case 1, which I assumed ought to raise the error because the tablespace held some objects that met the requirements for the error: I created a table in a tablespace called users and, for the domain indexes a couple of Context (Oracle Text) indexes in a tablespace called test_8k_assm. Then I tried to drop the tablespace holding the context indexes. Here’s the relevant cut-n-paste (19.11.0.0):
SQL> drop tablespace test_8k_assm including contents and datafiles;
drop tablespace test_8k_assm including contents and datafiles
*
ERROR at line 1:
ORA-29857: domain indexes and/or secondary objects exist in the tablespace

  • Case 2, which I thought would drop the tablespace because the model didn’t meet the requirement of the error message: I swapped the locations of the table and its indexes, creating the table in the tablespace called test_8k_assm and the Context indexes in the tablespace called users then I tried to drop the tablespace holding only the table. Here’s the relevant cut-n-paste (19.11.0.0):
SQL> drop tablespace test_8k_assm including contents and datafiles;
drop tablespace test_8k_assm including contents and datafiles
*
ERROR at line 1:
ORA-29857: domain indexes and/or secondary objects exist in the tablespace

So, the error message is misleading and, possibly, this might explain why the queries the OP had posted weren’t reporting any guilty objects. Whatever the error message says, the domain indexes that you need to deal with may not be in the tablespace you want to drop.

I also have a question-mark over “secondary objects” – I don’t know what objects (other than the bits and pieces that make up domain indexes) are deemed to be secondary.

Investigating

It’s at this point that the purpose of the blog note starts to appear – I don’t know the answer to the OP’s question and I’ve discovered a gap in my knowledge so I’m willing to spend a little time extending my knowledge and trying to give a little help to the OP.

We can start with the view dba_secondary_objects (based on secobj$ from script dcore.bsq) which tells us that there are two possible sources of secondary objects – index types and statistics types:

SQL> select dbms_metadata.get_ddl('VIEW','DBA_SECONDARY_OBJECTS') view_text from dual;

-- with a little cosmetic tidying:

VIEW_TEXT
-------------------------------------------------------------------------------------------
CREATE OR REPLACE FORCE NONEDITIONABLE VIEW "SYS"."DBA_SECONDARY_OBJECTS" ("INDEX_OWNER", "INDEX_NAME", "SECONDARY_OBJECT_OWNER", "SECONDARY_OBJECT_NAME", "SECONDARY_OBJDATA_TYPE")
AS
select u.name, o.name, u1.name, o1.name, 
       decode(s.spare1, 0, 'FROM INDEXTYPE',1, 'FROM STATISTICS TYPE')
from   sys.user$ u, sys.obj$ o, sys.user$ u1, sys.obj$ o1, sys.secobj$ s
where  s.obj# = o.obj# and o.owner# = u.user# and
       s.secobj# = o1.obj#  and  o1.owner# = u1.user#;

With this view definition I’d be tempted to do a little drilling down, with queries that return increasing amounts of data, and then pick subsets to expand to higher precision, e.g:

select  secondary_objdata_type, count(*)
from    dba_secondary_objects
group by
        secondary_objdata_type
/

Are there any secondary objects that aren’t domain indexes – if there are a few then query the view for just those to find out what they are and who owns them.

select  *
from    dba_secondary_objects
where   secondary_objdata_type = 'FROM STATISTICS TYPE'
/

In fact if the total number of secondary objects was small I might report the details about all of them and try to track them down, or if there was only one schema using that tablespace I might restrict my queries to the subset of secondary objects owned by that schema:

select  distinct index_owner, index_name
from    dba_secondary_objects
where   secondary_object_owner = 'TEST_USER'
;

The OP had a rather larger task than my little model – there were 14,499 segments in the tablespace, and a query for (segment_owner, segment_type, count(*)) returned 332 rows – so had taken a similar approach to mine to try and minimise the amount of work needed, starting with the following queries (which I’ve enhanced slightly and given a cosmetic makeover):

prompt  ==================================
prompt  Domain indexes (in the tablespace)
prompt  ==================================

select  owner, index_name, table_owner, table_name, tablespace_name
from    dba_indexes 
where   index_type = 'DOMAIN'
-- and  tablespace_name = 'TEST_8K_ASSM'        -- error: domain indexes don't have tablespaces
;

prompt  ==========================================
prompt  Domain indexes on tables in the tablespace
prompt  ==========================================

select  t.table_name, t.tablespace_name, i.index_name, i.tablespace_name
from    dba_indexes i, dba_tables t
where   t.table_name = i.table_name
and     t.owner = i.owner
and     i.index_type = 'DOMAIN'
and     t.tablespace_name = 'TEST_8K_ASSM'
;

prompt  ===================================================
prompt  Secondary objects that are tables in the tablespace
prompt  ===================================================

select  index_owner, index_name, secondary_object_owner, secondary_object_name, secondary_objdata_type
from    dba_secondary_objects
where   secondary_object_name in (
                select table_name 
                from dba_tables 
                where tablespace_name = 'TEST_8K_ASSM'
        )
;

prompt  ======================================================
prompt  Indexes in the tablespace that are secondary objects ?
prompt  ======================================================

select  index_owner, index_name, secondary_object_owner, secondary_object_name, secondary_objdata_type
from    dba_secondary_objects
where   index_name in (
                select index_name 
                from dba_indexes 
                where tablespace_name = 'TEST_8K_ASSM'
        )
;

The first query had an error – domain indexes do not report a tablespace_name in dba_indexes, so this query inevitably returned no rows.

The second query would report any domain indexes on tables that were in the given tablespace – which would be useful for the simple case where everything was created in the user’s default tablespace.

The third query reports any tables in the given tablespace that are the physical tables instantiating the “logical” domain index – which is particularly important for my example of the indexes being in the given tablespace when the table itself was in some other tablespace.

The final query is trying to report domain indexes that have corresponding physical objects in the given tablespace – but this isn’t going to help (at least with my Context indexes) because the index_name in dba_secondary_indexes will be a name that appears in dba_indexes as index_type = ‘DOMAIN’ and won’t have a value stored in the tablespace_name column.

Here’s the output I get (preceded by a full report of the view dba_secondary_objects in my otherwise empty little database) when I had the table in test_8k_assm and the indexes in tablespace users.

SQL> select * from dba_secondary_objects;

INDEX_OWNER     INDEX_NAME           SECONDARY_OBJECT_OWNER           SECONDARY_OBJECT_NAME            SECONDARY_OBJDATA_TY
--------------- -------------------- -------------------------------- -------------------------------- --------------------
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$I                  FROM INDEXTYPE
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$K                  FROM INDEXTYPE
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$N                  FROM INDEXTYPE
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$U                  FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$I                FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$K                FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$N                FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$U                FROM INDEXTYPE

8 rows selected.

==================================
Domain indexes (in the tablespace)
==================================

OWNER           INDEX_NAME           TABLE_OWNER     TABLE_NAME                TABLESPACE_NAME
--------------- -------------------- --------------- ------------------------- ------------------------------
TEST_USER       SH_IX_NAME           TEST_USER       SCRIPT_HANDLES
TEST_USER       SH_IX_HANDLE         TEST_USER       SCRIPT_HANDLES

2 rows selected.

==========================================
Domain indexes on tables in the tablespace
==========================================

TABLE_NAME                TABLESPACE_NAME                INDEX_NAME           TABLESPACE_NAME
------------------------- ------------------------------ -------------------- ------------------------------
SCRIPT_HANDLES            TEST_8K_ASSM                   SH_IX_NAME
SCRIPT_HANDLES            TEST_8K_ASSM                   SH_IX_HANDLE

2 rows selected.

===================================================
Secondary objects that are tables in the tablespace
===================================================

no rows selected

=====================================================
Indexes in the tablespace that have secondary objects
=====================================================

no rows selected

As you can see I have two Context indexes, each represented by 4 secondary tables (and that’s not the complete story because those tables have their own indexes and a couple of them have LOB columns, and one of the secondary tables is actually an index-organized table).

And this is the result when the base table is in tablespace users and the indexes are in test_8k_assm:

==================================
Domain indexes (in the tablespace)
==================================

OWNER           INDEX_NAME           TABLE_OWNER     TABLE_NAME                TABLESPACE_NAME
--------------- -------------------- --------------- ------------------------- ------------------------------
TEST_USER       SH_IX_NAME           TEST_USER       SCRIPT_HANDLES
TEST_USER       SH_IX_HANDLE         TEST_USER       SCRIPT_HANDLES

2 rows selected

==========================================
Domain indexes on tables in the tablespace
==========================================

no rows selected

===================================================
Secondary objects that are tables in the tablespace
===================================================

INDEX_OWNER     INDEX_NAME           SECONDARY_OBJECT_OWNER           SECONDARY_OBJECT_NAME            SECONDARY_OBJDATA_TY
--------------- -------------------- -------------------------------- -------------------------------- --------------------
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$I                  FROM INDEXTYPE
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$K                  FROM INDEXTYPE
TEST_USER       SH_IX_NAME           TEST_USER                        DR$SH_IX_NAME$U                  FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$I                FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$K                FROM INDEXTYPE
TEST_USER       SH_IX_HANDLE         TEST_USER                        DR$SH_IX_HANDLE$U                FROM INDEXTYPE

6 rows selected

=====================================================
Indexes in the tablespace that have secondary objects
=====================================================

no rows selected

So one of the middle two queries ought to return some rows if there are any Context indexes in the tablespace or on tables in the tablespace. (And for the final option – i.e. the table and its Context indexes being in the tablespace – both queries will return rows). The OP got no rows returned on all 4 queries.

What next?

It looks like the OP must have some secondary objects that aren’t playing by the rules if they’re “from index types”, or there are some secondary objects that are “from statistics types” and don’t get picked up by the queries. There is a view dba_ustat for “user stats types” which has interesting column names, but at this point I decided to follow a strategy that I’ve used many times in the past when Oracle raises an unexpected error on a repeatable process: repeat the process but enable SQL trace and see what Oracle was doing just before it raised the error.

Here’s the critical line in the trace file I produced; with a couple of the precding lines:

CLOSE #140708355472384:c=1860,e=1512,dep=1,type=0,tim=31701605233
EXEC #140708354845632:c=68018,e=68006,p=0,cr=1276,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=31701605472
ERROR #140708354845632:err=29857 tim=31701605485

A search backwards up the trace file showed that cursor #140708354845632 was for the “drop tablespace” statement. Just above the ERROR and EXEC lines we see cursor #140708355472384 closing. It’s reasonably likely (though not 100% certain) that this is the cursor representing the last SQL statement that Oracle executed before raising the error so searching backwards for “PARSING IN CURSOR #140708355472384” was my next step (and if that statement didn’t look promising I’d have search backwards from the ERROR line to find the cursor number from the most recent “EXEC” line). In this case (with my data) the CLOSE took me to the following (cosmetically cleaned) SQL:

select null from  (
             select o.obj# from obj$ o, tab$ t      where t.ts#=:1 and t.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, tabpart$ tp      where tp.ts#=:1 and tp.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, tabsubpart$ tsp      where tsp.ts#=:1 and tsp.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, ind$ i      where i.ts#=:1 and i.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, indpart$ ip      where ip.ts#=:1 and ip.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, indsubpart$ isp      where isp.ts#=:1 and isp.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select o.obj# from obj$ o, clu$ c      where c.ts#=:1 and c.obj#=o.obj# and            (decode(bitand(o.flags, 16), 16, 1, 0)=1)    
union all    select i.obj# from ind$ i, tab$ t, tabpart$ tp      where tp.ts#=:1 and tp.bo# = t.obj# and t.obj#=i.bo# and i.type#=9    
union all    select i.obj# from ind$ i, tab$ t, tabcompart$ tcp,tabsubpart$ tsp      where tsp.ts#=:1 and tsp.pobj#=tcp.obj# and tcp.bo#=t.obj# and            t.obj#=i.bo# and i.type#=9    
union all    select i.obj# from ind$ i, tab$ t      where t.ts#=:1 and t.obj#=i.bo# and i.type#=9
)

Every single line of the UNION ALL is addressing the same tablespace by number (:1) looking for objects where bit 5 is set in the obj$.flags column (i.e. secondary object) or indexes where the ind$.type# is 9 (“cooperative index method”).

That looks like a good bet for being the code that Oracle uses to check if there are any objects that should block the drop, so I changed the bind variable to the tablespace number (from v$tablespace) changed the “select null” to “select obj#” and stuck the resulting query block into an IN subquery against dba_objects (this may not be the most efficient of queries):

select 
        secondary, owner, object_name, object_type, object_id
from    dba_objects
where   object_id in (
                select obj# from  (
                                  select o.obj# from obj$ o, tab$ t where t.ts#=6 and t.obj#=o.obj# and (decode(bitand(o.flags, 16), 16, 1, 0)=1)
                        union all ...
                        union all select i.obj# from ind$ i, tab$ t where t.ts#=6 and t.obj#=i.bo# and i.type#=9
                )
        )
order by secondary, owner, object_name
/

S OWNER           OBJECT_NAME                      OBJECT_TYPE              OBJECT_ID
- --------------- -------------------------------- ----------------------- ----------
N TEST_USER       SH_IX_HANDLE                     INDEX                       123956
N TEST_USER       SH_IX_NAME                       INDEX                       123944
Y TEST_USER       DR$SH_IX_HANDLE$I                TABLE                       123957
Y TEST_USER       DR$SH_IX_HANDLE$K                TABLE                       123960
Y TEST_USER       DR$SH_IX_HANDLE$KD               INDEX                       123966
Y TEST_USER       DR$SH_IX_HANDLE$KR               INDEX                       123967
Y TEST_USER       DR$SH_IX_HANDLE$U                TABLE                       123963
Y TEST_USER       DR$SH_IX_HANDLE$X                INDEX                       123965
Y TEST_USER       DR$SH_IX_NAME$I                  TABLE                       123945
Y TEST_USER       DR$SH_IX_NAME$K                  TABLE                       123948
Y TEST_USER       DR$SH_IX_NAME$KD                 INDEX                       123954
Y TEST_USER       DR$SH_IX_NAME$KR                 INDEX                       123955
Y TEST_USER       DR$SH_IX_NAME$U                  TABLE                       123951
Y TEST_USER       DR$SH_IX_NAME$X                  INDEX                       123953
Y TEST_USER       SYS_C0016528                     INDEX                       123952
Y TEST_USER       SYS_C0016536                     INDEX                       123964
Y TEST_USER       SYS_IL0000123945C00006$$         INDEX                       123947
Y TEST_USER       SYS_IL0000123957C00006$$         INDEX                       123959
Y TEST_USER       SYS_IOT_TOP_123949               INDEX                       123950
Y TEST_USER       SYS_IOT_TOP_123961               INDEX                       123962

20 rows selected.

This is the list I got when the base table and the Context indexes were all created in the tablespace I wanted to drop – notice that the “logical” indexes (the first two items, flagged as secondary=’N’) are reported along with all the physical objects relating to the context indexes.

When I created the indexes in the target tablespace but the base table in an alternative tablespace the query reported only the physical objects making up the Context indexes; when I created the base table in the target tablespace and the Context indexes in an alternative tablespace the query reported only the “logical” indexes.

Conclusion

As a last resort, when the simple queries you find in the manuals or on MOS (or elsewhere on the internet) don’t seem to supply all the answers the strategy of finding and hacking the query that Oracle runs just before raising the error is a good generic method for getting a better handle on what’s causing the problem.

In this example of Oracle error ORA-29758 you can get a list of the objects (logical or physical or both) that are blocking a “drop tablespace” command and that makes it easier to find out what else you need to drop before you can drop the tablespace.

Bear in mind that I’ve just demonstrated the approach while experimenting with a couple of Context indexes, where the information I needed was available through a couple of ordinary data dictionary views – I only dived down into the internals because the data dictionary views weren’t giving the OP a complete answer.

I don’t know what was in the OP’s database to make this problem happen (it may simply be that the database was version 11.2.0.4 while mine was 19.11.0.0 and there’s a bug that’s been fixed since 11g), so the OP may find that after they’ve cleared all domain indexes indicated by this query they may still see ORA-29758 and find that there’s another query that Oracle uses to check for other conditions (maybe those “from statistics type” secondary objects) that also needs to be hijacked and subverted to complete the job. I know that when I traced a successful “drop tablespace” there were 13 more statements after the one I’ve shown that looked as if they might be checking for the (non-)existence of other types of object before the tablespace could be dropped.

Update

The OP has reported back that following this strategy they’ve chased the problem through (for their system) three further sets of objects, with the following error messages:

ORA-23515: materialized views and/or their indices exist in the tablespace

ORA-25183: index-organized table top index segment is in a different tablespace

ORA-02449: unique/primary keys in table referenced by foreign keys

June 17, 2022

Case Study

Filed under: Execution plans,Oracle,Parallel Execution,Performance,Problem Solving,Troubleshooting,Tuning — Jonathan Lewis @ 1:25 pm BST Jun 17,2022

The question “How do you trouble-shoot a performance problem” came up in an online session I did for the AIOUG (All-India OUG) today. It’s a very broad question and the only possible answers are either extremely generic, or very specific – so here’s a specific example that I talked about to give some idea of the processes I go through. It’s an example from the Oracle-l list server asking the (paraphrased) question:

I’ve got a parallel query that take 24 seconds to return 2,500 rows for 2018. .The same query for 2019 should return a similar amount of data but consumes a lot of TEMP space before failing; it takes 45 minutes to complete if I remove the parallel hint. The only difference between the two queries is a change to a single predicate: “opclf.year_number = to_number(‘YYYY’)” and the statistics are up to date – what’s going on?

The ease of diagnosing a problem is dependent on the ease of access to all the right information, and you don’t always know initially what the right information might be. In this case the OP had started by posting to github the SQL Monitor reports for the two queries, which were automatically generated since the queries were using parallel execution.

Click here to show/hide the first few sections for the 2019 output
SQL Text
------------------------------
SELECT /*+ PARALLEL(8) */ DISTINCT rd.document_id AS doc_id ,'firm_id' AS criteria_key ,opf.ultimate_parent_firm_id AS series_id ,period_number FROM ( SELECT /*+ PARALLEL(8) */ DISTINCT rd.document_id ,rd.client_role_id ,( CASE WHEN MONTHS_BETWEEN(TO_DATE('04/28/2022', 'MM/DD/YYYY'), TO_DATE('04/01/2017', 'MM/DD/YYYY')) > 12 THEN TRUNC(dc.date_value, 'YEAR') ELSE TRUNC(dc.date_value, 'MONTH') END ) period_number FROM REPORT_ENGINE.date_code dc ,REPORT_ENGINE.lit_fact_bd rd INNER JOIN
report_engine.firm FIRM ON rd.firm_id = FIRM.firm_id WHERE dc.date_value BETWEEN TO_DATE('04/01/2017', 'MM/DD/YYYY') AND TO_DATE('04/28/2022', 'MM/DD/YYYY') AND rd.publication_date_id = dc.date_id AND rd.year_number = to_number('2019') AND (FIRM.ultimate_parent_firm_id IN (to_number('252094'))) ) rd INNER JOIN report_engine.opposing_counsel op ON rd.client_role_id = op.legal_role_id INNER JOIN report_engine.lit_fact_bd opclf ON opclf.document_id = rd.document_id AND op.opposing_counsel_role_id
= opclf.client_role_id AND opclf.year_number = to_number('2019') INNER JOIN report_engine.firm opf ON opclf.firm_id = opf.firm_id AND opf.firm_id >= 1000 WHERE 1 = 1

Global Information
------------------------------
 Status              :  EXECUTING
 Instance ID         :  1
 Session             :  SYS (647:28741)
 SQL ID              :  3cjg20q2vw41f
 SQL Execution ID    :  16777216
 Execution Started   :  06/09/2022 05:08:24
 First Refresh Time  :  06/09/2022 05:08:25
 Last Refresh Time   :  06/09/2022 05:08:27
 Duration            :  221s
 Module/Action       :  sqlplus@c111dhw (TNS V1-V3)/-
 Service             :  SYS$USERS
 Program             :  sqlplus@c111dhw (TNS V1-V3)

Global Stats
====================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Buffer | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) |  Gets  | Reqs  | Bytes |
====================================================================
|     222 |     179 |       28 |       15 |    49M | 11624 |   2GB |
====================================================================

Parallel Execution Details (DOP=8 , Servers Allocated=16)
=====================================================================================================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    |  Other   | Buffer | Write | Write |         Wait Events         |
|                |       |         | Time(s) | Time(s) | Waits(s) | Waits(s) |  Gets  | Reqs  | Bytes |         (sample #)          |
=====================================================================================================================================
| PX Coordinator | QC    |         |    0.26 |    0.26 |          |     0.00 |  12144 |       |     . |                             |
| p000           | Set 1 |       1 |         |         |          |          |        |       |     . |                             |
| p001           | Set 1 |       2 |         |         |          |          |        |       |     . |                             |
| p002           | Set 1 |       3 |         |         |          |          |        |       |     . |                             |
| p003           | Set 1 |       4 |         |         |          |          |        |       |     . |                             |
| p004           | Set 1 |       5 |         |         |          |          |        |       |     . |                             |
| p005           | Set 1 |       6 |         |         |          |          |        |       |     . |                             |
| p006           | Set 1 |       7 |         |         |          |          |        |       |     . |                             |
| p007           | Set 1 |       8 |         |         |          |          |        |       |     . |                             |
| p008           | Set 2 |       1 |     220 |     177 |       28 |       15 |    48M | 11624 |   2GB | direct path write temp (28) |
| p009           | Set 2 |       2 |         |         |          |          |        |       |     . |                             |
| p010           | Set 2 |       3 |         |         |          |          |        |       |     . |                             |
| p011           | Set 2 |       4 |    1.71 |    1.70 |          |     0.01 |   595K |       |     . |                             |
| p012           | Set 2 |       5 |         |         |          |          |        |       |     . |                             |
| p013           | Set 2 |       6 |         |         |          |          |        |       |     . |                             |
| p014           | Set 2 |       7 |         |         |          |          |        |       |     . |                             |
| p015           | Set 2 |       8 |         |         |          |          |        |       |     . |                             |
=====================================================================================================================================

You’ll note that I haven’t got as far as the body of the execution plan yet, and I’ve highlighted line 44 – a line in the middle of the summary of activity for the parallel execution processes. There are 8 servers in each of two sets (we’re running parallel 8) and the line I’ve highlighted is the first server of the second set. The reason I’ve highlighted it is that it’s the one server that’s doing almost all the work – effectively the query (at some point in the plan) is running serially.

So, a first thought, maybe we’ve just been unlucky and running parallel 7 (say) would result in a better distribution of data across parallel servers and allow each of 7 processes to do whatever they had to do to a much smaller amount of data. Maybe a change to the distribution method (pq_distribute() hint) would spread the workload more evenly. In either case “N” smaller workload might still use a lot of TEMP, but possibly no individual process would fail, and the job would complete nearly N times as quickly.

Note: the 2018 Monitor report showed an equivalent skew in the data distribution, but the amount of work needed was much smaller with a read/write load of only 143MB compared to 2GB for the 2019 report. The OP did try running parallel 7, but with no change to the overall effect.

Let’s take a quick glance at the plan body (click to view/hide)
SQL Plan Monitoring Details (Plan Hash Value=1862006233)
=========================================================================================================================================================================================================================
| Id    |                     Operation                      |           Name            |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Write | Write | Mem | Temp | Activity |        Activity Detail       |
|       |                                                    |                           | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |     |      |   (%)     |         (# samples)         |
=========================================================================================================================================================================================================================
|     0 | SELECT STATEMENT                                   |                           |         |      |           |        |     3 |          |       |       |     |      |   |                             |
|     1 |   PX COORDINATOR                                   |                           |         |      |           |        |     3 |          |       |       |     |      |   |                             |
|     2 |    PX SEND QC (RANDOM)                             | :TQ10003                  |     781 |  153 |           |        |       |          |       |       |     |      |   |                             |
|     3 |     HASH UNIQUE                                    |                           |     781 |  153 |           |        |       |          |       |       |     |      |   |                             |
|     4 |      PX RECEIVE                                    |                           |     781 |  151 |           |        |       |          |       |       |     |      |   |                             |
|     5 |       PX SEND HASH                                 | :TQ10002                  |     781 |  151 |           |        |     2 |          |       |       |     |      |   |                             |
|  -> 6 |        BUFFER SORT                                 |                           |     781 |  153 |       219 |     +3 |     2 |        0 | 11624 |   2GB | 61M |   2G |    26.21 | Cpu (26)                     |
|       |                                                    |                           |         |      |           |        |       |          |       |       |     |      |   | direct path write temp (28) |
|  -> 7 |         NESTED LOOPS                               |                           |     781 |  151 |       219 |     +3 |     2 |      39M |       |       |     |      |   |                             |
|  -> 8 |          NESTED LOOPS                              |                           |     781 |  151 |       219 |     +3 |     2 |      39M |       |       |     |      |     0.49 | Cpu (1)                      |
|  -> 9 |           NESTED LOOPS                             |                           |     777 |   44 |       219 |     +3 |     2 |      39M |       |       |     |      |   |                             |
| -> 10 |            NESTED LOOPS                            |                           |      41 |   26 |       219 |     +3 |     2 |     6463 |       |       |     |      |   |                             |
| -> 11 |             HASH JOIN                              |                           |      41 |   21 |       219 |     +3 |     2 |     6463 |       |       |  6M |      |   |                             |
|    12 |              BUFFER SORT                           |                           |         |      |         1 |     +3 |     2 |    36855 |       |       |     |      |   |                             |
|    13 |               PX RECEIVE                           |                           |      87 |   19 |         1 |     +3 |     2 |    36855 |       |       |     |      |   |                             |
|    14 |                PX SEND HASH                        | :TQ10000                  |      87 |   19 |         1 |     +3 |     1 |    38694 |       |       |     |      |   |                             |
|    15 |                 NESTED LOOPS                       |                           |      87 |   19 |         1 |     +3 |     1 |    38694 |       |       |     |      |   |                             |
|    16 |                  NESTED LOOPS                      |                           |      87 |   19 |         1 |     +3 |     1 |    38694 |       |       |     |      |   |                             |
|    17 |                   TABLE ACCESS BY INDEX ROWID      | FIRM                      |       1 |    2 |         1 |     +3 |     1 |       43 |       |       |     |      |   |                             |
|    18 |                    INDEX RANGE SCAN                | FIRM_ULT_PARENT_FIRM_IDX1 |       1 |    1 |         1 |     +3 |     1 |       43 |       |       |     |      |   |                             |
|    19 |                   PARTITION RANGE SINGLE           |                           |         |      |         1 |     +3 |    43 |    38694 |       |       |     |      |   |                             |
|    20 |                    BITMAP CONVERSION TO ROWIDS     |                           |         |      |         1 |     +3 |    43 |    38694 |       |       |     |      |   |                             |
|    21 |                     BITMAP INDEX SINGLE VALUE      | LIT_FACT_BD_IDX09         |         |      |         1 |     +3 |    43 |       49 |       |       |     |      |   |                             |
|    22 |                  TABLE ACCESS BY LOCAL INDEX ROWID | LIT_FACT_BD               |      63 |   19 |         3 |     +1 | 38694 |    38694 |       |       |     |      |     0.49 | Cpu (1)                      |
|    23 |              PX RECEIVE                            |                           |      20 |    2 |         1 |     +3 |     2 |        2 |       |       |     |      |   |                             |
|    24 |               PX SEND HASH                         | :TQ10001                  |      20 |    2 |           |        |       |          |       |       |     |      |   |                             |
|    25 |                PX BLOCK ITERATOR                   |                           |      20 |    2 |           |        |       |          |       |       |     |      |   |                             |
|    26 |                 TABLE ACCESS FULL                  | OPPOSING_COUNSEL          |      20 |    2 |           |        |       |          |       |       |     |      |   |                             |
| -> 27 |             TABLE ACCESS BY INDEX ROWID            | DATE_CODE                 |       1 |      |       219 |     +3 |  6465 |     6463 |       |       |     |      |   |                             |
| -> 28 |              INDEX UNIQUE SCAN                     | PK_DATE_CODE              |       1 |      |       219 |     +3 |  6465 |     6465 |       |       |     |      |   |                             |
| -> 29 |            PARTITION RANGE SINGLE                  |                           |      19 |      |       219 |     +3 |  6465 |      39M |       |       |     |      |   |                             |
| -> 30 |             TABLE ACCESS BY LOCAL INDEX ROWID      | LIT_FACT_BD               |      19 |      |       220 |     +2 |  6465 |      39M |       |       |     |      |    35.92 | Cpu (74)                     |
| -> 31 |              INDEX RANGE SCAN                      | LIT_FACT_BD_IDX20         |       1 |      |       219 |     +3 |  6465 |      39M |       |       |     |      |     9.22 | Cpu (19)                     |
| -> 32 |           INDEX UNIQUE SCAN                        | PK_FIRM                   |       1 |      |       219 |     +3 |   39M |      39M |       |       |     |      |    10.68 | Cpu (22)                     |
| -> 33 |          TABLE ACCESS BY INDEX ROWID               | FIRM                      |       1 |      |       219 |     +3 |   39M |      39M |       |       |     |      |    16.99 | Cpu (35)                     |
===================================================================================================================================================================

You can see from the “->” symbols at the left hand side of the plan that this report was generated while the plan was still running. The thing that leaps out as you glance down the page is the value in the “Rows (Actual)” column at operations 7-9 (which show the rowsources generated by some nested loop joins) and operations 29, 32 and 33 of the plan that tell us something about how those rowsources were generated.

Operation 29 has executed (Execs) 6,465 so far, producing a total of 39M rows, and operations 32 and 33 have both executed 39M times each producing a total of 39M rows by index unique scan.

The plan for the 2018 data was similar though the join order for DATE_CODE, LIT_FACT_BD and FIRM was different (and it was the join to LIT_FACT_BD that increased the row count dramatically – so hinting it to be the last table in the join might help a bit), but the largest rowcount for the 2018 query was only 3M rows, not the 39M that had appeared after only 6,465 rows of a possible driving 39,855 in the 2019 query.

So it’s not surprising that the query could take much longer for 2019. It’s not the volume of output that matters, it’s the volume of input (or, more accurately, throughput or intermediate) data that matters.

Let’s think about that volume, though: the 2018 plan generated 3M rows and then crunched them down to 2,500 rows and the 2019 plan was supposed to produce a similar sized output (from 39M+ rows). Could we rewrite the query in some way that made it do some intermediate aggregation so that the volume of data to be aggregated was never enormous?

Let’s take a look at the plan from the 2018 Monitor report (click to show/hide)
SQL Plan Monitoring Details (Plan Hash Value=472871521)
=======================================================================================================================================================================================================================================
| Id |                     Operation                      |           Name            |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  | Write | Write |  Mem  | Temp  | Activity |       Activity Detail       |
|    |                                                    |                           | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)     |        (# samples)         |
=======================================================================================================================================================================================================================================
|  0 | SELECT STATEMENT                                   |                           |         |      |         1 |    +24 |    17 |     2613 |      |       |       |       |       |       |   |                            |
|  1 |   PX COORDINATOR                                   |                           |         |      |         1 |    +24 |    17 |     2613 |      |       |       |       |       |       |   |                            |
|  2 |    PX SEND QC (RANDOM)                             | :TQ10003                  |       1 |   39 |         1 |    +24 |     8 |     2613 |      |       |       |       |       |       |   |                            |
|  3 |     HASH UNIQUE                                    |                           |       1 |   39 |         9 |    +16 |     8 |     2613 |      |       |       |       |    9M |       |     6.90 | Cpu (2)                     |
|  4 |      PX RECEIVE                                    |                           |       1 |   38 |         9 |    +16 |     8 |       3M |      |       |       |       |       |       |   |                            |
|  5 |       PX SEND HASH                                 | :TQ10002                  |       1 |   38 |        12 |    +14 |     8 |       3M |      |       |       |       |       |       |     3.45 | Cpu (1)                     |
|  6 |        BUFFER SORT                                 |                           |       1 |   39 |        23 |     +2 |     8 |       3M | 4584 | 143MB |   703 | 143MB |  151M |  151M |    34.48 | Cpu (2)                     |
|    |                                                    |                           |         |      |           |        |       |          |      |       |       |       |       |       |   | direct path read temp (6)  |
|    |                                                    |                           |         |      |           |        |       |          |      |       |       |       |       |       |   | direct path write temp (2) |
|  7 |         NESTED LOOPS                               |                           |       1 |   38 |        15 |     +2 |     8 |       3M |      |       |       |       |       |       |   |                            |
|  8 |          NESTED LOOPS                              |                           |       1 |   38 |        15 |     +2 |     8 |       3M |      |       |       |       |       |       |   |                            |
|  9 |           NESTED LOOPS                             |                           |       1 |   38 |        15 |     +2 |     8 |       3M |      |       |       |       |       |       |   |                            |
| 10 |            NESTED LOOPS                            |                           |       1 |   38 |        15 |     +2 |     8 |       3M |      |       |       |       |       |       |   |                            |
| 11 |             HASH JOIN                              |                           |      41 |   21 |        15 |     +2 |     8 |    19334 |      |       |       |       |    7M |       |   |                            |
| 12 |              BUFFER SORT                           |                           |         |      |        13 |     +2 |     8 |    19233 |      |       |       |       |    1M |       |   |                            |
| 13 |               PX RECEIVE                           |                           |      89 |   19 |        13 |     +2 |     8 |    19233 |      |       |       |       |       |       |   |                            |
| 14 |                PX SEND HASH                        | :TQ10000                  |      89 |   19 |         1 |     +1 |     1 |    19233 |      |       |       |       |       |       |   |                            |
| 15 |                 NESTED LOOPS                       |                           |      89 |   19 |         1 |     +1 |     1 |    19233 |      |       |       |       |       |       |   |                            |
| 16 |                  NESTED LOOPS                      |                           |      89 |   19 |         1 |     +1 |     1 |    19233 |      |       |       |       |       |       |   |                            |
| 17 |                   TABLE ACCESS BY INDEX ROWID      | FIRM                      |       1 |    2 |         1 |     +1 |     1 |       43 |      |       |       |       |       |       |   |                            |
| 18 |                    INDEX RANGE SCAN                | FIRM_ULT_PARENT_FIRM_IDX1 |       1 |    1 |         1 |     +1 |     1 |       43 |      |       |       |       |       |       |   |                            |
| 19 |                   PARTITION RANGE SINGLE           |                           |         |      |         1 |     +1 |    43 |    19233 |      |       |       |       |       |       |   |                            |
| 20 |                    BITMAP CONVERSION TO ROWIDS     |                           |         |      |         1 |     +1 |    43 |    19233 |      |       |       |       |       |       |   |                            |
| 21 |                     BITMAP INDEX SINGLE VALUE      | LIT_FACT_BD_IDX09         |         |      |         1 |     +1 |    43 |       51 |      |       |       |       |       |       |   |                            |
| 22 |                  TABLE ACCESS BY LOCAL INDEX ROWID | LIT_FACT_BD               |      64 |   19 |         1 |     +1 | 19233 |    19233 |      |       |       |       |       |       |   |                            |
| 23 |              PX RECEIVE                            |                           |      20 |    2 |        15 |     +2 |     8 |       20 |      |       |       |       |       |       |   |                            |
| 24 |               PX SEND HASH                         | :TQ10001                  |      20 |    2 |         1 |    +14 |     8 |       20 |      |       |       |       |       |       |   |                            |
| 25 |                PX BLOCK ITERATOR                   |                           |      20 |    2 |         1 |    +14 |     8 |       20 |      |       |       |       |       |       |   |                            |
| 26 |                 TABLE ACCESS FULL                  | OPPOSING_COUNSEL          |      20 |    2 |         1 |    +14 |     3 |       20 |      |       |       |       |       |       |   |                            |
| 27 |             PARTITION RANGE SINGLE                 |                           |       1 |      |        15 |     +2 | 19334 |       3M |      |       |       |       |       |       |   |                            |
| 28 |              TABLE ACCESS BY LOCAL INDEX ROWID     | LIT_FACT_BD               |       1 |      |        16 |     +1 | 19334 |       3M |      |       |       |       |       |       |    17.24 | Cpu (5)                     |
| 29 |               INDEX RANGE SCAN                     | LIT_FACT_BD_IDX20         |       1 |      |        15 |     +2 | 19334 |       3M |      |       |       |       |       |       |   |                            |
| 30 |            TABLE ACCESS BY INDEX ROWID             | DATE_CODE                 |       1 |      |        15 |     +2 |    3M |       3M |      |       |       |       |       |       |    10.34 | Cpu (3)                     |
| 31 |             INDEX UNIQUE SCAN                      | PK_DATE_CODE              |       1 |      |        16 |     +1 |    3M |       3M |      |       |       |       |       |       |     6.90 | Cpu (2)                     |
| 32 |           INDEX UNIQUE SCAN                        | PK_FIRM                   |       1 |      |        23 |     +2 |    3M |       3M |      |       |       |       |       |       |     6.90 | Cpu (2)                     |
| 33 |          TABLE ACCESS BY INDEX ROWID               | FIRM                      |       1 |      |        16 |     +1 |    3M |       3M |      |       |       |       |       |       |    13.79 | Cpu (4)                     |
=======================================================================================================================================================================================================================================

We see from operations 3 – 7 that the 3M rows generated from the nested loop joins pass up through a buffer sort operation before being crunched down to 2,613 rows. It’s probably the buffer sort (which is buffering but not sorting) that has mostly passed through a single server and spilled to disc in the 2019 report. We just don’t want that 39M+ rows ever to exist.

So how easy will it be to change the SQL (click to view/hide)
SELECT
        /*+ PARALLEL(8) */
        DISTINCT rd.document_id AS doc_id
        ,'firm_id' AS criteria_key
        ,opf.ultimate_parent_firm_id AS series_id
        ,period_number
FROM (
        SELECT
                /*+ PARALLEL(8) */
                DISTINCT rd.document_id
                ,rd.client_role_id
                ,(
                        CASE 
                                WHEN MONTHS_BETWEEN(TO_DATE('04/28/2022', 'MM/DD/YYYY'), TO_DATE('04/01/2017', 'MM/DD/YYYY')) > 12
                                        THEN TRUNC(dc.date_value, 'YEAR')
                                ELSE TRUNC(dc.date_value, 'MONTH')
                                END
                        ) period_number
        FROM REPORT_ENGINE.date_code dc
                ,REPORT_ENGINE.lit_fact_bd rd
        INNER JOIN report_engine.firm FIRM ON rd.firm_id = FIRM.firm_id
        WHERE dc.date_value BETWEEN TO_DATE('04/01/2017', 'MM/DD/YYYY')
                        AND TO_DATE('04/28/2022', 'MM/DD/YYYY')
                AND rd.publication_date_id = dc.date_id
                AND rd.year_number = to_number('2019')
                AND (FIRM.ultimate_parent_firm_id IN (to_number('252094')))
        ) rd
INNER JOIN report_engine.opposing_counsel op ON rd.client_role_id = op.legal_role_id
INNER JOIN report_engine.lit_fact_bd opclf ON opclf.document_id = rd.document_id
        AND op.opposing_counsel_role_id = opclf.client_role_id
        AND opclf.year_number = to_number('2019')
INNER JOIN report_engine.firm opf ON opclf.firm_id = opf.firm_id
        AND opf.firm_id >= 1000
WHERE 1 = 1;

Lines 7-10 and 27 tell us we alredy have an inline view where we’re doing a “select distinct” and, unwinding the mix of “Oracle” and “ANSI” syntax, we can see that it joins DATE_CODE, LIT_FACT_BD and FIRM, and we know that one of those tables explodes the intermediate data size to something enormous. So it looks like the original author of this code had already worked out that the query needed to aggregate early.

Checking back to the original plans we note that there’s only one “hash unique” operation, and there’s no sign of a “view” operation, so maybe the performance problem is a result of the optimizer suddenly deciding it can do complex view merging with this inline view, and perhaps all we need to do is add the hint /*+ no_merge */ to the inline view and see what happens.

Here’s the plan after adding the hint (click to hide/vew)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----
| Id  | Operation                                            | Name                      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----
|   0 | SELECT STATEMENT                                     |                           |       |       |   314 (100)|          |       |       |        |      | |
|   1 |  PX COORDINATOR                                      |                           |       |       |            |          |       |       |        |      | |
|   2 |   PX SEND QC (RANDOM)                                | :TQ10005                  |    23 |  2599 |   314   (1)| 00:00:06 |       |       |  Q1,05 | P->S | QC (RAN D)  |
|   3 |    HASH UNIQUE                                       |                           |    23 |  2599 |   314   (1)| 00:00:06 |       |       |  Q1,05 | PCWP | |
|   4 |     PX RECEIVE                                       |                           |    23 |  2599 |   314   (1)| 00:00:06 |       |       |  Q1,05 | PCWP | |
|   5 |      PX SEND HASH                                    | :TQ10004                  |    23 |  2599 |   314   (1)| 00:00:06 |       |       |  Q1,04 | P->P | HASH |
|   6 |       HASH UNIQUE                                    |                           |    23 |  2599 |   314   (1)| 00:00:06 |       |       |  Q1,04 | PCWP | |
|   7 |        NESTED LOOPS                                  |                           |    23 |  2599 |   313   (1)| 00:00:06 |       |       |  Q1,04 | PCWP | |
|   8 |         NESTED LOOPS                                 |                           |    23 |  2599 |   313   (1)| 00:00:06 |       |       |  Q1,04 | PCWP | |
|   9 |          NESTED LOOPS                                |                           |    23 |  2323 |   310   (1)| 00:00:06 |       |       |  Q1,04 | PCWP | |
|* 10 |           HASH JOIN                                  |                           |   388 | 21340 |   148   (1)| 00:00:03 |       |       |  Q1,04 | PCWP | |
|  11 |            PX RECEIVE                                |                           |    20 |   160 |     2   (0)| 00:00:01 |       |       |  Q1,04 | PCWP | |
|  12 |             PX SEND BROADCAST                        | :TQ10002                  |    20 |   160 |     2   (0)| 00:00:01 |       |       |  Q1,02 | P->P | BROADCA ST  |
|  13 |              PX BLOCK ITERATOR                       |                           |    20 |   160 |     2   (0)| 00:00:01 |       |       |  Q1,02 | PCWC | |
|* 14 |               TABLE ACCESS FULL                      | OPPOSING_COUNSEL          |    20 |   160 |     2   (0)| 00:00:01 |       |       |  Q1,02 | PCWP | |
|  15 |            VIEW                                      |                           |   835 | 39245 |   146   (1)| 00:00:03 |       |       |  Q1,04 | PCWP | |
|  16 |             HASH UNIQUE                              |                           |   835 | 63460 |   146   (1)| 00:00:03 |       |       |  Q1,04 | PCWP | |
|  17 |              PX RECEIVE                              |                           |   835 | 63460 |   145   (0)| 00:00:03 |       |       |  Q1,04 | PCWP | |
|  18 |               PX SEND HASH                           | :TQ10003                  |   835 | 63460 |   145   (0)| 00:00:03 |       |       |  Q1,03 | P->P | HASH |
|* 19 |                HASH JOIN BUFFERED                    |                           |   835 | 63460 |   145   (0)| 00:00:03 |       |       |  Q1,03 | PCWP | |
|  20 |                 BUFFER SORT                          |                           |       |       |            |          |       |       |  Q1,03 | PCWC | |
|  21 |                  PX RECEIVE                          |                           |   835 | 52605 |   136   (0)| 00:00:03 |       |       |  Q1,03 | PCWP | |
|  22 |                   PX SEND HASH                       | :TQ10000                  |   835 | 52605 |   136   (0)| 00:00:03 |       |       |        | S->P | HASH |
|  23 |                    NESTED LOOPS                      |                           |   835 | 52605 |   136   (0)| 00:00:03 |       |       |        |      | |
|  24 |                     NESTED LOOPS                     |                           |   835 | 52605 |   136   (0)| 00:00:03 |       |       |        |      | |
|  25 |                      TABLE ACCESS BY INDEX ROWID     | FIRM                      |     1 |    12 |     2   (0)| 00:00:01 |       |       |        |      | |
|* 26 |                       INDEX RANGE SCAN               | FIRM_ULT_PARENT_FIRM_IDX1 |     1 |       |     1   (0)| 00:00:01 |       |       |        |      | |
|  27 |                      PARTITION RANGE SINGLE          |                           |       |       |            |          |    30 |    30 |        |      | |
|  28 |                       BITMAP CONVERSION TO ROWIDS    |                           |       |       |            |          |       |       |        |      | |
|* 29 |                        BITMAP INDEX SINGLE VALUE     | LIT_FACT_BD_IDX09         |       |       |            |          |    30 |    30 |        |      | |
|* 30 |                     TABLE ACCESS BY LOCAL INDEX ROWID| LIT_FACT_BD               |   598 | 30498 |   136   (0)| 00:00:03 |    30 |    30 |        |      | |
|  31 |                 PX RECEIVE                           |                           |  1854 | 24102 |     9   (0)| 00:00:01 |       |       |  Q1,03 | PCWP | |
|  32 |                  PX SEND HASH                        | :TQ10001                  |  1854 | 24102 |     9   (0)| 00:00:01 |       |       |  Q1,01 | P->P | HASH |
|  33 |                   PX BLOCK ITERATOR                  |                           |  1854 | 24102 |     9   (0)| 00:00:01 |       |       |  Q1,01 | PCWC | |
|* 34 |                    TABLE ACCESS FULL                 | DATE_CODE                 |  1854 | 24102 |     9   (0)| 00:00:01 |       |       |  Q1,01 | PCWP | |
|  35 |           PARTITION RANGE SINGLE                     |                           |     1 |    46 |     0   (0)|          |    30 |    30 |  Q1,04 | PCWP | |
|* 36 |            TABLE ACCESS BY LOCAL INDEX ROWID         | LIT_FACT_BD               |     1 |    46 |     0   (0)|          |    30 |    30 |  Q1,04 | PCWP | |
|* 37 |             INDEX RANGE SCAN                         | LIT_FACT_BD_IDX20         |     1 |       |     0   (0)|          |    30 |    30 |  Q1,04 | PCWP | |
|* 38 |          INDEX UNIQUE SCAN                           | PK_FIRM                   |     1 |       |     0   (0)|          |       |       |  Q1,04 | PCWP | |
|  39 |         TABLE ACCESS BY INDEX ROWID                  | FIRM                      |     1 |    12 |     0   (0)|          |       |       |  Q1,04 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----

Note particularly that operations 15 and 16 tell us that we’ve forced the optimizer into treating the inline view as a separate query block and we now have two aggregation steps, one inside the view, and another after joining FIRM (again) and LIT_FACT_BD (again) to the inline view.

Unfortunately the plan shown here is pulled from memory using dbms_xplan.display_cursor() after execution, so it include the various parallel executoin colums (TQ, IN-OUT, PQ Distrib), but doesn’t have the rowsource execution stats enabled so we can’t see the actual workload and data volume, but in the words of the OP: “adding no_merge hint did the trick and now the SQL is just executing fine”.

Summary

The steps for solving the performance problems of a specific SQL statement are very fluid. For a long-running or parallel statement the SQL Monitor report will automatically be created (though there are limits on the size of the plan that may disable the feature) and this is the easiest source of useful information, though you might also need to pull the execution plan from v$sql_plan to get details about parallel execution and partitioning at the same time.

If you’re not licensed for the diagnostic and performance packs, though, enabling SQL Trace to get the plan and waits gets you a lot of infomation, and querying (g)v$pq_tqstat immediately after running the query can fill in the parallel traffic details.

In this example the SQL Monitor report showed a highly skewed distribution, which might have been fixable by changing the PQ distribution, or even by simply changing the degree of parallelism.

Further examination of the report showed that the query generated an enormous rowsource which it then crunched down to a small result set. Comparing the 2018 and 2019 plans (which were not quite identical but were sufficiently similar to justify comparison) the same skew and explosion of rowsource were visible in both, though the data size involved in the 2018 plan made it seem that the plan was a “good” one which really it wasn’t.

The obvious target for tuning was to see if the explosion in volume could be reduced or eliminated by writing the query with some intermediate non-mergeable view(s), and it turned out that the query had been written with that intent in its original form but without a hint to block complex view merging. After adding the hint the performance was acceptable.

May 5, 2022

Lag/Lead slow

Filed under: Bugs,Oracle,Performance,Problem Solving,Troubleshooting,Tuning — Jonathan Lewis @ 10:05 am BST May 5,2022

This note is about a surprising performance difference between the lead() and lag() analytic functions (which31 turns out to be due to the behaviour of the nth_value() function) when the option to “ignore nulls” is included in their use (jump to conclusion). The detail I’ll be writing about was highlighted in a thread on the Oracle developer forum about a requirement to add a number of analytic columns to a dataset of 156 million rows using a statement of the following shape:

create table tb_target_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

You’ll notice that I’ve introduced a row_number(), and both a lead() and a lag() of a column called ed_ucc_dt. All three analytic columns use the same partitioning and ordering, though, so Oracle will only be doing one “window sort” in the execution plan. Part of the performance problem, of course, was that with 156M rows of a couple of dozen existing columns and adding a dozen new columns, the workload due to sorting was bound to be very large, so there were various suggestions of how to minimise that part of the workload.

However Solomon Yakobson pointed out that the code was using the “ignore nulls” option and there was a bug in 11g that made lead() and lag() very slow when this option was used. He subsequently reported that this defect was still present in 19c, restricted it to just the lag() function, and cited a MOS document ID referencing the problem: LAG Function is slow when using ignore nulls (Doc ID 2811596.1). The implication of the MOS note is that we shouldn’t expect this to change.

A follow-up posting by User_H3J7U gave us a reason for the slowness of the lag() function by running a sample query through dbms_utility.expand_sql(). Oracle rewrites the query to use variants of the nth_value() function when you use “ignore nulls”, but rewrites it to use variants of first_value() when you aren’t using the “ignore nulls” option. This isn’t a complete explanation of why lag() should be slow while lead() is not – but it’s a significant pointer towards a possible implementation issue and is a good clue about working around the problem. So let’s build a model of the situation.

The basic model

rem
rem     Script:         lag_ignore_nulls.sql
rem     Author:         Jonathan Lewis / Sam P
rem     Dated:          May 2022
rem     Purpose:     
rem
rem     Last tested
rem             19.11.0.0
rem

create table tb_source (
        pat_id,
        visit_date_dt,
        ed_ucc_dt
)
as
with generator as (
        select rownum id
        from    dual
        connect by
                level <= 1e4    --> comment to avoid wordpress format issue
)
select
        g1.id,
        to_date('01-Apr-2022') + dbms_random.value(0,100),
        to_date('01-Apr-2022') + dbms_random.value(5,105)
--      to_date(null)
from
        generator g1,
        generator g2
where
        g2.id <= 20     --> comment to avoid wordpress format issue
order by
        dbms_random.value
/


spool lag_ignore_nulls.lst

set serveroutput off
set timing on

prompt  ======================
prompt  Without "ignore nulls"
prompt  (My time 0.61 seconds)
prompt  ======================

create table tb_target_no_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

prompt  ======================
prompt  With "ignore nulls"
prompt  (My time 0.88 seconds)
prompt  ======================

create table tb_target_ignore
as
select
        pat_id,
        visit_date_dt,
        ed_ucc_dt,
        lag (ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as prev_ed_ucc_dt,
        lead(ed_ucc_dt ignore nulls, 1) over (partition by pat_id order by visit_date_dt) as next_ed_ucc_dt,
        row_number() over (partition by pat_id order by visit_date_dt) as row_num
from
        tb_source
;

I’ve created a source table with 200,000 rows, consisting of 10,000 pat_id values, and 20 rows per pat_id. The 20 rows for a pat_id (probably) each have a different visit_date_dt and a different ed_ucc_dt.

After creating the data set I’ve created two more tables using the lead() and lag() functions to generate a previous (lag) and next (lead) ed_ucc_dt for each row, partitioning by pat_id, ordering by visit_date_dt. One statement includes the “ignore nulls” option the other doesn’t and, as you can see, the time to create the “no ignore” table was 0.61 seconds while the time to create the “ignore null” table was 0.88 seconds.

The variation isn’t dramatic – but this is just 200,000 rows, in memory, with only a few columns and only two columns added through lead and lag.

After the baseline test I tweaked the statement that created the table with the “ignore nulls” option to get three more times.

  • With neither lead() nor lag() the time was 0.29 seconds
  • With just the lead() column the time was 0.46 seconds – an increase of 0.17 seconds
  • With just the lag() column the time was 0.71 seconds – an increase of 0.42 seconds

You might note that 0.29 + 0.17 + 0.42 = 0.88 (the time I got for adding both columns) – it’s a little lucky that it looks like a perfect match, but even matching within a couple of hundredths of a second would be have been a nice detail. It certainly seems that lag() – with my test data – consumes more resources than lead() for a pair of operationd that look as if they should produce the same workloads.

Internal Rewrite

The next step was to check what the internal rewrite of the code looked like, so I passed the select part of the statements (the procedure won’t accepts “create as select”) through dbms_utility.expand_sql() and reformatted the results. Here are the two rewritten statements – first without “ignore nulls”:

select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        decode(
                count(*) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 preceding and 1 preceding
                        ),a
                 1,     first_value(a1.ed_ucc_dt) over (
                                partition by a1.pat_id order by a1.visit_date_dt
                                rows between 1 preceding and 1 preceding
                        ),
                        null
        ) prev_ed_ucc_dt,
        decode(
                count(*) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and 1 following
                        ),a
                 1,     first_value(a1.ed_ucc_dt) over (
                                partition by a1.pat_id order by a1.visit_date_dt
                                rows between 1 following and 1 following
                        ),
                        null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1


The code looks a little long and messy, but that’s mainly because just about everything it does happens twice. The lag() function (prev column) turns into a first_value() function that looks at the row preceding the current row in the partition (rows between 1 preceding and 1 preceding). However it first has to count over the same clause to see if a row exists, and then either report its value or report a null – hence the structure decode(count(), 1, first_value(), null)

Note: the full lag() function call is: “lag(expression, offset, default)” where the offset (number of rows to lag) defaults to 1 and the default is the value you want reported when there is no matching row, and defaults to null.

The call to lead() basically does the same thing, but uses (rows between 1 following and 1 following) to access the next row in the partition.

On the other hand this is the SQL that Oracle generates when we include the “ignore nulls” clause (which means Oracle can’t restrict the row range to just one preceding or following row):

select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        nvl(
                nth_value(a1.ed_ucc_dt, 1) from last ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt,
        nvl(
                nth_value(a1.ed_ucc_dt, 1)           ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and unbounded following
                ),
                null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1

Both lag() and lead() turn into nth_value() with a second parameter of 1 (i.e. nth == 1st … which makes you wonder why Oracle isn’t using first_value()), and we can also see the “ignore nulls” still being used.

The lag() call now uses the range (rows between unbounded preceding and 1 preceding) i.e. everything from the start of partition to the previous row, while the lead() call uses the range (rows between 1 following and unbounded following) i.e. from the next row to the end of partition.

The other important detail to note is that the translation of the lag() call also includes the clause “from last” – in other words we want the first row when reading the partition in reverse order, and that might have something to do with the extra time it takes to operate the (translated) lag() function.

Workaround

Oracle is using a generic nth_value() to translate a generic lead()/lag(), but we’re in the special case where we know n = 1, which means we (and Oracle) could use first_value()/last_value(). It’s perfectly reasonable for Oracle’s internal code to avoid special cases if it makes no difference to performance, of course, but maybe in this case we could imitate Oracle’s rewrite to get some benefit.

  • Step 1 – change nth_value() to the appropriate first/last.
  • Step 2 – get rid of the “from last” which won’t be needed with last_value()
  • Step 3 – move the “ignore nulls” to the spot that Oracle wants to see it with first/last

Here’s the resulting SQL – I’ve left the nvl(count, expression, null) in place, but if you wanted a null as the default return value for the original lead()/lag() calls you could simplify the code a little further.

create table tb_target 
as
select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
        nvl(
                last_value(a1.ed_ucc_dt ignore nulls) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt,
        nvl(
                first_value(a1.ed_ucc_dt ignore nulls) over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between 1 following and unbounded following
                ),
                null
        ) next_ed_ucc_dt,
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1
/

  • Run-time: 0.61 seconds.
  • Run-time with just last_value() / preceding: 0.47 seconds
  • Run time with just first_value() / following: 0.43 seconds

There still seems to be a little discrepancy between accessing to the preceding data compared to accessing the following data but there’s a much better balance than before.

One more edit – taking out the nvl() construct because the original lead()/lag() calls didn’t have a non-null default supplied:

create table tb_target 
as
select
        a1.pat_id pat_id,
        a1.visit_date_dt visit_date_dt,
        a1.ed_ucc_dt ed_ucc_dt,
--
        last_value(a1.ed_ucc_dt ignore nulls) over (
                partition by a1.pat_id order by a1.visit_date_dt
                rows between unbounded preceding and 1 preceding
        ) prev_ed_ucc_dt,
--
        first_value(a1.ed_ucc_dt ignore nulls) over (
                partition by a1.pat_id order by a1.visit_date_dt
                rows between 1 following and unbounded following
        ) next_ed_ucc_dt,
--
        row_number() over (partition by a1.pat_id order by a1.visit_date_dt) row_num
from
        test_user.tb_source a1
/

With this simplification the time dropped by a further couple of hundredths of a second hovering between 0.57 and 0.58 seconds.

There was one last detail about the test code that should be mentioned – do the changes in code still produce the same results? As a quick and dirty check I ran the following query after each variant of creating the tb_target table (tb_target_ignore is the table created using the original lead()/lag() code with “ignore nulls”):

select  *
from   (
        select * from tb_target minus select * from tb_target_ignore
        union all
        select * from tb_target_ignore minus select * from tb_target
        )
/

The result was always zero rows.

Hypothesis

I raised the idea that the difference in timing for the lead() and lag() functions might have something to do with the volume of data that Oracle could be processing to find the one row it needed.

My thinking was that for the lead() rewrite – the call to nth_value(ed_ucc_dt,1) – would simply be looking at the next row in the partition (if it existed) because my data has no nulls that neeed to be ignored, while the rewrite of the lag() function with its “from last” requirement could be making Oracle re-read the entire preceding section of the partition before starting to process it backwards.

As a possible check to see if this was a viable hypothesis I ran one more test – visible in the initial declaration of tb_source – I created the data with ed_ucc_dt set to null in every single row, so that Oracle would be forced to process from the current position to whichever end of the partition was relevant regardless of whether it was calling lead() or lag().

With this change in place the timing for the lead() only and lag() only statements were nearly identical – which is a weak support for the hypothesis.

And once I’d done that test the next obvious test was to see what happened if I increased size of each partition (using non-null values for ed_ucc_dt) to see if larger partitions would increase the difference between the forward and backward tests. To do this I changed the script to create the tb_source table to produce 5,000 pat_id value with 40 rows per pat_id by changing the where clause to:

where
        g2.id <= 40     --> comment to avoid wordpress format issue
and     g1.id <= 5e3    --> comment to avoid wordpress format issue

With this change in place the timings for the original form of the lead()/lag() statement were:

  • With both lead() and lag() in place the time was 1.05 seconds
  • With neither lead() nor lag() the time was 0.25 seconds
  • With just the lead() column the time was 0.41 seconds – an increase of 0.16 seconds
  • With just the lag() column the time was 0.98 seconds – an increase of 0.73 seconds

So the lag() time (ballpark figures) nearly doubles as the partition size doubles but the lead() time stays pretty much the same.

The results of these two tests do tend to suggest that the generic nth_value() implementation can do some short-circuiting when working “forwards”, using a mechanism that isn’t available when the “from last” clause requires it to work “backwards”.

Writing the previous paragraph prompted me to do one last test – it wouldn’t produce the same results, of course, but I ought to check the performance when I moved the “from last” clause out of the “prev”/lag() column expression into the “next”/lead() column expression in Oracle’s original translation to confirm that the problem was associated with the “from last” and not with the choice of “preceding” or “following” in the row range section of the over() clause. (It was the “from last” that made the difference.)

tl;dr

If you’re using the lag() or lead() functions with “ignore nulls” on a very large dataset you may find that you can rewrite the code with first_value() or last_value() calls that use less CPU. The benefit is probably only significant on fairly large data sets, and may be particularly noticeable for cases where the same over() clause is used many times and the partition sizes are more than a couple of dozen rows each.

The potential for excess CPU usage comes from the effect of a generic internal rewrite using the nth_value() function with the “from last” clause even for the special case where your lead()/lag() have an offset (i.e. n) of 1 which would allow for the use of first_value()/last_value().

To get the correct rewrite you can use dbms_utility.expand_sql() to generate a suitable statement from which you can extract and edit the relevant pieces of text.

Footnote

A further point raised by Solomon Jakobson was that in the generic case (where you can’t use first_value() / last_value() to rewrite the SQL because you want to use an offset greater than one) it’s worth noting that the following two expressions are effectively identical:

        nvl(
                nth_value(a1.ed_ucc_dt, 2) from last ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt
                        rows between unbounded preceding and 1 preceding
                ),
                 null
        ) prev_ed_ucc_dt

        nvl(
                nth_value(a1.ed_ucc_dt, 2)           ignore nulls over (
                        partition by a1.pat_id order by a1.visit_date_dt desc
                        rows between 1 following and unbounded following
                ),
                 null
        ) prev_ed_ucc_dt
/

Note particularly that to avoid the issue of scanning the data “from last”, I’ve changed the ordering (in line 11) to descending and I’ve changed the preceding of line 4 to following in line 12 while reversing the positions of 1 and unbounded. (The same strategy can be applied for the lag() function in the original code).

In general it’s likely that lead() is will be more efficient than lag() so this change could make a big difference in performance. It’s worth noting, however, that if you’ve been using both lead() and lag() as my example did then your execution plan will (truthfully) show two “window sort” operations and the second (i.e. lower number in the plan) window sort will also be sorting the columns added by the first window sort. Depending on the nature of your data the additional sort might increase the workload by more than the benefit you get from eliminating a lag().

In an example created by Solomon Jakobson the effect of using lag( ,5) was catastrophic and the change that introduced the extra sort to get a lead( ,5) made a huge difference (40 seconds down to 0.15 seconds – and I haven’t made a mistake in the decimal point there); in my original test case the improvement I got from applying the same strategy was small (0.88 seconds down to 0.86 seconds). The critical difference between the test cases was that one had a single partition of 20,000 rows, the other had (10,000) partitions of only 20 rows.

(Footnote to the footnote – it did cross my mind that if you had a performance catastrophe with the simple lag(,n) and with the extra sort from reversing the order to use a lead(,n) would Oracle do something clever if you found a way to rewrite the query with a cascade of inline-views that all used last_value() – but I decided I wasn’t going to look at that option until someone paid me to. More seriously I did wonder about the option for using match_recognize to bypass the lag() problem, and may get around to thinking about that more carefully at some point.)

April 13, 2022

Adaptive Joins

Filed under: CBO,Hints,Oracle,Troubleshooting — Jonathan Lewis @ 1:53 pm BST Apr 13,2022

There’s a question on the Oracle Forums at the moment about a query that’s taking a long time to parse. Even after being reduced to one prebuilt (currently remote) table with two non-correlated outer joins to it the parse time is several hundred seconds. This seems fairly bizarre – I have seen some very long parse times from Oracle, but 10 minutes for 3 tables is well over the top; it did remind me, though of a note I started a few years ago of a 4 table join taking 4 seconds to parse, so I thought I’d present the query, the plan, and a little chat on debugging. Here’s the query:

select
        /*+ parallel(t4,3) */
        t1.small_vc,
        t2.small_vc,
        t3.small_vc,
        count(t4.small_vc)
from
        t4,     
        t1,     
        t2,     
        t3
where
        t1.id = t4.id1
and     t2.id = t4.id2
and     t3.id = t4.id3
and     t1.small_vc in (1,2,3)
and     t2.small_vc in (1,2,3,4)
and     t3.small_vc in (1,2,3,4,5)
group by
        t1.small_vc,
        t2.small_vc,
        t3.small_vc
;

I’m expecting a simple cascade of hash joins, with t1, t2 and t3 – the “small” tables – turning into “build” tables, then t4 – the “large” table – passing through each of them in turn until the penultimate rowsource is aggregated.

Here’s the execution plan — which looks pretty much as I expected it to – but there’s something wrong about it that isn’t visible in the output. Why is the query (plan) saying it took 0.07 seconds to complete (A-time), returning only 60 rows, when my SQL*Plus session didn’t return any data for 4 seconds

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Starts | E-Rows | Cost (%CPU)| E-Time   |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |          |      1 |        |   300 (100)|          |        |      |            |     60 |00:00:00.07 |       5 |      0 |       |       |          |
|   1 |  PX COORDINATOR                |          |      1 |        |            |          |        |      |            |     60 |00:00:00.07 |       5 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)          | :TQ10004 |      0 |      1 |   300   (4)| 00:00:01 |  Q1,04 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   3 |    HASH GROUP BY               |          |      3 |      1 |   300   (4)| 00:00:01 |  Q1,04 | PCWP |            |     60 |00:00:00.01 |       0 |      0 |  1394K|  1394K|     3/0/0|
|   4 |     PX RECEIVE                 |          |      3 |      1 |   300   (4)| 00:00:01 |  Q1,04 | PCWP |            |    180 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX SEND HASH              | :TQ10003 |      0 |      1 |   300   (4)| 00:00:01 |  Q1,03 | P->P | HASH       |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       HASH GROUP BY            |          |      3 |      1 |   300   (4)| 00:00:01 |  Q1,03 | PCWP |            |    180 |00:00:00.14 |    6114 |   6018 |  1394K|  1394K|     3/0/0|
|*  7 |        HASH JOIN               |          |      3 |   8460 |   299   (4)| 00:00:01 |  Q1,03 | PCWP |            |  12600 |00:00:00.14 |    6114 |   6018 |  2171K|  2171K|     3/0/0|
|   8 |         JOIN FILTER CREATE     | :BF0000  |      3 |     22 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     75 |00:00:00.02 |       0 |      0 |       |       |          |
|   9 |          PX RECEIVE            |          |      3 |     22 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     75 |00:00:00.02 |       0 |      0 |       |       |          |
|  10 |           PX SEND BROADCAST    | :TQ10000 |      0 |     22 |     2   (0)| 00:00:01 |  Q1,00 | S->P | BROADCAST  |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|  11 |            PX SELECTOR         |          |      3 |        |            |          |  Q1,00 | SCWC |            |     25 |00:00:00.01 |       3 |      0 |       |       |          |
|* 12 |             TABLE ACCESS FULL  | T3       |      1 |     22 |     2   (0)| 00:00:01 |  Q1,00 | SCWP |            |     25 |00:00:00.01 |       3 |      0 |       |       |          |
|* 13 |         HASH JOIN              |          |      3 |  27322 |   297   (4)| 00:00:01 |  Q1,03 | PCWP |            |  12600 |00:00:00.10 |    6114 |   6018 |  2171K|  2171K|     3/0/0|
|  14 |          JOIN FILTER CREATE    | :BF0001  |      3 |     21 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     72 |00:00:00.01 |       0 |      0 |       |       |          |
|  15 |           PX RECEIVE           |          |      3 |     21 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     72 |00:00:00.01 |       0 |      0 |       |       |          |
|  16 |            PX SEND BROADCAST   | :TQ10001 |      0 |     21 |     2   (0)| 00:00:01 |  Q1,01 | S->P | BROADCAST  |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|  17 |             PX SELECTOR        |          |      3 |        |            |          |  Q1,01 | SCWC |            |     24 |00:00:00.01 |       3 |      0 |       |       |          |
|* 18 |              TABLE ACCESS FULL | T2       |      1 |     21 |     2   (0)| 00:00:01 |  Q1,01 | SCWP |            |     24 |00:00:00.01 |       3 |      0 |       |       |          |
|* 19 |          HASH JOIN             |          |      3 |  92953 |   294   (3)| 00:00:01 |  Q1,03 | PCWP |            |  12600 |00:00:00.10 |    6114 |   6018 |  2171K|  2171K|     3/0/0|
|  20 |           JOIN FILTER CREATE   | :BF0002  |      3 |     19 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     63 |00:00:00.01 |       0 |      0 |       |       |          |
|  21 |            PX RECEIVE          |          |      3 |     19 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |     63 |00:00:00.01 |       0 |      0 |       |       |          |
|  22 |             PX SEND BROADCAST  | :TQ10002 |      0 |     19 |     2   (0)| 00:00:01 |  Q1,02 | S->P | BROADCAST  |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|  23 |              PX SELECTOR       |          |      3 |        |            |          |  Q1,02 | SCWC |            |     21 |00:00:00.01 |       3 |      0 |       |       |          |
|* 24 |               TABLE ACCESS FULL| T1       |      1 |     19 |     2   (0)| 00:00:01 |  Q1,02 | SCWP |            |     21 |00:00:00.01 |       3 |      0 |       |       |          |
|  25 |           JOIN FILTER USE      | :BF0000  |      3 |    343K|   289   (2)| 00:00:01 |  Q1,03 | PCWP |            |  19517 |00:00:00.09 |    6114 |   6018 |       |       |          |
|  26 |            JOIN FILTER USE     | :BF0001  |      3 |    343K|   289   (2)| 00:00:01 |  Q1,03 | PCWP |            |  19517 |00:00:00.09 |    6114 |   6018 |       |       |          |
|  27 |             JOIN FILTER USE    | :BF0002  |      3 |    343K|   289   (2)| 00:00:01 |  Q1,03 | PCWP |            |  19517 |00:00:00.09 |    6114 |   6018 |       |       |          |
|  28 |              PX BLOCK ITERATOR |          |      3 |    343K|   289   (2)| 00:00:01 |  Q1,03 | PCWC |            |  19517 |00:00:00.09 |    6114 |   6018 |       |       |          |
|* 29 |               TABLE ACCESS FULL| T4       |     48 |    343K|   289   (2)| 00:00:01 |  Q1,03 | PCWP |            |  19517 |00:00:00.06 |    6114 |   6018 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
  12 - SEL$1 / T3@SEL$1
  18 - SEL$1 / T2@SEL$1
  24 - SEL$1 / T1@SEL$1
  29 - SEL$1 / T4@SEL$1

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      OPT_PARAM('_fix_control' '16923858:5')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      FULL(@"SEL$1" "T4"@"SEL$1")
      FULL(@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T3"@"SEL$1")
      LEADING(@"SEL$1" "T1"@"SEL$1" "T4"@"SEL$1" "T2"@"SEL$1" "T3"@"SEL$1")
      USE_HASH(@"SEL$1" "T4"@"SEL$1")
      USE_HASH(@"SEL$1" "T2"@"SEL$1")
      USE_HASH(@"SEL$1" "T3"@"SEL$1")
      PQ_DISTRIBUTE(@"SEL$1" "T4"@"SEL$1" BROADCAST NONE)
      PX_JOIN_FILTER(@"SEL$1" "T4"@"SEL$1")
      PQ_DISTRIBUTE(@"SEL$1" "T2"@"SEL$1" NONE BROADCAST)
      PX_JOIN_FILTER(@"SEL$1" "T2"@"SEL$1")
      PQ_DISTRIBUTE(@"SEL$1" "T3"@"SEL$1" NONE BROADCAST)
      PX_JOIN_FILTER(@"SEL$1" "T3"@"SEL$1")
      SWAP_JOIN_INPUTS(@"SEL$1" "T2"@"SEL$1")
      SWAP_JOIN_INPUTS(@"SEL$1" "T3"@"SEL$1")
      GBY_PUSHDOWN(@"SEL$1")
      USE_HASH_AGGREGATION(@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("T3"."ID"="T4"."ID3")
  12 - filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3 OR TO_NUMBER("T3"."SMALL_VC")=4 OR TO_NUMBER("T3"."SMALL_VC")=5))
  13 - access("T2"."ID"="T4"."ID2")
  18 - filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3 OR TO_NUMBER("T2"."SMALL_VC")=4))
  19 - access("T1"."ID"="T4"."ID1")
  24 - filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3))
  29 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER_LIST(SYS_OP_BLOOM_FILTER(:BF0002,"T4"."ID1"),SYS_OP_BLOOM_FILTER(:BF0001,"T4"."ID2"),SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID3")))


One possible trouble-shooting step is simply to re-run the query, taking a snapshot of the session activity stats (v$mystat) and the session events (v$session_event) to see what they tell you (if anything) – here’s the critical extract from the session stats:

Name                                      Value
----                                      -----
CPU used when call started                  423
CPU used by this session                    429
parse time cpu                              411
parse time elapsed                          419

Most of the time is parse time, spent on the CPU. (If this had been a much larger scale problem and had occurred in the recent past I might have looked at ASH (v$active_session_hsitory) for any samples for the correct SQL_ID, and seen the problem revealed in the in_parse, in_hard_parse columns.

So let’s enable event 10053 and run the query again – but since it’s “only” 4 seconds, let’s tweak the timer option to report any step that took longer than 0.1 seconds. The default timer setting is a whole second (10^6 microseconds), so we set the fix-control to 5 to get 0.1 seconds (10^5 microseconds).

alter session set "_fix_control"='16923858:5';
alter session set events '10053 trace name context forever';

-- run the query, find the trace file

 grep TIMER or19_ora_23370.trc 

Here’s the output from the call to grep: it looks like group by placement (GBP) is causing a problem.

TIMER:  GBP: costing SEL$1 cpu: 0.303 sec elapsed: 0.309 sec
TIMER: GBP: iteration (#1) SEL$1 cpu: 0.303 sec elapsed: 0.309 sec
TIMER:  GBP: costing SEL$1565E019 cpu: 0.293 sec elapsed: 0.298 sec
TIMER: GBP: iteration (#2) SEL$1565E019 cpu: 0.294 sec elapsed: 0.299 sec
TIMER:  GBP: costing SEL$23EAFE84 cpu: 0.528 sec elapsed: 0.533 sec
TIMER: GBP: iteration (#3) SEL$23EAFE84 cpu: 0.528 sec elapsed: 0.533 sec
TIMER:  GBP: costing SEL$B5D97CA0 cpu: 0.533 sec elapsed: 0.540 sec
TIMER: GBP: iteration (#4) SEL$B5D97CA0 cpu: 0.534 sec elapsed: 0.540 sec
TIMER:  GBP: costing SEL$6C9B46B6 cpu: 0.531 sec elapsed: 0.531 sec
TIMER: GBP: iteration (#5) SEL$6C9B46B6 cpu: 0.531 sec elapsed: 0.532 sec
TIMER:  GBP: costing SEL$ED1298E3 cpu: 0.522 sec elapsed: 0.523 sec
TIMER: GBP: iteration (#8) SEL$ED1298E3 cpu: 0.523 sec elapsed: 0.524 sec
TIMER:  GBP: costing SEL$5968095A cpu: 0.523 sec elapsed: 0.523 sec
TIMER: GBP: iteration (#9) SEL$5968095A cpu: 0.524 sec elapsed: 0.523 sec
TIMER:  GBP: costing SEL$4CA81688 cpu: 0.525 sec elapsed: 0.527 sec
TIMER: GBP: iteration (#12) SEL$4CA81688 cpu: 0.526 sec elapsed: 0.528 sec
TIMER: Group-By Placement SEL$1 cpu: 3.766 sec elapsed: 3.793 sec
TIMER: Cost-Based Transformations (Overall) SEL$1 cpu: 3.769 sec elapsed: 3.795 sec
TIMER: Access Path Analysis (Final) SEL$1 cpu: 0.288 sec elapsed: 0.289 sec
TIMER: SQL Optimization (Overall) SEL$1 cpu: 4.072 sec elapsed: 4.108 sec

If you check further up the page, though, you’ll see in the Outline Information that Oracle has not used group by placement (it has done a “group by pushdown” but that’s different, and relates to aggregation in parallel execution. So one quick hack we could try is to add the hint /*+ no_place_group_by(@sel$1) */ to the query just to see what happens – and here’s the effect on the parse time:

Name                                      Value
----                                      -----
parse time cpu                               33
parse time elapsed                           34

Problem solved – provided we can get the hint into the code (by hand, or SQL Patch, etc.) But the question still remains: where did the time go? The trace file was fairly long (375,000 lines for the original, compared to 32,000 for the hinted) but a rapid scan seemed in order – and something very quickly caught my attention. It was pretty easy to spot because something big and nasty had happened 8 times.

The answer was in “Adaptive Plans”, which (mostly) get flagged with the label “AP:” in the 10053 trace file, for example:

AP: Computing costs for inflection point at min value 0.00
AP: Using binary search for inflection point search
AP: Costing Join for ADM inflection point at card 0.00
AP: Costing Join for ADM inflection point at card 0.00

You can see here that the optimizer is searching for an “inflection point”, that means it’s “thinking about” an adaptive join, and searching for the number of rows where a switch between a nested loop join and a hash join makes sense.

Notice particularly the comment about “using binary search”. After calculating the cost of the “best” join using the current estimates of cardinality for the incoming rowsource the optimizer starts calculating the possible costs (nested loop or hash, it ignores merge) for a set of alternative cardinalities until it finds the cardinality where the nested loop join and hash join have the same cost. That’s the number of rows that Oracle will use at run time to decide whether it should switch from its initial selection of join method to the alternative.

If the initial join selection was a hash join (i.e. large volume startegy) Oracle will use an alternative cardinality of 1 to start its list of estimates – on the assumption that that would produce a nested loop plan and then keep doubling the estimate until the resulting plan switched to a hash join, then close in on the break point by halving and doubling the remaining gap between the NL estimate and the HJ estimate.

If the initial join selection was a nested loop join Oracle will use a “worst case scenario” for the incoming estimate (acting, broadly speaking, as if every filter at that point had had a selectivity of 1 – i.e. 100% of the available data), and start by halving the estimate. This is fine when the starting estimate is (for example) for the first table in the join order and the esimate is just a few million rows.

Here’s the first estimate in my case (you’ll need a wide screen, or very small print):

6772:AP: Computing costs for inflection point at max value 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00 

After you’ve halved that number about 1,000 times you’re down to values in the region of a few thousand. And when you have to cost a couple of joins every time you halve, and when you’ve gone through the whole process 8 times that’s a lot of CPU.

In fact, with the no_place_group_by hint there was still one point where the optimizer did this adaptive join work – which probably accounts for most of the final 0.33 CPU seconds – but it didn’t show up in any separately reported timer events.

Of course the final little test of this parse time hypothesis is to add the hint /*+ no_adaptive_plan */ – so I did that, and the parse time was still about 0.3 seconds! Investigation of the 10053 trace showed that even with the hint in place the optimizer still went through that one huge binary chop – but when it had worked out the inflection point it printed the message:

AP: Adaptive plans bypassed for query block SEL$1 due to disabled by hint (no_adaptive_plan)
AP: Adaptive joins bypassed for query block SEL$1 due to adaptive plans disabled

According to the hint report the hint was valid, however, so that behaviour looks a little buggy. It then occurred to me that maybe I could have hinted /*+ no_adaptive_plan(@sel$1) */ – and that worked properly with the trace reporting:

AP: Checking validity for query block SEL$1, sqlid=7fjtvwktcmsgq
AP: Adaptive plans bypassed for query block SEL$1 due to disabled by hint (no_adaptive_plan)
AP: Adaptive joins bypassed for query block SEL$1 due to adaptive plans disabled

Had I not realised that this was possible I have fallen back on the hint /*+ opt_param() */ to set the parameter optimizer_adaptive_plans to false for the query in which case the following message (and no other “AP:” message) appeared near the start of the trace:

AP: Adaptive joins bypassed for query block SEL$1 due to disabled by adaptive join parameter

If you want to experiment on your own system here’s the script to create the data – the script name reflects the fact that I found this example by accident while working on something completely different:

rem
rem     Script:         12c_vector_transform_c.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2015
rem     Purpose:        
rem
rem     Last tested 
rem             19.11.0.0
rem             12.2.0.1
rem             12.1.0.2

drop table t1;
drop table t2;
drop table t3;
drop table t4;
purge recyclebin;

create table t1 
as
select
        rownum          id,
        to_char(mod(rownum,10)) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 70
;

alter table t1 
        add constraint t1_pk primary key(id)
;

create table t2
as
select
        rownum          id,
        to_char(mod(rownum,12)) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 70
;

alter table t2
        add constraint t2_pk primary key(id)
;

create table t3
as
select
        rownum          id,
        to_char(mod(rownum,14)) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 70
;

alter table t3
        add constraint t3_pk primary key(id)
;

create table t4
nologging
as
select
        t1.id                   id1,
        t2.id                   id2,
        t3.id                   id3,
        rpad(rownum,10)         small_vc,
        rpad('x',100)           padding
from
        t1, t2, t3
;

spool 12c_vector_transform_c

set timing on
set linesize 255
set trimspool on
set pagesize 90
set arraysize 100
set serveroutput off

execute snap_my_stats.start_snap

alter session set statistics_level = all;

alter session set "_fix_control"='16923858:5';
alter session set events '10053 trace name context forever';

select
        /*+ 
--              opt_param('optimizer_adaptive_plans' 'false')
--              no_adaptive_plan
--              no_adaptive_plan(@sel$1)
--              no_place_group_by(@sel$1)
                parallel(t4,3)
        */
        t1.small_vc,
        t2.small_vc,
        t3.small_vc,
        count(t4.small_vc)
from
        t4,     
        t1,     
        t2,     
        t3
where
        t1.id = t4.id1
and     t2.id = t4.id2
and     t3.id = t4.id3
and     t1.small_vc in (1,2,3)
and     t2.small_vc in (1,2,3,4)
and     t3.small_vc in (1,2,3,4,5)
group by
        t1.small_vc,
        t2.small_vc,
        t3.small_vc
;

select * from table(dbms_xplan.display_cursor(null,null,'cost outline allstats all note -bytes'));

alter session set events '10053 trace name context off';

set serveroutput on
execute snap_my_stats.end_snap

spool off

April 7, 2022

Parallel Inactivity

Filed under: Oracle,Parallel Execution,Performance,Troubleshooting — Jonathan Lewis @ 5:08 pm BST Apr 7,2022

This is a simple script that I wrote more than 20 years ago (for 8.1.7.4 apparently) to get the session wait event figures from parallel query slaves as they were running. I’m posting it now because there’s a question on the Oracle Database Forum where it would be useful as a way for checking for performance problems in a long-running CTAS.

Oracle gives us the view v$px_session to link together the query co-ordinator with the parallel query slaves, and it gives us v$px_sesstat to report the linked session activity stats, but it doesn’t gives us a view to link together the session events for the multiple sessions, so this query simply joins v$px_session to v$session_events to produce that result:

rem
rem     Script:         px_waits.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2001
rem     Purpose:        Report events for PX execution
rem
rem     Last tested:
rem             19.3.0.0
rem
rem     Notes:
rem     Note that the co-ordinator can be on a different instance
rem     in parallel server systems.  To cater for an oddity of the
rem     qcinst report, we need the current instance number in case
rem     the QC is on the current machine.
rem
rem     If the co-ordinator is present, then it appears last on the list
rem     for a server group.
rem

column instance_number new_value m_inst
select instance_number from v$instance;

set linesize 120
set pagesize 60
set trimspool on
set tab off
set verify off

spool px_waits

break -
        on qcsid skip 1 -
        on server_group -
        on degree -
        on server_set -
        on sid -
        on server# skip 1

column event format a32

column  qcsid           format a6       heading 'Coord'
column  server_group    format 999      heading 'Grp'
column  degree          format a5       heading 'Deg'   noprint
column  server_set      format 999      heading 'Set'   noprint
column  server#         format 999      heading 'Sno'
column  sid             format 9999     heading 'SID'
column  name            format a32      heading 'Statistic'
column  value           format 99,999,999       heading 'value'

select
        ss.qcsid || '/' || nvl(ss.qcinst_id,&m_inst)            qcsid,
        ss.server_group,
        decode(degree,
                null,null,
                ss.degree || '/' || ss.req_degree
        )       degree,
        ss.server_set,
        ss.server#,
        ss.sid,
        se.event,
        se.total_waits,
        se.time_waited
from
        v$px_session            ss,
        v$session_event         se
where
        se.sid = ss.sid
order by
        ss.qcsid,
        ss.server_group,
        ss.server_set,
        ss.server#,
        se.event
;

clear columns
clear breaks
set verify on

spool off

See also: Parallel Activity for the session activity stats report.

Parallel Activity

Filed under: Oracle,Parallel Execution,Performance,Troubleshooting — Jonathan Lewis @ 4:49 pm BST Apr 7,2022

This is a simple script that I wrote more than 20 years ago (for 8.1.7.4 apparently) to get the session activity from parallel query slaves as they were running. I’m posting it now because there’s a question on the Oracle Database Forum where it would be useful as a way for checking for a skewed data distribution in a long-running query.

Oracle gives use the view v$px_sesstat which connects PX slaves with their query co-ordinator (QC) and reports the session activity for all of them. The view only report statistics numbersm though, so this little query joins the view to v$statname to report the names. It eliminates stats where the value is zero, and orders by QC (though you could add a line to restrict the query to a single QC), parallel server group, then process, then statistic number.

rem
rem     Script:         px_stats.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2001
rem     Purpose:        Report statistics for PX slaves on the fly
rem
rem     Last tested
rem             19.3.0.0
rem     Notes:
rem     Note that the co-ordinator can be on a different instance
rem     in parallel server systems.  To cater for an oddity of the
rem     qcinst report, we need the current instance number in case
rem     the QC is on the current machine.
rem
rem     If the co-ordinator is present, then it appears last on the list
rem     for a server group.
rem
rem     This code really needs to be enhanced to do a proper job on OPS/RAC
rem     by accessing gv$px_sesstat, and handling sids and instances correctly
rem
rem     The user running this query has to have SELECT privileges 
rem     on the views v$instance, v$statname and v$px_sesstat
rem

column instance_number new_value m_inst
select instance_number from v$instance;

spool px_stats

set tab off
set pagesize 60
set linesize 156
set trimspool on
set verify off

break -
        on qcsid skip 1 -
        on server_group -
        on degree -
        on server_set -
        on sid -
        on server# skip 1


column  qcsid           format a6               heading 'Coord'
column  server_group    format 999              heading 'Grp'
column  degree          format a5               heading 'Deg'
column  server_set      format 999              heading 'Set'
column  server#         format 999              heading 'Sno'
column  sid             format 9999             heading 'SID'
column  name            format a52              heading 'Statistic'
column  value           format 99,999,999,999   heading 'value'


select
        st.qcsid || '/' || nvl(st.qcinst_id,&m_inst)            qcsid,
        st.server_group,
        decode(degree,
                null,null,
                st.degree || '/' || st.req_degree
        )       degree,
        st.server_set,
        st.server#,
        st.sid,
        sn.name,
        st.value
from
        v$px_sesstat    st,
        v$statname      sn
where
        sn.statistic# = st.statistic#
and     st.value != 0
order by
        st.qcsid,
        st.server_group,
        st.server_set,
        st.server#,
        st.statistic#
;

clear columns
clear breaks
set verify on

spool off

See also: Parallel Inactivity for a simple script reporting wait events

March 24, 2022

Drop column bug

Filed under: Bugs,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 11:07 am GMT Mar 24,2022

Here’s a problem that appeared recently on the Orace Developer forum showing one of the classic symptons of new features namely that “mix and match” often runs into problems. This example has been a long time coming so “new” is something of a misnomer, but the alternative source of unlucky collisions is in the “rare” features – in this case Spatial. (It’s possible that the problem is not restricted to just Spatial but it probably does require a somewhat exotic data type.)

The problem appeared in a thread on the Oracle Developer Forum from someone who was trying to drop a pair of columns and finding that the statement failed with a surprising choice of error: ORA-00904: Invalid Identifier. The surprising thing about this error was that the named identifier was clearly not invalid. The suspicion that that this was an example of “new features colliding” was that the columns to be dropped were virtual columns based on a real column of the table that had been declared as an object type defined in the MDSYS (Spatial) schema.

Conveniently the author had supplied a short, simple, script to demonstrate the issue, so I copied it and modified it a bit to do a few tests around the edges of the problem. Here’s the code that I used to start my investigation:

rem
rem     Script:         drop_col_bug.sql
rem     Author:         Jonathan Lewis/User_77G7L
rem     Dated:          Mar 2022
rem     Purpose:        
rem
rem     Last tested 
rem             21.3.0.0        Still broken
rem             19.11.0.0
rem

create table xxx (
        v1              varchar2(10),
        n1              number,
        shape           mdsys.sdo_geometry,
        x_value         number generated always as (mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.x) virtual,
        y_value         number generated always as (mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.y) virtual,
        v2              varchar2(10),
        n2              number,
        n3              number
)
segment creation immediate
;

insert into xxx(v1, n1, v2, n2, n3) values('z',1,'y',2,3);
update xxx  set
        shape = sdo_geometry(
                2003,                           -- two-dimensional polygon
                null,
                null,
                sdo_elem_info_array(1,1003,3),  -- one rectangle (1003 = exterior)
                sdo_ordinate_array(1,1, 5,7)    -- only 2 points needed to define rectangle
        )
;


commit;

alter table xxx drop (x_value, y_value) ;

The modifications I made from the original code are:

  • I’ve removed a couple of redundant sets of parentheses from the virtual column definitions
  • I’ve added a few columns before, in between, and after the virtual columns
  • I’ve used “segment creation immediate”
  • I’ve inserted a row into the table

The last two are simply to ensure that I have data segments and at least one item for the drop to work on – just in case it’s a run-time problem being reported as a parse time issue.

The extra columns are to test whether the type and position of the column I drop affects the outcome, and the change in parentheses is purely aesthetic.

Here’s the result of the attempt to drop the virtual columns:

alter table xxx drop (x_value, y_value)
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

This is a little strange since I have used the packaged function mdsys.sdo_geom.sdo_pointonsurface() to define the virtual columns and Oracle didn’t complain when I created the column. (Part of the reason I had reduced the original parentheses was to check that the compiler hadn’t got confused by an excess of paretheses).

As a quick “what if” test I tried using the alternative syntax for drop column that you can use with just one column:

SQL> alter table xxx drop column x_value;
alter table xxx drop column x_value
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

What about trying to set the column unused before dropping all unused columns?

SQL> alter table xxx set unused column x_value;
alter table xxx set unused column x_value
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

So is the problem restricted to the virtual columns – what happens if I try to drop a column from the end of the table, what about the one between the two virtual columns, how about a column that appears before even the shape column? Nothing changes:

SQL> alter table xxx drop column v1;
alter table xxx drop column v1
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier


SQL> alter table xxx set unused column v1;
alter table xxx set unused column v1
  *
ERROR at line 1:
ORA-00904: "MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE": invalid identifier

What if I have only one of the virtual columns? No difference.

What if I don’t have either of the virtual columns? Finally I can drop any column I like from the table (including the shape column). Not that that’s much use to the user.

You can’t set unused or drop any columns in the table thanks to an error that looks as if it’s associated with the definition of those virtual columns.

Workaround

Is there any way to bypass the problem and still store the information we need (until we want to drop it). Let’s start by taking a look at the way Oracle has used our table definition to create column definitions, just in case that gives us a clue:

select 
        column_id id, segment_column_id seg_id, internal_column_id int_id, 
        column_name, data_type, data_default  
from 
        user_tab_cols 
where 
        table_name = 'XXX' 
order by 
        column_id,
        internal_column_id
/

        ID     SEG_ID     INT_ID COLUMN_NAME          DATA_TYPE                 DATA_DEFAULT
---------- ---------- ---------- -------------------- ------------------------- --------------------------------------------------------------------------------
         1          1          1 V1                   VARCHAR2
         2          2          2 N1                   NUMBER
         3          3          3 SHAPE                SDO_GEOMETRY
         3          4          4 SYS_NC00004$         NUMBER
         3          5          5 SYS_NC00005$         NUMBER
         3          6          6 SYS_NC00006$         NUMBER
         3          7          7 SYS_NC00007$         NUMBER
         3          8          8 SYS_NC00008$         NUMBER
         3          9          9 SYS_NC00009$         SDO_ELEM_INFO_ARRAY
         3         10         10 SYS_NC00010$         SDO_ORDINATE_ARRAY
         4                    11 X_VALUE              NUMBER                    (("MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE"("SHAPE",0.005))."SDO_POINT")."X"
         5                    12 Y_VALUE              NUMBER                    (("MDSYS"."SDO_GEOM"."SDO_POINTONSURFACE"("SHAPE",0.005))."SDO_POINT")."Y"
         6         11         13 V2                   VARCHAR2
         7         12         14 N2                   NUMBER
         8         13         15 N3                   NUMBER

15 rows selected.

There’s quite a lot going on there in terms of columns hidden behind the sdo_geometry type. In fact internal columns 9 and 10 might prompt you to look for other objects like table types or LOBs:

SQL> select column_name, segment_name, index_name from user_lobs where table_name = 'XXX';

COLUMN_NAME                    SEGMENT_NAME                   INDEX_NAME
------------------------------ ------------------------------ ------------------------------
"SHAPE"."SDO_ELEM_INFO"        SYS_LOB0000100168C00009$$      SYS_IL0000100168C00009$$
"SHAPE"."SDO_ORDINATES"        SYS_LOB0000100168C00010$$      SYS_IL0000100168C00010$$

2 rows selected.

But the interesting detail is the data_default column for our two virtual columns – which have more parentheses than the original definitions. Perhaps the storage of the expression has gone wrong (as happened in an older version of Oracle with case expressions) and is causing the ORA-00904 error to appear. So let’s try selecting data from the table using the expression stored in data dictionary:

select
        ((MDSYS.SDO_GEOM.SDO_POINTONSURFACE(SHAPE,0.005)).SDO_POINT).X          old_x,
          mdsys.sdo_geom.sdo_pointonsurface(shape,0.005).sdo_point.x            new_x,
        ((MDSYS.SDO_GEOM.SDO_POINTONSURFACE(SHAPE,0.005)).SDO_POINT).Y          old_y
from
        xxx
/

     OLD_X      NEW_X      OLD_Y
---------- ---------- ----------
         1          1          1

1 row selected.


No syntax error there – as far as a simple select is concerned. I’ve included my tidier format for the x_value column aligned with the resulting stored value (with all the double quotes removed – though I’ve also tested it with the quotes in place) – and the only significant visual difference is the number of parentheses, so maybe that’s a clue. In particular we note that the error reports “MDSYS”.”SDO_GEOM”.”SDO_POINTONSURFACE” as the invalid identifier and the first time an extra (close) parenthesis appears is just after that function call. Maybe (for no good reason) the code path involved with handling column data during a drop/set unused call is getting confused by parentheses. So let’s try to reduce the complexity of the expression by hiding it inside a local function.

First attempt – create a function to return an sdo_point_type and define the virtual columns to expose the X and Y values from the point:

create or replace function my_point(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return mdsys.sdo_point_type
deterministic
as
begin
        return mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point;
end;
/

        x_value         number generated always as (my_point(shape,0.005).x) virtual,
        y_value         number generated always as (my_point(shape,0.005).y) virtual,

This approach still produces an ORA-00904, though the invalid identifier becomes “TEST_USER”.”MY_POINT”.

Second attempt – two separate functions, one for the x value, one for the y value:

create or replace function my_x(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return number
deterministic
as
begin
        return   mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point.x;
end;
/

show errors

create or replace function my_y(
        inshape         in mdsys.sdo_geometry,
        intolerance     in number
)
return number
deterministic
as
begin
        return   mdsys.sdo_geom.sdo_pointonsurface(inshape, intolerance).sdo_point.y;
end;
/

        x_value         number generated always as (my_x(shape, 0.005)) virtual,
        y_value         number generated always as (my_y(shape, 0.005)) virtual,

This worked so, finally, I looked at the SQL Language reference manual to see if there were any restrictions on virtual columns that might explain the problem I had had with all the previous definitions (and, yes, I know I should have done that right after the first failure) and I found the following:

  • The virtual column cannot be an Oracle supplied data type, a user-defined type, or LOB or LONG RAW.

None of my virtual column definitions returned an Oracle supplied data type or a user-defined data type. But would the restriction also apply to single attributes of such a data type, or has the complexity of spatial data types found a hole in the code? And the answer to that question is a whole new blog note waiting to be finish (because I’d forgotten what a pain it was to use object types in simple SQL.)

February 1, 2022

Troubleshooting catalogue

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 11:31 am GMT Feb 1,2022

This is a list of all articles I’ve written that describe some troubleshooting ideas. Each entry has a date stamp and a short note of the contents. The articles are generally listed most-recent first.

I am currently working backwards through my entire blog adding articles to catalogues as I go, so any individual catalogue may grow very slowly.

  • Parallel Default (Sept 2022): Why did my query run parallel? A possible reason and a simple demo.
  • Hash Joins and functions (July 2022): tracking down why a trim() function in a hash join predicate used 1,800 seconds of CPU.
  • Pagination Costs (July 2022): select with offset N next M can change plans as N and/or M vary.
    • Fetch Fetch and FBIs (June 2019): a error of omission (with workaround) in the optimizer’s treatment of FBIs with Fetch First / Top-N (last tested 19.11)
  • LOB Space (July 2022): considerations when comparing dbms_space.space_usage() with sum(dbms_lob.getlength())
  • Index Wildcard (July 2022): Solving a problem of a case where a special case fix control had a bad side effect on a cursor_sharing damage-limitation strategy.
  • ORA-00054 pt.1 (July 2022): Trouble-shooting with errorstack and ksq trace
  • ORA-39857 (June 2022): using a “drop tablespace” problem to describe a generic method for finding out why Oracle is not doing what you thought it would.
  • Case Study (June 2022): an example of solving a performance problem by reading an SQL Monitor report
  • Adaptive Plans (April 2022): what they can cost and how block then selectively
  • PL/SQL Warning (April 2022): an aid to pre-empt trouble-shooting – automatic warnings of pl/sql coding flaws
  • Finding SQL (Aug 2021): currently includes ideas on finding “who has been using up all the temporary space”.
  • Memory allocation (Aug 2021): SGA target, PGA target, Memory target: where’s the memory going.
  • Cleaning the audit trail (July 2021): playing about with hexadecimal pinpoints a bug in dbms_audit_mgmt.clean_audit_trail
  • Disappearing NOT NULL constraints (July 2021): the effects of the link between primary keys and not null declarations
  • v$resource_limit (March 2021): reference note and simple script
  • Dates and formats (Dec 2020): It’s important to use date datatypes, and check for type-conversion threats.
  • Java Names upgrade (Dec 2020): how an upgrade from 12c to 19c caused an error in “home-grown” auditing.
  • Interval partition serialising (Feb 2020): Why did a parallel insert into an interval partitioned table serialise through one PX process – conjecture and workaround.
  • Interval Partitioning load threat (Feb 2018): a big insert as select, create as select to load/create an interval partitioned table can produce a MASSIVE sort.
  • v$lock efficiency (May 2013): the benefit of collecting stats on individual x$ structures, and how to avoid dbms_stats.gather_fixed_object_stats
  • Old Plan_table (Jan 2010): why dbms_xplan.display() might have a note: ‘PLAN_TABLE’ is old version
  • PGA Memory (June 2009): checking process memory and process memory detail.
Next Page »

Website Powered by WordPress.com.