Here’s a question on OTN from a SQL Server user that should prompt a few interesting ideas. Re-arranged and paraphrased it goes something like this:
In SQL Server I can write the following code:
DECLARE @Counter INT SELECT @Counter = 0 UPDATE TempDB.dbo.TransactionDetail SET @Counter = AccountRunningCount = @Counter + 1 FROM TempDB.dbo.TransactionDetail WITH (TABLOCKX)What I want to do is more like this:
DECLARE @Total INT = 0 UPDATE StringOutput set @Total = SumOfLength = @Total + ColLength OPTION (MAXDOP 1)How do I do something similar in Oracle ?
I suspect that there may be an error in the second piece of code if it is to run on SQL Server since it doesn’t have a “FROM StringOutput” to match the “FROM TempDB.dbo.TransactionDetail” that appears in the first piece – but perhaps it’s implied by the reference to StringOutput in the UPDATE clause.
To an Oracle user, the set clause is particularly strange – but this is a feature of how T-SQL (roughly the equivalent of PL/SQL) can work for SQL Server. This set clause allows each row to modify, and be modified by, a variable. So, for each row, this double assignment first sets the column sumoflength to the value of the variable total plus the value of the column collength, then it sets the variable total to this new value of sumoflength. The net effect is that the column sumoflength acquires a running total of collength as you read through the table. (I am basing my explanation on “reasonable guesswork” combined with the title of the thread which was: “Update Failed for Sum of previous row and current row”.
So how do you do this in Oracle ? The simple answer is “you shouldn’t, but if you do the strategy has to be completely different because the tools are different”.
The reason for the “shouldn’t” is that there is no implicit order in a select statement unless it includes an “order by” clause – so any thoughts of “previous” or “next” for data in a table are erroneous. Assume then that we define an order through a unique, non-null column – we would then look at the SQL Server code and wonder whether its internal implementation was effectively a row by row mechanism, or whether there was some cunning array based processing taking place (perhaps like the trick that PL/SQL uses to turn an implicit cursor for loop into array fetches of 100 rows at a time).
My first thoughts about the task (assuming we actually do want to do it) point me to analytic functions as a way of producing the data – but I fear I may have to use PL/SQL because of the strong flavour of row-based activity that the code suggests. Let’s start with the first stab at an analtyic approach:
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select rownum id, rownum n1, cast(null as number(9)) running_tot from generator v1 where rownum <= 1e3 ; alter table t1 add constraint t1_pk primary key(id); -- collect stats
This gives us a primary key for ordering, a column to generate a running total for, and a column to store the running total – currently null. Let’s write a query that shows us the data we ultimately want to store in the table:
select
id,
n1,
-- running_tot,
sum(n1) over (order by id) running_tot
from
t1
;
ID N1 RUNNING_TOT
---------- ---------- -----------
1 1 1
2 2 3
3 3 6
4 4 10
5 5 15
6 6 21
7 7 28
8 8 36
9 9 45
10 10 55
Since we can calculate the right information in a select statement it would be nice if we could turn that statement into an updatable view; unfortunatley it doesn’t work.
update
(
select
id,
n1,
running_tot,
sum(n1) over (order by id) running_tot_calc
from
t1
)
set
running_tot = running_tot_calc
;
(
*
ERROR at line 2:
ORA-01732: data manipulation operation not legal on this view
That doesn’t seem entirely reasonable, but we can take the code and wrap it in pl/sql as a cursor for loop and (left as an exercise to the reader) we could even make the code more complex to do array updates rather than using the simplest code to do row by row updates:
declare cursor c1 is select id, n1, running_tot, sum(n1) over (order by id) running_tot_calc from t1 for update of running_tot ; begin for r1 in c1 loop update t1 set running_tot = r1.running_tot_calc where current of c1; end loop; commit; end; /
For the small data set I had generated, this runs pretty quickly, and gets the right answer. I find it slightly surprising, though, that I’m allowed to use the “select for update”/”current of” approach when a straight update isn’t legal.
Of course we might consider going for the full, non-procedural, array update – and luckily we’ve got a primary key that will help us to generate an updatable join view, or a simple, update-only, merge.
merge into t1 od using ( select id, n1, running_tot, sum(n1) over (order by id) running_tot_calc from t1 ) nd on (nd.id = od.id) when matched then update set od.running_tot = nd.running_tot_calc ;
I won’t show you the updatable join view version – because it didn’t work – I got the “non-key-preserved” error ORA-01779. I think the optimizer is having some trouble (or being a little too clever) with the analytic sum(). Again, with the small data set, this update was pretty quick. For larger data sets, the choice of general strategy is left to the reader.
I’d still be interested to know, though, whether the SQL Server code was internally array based or operates row by row.

Your comment about the “shouldn’t” and table order might not apply to SQL Server because the default table structure over there is their version of an IOT. At least that is what I remember from the All Things Oracle webinar you did.So wouldn’t that enable them to do the previous/next type logic without worrying about the order so much?
Comment by Tony — October 26, 2012 @ 10:29 pm UTC Oct 26,2012 |
Tony,
I believe your suggestion about why the “next/previous” problem wouldn’t occur to many SQL Server programmers is correct. That doesn’t mean they shouldn’t worry about it, though, it just means they’re slightly less likely to think of it as a problem.
Comment by Jonathan Lewis — December 13, 2012 @ 2:19 pm UTC Dec 13,2012 |
I think “select for update” is prefer since
1.”merge” result could be inconsistent(for example with insert and update in another session)
2. “select for update” optimized and stabilized with “buffer sort” in 11.2.
3. we can optimize case with “for update” cursor with bulk operations(fetch limit+for all)
And would be great if “window sort” executed after locking in next oracle versions like subqueries and functions now, which is not used for sorting into “order by” clause.
Also i want to ask: has the “for update” mechanism for array locking like “array update”(redo array) ?
Comment by Sayan Malakshinov — October 26, 2012 @ 11:45 pm UTC Oct 26,2012 |
Sayan,
Very useful points – especially the reminder about write consistency.
In this case, of course, we’re going to be updating every row in the table, so I should have suggested “lock table in exclusive mode” as a starting step – which would be much more efficient than select for update.
Comment by Jonathan Lewis — December 13, 2012 @ 2:21 pm UTC Dec 13,2012 |
I think this will work:
update t1 set running_tot = ( select sum( t1_1.n1) from t1 t1_1 where t1_1.id <= t1.id );Not the fastest query in the world, but should work.
Comment by Lev Erusalimskiy — October 29, 2012 @ 12:03 am UTC Oct 29,2012 |
Besides the fact that your query is much heavier (too many current gets), is much more important that it is dangerous because of possible inconsistencies result.
Just simple example with your query:
--creating test table with 10 rows, id and n1 = 1..10: create table t1 as select level id ,level n1 ,cast(null as int) running_tot from dual connect by level<=10; Table created. ----------------------- -- in one session we updating n1 to 100 where id=5: update t1 set n1=100 where id=5; 1 row updated. ----------------------- -- in another we updating running_total with your query: update t1 set running_tot = ( select sum( t1_1.n1) from t1 t1_1 where t1_1.id <= t1.id ); -- wait for unlocking.. ----------------------- -- now committing in first session: commit; Commit complete. ----------------------- -- second session unlocked: 10 rows updated. -- let's see what we got: select * from t1; ID N1 RUNNING_TOT ---------- ---------- ----------- 1 1 1 2 2 3 3 3 6 4 4 10 5 100 15 6 6 21 7 7 28 8 8 36 9 9 45 10 10 55 10 rows selected.Comment by Sayan Malakshinov — October 29, 2012 @ 6:58 pm UTC Oct 29,2012 |
I agree that this query is not the fastest, but I don’t think it is more dangerous than any other update statement
- it sees committed data as of the begging of the query.
Buy the way, how did you add this beautiful formatting to your post?
Comment by Lev Erusalimskiy — October 30, 2012 @ 3:05 am UTC Oct 30,2012 |
Lev, your are wrong. If you repeat my example with another simple query:
or change your query with adding one “dummy” predicate:
update t1 set running_tot = ( select sum( t1_1.n1) from t1 t1_1 where t1_1.id <= t1.id and t1_1.n1*0 = t1.n*0 );you would see that applied new values.
This behaviour described in articles about “write consistency” and mini-rollbacks(statement restarts).
About posting:
http://jonathanlewis.wordpress.com/comments-on-comment/
http://en.support.wordpress.com/code/posting-source-code/
Comment by Sayan Malakshinov — October 30, 2012 @ 5:34 am UTC Oct 30,2012