<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Redo</title>
	<atom:link href="http://jonathanlewis.wordpress.com/2011/08/19/redo-2/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/</link>
	<description>Just another Oracle weblog</description>
	<lastBuildDate>Sun, 26 May 2013 02:13:39 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: A NoCOUG to Remember</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-53860</link>
		<dc:creator><![CDATA[A NoCOUG to Remember]]></dc:creator>
		<pubDate>Fri, 01 Mar 2013 12:54:40 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-53860</guid>
		<description><![CDATA[[...] included a demonstration of how Oracle actually fails the ACID test, which he blogged about afterward, definitely give it a [...]]]></description>
		<content:encoded><![CDATA[<p>[...] included a demonstration of how Oracle actually fails the ACID test, which he blogged about afterward, definitely give it a [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Maletinsky</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-46026</link>
		<dc:creator><![CDATA[Martin Maletinsky]]></dc:creator>
		<pubDate>Wed, 11 Apr 2012 21:42:20 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-46026</guid>
		<description><![CDATA[Hello Tony,

Thanks a lot for your detailed reply (I didn&#039;t expect a reply that quickly). Yes, it is the the redo associated with the change to the transaction slot in the undo table space I was referring to when I used the term &quot;commit record&quot; and it was this change I wanted to wait for to be written to disk before updating the transaction table slot. And yes again, the steps you list are exactly what I had in mind - only that I didn&#039;t think about pinning the undo segment header. Is this pinning strictly necessary - wouldn&#039;t this inhibit concurrency significantly (if you pin it in exclusive mode) or did you just intend to pin it in shared mode to avoid it from being evicted from the cache before it&#039;s updated?

However, I understand that Jonathan was saying this approach might lead to inconsistencies as well:
&quot;[...] and only lets them see the change after your session has been posted by lgwr (and even that leaves a window for inconsistent recovery if the instance crashes after the write has happened but before your session has been posted – see my reply to Martin Berger [...]&quot; in Jonathan&#039;s update to his original post
&quot;[...] Even if we stopped session 3 from seeing the data until session 1 agreed that it the transasction had been made durable we could be in a position where the write to disc had occurred but instance failure meant that session 1 had not seen the acknowledgement; [...]&quot; in Jonathan&#039;s reply to Martin Berger.

I am aware that the approach is an open heart surgery, however it is my understanding that the problem to be fixed comes close to a serious heart attack (see my comment 27 above). I don&#039;t know the Oracle source code but it seems hard to imagine that the problem can be fixed without significant modifications in central areas of the code.

kind regards
Martin]]></description>
		<content:encoded><![CDATA[<p>Hello Tony,</p>
<p>Thanks a lot for your detailed reply (I didn&#8217;t expect a reply that quickly). Yes, it is the the redo associated with the change to the transaction slot in the undo table space I was referring to when I used the term &#8220;commit record&#8221; and it was this change I wanted to wait for to be written to disk before updating the transaction table slot. And yes again, the steps you list are exactly what I had in mind &#8211; only that I didn&#8217;t think about pinning the undo segment header. Is this pinning strictly necessary &#8211; wouldn&#8217;t this inhibit concurrency significantly (if you pin it in exclusive mode) or did you just intend to pin it in shared mode to avoid it from being evicted from the cache before it&#8217;s updated?</p>
<p>However, I understand that Jonathan was saying this approach might lead to inconsistencies as well:<br />
&#8220;[...] and only lets them see the change after your session has been posted by lgwr (and even that leaves a window for inconsistent recovery if the instance crashes after the write has happened but before your session has been posted – see my reply to Martin Berger [...]&#8221; in Jonathan&#8217;s update to his original post<br />
&#8220;[...] Even if we stopped session 3 from seeing the data until session 1 agreed that it the transasction had been made durable we could be in a position where the write to disc had occurred but instance failure meant that session 1 had not seen the acknowledgement; [...]&#8221; in Jonathan&#8217;s reply to Martin Berger.</p>
<p>I am aware that the approach is an open heart surgery, however it is my understanding that the problem to be fixed comes close to a serious heart attack (see my comment 27 above). I don&#8217;t know the Oracle source code but it seems hard to imagine that the problem can be fixed without significant modifications in central areas of the code.</p>
<p>kind regards<br />
Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tonyhasler</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-46025</link>
		<dc:creator><![CDATA[tonyhasler]]></dc:creator>
		<pubDate>Wed, 11 Apr 2012 21:20:49 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-46025</guid>
		<description><![CDATA[Martin,

I know it is a bit unusual for somebody other than the blog owner to reply to comments but I am pretty sure I understand your confusion so with apologies to Jonathan....

The key point is your statement &quot;...after receiving confirmation from the log writer that the redo has been written to disk&quot;

What redo are we talking about?  The redo associated with the changes to the permanent table blocks or the redo associated with the change to the transaction slot in the undo table space?  If you mean the former then that doesn&#039;t work because the changes will be rolled back after crash recovery.  The thing that makes a transaction durable after a crash is the logging of the update to the transaction slot.

So the way to implement what you are suggesting is:

- Pin the undo segment header in cache
- Log a change to the transaction slot without actually making the change to the transaction slot in memory
- Get a confirmation from LGWR that the change is on disk
- Actually make the change to the transaction slot in memory.
- Unpin the undo segment header

Of course, there are variations and there are complications should the client process crash or hang but you get the basic idea.

This approach would work but violates a couple of fundamental aspects of Oracle architecture.  As Jonathan explains in his book the process of generating redo is fundamentally tied into the process of making block changes.

Despite the open heart surgery on the database engine that your suggestion requires, I believe that something like this would need to be done should Oracle ever be forced to fix this problem by somebody such as a major client or the powers that be at TPC.]]></description>
		<content:encoded><![CDATA[<p>Martin,</p>
<p>I know it is a bit unusual for somebody other than the blog owner to reply to comments but I am pretty sure I understand your confusion so with apologies to Jonathan&#8230;.</p>
<p>The key point is your statement &#8220;&#8230;after receiving confirmation from the log writer that the redo has been written to disk&#8221;</p>
<p>What redo are we talking about?  The redo associated with the changes to the permanent table blocks or the redo associated with the change to the transaction slot in the undo table space?  If you mean the former then that doesn&#8217;t work because the changes will be rolled back after crash recovery.  The thing that makes a transaction durable after a crash is the logging of the update to the transaction slot.</p>
<p>So the way to implement what you are suggesting is:</p>
<p>- Pin the undo segment header in cache<br />
- Log a change to the transaction slot without actually making the change to the transaction slot in memory<br />
- Get a confirmation from LGWR that the change is on disk<br />
- Actually make the change to the transaction slot in memory.<br />
- Unpin the undo segment header</p>
<p>Of course, there are variations and there are complications should the client process crash or hang but you get the basic idea.</p>
<p>This approach would work but violates a couple of fundamental aspects of Oracle architecture.  As Jonathan explains in his book the process of generating redo is fundamentally tied into the process of making block changes.</p>
<p>Despite the open heart surgery on the database engine that your suggestion requires, I believe that something like this would need to be done should Oracle ever be forced to fix this problem by somebody such as a major client or the powers that be at TPC.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Maletinsky</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-46024</link>
		<dc:creator><![CDATA[Martin Maletinsky]]></dc:creator>
		<pubDate>Wed, 11 Apr 2012 21:18:43 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-46024</guid>
		<description><![CDATA[Hello

I have two questions related to the solution you suggest to the problem (in the update you added to the initial post). As in my previous post, let me use “commit record” as a shorthand for “the change vector describing the transaction table slot update”.

1) How do you deal with the scenario where the session owning the transaction terminates after it posted the LGWR but before it could update the transaction table slot from ‘commit waiting on lgwr’ to  ‘committed&#039;? Will PMON have to wait for the LGWR to write the corresponding commit record to disk and then do the update (or roll back the transaction in case the log writer crashes)?

--

2) In the fourth (last) bullet point you mention a consistency problem. If I understand it correctly, the problem results from multiple sessions committing transactions which might update the transaction table slots from ‘commit waiting on lgwr’ to  ‘committed&#039; in a different (temporal) order than the (physical) order of the corresponding commit records in the redo log file. 

a) Is this my understanding of the consistency problem correct?

b) (if yes to a) Why does this different ordering cause a consistency problem - could you please sketch a scenario how this can lead to inconsistent data?

c) If my understanding is correct and the different ordering does cause a consistency problem, am I correct in the assumption that by &quot;waiting (or watching) for it to acknowledge&quot; you mean the background process you mention waits for each process being woken up to update it&#039;s transaction table slot before waking up the next process on the list (otherwise OS scheduling could mess up the order of the transaction table updates even though the processes are woken up in the right order).

d) Point c) brought me to another thought regarding the current COMMIT mechanism. Based on my understanding there is the potential that multiple sessions update the transaction table slots  (from ACTIVE to COMMITTED) in a different (temporal) order than the (physical) order of the corresponding commit record in the redo log file. This could happen as a result of OS scheduling, let me sketch the scenario:
t1: Session 1 is about to commit Transaction TA1 and copies the commit record CR1 into the log buffer
t2: Session 1 is taken off the CPU by the OS scheduler
t3: Session 2 is about to commit Transaction TA2 and copies the commit record CR2 into the log buffer
t4: Session 2 updates the transaction table slot TTS2 from ACTIVE to COMMITTED
t5: Session 2 posts the log writer
t6: the log writer writes the redo log buffer to the redo log file (and thus the commit records CR1, CR2 in this order)
t7: Session 1 is put on the CPU again by the OS scheduler
t8: Session 1 updates the transaction table slot TTS1 from ACTIVE to COMMITTED (and doesn&#039;t post the log writer because the record has already been written)
Now you have CR2 after CR1 in the redo log file but TTS1 was updated after TTS2 - can this lead to a consistency problem as well?

Thanks for your help to understand this matter
kind regards
Martin]]></description>
		<content:encoded><![CDATA[<p>Hello</p>
<p>I have two questions related to the solution you suggest to the problem (in the update you added to the initial post). As in my previous post, let me use “commit record” as a shorthand for “the change vector describing the transaction table slot update”.</p>
<p>1) How do you deal with the scenario where the session owning the transaction terminates after it posted the LGWR but before it could update the transaction table slot from ‘commit waiting on lgwr’ to  ‘committed&#8217;? Will PMON have to wait for the LGWR to write the corresponding commit record to disk and then do the update (or roll back the transaction in case the log writer crashes)?</p>
<p>&#8211;</p>
<p>2) In the fourth (last) bullet point you mention a consistency problem. If I understand it correctly, the problem results from multiple sessions committing transactions which might update the transaction table slots from ‘commit waiting on lgwr’ to  ‘committed&#8217; in a different (temporal) order than the (physical) order of the corresponding commit records in the redo log file. </p>
<p>a) Is this my understanding of the consistency problem correct?</p>
<p>b) (if yes to a) Why does this different ordering cause a consistency problem &#8211; could you please sketch a scenario how this can lead to inconsistent data?</p>
<p>c) If my understanding is correct and the different ordering does cause a consistency problem, am I correct in the assumption that by &#8220;waiting (or watching) for it to acknowledge&#8221; you mean the background process you mention waits for each process being woken up to update it&#8217;s transaction table slot before waking up the next process on the list (otherwise OS scheduling could mess up the order of the transaction table updates even though the processes are woken up in the right order).</p>
<p>d) Point c) brought me to another thought regarding the current COMMIT mechanism. Based on my understanding there is the potential that multiple sessions update the transaction table slots  (from ACTIVE to COMMITTED) in a different (temporal) order than the (physical) order of the corresponding commit record in the redo log file. This could happen as a result of OS scheduling, let me sketch the scenario:<br />
t1: Session 1 is about to commit Transaction TA1 and copies the commit record CR1 into the log buffer<br />
t2: Session 1 is taken off the CPU by the OS scheduler<br />
t3: Session 2 is about to commit Transaction TA2 and copies the commit record CR2 into the log buffer<br />
t4: Session 2 updates the transaction table slot TTS2 from ACTIVE to COMMITTED<br />
t5: Session 2 posts the log writer<br />
t6: the log writer writes the redo log buffer to the redo log file (and thus the commit records CR1, CR2 in this order)<br />
t7: Session 1 is put on the CPU again by the OS scheduler<br />
t8: Session 1 updates the transaction table slot TTS1 from ACTIVE to COMMITTED (and doesn&#8217;t post the log writer because the record has already been written)<br />
Now you have CR2 after CR1 in the redo log file but TTS1 was updated after TTS2 &#8211; can this lead to a consistency problem as well?</p>
<p>Thanks for your help to understand this matter<br />
kind regards<br />
Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Maletinsky</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-46023</link>
		<dc:creator><![CDATA[Martin Maletinsky]]></dc:creator>
		<pubDate>Wed, 11 Apr 2012 20:12:01 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-46023</guid>
		<description><![CDATA[Hello

After reading about this problem (originally in your book &quot;Oracle Core&quot;) my first thought was the problem could easily be fixed by reordering the sequence of events at commit time, i.e. the session only updates the transaction table slot after receiving confirmation from the log writer that the redo has been written to disk.

From your comment in the update you added as well as from your answer to Martin Berger&#039;s post which you refer to in this update I understand that my approach would open another possibility for inconsistency - however, I don&#039;t understand why. 
To keep the text more readable, let me use &quot;commit record&quot; as a shorthand for &quot;the change vector describing the transaction table slot update&quot;. You say inconsistency could result if the instance crashed after the log writer has written the commit record to disk and before the session executing the transaction is informed about that write. However, if the instance crashes, I don&#039;t see what difference it makes if the session managed to update the transaction table slot before the crash or not. The timing of this updates is not determinitstic anyway (even without a crash it may be delayed arbitrarily as a result of scheduling). Moreover the update affects volatile memory and is lost after the crash anyway - it will however be recovered from the redo that was written to disk upon restart of the database.

I see an issue, if the session terminates after the redo has been written to disk and before the session managed to update the transaction table slot. In this case PMON will roll back the open transaction and it is therefore necessary to prevent the transaction from being recovered from the redo logs after a subsequent instance crash. However, couldn&#039;t this be solved by extending the cleanup work performed by PMON upon termination of the session,  e.g. by having the PMON modify (delete) the commit record that was previously written to disk? I am aware that this would contradict the current philosophy of writing sequentially to the redo log and not having to modify redo data once it has been written in the past, however I can&#039;t see a fundamental reason against it. 

Thanks for any elucidation on this matter
kind regards
Martin]]></description>
		<content:encoded><![CDATA[<p>Hello</p>
<p>After reading about this problem (originally in your book &#8220;Oracle Core&#8221;) my first thought was the problem could easily be fixed by reordering the sequence of events at commit time, i.e. the session only updates the transaction table slot after receiving confirmation from the log writer that the redo has been written to disk.</p>
<p>From your comment in the update you added as well as from your answer to Martin Berger&#8217;s post which you refer to in this update I understand that my approach would open another possibility for inconsistency &#8211; however, I don&#8217;t understand why.<br />
To keep the text more readable, let me use &#8220;commit record&#8221; as a shorthand for &#8220;the change vector describing the transaction table slot update&#8221;. You say inconsistency could result if the instance crashed after the log writer has written the commit record to disk and before the session executing the transaction is informed about that write. However, if the instance crashes, I don&#8217;t see what difference it makes if the session managed to update the transaction table slot before the crash or not. The timing of this updates is not determinitstic anyway (even without a crash it may be delayed arbitrarily as a result of scheduling). Moreover the update affects volatile memory and is lost after the crash anyway &#8211; it will however be recovered from the redo that was written to disk upon restart of the database.</p>
<p>I see an issue, if the session terminates after the redo has been written to disk and before the session managed to update the transaction table slot. In this case PMON will roll back the open transaction and it is therefore necessary to prevent the transaction from being recovered from the redo logs after a subsequent instance crash. However, couldn&#8217;t this be solved by extending the cleanup work performed by PMON upon termination of the session,  e.g. by having the PMON modify (delete) the commit record that was previously written to disk? I am aware that this would contradict the current philosophy of writing sequentially to the redo log and not having to modify redo data once it has been written in the past, however I can&#8217;t see a fundamental reason against it. </p>
<p>Thanks for any elucidation on this matter<br />
kind regards<br />
Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Maletinsky</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-46022</link>
		<dc:creator><![CDATA[Martin Maletinsky]]></dc:creator>
		<pubDate>Wed, 11 Apr 2012 20:01:55 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-46022</guid>
		<description><![CDATA[Hello,

From what I understand, this behaviour means that every information that gets out of the database (be it information read by a human user, be it information communicated to another system) is unreliable in the sense that it may be information that potentially doesn&#039;t survive an instance crash. If this is correct, it sounds like a serious issue to me, as also mentioned by Tony Hasler in his comment dated August 19, 2011. The potential consequences seem to be dramatic as pointed out in Tony Hasler&#039;s blog &quot;Why is ACID important?&quot; (i.e. confirmed business transactions lost after an instance crash, messages sent out twice due to an instance crash). This and the fact that Oracle makes lot of effort to ensure the ACID property in other situations (such as disabling the PL/SQL COMMIT optimization for distributed transactions) leads me to the following two questions  in this context:
1) Is the problem really as bad as it sounds to me or are there mitigating circumstances which I am ignoring?
2) Is there any official statement from Oracle Corp. regarding this matter since it was discussed publicly more than 6 months ago?

Thanks for any reply
kind regards
Martin]]></description>
		<content:encoded><![CDATA[<p>Hello,</p>
<p>From what I understand, this behaviour means that every information that gets out of the database (be it information read by a human user, be it information communicated to another system) is unreliable in the sense that it may be information that potentially doesn&#8217;t survive an instance crash. If this is correct, it sounds like a serious issue to me, as also mentioned by Tony Hasler in his comment dated August 19, 2011. The potential consequences seem to be dramatic as pointed out in Tony Hasler&#8217;s blog &#8220;Why is ACID important?&#8221; (i.e. confirmed business transactions lost after an instance crash, messages sent out twice due to an instance crash). This and the fact that Oracle makes lot of effort to ensure the ACID property in other situations (such as disabling the PL/SQL COMMIT optimization for distributed transactions) leads me to the following two questions  in this context:<br />
1) Is the problem really as bad as it sounds to me or are there mitigating circumstances which I am ignoring?<br />
2) Is there any official statement from Oracle Corp. regarding this matter since it was discussed publicly more than 6 months ago?</p>
<p>Thanks for any reply<br />
kind regards<br />
Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-45043</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Wed, 15 Feb 2012 02:31:57 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-45043</guid>
		<description><![CDATA[Hemant,

As you say, it&#039;s similar; but I don&#039;t think it&#039;s really the same because it looks like the (very old) problem of two RAC nodes operating briefly at different SCNs.  Alas, the speed of light is not infinite - and even smaller when not passign through a vacuum.]]></description>
		<content:encoded><![CDATA[<p>Hemant,</p>
<p>As you say, it&#8217;s similar; but I don&#8217;t think it&#8217;s really the same because it looks like the (very old) problem of two RAC nodes operating briefly at different SCNs.  Alas, the speed of light is not infinite &#8211; and even smaller when not passign through a vacuum.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hemant K Chitale</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-45022</link>
		<dc:creator><![CDATA[Hemant K Chitale]]></dc:creator>
		<pubDate>Tue, 14 Feb 2012 06:21:44 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-45022</guid>
		<description><![CDATA[Something similar when using AQs in RAC  : Oracle Support article : &quot;Query returns 0 rows after successful row  insert/message enqueue when using Tibco Application in a RAC environment [ID 1412774.1]&quot;]]></description>
		<content:encoded><![CDATA[<p>Something similar when using AQs in RAC  : Oracle Support article : &#8220;Query returns 0 rows after successful row  insert/message enqueue when using Tibco Application in a RAC environment [ID 1412774.1]&#8220;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ariq</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-42479</link>
		<dc:creator><![CDATA[Ariq]]></dc:creator>
		<pubDate>Mon, 21 Nov 2011 14:39:40 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-42479</guid>
		<description><![CDATA[Useful Blog

Thanks]]></description>
		<content:encoded><![CDATA[<p>Useful Blog</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: laimis (lnd)</title>
		<link>http://jonathanlewis.wordpress.com/2011/08/19/redo-2/#comment-41640</link>
		<dc:creator><![CDATA[laimis (lnd)]]></dc:creator>
		<pubDate>Tue, 06 Sep 2011 08:27:38 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=6962#comment-41640</guid>
		<description><![CDATA[&gt;An interesing, but hardly, novel, approach.
&gt; Essentially, rather than keeping rows locked until the lazy committed transaction is durable the page containing the row (or rows) is marked DIRTIED_BY_LC_XACT. What is the difference? 

Exactly. I only wanted to search CS field for answers. Oracle is hardly the first facing the problem. It is important to identify it and try to apply some Computer Science to it. If none is found then may be someone can do some :)

Personally I am interested if the WAIT by readers can be eliminated. That allways was what oracle was proud of - readers do not have to wait.
Sai suggested an elegant thing - turning the clock back for some SQL commands:

http://sai-oracle.blogspot.com/2011/08/is-oracle-acid-compliant-database.html

That for me looks very much like using a flashback query. 

I only think this approach needs a few things to be investigated:
- locking. Commit not only makes transactions visible, it releases locks too.
- somehow it doesn&#039;t feel obvious for me that only some SQL commands (excluding DML&#039;s) are subject to this FlushSCN. This thought connects to locks release. It looks a bit suspicious that one type of commands (even of the same transaction) may see later data than others. 
- performance. That seems a minor issue actually.

Actually this excellent blog by Saibabu Devabhaktuni (Sai) makes one think how SQL command(a query) obtains the CurrentSCN and it&#039;s relationship to CommitSCN. First it looks like this is a serialization point, a protected critical section with respect to (at least) those two processes: a query getting CurrentSCN and getting CommitSCN (which is done by user process I assume, not lgwr itself.)
Then the need for CommitSCN to be in redo stream dictates the ordering of actions: first modifying in-memory structures(rollback blocks) to mark transaction commited (and immediately visible to others, with locks released) and then redo flush (to disk).

This requirement leads almost directly to those two approaches we discussed here:

- locking and waiting for flush to complete

- adjusting time: using something else for QuerySCN than currentSCN. That Saibabu Devabhaktuni&#039;s  idea uses MVCC at full and for that reason it looks very attractive and consistent with &quot;readers do not have to wait&quot;.  It only has to be formally verified, and the correct way to think for me seems to look more closely to those key timestamps:

-CommitSCN,
-FlushSCN,
-QuerySCN.

just my 2 cents....]]></description>
		<content:encoded><![CDATA[<p>&gt;An interesing, but hardly, novel, approach.<br />
&gt; Essentially, rather than keeping rows locked until the lazy committed transaction is durable the page containing the row (or rows) is marked DIRTIED_BY_LC_XACT. What is the difference? </p>
<p>Exactly. I only wanted to search CS field for answers. Oracle is hardly the first facing the problem. It is important to identify it and try to apply some Computer Science to it. If none is found then may be someone can do some :)</p>
<p>Personally I am interested if the WAIT by readers can be eliminated. That allways was what oracle was proud of &#8211; readers do not have to wait.<br />
Sai suggested an elegant thing &#8211; turning the clock back for some SQL commands:</p>
<p><a href="http://sai-oracle.blogspot.com/2011/08/is-oracle-acid-compliant-database.html" rel="nofollow">http://sai-oracle.blogspot.com/2011/08/is-oracle-acid-compliant-database.html</a></p>
<p>That for me looks very much like using a flashback query. </p>
<p>I only think this approach needs a few things to be investigated:<br />
- locking. Commit not only makes transactions visible, it releases locks too.<br />
- somehow it doesn&#8217;t feel obvious for me that only some SQL commands (excluding DML&#8217;s) are subject to this FlushSCN. This thought connects to locks release. It looks a bit suspicious that one type of commands (even of the same transaction) may see later data than others.<br />
- performance. That seems a minor issue actually.</p>
<p>Actually this excellent blog by Saibabu Devabhaktuni (Sai) makes one think how SQL command(a query) obtains the CurrentSCN and it&#8217;s relationship to CommitSCN. First it looks like this is a serialization point, a protected critical section with respect to (at least) those two processes: a query getting CurrentSCN and getting CommitSCN (which is done by user process I assume, not lgwr itself.)<br />
Then the need for CommitSCN to be in redo stream dictates the ordering of actions: first modifying in-memory structures(rollback blocks) to mark transaction commited (and immediately visible to others, with locks released) and then redo flush (to disk).</p>
<p>This requirement leads almost directly to those two approaches we discussed here:</p>
<p>- locking and waiting for flush to complete</p>
<p>- adjusting time: using something else for QuerySCN than currentSCN. That Saibabu Devabhaktuni&#8217;s  idea uses MVCC at full and for that reason it looks very attractive and consistent with &#8220;readers do not have to wait&#8221;.  It only has to be formally verified, and the correct way to think for me seems to look more closely to those key timestamps:</p>
<p>-CommitSCN,<br />
-FlushSCN,<br />
-QuerySCN.</p>
<p>just my 2 cents&#8230;.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
