Thursday, February 16, 2012

Can a rollback be delayed?

Hi all, thanks for reading.

Bit of a puzzler...

A customer of ours has reported a intermittent error using one of our applications to receive some stock.

Under normal circumstances the application updates about 10 tables in a single transaction, inserting to some, updating others. After the stock receipt they print off a Crystal report against the receipt data.

The customer is saying that (very) occasionally they perform a receipt (no errors) and print off the report (no problems) yet when they come to access the data some time later it is almost as if no receipt has been made.

I have seen their data and can see what they mean. None of the expected data updates seem to have taken place. And yet there is the report sat there, indicating that it must have...

Looking closely at the data I can see breaks in the sequencing in certain identity column-carrying tables, during the period when the stock receipt was made. To all intents and purposes, therefore, it looks as though a rollback has occurred.

Is this possible? Could a Crystal report show uncommitted data which is then rolled back? How 'long' can a rollback take? Can it be initiated in some other way?

NB: We have done all the obvious things like:
- checking they have received against the correct DB;
- checking that no app or procedure can remove data in this way.
- verified that the report is reporting from the correct place.
etc.

The 'missing' data is so perfectly removed, my instinct says it must be a rollback but I can't see how this can be (yet).

All suggestions gratefully received - how could I track down whether this was occurring? Or is it my fevered imagination?

Many thanks!!!!

pmb

ps: According to our records, this customer is running SQL 7 SP3.The short answer is "it depends".

The longer (less infuriating) answer is that in order for the rollback to remove the transaction, the transaction must be open during the entire time. This will have the effect that no other SPID can access any of the modified data until the transaction is committed or rolled back (i.e. a blocking problem). OK, sounds like you knew that already, but the practical upshot of all that is that the situation you describe is technically possible provided it all happens on the same connection. If the report is generated from the exact same place (and connection) as the transaction occurred on, then it is possible if one more piece falls exactly into place. The connection must end abnormally without committing the changes. An abnormal end to a connection is generally treated as a rollback, and would explain the situation above, but as I said before, if this was the case, you would very likely see a lot more blocking problems than lost data problems.|||Thanks for the response!

The transactions in our apps are automatically bracketed by 'begin'/'commit transaction' calls as part of screen handling architecture. (ie: the user presses an on-screen button, a 'begin transaction' is fired off, the guts of the program does its SQL business and the final step then fires the 'commit transaction'). Any SQL errors encountered betweentimes are intercepted and displayed to screen (with a 'rollback transaction' performed before program abort).

To the best of my knowledge, no errors are encountered during the stock receipt process, so it looks like the commit goes ok. Following the receipt, the user then spins up our Crystal front-end and runs the relevant report. I would have thought that this Crystal report step would be treated as a completely separate SQL process and thus the report would only be able 'see' the amended data from the previous process once it was committed.

Is there some kind of 'dirty read' (or even 'dirty write') SQL setting that could be happening here? Assuming such a setting exists and is in use here, how would I find out? And how would such logic operate with a rollback?|||It is possible that Crystal Reports is connecting and setting their session to READ_UNCOMMITTED, or using nolock hints, but that would not make a lot of sense (at least to me). You should be able to find out if Crystal is doing that, by running a few Profiler traces.

My money is still on something/someone is deleting the data. It may be via Query Analyzer, rather than through your application, or even an MS Office application accessing the tables directly. From what you say about the architecture of the application, it sounds like the commit/rollback is instantaneous, and does not wait for any user interaction.|||Thanks, fellow drone. Much appreciated. I'll look into the read uncommitted side of things.

On the basis that a report *could* 'see' uncommitted data, I'm still left with the issue of why the rollback is occurring. That's my own problem of course (assuming there is no possibility of this being a SQL server-level issue).

So, a question - if I wrote a trigger to store off transaction details to a logging table (say, writing off @.@.IDENTITY) to 'prove' the reciept was taking place), if a rollback occurs on the triggering table, will my logging table updates also be rolled back?

I'm assuming the answer to the above is 'YES', so am open to suggestions on how to track the problem. (NB: Running Profiler not really an option. On average the issue occurs once a fortnight!)|||That trigger idea made me wonder about something.

create table test1
(col1 int identity (1, 1),
col2 varchar(10))

insert into test1 (col2) values ('hello')
select * from test1
begin transaction

insert into test1 (col2) values ('hi')

select * from test1

begin transaction
update test1
set col2 = 'bye'
commit transaction

select * from test1

rollback transaction
select * from test1

drop table test1

A rollback statement will rollback any transactions committed within any transaction before it. Maybe you should have something check @.@.trancount and save that to a more durable location (say a text log?) Maybe there is some problem when the planets are aligned just right, you have some transaction still going from somewhere else. Still, I would expect that to cause huge blocking problems before lost data problems.

No comments:

Post a Comment