Tell me how wrong and stupid I am about T-SQL transactions



  • The app I'm working on has the potential to open nested transactions and close them in the wrong order. (We're not sure if it actually does this or not, but it has the potential to do it due to how its structured.) My boss's boss added in an except that is thrown (right now) any time a nested transaction happens-- but that's not really the problem here. And his debug code fires so often it's practically useless because devs always end up turning it off.

    So what I'd like to do is fix the check so that it only identifies situations where nested transactions are closed in the wrong order. (To be specific, I mean the application opens Trans A, then Trans B, then closes Trans A before closing Trans B.) This is a typical WebAPI application running on ASP.net 4.5 and IIS 7.0.

    My "solution" in psuedo-code:

    1. Create a static Stack of transactions in our class that wraps the whole SQL Connection transaction thing (let's call it CustomTransaction). Marked as ThreadStatic.

    2. Every time a transaction is opened, push the current CustomTransaction class to the top of the stack.

    3. Every time a transaction is closed, pop the top of the stack and check to see if it's equal to the current CustomTransaction--

      a. If so, done, no problem
      b. If not, check if it's a debug build and throw an exception if so
      
    4. Also keep a counter of the "deepest" transaction level for the SQL stats.

    So, tell me how stupid this idea is! Specifically, does this interact with SQL connection pooling in any weird undesirable way? Also for a web server, is ThreadStatic sufficient, or will IIS run multiple different page requests in a single thread? (I don't think it will...) What is something I'm not even thinking of that will break this code horribly?


  • ♿ (Parody)

    That sounds about like how I'd imagine approaching the problem.


  • Garbage Person

    The correct solution is to shoot the person that architected you into that hole and then do exactly that.


  • ♿ (Parody)

    Well, fair enough. I've never felt the need for nested transactions and all the rest, but...sometimes you gotta play the WTFs you're dealt.


  • FoxDev

    Nested transactions are TRWTF… but you can't control that in this scenario. So I'd say your solution is pretty much on the money.



  • I don't see the problem being nested transactions as a feature. One of our users might perform an action that requires a Blah to be created but a Blah contains a Fleb. The helper functions that create those each open a transaction, because creating them involves multiple inserts and it would be a bigger WTF if they didn't. Both classes need their transactions because a different operation might require only creating a Fleb without a Blah.



  • OK well I'm gonna start implementing this after lunch, so speak now or forever hear my WTFs.



  • Tell me how wrong and stupid I am about T-SQL transactions

    You're wrong and fucking stupid.

    Well, actually you're pretty much on the best track - it's obviously a stack problem, and I don't think the solution can be much improved (unless you can somehow manage the transactions to guarantee the order? dunno).

    But hey, since you asked.

    As for threading, ASP.NET / IIS seems to do weird stuff with it when you use asynchronous stuff, so you can end up jumping between threads, and probably having another request assigned to the same thread in the meantime. Again, not an expert, but it's probably worth checking out a bit more.

    @blakeyrat said:

    One of our users might perform an action that requires a Blah to be created but a Blah contains a Fleb. The helper functions that create those each open a transaction, because creating them involves multiple inserts and it would be a bigger WTF if they didn't. Both classes need their transactions because a different operation might require only creating a Fleb without a Blah.

    But I suppose you can't end "constructing Blah" until Fleb gets fully constructed in this case? I'm still not sure how you end up with out-of-order transactions, but I'm wondering if you're not actually pissing on a forest fire.



  • @Maciejasjmj said:

    I'm still not sure how you end up with out-of-order transactions, but I'm wondering if you're not actually pissing on a forest fire.

    I have no idea if the code is doing that at all, as I've already stated.

    I just know the check my boss put in for it is:

    1. Incorrect (giving a false-positive for about 50,000 harmless nested transactions)
    2. Annoying, enough that it gets turned-off during development, which makes it altogether useless.


  • Either don't use transactions at all, or start a single transaction before the first action and end after the last. TransactionScope was invented for exactly this situation.



  • @Jaime said:

    Either don't use transactions at all,

    That's a bad idea.

    @Jaime said:

    or start a single transaction before the first action and end after the last. TransactionScope was invented for exactly this situation.

    I don't have the time or inclination to refactor the entire codebase.



  • @blakeyrat said:

    I have no idea if the code is doing that at all, as I've already stated.

    I just know the check my boss put in for it is:1) Incorrect (giving a false-positive for about 50,000 harmless nested transactions)2) Annoying, enough that it gets turned-off during development, which makes it altogether useless.

    Put a proper check in which logs out-of-order transactions somewhere, then run a load simulator on the API? Same goes for the thread staticness and all other issues - given the high enough load, you'll probably end up running into all sorts of problems.


  • FoxDev

    @blakeyrat said:

    speak now or forever hear my WTFs

    To be fair, we'd be hearing your WTFs even if we didn't speak up. But then that's what this site is for, talking about WTFs 😄



  • @Maciejasjmj said:

    Put a proper check in which logs out-of-order transactions somewhere,

    Read the OP.

    @Maciejasjmj said:

    then run a load simulator on the API?

    It's not load that's the problem, it's breadth. There are probably 300+ API endpoints, most of which will write more than one database object (the specific amount varying with what's passed-in.)

    @Maciejasjmj said:

    Same goes for the thread staticness and all other issues - given the high enough load, you'll probably end up running into all sorts of problems.

    I was hoping someone could simply provide a link that definitively said, "IIS runs one page view per thread at a time, EVER." Which I'm 99% sure it does, but I can't find any guarantee.



  • @blakeyrat said:

    Read the OP.

    Yeah, as I said, good idea. Don't know how your prod environment looks like, but maybe you'll be able to get away with logging there too.

    @blakeyrat said:

    It's not load that's the problem, it's breadth. There are probably 300+ API endpoints, most of which will write more than one database object (the specific amount varying with what's passed-in.)

    foreach (var endpoint in Endpoints)... I mean, you can just throw an exception in a debug build and hope that someone hits that one endpoint with the specific request which causes transactions to go out of whack, but I'd personally start doing some actual testing for that.

    @blakeyrat said:

    I was hoping someone could simply provide a link that definitively said, "IIS runs one page view per thread at a time, EVER." Which I'm 99% sure it does, but I can't find any guarantee.

    From my link:

    If all the modules and handlers in the pipeline are synchronous, the request will execute on a single CLR ThreadPool thread.

    which seems to imply it also won't be preempted out of that thread until it finishes. If you do have asynchronous stuff along the way, though, I think it throws all the single thread guarantees out of the window due to the pooling.



  • There is precisely one controller which uses async, and it doesn't trigger the existing nested transaction check at all, so I should be good there.

    If all the modules and handlers in the pipeline are synchronous, the request will execute on a single CLR ThreadPool thread.

    Right I already knew that, but I'm looking at the opposite/converse of that statement. ("A single ThreadPool thread will execute only one request at a time.")

    The more I think about it, the more I'm thinking, "well how could it possibly execute more than one request at a time on a single thread?" so maybe it's a non-issue, but like I said, I'd like to see a guarantee in the documentation somewhere.



  • @blakeyrat said:

    The more I think about it, the more I'm thinking, "well how could it possibly execute more than one request at a time on a single thread?" so maybe it's a non-issue, but like I said, I'd like to see a guarantee in the documentation somewhere.

    Yeah, I thought that's pretty much a given, but opinions differ.

    How about just slapping it into HttpRequestMessage properties, or however WebAPI manages request contexts?



  • @Maciejasjmj said:

    How about just slapping it into HttpRequestMessage properties, or however WebAPI manages request contexts?

    Because the layer that deals with the database isn't only used by the WebAPI.

    Thanks for that article, though, it looks helpful.



  • @blakeyrat said:

    Because the layer that deals with the database isn't only used by the WebAPI.

    Further reading, here in context of ASP.NET. In short, "fuck if anybody knows, it kinda seems like you can be preempted between lifecycle events and not in the middle of the action, but it's not like it's documented anywhere".

    At that point, I'm wondering if a static ConcurrentDictionary mapping some sort of IDs assigned at the beginning of request to the stack objects is feasible, which probably means deep shit now.



  • Holy shit:

    What the output shows is that - for the second request - the BeginRequest events in the HttpModule pipeline and the page constructor fire on one thread, but the Page_Load fires on another. The second thread has had the HttpContext migrated from the first, but not the CallContext or the ThreadStatic's (NB: since HttpContext is itself stored in CallContext, this means ASP.Net is explicitly migrating the HttpContext across).

    That pretty much makes adding a System.Web reference to our data layer as the only possibility. Ugh.

    (It also means the current check, which uses ThreadStatic, is wrong.)



  • @blakeyrat said:

    That pretty much makes adding a System.Web reference to our data layer as the only possibility. Ugh.

    Either that, or you roll your own context manager shared between threads, it seems.



  • It looks like the CallContext class added in .net 4 is my answer here. It has a mode where it can be used if a thread "migrates".



  • Here's what I ended up with, the problem is I have no idea if it actually works-- time to create this bug on purpose!

        private void StoreTransactionOrder()
        {
            if (!Debugger.IsAttached)
            {
                return;
            }
    
            // Add this transaction to the per-thread stack of open transactions
            var transactionStack = (Stack<CustomTransaction>)CallContext.LogicalGetData("transactionStack");
    
            if (transactionStack == null)
            {
                transactionStack = new Stack<CustomTransaction>();
            }
    
            transactionStack.Push(this);
    
            if (transactionStack.Count > _deepestNestedTransaction)
            {
                _deepestNestedTransaction = transactionStack.Count;
            }
    
            CallContext.LogicalSetData("transactionStack", transactionStack);
        }
    
        private void CheckTransactionOrder()
        {
            if (!Debugger.IsAttached)
            {
                return;
            }
            
            var transactionStack = (Stack<CustomTransaction>)CallContext.LogicalGetData("transactionStack");
    
            if (transactionStack.IsNullOrEmpty())
            {
                throw new NotSupportedException("SQL transaction closed before being opened.");
            }
    
            var lastTransaction = transactionStack.Pop();
    
            if (lastTransaction != this)
            {
                throw new NotSupportedException("SQL parent transaction closed before nested transaction.");
            }
        }
    

    EDIT: she works! At least if I manually trigger the error in a console app.



  • @blakeyrat said:

    It looks like the CallContext class added in .net 4 is my answer here

    From Maciejasjmj's reference:

    What the output shows is that - for the second request - the BeginRequest events in the HttpModule pipeline and the page constructor fire on one thread, but the Page_Load fires on another. The second thread has had the HttpContext migrated from the first, but not the CallContext or the ThreadStatic's (NB: since HttpContext is itself stored in CallContext, this means ASP.Net is explicitly migrating the HttpContext across). Let's just spell this out again:
    Looks like CallContext doesn't solve the threading problem.


  • Yeah, I believe the LogicalGetData() / LogicalSetData() as opposed to GetData() / SetData() are the part that does the magic? Kinda? No idea.

    Set the ThreadPool size to one and try to knock the app out with API requests is my best solution.



  • @Jaime said:

    Looks like CallContext doesn't solve the threading problem.

    From my understanding reading the CallContext documentation, when a thread is migrated there's a "LogicalCallContext" that gets migrated along with it. (The normal CallContext does not. Yes this is weird and confusing and probably also dumb.)

    So I'm being careful to use the Logical calls to the CallContext.



  • @Maciejasjmj said:

    Set the ThreadPool size to one and try to knock the app out with API requests is my best solution.

    Yeah I might do testing like that tomorrow; right now I did a write-up and put in a pull request to see if the other back-end devs here have any opinion on it. I don't want to bother a more thorough test if it's just going to be kiboshed.



  • @blakeyrat said:

    From my understanding reading the CallContext documentation, when a thread is migrated there's a "LogicalCallContext" that gets migrated along with it. (The normal CallContext does not. Yes this is weird and confusing and probably also dumb.)

    This MSDN Blog post talks about the issue of threading and it uses LogicalGetData. The fact that he doesn't simply suggest to use LogicalGetData and be done with it suggests that he had a threading problem even when using that method.



  • This is hairy. For example, look at this SO answer:

    Which says that LogicalGetData does travel around as the thread does. But the article he links to as a source doesn't load for me, so.

    John Skeet also implies this works in his comments in this post:

    Hm.

    BTW a co-worker looked at my code and told me my CallContext stuff should be put in a singleton. Huh? I asked why, and he said, "so if Microsoft changes the behavior of CallContext you only need to change the code in one place." WTF.


  • Discourse touched me in a no-no place

    @blakeyrat said:

    BTW a co-worker looked at my code and told me my CallContext stuff should be put in a singleton. Huh? I asked why, and he said, "so if Microsoft changes the behavior of CallContext you only need to change the code in one place." WTF.

    It's late. I read that as putting it in a simpleton. That's logically wrong. Or morally wrong (though possibly satisfying in this case).


  • Banned

    What you really need is a database context construct, then you don't have to debug through all this stuff.



  • I don't see how that addresses the problem-- your solution still relies on having some kind of thread-local context class. (You're using HttpContext.) Well, great, but... uh... I don't got one. At least not one that exists in both the ASP.net and the non-ASP.net applications using this exact same data layer.



  • BTW this blog post: http://blog.stephencleary.com/2013/04/implicit-async-context-asynclocal.html I've seen referenced by many people on this issue also states you should only use Immutable objects, which my code is not doing. Hm. So it might be broken there.

    This whole thing is confusing beyond belief.



  • This is why I explicitly pass connection (and if necessary transaction) parameters to methods that do database actions. IoC, Aspect-Oriented Programming and context-style connections all introduce problems when trying to control transactions. You can also run into problems with two methods using the same connection simultaneously. The latter was mostly done away with when MARS was introduced.



  • Yes yes I appreciate the preachy-ness, but I don't have the time or inclination to refactor this entire app. The purpose of this change is to identify real genuine problems with data integrity, not to waste time dicking around with (and possibly introducing bugs into) working code.



  • @blakeyrat said:

    dicking around with (and possibly introducing bugs into) working code.

    Well you still have to poke around that code to have it use your stack, I suppose?

    Actually, some IoC-thingy might just work - wrap the stack into an object which throws when you add or remove wrong stuff to it, construct it before you delegate work to the data layer and pass it around as a parameter.



  • @Maciejasjmj said:

    Actually, some IoC-thingy might just work - wrap the stack into an object which throws when you add or remove wrong stuff to it, construct it before you delegate work to the data layer and pass it around as a parameter.

    I feel like I'm talking Martian here.

    How do I do that without touching every single one of the 300 or so API endpoints separately?



  • Your architecture is built on an incorrect assumption (the assumption that IIS won't do nasty things to threads). There are no good answers.

    Actually... how about self-hosting the WebAPI service?



  • @blakeyrat said:

    How do I do that without touching every single one of the 300 or so API endpoints separately?

    Okay then, construct it just before you start work in the DB layer. I have no idea how your app is structured, but since you've made your Store/Check methods private, I think there's a single class in which all the start/end transaction events occur? Try somewhere around there?

    Look, I'm just giving you the idea that function locals are generally exempt from threading problems. What you do with it... well, you're the programmer.



  • I know that. I don't understand how it helps me. There could be 50 CustomTransaction classes created at various points, none of which know about any others. How do function variables help me? The only thing they all have in common is being in the same thread.

    Maybe I am stupid. Ok; fine. Either help Dumbo McDummyface solve the problem, or don't. Cryptic clues are not appreciated.


  • Banned

    I get that you don't want to redesign it all. that said non-asp.net stuff can still have the same concept of context using thread local storage or whatever other backend you need to provide. you are right though, that is describing a mountain of change not a debugging trick.

    To debug this stuff I would actually hit it from SQL profiler and intercept into all calls made to the db adding some level of diagnostics in a comment. So you can see:

    -- process id: 123 -- app name: something
    select * from bla
    

    There are a few .net sql profilers out there that may be able to give you this kind of visibility afaik, been a while though.



  • We don't have a way of running through every codepath. The idea is we install this to point out errors (if any exist) and fix them as we see them.



  • @blakeyrat said:

    There could be 50 CustomTransaction classes created at various points, none of which know about any others.

    Okay well that sucks, I had a bit of a different idea about your architecture. Never mind.

    Just thought of something else - you need it only for debugging purposes, right? I'd need to test it, but the vibe I get from all the articles is that thread jumping occurs rather rarely - so why not store a thread ID with the stack in the whateverContext, and on pop check if it still matches - if it doesn't, that means we got jumped and the data we've got is meaningless, so we don't bother with an exception.

    Might not be perfect, but might be good enough?



  • @blakeyrat said:

    nested transactions

    My understanding was that there ain't no such thing.



  • Well, about half the articles I found say what I have works. Since, like you say, the code is only run when a debugger is attached, I think I'll declare what I have good.



  • I know that of course. How does it help my problem? Oh? You're just trying to point out how smart you are? Well I'm very impressed, you are numero uno smarty man. Now do you have any actual help to offer?



  • @blakeyrat said:

    Since, like you say, the code is only run when a debugger is attached, I think I'll declare what I have good.

    Yep, sounds about right. I'd still add at least a loop over the stack checking if this is anywhere there to avoid false positives, but whatever works for you.



  • @blakeyrat said:

    Now do you have any actual help to offer?

    It depends on how the app talks to the database. Are we dealing with ad-hoc queries, or is everything neatly arranged inside sprocs? Are transactions initiated in client code, or within the ad-hoc queries/stored procedures? Are savepoints used?

    If business process A requires executing business process B, but process B can be run independently, then process B either has to use something like this template to respect its caller, or process A has to check @@trancount after calling process B to make sure the rug hasn't been pulled out from under it.

    If every operation gets to its COMMIT without a hitch, and @@trancount = 0 at the end, everything's fine. The problem cases are when any process issues a rollback.

    Keeping a stack of transactions might help catch open transactions, but it seems like it'd be easier just to check @@trancount and throw if it doesn't match the expected value at that point. If it's 1 at the end of an operation, or if it's 0 in the middle of an operation, you know there's a problem.



  • @Groaner said:

    but it seems like it'd be easier just to check @@trancount and throw if it doesn't match the expected value at that point. If it's 1 at the end of an operation, or if it's 0 in the middle of an operation, you know there's a problem.

    Doesn't meet his needs. Consider this

    void Withdraw(int accountNum, decimal amount)
    {
      // begin transaction
      // write entry to account ledger
      // decrement account balance
      // end transaction
    }
    
    void Deposit(int accountNum, decimal amount)
    {
      // begin transaction
      // write entry to account ledger
      // increment account balance
      // end transaction
    }
    
    void Transfer(int sourceAccountNum, int destAccountNum, decimal amount)
    {
      // begin transaction
      Withdraw(sourceAccountNum, amount);
      Deposit(destAccountNum, amount);
      // end transaction
    }
    

    Neither Withdraw() nor Deposit() know what their @@trancount should be since it will be 0 if the operation is a stand-alone operation, but 1 if it is called as part of a Transfer().

    Whether you use sprocs, ad-hoc queries, client transactions, or garden gnomes, writing reusable code means that an operation will be sometimes be called within an outer transaction and sometime called independently. You provided a solution that is appropriate for an application with all business code that affects data inside sprocs, but not everyone writes code like that. Blakeyrat is looking for a solution that works where the business logic is implemented where god intended - in the business layer of the application. Telling him to reimplement will get you the same answer he gave me - "No, I'm not going to write the entire application differently". The difference is that I suggested he use a different pattern, and I was hoping he saw the possibility of maybe introducing a second pattern and switching to it over time if it worked better for him. You are suggesting he move all of his code to another platform and write it in another language. And a much crappier language.



  • @Jaime said:

    Neither Withdraw() nor Deposit() know what their @@trancount should be since it will be 0 if the operation is a stand-alone operation, but 1 if it is called as part of a Transfer().

    You can also store @@trancount in a variable before it gets incremented with the first BEGIN TRANSACTION and then compare it to the value at exit.

    @Jaime said:

    Whether you use sprocs, ad-hoc queries, client transactions, or garden gnomes, writing reusable code means that an operation will be sometimes be called within an outer transaction and sometime called independently.

    This is true.

    @Jaime said:

    You provided a solution that is appropriate for an application with all business code that affects data inside sprocs, but not everyone writes code like that.

    This is also probably true.

    @Jaime said:

    Blakeyrat is looking for a solution that works where the business logic is implemented where god intended - in the business layer of the application.

    If it's talking to a SQL database, at some point, there's going to be SQL involved. The business layer need not concern itself with transactions, but I would expect the data layer to know something about them and know when something's gone wrong.

    @Jaime said:

    Telling him to reimplement will get you the same answer he gave me - "No, I'm not going to write the entire application differently". The difference is that I suggested he use a different pattern, and I was hoping he saw the possibility of maybe introducing a second pattern and switching to it over time if it worked better for him.

    Any proposals here are going to require changing some amount of code. Adding a couple checks on @@trancount to methods isn't very different from adding a transaction stack to them.

    @Jaime said:

    You are suggesting he move all of his code to another platform and write it in another language.

    Not necessarily. If the data layer provides some way of getting the transaction count for the current session, it could be done there instead. If value in = value out, everything's good. If value in != value out, we have a problem. Creating a transaction stack to keep track of that feels like reinventing the wheel, unless there are complicated rules for partial rollbacks/savepoints/etc.

    @Jaime said:

    And a much crappier language.

    T-SQL is a terrible general-purpose language. But it's damn good for pushing data around.


Log in to reply