SQL Server Service Broker – Error Handling


This post is part of a series on this blog that will explore SQL Server Service Broker, a native messaging and queueing technology built into the SQL Server Database Engine.

Previous posts:

In this installment, we discuss error handling in Service Broker applications as well as a noir-sounding concept called the poison message.

Poison Messages

At the end of last week’s post I raised a particular scenario.  When receiving a message from the queue while inside a transaction, if we experience an error and the transaction rolls back, the message gets put back in the queue.  And the next time we read from the queue, we’ll receive that same message again.  So, what happens if we meet up with that same error again? And again?  And again?  Are you picking up what I’m layin’ down?

This type of situation, a message that can never be processed successfully, is known as a poison message.  The name kind of makes it sound like there’s a problem with the message itself.  And there might be.  Perhaps the message format is wrong for what the receiving code was expecting.  But maybe the problem is with the receiving code itself.  Regardless of what causes the poison message, it has to be dealt with.

SQL Server has a built-in mechanism for handling poison messages.  If a transaction that receives a message rolls back 5 times, SQL Server will disable the queue.  So that means that all processing that depends on that queue will cease.  Nice, huh?  Because of this, it behooves you to make sure you include proper error handling in your message processing code.  And how exactly you handle errors will depend on several factors:

  • Should the message processing be retried?  If the error was a deadlock, then retrying is appropriate, because it should eventually succeed.
  • Should the whole transaction be rolled back or just part of it?  You can use savepoints to rollback all logic except the receive, for instance.
  • Should the conversation be ended?  In the case of a poison message, it’s common to commit the receipt of the message (to remove it from the queue) and end the conversation with an error.  This notifies the sender service that something went wrong on the receiver side.  However, if the sender doesn’t care about the success or failure of the receiver’s processing, you may choose to log the error and commit the transaction without ending the conversation.
  • What logic does the sender need to perform if the receiver gets an error?  This is where things can get sticky.  It’s relatively easy to code error handling on the receiver side.  But what if there’s logic on the sender side that needs to be undone to complete the “rollback”?  Now we need to include error handling in the receiving code that notifies the sending service of the error and we need receiving code on the sender service that will handle the error and perform any necessary logic.

Best Practices

Asynchronous applications can get pretty complex.  I know you were probably hoping for some example code on proper error handling.  But the thing is, so much is dependent on how your specific implementation has to function.  What I can share are some best practices.  Here are some of my recommendations, in no particular order:

  • Map it out before writing one bit of code – If you don’t have a clear picture of how your application logic flows, you simply won’t code an efficient and robust app.  This should be a no-brainer, but even I’ve fallen victim to the urge to start coding before I’ve mapped out a clear picture of the logical flow.  You’ve heard “measure twice, cut once”, well this is the developer’s version of that.
  • Do validation on the sending side – Does your receiver assume the message will be in xml format?  Make sure you’re casting your message as xml at the sender.  Does your receiver require a specific format?  Consider using a specific xml schema.  Performing as much validation as possible on the sender side not only helps prevent a lot of receiver errors, it also lessens the workload of the receiving service, which means better message processing performance.
  • Keep it simple – The less complex your receiving code, the less opportunity there is for errors.  If this is a one-way application, you might even consider something as simple as receiving messages from the queue and inserting them into another table to await subsequent processing.  But even with something this simple…
  • Always include TRY/CATCH blocks in your receiving procedure.

Conclusion

I’d love to hear from others who are using Service Broker.  How do you handle errors in your application?  Any tips or tricks to share?  Leave your thoughts in the comments.

Also recommended:


Leave a comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

4 thoughts on “SQL Server Service Broker – Error Handling

  • Scott

    Great article Collen, I have used SB in anger with only one client and we had an issue with Service Broke bloat. Because it seems to be an under used feature of SQL Server finding troubleshooting documentation on it is quite tricky. Anyway SB bloat is where some internal messages don’t have a message expiry set so the default is like a BILLION seconds (or 70 years in the future!), so you find your database growing day by day (in my client’s case 4GB per day!) but we weren’t seeing optimize jobs taking longer such as re-indexing as this is internal SQL Engine data tables – however the daily backup was certainly taking longer!

    Then if you try and remove the messages with a basic delete this can cause further issues and mess up the SB entirely you have to close the conversation/message softly then remove – and not easy in a production environment, we had to batch it overnight.

    Thankfully I found some information here (thank you Remus!) http://rusanu.com/2005/12/20/troubleshooting-dialogs/

    Rgds,

    Scott

  • Scott

    Great article Colleen, I have used SB in anger with only one client and we had an issue with Service Broke bloat. Because it seems to be an under used feature of SQL Server finding troubleshooting documentation on it is quite tricky. Anyway SB bloat is where some internal messages don’t have a message expiry set so the default is like a BILLION seconds (or 70 years in the future!), so you find your database growing day by day (in my client’s case 4GB per day!) but we weren’t seeing optimize jobs taking longer such as re-indexing as this is internal SQL Engine data tables – however the daily backup was certainly taking longer!

    Then if you try and remove the messages with a basic delete this can cause further issues and mess up the SB entirely you have to close the conversation/message softly then remove – and not easy in a production environment, we had to batch it overnight.

    Thankfully I found some information here (thank you Remus!) http://rusanu.com/2005/12/20/troubleshooting-dialogs/

    Rgds,

    Scott