I recently got an email from yet another developer about an application that would sporadically die for no apparent reason. The application had logic to retry automatically, and would then run without error. Given the intermittent nature of the problem, and the fact that the process was terminating, rather than hanging or timing out, to me it smacked of deadlocking. So I decided to turn on trace flags 1204 and 3605 to confirm my suspicions.
DBCC TRACEON (1204, 3605, -1)
Flag 1204 records deadlock information. Flag 3605 sends it to the error log. And flag -1 turns the trace on for all sessions, not just the current session starting the trace.
I enabled the trace flags and sure enough, deadlocks.
Once a deadlock occurs and the trace information is written to the errorlog, actually making sense of it can be tricky. I’ve found that, rather than trying to view it through Management Studio, make a copy of the actual ERRORLOG file and view it in Notepad or Wordpad. It makes the information much more readable. Let’s take a look at a typical deadlock trace output.
2011-02-25 16:00:45.96 spid4s Deadlock encountered …. Printing deadlock information
2011-02-25 16:00:45.96 spid4s Wait-for graph
2011-02-25 16:00:45.96 spid4s
2011-02-25 16:00:45.96 spid4s Node:1
2011-02-25 16:00:45.96 spid4s PAGE: 75:1:5994649 CleanCnt:3 Mode:IX Flags: 0x2
2011-02-25 16:00:45.96 spid4s Grant List 0:
2011-02-25 16:00:45.96 spid4s Owner:0x48A1FCE0 Mode: IX Flg:0x0 Ref:0 Life:02000000 SPID:981 ECID:0 XactLockInfo: 0x4173350C
2011-02-25 16:00:45.96 spid4s SPID: 981 ECID: 0 Statement Type: UPDATE Line #: 1
2011-02-25 16:00:45.96 spid4s Input Buf: Language Event: UPDATE MyTable SET MyPatApplication = NULL WHERE MyPatApplication = ” AND MySendNotice = -1
2011-02-25 16:00:45.96 spid4s Requested By:
2011-02-25 16:00:45.96 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
2011-02-25 16:00:45.96 spid4s
2011-02-25 16:00:45.96 spid4s Node:2
2011-02-25 16:00:45.96 spid4s OBJECT: 75:1886017850:0 CleanCnt:2 Mode:X Flags: 0x0
2011-02-25 16:00:45.96 spid4s Grant List 1:
2011-02-25 16:00:45.96 spid4s Owner:0x1EC887A0 Mode: X Flg:0x0 Ref:22 Life:02000000 SPID:1122 ECID:0 XactLockInfo: 0x1E202E24
2011-02-25 16:00:45.97 spid4s SPID: 1122 ECID: 0 Statement Type: SELECT Line #: 1
2011-02-25 16:00:45.97 spid4s Input Buf: Language Event: SELECT MyTable.MyFolderID, CLName1, MDesc, MNarrative, MArrangement FROM MyTable
2011-02-25 16:00:45.97 spid4s Requested By:
2011-02-25 16:00:45.97 spid4s ResType:LockOwner Stype:’OR’Xdes:0x417334E8 Mode: IX SPID:981 BatchID:0 ECID:0 TaskProxy:(0x4AD92378) Value:0x131f63c0 Cost:(0/28112)
2011-02-25 16:00:45.97 spid4s
2011-02-25 16:00:45.97 spid4s Victim Resource Owner:
2011-02-25 16:00:45.97 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
The first thing we see is that this deadlock occurred between 2 processes, Node 1 and Node 2. Deadlocks can happen between more than 2 processes as well, and I’ll show an example of that a little later. For now, we’ll focus on Node 1 of this deadlock. I’m not going to go into every bit of information, but I’ll show you the juicy bits.
2011-02-25 16:00:45.96 spid4s PAGE: 75:1:5994649 CleanCnt:3 Mode:IX Flags: 0x2
2011-02-25 16:00:45.96 spid4s Grant List 0:
2011-02-25 16:00:45.96 spid4s Owner:0x48A1FCE0 Mode: IX Flg:0x0 Ref:0 Life:02000000 SPID:981 ECID:0 XactLockInfo: 0x4173350C
2011-02-25 16:00:45.96 spid4s SPID: 981 ECID: 0 Statement Type: UPDATE Line #: 1
2011-02-25 16:00:45.96 spid4s Input Buf: Language Event: UPDATE MyTable SET MyPatApplication = NULL WHERE MyPatApplication = ” AND MySendNotice = -1
2011-02-25 16:00:45.96 spid4s Requested By:
2011-02-25 16:00:45.96 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
In the first line, “PAGE: 75:1:5994649” is the resource this process had locked. This means the lock was held on the object located on page 1:5994649 in database 75. We can figure out what object this equates to by querying sys.databases to find out which database is number 75, and using the dbcc page command, dbcc page (75, 1, 5994649, 3), to get the id of the object on that page. Then we query sys.objects to find the name of the object. Also in the first line we see that the lock Mode on this resource was “IX” or intent exclusive.
In the fourth line, we can see that this Node was SPID 981, and it was performing an update statement. On the next line we see the exact statement that SPID 981 was executing at the time.
Beginning with the 6th line, we see information about the process that was requesting a lock on the resource in question:
2011-02-25 16:00:45.96 spid4s Requested By:
2011-02-25 16:00:45.96 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
Here we see that SPID 1122 was requesting a shared lock on the same resource owned by SPID 981.
The information on Node 2 follows the same pattern:
2011-02-25 16:00:45.96 spid4s OBJECT: 75:1886017850:0 CleanCnt:2 Mode:X Flags: 0x0
2011-02-25 16:00:45.96 spid4s Grant List 1:
2011-02-25 16:00:45.96 spid4s Owner:0x1EC887A0 Mode: X Flg:0x0 Ref:22 Life:02000000 SPID:1122 ECID:0 XactLockInfo: 0x1E202E24
2011-02-25 16:00:45.97 spid4s SPID: 1122 ECID: 0 Statement Type: SELECT Line #: 1
2011-02-25 16:00:45.97 spid4s Input Buf: Language Event: SELECT MyTable.MyFolderID, CLName1, MDesc, MNarrative, MArrangement FROM MyTable
2011-02-25 16:00:45.97 spid4s Requested By:
2011-02-25 16:00:45.97 spid4s ResType:LockOwner Stype:’OR’Xdes:0x417334E8 Mode: IX SPID:981 BatchID:0 ECID:0 TaskProxy:(0x4AD92378) Value:0x131f63c0 Cost:(0/28112)
SPID 1122 had an exclusive lock on the object located on page 1886017850:0 in database 75 while executing a SELECT statement. And SPID 981 was requesting an intent-exclusive lock on that same resource.
In the final section:
2011-02-25 16:00:45.97 spid4s Victim Resource Owner:
2011-02-25 16:00:45.97 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
We see that SQL Server chose SPID 1122 to be the victim of this deadlock and terminated the process. Generally speaking, SQL Server will choose to terminate a select statement over an insert/update/delete statement.
Two-node deadlocks are pretty straightforward to read once you get the hang of it. Deadlocks with more nodes can be confusing, but just follow the chain. For example:
2011-02-25 16:00:45.95 spid4s Deadlock encountered …. Printing deadlock information
2011-02-25 16:00:45.95 spid4s Wait-for graph
2011-02-25 16:00:45.95 spid4s
2011-02-25 16:00:45.95 spid4s Node:1
2011-02-25 16:00:45.95 spid4s PAGE: 75:1:5994649 CleanCnt:3 Mode:IX Flags: 0x2
2011-02-25 16:00:45.95 spid4s Grant List 0:
2011-02-25 16:00:45.95 spid4s Owner:0x48A1FCE0 Mode: IX Flg:0x0 Ref:0 Life:02000000 SPID:981 ECID:0 XactLockInfo: 0x4173350C
2011-02-25 16:00:45.95 spid4s SPID: 981 ECID: 0 Statement Type: UPDATE Line #: 1
2011-02-25 16:00:45.95 spid4s Input Buf: Language Event: UPDATE MyTable SET MyPatApplication = NULL WHERE MyPatApplication = ” AND MySendNotice = -1
2011-02-25 16:00:45.95 spid4s Requested By:
2011-02-25 16:00:45.95 spid4s ResType:LockOwner Stype:’OR’Xdes:0x1E202E00 Mode: S SPID:1122 BatchID:0 ECID:0 TaskProxy:(0x196BC378) Value:0x4a7f5d80 Cost:(0/4232)
2011-02-25 16:00:45.95 spid4s
2011-02-25 16:00:45.95 spid4s Node:2
2011-02-25 16:00:45.95 spid4s OBJECT: 75:1886017850:0 CleanCnt:3 Mode:X Flags: 0x0
2011-02-25 16:00:45.95 spid4s Wait List:
2011-02-25 16:00:45.95 spid4s Owner:0x53909C00 Mode: IS Flg:0x2 Ref:1 Life:00000000 SPID:977 ECID:0 XactLockInfo: 0x44EA8794
2011-02-25 16:00:45.95 spid4s SPID: 977 ECID: 0 Statement Type: SELECT Line #: 1
2011-02-25 16:00:45.95 spid4s Input Buf: Language Event: (@P1 nvarchar(31))SELECT * FROM MyTable WHERE MyFolderID = @P1
2011-02-25 16:00:45.95 spid4s Requested By:
2011-02-25 16:00:45.95 spid4s ResType:LockOwner Stype:’OR’Xdes:0x417334E8 Mode: IX SPID:981 BatchID:0 ECID:0 TaskProxy:(0x4AD92378) Value:0x131f63c0 Cost:(0/28112)
2011-02-25 16:00:45.95 spid4s
2011-02-25 16:00:45.95 spid4s Node:3
2011-02-25 16:00:45.95 spid4s OBJECT: 75:1886017850:0 CleanCnt:3 Mode:X Flags: 0x0
2011-02-25 16:00:45.95 spid4s Grant List 1:
2011-02-25 16:00:45.95 spid4s Owner:0x1EC887A0 Mode: X Flg:0x0 Ref:22 Life:02000000 SPID:1122 ECID:0 XactLockInfo: 0x1E202E24
2011-02-25 16:00:45.95 spid4s SPID: 1122 ECID: 0 Statement Type: SELECT Line #: 1
2011-02-25 16:00:45.95 spid4s Input Buf: Language Event: SELECT MyTable.MyFolderID, CLName1, MDesc, MNarrative, MArrangement FROM MyTable
2011-02-25 16:00:45.95 spid4s Requested By:
2011-02-25 16:00:45.95 spid4s ResType:LockOwner Stype:’OR’Xdes:0x44EA8770 Mode: IS SPID:977 BatchID:0 ECID:0 TaskProxy:(0x294C2378) Value:0x53909c00 Cost:(0/0)
2011-02-25 16:00:45.95 spid4s
2011-02-25 16:00:45.95 spid4s Victim Resource Owner:
2011-02-25 16:00:45.95 spid4s ResType:LockOwner Stype:’OR’Xdes:0x44EA8770 Mode: IS SPID:977 BatchID:0 ECID:0 TaskProxy:(0x294C2378) Value:0x53909c00 Cost:(0/0)
Here SPID 981 had something that SPID 1122 wanted, SPID 1122 had something that SPID 977 wanted, and SPID 977 had something that SPID 981 wanted. As a result, SPID 977 was killed.
In the end, the developer was able to eliminate a lot of deadlocks by adding NOLOCK to his select statements, and his end-users are much happier.