Dodging Deadlocks With Indexes

Jan 14, 2019

A lot of people don’t realise that some deadlocks can be removed entirely with the introduction of a new index. The most commonly talked about deadlock solutions that I see are …

  • Switch the order locks are taken out in your queries (If it’s possible or makes sense)
  • Reduce the chance of the offending queries running at the same time
  • Add error handling to the app to auto retry when deadlocks do occur

All of the above are good practices and should be done but I also wanted to cover here how some deadlocks can be solved with the introduction of an index. The ways in which an index can help are…

  • Speed up the query, causing locks to be held for less time and reduce the chance of deadlock
  • Covering indexes remove the need for the underlying clustered index to be touched on selects. If your deadlock is being caused by shared locks from a select on a small subset of the full rows then you may be able to move these locks to a new index whereby the two offending queries don’t need locks on the same data.

Let’s look at a demo of how a covering index can help, if you want to follow along you’ll need a copy of the Stack Overflow Database.

To simulate the deadlock run these queries in two separate tabs in the following order…

Tab 1Tab 2
BEGIN TRAN
/*EXCLUSIVE LOCK ON VOTES*/
UPDATE [Votes] SET BountyAmount = 10 WHERE id = 1
BEGIN TRAN
/*EXCLUSIVE LOCK ON USERS*/
UPDATE [Users] SET Age = 10 WHERE Id = 1	
/*Attempt to take a bunch of shared locks on users*/
SELECT TOP 100
 Location,
 COUNT(*)
FROM
 [Users]
GROUP BY [Location]
ORDER BY COUNT(*) DESC
/* Attempt to take a bund of shared locks on votes*/
SELECT TOP 100 
 UserId,
 COUNT(*)
FROM
 [Votes]
GROUP BY UserId
ORDER BY COUNT(*) DESC	

Boom, Deadlock!

Now let’s take a step back and see how a couple of indexes can completely avoid this deadlock. First ROLLBACK both existing transactions.

Both our queries UPDATE statements end up with an Exclusive Key lock on their respective tables clustered indexes. This can be seen by running…

BEGIN TRAN
UPDATE [Votes] SET BountyAmount = 10 WHERE id = 1

SELECT
   OBJECT_NAME(p.object_id) Object,
   i.name IndexName,
   l.resource_type ObjectType,
   l.request_mode LockLType,
   l.request_status LockStatus
FROM sys.dm_tran_current_transaction t
   INNER JOIN sys.dm_tran_locks l ON l.request_owner_id = t.transaction_id
   INNER JOIN sys.partitions p ON p.hobt_id = l.resource_associated_entity_id
   LEFT JOIN sys.indexes i ON i.object_id = p.object_id AND i.index_id = p.index_id
ROLLBACK

Key lock transaction

Notice that the Exclusive Key lock is on the clustered key (pk_votes_id) of the votes table.

So to avoid the deadlock we need to make sure the respective select query we ran in Tab2 against the Votes table doesn’t block on this key lock. If we refer back to our SELECT query…

SELECT TOP 100 
 UserId,
 COUNT(*)
FROM
 [Votes]
GROUP BY UserId
ORDER BY COUNT(*) DESC

We can see the only field our query touches is UserId and that UserId is not changed at all in our Update statement. What this means is that if we create an index on UserId that index will not be updated or locked as part of our UPDATE statement and our SELECT query can use that index to run lock free…

CREATE NONCLUSTERED INDEX ndx_votes__user ON Votes(UserId)

Just to prove that our update won’t touch this new index lets run it again and check it still only takes one key lock on the Clustered index with no locks on our new NonClustered index.

BEGIN TRAN
UPDATE [Votes] SET BountyAmount = 10 WHERE id = 1

SELECT
   OBJECT_NAME(p.object_id) Object,
   i.name IndexName,
   l.resource_type ObjectType,
   l.request_mode LockLType,
   l.request_status LockStatus
FROM sys.dm_tran_current_transaction t
   INNER JOIN sys.dm_tran_locks l ON l.request_owner_id = t.transaction_id
   INNER JOIN sys.partitions p ON p.hobt_id = l.resource_associated_entity_id
   LEFT JOIN sys.indexes i ON i.object_id = p.object_id AND i.index_id = p.index_id
ROLLBACK

Key lock transaction

Great! Our new index is lock free and our SELECT query can now use it without being blocked.

The same can be done with our query against the Users table, again the read query is only touching one field (Location) which is not used in our update…

CREATE NONCLUSTERED INDEX ndx_user_location ON Users(Location)

With these two new indexes if you then run the queries in two tabs again you’ll notice there is no longer a deadlock and even no blocking between them.

Winning! As with all things you need to find the tool that works best for your situation. This isn’t a one stop deadlock fix but rather just another tool in your toolkit to use when it makes sense.

Introducing ChaosLoad

Jan 11, 2019

Scripted Simulation of SQL Server Loads

When blogging, presenting or testing an idea, one issue I constantly have is that my local SQL Server used for all these things has no real load on it making it hard to test how things I do work concurrently or under stress. I used to use Adam Machanic’s SQLQueryStress and this was great for running single scripts but made it hard to make more realistic loads.

Enter ChaosLoad (Out of Chaos comes order and all that)…

ChaosLoad allows you to define a JSON template of the load you want to run that looks something like this…

{
   "ConnectionString": "Server=localhost\\sql2019;Database=WideWorldImporters;Trusted_Connection=True;",
   "Templates": [
      {
         "ScriptPath": "Scripts\\Demo1\\HammerTime.sql",
         "Sleep": 100,
         "Threads": 50,
         "RunCount": 1000
      },
      {
         "ScriptPath": "Scripts\\Demo1\\FindUser.sql",
         "Sleep": 0,
         "Threads": 5,
         "RunCount": 5
      }
   ]
}

You can define any number of scripts and have them execute a set number of times on a specified number of threads with a specified interval after each run. In the above example HammerTime.Sql runs 100 times on 50 threads and pauses for 50ms after each run, FindUser.Sql runs 5 times on 5 threads with no pause between each execution.

The source can be found on the ChaosLoad GitHub Repository, to run from source you’ll need to first install .Net Core SDK 2.1+, alternatively you can download the Windows release binaries from GitHub and run without installing anything.

Assuming you’ve cloned the source then you can run ChaosLoad from command line by going to the directory the project is in and running

dotnet run PathToYourJson.Json

Or if you’ve downloaded the released version from GitHub…

ChaosLoad.exe PathToYourJson.exe

You can script some pretty cool labs with this to use in training sessions, For example hide a rogue query doing a crazy memory hogging sort in a bunch of innocent queries and play find the problem.

If you want a quick example you can run the one in the sample folder that can be run against the StackOverflow database. It looks like this…

{
   "ConnectionString": "Server=localhost\\sql2017;Database=StackOVerflow2013;Trusted_Connection=True;",
   "Templates": [{
      "ScriptPath": "Scripts\\Demo1\\GetAllUsersEndWithG.sql",
      "Sleep": 0,
      "Threads": 5,
      "RunCount": 10
   },
   {
      "ScriptPath": "Scripts\\Demo1\\GetCountPosts.sql",
      "Sleep": 0,
      "Threads": 5,
      "RunCount": 10
   },
   {
      "ScriptPath": "Scripts\\Demo1\\GetMostPopularBadge.sql",
      "Sleep": 10,
      "Threads": 5,
      "RunCount": 10
   }]
}
/*GetMostPopularBadge.sql*/
SELECT TOP 1
    COUNT(*), [Name]
FROM dbo.Badges
GROUP BY [name]
ORDER BY COUNT(*) DESC

/*GetAllUsersEndWithG.sql*/
SELECT *
FROM [users]
WHERE DisplayName LIKE '%G'

/*GetCountPosts.Sql*/
SELECT COUNT(*)
FROM Posts

Before I run this the server has nothing going on…

Empty sp_whoisactive

Let’s then fire it up…

Starting From Console

Now our server looks like this…

Loaded sp_whoisactive

Minor disclaimer, This project was written in an hour or so late one evening, it’s pretty scrappy and whilst I fully intend to go back and tidy up there is a good chance I’ll never get to it ;) Feel free to submit pull requests.

Waiter Waiter There's an Index in my Index

Jan 9, 2019

I’ve done a few posts on Clustered and Non-Clustered indexes before, what I’ve not however covered and something that is often not thought about is how SQL Server links the NonClustered and Clustered indexes together for lookups.

In Summary, it works like this…

  • If a table has no Clustered index (A Heap) then each record in each Non-Clustered index will store the Row Identifier (RID), this is used as a pointer back to the heap, any data required that is not in the Non-Clustered index will use this pointer to go back to the heap and get the additional fields.
  • If a table does have a Clustered index then the Non-Clustered indexes do not use the RID as a pointer, instead they use the fields in the Clustered index as the pointer. This is a little surprising to learn at first as for this to work it means that every Non-Clustered index also has the Clustered index fields included in it for it to be able to perform lookups back to the Clustered index.

Lets demo the above points with a quick sample…

CREATE TABLE NonClusteredOnly
(
   Id INT IDENTITY,
   Name NVARCHAR(100),
   INDEX ndx_nonclusteredonly_name NONCLUSTERED(Name)
)

CREATE TABLE ClusteredOnly
(
   Id INT IDENTITY PRIMARY KEY,
   Name NVARCHAR(100)
)

CREATE TABLE ClusteredAndNonClustered
(
   Id INT IDENTITY PRIMARY KEY,
   Name NVARCHAR(100),
   INDEX ndx_clusteredandnonclustered_name NONCLUSTERED(Name)
)

INSERT INTO NonClusteredOnly (Name)
SELECT specific_name
FROM msdb.information_schema.routines 
WHERE routine_type = 'PROCEDURE'

INSERT INTO ClusteredOnly (Name)
SELECT specific_name
FROM msdb.information_schema.routines 
WHERE routine_type = 'PROCEDURE'

INSERT INTO ClusteredAndNonClustered (Name)
SELECT specific_name
FROM msdb.information_schema.routines 
WHERE routine_type = 'PROCEDURE'

After running that you’ll have 3 tables and about 500 records in each (Will vary depending on which version of SQL Server you’re running).

Let’s first look at what happens if we filter on the name column and return both ID and Name from ClusteredOnly…

SELECT id,[Name] FROM ClusteredOnly
WHERE [Name] LIKE 'sp_verify_job%'

Clustered Only Plan

As you probably expected we have no index that can help with our predicate so it does a full scan on the Clustered index.

Now let’s run that same query on our table with only a single Non-Clustered index…

SELECT id,[Name] FROM NonClusteredOnly
WHERE [Name] LIKE 'sp_verify_job%'

NonClustered Only Plan

In this case, we’re using the Non-Clustered index to seek on our predicate but because that index doesn’t have ID in it and our query wants the ID field it then uses the RID (Record Identifier) to go back to the heap and get that data in that field.

It gets more interesting when we run that same query on our table with a Clustered index on ID and Non-Clustered on Name, the obvious thing you’d expect to see is probably seek on the Non-Clustered then a lookup to the Clustered index to get the ID field, However…

SELECT id,[Name] FROM ClusteredAndNonClustered
WHERE [Name] LIKE 'sp_verify_job%'

NonClustered and Clustered Plan

You can see here we’ve got both ID and Name from the Non-Clustered index even though it only has name in it. This is possible because as I mentioned above all Non-Clustered indexes include the Clustered index columns in them as a way to point back to the Clustered index, in this case we’re not referencing anything that’s not either in the Non-Clustered or Clustered index so no lookup is needed.

Now that we’ve demonstrated that each record has a pointer back to the Clustered index and that pointer is the fields in the Clustered index you may now be wondering how that works if the Clustered Index is not unique?

Let’s setup another test to demo this…

CREATE TABLE NonUniqueClusteredAndNonClustered
(
   ID INT,
   [Name] NVARCHAR(100),
   [LookupText] NVARCHAR(100),
   INDEX ndx_clustered CLUSTERED (Id),
   INDEX ndx_nonclustered NONCLUSTERED(Name)
)

INSERT INTO NonUniqueClusteredAndNonClustered (Id,Name,LookupText)
SELECT 1,specific_name,'Test'
FROM msdb.information_schema.routines 
WHERE routine_type = 'PROCEDURE'

We now have our Clustered index on ID and every single value in that field is 1, we also have a new field LookupText that is not included in any index. Knowing what we’ve covered above we know in order to get LookupText from a seek on the Non-Clustered index we’d need to use the Clustered index keys to perform a lookup on the Clustered Index to the record in question, But… This can’t work when the pointer is not unique can it?…

SELECT id,[Name],LookupText FROM NonUniqueClusteredAndNonClustered
WHERE [Name] LIKE 'sp_verify_job%'

Non Unique Clustered Index Plan

How can it be possible perform a lookup from the Non-Clustered to the Clustered when we have no unique pointer? e.g the pointer is 1 for every single record. Well in this case SQL Server see’s the key is not unique and at the point where it inserts the data is appends a unique 4 byte value to the Clustered Index for each row allowing for the pointer to work. You will never actually see this unique key as it is only used behind the scenes for SQL Server to do things like facilitate these lookups.

Summary

Some things to take away from this…

  • You don’t need to include Clustered index fields in your Non-Clustered indexes as they are already there.
  • You don’t need to add additional indexes to avoid lookups from Non-Clustered to Clustered when the lookup fields are already defined on the Clustered Index.
  • Lookups will never happen from a Non-Clustered index to a field defined on the Clustered Index.
  • This can also work in reverse, for any record in a Clustered index you can find the matching record in each Non-Clustered index. This is how things like Deletes and Updates work, e.g Update Clustered Index then lookup each matching record in Non-Clustered indexes and update them too.

subscribe via RSS