Side note :
I originally started this post to show with examples all the scenarios where you can get minimal logging copying data from one table to another in SQL Server. I’m running SQL Server 2017 CU7 and it quickly turned out that I couldn’t get minimal logging to happen in all the scenarios the docs say it should work. Rather than not showing these scenarios, I’m still going to list the examples for them and just note where it didn’t behave as expected.
SQL Server under certain conditions can switch to minimal logging where it then logs at a much higher level (Extent/Page) rather than row. In this post I’m specifically looking at copying data from one table to another and getting the benefit of minimal logging. For SQL Server to use minimal logging a number of things have to be true…
Recovery model must be Simple or Bulk Logged
Table must be either
A heap and insert done with TABLOCK
A heap + nonclustered index with TABLOCK and either trace flag 610 or SQL Server 2016+
A clustered index with no data and TABLOCK
A clustered index with no data and either trace flag 610 or SQL Server 2016+
A clustered index with data and trace flag 610 or SQL Server 2016+
A clustered index with nonclustered indexes and TABLOCK and trace flag 610 or SQL Server 2016+
A LOT more information on the above can be found in the following articles from Microsoft
Even with everything above being true the optimizer can still choose plans that don’t minimally log. For example, if you insert into an empty clustered index and the source data isn’t in the same order as the clustered index then SQL Server will not minimally log. If you need an insert to use minimal logging it’s always best to test it first and not just assume you’re going to get it.
Below I’m going to cover a number of examples of the above conditions and show ways to check if you are getting minimal logging. I’m running all the examples in this post on SQL Server 2017 CU7 so trace flat 610 is not needed for me but if you’re on a version before 2016 you’ll need to enable this traceflag to get the same results.
For the examples in this post I’m going to be using the Stack Overflow database because it’s a nice size and has a schema that is easily understandable. You can download various sized versions from BrentOzar.com
I’ve written the following stored procedure to return log information for a given query. If you’re following along create this in the Stack Overflow database…
GetLogUseStats will allow us to pass in an insert query and a table name then return all the log information that happened on the table and its associated indexes. For this post I’ll mainly be passing 1 in for the ClearData parameter to force each query to run on an empty table. This is designed to be run on a Development server, for accuracy try to run it on a database that nothing else is running on or log records written by other queries will throw out the results.
To keep the examples simple, we’re just going to look at ways to insert all the records in the posts table into a second table. If you’re following along once you’ve restored Stack Overflow database set it to simple recovery model (Bulk would also work). I’ve trimmed the Posts table down to 500k records to keep things running fast on my laptop but you can adjust this to suit your machine.
create an empty table that we’re going to copy the posts into…
So at this point we have a heap with no nonclustered indexes.
I’m going to run through each of the points above that support minimal logging but before we do lets first do an insert into the heap with no TABLOCK which as we can see from the bullet points above will not minimally log…
We’ve caused the log file to grow 0.1701mb and we can see we have a large number of individual LOP_INSERT_ROWS records in our log file. As suspected this query was not minimally logged. Follow through the next example to see what a minimally logged query looks like.
Heap With No Indexes and TABLOCK
Let’s try this again but this time using TABLOCK to take a table lock on our PostsDestination table which as we know should get us a minimally logged insert..
Our log record growth has gone from 0.1701mb to 0.0026mb, down by about a factor of 60 (These are low numbers even on the non minimally logged example as I shrunk my dataset). Also notice we have hardly any LOG_INSERT_ROWS records as we’re now working at the page level rather than the row.
Heap With NonClustered Index and TABLOCK (Trace Flag 610 or 2016+)
This is where things get interesting according to the docs you get minimal logging on a nonclustered index when using TABLOCK if you either enable trace flag 610 or are on SQL Server 2016. As mentioned I’m running SQL Server 2017 CU7, let’s see what happens….
First create a nonclustered index on our table…
I’m explicitly setting ALLOW_PAGE_LOCKS to on here even though it’s the default just to prove the index will allow page locking as without it the insert can’t be minimally logged.
Then let’s run our insert again with TABLOCK…
Looking at this we can see the log file grew by 0.1718mb and log records for both the heap and the index have a high number of LOG_INSERT_ROW records. Not only did we not minimally log the insert into the index we also lost minimal logging to the heap. This is completely unexpected from reading the documentation. I’m not sure if I’ve missed something here but as far as I can see the query above meets all the criteria for a minimally logged insert.
Empty Clustered Index With TABLOCK
If you still have it drop the nonclustered index from the above example and create a clustered index…
Then lets run our TABLOCK insert again…
Woop back in to chartered territory, minimal logging is back! The log grew by 0.0013mb (Our smallest yet) and only 3 LOP_INSERT_ROW records.
Empty Clustered Index With Trace 610 Or 2016+
Now let’s try again without TABLOCK, before we do that though let’s drop and recreate the index to keep things clean
Failure number 2! 0.0251mb Log Growth and lots of log operations at LOP_INSERT_ROW level. Reason again that you should always check your assumptions with minimal logging if you’re relying on it happening.
Clustered Index With Data and Trace 610 OR 2016+
Moving along, let’s recreate the index again…
Now let’s insert records with an Id < 100
Now let’s try inserting where the Id is > 100 with TABLOCK and without clearing down the table first…
Notice the @ClearData=0.
Failure number 3, high log growth with many LOP_ROW_INSERT log records.
Clustered Index With NonClustered Indexes and TABLOCK Or Trace 610/SQL 2016+
Let’s first recreate our clustered index and add nonclustered index to the mix…
Now let’s try our insert again
Failure number 4! High log growth and inserted log records at a row level.
So all the scenarios where minimal logging should work with a nonclustered index failed along with any that said TABLOCK was not required and anywhere data already exists.
These results weren’t as I’d expected but this was still an interesting exorcise. I’m sure I’m missing something here but after rereading the documentation several times I really can’t see what. If anyone does have any ideas on the above scenarios drop me a message in the comments.
When testing/debugging TSQL it’s common to use the print statement throughout to see what was happening where in much the same way that other languages use things like console.log.
One of the big issues with print is it buffers and you can’t control when it gets written to the output stopping you seeing where a query is at whilst it’s running.
You’ll see that neither print statement writes anything to the messages tab until the query has finished. This makes print statement useless for gauging what a query is currently doing.
RAISERRROR to the rescue, if you set a severity of info then it behaves much like print in that it won’t change the way the query run and you can output messages. RAISERROR has a NOWAIT option you can use to cause it to return the output straight away and avoid the buffering problems that print has.
In the above example we’re using a severity of zero to make sure it acts as an info message and not an actual error, this will cause it to work just like print even when wrapped in a try block. To prove this…
Since SQL Server 2005 we’ve had TRY CATCH syntax in SQL Server handle errors. If an error occurs inside a try block execution will leave the try block and enter the catch statement It’s inside the catch statement that you can do any error handling ROLLBACKs, Logging, Data fixes etc and then either rethrow the error or swallow it so nothing higher up knows anything went wrong. Until SQL Server 2012 throwing an error involved the use of the RAISERROR command. See below for a simple example of this…
A couple of things to notice here…
It’s a bit of a pain to rethrow an exception as we must first call the error functions to construct a new error based on the original.
The line number that comes back in the exception is the line number of the RAISERROR call not the line number the actual error occurred on.
Both of the above are improved in SQL Server 2012 and I’ll cover that shortly. First lets quickly look at throwing custom exceptions with RAISERRROR
The first parameter we’re passing in is the message text we want our error to contain. You can instead pass in the ID of a system error message and have it use that.
The second parameter is the severity level, more info on these can be found here. In our case we’re using 16 which is a severity that is classified as an error rather than info but not so serious it won’t allow us to handle it.
The last parameter is state, if we’re raising the same error in multiple places we would give each one a different state to show where it came from.
One thing to beware of with RAISERROR is if it’s not in a try block the execution will continue after your RAISERROR line…
If you run this you’ll see both prints are run. If this was inside a try block only the first print would run and then control would switch to the catch block.
TRY/CATCH THROW Post SQL Server 2012
Life got a whole load easier with the THROW statement. All the code we wrote with raise error to bubble the correct errors up is no longer needed. Instead we can just call THROW from inside our catch statement and the error will be re-thrown.
We could rewrite the first example in this post using the 2012 syntax like this…
You’ll notice this approach as well as being cleaner also displays the correct line number in the error message.
Like RAISERROR you can also use throw to throw custom errors. For example…
The first argument is the error number and must be between 50000 and 2147483647
The second argument is the error message
The third is the state
Where as RAISERROR will continue executing the batch when outside of a try block THROW will not…
The second print wont run as throw will end the batch unless you catch it.
Check State Of Transaction In Catch
Before we call COMMIT or ROLLBACK in a catch we first need to check there is an active transaction we can call these on. If an error was raised outside of a transaction calling ROLLBACK in the catch will cause another error. One of the ways we can handle this is to check the count of transactions is greater than zero.
If you set XACT_ABORT to on then things behave slightly differently, normally non fatal errors will fail the statement but if you turn this on the whole batch will fail and the transaction will be changed to a state where it can’t be committed. In this case you’ll have a transaction count of greater than one but if you call commit on it then it will fail…
I know we’re not actually changing any data here but it will allow us to finish the transaction by calling commit in our catch. If you now run that again but with SET XACT_ABORT ON you’ll see that it errors and won’t allow the commit.
You can see by turning this on we can enforce no data be saved in transactions where any single statement has an error.
If you ever want to perform a commit in a catch (An odd scenario) then it’s not enough to just check @@TRANCOUNT, you need to instead check XACT_STATE to make sure the transaction is in a state that is allowed to be committed. Some severity errors will put the transaction in a state where it’s marked as readonly and no commits can be performed. XACT_STATE returns 1 of 3 values…
1 - There is an active transaction that can be committed.
0 - There are no transactions
-1 There is an active transaction that is in a state that can’t be committed. This will automatically be rolled back when the batch finishes.
Let’s change the above query to use XACT_STATE…
Here we’re using XACT_ABORT ON for force any error to put the transaction in a readonly state. If you run this code it will fall into the rollback in the catch block because of this.
Another thing to note here is as in the example above where we showed RAISERRORR continues execution of the batch if there is no TRY/CATCH this same behavior occurs with XACT_ABORT ON. RAISERROR does not use XACT_ABORT where as THROW does…
Both prints here will run even with SET XACT_ABORT ON. If you want a batch to fail on error then use THROW. In fact you should pretty much always use THROW unless you have a good reason to want the RAISERROR behavior.
I mentioned above that with XACT_ABORT OFF (The default) some levels of error will not fail the batch but instead just fail the statement. To see this run the following..
The first insert will fail as it violates the NOT NULL constraint however the second oen will still run and succeed as can be seen if you run the select runs we have a single row. If you run it again with XACT_ABORT ON you’ll see that it fails at the batch level and everything gets rolled back. However outside of an explicit transaction everything before the line that errors will still commit fine…
This will insert a single record of value ‘One’. If you want the batch to ROLLBACK before the error as well then you need to wrap it all in a transaction…
This will insert zero records because one of the statements in the batch had an error.