In response to my last post, I got a very interesting comment from Philip:
“I would love to see a way to add logic that will verify at least X backup files for a given DB remain. Specifically, in my organization, I would like to delete all but the most recent one– even if that one is three weeks old because I took that particular DB offline or detached for one reason or another. As it is, I must now sort by file name, then delete all but the oldest one manually.”
He raises a good point. Generally speaking, when I take a backup that I know I have to keep, I rename it with a .keep extension so it doesn’t get cleaned up with everything else. Well, that works great when I know that that’s a backup I have to keep. But what about Philip’s example, suppose I take a particular database offline? If I don’t specifically rename that last backup, it will be gone in a week or whatever my retention period happens to be. Or maybe I backup different databases at different intervals (DatabaseX nightly, and DatabaseY weekly) and I’d just like to retain the last 3 backups. Regardless of the reason(s) behind it, I thought it was a worthy enhancement, so I came up with some code that should work.
The first thing I need to do is find the most recent n backups. For that, I use the backupset table in msdb. So, let’s say I want to list the last 3 backups for each database.
select database_name, backup_start_date from ( SELECT database_name, backup_start_date, row_number() over (partition by database_name order by backup_start_date desc) as bkup_number FROM msdb.dbo.backupset ) as top_n where bkup_number <= 3
Now, all we need to do is change that query slightly to get us the min backup date.
select database_name, min(backup_start_date) from ( SELECT database_name, backup_start_date, row_number() over (partition by database_name order by backup_start_date desc) as bkup_number FROM msdb.dbo.backupset ) as top_n where bkup_number <= 3 group by database_name
And that date is the cutoff date for deleting backup files for that particular database. So, for example, if we want to retain the most recent 3 backups, we know we can delete any AdventureWorks backups older than 2012-01-22 07:03:41.000.
Now that we know how to get the cutoff date for each database, let’s put it together in the procedure. (Alternatively, you can make this a separate procedure altogether, up to you.) To accomodate this new option, I added a possible unit type: F (for files). So if we execute
exec [maint].[sp_clean_backupdir2] 3, 'F', 'bak'
we mean to delete all but the last 3 backup files for each database. This new parameter required a minor change to my initialization logic to account for the new possible value.
SELECT @unit2 = CASE @unit WHEN 'H' THEN 'hh' WHEN 'D' THEN 'dd' WHEN 'W' THEN 'ww' WHEN 'M' THEN 'mm' WHEN 'Y' THEN 'yy' WHEN 'F' THEN 'ff' ELSE @unit END ,@period = CASE WHEN @period > 0 and @unit <> 'F' THEN @period*-1 ELSE @period END IF @unit2 NOT IN ('hh', 'dd', 'ww', 'mm', 'yy', 'ff') RAISERROR (N'Invalid interval unit specified. Accepted values are H,D,W,M,Y, or F.' ,16 ,1)
The major change required meant adding some conditional logic. If the unit parameter isn’t ‘F’, we’ll carry on as we did before. However, if it is ‘F’, then we need to use the logic I described above to get the @deldate for each database. We also go through each individual database’s subfolder(s), rather than just a blanket delete on the BackupPath(s) including all subfolders.
IF @unit <> 'F' /* ... the old logic ... */ ELSE BEGIN DECLARE paths CURSOR FOR SELECT BackupPath, database_name, min(backup_start_date) FROM maint.BackupPaths CROSS JOIN ( SELECT database_name, backup_start_date, row_number() over (partition by database_name order by backup_start_date desc) as bkup_number FROM msdb.dbo.backupset ) as top_n where bkup_number = @period group by BackupPath, database_name OPEN paths FETCH paths INTO @path, @database_name, @deldate WHILE @@FETCH_STATUS = 0 BEGIN SELECT @deldate = DATEADD(mi, -1, @deldate) SELECT @chardate = CONVERT(NVARCHAR(19),@deldate,126) SELECT @path = @path + '\' + @database_name EXECUTE master.dbo.xp_delete_file 0 -- 0 = backup file, 1 = report file ,@path -- base folder ,@fileext -- file extension ,@chardate -- older than this date ,0 -- 1 = include first-level subfolders, 0 = don't include subfolders FETCH paths INTO @path, @database_name, @deldate END CLOSE paths DEALLOCATE paths END
And there you go. I hope this helps, Philip. Give it a try and let me know how it works for you. Disclaimer: I whipped it together pretty quickly, so be sure to test it thoroughly before using it in a production environment!