TSQL – Large table cleanup with minimal locks

0
184

Your SQL Server production database must be available 7×24 but it contains millions of obsolete rows which hurt performance.  How do you delete these rows without the lock contention and business impact of a standard delete?  In this tip we look at how you can delete massive number of rows in small minimal lock batches.

Solution

Quick OLTP is critical to a business’s success so DBAs need to mindful of data contention caused by application cleanup jobs.  One approach is to select a huge number of keys in one pass then break the cleanup into small batches with commits and sleep windows to minimize contention.  I recommend enabling the cleanup in a SQL job and run at low system usage time.

This is accomplished with the T-SQL code below.  You would customize the definition of obsolete rows (I call deadheads) to your shop. In this case a sale without money and no contact is obsolete.  Historical cleanup is a common use of this routine.  This version works with multiple tables in this case a parent table and its two child tables.  The column SalesID relates the tables.

Two layers of temporary tables are used: the outer table to contain all the obsolete primary keys for this run; the inner table to contain a subset of the same.  The number of keys placed in the temporary tables is best determined by altering the number of rows until the highest rows deleted per minute is reached.  I started with 1,000 rows in the outer table and 100 in the inner table.  I slowly scaled up to 500,000 rows in the outer table and 5,000 rows in the inner table.  25,316 deletions per minute was the peak value.  Beyond 500,000 rows in the outer table the deletion per minute value declined on my server.

Since the application tables are defined with primary keys using the identity property the rows are inserted with ever increasing primary key values making it easy to pick up the oldest rows via SELECT top.  Your case may be based upon a date column so alter the SELECT top 500000 accordingly.

The code loops through the 500,000 row outer table loading 5,000 keys values into the inner table.  Those 5,000 rows are then deleted from the application tables and outer table. The last step is to truncating the inner table.  This loop continues until all 500,000 outer table rows are deleted.  I find keeping the inner table small (5000) minimizes lock contention but you will need to adjust this number to your shop.

If desired use the WAITFOR command to pause processing and allow other applications to process without contending with cleanup.

At any point the job can be cancelled resulting in a brief rollback only for the last inner table set of keys. Committed deletes are not rolled back.

The job can be restarted from the top without issue and can be run as many times a day as needed.  I run this job two times a night on tables needing a quick cleanup then pull back to once a week to maintain optimal performance.

USE [msdb]
GO
/****** Object: StoredProcedure [dbo].[usp_ObsoleteRowCleanup] Script Date: 12/18/2011 14:00:38 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

/********************************************************************

File Name: usp_ ObsoleteRowCleanup.sql
Applies to: SQL Server 2005/8
Purpose: To cleanup obsolete rows.
Prerequisite: None.
Assumption: Removal of obsolete rows improves performance
Parameters: None
Returns: 0 = Succesful, 1 = Errors.
Author: Edward A. Polley
Spec
— Definition of deadhead.
Select COUNT(*) from SalesLead L, Transaction T
WHERE L.SalesID = T.SalesID
AND L.ContactId IS NULL
AND T.MoneyIn = 0

********************************************************************/
CREATE PROCEDURE [dbo].[usp_ ObsoleteRowCleanup]
–Will delete first 500000 deadheads
–Delete is from child tables: SalesZip and Transaction then parent SalesLead

AS

SET NOCOUNT ON
SET QUOTED_IDENTIFIER OFF

DECLARE
@DeadHeadCnt int,
@Toprow int

BEGIN
PRINT ‘Building Outer Temp Tbl ‘
PRINT getdate()

CREATE TABLE #OuterTemp(SalesID bigint) — hold all deadhead keys
CREATE INDEX IX_OuterTemp_SalesId on #OuterTemp (SalesID)

CREATE TABLE #InnerTemp(SalesID bigint) — hold subset of deadhead keys
CREATE INDEX IX_InnerTemp_SalesID on #InnerTemp (SalesID)

INSERT INTO #OuterTemp — select first 500,000 keys
Select TOP 500000 L.SalesID from YourDB.dbo.SalesLead L (Nolock),
YourDB.dbo.Transaction (Nolock) T
WHERE L.SalesID = T.SalesID
AND L.LeadContactId IS NULL -–substitute your condition
AND T.MoneyIn = 0
OPTION (MAXDOP 8); — allow parallel processing
— mainline
Select @DeadHeadCnt = count(*)From #OuterTemp
PRINT ‘Number of Dead Heads’
PRINT @DeadHeadCnt

While @DeadHeadCnt > 0
BEGIN
PRINT ‘Begins deletes ‘
PRINT getdate()
PRINT ‘Top row ‘
SELECT @toprow = min(SalesID) From #OuterTemp — ever incr key
PRINT @toprow

BEGIN TRANSACTION
INSERT INTO #InnerTemp — experiment with this value
Select TOP 5000 SalesID from #OuterTemp

— Delete from application tables by processing small intertemp
DELETE from YourDB.dbo.SalesZIP where SalesID in
(Select SalesID from #InnerTemp (nolock) ); –innertemp

DELETE from YourDB.dbo.Transaction where SalesID in
(Select SalesID from #InnerTemp (nolock) ) ;

DELETE from YourDB.dbo.SalesLead WHERE SalesID in
(Select SalesID from #InnerTemp (nolock) );

— Delete from outer table
DELETE from #OuterTemp WHERE SalesID in
(Select SalesID from #InnerTemp (nolock) ) ;

Select @DeadHeadCnt = count(*)From #OuterTemp — reset loop cnt
TRUNCATE TABLE #InnerTemp — cleanup working table

COMMIT

–WAITFOR DELAY ’00:00:01’ — adjust as needed

PRINT ‘End Deletes’
END
— Destroy all temporary tables.
IF OBJECT_ID(‘tempdb..#OuterTemp’) IS NOT NULL
DROP TABLE #OuterTemp

IF OBJECT_ID(‘tempdb..#INNERTEMP’) IS NOT NULL
DROP TABLE #InnerTemp
END
GO

LEAVE A REPLY

Please enter your comment!
Please enter your name here

12 + nineteen =