Index Defrag Script
Previously I posted that it's important to keep your indexes in shape with weekly or (preferably) nightly defrags. Below is a script I use to execute nightly defrags in SQL 2005 Enterprise. I can't claim complete credit... I believe this is a script I originally found on Microsoft and heavily modified to meet my needs.
You may want to modify the script if you're doing much with XML or LOB's. Also keep in mind that this is hitting the sys.dm_db_index_physical_stats table to view fragmentation information, which can be resource intensive.
If you're not familiar with index defragmentation, check out "Alter Index" on Books Online.
IF Not Exists(SELECT OBJECT_ID FROM sys.tables WHERE [name] = N'dba_indexDefragLog') BEGIN CREATE TABLE dbo.dba_indexDefragLog ( indexDefrag_id INT IDENTITY(1,1) not null , objectID INT not null , objectName NVARCHAR(130) not null , indexID INT not null , indexName NVARCHAR(130) not null , partitionNumber SMALLINT not null , fragmentation FLOAT not null , dateTimeStart DATETIME not null , durationSeconds INT not null CONSTRAINT PK_indexDefragLog PRIMARY KEY CLUSTERED (indexDefrag_id) ); PRINT 'dba_indexDefragLog Table Created'; END IF OBJECTPROPERTY(OBJECT_ID('dbo.dba_indexDefrag_sp'), N'IsProcedure') IS Null BEGIN EXECUTE ('Create Procedure dbo.dba_indexDefrag_sp As Print ''Hello World!''') RAISERROR('Procedure dba_indexDefrag_sp created.', 10, 1); END; Go SET ANSI_Nulls ON; SET Ansi_Padding ON; SET Ansi_Warnings ON; SET ArithAbort ON; SET Concat_Null_Yields_Null ON; SET NOCOUNT ON; SET Numeric_RoundAbort OFF; SET Quoted_Identifier ON; Go ALTER PROCEDURE dbo.dba_indexDefrag_sp /* Declare Parameters */ @minFragmentation FLOAT = 10.0 /* in percent, will not defrag if fragmentation less than specified */ , @rebuildThreshold FLOAT = 30.0 /* in percent, greater than @rebuildThreshold will result in rebuild instead of reorg */ , @onlineRebuild BIT = 1 /* 1 = online rebuild; 0 = offline rebuild */ , @executeSQL BIT = 1 /* 1 = execute; 0 = print command only */ , @tableName VARCHAR(4000) = Null /* Option to specify a table name */ , @printCommands BIT = 0 /* 1 = print commands; 0 = do not print commands */ , @defragDelay CHAR(8) = '00:00:05' /* time to wait between defrag commands */ AS /*********************************************************************** Name: dba_indexDefrag_sp Author: Michelle F. Ufford Purpose: Defrags all indexes for the current database Notes: CAUTION: Monitor transaction log when executing for the first time! @minFragmentation defaulted to 10%, will not defrag if fragmentation if less than specified @rebuildThreshold defaulted to 30% as recommended by Microsoft in BOL; > than 30% will result in rebuild instead @onlineRebuild 1 = online rebuild; 0 = offline rebuild @executeSQL 1 = execute the SQL generated by this proc; 0 = print command only @tableName Specify if you only want to defrag indexes for a specific table @printCommands 1 = print commands to screen; 0 = do not print commands @defragDelay time to wait between defrag commands; gives the server a little time to catch up Called by: SQL Agent Job or DBA Date Initials Description ------------------------------------------------------------------- 2008-10-27 MFU Initial Release *********************************************************************** Exec dbo.dba_indexDefrag_sp @executeSQL = 0 , @printCommands = 1; ***********************************************************************/ SET NOCOUNT ON; SET XACT_Abort ON; BEGIN /* Declare our variables */ DECLARE @objectID INT , @indexID INT , @partitionCount BIGINT , @schemaName NVARCHAR(130) , @objectName NVARCHAR(130) , @indexName NVARCHAR(130) , @partitionNumber SMALLINT , @partitions SMALLINT , @fragmentation FLOAT , @sqlCommand NVARCHAR(4000) , @rebuildCommand NVARCHAR(200) , @dateTimeStart DATETIME , @dateTimeEnd DATETIME , @containsLOB BIT; /* Just a little validation... */ IF @minFragmentation Not Between 0.00 And 100.0 SET @minFragmentation = 10.0; IF @rebuildThreshold Not Between 0.00 And 100.0 SET @rebuildThreshold = 30.0; IF @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]' SET @defragDelay = '00:00:05'; /* Determine which indexes to defrag using our user-defined parameters */ SELECT OBJECT_ID AS objectID , index_id AS indexID , partition_number AS partitionNumber , avg_fragmentation_in_percent AS fragmentation , 0 AS 'defragStatus' /* 0 = unprocessed, 1 = processed */ INTO #indexDefragList FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(@tableName), NULL , NULL, N'Limited') WHERE avg_fragmentation_in_percent > @minFragmentation And index_id > 0 OPTION (MaxDop 1); /* Create a clustered index to boost performance a little */ CREATE CLUSTERED INDEX CIX_temp_indexDefragList ON #indexDefragList(objectID, indexID, partitionNumber); /* Begin our loop for defragging */ WHILE (SELECT COUNT(*) FROM #indexDefragList WHERE defragStatus = 0) > 0 BEGIN /* Grab the most fragmented index first to defrag */ SELECT TOP 1 @objectID = objectID , @fragmentation = fragmentation , @indexID = indexID , @partitionNumber = partitionNumber FROM #indexDefragList WHERE defragStatus = 0 ORDER BY fragmentation DESC; /* Look up index information */ SELECT @objectName = QUOTENAME(o.name) , @schemaName = QUOTENAME(s.name) FROM sys.objects AS o INNER Join sys.schemas AS s ON s.schema_id = o.schema_id WHERE o.OBJECT_ID = @objectID; SELECT @indexName = QUOTENAME(name) FROM sys.indexes WHERE OBJECT_ID = @objectID And index_id = @indexID And type > 0; /* Determine if the index is partitioned */ SELECT @partitionCount = COUNT(*) FROM sys.partitions WHERE OBJECT_ID = @objectID And index_id = @indexID; /* Look for LOBs */ SELECT TOP 1 @containsLOB = column_id FROM sys.columns WITH (NoLock) WHERE [OBJECT_ID] = @objectID And (system_type_id In (34, 35, 99) -- 34 = image, 35 = text, 99 = ntext Or max_length = -1); -- varbinary(max), varchar(max), nvarchar(max), xml /* See if we should rebuild or reorganize; handle thusly */ IF @fragmentation < @rebuildThreshold And @partitionCount <= 1 SET @sqlCommand = N'Alter Index ' + @indexName + N' On ' + @schemaName + N'.' + @objectName + N' ReOrganize'; IF @fragmentation >= @rebuildThreshold And IsNull(@containsLOB, 0) = 0 -- Cannot rebuild if the table has one or more LOB And @partitionCount <= 1 BEGIN /* We should always rebuild online if possible (SQL 2005 Enterprise) */ IF @onlineRebuild = 0 SET @rebuildCommand = N' Rebuild With (Online = Off, MaxDop = 1)'; ELSE SET @rebuildCommand = N' Rebuild With (Online = On, MaxDop = 1)'; SET @sqlCommand = N'Alter Index ' + @indexName + N' On ' + @schemaName + N'.' + @objectName + @rebuildCommand; END; /* If our index is partitioned, we should always reorganize */ IF @partitionCount > 1 SET @sqlCommand = N'Alter Index ' + @indexName + N' On ' + @schemaName + N'.' + @objectName + N' ReOrganize' + N' Partition = ' + CAST(@partitionNumber AS NVARCHAR(10)); -- no MaxDop needed, single threaded operation /* Are we executing the SQL? If so, do it */ IF @executeSQL = 1 BEGIN /* Grab the time for logging purposes */ SET @dateTimeStart = GETDATE(); EXECUTE (@sqlCommand); SET @dateTimeEnd = GETDATE(); /* Log our actions */ INSERT INTO dbo.dba_indexDefragLog ( objectID , objectName , indexID , indexName , partitionNumber , fragmentation , dateTimeStart , durationSeconds ) SELECT @objectID , @objectName , @indexID , @indexName , @partitionNumber , @fragmentation , @dateTimeStart , DATEDIFF(SECOND, @dateTimeStart, @dateTimeEnd); /* Just a little breather for the server */ WAITFOR Delay @defragDelay; /* Print if specified to do so */ IF @printCommands = 1 PRINT N'Executed: ' + @sqlCommand; END ELSE /* Looks like we're not executing, just print the commands */ BEGIN IF @printCommands = 1 PRINT @sqlCommand; END /* Update our index defrag list when we've finished with that index */ UPDATE #indexDefragList SET defragStatus = 1 WHERE objectID = @objectID And indexID = @indexID And partitionNumber = @partitionNumber; END /* When everything is done, make sure to get rid of our temp table */ DROP TABLE #indexDefragList; SET NOCOUNT OFF; RETURN 0 END Go SET Quoted_Identifier OFF SET ANSI_Nulls ON Go
20081117 UPDATE: Check out my Updated Index Defrag Script for SQL 2005 and 2008.
Tips for Large Data Stores
I'm currently working on projects that involve the collection of massive amounts of data (i.e. terabyte-class, billion-row tables). One of the challenges of collecting so much data is the ability to report on the data quickly. To this end, I plan to spend the next several blog posts discussing tips for designing and managing large relational data stores. These are primarily design practices that have performed well for me within the various environments in which I've worked.
Partitioning
• Partitioning is the segregation of a single logical table into separate, identical physical structures. Basically, every table is stored in at least one partition; "partitioned tables" are merely comprised of many partitions.
• Implemented correctly, partitioning can have dramatic improvement on read performance, index maintenance cost reduction, storage of large fact tables on separate disks, and data archiving.
• If you're not familiar with partitioning, check out Kimberly Tripp's excellent white paper, Partitioned Tables and Indexes in SQL Server 2005.
• I cannot rave enough about the impact partitioning has had within our environment. To take one of the most extreme examples of improvement, the execution time of one BI report decreased from 2.5+ hours to 20 minutes after we implemented partitioning. While not all stored procedures experienced such phenomenal improvement, improvement in the range of 30% - 60% was very common.
• While partitioning does have many benefits, there are some negatives. Namely, more expensive writes (I've noticed around 10%), increased duration of queries that span many partitions (i.e. queries on long time spans, such as a year), and increased maintenance needs.
• My rule of thumb is to partition any table with growth rates > 10mm records per week or with regular archiving needs.
• Much more to come on partitioning in the near future.
Indexing
• Any experienced DBA can tell you that effective index management is critical.
• Ensure your stored procedures are using index seeks instead of scans. If scans cannot be avoided, consider using partitioning with aligned indexes to limit the amount of pages scanned.
• Periodically look for un-used indexes and remove them to improve write performance.
• Look for missing indexes, create one or two, and evaluate. Try composite indexes and included columns as a way to limit the amount of indexes you need to create.
• Defrag your indexes! Nightly, if you can get away with it.
Stored Procedure Tuning
• When querying on large amounts of data, try breaking your queries up into individual components and storing in temporary tables, then performing the joins.
• When querying on large date ranges, try looping through the days and inserting the data into a temp table. This can be especially beneficial when querying partitioned tables.
• Create indexes on your temporary tables!
• Utilize index seeks whenever possible.
• If permitted to use dirty reads and your environment's isolation level is not defaulted to read uncommitted, use With (NoLock) hints.
• Temporary tables can be partitioned! This is usually not necessary but can sometimes be beneficial.
• Try replacing table variables with temporary tables. My rule of thumb is this: if I'm performing joins on the temporary data or if I have more than 100 records, use a temporary table. Table variables can decrease stored procedure recompiles but can sometimes have a negative impact on performance. When in doubt, test each method and evaluate which works better for your needs.
• *Always* include the partitioning key when querying partitioned tables.


