Index Defrag Script, v4.0
In my blog post, "Index Defrag Script Updates - Beta Testers Needed", I stated "I'll hopefully have the new version online in just a few days." That was dated January 26th. I had every intention of following through with it, too, but something came up:

My daughter, Chloe Lynn, was born on February 10th. She's a happy, healthy baby girl who consumes all of my free time and already has both her parents wrapped around her adorable little finger. So while I apologize for the delay in posting the latest version, I hope you can understand and forgive me.
Alrighty, back to SQL stuff! This version of the script has been significantly overhauled from previous versions. Here's a full synopsis of the changes and enhancements:
- There's now a time limit option so you have more control over how long your defrags run. This time limit is checked *before* a defrag is begun, so it's still possible to have a defrag occur after the time limit is exceeded (i.e. a large index).
- I've added a static table for managing the index defrag scans. This way, you can start and stop the defrag process without the need to rescan. This is especially useful for VLDB's or any environment where you're unable to complete the defrags in one operation.
- Just in case you want to perform a rescan, even if there's still indexes left to defrag from your last rescan, there's a parameter to force it.
- There's now an option to sort by page count, range scan count, or fragmentation level. Range scan count is defaulted, as the indexes that have high amounts of range scans will benefit the most from having a defragged index. You can also specify whether you want to sort by ASC or DESC.
- There's now min and max parameters for page counts. This is useful for a) ignoring indexes with less than 1 extent (as recommended by Microsoft) and b) for scheduling index operations by size. For instance, you may want to defrag your small indexes during business hours but leave your big indexes for evening or weekend hours.
- There's now a parameterized option for sorting in TEMPDB. This may reduce execution time and will prevent unnecessary database file size inflation during defrags. NOTE: Make sure you have enough free space in TEMPDB prior to enabling this option.
- I moved the SQL statement output to display before execution so you can see what's currently executing.
- I've added a debug output of the parameters selected. I've added additional validation to the start of the script, so this will help show you if an invalid value was submitted and overwritten.
- I've added new columns to the log table to show what command is being executed and what error, if any, occurred when trying to execute.
- I've added try/catch logic to handle errors during execution; this way, a single error will not prevent the whole script from terminating.
- The script will now force a rebuild for indexes with allow_page_locks = off.
- For those who use partitioning, you can now exclude the right-most populated partition from the defrag operation. This won't be applicable for all partitioning schemes, but for sliding-window scenarios (one of the most common schemes), it'll reduce contention on the partition that's being actively written to.
- I've fixed a bug where tables with LOB indexes may have more than one record returned from sys.dm_db_index_physical_stats.
- For various reasons, I've removed the option to rebuild stats.
Also, if you have a previous version of the script installed, this version will rename those tables, since there have been some changes made to them.
FAQ:
I often receive the same questions about this script, so allow me to answer them here:
"I keep running the script, but my index is still fragmented. Why?"
This is most likely a very small index. Here's what Microsoft has to say:
"In general, fragmentation on small indexes is often not controllable. The pages of small indexes are stored on mixed extents. Mixed extents are shared by up to eight objects, so the fragmentation in a small index might not be reduced after reorganizing or rebuilding the index. For more information about mixed extents, see Understanding Pages and Extents."
"What database should I create it in?" or "Can I create this in the MASTER database?"
It's up to you where you create it. You could technically create it in the MASTER database, but I recommend creating a utility database for your DBA administrative tasks.
"Can I run this againt a SharePoint database?"
I've never tried personally, but I've been told it runs just fine.
"What are the minimum requirements to run this script?" or "Will this run on SQL Server 2000 instances?"
You need to be on SQL Server 2005 SP2 or higher.
Without further ado, here's the script:
/* Scroll down to the see notes, disclaimers, and licensing information */ DECLARE @indexDefragLog_rename VARCHAR(128) , @indexDefragExclusion_rename VARCHAR(128) , @indexDefragStatus_rename VARCHAR(128); SELECT @indexDefragLog_rename = 'dba_indexDefragLog_obsolete_' + CONVERT(VARCHAR(10), GETDATE(), 112) , @indexDefragExclusion_rename = 'dba_indexDefragExclusion_obsolete_' + CONVERT(VARCHAR(10), GETDATE(), 112) , @indexDefragStatus_rename = 'dba_indexDefragStatus_obsolete_' + CONVERT(VARCHAR(10), GETDATE(), 112); IF Exists(SELECT [OBJECT_ID] FROM sys.tables WHERE [name] = 'dba_indexDefragLog') EXECUTE SP_RENAME dba_indexDefragLog, @indexDefragLog_rename; IF Exists(SELECT [OBJECT_ID] FROM sys.tables WHERE [name] = 'dba_indexDefragExclusion') EXECUTE SP_RENAME dba_indexDefragExclusion, @indexDefragExclusion_rename; IF Exists(SELECT [OBJECT_ID] FROM sys.tables WHERE [name] = 'dba_indexDefragStatus') EXECUTE SP_RENAME dba_indexDefragStatus, @indexDefragStatus_rename; Go CREATE TABLE dbo.dba_indexDefragLog ( indexDefrag_id INT IDENTITY(1,1) Not Null , databaseID INT Not Null , databaseName NVARCHAR(128) Not Null , objectID INT Not Null , objectName NVARCHAR(128) Not Null , indexID INT Not Null , indexName NVARCHAR(128) Not Null , partitionNumber SMALLINT Not Null , fragmentation FLOAT Not Null , page_count INT Not Null , dateTimeStart DATETIME Not Null , dateTimeEnd DATETIME Null , durationSeconds INT Null , sqlStatement VARCHAR(4000) Null , errorMessage VARCHAR(1000) Null CONSTRAINT PK_indexDefragLog_v40 PRIMARY KEY CLUSTERED (indexDefrag_id) ); PRINT 'dba_indexDefragLog Table Created'; CREATE TABLE dbo.dba_indexDefragExclusion ( databaseID INT Not Null , databaseName NVARCHAR(128) Not Null , objectID INT Not Null , objectName NVARCHAR(128) Not Null , indexID INT Not Null , indexName NVARCHAR(128) Not Null , exclusionMask INT Not Null /* 1=Sunday, 2=Monday, 4=Tuesday, 8=Wednesday, 16=Thursday, 32=Friday, 64=Saturday */ CONSTRAINT PK_indexDefragExclusion_v40 PRIMARY KEY CLUSTERED (databaseID, objectID, indexID) ); PRINT 'dba_indexDefragExclusion Table Created'; CREATE TABLE dbo.dba_indexDefragStatus ( databaseID INT , databaseName NVARCHAR(128) , objectID INT , indexID INT , partitionNumber SMALLINT , fragmentation FLOAT , page_count INT , range_scan_count BIGINT , schemaName NVARCHAR(128) Null , objectName NVARCHAR(128) Null , indexName NVARCHAR(128) Null , scanDate DATETIME , defragDate DATETIME Null , printStatus BIT DEFAULT(0) , exclusionMask INT DEFAULT(0) CONSTRAINT PK_indexDefragStatus_v40 PRIMARY KEY CLUSTERED(databaseID, objectID, indexID, partitionNumber) ); PRINT 'dba_indexDefragStatus Table Created'; IF OBJECTPROPERTY(OBJECT_ID('dbo.dba_indexDefrag_sp'), N'IsProcedure') = 1 BEGIN DROP PROCEDURE dbo.dba_indexDefrag_sp; PRINT 'Procedure dba_indexDefrag_sp dropped'; END; Go CREATE PROCEDURE dbo.dba_indexDefrag_sp /* Declare Parameters */ @minFragmentation FLOAT = 10.0 /* in percent, will not defrag if fragmentation less than specified */ , @rebuildThreshold FLOAT = 30.0 /* in percent, greater than @rebuildThreshold will result in rebuild instead of reorg */ , @executeSQL BIT = 1 /* 1 = execute; 0 = print command only */ , @defragOrderColumn NVARCHAR(20) = 'range_scan_count' /* Valid options are: range_scan_count, fragmentation, page_count */ , @defragSortOrder NVARCHAR(4) = 'DESC' /* Valid options are: ASC, DESC */ , @timeLimit INT = 720 /* defaulted to 12 hours */ /* Optional time limitation; expressed in minutes */ , @DATABASE VARCHAR(128) = Null /* Option to specify a database name; null will return all */ , @tableName VARCHAR(4000) = Null -- databaseName.schema.tableName /* Option to specify a table name; null will return all */ , @forceRescan BIT = 0 /* Whether or not to force a rescan of indexes; 1 = force, 0 = use existing scan, if available */ , @scanMode VARCHAR(10) = N'LIMITED' /* Options are LIMITED, SAMPLED, and DETAILED */ , @minPageCount INT = 8 /* MS recommends > 1 extent (8 pages) */ , @maxPageCount INT = Null /* NULL = no limit */ , @excludeMaxPartition BIT = 0 /* 1 = exclude right-most populated partition; 0 = do not exclude; see notes for caveats */ , @onlineRebuild BIT = 1 /* 1 = online rebuild; 0 = offline rebuild; only in Enterprise */ , @sortInTempDB BIT = 1 /* 1 = perform sort operation in TempDB; 0 = perform sort operation in the index's database */ , @maxDopRestriction TINYINT = Null /* Option to restrict the number of processors for the operation; only in Enterprise */ , @printCommands BIT = 0 /* 1 = print commands; 0 = do not print commands */ , @printFragmentation BIT = 0 /* 1 = print fragmentation prior to defrag; 0 = do not print */ , @defragDelay CHAR(8) = '00:00:05' /* time to wait between defrag commands */ , @debugMode BIT = 0 /* display some useful comments to help determine if/where issues occur */ AS /********************************************************************************* Name: dba_indexDefrag_sp Author: Michelle Ufford, http://sqlfool.com Purpose: Defrags one or more indexes for one or more databases Notes: CAUTION: TRANSACTION LOG SIZE SHOULD BE MONITORED CLOSELY WHEN DEFRAGMENTING. DO NOT RUN UNATTENDED ON LARGE DATABASES DURING BUSINESS HOURS. @minFragmentation defaulted to 10%, will not defrag if fragmentation is less than that @rebuildThreshold defaulted to 30% as recommended by Microsoft in BOL; greater than 30% will result in rebuild instead @executeSQL 1 = execute the SQL generated by this proc; 0 = print command only @defragOrderColumn Defines how to prioritize the order of defrags. Only used if @executeSQL = 1. Valid options are: range_scan_count = count of range and table scans on the index; in general, this is what benefits the most from defragmentation fragmentation = amount of fragmentation in the index; the higher the number, the worse it is page_count = number of pages in the index; affects how long it takes to defrag an index @defragSortOrder The sort order of the ORDER BY clause. Valid options are ASC (ascending) or DESC (descending). @timeLimit Optional, limits how much time can be spent performing index defrags; expressed in minutes. NOTE: The time limit is checked BEFORE an index defrag is begun, thus a long index defrag can exceed the time limitation. @database Optional, specify specific database name to defrag; If not specified, all non-system databases will be defragged. @tableName Specify if you only want to defrag indexes for a specific table, format = databaseName.schema.tableName; if not specified, all tables will be defragged. @forceRescan Whether or not to force a rescan of indexes. If set to 0, a rescan will not occur until all indexes have been defragged. This can span multiple executions. 1 = force a rescan 0 = use previous scan, if there are indexes left to defrag @scanMode Specifies which scan mode to use to determine fragmentation levels. Options are: LIMITED - scans the parent level; quickest mode, recommended for most cases. SAMPLED - samples 1% of all data pages; if less than 10k pages, performs a DETAILED scan. DETAILED - scans all data pages. Use great care with this mode, as it can cause performance issues. @minPageCount Specifies how many pages must exist in an index in order to be considered for a defrag. Defaulted to 8 pages, as Microsoft recommends only defragging indexes with more than 1 extent (8 pages). NOTE: The @minPageCount will restrict the indexes that are stored in dba_indexDefragStatus table. @maxPageCount Specifies the maximum number of pages that can exist in an index and still be considered for a defrag. Useful for scheduling small indexes during business hours and large indexes for non-business hours. NOTE: The @maxPageCount will restrict the indexes that are defragged during the current operation; it will not prevent indexes from being stored in the dba_indexDefragStatus table. This way, a single scan can support multiple page count thresholds. @excludeMaxPartition If an index is partitioned, this option specifies whether to exclude the right-most populated partition. Typically, this is the partition that is currently being written to in a sliding-window scenario. Enabling this feature may reduce contention. This may not be applicable in other types of partitioning scenarios. Non-partitioned indexes are unaffected by this option. 1 = exclude right-most populated partition 0 = do not exclude @onlineRebuild 1 = online rebuild; 0 = offline rebuild @sortInTempDB Specifies whether to defrag the index in TEMPDB or in the database the index belongs to. Enabling this option may result in faster defrags and prevent database file size inflation. 1 = perform sort operation in TempDB 0 = perform sort operation in the index's database @maxDopRestriction Option to specify a processor limit for index rebuilds @printCommands 1 = print commands to screen; 0 = do not print commands @printFragmentation 1 = print fragmentation to screen; 0 = do not print fragmentation @defragDelay Time to wait between defrag commands; gives the server a little time to catch up @debugMode 1 = display debug comments; helps with troubleshooting 0 = do not display debug comments Called by: SQL Agent Job or DBA ---------------------------------------------------------------------------- DISCLAIMER: This code and information are provided "AS IS" without warranty of any kind, either expressed or implied, including but not limited to the implied warranties or merchantability and/or fitness for a particular purpose. ---------------------------------------------------------------------------- LICENSE: This index defrag script is free to download and use for personal, educational, and internal corporate purposes, provided that this header is preserved. Redistribution or sale of this index defrag script, in whole or in part, is prohibited without the author's express written consent. ---------------------------------------------------------------------------- Date Initials Version Description ---------------------------------------------------------------------------- 2007-12-18 MFU 1.0 Initial Release 2008-10-17 MFU 1.1 Added @defragDelay, CIX_temp_indexDefragList 2008-11-17 MFU 1.2 Added page_count to log table , added @printFragmentation option 2009-03-17 MFU 2.0 Provided support for centralized execution , consolidated Enterprise & Standard versions , added @debugMode, @maxDopRestriction , modified LOB and partition logic 2009-06-18 MFU 3.0 Fixed bug in LOB logic, added @scanMode option , added support for stat rebuilds (@rebuildStats) , support model and msdb defrag , added columns to the dba_indexDefragLog table , modified logging to show "in progress" defrags , added defrag exclusion list (scheduling) 2009-08-28 MFU 3.1 Fixed read_only bug for database lists 2010-04-20 MFU 4.0 Added time limit option , added static table with rescan logic , added parameters for page count & SORT_IN_TEMPDB , added try/catch logic and additional debug options , added options for defrag prioritization , fixed bug for indexes with allow_page_lock = off , added option to exclude right-most partition , removed @rebuildStats option , refer to http://sqlfool.com for full release notes ********************************************************************************* Example of how to call this script: Exec dbo.dba_indexDefrag_sp @executeSQL = 1 , @printCommands = 1 , @debugMode = 1 , @printFragmentation = 1 , @forceRescan = 1 , @maxDopRestriction = 1 , @minPageCount = 8 , @maxPageCount = Null , @minFragmentation = 1 , @rebuildThreshold = 30 , @defragDelay = '00:00:05' , @defragOrderColumn = 'page_count' , @defragSortOrder = 'DESC' , @excludeMaxPartition = 1 , @timeLimit = Null; *********************************************************************************/ SET NOCOUNT ON; SET XACT_Abort ON; SET Quoted_Identifier ON; BEGIN BEGIN Try /* Just a little validation... */ IF @minFragmentation IS Null Or @minFragmentation Not Between 0.00 And 100.0 SET @minFragmentation = 10.0; IF @rebuildThreshold IS Null Or @rebuildThreshold Not Between 0.00 And 100.0 SET @rebuildThreshold = 30.0; IF @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]' SET @defragDelay = '00:00:05'; IF @defragOrderColumn IS Null Or @defragOrderColumn Not In ('range_scan_count', 'fragmentation', 'page_count') SET @defragOrderColumn = 'range_scan_count'; IF @defragSortOrder IS Null Or @defragSortOrder Not In ('ASC', 'DESC') SET @defragSortOrder = 'DESC'; IF @scanMode Not In ('LIMITED', 'SAMPLED', 'DETAILED') SET @scanMode = 'LIMITED'; IF @debugMode IS Null SET @debugMode = 0; IF @forceRescan IS Null SET @forceRescan = 0; IF @sortInTempDB IS Null SET @sortInTempDB = 1; IF @debugMode = 1 RAISERROR('Undusting the cogs and starting up...', 0, 42) WITH NoWait; /* Declare our variables */ DECLARE @objectID INT , @databaseID INT , @databaseName NVARCHAR(128) , @indexID INT , @partitionCount BIGINT , @schemaName NVARCHAR(128) , @objectName NVARCHAR(128) , @indexName NVARCHAR(128) , @partitionNumber SMALLINT , @fragmentation FLOAT , @pageCount INT , @sqlCommand NVARCHAR(4000) , @rebuildCommand NVARCHAR(200) , @dateTimeStart DATETIME , @dateTimeEnd DATETIME , @containsLOB BIT , @editionCheck BIT , @debugMessage NVARCHAR(4000) , @updateSQL NVARCHAR(4000) , @partitionSQL NVARCHAR(4000) , @partitionSQL_Param NVARCHAR(1000) , @LOB_SQL NVARCHAR(4000) , @LOB_SQL_Param NVARCHAR(1000) , @indexDefrag_id INT , @startDateTime DATETIME , @endDateTime DATETIME , @getIndexSQL NVARCHAR(4000) , @getIndexSQL_Param NVARCHAR(4000) , @allowPageLockSQL NVARCHAR(4000) , @allowPageLockSQL_Param NVARCHAR(4000) , @allowPageLocks INT , @excludeMaxPartitionSQL NVARCHAR(4000); /* Initialize our variables */ SELECT @startDateTime = GETDATE() , @endDateTime = DATEADD(MINUTE, @timeLimit, GETDATE()); /* Create our temporary tables */ CREATE TABLE #databaseList ( databaseID INT , databaseName VARCHAR(128) , scanStatus BIT ); CREATE TABLE #processor ( [INDEX] INT , Name VARCHAR(128) , Internal_Value INT , Character_Value INT ); CREATE TABLE #maxPartitionList ( databaseID INT , objectID INT , indexID INT , maxPartition INT ); IF @debugMode = 1 RAISERROR('Beginning validation...', 0, 42) WITH NoWait; /* Make sure we're not exceeding the number of processors we have available */ INSERT INTO #processor EXECUTE XP_MSVER 'ProcessorCount'; IF @maxDopRestriction IS Not Null And @maxDopRestriction > (SELECT Internal_Value FROM #processor) SELECT @maxDopRestriction = Internal_Value FROM #processor; /* Check our server version; 1804890536 = Enterprise, 610778273 = Enterprise Evaluation, -2117995310 = Developer */ IF (SELECT SERVERPROPERTY('EditionID')) In (1804890536, 610778273, -2117995310) SET @editionCheck = 1 -- supports online rebuilds ELSE SET @editionCheck = 0; -- does not support online rebuilds /* Output the parameters we're working with */ IF @debugMode = 1 BEGIN SELECT @debugMessage = 'Your selected parameters are... Defrag indexes with fragmentation greater than ' + CAST(@minFragmentation AS VARCHAR(10)) + '; Rebuild indexes with fragmentation greater than ' + CAST(@rebuildThreshold AS VARCHAR(10)) + '; You' + CASE WHEN @executeSQL = 1 THEN ' DO' ELSE ' DO NOT' END + ' want the commands to be executed automatically; You want to defrag indexes in ' + @defragSortOrder + ' order of the ' + UPPER(@defragOrderColumn) + ' value; You have' + CASE WHEN @timeLimit IS Null THEN ' not specified a time limit;' ELSE ' specified a time limit of ' + CAST(@timeLimit AS VARCHAR(10)) END + ' minutes; ' + CASE WHEN @DATABASE IS Null THEN 'ALL databases' ELSE 'The ' + @DATABASE + ' database' END + ' will be defragged; ' + CASE WHEN @tableName IS Null THEN 'ALL tables' ELSE 'The ' + @tableName + ' table' END + ' will be defragged; We' + CASE WHEN Exists(SELECT TOP 1 * FROM dbo.dba_indexDefragStatus WHERE defragDate IS Null) And @forceRescan <> 1 THEN ' WILL NOT' ELSE ' WILL' END + ' be rescanning indexes; The scan will be performed in ' + @scanMode + ' mode; You want to limit defrags to indexes with' + CASE WHEN @maxPageCount IS Null THEN ' more than ' + CAST(@minPageCount AS VARCHAR(10)) ELSE ' between ' + CAST(@minPageCount AS VARCHAR(10)) + ' and ' + CAST(@maxPageCount AS VARCHAR(10)) END + ' pages; Indexes will be defragged' + CASE WHEN @editionCheck = 0 Or @onlineRebuild = 0 THEN ' OFFLINE;' ELSE ' ONLINE;' END + ' Indexes will be sorted in' + CASE WHEN @sortInTempDB = 0 THEN ' the DATABASE' ELSE ' TEMPDB;' END + ' Defrag operations will utilize ' + CASE WHEN @editionCheck = 0 Or @maxDopRestriction IS Null THEN 'system defaults for processors;' ELSE CAST(@maxDopRestriction AS VARCHAR(2)) + ' processors;' END + ' You' + CASE WHEN @printCommands = 1 THEN ' DO' ELSE ' DO NOT' END + ' want to print the ALTER INDEX commands; You' + CASE WHEN @printFragmentation = 1 THEN ' DO' ELSE ' DO NOT' END + ' want to output fragmentation levels; You want to wait ' + @defragDelay + ' (hh:mm:ss) between defragging indexes; You want to run in' + CASE WHEN @debugMode = 1 THEN ' DEBUG' ELSE ' SILENT' END + ' mode.'; RAISERROR(@debugMessage, 0, 42) WITH NoWait; END; IF @debugMode = 1 RAISERROR('Grabbing a list of our databases...', 0, 42) WITH NoWait; /* Retrieve the list of databases to investigate */ INSERT INTO #databaseList SELECT database_id , name , 0 -- not scanned yet for fragmentation FROM sys.databases WHERE name = IsNull(@DATABASE, name) And [name] Not In ('master', 'tempdb')-- exclude system databases And [STATE] = 0 -- state must be ONLINE And is_read_only = 0; -- cannot be read_only /* Check to see if we have indexes in need of defrag; otherwise, re-scan the database(s) */ IF Not Exists(SELECT TOP 1 * FROM dbo.dba_indexDefragStatus WHERE defragDate IS Null) Or @forceRescan = 1 BEGIN /* Truncate our list of indexes to prepare for a new scan */ TRUNCATE TABLE dbo.dba_indexDefragStatus; IF @debugMode = 1 RAISERROR('Looping through our list of databases and checking for fragmentation...', 0, 42) WITH NoWait; /* Loop through our list of databases */ WHILE (SELECT COUNT(*) FROM #databaseList WHERE scanStatus = 0) > 0 BEGIN SELECT TOP 1 @databaseID = databaseID FROM #databaseList WHERE scanStatus = 0; SELECT @debugMessage = ' working on ' + DB_NAME(@databaseID) + '...'; IF @debugMode = 1 RAISERROR(@debugMessage, 0, 42) WITH NoWait; /* Determine which indexes to defrag using our user-defined parameters */ INSERT INTO dbo.dba_indexDefragStatus ( databaseID , databaseName , objectID , indexID , partitionNumber , fragmentation , page_count , range_scan_count , scanDate ) SELECT ps.database_id AS 'databaseID' , QUOTENAME(DB_NAME(ps.database_id)) AS 'databaseName' , ps.OBJECT_ID AS 'objectID' , ps.index_id AS 'indexID' , ps.partition_number AS 'partitionNumber' , SUM(ps.avg_fragmentation_in_percent) AS 'fragmentation' , SUM(ps.page_count) AS 'page_count' , os.range_scan_count , GETDATE() AS 'scanDate' FROM sys.dm_db_index_physical_stats(@databaseID, OBJECT_ID(@tableName), Null , Null, @scanMode) AS ps Join sys.dm_db_index_operational_stats(@databaseID, OBJECT_ID(@tableName), Null , Null) AS os ON ps.database_id = os.database_id And ps.OBJECT_ID = os.OBJECT_ID and ps.index_id = os.index_id And ps.partition_number = os.partition_number WHERE avg_fragmentation_in_percent >= @minFragmentation And ps.index_id > 0 -- ignore heaps And ps.page_count > @minPageCount And ps.index_level = 0 -- leaf-level nodes only, supports @scanMode GROUP BY ps.database_id , QUOTENAME(DB_NAME(ps.database_id)) , ps.OBJECT_ID , ps.index_id , ps.partition_number , os.range_scan_count OPTION (MaxDop 2); /* Do we want to exclude right-most populated partition of our partitioned indexes? */ IF @excludeMaxPartition = 1 BEGIN SET @excludeMaxPartitionSQL = ' Select ' + CAST(@databaseID AS VARCHAR(10)) + ' As [databaseID] , [object_id] , index_id , Max(partition_number) As [maxPartition] From ' + DB_NAME(@databaseID) + '.sys.partitions Where partition_number > 1 And [rows] > 0 Group By object_id , index_id;'; INSERT INTO #maxPartitionList EXECUTE SP_EXECUTESQL @excludeMaxPartitionSQL; END; /* Keep track of which databases have already been scanned */ UPDATE #databaseList SET scanStatus = 1 WHERE databaseID = @databaseID; END /* We don't want to defrag the right-most populated partition, so delete any records for partitioned indexes where partition = Max(partition) */ IF @excludeMaxPartition = 1 BEGIN DELETE ids FROM dbo.dba_indexDefragStatus AS ids Join #maxPartitionList AS mpl ON ids.databaseID = mpl.databaseID And ids.objectID = mpl.objectID And ids.indexID = mpl.indexID And ids.partitionNumber = mpl.maxPartition; END; /* Update our exclusion mask for any index that has a restriction on the days it can be defragged */ UPDATE ids SET ids.exclusionMask = ide.exclusionMask FROM dbo.dba_indexDefragStatus AS ids Join dbo.dba_indexDefragExclusion AS ide ON ids.databaseID = ide.databaseID And ids.objectID = ide.objectID And ids.indexID = ide.indexID; END SELECT @debugMessage = 'Looping through our list... there are ' + CAST(COUNT(*) AS VARCHAR(10)) + ' indexes to defrag!' FROM dbo.dba_indexDefragStatus WHERE defragDate IS Null And page_count Between @minPageCount And IsNull(@maxPageCount, page_count); IF @debugMode = 1 RAISERROR(@debugMessage, 0, 42) WITH NoWait; /* Begin our loop for defragging */ WHILE (SELECT COUNT(*) FROM dbo.dba_indexDefragStatus WHERE ( (@executeSQL = 1 And defragDate IS Null) Or (@executeSQL = 0 And defragDate IS Null And printStatus = 0) ) And exclusionMask & POWER(2, DATEPART(weekday, GETDATE())-1) = 0 And page_count Between @minPageCount And IsNull(@maxPageCount, page_count)) > 0 BEGIN /* Check to see if we need to exit our loop because of our time limit */ IF IsNull(@endDateTime, GETDATE()) < GETDATE() BEGIN RAISERROR('Our time limit has been exceeded!', 11, 42) WITH NoWait; END; IF @debugMode = 1 RAISERROR(' Picking an index to beat into shape...', 0, 42) WITH NoWait; /* Grab the index with the highest priority, based on the values submitted; Look at the exclusion mask to ensure it can be defragged today */ SET @getIndexSQL = N' Select Top 1 @objectID_Out = objectID , @indexID_Out = indexID , @databaseID_Out = databaseID , @databaseName_Out = databaseName , @fragmentation_Out = fragmentation , @partitionNumber_Out = partitionNumber , @pageCount_Out = page_count From dbo.dba_indexDefragStatus Where defragDate Is Null ' + CASE WHEN @executeSQL = 0 THEN 'And printStatus = 0' ELSE '' END + ' And exclusionMask & Power(2, DatePart(weekday, GetDate())-1) = 0 And page_count Between @p_minPageCount and IsNull(@p_maxPageCount, page_count) Order By + ' + @defragOrderColumn + ' ' + @defragSortOrder; SET @getIndexSQL_Param = N'@objectID_Out int OutPut , @indexID_Out int OutPut , @databaseID_Out int OutPut , @databaseName_Out nvarchar(128) OutPut , @fragmentation_Out int OutPut , @partitionNumber_Out int OutPut , @pageCount_Out int OutPut , @p_minPageCount int , @p_maxPageCount int'; EXECUTE SP_EXECUTESQL @getIndexSQL , @getIndexSQL_Param , @p_minPageCount = @minPageCount , @p_maxPageCount = @maxPageCount , @objectID_Out = @objectID OUTPUT , @indexID_Out = @indexID OUTPUT , @databaseID_Out = @databaseID OUTPUT , @databaseName_Out = @databaseName OUTPUT , @fragmentation_Out = @fragmentation OUTPUT , @partitionNumber_Out = @partitionNumber OUTPUT , @pageCount_Out = @pageCount OUTPUT; IF @debugMode = 1 RAISERROR(' Looking up the specifics for our index...', 0, 42) WITH NoWait; /* Look up index information */ SELECT @updateSQL = N'Update ids Set schemaName = QuoteName(s.name) , objectName = QuoteName(o.name) , indexName = QuoteName(i.name) From dbo.dba_indexDefragStatus As ids Inner Join ' + @databaseName + '.sys.objects As o On ids.objectID = o.object_id Inner Join ' + @databaseName + '.sys.indexes As i On o.object_id = i.object_id And ids.indexID = i.index_id Inner Join ' + @databaseName + '.sys.schemas As s On o.schema_id = s.schema_id Where o.object_id = ' + CAST(@objectID AS VARCHAR(10)) + ' And i.index_id = ' + CAST(@indexID AS VARCHAR(10)) + ' And i.type > 0 And ids.databaseID = ' + CAST(@databaseID AS VARCHAR(10)); EXECUTE SP_EXECUTESQL @updateSQL; /* Grab our object names */ SELECT @objectName = objectName , @schemaName = schemaName , @indexName = indexName FROM dbo.dba_indexDefragStatus WHERE objectID = @objectID And indexID = @indexID And databaseID = @databaseID; IF @debugMode = 1 RAISERROR(' Grabbing the partition count...', 0, 42) WITH NoWait; /* Determine if the index is partitioned */ SELECT @partitionSQL = 'Select @partitionCount_OUT = Count(*) From ' + @databaseName + '.sys.partitions Where object_id = ' + CAST(@objectID AS VARCHAR(10)) + ' And index_id = ' + CAST(@indexID AS VARCHAR(10)) + ';' , @partitionSQL_Param = '@partitionCount_OUT int OutPut'; EXECUTE SP_EXECUTESQL @partitionSQL, @partitionSQL_Param, @partitionCount_OUT = @partitionCount OUTPUT; IF @debugMode = 1 RAISERROR(' Seeing if there are any LOBs to be handled...', 0, 42) WITH NoWait; /* Determine if the table contains LOBs */ SELECT @LOB_SQL = ' Select @containsLOB_OUT = Count(*) From ' + @databaseName + '.sys.columns With (NoLock) Where [object_id] = ' + CAST(@objectID AS VARCHAR(10)) + ' And (system_type_id In (34, 35, 99) Or max_length = -1);' /* system_type_id --> 34 = image, 35 = text, 99 = ntext max_length = -1 --> varbinary(max), varchar(max), nvarchar(max), xml */ , @LOB_SQL_Param = '@containsLOB_OUT int OutPut'; EXECUTE SP_EXECUTESQL @LOB_SQL, @LOB_SQL_Param, @containsLOB_OUT = @containsLOB OUTPUT; IF @debugMode = 1 RAISERROR(' Checking for indexes that do not allow page locks...', 0, 42) WITH NoWait; /* Determine if page locks are allowed; for those indexes, we need to always rebuild */ SELECT @allowPageLockSQL = 'Select @allowPageLocks_OUT = Count(*) From ' + @databaseName + '.sys.indexes Where object_id = ' + CAST(@objectID AS VARCHAR(10)) + ' And index_id = ' + CAST(@indexID AS VARCHAR(10)) + ' And Allow_Page_Locks = 0;' , @allowPageLockSQL_Param = '@allowPageLocks_OUT int OutPut'; EXECUTE SP_EXECUTESQL @allowPageLockSQL, @allowPageLockSQL_Param, @allowPageLocks_OUT = @allowPageLocks OUTPUT; IF @debugMode = 1 RAISERROR(' Building our SQL statements...', 0, 42) WITH NoWait; /* If there's not a lot of fragmentation, or if we have a LOB, we should reorganize */ IF (@fragmentation < @rebuildThreshold Or @containsLOB >= 1 Or @partitionCount > 1) And @allowPageLocks = 0 BEGIN SET @sqlCommand = N'Alter Index ' + @indexName + N' On ' + @databaseName + N'.' + @schemaName + N'.' + @objectName + N' ReOrganize'; /* If our index is partitioned, we should always reorganize */ IF @partitionCount > 1 SET @sqlCommand = @sqlCommand + N' Partition = ' + CAST(@partitionNumber AS NVARCHAR(10)); END /* If the index is heavily fragmented and doesn't contain any partitions or LOB's, or if the index does not allow page locks, rebuild it */ ELSE IF (@fragmentation >= @rebuildThreshold Or @allowPageLocks <> 0) And IsNull(@containsLOB, 0) != 1 And @partitionCount <= 1 BEGIN /* Set online rebuild options; requires Enterprise Edition */ IF @onlineRebuild = 1 And @editionCheck = 1 SET @rebuildCommand = N' Rebuild With (Online = On'; ELSE SET @rebuildCommand = N' Rebuild With (Online = Off'; /* Set sort operation preferences */ IF @sortInTempDB = 1 SET @rebuildCommand = @rebuildCommand + N', Sort_In_TempDB = On'; ELSE SET @rebuildCommand = @rebuildCommand + N', Sort_In_TempDB = Off'; /* Set processor restriction options; requires Enterprise Edition */ IF @maxDopRestriction IS Not Null And @editionCheck = 1 SET @rebuildCommand = @rebuildCommand + N', MaxDop = ' + CAST(@maxDopRestriction AS VARCHAR(2)) + N')'; ELSE SET @rebuildCommand = @rebuildCommand + N')'; SET @sqlCommand = N'Alter Index ' + @indexName + N' On ' + @databaseName + N'.' + @schemaName + N'.' + @objectName + @rebuildCommand; END ELSE /* Print an error message if any indexes happen to not meet the criteria above */ IF @printCommands = 1 Or @debugMode = 1 RAISERROR('We are unable to defrag this index.', 0, 42) WITH NoWait; /* Are we executing the SQL? If so, do it */ IF @executeSQL = 1 BEGIN SET @debugMessage = 'Executing: ' + @sqlCommand; /* Print the commands we're executing if specified to do so */ IF @printCommands = 1 Or @debugMode = 1 RAISERROR(@debugMessage, 0, 42) WITH NoWait; /* Grab the time for logging purposes */ SET @dateTimeStart = GETDATE(); /* Log our actions */ INSERT INTO dbo.dba_indexDefragLog ( databaseID , databaseName , objectID , objectName , indexID , indexName , partitionNumber , fragmentation , page_count , dateTimeStart , sqlStatement ) SELECT @databaseID , @databaseName , @objectID , @objectName , @indexID , @indexName , @partitionNumber , @fragmentation , @pageCount , @dateTimeStart , @sqlCommand; SET @indexDefrag_id = SCOPE_IDENTITY(); /* Wrap our execution attempt in a try/catch and log any errors that occur */ BEGIN Try /* Execute our defrag! */ EXECUTE SP_EXECUTESQL @sqlCommand; SET @dateTimeEnd = GETDATE(); /* Update our log with our completion time */ UPDATE dbo.dba_indexDefragLog SET dateTimeEnd = @dateTimeEnd , durationSeconds = DATEDIFF(SECOND, @dateTimeStart, @dateTimeEnd) WHERE indexDefrag_id = @indexDefrag_id; END Try BEGIN Catch /* Update our log with our error message */ UPDATE dbo.dba_indexDefragLog SET dateTimeEnd = GETDATE() , durationSeconds = -1 , errorMessage = Error_Message() WHERE indexDefrag_id = @indexDefrag_id; IF @debugMode = 1 RAISERROR(' An error has occurred executing this command! Please review the dba_indexDefragLog table for details.' , 0, 42) WITH NoWait; END Catch /* Just a little breather for the server */ WAITFOR Delay @defragDelay; UPDATE dbo.dba_indexDefragStatus SET defragDate = GETDATE() , printStatus = 1 WHERE databaseID = @databaseID And objectID = @objectID And indexID = @indexID And partitionNumber = @partitionNumber; END ELSE /* Looks like we're not executing, just printing the commands */ BEGIN IF @debugMode = 1 RAISERROR(' Printing SQL statements...', 0, 42) WITH NoWait; IF @printCommands = 1 Or @debugMode = 1 PRINT IsNull(@sqlCommand, 'error!'); UPDATE dbo.dba_indexDefragStatus SET printStatus = 1 WHERE databaseID = @databaseID And objectID = @objectID And indexID = @indexID And partitionNumber = @partitionNumber; END END /* Do we want to output our fragmentation results? */ IF @printFragmentation = 1 BEGIN IF @debugMode = 1 RAISERROR(' Displaying a summary of our action...', 0, 42) WITH NoWait; SELECT databaseID , databaseName , objectID , objectName , indexID , indexName , partitionNumber , fragmentation , page_count , range_scan_count FROM dbo.dba_indexDefragStatus WHERE defragDate >= @startDateTime ORDER BY defragDate; END; END Try BEGIN Catch SET @debugMessage = Error_Message() + ' (Line Number: ' + CAST(Error_Line() AS VARCHAR(10)) + ')'; PRINT @debugMessage; END Catch; /* When everything is said and done, make sure to get rid of our temp table */ DROP TABLE #databaseList; DROP TABLE #processor; DROP TABLE #maxPartitionList; IF @debugMode = 1 RAISERROR('DONE! Thank you for taking care of your indexes! :)', 0, 42) WITH NoWait; SET NOCOUNT OFF; RETURN 0 END
You can also download it here: dba_indexDefrag_sp_v40_public.txt
I've had this latest version in production on terabyte-size databases running SQL Server 2005 and 2008 Enterprise editions for the last 3 months, where it runs nightly without issue. I've also had numerous beta testers report success in their environments, too. But to be safe, make sure to keep an eye on it the first time it runs to ensure you understand the impact on your server.
Enjoy!
Michelle
Filtered Indexes Work-Around
Recently, I needed to create a stored procedure that queried a rather large table. The table has a filtered index on a date column, and it covers the query. However, the Query Optimizer was not using the index, which was increasing the execution time (not to mention IO!) by at least 10x. This wasn't the first time I've had the Optimizer fail to use a filtered index. Normally when this happens, I use a table hint to force the filtered index -- after I verify that it is indeed faster, of course. However, since this was a stored procedure, I was receiving the following error message whenever I tried to execute the proc:
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
SQL Server would not allow me to execute the stored procedure using the filtered index hint. If I removed the hint, it executed, but it used a different, non-covering and far more expensive index. For those of you not familiar with this issue, allow me to illustrate the problem.
First, create a table to play with and populate it with some bogus data:
CREATE TABLE dbo.filteredIndexTest ( myID INT IDENTITY(1,3) , myDate SMALLDATETIME , myData CHAR(100) CONSTRAINT PK_filteredIndexTest PRIMARY KEY CLUSTERED(myID) ); SET NOCOUNT ON; DECLARE @DATE SMALLDATETIME = '2010-01-01'; WHILE @DATE < '2010-02-01' BEGIN INSERT INTO dbo.filteredIndexTest ( myDate , myData ) SELECT @DATE , 'Date: ' + CONVERT(VARCHAR(20), @DATE, 102); SET @DATE = DATEADD(MINUTE, 1, @DATE); END; SELECT COUNT(*) FROM dbo.filteredIndexTest;
It looks like this will generate 44,640 rows of test data... plenty enough for our purposes. Now, let's create our filtered index and write a query that will use it:
CREATE NONCLUSTERED INDEX IX_filteredIndexTest_1 ON dbo.filteredIndexTest(myDate) Include (myData) WHERE myDate >= '2010-01-27'; SELECT DISTINCT myData FROM dbo.filteredIndexTest WHERE myDate >= '2010-01-28';
If you look at the execution plan for this query, you'll notice that the Optimizer is using the filtered index. Perfect! Now let's parameterize it.
DECLARE @myDate1 SMALLDATETIME = '2010-01-28'; SELECT DISTINCT myData FROM dbo.filteredIndexTest WHERE myDate >= @myDate1;
Uh oh. Looking at the execution plan, we see that SQL Server is no longer using the filtered index. Instead, it's scanning the clustered index! Why is this? There's actually a good explanation for it. The reason is that I could, in theory, pass a date to my parameter that fell outside of the filtered date range. If that's the case, then SQL Server could not utilize the filtered index. Personally, I think it's a bug and SQL Server should identify whether or not a filtered index could be used based on the actual value submitted, but... that's a whole other blog post.
So what can we do? Well, dynamic SQL may be able to help us out in this case. Let's give it a go. First, let's try parameterized dynamic SQL.
DECLARE @mySQL1 NVARCHAR(2000) , @myParam NVARCHAR(2000) = '@p_myDate2 smalldatetime' , @myDate2 SMALLDATETIME = '2010-01-28'; SET @mySQL1 = 'Select Distinct myData From dbo.filteredIndexTest Where myDate >= @p_myDate2'; EXECUTE SP_EXECUTESQL @mySQL1, @myParam, @p_myDate2 = @myDate2;
Looking at the execution plan, we see we're still scanning on the clustered index. This is because the parameterized dynamic SQL resolves to be the exact same query as the one above it. Let's try unparameterized SQL instead:
DECLARE @mySQL2 NVARCHAR(2000) , @myDate3 SMALLDATETIME = '2010-01-28'; SET @mySQL2 = 'Select Distinct myData From dbo.filteredIndexTest Where myDate >= ''' + CAST(@myDate3 AS VARCHAR(20)) + ''''; EXECUTE SP_EXECUTESQL @mySQL2; -- Drop Table dbo.filteredIndexTest;
Voila! We have a seek on our filtered index. Why? Because the statement resolves to be identical to our first query, where we hard-coded the date value in the WHERE clause.
Now, I want to stress this fact: you should always, ALWAYS use parameterized dynamic SQL whenever possible. Not only is it safer, but it's also faster, because it can reuse cached plans. But sometimes you just cannot accomplish the same tasks with it. This is one of those times. If you do end up needing to use unparameterized dynamic SQL as a work-around, please make sure you're validating your input, especially if you're interfacing with any sort of external source.
There's an even easier work-around for this problem that Dave (http://www.crappycoding.com) shared with me: recompile.
Adding "Option (Recompile)" to the end of your statements will force the Optimizer to re-evaluate which index will best meet the needs of your query every time the statement is executed. More importantly, it evaluates the plan based on the actual values passed to the parameter... just like in our hard-coded and dynamic SQL examples. Let's see it in action:
DECLARE @myDate4 SMALLDATETIME = '2010-01-28'; SELECT DISTINCT myData FROM dbo.filteredIndexTest WHERE myDate >= @myDate4 OPTION (RECOMPILE); DECLARE @myDate5 SMALLDATETIME = '2010-01-20'; SELECT DISTINCT myData FROM dbo.filteredIndexTest WHERE myDate >= @myDate5 OPTION (RECOMPILE);
If we look at the execution plans for the 2 queries above, we see that the first query seeks on the filtered index, and the second query scans on the clustered index. This is because the second query cannot be satisfied with the filtered index because we initially limited our index to dates greater than or equal to 1/27/2010.
There are, of course, trade-offs associated with each approach, so use whichever one best meets your needs. Do you have another work-around for this issue? If so, please let me know.
Update:
Alex Kuznetsov (http://www.simple-talk.com/author/alex-kuznetsov/) shared this method too:
DECLARE @myDate1 SMALLDATETIME = '2010-01-28'; SELECT DISTINCT myData FROM dbo.filteredIndexTest WHERE myDate = @myDate1 AND myDate >= '2010-01-27';
Like the other examples, this will result in an index seek on the filtered index. Basically, by explicitly declaring the start date of your filter, you're letting the Optimizer know that the filtered index can satisfy the request, regardless of the parameter value passed. Thanks for the tip, Alex!
#PASSAwesomeness
Allen Kinsel on Twitter (@sqlinsaneo) recently started a new Twitter tag, #PASSAwesomeness, about all of the cool things about PASS Summit. I really like the tag, so I'm going to blatantly steal borrow it for this post.
First, and long overdue, I want to give a brief recap of the East Iowa SQL Saturday. On October 17th, our local PASS chapter, 380PASS, sponsored our first ever SQL Saturday at the University of Iowa in Iowa City. By all accounts, the event was a great success! We had 90 attendees, 11 speakers, and 21 sessions. We received numerous compliments on the quality of the speakers, the niceness of the facilities, and the abundance of food. Not too shabby for our first time hosting the event, if I do say so myself.
I'd like to thank all of our wonderful speakers, especially those who traveled from out of town and out of state, for making this event such a success. I'd also like to thank our amazing volunteers for helping put this all together. Lastly, but certainly not least, I'd like to thank our generous sponsors, without whom this event would not be possible. Because this event went so smoothly and was so well received in the community, we've already started planning our next big SQL event! In the meantime, don't forget to check out our monthly 380PASS meetings to tide you over.
I'd also like to take a moment to discuss the PASS Summit. Unless you're a DBA who's been living under a rock, you've probably heard of the PASS Summit. If you *have* been living under a rock -- and hey, I'm not poking fun, I used to live under a rock, too! -- then what you need to know is that the Summit is the largest SQL Server conference in the world. It's a gathering of Microsoft developers and SQL Server gurus; the rest of us show up to try to absorb as much from them as possible. Since I've recently moved to the Business Intelligence team, I'm extremely excited to delve into the amazing amount of BI content offered.
I'm also deeply honored to be presenting at the Summit this year on some of the performance tuning techniques I've used with great success in my production environments. The session is titled, Super Bowl, Super Load - A Look At Performance Tuning for VLDB's. If you're interested in performance tuning or VLDB (very large database) topics, consider stopping by to catch my session. From what I can tell, I'll be presenting on Tuesday from 10:15am - 11:30am in room(s?) 602-604.
If you read my blog, or if we've ever interacted in any way on the internet -- Twitter, LinkedIn, e-mails, blog comments, etc. -- please stop by and say "hi"! Aside from all of the awesome SQL Server content, I'm really looking forward to meeting as many new folks as possible.
And on that note...
Getting to meet all of the amazing SQL Server professionals out there who have inspired and encouraged me in so many ways #PASSAwesomeness
Partitioning Tricks
For those of you who are using partitioning, or who are considering using partitioning, allow me to share some tips with you.
Easy Partition Staging Tables
Switching partitions (or more specifically, hobts) in and out of a partitioned table requires the use of a staging table. The staging table has very specific requirements: it must be completely identical to the partitioned table, including indexing structures, and it must have a check constraint that limits data to the partitioning range. Thanks to my co-worker Jeff, I've recently started using the SQL Server Partition Management tool on CodePlex. I haven't used the automatic partition switching feature -- frankly, using any sort of data modification tool in a production environment makes me nervous -- but I've been using the scripting option to create staging tables in my development environment, which I then copy to production for use. It's nothing you can't do yourself, but it does make the whole process easy and painless, plus it saves you from annoying typos. But be careful when using this tool to just create the table and check constraints automatically, because you may need to...
Add Check Constraints After Loading Data
Most of the time, I add the check constraint when I create the staging table, then I load data and perform the partition switch. However, for some reason, I was receiving the following error:
.Net SqlClient Data Provider: Msg 4972, Level 16, State 1, Line 1
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table 'myStagingTable' allows values that are not allowed by check constraints or partition function on target table 'myDestinationTable'.
This drove me crazy. I confirmed my check constraints were correct, that I had the correct partition number, and that all schema and indexes matched identically. After about 30 minutes of this, I decided to drop and recreate the constraint. For some reason, it fixed the issue. Repeat tests produced the same results: the check constraint needed to be added *after* data was loaded. This error is occurring on a SQL Server 2008 SP1 box; to be honest, I'm not sure what's causing the error, so if you know, please leave me a comment. But I figured I'd share so that anyone else running into this issue can hopefully save some time and headache.
Replicating Into Partitioned and Non-Partitioned Tables
Recently, we needed to replicate a non-partitioned table to two different destinations. We wanted to use partitioning for Server A, which has 2008 Enterprise; Server B, which is on 2005 Standard, could not take advantage of partitioning. The solution was really easy: create a pre-snapshot and post-snapshot script for the publication, then modify to handle each server group differently. Using pseudo-code, it looked something like this:
/* Identify which servers get the partitioned version */ IF @@SERVERNAME In ('yourServerNameList') BEGIN /* Create your partitioning scheme if necessary */ IF Not Exists(SELECT * FROM sys.partition_schemes WHERE name = 'InsertPartitionScheme') CREATE PARTITION SCHEME InsertPartitionScheme AS PARTITION InsertPartitionFunction ALL TO ([PRIMARY]); /* Create your partitioning function if necessary */ IF Not Exists(SELECT * FROM sys.partition_functions WHERE name = 'InsertPartitionFunction') CREATE PARTITION FUNCTION InsertPartitionFunction (SMALLDATETIME) AS RANGE RIGHT FOR VALUES ('insertValues'); /* Create a partitioned version of your table */ CREATE TABLE [dbo].[yourTableName] ( [yourTableSchema] ) ON InsertPartitionScheme([partitioningKey]); END ELSE BEGIN /* Create a non-partitioned version of your table */ CREATE TABLE [dbo].[yourTableName] ( [yourTableSchema] ) ON [PRIMARY]; END
You could also use an edition check instead of a server name check, if you prefer. The post-snapshot script basically looked the same, except you create partitioned indexes instead.
Compress Old Partitions
Did you know you can set different compression levels for individual partitions? It's true! I've just completed doing this on our largest partitioned table. Here's how:
/* Apply compression to your partitioned table */ ALTER TABLE dbo.yourTableName Rebuild Partition = All WITH ( Data_Compression = Page ON Partitions(1 TO 9) , Data_Compression = ROW ON Partitions(10 TO 11) , Data_Compression = NONE ON Partitions(12) ); /* Apply compression to your partitioned index */ ALTER INDEX YourPartitionedIndex ON dbo.yourTableName Rebuild Partition = All WITH ( Data_Compression = Page ON Partitions(1 TO 9) , Data_Compression = ROW ON Partitions(10 TO 11) , Data_Compression = NONE ON Partitions(12) ); /* Apply compression to your unpartitioned index */ ALTER INDEX YourUnpartitionedIndex ON dbo.yourTableName Rebuild WITH (Data_Compression = ROW);
A couple of things to note. In all of our proof-of-concept testing, we found that compression significantly reduced query execution time, reads (IO), and storage. However, CPU was also increased significantly. The results were more dramatic, both good and bad, with page compression versus row compression. Still, for our older partitions, which aren't queried regularly, it made sense to turn on page compression. The newer partitions receive row compression, and the newest partitions, which are still queried very regularly by routine processes, were left completely uncompressed. This seems to strike a nice balance in our environment, but of course, results will vary depending on how you use your data.
Something to be aware of is that compressing your clustered index does *not* compress your non-clustered indexes; those are separate operations. Lastly, for those who are curious, it took us about 1 minute to apply row compression and about 7 minutes to apply page compression to partitions averaging 30 million rows.
Looking for more information on table partitioning? Check out my overview of partitioning, my example code, and my article on indexing on partitioned tables.
Monitoring Process for Performance Counters
Recently I needed to create a process to monitor performance counters over a short period of time. We were going to implement a change and we wanted to compare performance before and after to see if there was any impact.
To do this, I first created a couple of tables. One table is used to actually store the monitored values. The second table is used for configuration; you insert only the counters you want to monitor.
/* Create the table to store our logged perfmon counters */ CREATE TABLE dbo.dba_perfCounterMonitor ( capture_id INT IDENTITY(1,1) Not Null , captureDate SMALLDATETIME Not Null , objectName NVARCHAR(128) Not Null , counterName NVARCHAR(128) Not Null , instanceName NVARCHAR(128) Not Null , VALUE FLOAT(6) Not Null , valueType NVARCHAR(10) Not Null CONSTRAINT PK_dba_perfCounterMonitor PRIMARY KEY CLUSTERED(capture_id) ); /* Create the table that controls which counters we're going to monitor */ CREATE TABLE dbo.dba_perfCounterMonitorConfig ( objectName NVARCHAR(128) Not Null , counterName NVARCHAR(128) Not Null , instanceName NVARCHAR(128) Null );
If you leave the instanceName NULL in the config table, it'll monitor all instances. Now we're going to insert some sample performance counters into the config table. The counters you're interested in can, and likely will, vary.
/* Insert some perfmon counters to be monitored */ INSERT INTO dbo.dba_perfCounterMonitorConfig SELECT 'SQLServer:Buffer Manager', 'Page Life Expectancy', Null UNION All SELECT 'SQLServer:Locks', 'Lock Requests/sec', Null UNION All SELECT 'SQLServer:Locks', 'Lock Waits/sec', Null UNION All SELECT 'SQLServer:Locks', 'Lock Wait Time (ms)', Null UNION All SELECT 'SQLServer:Buffer Manager', 'Page reads/sec', Null UNION All SELECT 'SQLServer:Buffer Manager', 'Page writes/sec', Null UNION All SELECT 'SQLServer:Buffer Manager', 'Buffer cache hit ratio', Null UNION All SELECT 'SQLServer:Databases', 'Transactions/sec', 'AdventureWorks' UNION All SELECT 'SQLServer:General Statistics', 'Processes blocked', Null;
Now let's create our proc. This proc will run for a specified time period and will *average* the counters over that time. I personally take snapshots every 15 seconds for 4 minutes; I have a scheduled task that runs this every 5 minutes. It's not perfect, but it gives me a good idea of what's happening on the server.
CREATE PROCEDURE dbo.dba_perfCounterMonitor_sp /* Declare Parameters */ @samplePeriod INT = 240 /* how long to sample, in seconds */ , @sampleRate CHAR(8) = '00:00:15' /* how frequently to sample, in seconds */ , @displayResults BIT = 0 /* display the results when done */ AS /********************************************************************************* Name: dba_perfCounterMonitor_sp Author: Michelle Ufford, http://sqlfool.com Purpose: Monitors performance counters. Uses the dba_perfCounterMonitorConfig table to manage which perf counters to monitor. @samplePeriod - specifies how long the process will try to monitor performance counters; in seconds. @sampleRate - how long inbetween samples; in seconds. The average values over sample period is then logged to the dba_perfCounterMonitor table. Notes: There are 3 basic types of performance counter calculations: Value/Base: these calculations require 2 counters. The value counter (cntr_type = 537003264) has to be divided by the base counter (cntr_type = 1073939712). Per Second: these counters are store cumulative values; the value must be compared at 2 different times to calculate the difference (cntr_type = 537003264). Point In Time: these counters show what the value of the counter is at the current point-in-time (cntr_type = 65792). No calculation is necessary to derive the value. Called by: DBA Date User Description ---------------------------------------------------------------------------- 2009-09-04 MFU Initial Release ********************************************************************************* Exec dbo.dba_perfCounterMonitor_sp @samplePeriod = 60 , @sampleRate = '00:00:01' , @displayResults = 1; *********************************************************************************/ SET NOCOUNT ON; SET XACT_Abort ON; SET Ansi_Padding ON; SET Ansi_Warnings ON; SET ArithAbort ON; SET Concat_Null_Yields_Null ON; SET Numeric_RoundAbort OFF; BEGIN /* Declare Variables */ DECLARE @startTime DATETIME , @endTime DATETIME , @iteration INT; SELECT @startTime = GETDATE() , @iteration = 1; DECLARE @samples TABLE ( iteration INT Not Null , objectName NVARCHAR(128) Not Null , counterName NVARCHAR(128) Not Null , instanceName NVARCHAR(128) Not Null , cntr_value FLOAT Not Null , base_value FLOAT Null , cntr_type BIGINT Not Null ); BEGIN Try /* Start a new transaction */ BEGIN TRANSACTION; /* Grab all of our counters */ INSERT INTO @samples SELECT @iteration , RTRIM(dopc.OBJECT_NAME) , RTRIM(dopc.counter_name) , RTRIM(dopc.instance_name) , RTRIM(dopc.cntr_value) , (SELECT cntr_value FROM sys.dm_os_performance_counters AS dopc1 WHERE dopc1.OBJECT_NAME = pcml.objectName And dopc1.counter_name = pcml.counterName + ' base' And dopc1.instance_name = IsNull(pcml.instanceName, dopc.instance_name)) , dopc.cntr_type FROM sys.dm_os_performance_counters AS dopc Join dbo.dba_perfCounterMonitorConfig AS pcml ON dopc.OBJECT_NAME = pcml.objectName And dopc.counter_name = pcml.counterName And dopc.instance_name = IsNull(pcml.instanceName, dopc.instance_name); /* During our sample period, grab our counter values and store the results */ WHILE GETDATE() < DATEADD(SECOND, @samplePeriod, @startTime) BEGIN SET @iteration = @iteration + 1; INSERT INTO @samples SELECT @iteration , RTRIM(dopc.OBJECT_NAME) , RTRIM(dopc.counter_name) , RTRIM(dopc.instance_name) , dopc.cntr_value , (SELECT cntr_value FROM sys.dm_os_performance_counters AS dopc1 WHERE dopc1.OBJECT_NAME = pcml.objectName And dopc1.counter_name = pcml.counterName + ' base' And dopc1.instance_name = IsNull(pcml.instanceName, dopc.instance_name)) , dopc.cntr_type FROM sys.dm_os_performance_counters AS dopc Join dbo.dba_perfCounterMonitorConfig AS pcml ON dopc.OBJECT_NAME = pcml.objectName And dopc.counter_name = pcml.counterName And dopc.instance_name = IsNull(pcml.instanceName, dopc.instance_name); /* Wait for a small delay */ WAITFOR Delay @sampleRate; END; /* Grab our end time for calculations */ SET @endTime = GETDATE(); /* Store the average of our point-in-time counters */ INSERT INTO dbo.dba_perfCounterMonitor ( captureDate , objectName , counterName , instanceName , VALUE , valueType ) SELECT @startTime , objectName , counterName , instanceName , AVG(cntr_value) , 'value' FROM @samples WHERE cntr_type = 65792 GROUP BY objectName , counterName , instanceName; /* Store the average of the value vs the base for cntr_type = 537003264 */ INSERT INTO dbo.dba_perfCounterMonitor ( captureDate , objectName , counterName , instanceName , VALUE , valueType ) SELECT @startTime , objectName , counterName , instanceName , AVG(cntr_value)/AVG(IsNull(base_value, 1)) , 'percent' FROM @samples WHERE cntr_type = 537003264 GROUP BY objectName , counterName , instanceName; /* Compare the first and last values for our cumulative, per-second counters */ INSERT INTO dbo.dba_perfCounterMonitor ( captureDate , objectName , counterName , instanceName , VALUE , valueType ) SELECT @startTime , objectName , counterName , instanceName , (MAX(cntr_value) - MIN(cntr_value)) / DATEDIFF(SECOND, @startTime, @endTime) , 'value' FROM @samples WHERE cntr_type = 272696576 GROUP BY objectName , counterName , instanceName; /* Should we display the results of our most recent execution? */ IF @displayResults = 1 SELECT captureDate , objectName , counterName , instanceName , VALUE , valueType FROM dbo.dba_perfCounterMonitor WITH (NoLock) WHERE captureDate = CAST(@startTime AS SMALLDATETIME) ORDER BY objectName , counterName , instanceName; /* If you have an open transaction, commit it */ IF @@TRANCOUNT > 0 COMMIT TRANSACTION; END Try BEGIN Catch /* Whoops, there was an error... rollback! */ IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION; /* Return an error message and log it */ EXECUTE dbo.dba_logError_sp; END Catch; SET NOCOUNT OFF; RETURN 0; END Go
Like I said, it's not perfect, but it gets the job done.
Getting an error about dba_logError_sp? Take a look at my error handling proc.
Overhead in Non-Unique Clustered Indexes
I've received a couple of questions regarding my article, Performance Considerations of Data Types, and the overhead associated with non-unique clustered indexes. I started to respond via e-mail, but my response was so long I decided to turn it into a blog post instead.
I should start by clarifying that non-unique clustered indexes do not necessarily consume more space and overhead; it depends on the data stored. If you have duplicate clustered key values, the first instance of the value will be handled as though it were unique. Any subsequent values, however, will incur overhead to manage the uniquifier that SQL Server adds to maintain row uniqueness. This same overhead is also incurred in non-clustered indexes, too, adding to the overall expense of this approach.
I think it helps to actually look at the data, so let's walk through a few different common scenarios. We'll create a table with a unique clustered index, a table with a non-unique clustered index but no duplicates, and a table with duplicate key values.
Also, a little warning that I started to write this in SQL Server 2008, and since I'm on a 2008 kick, I decided to leave it that way. You can modify this pretty easily to work in 2005, if necessary.
USE sandbox; Go /* Unique, clustered index, no duplicate values */ CREATE TABLE dbo.uniqueClustered ( myDate DATE Not Null , myNumber INT Not Null , myColumn CHAR(995) Not Null ); CREATE UNIQUE CLUSTERED INDEX CIX_uniqueClustered ON dbo.uniqueClustered(myDate); /* Non-unique clustered index, but no duplicate values */ CREATE TABLE dbo.nonUniqueNoDups ( myDate DATE Not Null , myNumber INT Not Null , myColumn CHAR(995) Not Null ); CREATE CLUSTERED INDEX CIX_nonUniqueNoDups ON dbo.nonUniqueNoDups(myDate); /* Non-unique clustered index, duplicate values */ CREATE TABLE dbo.nonUniqueDuplicates ( myDate DATE Not Null , myNumber INT Not Null , myColumn CHAR(995) Not Null ); CREATE CLUSTERED INDEX CIX_nonUniqueDuplicates ON dbo.nonUniqueDuplicates(myDate);
I'm going to use the date data type in 2008 for my clustered index key. To ensure uniqueness for the first two tables, I'll iterate through a few years' worth of dates. This is typical of what you may see in a data mart, where you'd have one record with an aggregation of each day's data. For the table with duplicate values, I'm going to insert the same date for each row.
/* Populate some test data */ SET NOCOUNT ON; DECLARE @myDate DATE = '1990-01-01' , @myNumber INT = 1; WHILE @myDate < '2010-01-01' BEGIN INSERT INTO dbo.uniqueClustered SELECT @myDate, @myNumber, 'data'; INSERT INTO dbo.nonUniqueNoDups SELECT @myDate, @myNumber, 'data'; INSERT INTO dbo.nonUniqueDuplicates SELECT '2009-01-01', @myNumber, 'data'; SELECT @myDate = DATEADD(DAY, 1, @myDate) , @myNumber += 1; END;
After running the above script, each table should have 7,305 records. This is obviously pretty small for a table, but it'll serve our purposes. Now let's take a look at the size of our tables:
/* Look at the details of our indexes */ /* Unique, clustered index, no duplicate values */ SELECT 'unique' AS 'type', page_count, avg_page_space_used_in_percent, record_count , min_record_size_in_bytes, max_record_size_in_bytes FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'uniqueClustered'), Null, Null, N'Detailed') WHERE index_level = 0 UNION All /* Non-unique clustered index, but no duplicate values */ SELECT 'non-unique, no dups', page_count, avg_page_space_used_in_percent, record_count , min_record_size_in_bytes, max_record_size_in_bytes FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'nonUniqueNoDups'), Null, Null, N'Detailed') WHERE index_level = 0 UNION All /* Non-unique clustered index, duplicate values */ SELECT 'duplicates', page_count, avg_page_space_used_in_percent, record_count , min_record_size_in_bytes, max_record_size_in_bytes FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'nonUniqueDuplicates'), Null, Null, N'Detailed') WHERE index_level = 0;
Here's the results:
type page_count avg_page_space_used_in_percent record_count min_record_size_in_bytes max_record_size_in_bytes ------------------- -------------------- ------------------------------ -------------------- ------------------------ ------------------------ unique 914 99.8055102545095 7305 1009 1009 non-unique, no dups 914 99.8055102545095 7305 1009 1009 duplicates 1044 88.066036570299 7305 1009 1017
I want to point out a couple of things. First, there is no difference in the number of pages between the non-unique clustered index with no duplicates ([nonUniqueNoDups]) and the unique clustered index ([uniqueClustered]). The table with duplicate clustered key values, however, requires 14% more pages to store the same amount of data. Secondly, the [max_record_size_in_bytes] of the [nonUniqueDuplicates] table is 8 bytes more than that of the other two. We'll discuss why in a minute.
Now let's take a look at the actual data pages. For this, I'm going to use my page internals proc.
Execute dbo.dba_viewPageData_sp @databaseName = 'sandbox' , @tableName = 'sandbox.dbo.uniqueClustered' , @indexName = 'CIX_uniqueClustered';
I'm not going to post the entire results here, but I want to draw your attention to "m_slotCnt = 8", which is near the top of the page. That means 8 records are stored on this page. Also, when you look near the end of the first record (Slot 0), you should see the following results:
Slot 0 Offset 0x60 Length 1009 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 1009 Memory Dump @0x00A9C060 00000000: 1000ee03 c3150b01 00000064 61746120 †..î.Ã......data [...] 000003F0: 00†††††††††††††††††††††††††††††††††††. Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3 myDate = 1990-01-01 Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4 myNumber = 1 Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995 myColumn = data
Now let's look at the table that has a non-unique clustered index but no duplicates:
EXECUTE dbo.dba_viewPageData_sp @databaseName = 'sandbox' , @tableName = 'sandbox.dbo.nonUniqueNoDups' , @indexName = 'CIX_nonUniqueNoDups';
The m_slotCnt count is also 8 for this page. This time, let's glance at the first and second records (Slot 0 and 1 respectively):
Slot 0 Offset 0x60 Length 1009 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 1009 Memory Dump @0x62FDC060 00000000: 1000ee03 c3150b01 00000064 61746120 †..î.Ã......data [...] 000003F0: 00†††††††††††††††††††††††††††††††††††. Slot 0 Column 0 Offset 0x0 Length 4 Length (physical) 0 UNIQUIFIER = 0 Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3 myDate = 1990-01-01 Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4 myNumber = 1 Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995 myColumn = data Slot 1 Offset 0x451 Length 1009 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 1009 Memory Dump @0x62FDC451 00000000: 1000ee03 c4150b02 00000064 61746120 †..î.Ä......data [...] 000003F0: 00†††††††††††††††††††††††††††††††††††. Slot 1 Column 0 Offset 0x0 Length 4 Length (physical) 0 UNIQUIFIER = 0 Slot 1 Column 1 Offset 0x4 Length 3 Length (physical) 3 myDate = 1990-01-02 Slot 1 Column 2 Offset 0x7 Length 4 Length (physical) 4 myNumber = 2 Slot 1 Column 3 Offset 0xb Length 995 Length (physical) 995 myColumn = data
We now see a new addition to the row, "UNIQUIFIER = 0." This is SQL Server's way of managing row uniqueness internally. You'll notice that, because the clustered key values are unique, the UNIQUIFIER is set to 0 and the row size is still 1009; for all intents and purposes, the UNIQUIFIER is not consuming any space.
Update: The DBCC God himself, Paul Randal, explained that non-dupes actually have a NULL UNIQUIFIER, which DBCC PAGE displays as a 0. Thanks for explaining, Paul! I wondered about that but chalked it up to SQL voodoo.
Now let's look at our final case, a non-unique clustered index with duplicate key values:
EXECUTE dbo.dba_viewPageData_sp @databaseName = 'sandbox' , @tableName = 'sandbox.dbo.nonUniqueDuplicates' , @indexName = 'CIX_nonUniqueDuplicates';
Here's where things get interesting. The m_slotCnt value is now 7, which means we're now storing 1 record less per page. Let's look at the details:
Slot 0 Offset 0x60 Length 1009 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 1009 Memory Dump @0x00A9C060 00000000: 1000ee03 df300b01 00000064 61746120 †..î.ß0.....data [...] 000003F0: 00†††††††††††††††††††††††††††††††††††. Slot 0 Column 0 Offset 0x0 Length 4 Length (physical) 0 UNIQUIFIER = 0 Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3 myDate = 2009-01-01 Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4 myNumber = 1 Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995 myColumn = data Slot 1 Offset 0x451 Length 1017 Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS Record Size = 1017 Memory Dump @0x00A9C451 00000000: 3000ee03 df300b02 00000064 61746120 †0.î.ß0.....data [...] 000003F0: 000100f9 03010000 00†††††††††††††††††...ù..... Slot 1 Column 0 Offset 0x3f5 Length 4 Length (physical) 4 UNIQUIFIER = 1 Slot 1 Column 1 Offset 0x4 Length 3 Length (physical) 3 myDate = 2009-01-01 Slot 1 Column 2 Offset 0x7 Length 4 Length (physical) 4 myNumber = 2 Slot 1 Column 3 Offset 0xb Length 995 Length (physical) 995 myColumn = data
The first record, Slot 0, looks exactly the same as in the previous table; the UNIQUIFIER is 0 and the row size is 1009. The second record (Slot 1), however, now has a UNIQUIFIER value of 1 and the row size is 1017. If you notice, the "Record Attributes" of Slot 1 are also different, with the addition of "VARIABLE_COLUMNS." This is because the UNIQUIFIER is stored as a variable column. The extra 8 bytes of overhead break down to 4 bytes to store the UNIQUIFIER, 2 bytes to store the variable column offset, and 2 bytes to store the variable count. The tables we created used all fixed-length columns; you may notice some minor overhead differences if your table already contains variable columns.
To summarize, there is indeed a difference in the page structure between a unique clustered index and a non-unique clustered index; however, there's only a possible performance and space impact when storing duplicate clustered key values. So there you go, more detail than you ever wanted to know about clustered indexes and uniqueness!
Performance Considerations of Data Types
I've just finished my first real content for the PASS Performance SIG. I decided to write on "Performance Considerations of Data Types," as I think this is one of the easiest and most overlooked topics in performance tuning. Here's a summary:
Selecting inappropriate data types, especially on large tables with millions or billions of rows, can have significant performance implications. In this article, I’ll explain why and offer suggestions on how to select the most appropriate data type for your needs. The primary focus will be on common data types in SQL Server 2005 and 2008, but I’ll also discuss some aspects of clustered indexes and column properties. Most importantly, I’ll show some examples of common data-type misuse.
If you're interested in this content, you can find it here: Performance Considerations of Data Types.
Special thanks to Paul Randal and Paul Nielsen for providing me with technical reviews and great feedback. You guys are awesome!
Thanks also to Mladen Prajdic and Jeremiah Peschka for their great input. You guys are awesome, too!
A Look at Missing Indexes
Tim Ford (@SQLAgentMan) recently blogged about his Top 5 SQL Server Indexing Best Practices. I thought it was a good list, and it inspired this blog post. I've recently been doing a little index spring cleaning, and I thought some people may be interested in the process I go through. So, here it is... a journey through madness an overview of my general missing index process.
I start with my trusty dba_missingIndexStoredProc table. If this table sounds completely foreign to you, check out my post, Find Missing Indexes In Stored Procs. Basically, I have a process that runs every night, scanning the XML of every query plan on the server to find procs that are possibly missing indexes. I then log the details for later action.
So I take a look at my table, and I find 8 stored procedures that are possibly missing indexes. Clicking on the XML link will show me the logged query plan:
Right clicking on the "Missing Index" description will give me the details of the recommended index:
Here's an example of what SQL Server will return for you:
/* Missing Index Details from ExecutionPlan2.sqlplan The Query Processor estimates that implementing the following index could improve the query cost by 85.7327%. */ /* USE [msdb] GO CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[sysjobhistory] ([job_id]) INCLUDE ([instance_id],[step_id],[sql_message_id],[sql_severity], [run_status],[run_date],[run_time],[run_duration],[operator_id_emailed], [operator_id_netsent],[operator_id_paged],[retries_attempted],[server]) GO */
I now compare the details of this proposed index to the missing index DMV suggestions, using this query:
SELECT t.name AS 'affected_table' , 'Create NonClustered Index IX_' + t.name + '_missing_' + CAST(ddmid.index_handle AS VARCHAR(10)) + ' On ' + ddmid.STATEMENT + ' (' + IsNull(ddmid.equality_columns,'') + CASE WHEN ddmid.equality_columns IS Not Null And ddmid.inequality_columns IS Not Null THEN ',' ELSE '' END + IsNull(ddmid.inequality_columns, '') + ')' + IsNull(' Include (' + ddmid.included_columns + ');', ';' ) AS sql_statement , ddmigs.user_seeks , ddmigs.user_scans , CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) AS 'est_impact' , ddmigs.last_user_seek FROM sys.dm_db_missing_index_groups AS ddmig INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs ON ddmigs.group_handle = ddmig.index_group_handle INNER JOIN sys.dm_db_missing_index_details AS ddmid ON ddmig.index_handle = ddmid.index_handle INNER JOIN sys.tables AS t ON ddmid.OBJECT_ID = t.OBJECT_ID WHERE ddmid.database_id = DB_ID() --AND t.name = 'myTableName' ORDER BY CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) DESC;
I usually find the data in both places, but not always. One reason why is because the missing index DMV will only store data since your last reboot. So if I'm taking a look at this DMV on Monday and I just rebooted on Sunday, I may not have enough history to give me meaningful recommendations. This is just something to be aware of.
What I'm looking for in this DMV is the number of user_seeks and the est_impact. Also, if I haven't rebooted my server in a while, I take a look at last_user_seek so I can determine whether or not it's still accurate.
Next, I take a look at my existing indexes using Kimberly Tripp's sp_helpindex2 system stored proc. I use her proc instead of sp_helpindex because I need to see included columns.
If you're wondering why I'm looking at existing indexes, the reason is because I'm looking for indexes that can be modified slightly to accommodate my missing index needs. By "modified slightly," I mean that I'd only want to make a change to an existing index if it did not drastically change the size or composition of an index, i.e. adding one or two narrow columns as included columns. I do NOT mean making changes that double the size of your index; in those cases, you'd probably be better off creating a brand new index.
Looking at existing indexes is actually a pretty critical part of the puzzle. If I have a proc that only gets called a few times an hour and could benefit from a better index, I may not create that index if it means adding a wide, expensive index to a busy table. If I can make a small modification to an existing index, then there's a greater chance I'll make the change and cover my query.
At this point, I should have enough information to start making decisions. I was going to write out the path I normally take when making decisions, but I thought, "Hey! What a great time for a diagram." So here you go:
Disclaimer: I'm *not* a Visio wizard, so if I butchered the use of certain symbols in my diagram, please let me know so I can a) fix it, and b) learn from it!
It's hard to really put all of the decision paths into a single, small diagram like this. There's a lot of variables that I'm not even touching here. But I think this is a fairly good "generic" representation of the path I take. When I hit an "end" process, it means I don't create the missing index at this time. Maybe in the future, it'll become necessary, but I prefer to err on the side of less indexes.
So there you have it, a brief look at my missing index process. Hopefully someone finds it helpful.
T-SQL Bitwise Operations
I've seen bit-product columns from time-to-time, mostly in SQL Server 2000 system tables, but it's never been something I've had to work with. And when I've needed to, I've known how to figure out which options are selected, i.e. a bit product of 9 means options 8 and 1 are selected. If you've ever taken a look at the [status] column on the sysdatabases table (SQL 2000), you'll know what I'm talking about.
What I've never known how to do, until recently, was calculate these options programmatically. That's why, when I noticed the [freq_interval] on the sysschedules table was a bit-product column, I decided to spend a little time figuring it out. Fortunately for me, a couple of my awesome co-workers, Jeff M. and Jason H., have worked with this before and were able to explain it to me. And, it turns out, it's actually quite easy.
Let me back up a few steps in case you're not familiar with this topic. If you check out the Books Online entry for the sysschedules table (2005), you'll notice the following statement:
freq_interval is one or more of the following:
1 = Sunday
2 = Monday
4 = Tuesday
8 = Wednesday
16 = Thursday
32 = Friday
64 = Saturday
When I looked at the actual value in the table, the schedule has a [freq_interval] value of 42, which is the sum of the bit values for the days selected.
If there were more than 7 options, the bit values would continue to double, i.e. 128, 256, etc. And regardless of how many bit values you select, you're guaranteed one and only one possible answer, as the sum of all previous bit values will never exceed the next bit value:
1 + 2 = 3
1 + 2 + 4 = 7
1 + 2 + 4 + 8 = 15
Knowing this, I'm able to retrieve the values manually: I start with the highest bit value that does not exceed 42, then subtract it; I repeat until I'm left with 0.
So...
42 - 32 = 10
10 - 8 = 2
2 - 2 = 0
That means my job is scheduled to run on Friday's (32), Wednesday's (8), and Monday's (2).
Now how do I do this with T-SQL? SQL Server provides an operator specifically for this task: the bitwise AND operator (&). For now, I'm going to skip the "why" behind this and just get to the practical application. If you're interested in the "why," let me know and I'll write a follow-up post on binary and logical AND and OR operations.
For example, to use the bitwise AND to find out which days are selected...
SELECT 42 & 1 AS 'Sunday' , 42 & 2 AS 'Monday' , 42 & 4 AS 'Tuesday' , 42 & 8 AS 'Wednesday' , 42 & 16 AS 'Thursday' , 42 & 32 AS 'Friday' , 42 & 64 AS 'Saturday';
... will return ...
Sunday Monday Tuesday Wednesday Thursday Friday Saturday ----------- ----------- ----------- ----------- ----------- ----------- ----------- 0 2 0 8 0 32 0
If the result is not equal to zero, then that day is selected. Easy as key lime pie, right?
Now let's take it a step further and create our own working example. Let's say we're going to track the characteristics of various objects in a single bit-product column (note: this is not necessarily the best way to accomplish this in the real world, but it's a good illustration). First, set up a table to use in our example. This table will have a column, [attributes], which will hold the sum of our bit values.
CREATE TABLE myTable ( id INT IDENTITY(1,1) , item VARCHAR(10) , attributes INT ); INSERT INTO myTable SELECT 'Broccoli', 200 UNION All SELECT 'Tomato', 193 UNION All SELECT 'Car', 276 UNION All SELECT 'Ball', 292;
Next, we're going to create a table variable that holds characteristics and their values. We'll then join these two tables together to see which attributes exist for each item.
DECLARE @statusLookup TABLE ( attribute INT , VALUE VARCHAR(10) ); INSERT INTO @statusLookup SELECT 1, 'Red' UNION All SELECT 4, 'Blue' UNION All SELECT 8, 'Green' UNION All SELECT 16, 'Metal' UNION All SELECT 32, 'Plastic' UNION All SELECT 64, 'Plant' UNION All SELECT 128, 'Edible' UNION All SELECT 256, 'Non-Edible'; SELECT a.item, b.VALUE FROM myTable a Cross Join @statusLookup b WHERE a.attributes & b.attribute <> 0 ORDER BY a.item , b.VALUE
You should get this result:
item value ---------- ---------- Ball Blue Ball Non-Edible Ball Plastic Broccoli Edible Broccoli Green Broccoli Plant Car Blue Car Metal Car Non-Edible Tomato Edible Tomato Plant Tomato Red
Great, now we know broccoli is edible! Let's apply a little XML to clean up the results...
SELECT a.item , REPLACE( REPLACE( REPLACE(( SELECT VALUE FROM @statusLookup AS b WHERE a.attributes & b.attribute <> 0 ORDER BY b.VALUE FOR XML Raw) , '"/><row value="', ', '), '<row value="', ''), '"/>', '') AS 'attributes' FROM myTable a ORDER BY a.item;
item attributes ---------------------------------------- Ball Blue, Non-Edible, Plastic Broccoli Edible, Green, Plant Car Blue, Metal, Non-Edible Tomato Edible, Plant, Red
Voila! There you have it, how to use the bitwise AND (&) operator to retrieve multiple values from a bit-product column. Pretty neat stuff!
Special thanks to Jeff M. and Jason H. for their assistance.
Happy Coding!
Michelle Ufford (aka SQLFool)
Source: http://sqlfool.com/2009/02/bitwise-operations/
Ramblings on Super Bowl and PASS
Super Bowl 2009
As many of you know, I'm a DBA at GoDaddy.com, which had 2 commercials in this year's Super Bowl. If you saw the commercials during the game or went to our website for the "internet only" versions, let me know; I have no control over the content of the ads, but I'm still interested in your opinions. But comments on ad content aside, the commercials continue to prove very effective for driving traffic to our website and, in turn, generating income. (Don't believe me? Read this and this article on finance.yahoo.com).
We typically get some pretty large spikes the minutes immediately following a commercial airing, and this year was no exception! We spent quite a bit of time throughout the year tuning our systems to support Super Bowl traffic, especially in the few weeks preceding the big game. By all accounts, this year's efforts have paid off; our database servers exceeded expectations. I don't think I'm allowed to go into specifics, but I can mention some server stats. During the spikes, my primary server reached 27k transactions per second, no timeouts, and very good response times. In fact, I estimate we decreased our recovery time by around 80% compared to last year.
Why do I mention all of this? Well, there's the bragging aspect, of course
. But more importantly, I bring it up to give credence to some of the performance tuning articles I've written in the past, like:
- Regularly defrag your indexes
- Evaluate the effectiveness of your indexes
- Partition your large tables
- Use TVP or XML for bulk inserts
- Use non-aligned indexes for single record look-ups (partitioning)
Keep in mind, there's rarely a "magic bullet" for performance tuning, and what worked for me may not work for you. If you have any questions, please feel free to leave me a comment or send me an e-mail, and I'll do my best to respond.
If you're interested in more information on effective performance tuning, make sure to check out the Performance Tuning Section on SQLServerPedia.com.
I380 PASS
I've been pleasantly surprised with the number of inquiries I've received regarding the I380 PASS Chapter (serving the East Iowa area of Cedar Rapids and Iowa City), so I'll continue to post updates to my blog.
As I've mentioned before, we're now officially a PASS Chapter, and we're currently in the planning stages of our first meeting. We have one confirmed key sponsor, Quest Software (woot!), and we're speaking with a couple of other possible sponsors. Side note: if you're interested in sponsoring our group, I'd love to hear from you! E-mail me at michelle @ sqlfool dot com.
We're currently planning to have meetings on the second Tuesday of every month, with our first meeting on Tuesday, March 10th 2009. We have a confirmed speaker but not a confirmed topic, and we're actively working on a meeting location. Please keep in mind that all of these details are subject to change.
If you're in the area and would like to attend, or know someone who should attend, please drop me a line!
Categories
- Business Intelligence
- Internals
- Miscellaneous
- PASS
- Performance & Tuning
- Presentations
- SQL 2008
- SQL Tips
- Syndication
- T-SQL Scripts
Subscribe to my blog!
| Like what you see? Subscribe! |
![]() |
Around the Web
Recent Tweets
- @zippy1981 I'm actually using @RedGate SQL Compare right now. It's worth every penny. #sqlhelp #redgate
- +1 :) RT @onpnt: Very well said, Janice :) @JaniceCLee your blog if full of WIN http://bit.ly/aZ4wPR
- @SQLDBA You're flying out of Orlando so there's def the possibility of a better deal. But I wouldn't do it unless you're a morning person :)





