Index Interrogation Script

As a continuation of the last post on indexes, here’s a script I wrote for index interrogation.  This stored procedure was the result of repeatedly having to lookup information on included columns.

If ObjectProperty(Object_ID('dbo.dba_indexLookup_sp'), 
        N'IsProcedure') Is Null
Begin
    Execute ('Create Procedure dbo.dba_indexLookup_sp 
        As Print ''Hello World!''')
    RaisError('Procedure dbo.dba_indexLookup_sp created.', 10, 1);
End;
Go
 
Set ANSI_Nulls On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set NoCount On;
Set Numeric_RoundAbort Off;
Set Quoted_Identifier On;
Go
 
Alter Procedure dbo.dba_indexLookup_sp
 
        /* Declare Parameters */
        @tableName  varchar(128)  =  Null
As
/***********************************************************************
    Name:       dba_indexLookup_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Retrieves index information for the specified table.
 
    Notes:      If the tableName is left null, it will return index 
                information for all tables and indexes.
 
    Called by:  DBA
 
    Date        User   Description
    --------------------------------------------------------------------
    2008-10-28  MFU    Initial Release
************************************************************************
    Exec dbo.dba_indexLookup_sp
        @tableName = 'myTableName';
***********************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
 
Begin
 
    Declare @objectID int;
 
    If @tableName Is Not Null
        Set @objectID = Object_ID(@tableName);
 
    With indexCTE(partition_scheme_name
                , partition_function_name
                , data_space_id)
    As (
        Select sps.name
            , spf.name
            , sps.data_space_id
        From sys.partition_schemes As sps
        Join sys.partition_functions As spf
            On sps.function_id = spf.function_id
    )
 
    Select st.name As 'table_name'
        , IsNull(ix.name, '') As 'index_name'
        , ix.object_id
        , ix.index_id
		, Cast(
            Case When ix.index_id = 1 
                    Then 'clustered' 
                When ix.index_id =0
                    Then 'heap'
                Else 'nonclustered' End
			+ Case When ix.ignore_dup_key <> 0 
                Then ', ignore duplicate keys' 
                    Else '' End
			+ Case When ix.is_unique <> 0 
                Then ', unique' 
                    Else '' End
			+ Case When ix.is_primary_key <> 0 
                Then ', primary key' Else '' End As varchar(210)
            ) As 'index_description'
        , IsNull(Replace( Replace( Replace(
            (   
                Select c.name As 'columnName'
                From sys.index_columns As sic
                Join sys.columns As c 
                    On c.column_id = sic.column_id 
                    And c.object_id = sic.object_id
                Where sic.object_id = ix.object_id
                    And sic.index_id = ix.index_id
                    And is_included_column = 0
                Order By sic.index_column_id
                For XML Raw)
                , '"/><row columnName="', ', ')
                , '<row columnName="', '')
                , '"/>', ''), '')
            As 'indexed_columns'
        , IsNull(Replace( Replace( Replace(
            (   
                Select c.name As 'columnName'
                From sys.index_columns As sic
                Join sys.columns As c 
                    On c.column_id = sic.column_id 
                    And c.object_id = sic.object_id
                Where sic.object_id = ix.object_id
                    And sic.index_id = ix.index_id
                    And is_included_column = 1
                Order By sic.index_column_id
                For XML Raw)
                , '"/><row columnName="', ', ')
                , '<row columnName="', '')
                , '"/>', ''), '')
            As 'included_columns'
        , IsNull(cte.partition_scheme_name, '') 
            As 'partition_scheme_name'
        , Count(partition_number) As 'partition_count'
        , Sum(rows) As 'row_count'
    From sys.indexes As ix
    Join sys.partitions As sp
        On ix.object_id = sp.object_id
        And ix.index_id = sp.index_id
    Join sys.tables As st
        On ix.object_id = st.object_id
    Left Join indexCTE As cte
        On ix.data_space_id = cte.data_space_id
    Where ix.object_id = IsNull(@objectID, ix.object_id)
    Group By st.name
        , IsNull(ix.name, '')
        , ix.object_id
        , ix.index_id
		, Cast(
            Case When ix.index_id = 1 
                    Then 'clustered' 
                When ix.index_id =0
                    Then 'heap'
                Else 'nonclustered' End
			+ Case When ix.ignore_dup_key <> 0 
                Then ', ignore duplicate keys' 
                    Else '' End
			+ Case When ix.is_unique <> 0 
                Then ', unique' 
                    Else '' End
			+ Case When ix.is_primary_key <> 0 
                Then ', primary key' Else '' End As varchar(210)
            )
        , IsNull(cte.partition_scheme_name, '')
        , IsNull(cte.partition_function_name, '')
    Order By table_name
        , index_id;
 
    Set NoCount Off;
    Return 0;
End
Go

Index Defrag Script

UPDATE: This script has been significantly updated. Please view the latest version at http://sqlfool.com/2011/06/index-defrag-script-v4-1/.

Previously I posted that it’s important to keep your indexes in shape with weekly or (preferably) nightly defrags. Below is a script I use to execute nightly defrags in SQL 2005 Enterprise. I can’t claim complete credit… I believe this is a script I originally found on Microsoft and heavily modified to meet my needs.

You may want to modify the script if you’re doing much with XML or LOB’s. Also keep in mind that this is hitting the sys.dm_db_index_physical_stats table to view fragmentation information, which can be resource intensive.

If you’re not familiar with index defragmentation, check out “Alter Index” on Books Online.

If Not Exists(Select object_id From sys.tables 
                Where [name] = N'dba_indexDefragLog')
Begin
 
    Create Table dbo.dba_indexDefragLog
    (
          indexDefrag_id    int identity(1,1)   not null
        , objectID          int                 not null
        , objectName        nvarchar(130)       not null
        , indexID           int                 not null
        , indexName         nvarchar(130)       not null
        , partitionNumber   smallint            not null
        , fragmentation     float               not null
        , dateTimeStart     datetime            not null
        , durationSeconds   int                 not null
        Constraint PK_indexDefragLog 
            Primary Key Clustered (indexDefrag_id)
    );
 
    Print 'dba_indexDefragLog Table Created';
End
 
If ObjectProperty(Object_ID('dbo.dba_indexDefrag_sp'), 
        N'IsProcedure') Is Null
Begin
    Execute ('Create Procedure dbo.dba_indexDefrag_sp As 
                Print ''Hello World!''')
    RaisError('Procedure dba_indexDefrag_sp created.', 10, 1);
End;
Go
 
Set ANSI_Nulls On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set NoCount On;
Set Numeric_RoundAbort Off;
Set Quoted_Identifier On;
Go
 
Alter Procedure dbo.dba_indexDefrag_sp
 
    /* Declare Parameters */
      @minFragmentation     float           = 10.0  
        /* in percent, will not defrag if fragmentation 
           less than specified */
    , @rebuildThreshold     float           = 30.0  
        /* in percent, greater than @rebuildThreshold 
           will result in rebuild instead of reorg */
    , @onlineRebuild        bit             = 1     
        /* 1 = online rebuild; 0 = offline rebuild */
    , @executeSQL           bit             = 1     
        /* 1 = execute; 0 = print command only */
    , @tableName            varchar(4000)   = Null  
        /* Option to specify a table name */
    , @printCommands        bit             = 0     
        /* 1 = print commands; 0 = do not print commands */
    , @defragDelay          char(8)         = '00:00:05'
        /* time to wait between defrag commands */
As
/***********************************************************************
    Name:       dba_indexDefrag_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Defrags all indexes for the current database
 
    Notes:
 
    CAUTION: Monitor transaction log when executing for the first time!
 
      @minFragmentation     defaulted to 10%, will not defrag if 
                            fragmentation if less than specified 
 
      @rebuildThreshold     defaulted to 30% as recommended by 
                            Microsoft in BOL;
                            > than 30% will result in rebuild instead
 
      @onlineRebuild        1 = online rebuild; 
                            0 = offline rebuild
 
      @executeSQL           1 = execute the SQL generated by this proc; 
                            0 = print command only
 
      @tableName            Specify if you only want to defrag indexes 
                            for a specific table
 
      @printCommands        1 = print commands to screen; 
                            0 = do not print commands
 
      @defragDelay          time to wait between defrag commands; 
                            gives the server a little time to catch up 
 
    Called by:  SQL Agent Job or DBA
 
    Date        Initials	Description
    -------------------------------------------------------------------
    2008-10-27  MFU         Initial Release
***********************************************************************
    Exec dbo.dba_indexDefrag_sp
	      @executeSQL    = 0
        , @printCommands = 1;
***********************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
 
Begin
 
    /* Declare our variables */
    Declare   @objectID         int
            , @indexID          int
            , @partitionCount   bigint
            , @schemaName       nvarchar(130)
            , @objectName       nvarchar(130)
            , @indexName        nvarchar(130)
            , @partitionNumber  smallint
            , @partitions       smallint
            , @fragmentation    float
            , @sqlCommand       nvarchar(4000)
            , @rebuildCommand   nvarchar(200)
            , @dateTimeStart    datetime
            , @dateTimeEnd      datetime
            , @containsLOB      bit;
 
    /* Just a little validation... */
    If @minFragmentation Not Between 0.00 And 100.0
        Set @minFragmentation = 10.0;
 
    If @rebuildThreshold Not Between 0.00 And 100.0
        Set @rebuildThreshold = 30.0;
 
    If @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]'
        Set @defragDelay = '00:00:05';
 
    /* Determine which indexes to defrag using our 
       user-defined parameters */
    Select
          Object_ID AS objectID
        , index_id AS indexID
        , partition_number AS partitionNumber
        , avg_fragmentation_in_percent AS fragmentation
        , 0 As 'defragStatus' 
            /* 0 = unprocessed, 1 = processed */
    Into #indexDefragList
    From sys.dm_db_index_physical_stats 
        (DB_ID(), Object_Id(@tableName), NULL , NULL, N'Limited')
    Where avg_fragmentation_in_percent > @minFragmentation 
        And index_id > 0
    Option (MaxDop 1);
 
    /* Create a clustered index to boost performance a little */
    Create Clustered Index CIX_temp_indexDefragList
        On #indexDefragList(objectID, indexID, partitionNumber);
 
    /* Begin our loop for defragging */
    While (Select Count(*) From #indexDefragList 
            Where defragStatus = 0) > 0
    Begin
 
        /* Grab the most fragmented index first to defrag */
        Select Top 1 
              @objectID         = objectID
            , @fragmentation    = fragmentation
            , @indexID          = indexID
            , @partitionNumber  = partitionNumber
        From #indexDefragList
        Where defragStatus = 0
        Order By fragmentation Desc;
 
        /* Look up index information */
        Select @objectName = QuoteName(o.name)
             , @schemaName = QuoteName(s.name)
        From sys.objects As o
        Inner Join sys.schemas As s 
            On s.schema_id = o.schema_id
        Where o.object_id = @objectID;
 
        Select @indexName = QuoteName(name)
        From sys.indexes 
        Where object_id = @objectID
            And index_id = @indexID
            And type > 0;
 
        /* Determine if the index is partitioned */
        Select @partitionCount = Count(*)
        From sys.partitions
        Where object_id = @objectID 
            And index_id = @indexID;
 
        /* Look for LOBs */
        Select Top 1 
            @containsLOB = column_id
        From sys.columns With (NoLock) 
        Where  
            [object_id] = @objectID
            And (system_type_id In (34, 35, 99) 
            -- 34 = image, 35 = text, 99 = ntext
                    Or max_length = -1); 
            -- varbinary(max), varchar(max), nvarchar(max), xml
 
        /* See if we should rebuild or reorganize; handle thusly */
        If @fragmentation < @rebuildThreshold And @partitionCount <= 1
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + N' ReOrganize';
 
        If @fragmentation >= @rebuildThreshold 
            And IsNull(@containsLOB, 0) = 0 
                -- Cannot rebuild if the table has one or more LOB
            And @partitionCount <= 1
        Begin
 
            /* We should always rebuild online if possible 
                (SQL 2005 Enterprise) */
            If @onlineRebuild = 0 
                Set @rebuildCommand = N' Rebuild With 
                    (Online = Off, MaxDop = 1)';
            Else
                Set @rebuildCommand = N' Rebuild With 
                    (Online = On, MaxDop = 1)';
 
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + @rebuildCommand;
        End;
 
        /* If our index is partitioned, we should always reorganize */
        If @partitionCount > 1
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + N' ReOrganize'
                + N' Partition = ' 
                + Cast(@partitionNumber As nvarchar(10)); 
                -- no MaxDop needed, single threaded operation
 
        /* Are we executing the SQL?  If so, do it */
        If @executeSQL = 1
        Begin
 
            /* Grab the time for logging purposes */
            Set @dateTimeStart  = GetDate();
            Execute (@sqlCommand);
            Set @dateTimeEnd  = GetDate();
 
            /* Log our actions */
            Insert Into dbo.dba_indexDefragLog
            (
                  objectID
                , objectName
                , indexID
                , indexName
                , partitionNumber
                , fragmentation
                , dateTimeStart
                , durationSeconds
            )
            Select
                  @objectID
                , @objectName
                , @indexID
                , @indexName
                , @partitionNumber
                , @fragmentation
                , @dateTimeStart
                , DateDiff(second, @dateTimeStart, @dateTimeEnd);
 
            /* Just a little breather for the server */
            WaitFor Delay @defragDelay;
 
            /* Print if specified to do so */
            If @printCommands = 1
                Print N'Executed: ' + @sqlCommand;
        End
        Else
        /* Looks like we're not executing, just print 
            the commands */
        Begin
            If @printCommands = 1
                Print @sqlCommand;
        End
 
        /* Update our index defrag list when we've
            finished with that index */
        Update #indexDefragList
        Set defragStatus = 1
        Where objectID         = @objectID
          And indexID          = @indexID
          And partitionNumber  = @partitionNumber;
 
    End
 
    /* When everything is done, make sure to get rid of 
        our temp table */
    Drop Table #indexDefragList;
 
    Set NoCount Off;
    Return 0
End
Go
 
Set Quoted_Identifier Off 
Set ANSI_Nulls On
Go

20081117 UPDATE: Check out my Updated Index Defrag Script for SQL 2005 and 2008.

Tips for Large Data Stores

I’m currently working on projects that involve the collection of massive amounts of data (i.e. terabyte-class, billion-row tables). One of the challenges of collecting so much data is the ability to report on the data quickly.  To this end, I plan to spend the next several blog posts discussing tips for designing and managing large relational data stores.  These are primarily design practices that have performed well for me within the various environments in which I’ve worked.

Partitioning

• Partitioning is the segregation of a single logical table into separate, identical physical structures. Basically, every table is stored in at least one partition; “partitioned tables” are merely comprised of many partitions.

• Implemented correctly, partitioning can have dramatic improvement on read performance, index maintenance cost reduction, storage of large fact tables on separate disks, and data archiving.

• If you’re not familiar with partitioning, check out Kimberly Tripp’s excellent white paper, Partitioned Tables and Indexes in SQL Server 2005.

• I cannot rave enough about the impact partitioning has had within our environment. To take one of the most extreme examples of improvement, the execution time of one BI report decreased from 2.5+ hours to 20 minutes after we implemented partitioning. While not all stored procedures experienced such phenomenal improvement, improvement in the range of 30% – 60% was very common.

• While partitioning does have many benefits, there are some negatives. Namely, more expensive writes (I’ve noticed around 10%), increased duration of queries that span many partitions (i.e. queries on long time spans, such as a year), and increased maintenance needs.

• My rule of thumb is to partition any table with growth rates > 10mm records per week or with regular archiving needs.

• Much more to come on partitioning in the near future. :)

Indexing

• Any experienced DBA can tell you that effective index management is critical.

• Ensure your stored procedures are using index seeks instead of scans. If scans cannot be avoided, consider using partitioning with aligned indexes to limit the amount of pages scanned.

• Periodically look for un-used indexes and remove them to improve write performance.

• Look for missing indexes, create one or two, and evaluate. Try composite indexes and included columns as a way to limit the amount of indexes you need to create.

• Defrag your indexes! Nightly, if you can get away with it.

Stored Procedure Tuning

• When querying on large amounts of data, try breaking your queries up into individual components and storing in temporary tables, then performing the joins.

• When querying on large date ranges, try looping through the days and inserting the data into a temp table. This can be especially beneficial when querying partitioned tables.

• Create indexes on your temporary tables!

• Utilize index seeks whenever possible.

• If permitted to use dirty reads and your environment’s isolation level is not defaulted to read uncommitted, use With (NoLock) hints.

• Temporary tables can be partitioned! This is usually not necessary but can sometimes be beneficial.

• Try replacing table variables with temporary tables. My rule of thumb is this: if I’m performing joins on the temporary data or if I have more than 100 records, use a temporary table. Table variables can decrease stored procedure recompiles but can sometimes have a negative impact on performance. When in doubt, test each method and evaluate which works better for your needs.

• *Always* include the partitioning key when querying partitioned tables.

Scientific Notation with IsNumeric()

Be careful when using the built-in IsNumeric() function. When presented with a character string formatted as “number+D+number” or “number+E+number”, the function will return a value of 1 (true). This is because SQL Server evaluates [number][d,e][number] as scientific notation.

For example,

[cc lang="tsql"]
Declare @dNotation char(10)
, @eNotation char(10)
, @standardText char(5);

Select @dNotation = ’1d1′
, @eNotation = ’1e1′
, @standardText = ‘abcde’;

Select IsNumeric(@dNotation) As ‘dNotation’
, IsNumeric(@eNotation) As ‘eNotation’
, IsNumeric(@standardText) As ‘standardText’;
[/cc]

… will return:

dNotation   eNotation   standardText
————————————————-
1           1            0

I’ve seen a couple of different ways to approach this, and each have their merits. The most easy and elegant solution I’ve seen came from Dave C. — PatIndex(‘%[^0123456789]%’, columnName):

[cc lang="tsql"]
Select PatIndex(‘%[^0123456789]%’, @dNotation)
, PatIndex(‘%[^0123456789]%’, @eNotation)
, PatIndex(‘%[^0123456789]%’, @standardText);
[/cc]

I’ve also seen Replace() and user-defined functions used with great effect.

Thanks to Dave C. for sharing his solution! :)

HTH!

Michelle

Permissions Script

Let’s get this blog thing started!  :)

I sometimes find that appropriate permissions are not always assigned to stored procedures during creation.  This usually occurs when a proc is scripted and re-deployed in a hurry, or when a non-DBA decides to play in my sandbox.  I use this rather simple script to quickly find and grant all missing permissions.

Declare @schema_owner varchar(20);
Set @schema_owner = 'dbo';
 
Select [name] As 'storedProcedure', 'Grant Execute On '
    + @schema_owner + '.' + [name]
    + ' To [insertDatabaseRoleHere];' As 'sqlCode'
From sys.objects With (NoLock)
Where [name] Not In (
    Select o.name
    From sys.database_permissions p With (NoLock)
    Inner Join sys.objects o With (NoLock)
        On p.major_id = o.object_id
    Inner Join sys.database_principals u With (NoLock)
        On u.principal_id = p.grantee_principal_id
    Where u.name = '[insertDatabaseRoleHere]'
)
And [type] = 'P';

This is especially useful for when you have a common database role that all stored procedures need to belong to; for example, infrastructures that share role membership to allow applications to execute procs.