A Look at Missing Indexes

Tim Ford (@SQLAgentMan) recently blogged about his Top 5 SQL Server Indexing Best Practices. I thought it was a good list, and it inspired this blog post. I’ve recently been doing a little index spring cleaning, and I thought some people may be interested in the process I go through. So, here it is… a journey through madness an overview of my general missing index process.

I start with my trusty dba_missingIndexStoredProc table. If this table sounds completely foreign to you, check out my post, Find Missing Indexes In Stored Procs. Basically, I have a process that runs every night, scanning the XML of every query plan on the server to find procs that are possibly missing indexes. I then log the details for later action.

So I take a look at my table, and I find 8 stored procedures that are possibly missing indexes. Clicking on the XML link will show me the logged query plan:

Procs With Missing Indexes

Procs With Missing Indexes

Right clicking on the “Missing Index” description will give me the details of the recommended index:

Missing Index Details

Missing Index Details

Here’s an example of what SQL Server will return for you:

/*
Missing Index Details from ExecutionPlan2.sqlplan
The Query Processor estimates that implementing the following index 
could improve the query cost by 85.7327%.
*/
 
/*
USE [msdb]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[sysjobhistory] ([job_id])
INCLUDE ([instance_id],[step_id],[sql_message_id],[sql_severity],
[run_status],[run_date],[run_time],[run_duration],[operator_id_emailed], 
[operator_id_netsent],[operator_id_paged],[retries_attempted],[server])
GO
*/

I now compare the details of this proposed index to the missing index DMV suggestions, using this query:

SELECT t.name AS 'affected_table'
    , 'Create NonClustered Index IX_' + t.name + '_missing_' 
        + CAST(ddmid.index_handle AS VARCHAR(10))
        + ' On ' + ddmid.STATEMENT 
        + ' (' + IsNull(ddmid.equality_columns,'') 
        + CASE WHEN ddmid.equality_columns IS Not Null 
            And ddmid.inequality_columns IS Not Null THEN ',' 
                ELSE '' END 
        + IsNull(ddmid.inequality_columns, '')
        + ')' 
        + IsNull(' Include (' + ddmid.included_columns + ');', ';'
        ) AS sql_statement
    , ddmigs.user_seeks
    , ddmigs.user_scans
    , CAST((ddmigs.user_seeks + ddmigs.user_scans) 
        * ddmigs.avg_user_impact AS INT) AS 'est_impact'
    , ddmigs.last_user_seek
FROM sys.dm_db_missing_index_groups AS ddmig
INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs
    ON ddmigs.group_handle = ddmig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS ddmid 
    ON ddmig.index_handle = ddmid.index_handle
INNER JOIN sys.tables AS t
    ON ddmid.OBJECT_ID = t.OBJECT_ID
WHERE ddmid.database_id = DB_ID()
    --AND t.name = 'myTableName' 
ORDER BY CAST((ddmigs.user_seeks + ddmigs.user_scans) 
    * ddmigs.avg_user_impact AS INT) DESC;

I usually find the data in both places, but not always. One reason why is because the missing index DMV will only store data since your last reboot. So if I’m taking a look at this DMV on Monday and I just rebooted on Sunday, I may not have enough history to give me meaningful recommendations. This is just something to be aware of.

What I’m looking for in this DMV is the number of user_seeks and the est_impact. Also, if I haven’t rebooted my server in a while, I take a look at last_user_seek so I can determine whether or not it’s still accurate.

Next, I take a look at my existing indexes using Kimberly Tripp’s sp_helpindex2 system stored proc. I use her proc instead of sp_helpindex because I need to see included columns.

If you’re wondering why I’m looking at existing indexes, the reason is because I’m looking for indexes that can be modified slightly to accommodate my missing index needs. By “modified slightly,” I mean that I’d only want to make a change to an existing index if it did not drastically change the size or composition of an index, i.e. adding one or two narrow columns as included columns. I do NOT mean making changes that double the size of your index; in those cases, you’d probably be better off creating a brand new index.

Looking at existing indexes is actually a pretty critical part of the puzzle. If I have a proc that only gets called a few times an hour and could benefit from a better index, I may not create that index if it means adding a wide, expensive index to a busy table. If I can make a small modification to an existing index, then there’s a greater chance I’ll make the change and cover my query.

At this point, I should have enough information to start making decisions. I was going to write out the path I normally take when making decisions, but I thought, “Hey! What a great time for a diagram.” So here you go:

Decision Path

Decision Path

Disclaimer: I’m *not* a Visio wizard, so if I butchered the use of certain symbols in my diagram, please let me know so I can a) fix it, and b) learn from it!

It’s hard to really put all of the decision paths into a single, small diagram like this. There’s a lot of variables that I’m not even touching here. But I think this is a fairly good “generic” representation of the path I take. When I hit an “end” process, it means I don’t create the missing index at this time. Maybe in the future, it’ll become necessary, but I prefer to err on the side of less indexes.

So there you have it, a brief look at my missing index process. Hopefully someone finds it helpful. :)

Find Missing Indexes in Stored Procs with T-SQL

This post should probably be called “The Power of Twitter.” I’ve said it before, I’ll say it again: I love Twitter. I mostly follow SQL Server people, so it’s become a great source of new information and help when I feel like banging my head against the wall when I get stumped.

So last week, Jonathan Kehayias (@jmkehayias) posted a link to a missing index script on the MSDN forums. Jonathan’s script is modified from something he put together for a question posed by Jeremiah Peschka (@peschkaj).

Jonathan’s script intrigued me. I’d never tried to search a query plan’s XML before, and it certainly presents some interesting possibilities. After dealing with a missing index in production a few weeks ago (caused by an index change), I thought it’d be a great idea to put a regular monitor in place.

Everything was going well until I ran into a problem where I couldn’t get the proc name returned. The results spanned numerous databases; I had the object_id and database_id, but I wanted to store the proc name instead. I tried several different methods, including sp_msforeachdb and sp_executeSQL, and while I had a working solution, it was a little more clunky than I liked. So I asked my awesome followers on Twitter for any tips and within minutes I had half a dozen responses. In the end, @MladenPrajdic solved my problem with a pretty simple solution: put ‘Use ?;’ at the start of my sp_msforeachdb statement. Thanks again, Mladen!

I’ve now had this process running on my server for a few days now, with good success. The stored procedure below will return the database name, proc name, and query plan XML for any stored proc with a missing index. This is a centralized proc that will store the results in a table for later action. If you click on the XML, you should see the actual query plan with the missing index details. Because this looks at cached query plans, your best bet is to run it fairly regularly (maybe daily) to increase your chances of catching any problem procs.

/* Create a stored procedure skeleton */
If ObjectProperty(Object_ID('dbo.dba_missingIndexStoredProc_sp'), N'IsProcedure') Is Null
Begin
    Execute ('Create Procedure dbo.dba_missingIndexStoredProc_sp As Print ''Hello World!''')
    RaisError('Procedure dba_missingIndexStoredProc_sp created.', 10, 1);
End;
Go
 
/* Drop our table if it already exists */
If Exists(Select Object_ID From sys.tables Where [name] = N'dba_missingIndexStoredProc')
Begin
    Drop Table dbo.dba_missingIndexStoredProc
    Print 'dba_missingIndexStoredProc table dropped!';
End
 
/* Create our table */
Create Table dbo.dba_missingIndexStoredProc
(
      missingIndexSP_id int Identity(1,1)   Not Null
    , databaseName      varchar(128)        Not Null
    , databaseID        int                 Not Null
    , objectName        varchar(128)        Not Null
    , objectID          int                 Not Null
    , query_plan        xml                 Not Null
    , executionDate     smalldatetime       Not Null
 
    Constraint PK_missingIndexStoredProc
        Primary Key Clustered(missingIndexSP_id)
);
 
Print 'dba_missingIndexStoredProc Table Created';
 
/* Configure our settings */
Set ANSI_Nulls On;
Set Quoted_Identifier On;
Go
 
Alter Procedure dbo.dba_missingIndexStoredProc_sp
 
        /* Declare Parameters */
            @lastExecuted_inDays    int = 7
          , @minExecutionCount      int = 7
          , @logResults             bit = 1
          , @displayResults         bit = 0
 
As
/*********************************************************************************
    Name:       dba_missingIndexStoredProc_sp
 
    Author:     Michelle Ufford, http://sqlfool.com
 
    Purpose:    Retrieves stored procedures with missing indexes in their
                cached query plans.
 
                @lastExecuted_inDays = number of days old the cached query plan
                                       can be to still appear in the results;
                                       the HIGHER the number, the longer the
                                       execution time.
 
                @minExecutionCount = minimum number of executions the cached
                                     query plan can have to still appear 
                                     in the results; the LOWER the number,
                                     the longer the execution time.
 
                @logResults = store results in dba_missingIndexStoredProc
 
                @displayResults = return results to the caller
 
    Notes:      This is not 100% guaranteed to catch all missing indexes in
                a stored procedure.  It will only catch it if the stored proc's
                query plan is still in cache.  Run regularly to help minimize
                the chance of missing a proc.
 
    Called by:  DBA and/or SQL Agent Job
 
    Date        User    Description
    ----------------------------------------------------------------------------
    2009-03-02  MFU     Initial Release for public consumption
*********************************************************************************
    Exec dbo.dba_missingIndexStoredProc_sp
          @lastExecuted_inDays  = 30
        , @minExecutionCount    = 5
        , @logResults           = 1
        , @displayResults       = 1;
*********************************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set Numeric_RoundAbort Off;
 
Begin
 
    /* Declare Variables */
    Declare @currentDateTime smalldatetime;
 
    Set @currentDateTime = GetDate();
 
    Declare @plan_handles Table
    (
        plan_handle     varbinary(64)   Not Null
    );
 
    Create Table #missingIndexes
    (
          databaseID    int             Not Null
        , objectID      int             Not Null
        , query_plan    xml             Not Null
 
        Constraint PK_temp_missingIndexes Primary Key Clustered
        (
            databaseID, objectID
        )
    );
 
    Begin Try
 
        /* Perform some data validation */
        If @logResults = 0 And @displayResults = 0
        Begin
 
            /* Log the fact that there were open transactions */
            Execute dbo.dba_logError_sp
                  @errorType            = 'app'
                , @app_errorProcedure   = 'dba_missingIndexStoredProc_sp'
                , @app_errorMessage     = '@logResults = 0 and @displayResults = 0; no action taken, exiting stored proc.'
                , @forceExit            = 1
                , @returnError          = 1;  
 
        End;
 
        Begin Transaction;
 
        /* Retrieve distinct plan handles to minimize dm_exec_query_plan lookups */
        Insert Into @plan_handles
        Select Distinct plan_handle
        From sys.dm_exec_query_stats
        Where last_execution_time > DateAdd(day, -@lastExecuted_inDays, @currentDateTime)
            And execution_count > @minExecutionCount;
 
        With xmlNameSpaces (
            Default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
        )
 
        /* Retrieve our query plan's XML if there's a missing index */
        Insert Into #missingIndexes
        Select deqp.[dbid]
            , deqp.objectid
            , deqp.query_plan 
        From @plan_handles As ph
        Cross Apply sys.dm_exec_query_plan(ph.plan_handle) As deqp 
        Where deqp.query_plan.exist('//MissingIndex') = 1
            And deqp.objectid Is Not Null;
 
        /* Do we want to store the results of our process? */
        If @logResults = 1
        Begin
            Insert Into dbo.dba_missingIndexStoredProc
            Execute sp_msForEachDB 'Use ?; 
                                    Select ''?''
                                        , mi.databaseID
                                        , Object_Name(o.object_id)
                                        , o.object_id
                                        , mi.query_plan
                                        , GetDate()
                                    From sys.objects As o 
                                    Join #missingIndexes As mi 
                                        On o.object_id = mi.objectID 
                                    Where databaseID = DB_ID();';
 
        End
        /* We're not logging it, so let's display it */
        Else
        Begin
            Execute sp_msForEachDB 'Use ?; 
                                    Select ''?''
                                        , mi.databaseID
                                        , Object_Name(o.object_id)
                                        , o.object_id
                                        , mi.query_plan
                                        , GetDate()
                                    From sys.objects As o 
                                    Join #missingIndexes As mi 
                                        On o.object_id = mi.objectID 
                                    Where databaseID = DB_ID();';
        End;
 
        /* See above; this part will only work if we've 
           logged our data. */
        If @displayResults = 1 And @logResults = 1
        Begin
            Select *
            From dbo.dba_missingIndexStoredProc
            Where executionDate >= @currentDateTime;
        End;
 
        /* If you have an open transaction, commit it */
        If @@TranCount > 0
            Commit Transaction;
 
    End Try
    Begin Catch
 
        /* Whoops, there was an error... rollback! */
        If @@TranCount > 0
            Rollback Transaction;
 
        /* Return an error message and log it */
        Execute dbo.dba_logError_sp;
 
    End Catch;
 
    /* Clean-Up! */
    Drop Table #missingIndexes;
 
    Set NoCount Off;
    Return 0;
End
Go

Not know what “EXECUTE dbo.dba_logError_sp;” is doing? Check out my blog post on Error Handling in T-SQL.

HTH!

Michelle

Source: http://sqlfool.com/2009/03/find-missing-indexes/