Find Missing Indexes in Stored Procs with T-SQL

This post should probably be called “The Power of Twitter.” I’ve said it before, I’ll say it again: I love Twitter. I mostly follow SQL Server people, so it’s become a great source of new information and help when I feel like banging my head against the wall when I get stumped.

So last week, Jonathan Kehayias (@jmkehayias) posted a link to a missing index script on the MSDN forums. Jonathan’s script is modified from something he put together for a question posed by Jeremiah Peschka (@peschkaj).

Jonathan’s script intrigued me. I’d never tried to search a query plan’s XML before, and it certainly presents some interesting possibilities. After dealing with a missing index in production a few weeks ago (caused by an index change), I thought it’d be a great idea to put a regular monitor in place.

Everything was going well until I ran into a problem where I couldn’t get the proc name returned. The results spanned numerous databases; I had the object_id and database_id, but I wanted to store the proc name instead. I tried several different methods, including sp_msforeachdb and sp_executeSQL, and while I had a working solution, it was a little more clunky than I liked. So I asked my awesome followers on Twitter for any tips and within minutes I had half a dozen responses. In the end, @MladenPrajdic solved my problem with a pretty simple solution: put ‘Use ?;’ at the start of my sp_msforeachdb statement. Thanks again, Mladen!

I’ve now had this process running on my server for a few days now, with good success. The stored procedure below will return the database name, proc name, and query plan XML for any stored proc with a missing index. This is a centralized proc that will store the results in a table for later action. If you click on the XML, you should see the actual query plan with the missing index details. Because this looks at cached query plans, your best bet is to run it fairly regularly (maybe daily) to increase your chances of catching any problem procs.

/* Create a stored procedure skeleton */
If ObjectProperty(Object_ID('dbo.dba_missingIndexStoredProc_sp'), N'IsProcedure') Is Null
    Execute ('Create Procedure dbo.dba_missingIndexStoredProc_sp As Print ''Hello World!''')
    RaisError('Procedure dba_missingIndexStoredProc_sp created.', 10, 1);
/* Drop our table if it already exists */
If Exists(Select Object_ID From sys.tables Where [name] = N'dba_missingIndexStoredProc')
    Drop Table dbo.dba_missingIndexStoredProc
    Print 'dba_missingIndexStoredProc table dropped!';
/* Create our table */
Create Table dbo.dba_missingIndexStoredProc
      missingIndexSP_id int Identity(1,1)   Not Null
    , databaseName      varchar(128)        Not Null
    , databaseID        int                 Not Null
    , objectName        varchar(128)        Not Null
    , objectID          int                 Not Null
    , query_plan        xml                 Not Null
    , executionDate     smalldatetime       Not Null
    Constraint PK_missingIndexStoredProc
        Primary Key Clustered(missingIndexSP_id)
Print 'dba_missingIndexStoredProc Table Created';
/* Configure our settings */
Set ANSI_Nulls On;
Set Quoted_Identifier On;
Alter Procedure dbo.dba_missingIndexStoredProc_sp
        /* Declare Parameters */
            @lastExecuted_inDays    int = 7
          , @minExecutionCount      int = 7
          , @logResults             bit = 1
          , @displayResults         bit = 0
    Name:       dba_missingIndexStoredProc_sp
    Author:     Michelle Ufford,
    Purpose:    Retrieves stored procedures with missing indexes in their
                cached query plans.
                @lastExecuted_inDays = number of days old the cached query plan
                                       can be to still appear in the results;
                                       the HIGHER the number, the longer the
                                       execution time.
                @minExecutionCount = minimum number of executions the cached
                                     query plan can have to still appear 
                                     in the results; the LOWER the number,
                                     the longer the execution time.
                @logResults = store results in dba_missingIndexStoredProc
                @displayResults = return results to the caller
    Notes:      This is not 100% guaranteed to catch all missing indexes in
                a stored procedure.  It will only catch it if the stored proc's
                query plan is still in cache.  Run regularly to help minimize
                the chance of missing a proc.
    Called by:  DBA and/or SQL Agent Job
    Date        User    Description
    2009-03-02  MFU     Initial Release for public consumption
    Exec dbo.dba_missingIndexStoredProc_sp
          @lastExecuted_inDays  = 30
        , @minExecutionCount    = 5
        , @logResults           = 1
        , @displayResults       = 1;
Set NoCount On;
Set XACT_Abort On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set Numeric_RoundAbort Off;
    /* Declare Variables */
    Declare @currentDateTime smalldatetime;
    Set @currentDateTime = GetDate();
    Declare @plan_handles Table
        plan_handle     varbinary(64)   Not Null
    Create Table #missingIndexes
          databaseID    int             Not Null
        , objectID      int             Not Null
        , query_plan    xml             Not Null
        Constraint PK_temp_missingIndexes Primary Key Clustered
            databaseID, objectID
    Begin Try
        /* Perform some data validation */
        If @logResults = 0 And @displayResults = 0
            /* Log the fact that there were open transactions */
            Execute dbo.dba_logError_sp
                  @errorType            = 'app'
                , @app_errorProcedure   = 'dba_missingIndexStoredProc_sp'
                , @app_errorMessage     = '@logResults = 0 and @displayResults = 0; no action taken, exiting stored proc.'
                , @forceExit            = 1
                , @returnError          = 1;  
        Begin Transaction;
        /* Retrieve distinct plan handles to minimize dm_exec_query_plan lookups */
        Insert Into @plan_handles
        Select Distinct plan_handle
        From sys.dm_exec_query_stats
        Where last_execution_time > DateAdd(day, -@lastExecuted_inDays, @currentDateTime)
            And execution_count > @minExecutionCount;
        With xmlNameSpaces (
            Default ''
        /* Retrieve our query plan's XML if there's a missing index */
        Insert Into #missingIndexes
        Select deqp.[dbid]
            , deqp.objectid
            , deqp.query_plan 
        From @plan_handles As ph
        Cross Apply sys.dm_exec_query_plan(ph.plan_handle) As deqp 
        Where deqp.query_plan.exist('//MissingIndex') = 1
            And deqp.objectid Is Not Null;
        /* Do we want to store the results of our process? */
        If @logResults = 1
            Insert Into dbo.dba_missingIndexStoredProc
            Execute sp_msForEachDB 'Use ?; 
                                    Select ''?''
                                        , mi.databaseID
                                        , Object_Name(o.object_id)
                                        , o.object_id
                                        , mi.query_plan
                                        , GetDate()
                                    From sys.objects As o 
                                    Join #missingIndexes As mi 
                                        On o.object_id = mi.objectID 
                                    Where databaseID = DB_ID();';
        /* We're not logging it, so let's display it */
            Execute sp_msForEachDB 'Use ?; 
                                    Select ''?''
                                        , mi.databaseID
                                        , Object_Name(o.object_id)
                                        , o.object_id
                                        , mi.query_plan
                                        , GetDate()
                                    From sys.objects As o 
                                    Join #missingIndexes As mi 
                                        On o.object_id = mi.objectID 
                                    Where databaseID = DB_ID();';
        /* See above; this part will only work if we've 
           logged our data. */
        If @displayResults = 1 And @logResults = 1
            Select *
            From dbo.dba_missingIndexStoredProc
            Where executionDate >= @currentDateTime;
        /* If you have an open transaction, commit it */
        If @@TranCount > 0
            Commit Transaction;
    End Try
    Begin Catch
        /* Whoops, there was an error... rollback! */
        If @@TranCount > 0
            Rollback Transaction;
        /* Return an error message and log it */
        Execute dbo.dba_logError_sp;
    End Catch;
    /* Clean-Up! */
    Drop Table #missingIndexes;
    Set NoCount Off;
    Return 0;

Not know what “EXECUTE dbo.dba_logError_sp;” is doing? Check out my blog post on Error Handling in T-SQL.




Bulk Inserts with XML

Last week, I blogged about how to perform one-to-many inserts with table-valued parameters, a feature new in 2008. For those who do not yet have 2008 or will not have it in the near future, it may still be beneficial to use XML for bulk inserts.

Here’s a pretty simple example of how to accomplish this:

/* Create some tables to work with */
CREATE TABLE dbo.orders
      order_id      INT IDENTITY(1,1)   Not Null
    , orderDate     DATETIME            Not Null
    , customer_id   INT                 Not Null
    CONSTRAINT PK_orders
        PRIMARY KEY CLUSTERED(order_id)
CREATE TABLE dbo.orderDetails
      orderDetail_id    INT IDENTITY(1,1)   Not Null
    , order_id          INT                 Not Null
    , lineItem          INT                 Not Null
    , product_id        INT                 Not Null
    CONSTRAINT PK_orderDetails
        PRIMARY KEY CLUSTERED(orderDetail_id)
    CONSTRAINT FK_orderDetails_orderID
        FOREIGN KEY(order_id)
        REFERENCES dbo.orders(order_id)
/* Create a new procedure using an XML parameter */
CREATE PROCEDURE dbo.insert_orderXML_sp
      @orderDate        DATETIME
    , @customer_id      INT
    , @orderDetailsXML  XML
    DECLARE @myOrderID INT;
    INSERT INTO dbo.orders
        , customer_id    
        , @customer_id
    SET @myOrderID = SCOPE_IDENTITY();
    INSERT INTO dbo.orderDetails
        , lineItem
        , product_id
    SELECT @myOrderID
         , myXML.value('./@lineItem', 'int')
         , myXML.value('./@product_id', 'int')
    FROM @orderDetailsXML.nodes('/orderDetail') As nodes(myXML);
/* Call our stored procedure */
EXECUTE dbo.insert_orderXML_sp
      @orderDate = '2008-01-01'
    , @customer_id = 101
    , @orderDetailsXML = 
        '<orderDetail lineItem="1" product_id="123" />
         <orderDetail lineItem="2" product_id="456" />
         <orderDetail lineItem="3" product_id="789" />
         <orderDetail lineItem="4" product_id="246" />
         <orderDetail lineItem="5" product_id="135" />';
/* Check our data */
SELECT * FROM dbo.orders;
SELECT * FROM dbo.orderDetails;
/* Clean up our mess */
DROP PROCEDURE insert_orderXML_sp;
DROP TABLE dbo.orderDetails;
DROP TABLE dbo.orders;

I’ve found that this is more efficient when performing large parent/child inserts, i.e. 1 parent record to 100 child records. Keep in mind that there’s a point where doing an insert with XML is more expensive than using a traditional INSERT stored procedure. I haven’t run any tests yet to help define what that threshold is… more to come on this in the near future.

More on the Nodes() Method can be found here in Books Online:

Update: I’ve just learned that the “value” keyword is case-sensitive. Apparently my code box plug-in was defaulting “value” to “VALUE.” :)

Here’s the error message you’ll get if you don’t have “value” in lower-case:
Cannot find either column “myXML” or the user-defined function or aggregate “myXML.VALUE”, or the name is ambiguous.