Tim Ford (@SQLAgentMan) recently blogged about his Top 5 SQL Server Indexing Best Practices. I thought it was a good list, and it inspired this blog post. I’ve recently been doing a little index spring cleaning, and I thought some people may be interested in the process I go through. So, here it is… a journey through madness an overview of my general missing index process.
I start with my trusty dba_missingIndexStoredProc table. If this table sounds completely foreign to you, check out my post, Find Missing Indexes In Stored Procs. Basically, I have a process that runs every night, scanning the XML of every query plan on the server to find procs that are possibly missing indexes. I then log the details for later action.
So I take a look at my table, and I find 8 stored procedures that are possibly missing indexes. Clicking on the XML link will show me the logged query plan:
Right clicking on the “Missing Index” description will give me the details of the recommended index:
Here’s an example of what SQL Server will return for you:
/* Missing Index Details from ExecutionPlan2.sqlplan The Query Processor estimates that implementing the following index could improve the query cost by 85.7327%. */ /* USE [msdb] GO CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[sysjobhistory] ([job_id]) INCLUDE ([instance_id],[step_id],[sql_message_id],[sql_severity], [run_status],[run_date],[run_time],[run_duration],[operator_id_emailed], [operator_id_netsent],[operator_id_paged],[retries_attempted],[server]) GO */ |
I now compare the details of this proposed index to the missing index DMV suggestions, using this query:
SELECT t.name AS 'affected_table' , 'Create NonClustered Index IX_' + t.name + '_missing_' + CAST(ddmid.index_handle AS VARCHAR(10)) + ' On ' + ddmid.STATEMENT + ' (' + IsNull(ddmid.equality_columns,'') + CASE WHEN ddmid.equality_columns IS Not Null And ddmid.inequality_columns IS Not Null THEN ',' ELSE '' END + IsNull(ddmid.inequality_columns, '') + ')' + IsNull(' Include (' + ddmid.included_columns + ');', ';' ) AS sql_statement , ddmigs.user_seeks , ddmigs.user_scans , CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) AS 'est_impact' , ddmigs.last_user_seek FROM sys.dm_db_missing_index_groups AS ddmig INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs ON ddmigs.group_handle = ddmig.index_group_handle INNER JOIN sys.dm_db_missing_index_details AS ddmid ON ddmig.index_handle = ddmid.index_handle INNER JOIN sys.tables AS t ON ddmid.OBJECT_ID = t.OBJECT_ID WHERE ddmid.database_id = DB_ID() --AND t.name = 'myTableName' ORDER BY CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) DESC; |
I usually find the data in both places, but not always. One reason why is because the missing index DMV will only store data since your last reboot. So if I’m taking a look at this DMV on Monday and I just rebooted on Sunday, I may not have enough history to give me meaningful recommendations. This is just something to be aware of.
What I’m looking for in this DMV is the number of user_seeks and the est_impact. Also, if I haven’t rebooted my server in a while, I take a look at last_user_seek so I can determine whether or not it’s still accurate.
Next, I take a look at my existing indexes using Kimberly Tripp’s sp_helpindex2 system stored proc. I use her proc instead of sp_helpindex because I need to see included columns.
If you’re wondering why I’m looking at existing indexes, the reason is because I’m looking for indexes that can be modified slightly to accommodate my missing index needs. By “modified slightly,” I mean that I’d only want to make a change to an existing index if it did not drastically change the size or composition of an index, i.e. adding one or two narrow columns as included columns. I do NOT mean making changes that double the size of your index; in those cases, you’d probably be better off creating a brand new index.
Looking at existing indexes is actually a pretty critical part of the puzzle. If I have a proc that only gets called a few times an hour and could benefit from a better index, I may not create that index if it means adding a wide, expensive index to a busy table. If I can make a small modification to an existing index, then there’s a greater chance I’ll make the change and cover my query.
At this point, I should have enough information to start making decisions. I was going to write out the path I normally take when making decisions, but I thought, “Hey! What a great time for a diagram.” So here you go:
Disclaimer: I’m *not* a Visio wizard, so if I butchered the use of certain symbols in my diagram, please let me know so I can a) fix it, and b) learn from it!
It’s hard to really put all of the decision paths into a single, small diagram like this. There’s a lot of variables that I’m not even touching here. But I think this is a fairly good “generic” representation of the path I take. When I hit an “end” process, it means I don’t create the missing index at this time. Maybe in the future, it’ll become necessary, but I prefer to err on the side of less indexes.
So there you have it, a brief look at my missing index process. Hopefully someone finds it helpful. 🙂