Index Definition Audit Script

Recently, I needed to audit indexes on two different servers. I wanted to compare things like index keys and included columns, partitioning keys, unique constraints, clustered indexes and primary keys, and filter index definitions. Basically, I wanted to make sure that the indexing of two databases on two different servers were completely in sync. To do this, I wrote the following scripts. The first script will audit a single database or even a single table. The second script once more makes use of Aaron Bertrand’s sp_foreachdb procedure to iterate through every database on a server.

To do a quick and easy compare, I dumped the results to a single table on each server and used Red Gate’s SQL Data Compare to find the differences.

This is another metadata script, so it should be fairly lightweight. That said, all of the usual disclaimers apply:

  • This script will only run on SQL 2008 or newer because of the filtered index component. See Patrick’s solution in the comments below for a mod that works in 2005 too.
  • It worked for me, but YMMV depending on editions, collations, creative settings, etc.
  • I don’t currently have access to any instances older than SQL Server 2012, so please respond with any backwards compatibility issues.
  • If you don’t have sp_foreachdb installed and don’t want to install it, you can replace it with sp_msforeachdb, but be aware databases may be skipped

Enjoy. :)

Single-Database Version

WITH indexCTE AS
(
    SELECT st.object_id                                                                         AS objectID
        , st.name                                                                               AS tableName
        , si.index_id                                                                           AS indexID
        , si.name                                                                               AS indexName
        , si.type_desc                                                                          AS indexType
        , sc.column_id                                                                          AS columnID
        , sc.name + CASE WHEN sic.is_descending_key = 1 THEN ' DESC' ELSE '' END                AS columnName
        , sic.key_ordinal                                                                       AS ordinalPosition
        , CASE WHEN sic.is_included_column = 0 AND key_ordinal > 0 THEN sc.name ELSE NULL END   AS indexKeys
        , CASE WHEN sic.is_included_column = 1 THEN sc.name ELSE NULL END                       AS includedColumns
        , sic.partition_ordinal                                                                 AS partitionOrdinal
        , CASE WHEN sic.partition_ordinal > 0 THEN sc.name ELSE NULL END                        AS partitionColumns
        , si.is_primary_key                                                                     AS isPrimaryKey
        , si.is_unique                                                                          AS isUnique
        , si.is_unique_constraint                                                               AS isUniqueConstraint
        , si.has_filter                                                                         AS isFilteredIndex
        , COALESCE(si.filter_definition, '')                                                    AS filterDefinition
    FROM sys.tables                         AS st
    INNER JOIN sys.indexes                  AS si 
        ON si.object_id =   st.object_id
    INNER JOIN sys.index_columns            AS sic 
	    ON sic.object_id=si.object_id
        AND sic.index_id=si.index_id 
    INNER JOIN sys.columns                  AS sc 
	    ON sc.object_id = sic.object_id 
	    and sc.column_id = sic.column_id
) 
 
SELECT DISTINCT 
      @@SERVERNAME                                      AS ServerName
    , DB_NAME()                                         AS DatabaseName
    , tableName
    , indexName
    , indexType
    , STUFF((
            SELECT ', ' + indexKeys
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND indexKeys IS NOT NULL 
            ORDER BY ordinalPosition
                FOR XML PATH(''), 
      TYPE).value('.','varchar(max)'),1,1,'')           AS indexKeys
    , COALESCE(STUFF((
            SELECT ', ' + includedColumns
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND includedColumns IS NOT NULL 
            ORDER BY columnID
                FOR XML PATH(''), 
      TYPE).value('.','varchar(max)'),1,1,''), '')      AS includedColumns
    , COALESCE(STUFF((
            SELECT ', ' + partitionColumns
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND partitionColumns IS NOT NULL 
            ORDER BY partitionOrdinal
                FOR XML PATH(''), 
      TYPE).value('.','varchar(max)'),1,1,''), '')      AS partitionKeys
    , isPrimaryKey
    , isUnique
    , isUniqueConstraint
    , isFilteredIndex
    , FilterDefinition
FROM indexCTE AS cte
--WHERE tableName = 'SalesOrderDetail'
ORDER BY tableName
    , indexName;

Multi-Database Version

IF OBJECT_ID('tempdb..#IndexAudit') IS NOT NULL
    DROP TABLE #IndexAudit;
 
CREATE TABLE #IndexAudit
(
      serverName                SYSNAME
    , databaseName              SYSNAME
    , tableName                 VARCHAR(128)
    , indexName                 VARCHAR(128)
    , indexType                 NVARCHAR(60)
    , indexKeys                 VARCHAR(8000)
    , includedColumns           VARCHAR(8000)
    , partitionColumns          VARCHAR(8000)
    , isPrimaryKey              BIT
    , isUnique                  BIT
    , isUniqueConstraint        BIT
    , isFilteredIndex           BIT
    , FilterDefinition          VARCHAR(8000)
);
 
EXECUTE sp_foreachdb 'USE ?;
WITH indexCTE AS
(
    SELECT st.object_id                                                                         AS objectID
        , st.name                                                                               AS tableName
        , si.index_id                                                                           AS indexID
        , si.type_desc                                                                          AS indexType
        , si.name                                                                               AS indexName
        , sc.column_id                                                                          AS columnID
        , sc.name + CASE WHEN sic.is_descending_key = 1 THEN '' DESC'' ELSE '''' END            AS columnName
        , sic.key_ordinal                                                                       AS ordinalPosition
        , CASE WHEN sic.is_included_column = 0 AND key_ordinal > 0 THEN sc.name ELSE NULL END   AS indexKeys
        , CASE WHEN sic.is_included_column = 1 THEN sc.name ELSE NULL END                       AS includedColumns
        , sic.partition_ordinal                                                                 AS partitionOrdinal
        , CASE WHEN sic.partition_ordinal > 0 THEN sc.name ELSE NULL END                        AS partitionColumns
        , si.is_primary_key                                                                     AS isPrimaryKey
        , si.is_unique                                                                          AS isUnique
        , si.is_unique_constraint                                                               AS isUniqueConstraint
        , si.has_filter                                                                         AS isFilteredIndex
        , COALESCE(si.filter_definition, '''')                                                  AS filterDefinition
    FROM sys.tables                         AS st
    INNER JOIN sys.indexes                  AS si 
        ON si.object_id =   st.object_id
    INNER JOIN sys.index_columns            AS sic 
	    ON sic.object_id=si.object_id
        AND sic.index_id=si.index_id 
    INNER JOIN sys.columns                  AS sc 
	    ON sc.object_id = sic.object_id 
	    and sc.column_id = sic.column_id
) 
 
INSERT INTO #IndexAudit
SELECT DISTINCT 
      @@SERVERNAME                                              AS ServerName
    , DB_NAME()                                                 AS DatabaseName
    , tableName
    , indexName
    , indexType
    , STUFF((
            SELECT '', '' + indexKeys
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND indexKeys IS NOT NULL 
            ORDER BY ordinalPosition
                FOR XML PATH(''''), 
      TYPE).value(''.'',''varchar(max)''),1,1,'''')             AS indexKeys
    , COALESCE(STUFF((
            SELECT '', '' + includedColumns
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND includedColumns IS NOT NULL 
            ORDER BY columnID
                FOR XML PATH(''''), 
      TYPE).value(''.'',''varchar(max)''),1,1,''''), '''')      AS includedColumns
    , COALESCE(STUFF((
            SELECT '', '' + partitionColumns
                FROM indexCTE
            WHERE objectID = cte.objectID
                AND indexID = cte.indexID
                AND partitionColumns IS NOT NULL 
            ORDER BY partitionOrdinal
                FOR XML PATH(''''), 
      TYPE).value(''.'',''varchar(max)''),1,1,''''), '''')      AS partitionKeys
    , isPrimaryKey
    , isUnique
    , isUniqueConstraint
    , isFilteredIndex
    , FilterDefinition
FROM indexCTE AS cte
ORDER BY tableName
    , indexName;
';
 
-- For multi-server testing, dump results to a temp table and compare tables
SELECT *
FROM #IndexAudit
WHERE databaseName NOT IN ('tempdb', 'master', 'msdb', 'model')
ORDER BY serverName
    , databaseName
    , tableName
    , indexName;

Example Results

ServerName   DatabaseName       tableName            indexName                      indexType    indexKeys                                                    includedColumns                                    partitionKeys isPrimaryKey isUnique isUniqueConstraint isFilteredIndex FilterDefinition
------------ ------------------ -------------------- ------------------------------ ------------ ------------------------------------------------------------ -------------------------------------------------- ------------- ------------ -------- ------------------ --------------- --------------------------------
SQLFOOL\2012 AdventureWorks2012 SalesOrderDetail     AK_SalesOrderDetail_rowguid    NONCLUSTERED  rowguid                                                                                                                      0            1        0                  0                                               
SQLFOOL\2012 AdventureWorks2012 SalesOrderDetail     FIX_SalesOrderDetail_1         NONCLUSTERED  CarrierTrackingNumber, SalesOrderID, ProductID               OrderQty, SpecialOfferID, LineTotal, rowguid                    0            0        0                  1               ([ModifiedDate]>='2006-01-01')  
SQLFOOL\2012 AdventureWorks2012 SalesOrderDetail     IX_SalesOrderDetail_ProductID  NONCLUSTERED  ProductID                                                                                                                    0            0        0                  0                                               
SQLFOOL\2012 AdventureWorks2012 SalesOrderDetail     PK_SalesOrderDetail_SalesOrder CLUSTERED     SalesOrderID, SalesOrderDetailID                                                                                             1            1        0                  0

Accelerating ETL Data Retrieval

As I’ve mentioned previously, SQL Server 2012 Integration Services Design Patterns is now available on Amazon. PASS has invited the ENTIRE author team to give a pre-conference training session at Summit 2012. Precons are all-day training events and are held on the days before Summit. Our precon will be held on Monday, November 5th. I’ve attended several precons at Summit, and in my opinion, they provide unparalleled depth and access to the presenters. I expect our precon will be no different.

So what will I be presenting on? Good question. I’m going to talk about Accelerating ETL Data Retrieval. Specifically, I’ll discuss strategies for retrieving data for full loads vs. incremental loads, and I’ll explore some strategies for retrieving data from large tables. To clarify “large,” since everyone has a different definition, I’ve successfully employed these strategies on tables with 10 billion rows.

Now you might be reading this and thinking, “Hey, that doesn’t sound like SSIS!” Well… yes and no. While I will be discussing some strategies within SSIS, most of what I discuss will take place at the database layer. In fact, most of my content could be useful for any ETL tool. I still think it’s a good topic for this precon for three reasons. First, most ETL developers I talk with — that is, people who specialize in and are primarily tasked with ETL — overlook the data retrieval step. I frequently hear, “I’m pulling all of the data, so it doesn’t matter, I can’t get the data any faster than SELECT * FROM.” That’s not always true, and I’ll explain why in the precon. Secondly, having a thorough understanding of data retrieval techniques is important. Data retrieval is one of the most — if not the single most — expensive components of ETL. Lastly, I think there’s a huge knowledge gap in this area. While there is a lot of discussion about query optimization, it rarely has to do with the type of query patterns that ETL requires.

So that’s what I’ll be talking about and why. What do you think? Worthwhile content?

Also, this is just one of the things that will be covered in our SSIS Design Patterns precon. The rest of the author team — Andy Leonard, Matt Masson, Tim Mitchell, and Jessica Moss — is putting together great content that includes package execution, parent/child relationships, expressions and scripting, dynamic configurations, error handling, data flow internals, scalability and parallelism, automation, dynamic package generation, data warehousing patterns, and more!

As an aside, I am honored to announce that I have been re-awarded as Microsoft MVP for SQL Server. My husband took this picture of my trophy for me, which I think turned out really cool. :)

Go Daddy Insiders

I don’t hide the fact that I work at Go Daddy. All discussions of advertising methods aside, it’s a great company to work for. Not only am I treated well as an employee, I also get to work in a world-class technical environment. However, the marketing campaigns tend to steal the spotlight. As a result, few people are aware of technology that it takes to be the #1 hosting provider in the world. Some examples of little-known facts about Go Daddy:

  • 10 billion DNS queries answered daily
  • Over 35,000 servers & 100,000 square feet of state-of-the-art global data centers
  • 25 petabytes — yes, petabytes! — of networked data storage

Pretty cool, huh? Go Daddy has launched a new blog called Inside Go Daddy as a way to share all the nitty gritty details of what it takes to support this kind of environment. Here’s a blurb from the site:

This is your inside source for what’s going on with Go Daddy’s tech experts. You’ll get insight and opinions from Go Daddy’s tech leaders on industry topics, company projects & open source initiatives … the leading edge, unconventional, “behind-the-scenes” information you won’t find anywhere else. It’s not PR, it’s not executive talk, it’s the story straight from Go Daddy’s developers, engineers & IT personnel.

Shockingly, I’ve signed up to blog about database scalability. ;) I’ve just started a new series that explores the tuning and design changes required to support 27k transactions per second during the airing of Go Daddy’s Super Bowl commercials. Those who attended my Summit 2009 session might recognize some of the high-level content, but this series will explore the topics in depth and with never-before-revealed detail. My first article, Scaling the Database: Data Types, is now live.

If you find the content helpful or interesting, please share the article or leave a comment. My employer monitors blog traffic, and we have a bit of a contest going on to see what topics get the most hits. Quite frankly, it’d be cool if the SQL Server topics outperformed the NoSQL topics. ;)

Also, I’ll entertain topic requests, so if there’s something you’re just dying to know about what we do or how we do it, let me know. :)