Digging Around in SQL Internals – View Page Data

So lately I’ve been geeking out on SQL internals. My most recent find involves a couple of undocumented toys, DBCC Page and sys.system_internals_allocation_units.

DBCC Page will actually display the contents of a page. I found a blog post by Paul Randal, back when he still worked for Microsoft, describing the DBCC Page command. Here’s a summary:

dbcc page ( {‘dbname’ | dbid}, filenum, pagenum [, printopt={0|1|2|3} ])

The filenum and pagenum parameters are taken from the page IDs that come from various system tables and appear in DBCC or other system error messages. A page ID of, say, (1:354) has filenum = 1 and pagenum = 354.

The printopt parameter has the following meanings:

* 0 – print just the page header
* 1 – page header plus per-row hex dumps and a dump of the page slot array (unless its a page that doesn’t have one, like allocation bitmaps)
* 2 – page header plus whole page hex dump
* 3 – page header plus detailed per-row interpretation

The per-row interpretation work for all page types, including allocation bitmaps.

We’ll come back to actually viewing a page in just a minute. Now that we know we can view page contents, how do we find out which pages contain the data we want to look at? This is where sys.system_internals_allocation_units can help. The sys.system_internals_allocation_units DMV is just like sys.allocation_units, except it has a few additional columns of interest: [first_page], [root_page], and [first_iam_page]. The query below will return the filenum and pagenum values, for use in the DBCC Page command, for a specific table.

Select Object_Name(p.object_id) As 'tableName'
    , i.name As 'indexName'
    , p.partition_number
    , au.type_desc
    , Convert (varchar(6),
      Convert (int, SubString (au.first_page, 6, 1) +
         SubString (au.first_page, 5, 1))) +
   ':' + Convert (varchar(20),
      Convert (int, SubString (au.first_page, 4, 1) +
         SubString (au.first_page, 3, 1) +
         SubString (au.first_page, 2, 1) +
         SubString (au.first_page, 1, 1))) As 'firstPage'
    , Convert (varchar(6),
      Convert (int, SubString (au.root_page, 6, 1) +
         SubString (au.root_page, 5, 1))) +
   ':' + Convert (varchar(20),
      Convert (int, SubString (au.root_page, 4, 1) +
         SubString (au.root_page, 3, 1) +
         SubString (au.root_page, 2, 1) +
         SubString (au.root_page, 1, 1))) As 'rootPage'
    , Convert (varchar(6),
      Convert (int, SubString (au.first_iam_page, 6, 1) +
         SubString (au.first_iam_page, 5, 1))) +
   ':' + Convert (varchar(20),
      Convert (int, SubString (au.first_iam_page, 4, 1) +
         SubString (au.first_iam_page, 3, 1) +
         SubString (au.first_iam_page, 2, 1) +
         SubString (au.first_iam_page, 1, 1))) As 'firstIAM_page'
From sys.indexes As i
Join sys.partitions As p
    On i.object_id = p.object_id
    And i.index_id = p.index_id
Join sys.system_internals_allocation_units As au
    On p.hobt_id = au.container_id
Where Object_Name(p.object_id) = 'ProductReview'
Order By tableName;

Note: the conversion code was borrowed from one of Kimberley Tripp’s posts, see the References section for the link.

Running the preceding query in the AdventureWorks database will produce the following:

tableName  indexName                           partition_number type_desc    firstPage  rootPage  firstIAM_page
---------- ----------------------------------- ---------------- ------------ ---------- --------- --------------
ProductRev PK_ProductReview_ProductReviewID    1                IN_ROW_DATA  1:770      1:773     1:771
ProductRev IX_ProductReview_ProductID_Name     1                IN_ROW_DATA  1:911      1:914     1:912

Great, now we have a starting place! Let’s now take DBCC Page out for a spin and see what it can do. I’m going to use the [firstPage] value for the IX_ProductReview_ProductID_Name index. Remember, the value preceding the colon (:) is your file number, and the value after it is your page number.

DBCC TraceOn (3604);
DBCC Page (AdventureWorks, 1, 911, 3);
DBCC TraceOff (3604);

You should get back something like the following. (Note: for brevity’s sake, I’m only displaying part of the results)

Allocation Status
 
GAM (1:2) = ALLOCATED                SGAM (1:3) = NOT ALLOCATED           
PFS (1:1) = 0x60 MIXED_EXT ALLOCATED   0_PCT_FULL                         DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED            
FileId PageId      Row    Level  ProductID (key) ReviewerName (key)                                 ProductReviewID (key) Comments
------ ----------- ------ ------ --------------- -------------------------------------------------- --------------------- -------------------------------------------------------------------------
1      911         0      0      709             John Smith                                         1                     I can't believe I'm singing the praises of a pair of socks, but...(79025483b74e)

Let’s take a moment to consider why we’re seeing this. You’ll notice there’s 3 columns with (key) next to the name: [ProductID], [ReviewerName], and [ProductReviewID]. There’s one column without (key): [Comments]. This is exactly what I was expecting to see. Why? If you take a look at the details of IX_ProductReview_ProductID_Name, you’ll notice it’s a non-clustered index on only [ProductID, ReviewerName], with an included column, [Comments]. You’ll also notice the clustered index on Production.ProductReview is [ProductReviewID]. Remember, the composition of a non-clustered index includes the clustered index value as a sort of row pointer.

Not sure how to view the details of an index, like included columns? Check out my Index Interrogation Script.

I was interested in the ordering of the data. I wasn’t sure whether I’d find [ProductReviewID] first or last. But after looking at several of these examples, I’ve always found the clustered index to be nestled behind the non-clustered index columns and before the included columns (as exampled above).

That’s all I have for now. Try it out, play with the different printopt values, look at different types of pages, and have fun. Oh, and if anyone has a link to a blog post or a book that explains each of the items in the page header, I’d greatly appreciate it. :)

Happy Digging!

Michelle

References:

Source: http://sqlfool.com/2009/02/view_page_data/

Fragmentation on Replicated Tables

I’ve heard more than one person state that they do not need to defrag their tables, because they’re replicated and the source tables are defragged regularly. Let’s set this myth straight.

Repeat after me: defragmentation processes are not replicated.

Good. Thank you.

For those non-believers, let me prove it:

SET NOCOUNT ON;
 
/* Create publication db */
CREATE DATABASE sandbox_publisher;
GO
 
/* Create subscription db */
CREATE DATABASE sandbox_subscriber;
GO
 
/* Navigate to publisher db */
USE sandbox_publisher;
GO
 
/* Create a table to play with */
CREATE TABLE myTable
(
      myID UNIQUEIDENTIFIER
    , myColumn VARCHAR(2000)
 
    CONSTRAINT PK_myTable PRIMARY KEY CLUSTERED (myID)
);
/* NOTE: please never, ever use this
   type of clustered index in real-life.
   The whole purpose is to generate
   substantial fragmentation for 
   illustrative purposes. */
 
/* Populate it with some non-sequential
   records to generate fragmentation */
DECLARE @iteration INT;
SET @iteration = 0;
 
WHILE @iteration < 1000
BEGIN
    INSERT INTO myTable
    SELECT NewID(), REPLICATE('A', 2000);
 
    SET @iteration = @iteration + 1;
END
 
/* Now let's verify that we really did
   make a mess of our clustered index */
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
99.0157480314961             508
 
99% fragmented, not too shabby!  Your results 
may vary slightly, based on the guid values
generated by NewID().
*/
 
/* Now, let's set up replication */
USE sandbox_publisher;
GO
 
/* Enable publication */
EXECUTE SP_REPLICATIONDBOPTION
      @dbname = N'sandbox_publisher'
    , @optname = N'publish'
    , @VALUE = N'true';
GO
 
/* Create our publication */
EXECUTE SP_ADDPUBLICATION
      @publication = N'myTestPublication' 
    , @sync_method = N'concurrent'
    , @description = N'Test Publication'
    , @status = N'active'
    , @independent_agent = N'true'
    , @immediate_sync = N'true'
    , @retention = 0
GO
 
/* Create our snapshot agent */
EXECUTE SP_ADDPUBLICATION_SNAPSHOT
      @publication = N'myTestPublication'
    , @frequency_type = 1;
 
/* Add our table as an article */
EXECUTE SP_ADDARTICLE
      @publication = N'myTestPublication'
    , @article = N'myTable'
    , @source_owner = N'dbo'
    , @source_object = N'myTable'
    , @type = N'logbased'
    , @destination_table = N'myTable'
    , @destination_owner = N'dbo'
    , @ins_cmd = N'CALL sp_MSins_dbomyTable'
    , @del_cmd = N'CALL sp_MSdel_dbomyTable'
    , @upd_cmd = N'MCALL sp_MSupd_dbomyTable';
GO
 
/* Generate an initial snapshot */
EXECUTE sp_startpublication_snapshot
      @publication = N'myTestPublication';
 
/* Create our subscription */
EXECUTE SP_ADDSUBSCRIPTION 
      @publication = N'myTestPublication'
    , @subscriber = N'YourServerName'
    , @destination_db = N'sandbox_subscriber';
 
/* At this point, basic transactional replication 
   should be configured.  You should now have a 
   copy of myTable in your subscription database. */
 
USE sandbox_subscriber;
GO
 
/* Let's take a look at fragmentation on the 
   subscription database. */
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
0.29940119760479             334
 
Less than 1% fragmented!  Why is this?  During
initialization, a copy of the schema and data
is sent to the subscriber.  The data is sorted
before insertion into the destination table.
*/
 
USE sandbox_publisher;
GO
 
/* Let's insert another 1000 records and see
   what happens... */
DECLARE @iteration INT;
SET @iteration = 0;
 
WHILE @iteration < 1000
BEGIN
    INSERT INTO sandbox_publisher.dbo.myTable
    SELECT NewID(), REPLICATE('A', 2000);
 
    SET @iteration = @iteration + 1;
END
 
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
98.6193293885602             1014
 
The publisher is still very fragmented 
(as expected) */
 
USE sandbox_subscriber;
GO
 
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
95.4034729315628             979
 
But now the subscriber is fragmented too!  This
is because each individual transaction is 
replicated, so out-of-order inserts affect
both the publishing and the subcribing 
databases equally.
*/
 
/* Finally, let's rebuild our index on the 
   publisher and see what happens */
USE sandbox_publisher;
GO
 
ALTER INDEX PK_myTable ON myTable Rebuild;
 
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
0                            667
 
0% fragmentation in the publishing table!  
What about our subscribing table?
*/
 
USE sandbox_subscriber;
GO
 
SELECT avg_fragmentation_in_percent
    , page_count
FROM sys.dm_db_index_physical_stats 
(DB_ID(), OBJECT_ID(N'MyTable'), NULL , NULL, N'Limited') 
OPTION (MaxDop 1);
 
/* Results:
 
avg_fragmentation_in_percent page_count
---------------------------- --------------------
95.4034729315628             979
 
As you can see, the subcribing table was
completely unaffected by the defrag
activities on the publisher.
*/

So there you have it, proof that defrag operations are not replicated! Now go defrag your replicated tables. :)

Oh, and if you need help defragging your indexes, check out my index defrag script (2005, 2008).

Creating a 60 GB Index

Recently, I needed to create an index on a 1.5 billion row table. I’ve created some large indexes before, but this was the largest, so I thought I’d share my experience in case anyone was interested.

The plan was to create the following index:

Create NonClustered Index IX_indexName_unpartitioned
    On databaseName.dbo.tableName(columnList)
    Include (includedColumnList)
    With (MaxDop = 1, Online = On, Sort_In_TempDB = On)
    On [Primary];

This is an unpartitioned, non-clustered index being built on a partitioned table. Normally, when you build an aligned partitioned index, the index creation process requires less memory and has less noticeable impact on the system, because each partition is handled individually, one at a time. But as this is an unpartitioned (unaligned) index, each partition was built concurrently, requiring more memory and causing a greater impact on performance. Because of this, I needed to restrict the process to MaxDop 1; otherwise, the server would suffer because of too much memory pressure.

I chose Sort_In_TempDB = On because:

  • I’m building this index online on a very busy table and cannot afford to impact normal oeprations. By using Sort_In_TempDB = On, index transactions are separated from user transactions, allowing the user transaction log to be truncated. *
  • TempDB is on a different volume and therefore should reduce the duration of the operation.
  • The recovery for the user database is full, and the recovery for the TempDB is simple. Sorting in TempDB would minimize logging.

* Note: the transaction log for the user database still grew at a much faster rate than normal and had to be closely monitored during this operation to ensure enough free space remained.

The size of the indexed columns is 25 bytes. So I ran my calculations and came up with 36gb space requirement. We increased TempDB to 50gb and gave it a go. An hour later… ERROR. The process terminated because there would not be enough space free in TempDB to complete the operation. Obviously, my calculations were incorrect. After speaking with Chris Leonard, a man who is way too smart for his own good, I realized I had not included my clustered index in the size calculations. Doh.

Re-running my estimates, here’s what I came up with:

Index Size 25   bytes
Clustered Index Size 16   bytes
Records per Page 197
Est. Rows 1,575,000,000
Est. Number of Pages 7,995,000
Space Requirements 59   GB

 

Obviously, 50gb of free space just wasn’t going to cut it. I decided to give TempDB a little wiggle room and bumped up the space in TempDB to 70gb (not as easy as you’d imagine, I had to requisition more space on the SAN), then re-ran the operation. Success! The process completed in 3 hours and 24 minutes. There was a mild increase in CPU, but no applications or users experienced any issues.

For those interested in the particulars: this was used for a single-record look-up and could not be filtered by the partitioning key. The non-partitioned version of this index has 80% less reads and 11% less CPU than its partitioned counterpart.

If you’re interested in learning more about indexes, here’s some recommended reading:

Index Clean-Up Scripts

I’ve been spending a lot of time lately looking at indexing in my production environments… dropping un-used ones, adding missing ones, and fine-tuning the ones I already have. I thought I’d share some of the scripts I’ve been using to accomplish this.

Here’s the script I use to find any un-used indexes. This relies heavily on the sys.dm_db_index_usage_stats DMV (2005+). This query will also return the SQL statements needed to drop the indexes for convenience. This does NOT mean you should necessarily drop the index. This is only a guide and a starting point; only you know how your application is used and whether SQL Server’s recommendations make sense.

Un-Used Indexes Script

Declare @dbid int
    , @dbName varchar(100);
 
Select @dbid = DB_ID()
    , @dbName = DB_Name();
 
With partitionCTE (object_id, index_id, row_count, partition_count) 
As
(
    Select [object_id]
        , index_id
        , Sum([rows]) As 'row_count'
        , Count(partition_id) As 'partition_count'
    From sys.partitions
    Group By [object_id]
        , index_id
) 
 
Select Object_Name(i.[object_id]) as objectName
        , i.name
        , Case 
            When i.is_unique = 1 
                Then 'UNIQUE ' 
            Else '' 
          End + i.type_desc As 'indexType'
        , ddius.user_seeks
        , ddius.user_scans
        , ddius.user_lookups
        , ddius.user_updates
        , cte.row_count
        , Case When partition_count > 1 Then 'yes' 
            Else 'no' End As 'partitioned?'
        , Case 
            When i.type = 2 And i.is_unique_constraint = 0
                Then 'Drop Index ' + i.name 
                    + ' On ' + @dbName 
                    + '.dbo.' + Object_Name(ddius.[object_id]) + ';'
            When i.type = 2 And i.is_unique_constraint = 1
                Then 'Alter Table ' + @dbName 
                    + '.dbo.' + Object_Name(ddius.[object_ID]) 
                    + ' Drop Constraint ' + i.name + ';'
            Else '' 
          End As 'SQL_DropStatement'
From sys.indexes As i
Inner Join sys.dm_db_index_usage_stats ddius
    On i.object_id = ddius.object_id
        And i.index_id = ddius.index_id
Inner Join partitionCTE As cte
    On i.object_id = cte.object_id
        And i.index_id = cte.index_id
Where ddius.database_id = @dbid
Order By 
    (ddius.user_seeks + ddius.user_scans + ddius.user_lookups) Asc
    , user_updates Desc;

This next script relies on several DMV’s (2005+) that identify missing indexes. While this is good information, the index recommendations do not always make sense and/or sometimes overlap. Also, these DMV’s store data since the SQL Server was last restarted, so if it’s been a while since your server was rebooted, this data may be out of date. This script also provides a SQL statement, in case you do decide to create the index, but it doesn’t take into consideration advanced parameters (i.e. sort_in_tempDB, Online, MaxDop, etc.) and only provides a basic create statement. Nonetheless, it’s another good starting point.

Missing Index Script

Select t.name As 'affected_table'
    , 'Create NonClustered Index IX_' + t.name + '_missing_' 
        + Cast(ddmid.index_handle As varchar(10))
        + ' On ' + ddmid.statement 
        + ' (' + IsNull(ddmid.equality_columns,'') 
        + Case When ddmid.equality_columns Is Not Null 
            And ddmid.inequality_columns Is Not Null Then ',' 
                Else '' End 
        + IsNull(ddmid.inequality_columns, '')
        + ')' 
        + IsNull(' Include (' + ddmid.included_columns + ');', ';'
        ) As sql_statement
    , ddmigs.user_seeks
    , ddmigs.user_scans
    , Cast((ddmigs.user_seeks + ddmigs.user_scans) 
        * ddmigs.avg_user_impact As int) As 'est_impact'
    , ddmigs.last_user_seek
From sys.dm_db_missing_index_groups As ddmig
Inner Join sys.dm_db_missing_index_group_stats As ddmigs
    On ddmigs.group_handle = ddmig.index_group_handle
Inner Join sys.dm_db_missing_index_details As ddmid 
    On ddmig.index_handle = ddmid.index_handle
Inner Join sys.tables As t
    On ddmid.object_id = t.object_id
Where ddmid.database_id = DB_ID()
    And Cast((ddmigs.user_seeks + ddmigs.user_scans) 
        * ddmigs.avg_user_impact As int) > 100
Order By Cast((ddmigs.user_seeks + ddmigs.user_scans) 
    * ddmigs.avg_user_impact As int) Desc;

Indexing for Partitioned Tables

So you’ve partitioned your table, and now you’re ready to performance tune. As with any table, indexing is a great place to start. And if you’re like most people new to partitioning, you probably created all of your indexes on the partitioned scheme, either by design or unintentionally.

Let’s take a step back here and discuss what a partitioned index is. A partitioned index, like a partitioned table, separates data into different physical structures (partitions) under one logical name. Specifically, each partition in a nonclustered index contains its own B-tree structure with a subset of rows, based upon the partitioning scheme. By default, an unpartitioned nonclustered index has just one partition.

Keep in mind, when you create an index on a partitioned table, i.e.

Create NonClustered Index IX_myIndex
    On dbo.myTable(myColumn);

… you are creating the index on the partitioned scheme by default. In order to create a NON-partitioned index on that same table, you would need to explicitly declare “On [FileGroup]“, i.e.

Create NonClustered Index IX_myIndex
    On dbo.myTable(myColumn)
    On [Primary];

 

But should you partition your index? That depends on how you use it. In fact, most environments will probably want to use a mix of partitioned and non-partitioned indexes. I’ve found that that partitioned indexes perform better when aggregating data or scanning partitions. Conversely, you’ll probably find that, if you need to locate a single, specific record, nothing performs better than a non-partitioned index on that column.

Let’s walk through some examples and see how they perform. I’ll bring back my trusty ol’ orders table for this.

/* Create a partition function. */
Create Partition Function 
    [test_monthlyDateRange_pf] (datetime)
    As Range Right For Values
    ('2009-01-01', '2009-01-08', '2009-01-15'
    , '2009-01-22', '2009-01-29');
Go
 
/* Associate the partition function with a partition scheme. */
Create Partition Scheme test_monthlyDateRange_ps
    As Partition test_monthlyDateRange_pf
    All To ([Primary]);
Go
 
/* Create a partitioned table. */
Create Table dbo.orders
(
      order_id  int Identity(1,1)   Not Null
    , orderDate datetime            Not Null
    , orderData smalldatetime       Not Null
 
    Constraint PK_orders Primary Key Clustered
    (
        order_id
      , orderDate
    )
) On test_monthlyDateRange_ps(orderDate);
Go
 
/* Create some records to play with. */
Set NoCount On;
 
Declare @endDate datetime = '2009-01-01';
 
While @endDate < '2009-02-01'
Begin
 
    Insert Into dbo.orders
    Select @endDate, @endDate;
 
    Set @endDate = DATEADD(minute, 1, @endDate);
 
End;
 
Set NoCount Off;
 
 
/* Let’s create an aligned, partitioned index. */
Create NonClustered Index IX_orders_aligned
    On dbo.orders(order_id)
    On test_monthlyDateRange_ps(orderDate); 
    /* you don't actually need to declare the last
       line of this unless you want to create the
       index on a different partitioning scheme.   */
 
/* Now let’s create an unpartitioned index. */
Create NonClustered Index IX_orders_unpartitioned
    On dbo.orders(order_id)
    On [Primary];

 

Now that we have both a partitioned and an unpartitioned index, let’s take a look at our sys.partitions table:

/* Let's take a look at our index partitions */
Select i.name
    , i.index_id
    , p.partition_number
    , p.rows
From sys.partitions As p
Join sys.indexes As i
    On p.object_id = i.object_id 
   And p.index_id = i.index_id
Where p.object_id = object_id('orders')
Order By i.index_id, p.partition_number;

 

sys.partitions

sys.partitions

As expected, both of our partitioned indexes, PK_orders and IX_orders_aligned, have 6 partitions, with a subset of rows on each partition. Our unpartitioned non-clustered index, IX_orders_unpartitioned, on the other hand has just 1 partition containing all of the rows.

Now that we have our environment set up, let’s run through some different queries and see the performance impact of each type of index.

/* Query 1, specific record look-up, covered */
Select order_id, orderDate
From dbo.orders With (Index(IX_orders_aligned))
Where order_id = 25000;
 
Select order_id, orderDate
From dbo.orders With (Index(IX_orders_unpartitioned))
Where order_id = 25000;

 

Query 1

Query 1

The unpartitioned index performs significantly better when given a specific record to look-up. Now let’s try the same query, but utilizing a scan instead of a seek:

/* Query 2, specific record look-up, uncovered */
Select order_id, orderDate, orderData
From dbo.orders With (Index(IX_orders_aligned))
Where order_id = 30000;
 
Select order_id, orderDate, orderData
From dbo.orders With (Index(IX_orders_unpartitioned))
Where order_id = 30000;

 

Query 2

Query 2

Again we see that the non-partitioned index performs better with the single-record look-up. This can lead to some pretty dramatic performance implications. So when *would* we want to use a partitioned index? Two instances immediately pop to mind. First, partition switching can only be performed when all indexes on a table are aligned. Secondly, partitioned indexes perform better when manipulating large data sets. To see this in action, let’s try some simple aggregation…

/* Query 3, aggregation */
Select Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime) As 'order_date'
    , Count(*)
From dbo.orders With (Index(IX_orders_aligned))
Where orderDate Between '2009-01-01' And '2009-01-07 23:59'
Group By Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime) 
Order By Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime);
 
Select Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime) As 'order_date'
    , Count(*)
From dbo.orders With (Index(IX_orders_unpartitioned))
Where orderDate Between '2009-01-01' And '2009-01-07 23:59'
Group By Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime) 
Order By Cast(Round(Cast(orderdate As float), 0, 1) As smalldatetime);

 

Query 3

Query 3

As you can see, partitioned indexes perform better when aggregating data. This is just a simple example, but the results can be even more dramatic in a large production environment. This is one of the reasons why partitioned tables and indexes are especially beneficial in data warehouses.

So now you have a general idea of what a partitioned index is and when to use a partitioned index vs a non-partitioned index. Ultimately, your indexing needs will depend largely on the application and how the data is used. When in doubt, test, test, test! So to recap…

  • Specify “On [FileGroup]“ to create an unpartitioned index on a partitioned table
  • Consider using non-partitioned indexes for single-record look-ups
  • Use partitioned indexes for multiple records and data aggregations
  • To enable partition switching, all indexes on the table must be aligned.

For more information on partitioning, check out my other partitioning articles:

Partitioning Example
Partitioning 101
Tips for Large Data Stores

SQL Tweaks and Tools That Make My Life Easier

It still surprises me how many people don’t know about some of the very things that make my job so much easier. So this next post is dedicated to sharing some of the tweaks and tools I’ve run across that will help anyone who works with SQL:

 

Indexes
Anyone who uses included columns is probably well aware of the frustrations that can come from having to look up information on which columns are included. I wrote a stored procedure, dba_indexLookup_sp, to help me with this, before discovering sp_helpindex2. If you haven’t heard of sp_helpindex2, it’s a re-write of sp_helpindex by Kimberly Tripp. You can find it on Kimberly’s blog. The main difference is Kimberly’s is a system stored procedure (mine is not) and my version returns partitioning information (Kimberly’s does not). Check both out and use whichever one meets your needs best.

 

KeyBoard ShortCuts

In SQL Server Management Studio (SSMS), click on:
    Tools –> Options… –> Environment –> Keyboard

Keyboard Shortcuts

Keyboard Shortcuts

For your copying convenience:

Ctrl+3   Select Top 100 * From
Ctrl+4   sp_tables @table_owner = ‘dbo’
Ctrl+5   sp_columns
Ctrl+6   sp_stored_procedures @sp_owner = ‘dbo’
Ctrl+7   sp_spaceused
Ctrl+8   sp_helptext
Ctrl+9   dba_indexLookup_sp or sp_helpindex2

Please note that these settings will not take effect until you open a new query window. Here’s an example of how you could use this: use Ctrl+4 to find a list of tables, then copy one into your query window; to view a sample of that table’s data, highlight the table name (I usually double-click on it) and press Ctrl+3. It’s a thing of beauty. Oh, and you may want to remove/change the schema filters if you use schemas other than dbo.

 

Query Execution Settings

After having one too many issues arise from non-DBA’s connecting to the production environment to run a devastating ad hoc, I’ve had all of our developers and analysts adopt the following settings. The only thing difference between my setting and theirs is that I have “Set Statistics IO” selected. FYI – you can also make these same setting changes in Visual Studio.

In SQL Server Management Studio (SSMS), click on:
    Tools –> Options… –> Query Execution –> SQL Server –> Advanced

Query Execution Settings

Query Execution Settings

 

Copy Behavior
This next tip actually has nothing to do with SQL Server, and can be done with any Microsoft product. However, I just learned about it a few weeks ago and already I use it quite frequently.

Holding down “Alt” while you drag your mouse will change your selection behavior to block selection.

Block Selection

Block Selection

 

Please note: The following tools requires SQL 2008 Management Studio. These tools will also work when you connect SQL 2008 SSMS to a 2005 instance.

 

Object Detail Explorer

Finally, there’s a reason to use the Object Detail Explorer! My favorite use is to quickly find the table size and row counts of all the tables in a database. If these options are not currently available, you may just need to right click on the column headers and add it to the display.

Object Detail Explorer

Object Detail Explorer

 

Missing Indexes

And lastly, when using SSMS 2008 to execute Display Estimated Query Plan (Ctrl+L), it will show you if you’re missing any indexes. This will even work if you connect SSMS 2008 to SQL 2005!

Missing Index

Missing Index

That pretty much covers it for now. HTH! :)

Michelle

Updated Index Defrag Script (2005, 2008)

Thanks to everyone who left a comment or sent me an e-mail regarding the Index Defrag Script. I’ve received some great feedback and requests for features. I’ve also had some questions regarding how to use it, which I will answer at the end of this post.

Changes include:
– separate version for both Enterprise and Standard editions
    - Standard edition removes partitioning and online options
– output option to see fragmentation levels
– page_count added to the log table

I’ve also verified that this script works well in SQL 2008.

Enterprise Version:

IF EXISTS(SELECT OBJECT_ID FROM sys.tables
                WHERE [name] = N'dba_indexDefragLog')
BEGIN
 
    DROP TABLE dbo.dba_indexDefragLog;
 
    PRINT 'dba_indexDefragLog table dropped!';
 
END
 
CREATE TABLE dbo.dba_indexDefragLog
(
      indexDefrag_id    INT IDENTITY(1,1)   NOT NULL
    , objectID          INT                 NOT NULL
    , objectName        NVARCHAR(130)       NOT NULL
    , indexID           INT                 NOT NULL
    , indexName         NVARCHAR(130)       NOT NULL
    , partitionNumber   smallint            not null
    , fragmentation     FLOAT               NOT NULL
    , page_count        INT                 NOT NULL
    , dateTimeStart     DATETIME            NOT NULL
    , durationSeconds   INT                 NOT NULL
    CONSTRAINT PK_indexDefragLog
        PRIMARY KEY CLUSTERED (indexDefrag_id)
);
 
PRINT 'dba_indexDefragLog Table Created';
 
IF OBJECTPROPERTY(OBJECT_ID('dbo.dba_indexDefrag_sp'),
        N'IsProcedure') IS Null
BEGIN
    EXECUTE ('Create Procedure dbo.dba_indexDefrag_sp
        As Print ''Hello World!''');
    RAISERROR('Procedure dba_indexDefrag_sp created.'
        , 10, 1);
END;
Go
 
SET ANSI_Nulls ON;
SET Ansi_Padding ON;
SET Ansi_Warnings ON;
SET ArithAbort ON;
SET Concat_Null_Yields_Null ON;
SET NOCOUNT ON;
SET Numeric_RoundAbort OFF;
SET Quoted_Identifier ON;
Go
 
ALTER PROCEDURE dbo.dba_indexDefrag_sp
 
    /* Declare Parameters */
      @minFragmentation     FLOAT           = 10.0
        /* in percent, will not defrag if fragmentation
           less than specified */
    , @rebuildThreshold     FLOAT           = 30.0
        /* in percent, greater than @rebuildThreshold
           will result in rebuild instead of reorg */
    , @onlineRebuild        bit             = 1     
        /* 1 = online rebuild; 0 = offline rebuild */
    , @executeSQL           BIT             = 1
        /* 1 = execute; 0 = print command only */
    , @tableName            VARCHAR(4000)   = Null 
        /* Option to specify a table name */
    , @printCommands        BIT             = 0
        /* 1 = print commands; 0 = do not print commands */
    , @printFragmentation   BIT             = 0
        /* 1 = print fragmentation prior to defrag; 
           0 = do not print */
    , @defragDelay          CHAR(8)         = '00:00:05'
        /* time to wait between defrag commands */
AS
/********************************************************************
    Name:       dba_indexDefrag_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Defrags all indexes for the current database
 
    Notes:      This script was designed for SQL Server 2005
                Enterprise Edition.
 
    CAUTION: Monitor transaction log if executing for the first time!
 
      @minFragmentation     defaulted to 10%, will not defrag if
                            fragmentation if less than specified.
 
      @rebuildThreshold     defaulted to 30% as recommended by
                            Microsoft in BOL;
                            > than 30% will result in rebuild instead
 
      @onlineRebuild        1 = online rebuild; 
                            0 = offline rebuild
 
      @executeSQL           1 = execute the SQL generated by this proc;
                            0 = print command only
 
      @tableName            Specify if you only want to defrag indexes
                            for a specific table
 
      @printCommands        1 = print commands to screen;
                            0 = do not print commands
 
      @printFragmentation   1 = print fragmentation to screen;
                            0 = do not print fragmentation
 
      @defragDelay          time to wait between defrag commands;
                            gives the server some time to catch up
 
    Called by:  SQL Agent Job or DBA
 
    Date        Initials  Description
    ----------------------------------------------------------------
    2008-10-27  MFU       Initial Release
    2008-11-17  MFU       Added page_count to log table
                          , added @printFragmentation option
********************************************************************
    Exec dbo.dba_indexDefrag_sp
          @executeSQL         = 1
        , @printCommands      = 1
        , @minFragmentation   = 0
        , @printFragmentation = 1;
********************************************************************/
 
SET NOCOUNT ON;
SET XACT_Abort ON;
 
BEGIN
 
    /* Declare our variables */
    DECLARE   @objectID         INT
            , @indexID          INT
            , @partitionCount   BIGINT
            , @schemaName       NVARCHAR(130)
            , @objectName       NVARCHAR(130)
            , @indexName        NVARCHAR(130)
            , @partitionNumber  SMALLINT
            , @partitions       SMALLINT
            , @fragmentation    FLOAT
            , @pageCount        INT
            , @sqlCommand       NVARCHAR(4000)
            , @rebuildCommand   NVARCHAR(200)
            , @dateTimeStart    DATETIME
            , @dateTimeEnd      DATETIME
            , @containsLOB      BIT;
 
    /* Just a little validation... */
    IF @minFragmentation Not Between 0.00 And 100.0
        SET @minFragmentation = 10.0;
 
    IF @rebuildThreshold Not Between 0.00 And 100.0
        SET @rebuildThreshold = 30.0;
 
    IF @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]'
        SET @defragDelay = '00:00:05';
 
    /* Determine which indexes to defrag using our
       user-defined parameters */
    SELECT
          OBJECT_ID AS objectID
        , index_id AS indexID
        , partition_number AS partitionNumber
        , avg_fragmentation_in_percent AS fragmentation
        , page_count 
        , 0 AS 'defragStatus'
            /* 0 = unprocessed, 1 = processed */
    INTO #indexDefragList
    FROM sys.dm_db_index_physical_stats
        (DB_ID(), OBJECT_ID(@tableName), NULL , NULL, N'Limited')
    WHERE avg_fragmentation_in_percent > @minFragmentation
        And index_id > 0
    OPTION (MaxDop 1);
 
    /* Create a clustered index to boost performance a little */
    CREATE CLUSTERED INDEX CIX_temp_indexDefragList
        ON #indexDefragList(objectID, indexID, partitionNumber);
 
    /* Begin our loop for defragging */
    WHILE (SELECT COUNT(*) FROM #indexDefragList
            WHERE defragStatus = 0) > 0
    BEGIN
 
        /* Grab the most fragmented index first to defrag */
        SELECT TOP 1
              @objectID         = objectID
            , @fragmentation    = fragmentation
            , @indexID          = indexID
            , @partitionNumber  = partitionNumber
            , @pageCount        = page_count
        FROM #indexDefragList
        WHERE defragStatus = 0
        ORDER BY fragmentation DESC;
 
        /* Look up index information */
        SELECT @objectName = QUOTENAME(o.name)
             , @schemaName = QUOTENAME(s.name)
        FROM sys.objects AS o
        Inner Join sys.schemas AS s
            ON s.schema_id = o.schema_id
        WHERE o.OBJECT_ID = @objectID;
 
        SELECT @indexName = QUOTENAME(name)
        FROM sys.indexes
        WHERE OBJECT_ID = @objectID
            And index_id = @indexID
            And type > 0;
 
        /* Determine if the index is partitioned */
        SELECT @partitionCount = COUNT(*)
        FROM sys.partitions
        WHERE OBJECT_ID = @objectID
            And index_id = @indexID;
 
        /* Look for LOBs */
        SELECT TOP 1
            @containsLOB = column_id
        FROM sys.columns WITH (NOLOCK)
        WHERE 
            [OBJECT_ID] = @objectID
            And (system_type_id In (34, 35, 99)
            -- 34 = image, 35 = text, 99 = ntext
                    Or max_length = -1);
            -- varbinary(max), varchar(max), nvarchar(max), xml
 
        /* See if we should rebuild or reorganize; handle thusly */
        IF @fragmentation < @rebuildThreshold And @partitionCount <= 1
            SET @sqlCommand = N'Alter Index ' + @indexName + N' On '
                + @schemaName + N'.' + @objectName + N' ReOrganize';
 
        IF @fragmentation >= @rebuildThreshold
            And IsNull(@containsLOB, 0) = 0
                -- Cannot rebuild if the table has one or more LOB
            And @partitionCount <= 1
        BEGIN
 
            /* We should always rebuild online if possible
                (SQL 2005 Enterprise) */
            IF @onlineRebuild = 0
                SET @rebuildCommand = N' Rebuild With
                    (Online = Off, MaxDop = 1)';
            ELSE
                SET @rebuildCommand = N' Rebuild With
                    (Online = On, MaxDop = 1)';
 
            SET @sqlCommand = N'Alter Index ' + @indexName + N' On '
                + @schemaName + N'.' + @objectName + @rebuildCommand;
        END;
 
        /* If our index is partitioned, we should always reorganize */
        IF @partitionCount > 1
            SET @sqlCommand = N'Alter Index ' + @indexName + N' On '
                + @schemaName + N'.' + @objectName + N' ReOrganize'
                + N' Partition = '
                + CAST(@partitionNumber AS NVARCHAR(10));
                -- no MaxDop needed, single threaded operation
 
        /* Are we executing the SQL?  If so, do it */
        IF @executeSQL = 1
        BEGIN
 
            /* Grab the time for logging purposes */
            SET @dateTimeStart  = GETDATE();
            EXECUTE (@sqlCommand);
            SET @dateTimeEnd  = GETDATE();
 
            /* Log our actions */
            INSERT INTO dbo.dba_indexDefragLog
            (
                  objectID
                , objectName
                , indexID
                , indexName
                , partitionNumber
                , fragmentation
                , page_count
                , dateTimeStart
                , durationSeconds
            )
            SELECT
                  @objectID
                , @objectName
                , @indexID
                , @indexName
                , @partitionNumber
                , @fragmentation
                , @pageCount
                , @dateTimeStart
                , DATEDIFF(SECOND, @dateTimeStart, @dateTimeEnd);
 
            /* Just a little breather for the server */
            WAITFOR Delay @defragDelay;
 
            /* Print if specified to do so */
            IF @printCommands = 1
                PRINT N'Executed: ' + @sqlCommand;
        END
        ELSE
        /* Looks like we're not executing, just print
            the commands */
        BEGIN
            IF @printCommands = 1
                PRINT @sqlCommand;
        END
 
        /* Update our index defrag list when we've
            finished with that index */
        UPDATE #indexDefragList
        SET defragStatus = 1
        WHERE objectID         = @objectID
          And indexID          = @indexID
          And partitionNumber  = @partitionNumber;
 
    END
 
    /* Do we want to output our fragmentation results? */
    If @printFragmentation = 1
        Select idl.objectID
            , o.name As 'tableName'
            , idl.indexID
            , i.name As 'indexName'
            , idl.fragmentation
            , idl.page_count
        From #indexDefragList As idl
        Join sys.objects AS o
            On idl.objectID = o.object_id
        Join sys.indexes As i
            On idl.objectID = i.object_id
            And idl.indexID = i.index_id;
 
    /* When everything is done, make sure to get rid of
        our temp table */
    DROP TABLE #indexDefragList;
 
    SET NOCOUNT OFF;
    RETURN 0
END
Go

 

Standard Version:

IF EXISTS(SELECT OBJECT_ID FROM sys.tables
                WHERE [name] = N'dba_indexDefragLog')
BEGIN
 
    DROP TABLE dbo.dba_indexDefragLog;
 
    PRINT 'dba_indexDefragLog table dropped!';
 
END
 
CREATE TABLE dbo.dba_indexDefragLog
(
      indexDefrag_id    INT IDENTITY(1,1)   NOT NULL
    , objectID          INT                 NOT NULL
    , objectName        NVARCHAR(130)       NOT NULL
    , indexID           INT                 NOT NULL
    , indexName         NVARCHAR(130)       NOT NULL
    , fragmentation     FLOAT               NOT NULL
    , page_count        INT                 NOT NULL
    , dateTimeStart     DATETIME            NOT NULL
    , durationSeconds   INT                 NOT NULL
    CONSTRAINT PK_indexDefragLog
        PRIMARY KEY CLUSTERED (indexDefrag_id)
);
 
PRINT 'dba_indexDefragLog Table Created';
 
IF OBJECTPROPERTY(OBJECT_ID('dbo.dba_indexDefragStandard_sp'),
        N'IsProcedure') IS Null
BEGIN
    EXECUTE ('Create Procedure dbo.dba_indexDefragStandard_sp
        As Print ''Hello World!''');
    RAISERROR('Procedure dba_indexDefragStandard_sp created.'
        , 10, 1);
END;
Go
 
SET ANSI_Nulls ON;
SET Ansi_Padding ON;
SET Ansi_Warnings ON;
SET ArithAbort ON;
SET Concat_Null_Yields_Null ON;
SET NOCOUNT ON;
SET Numeric_RoundAbort OFF;
SET Quoted_Identifier ON;
Go
 
ALTER PROCEDURE dbo.dba_indexDefragStandard_sp
 
    /* Declare Parameters */
      @minFragmentation     FLOAT           = 10.0
        /* in percent, will not defrag if fragmentation
           less than specified */
    , @rebuildThreshold     FLOAT           = 30.0
        /* in percent, greater than @rebuildThreshold
           will result in rebuild instead of reorg */
    , @executeSQL           BIT             = 1
        /* 1 = execute; 0 = print command only */
    , @tableName            VARCHAR(4000)   = Null 
        /* Option to specify a table name */
    , @printCommands        BIT             = 0
        /* 1 = print commands; 0 = do not print commands */
    , @printFragmentation   BIT             = 0
        /* 1 = print fragmentation prior to defrag; 
           0 = do not print */
    , @defragDelay          CHAR(8)         = '00:00:05'
        /* time to wait between defrag commands */
AS
/********************************************************************
    Name:       dba_indexDefragStandard_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Defrags all indexes for the current database
 
    Notes:      This script was designed for SQL Server 2005
                Standard edition.
 
    CAUTION: Monitor transaction log if executing for the first time!
 
      @minFragmentation     defaulted to 10%, will not defrag if
                            fragmentation if less than specified.
 
      @rebuildThreshold     defaulted to 30% as recommended by
                            Microsoft in BOL;
                            > than 30% will result in rebuild instead
 
      @executeSQL           1 = execute the SQL generated by this proc;
                            0 = print command only
 
      @tableName            Specify if you only want to defrag indexes
                            for a specific table
 
      @printCommands        1 = print commands to screen;
                            0 = do not print commands
 
      @printFragmentation   1 = print fragmentation to screen;
                            0 = do not print fragmentation
 
      @defragDelay          time to wait between defrag commands;
                            gives the server some time to catch up
 
    Called by:  SQL Agent Job or DBA
 
    Date        Initials  Description
    ----------------------------------------------------------------
    2008-10-27  MFU       Initial Release
    2008-11-17  MFU       Added page_count to log table
                          , added @printFragmentation option
********************************************************************
    Exec dbo.dba_indexDefragStandard_sp
          @executeSQL         = 1
        , @printCommands      = 1
        , @minFragmentation   = 0
        , @printFragmentation = 1;
********************************************************************/
 
SET NOCOUNT ON;
SET XACT_Abort ON;
 
BEGIN
 
    /* Declare our variables */
    DECLARE   @objectID         INT
            , @indexID          INT
            , @schemaName       NVARCHAR(130)
            , @objectName       NVARCHAR(130)
            , @indexName        NVARCHAR(130)
            , @fragmentation    FLOAT
            , @pageCount        INT
            , @sqlCommand       NVARCHAR(4000)
            , @rebuildCommand   NVARCHAR(200)
            , @dateTimeStart    DATETIME
            , @dateTimeEnd      DATETIME
            , @containsLOB      BIT;
 
    /* Just a little validation... */
    IF @minFragmentation Not Between 0.00 And 100.0
        SET @minFragmentation = 10.0;
 
    IF @rebuildThreshold Not Between 0.00 And 100.0
        SET @rebuildThreshold = 30.0;
 
    IF @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]'
        SET @defragDelay = '00:00:05';
 
    /* Determine which indexes to defrag using our
       user-defined parameters */
    SELECT
          OBJECT_ID AS objectID
        , index_id AS indexID
        , avg_fragmentation_in_percent AS fragmentation
        , page_count 
        , 0 AS 'defragStatus'
            /* 0 = unprocessed, 1 = processed */
    INTO #indexDefragList
    FROM sys.dm_db_index_physical_stats
        (DB_ID(), OBJECT_ID(@tableName), NULL , NULL, N'Limited')
    WHERE avg_fragmentation_in_percent > @minFragmentation
        And index_id > 0
    OPTION (MaxDop 1);
 
    /* Create a clustered index to boost performance a little */
    CREATE CLUSTERED INDEX CIX_temp_indexDefragList
        ON #indexDefragList(objectID, indexID);
 
    /* Begin our loop for defragging */
    WHILE (SELECT COUNT(*) FROM #indexDefragList
            WHERE defragStatus = 0) > 0
    BEGIN
 
        /* Grab the most fragmented index first to defrag */
        SELECT TOP 1
              @objectID         = objectID
            , @fragmentation    = fragmentation
            , @indexID          = indexID
            , @pageCount        = page_count
        FROM #indexDefragList
        WHERE defragStatus = 0
        ORDER BY fragmentation DESC;
 
        /* Look up index information */
        SELECT @objectName = QUOTENAME(o.name)
             , @schemaName = QUOTENAME(s.name)
        FROM sys.objects AS o
        Inner Join sys.schemas AS s
            ON s.schema_id = o.schema_id
        WHERE o.OBJECT_ID = @objectID;
 
        SELECT @indexName = QUOTENAME(name)
        FROM sys.indexes
        WHERE OBJECT_ID = @objectID
            And index_id = @indexID
            And type > 0;
 
        /* Look for LOBs */
        SELECT TOP 1
            @containsLOB = column_id
        FROM sys.columns WITH (NOLOCK)
        WHERE 
            [OBJECT_ID] = @objectID
            And (system_type_id In (34, 35, 99)
            -- 34 = image, 35 = text, 99 = ntext
                    Or max_length = -1);
            -- varbinary(max), varchar(max), nvarchar(max), xml
 
        /* See if we should rebuild or reorganize; handle thusly */
        IF @fragmentation < @rebuildThreshold 
            Or IsNull(@containsLOB, 0) > 0 
            -- Cannot rebuild if the table has one or more LOB
            SET @sqlCommand = N'Alter Index ' + @indexName + N' On '
                + @schemaName + N'.' + @objectName + N' ReOrganize;'
        ELSE
            SET @sqlCommand = N'Alter Index ' + @indexName + N' On '
                + @schemaName + N'.' + @objectName +  ' Rebuild '
                + 'With (MaxDop = 1)'; -- minimize impact on server
 
        /* Are we executing the SQL?  If so, do it */
        IF @executeSQL = 1
        BEGIN
 
            /* Grab the time for logging purposes */
            SET @dateTimeStart  = GETDATE();
            EXECUTE (@sqlCommand);
            SET @dateTimeEnd  = GETDATE();
 
            /* Log our actions */
            INSERT INTO dbo.dba_indexDefragLog
            (
                  objectID
                , objectName
                , indexID
                , indexName
                , fragmentation
                , page_count
                , dateTimeStart
                , durationSeconds
            )
            SELECT
                  @objectID
                , @objectName
                , @indexID
                , @indexName
                , @fragmentation
                , @pageCount
                , @dateTimeStart
                , DATEDIFF(SECOND, @dateTimeStart, @dateTimeEnd);
 
            /* Just a little breather for the server */
            WAITFOR Delay @defragDelay;
 
            /* Print if specified to do so */
            IF @printCommands = 1
                PRINT N'Executed: ' + @sqlCommand;
        END
        ELSE
        /* Looks like we're not executing, just print
            the commands */
        BEGIN
            IF @printCommands = 1
                PRINT @sqlCommand;
        END
 
        /* Update our index defrag list when we've
            finished with that index */
        UPDATE #indexDefragList
        SET defragStatus = 1
        WHERE objectID  = @objectID
          And indexID   = @indexID;
 
    END
 
    /* Do we want to output our fragmentation results? */
    IF @printFragmentation = 1
        SELECT idl.objectID
            , o.name As 'tableName'
            , idl.indexID
            , i.name As 'indexName'
            , idl.fragmentation
            , idl.page_count
        FROM #indexDefragList AS idl
        JOIN sys.objects AS o
            ON idl.objectID = o.object_id
        JOIN sys.indexes As i
            ON idl.objectID = i.object_id
            AND idl.indexID = i.index_id;
 
    /* When everything is done, make sure to get rid of
        our temp table */
    DROP TABLE #indexDefragList;
 
    SET NOCOUNT OFF;
    RETURN 0
END
Go

 

For those who are having troubles with this script…

1) “Not all of my indexes were defragged!” or “Nothing happened when I executed this script.”

This script will only defrag those indexes that surpass the specified threshold. If you’re not seeing your index in the output, try executing this:

    Exec dbo.dba_indexDefrag_sp
          @executeSQL    = 0
        , @printCommands = 1
        , @minFragmentation = 0
        , @printFragmentation = 1;

Check to see what your index’s fragmentation level is. Maybe it’s not as fragmented as you feared. :)

2) “My indexes are still fragmented after running this script.”

To quote The Powers That Be (aka Microsoft)…

“In general, fragmentation on small indexes is often not controllable. The pages of small indexes are stored on mixed extents. Mixed extents are shared by up to eight objects, so the fragmentation in a small index might not be reduced after reorganizing or rebuilding the index.” — Reorganizing and Rebuilding Indexes

3) “Can I use this in my production environment?”

That really depends on your environment. I’ve successfully used this in some very large production environments. However, I wouldn’t exactly recommend executing the script in the middle of a business day on a billion+ row, heavily fragmented, unpartitioned table, either. :)

If you’re not sure what the impact will be, execute the commands-only version of the script…

    Exec dbo.dba_indexDefrag_sp
	      @executeSQL    = 0
        , @printCommands = 1
        , @printFragmentation = 1;

… then execute the statements one at a time. Make sure you monitor tempdb and the transaction log to ensure you don’t have any space issues.

If you have any additional questions or suggestions for this script, leave me a comment and I’ll be happy to help. :)

Index Interrogation Script

As a continuation of the last post on indexes, here’s a script I wrote for index interrogation.  This stored procedure was the result of repeatedly having to lookup information on included columns.

If ObjectProperty(Object_ID('dbo.dba_indexLookup_sp'), 
        N'IsProcedure') Is Null
Begin
    Execute ('Create Procedure dbo.dba_indexLookup_sp 
        As Print ''Hello World!''')
    RaisError('Procedure dbo.dba_indexLookup_sp created.', 10, 1);
End;
Go
 
Set ANSI_Nulls On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set NoCount On;
Set Numeric_RoundAbort Off;
Set Quoted_Identifier On;
Go
 
Alter Procedure dbo.dba_indexLookup_sp
 
        /* Declare Parameters */
        @tableName  varchar(128)  =  Null
As
/***********************************************************************
    Name:       dba_indexLookup_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Retrieves index information for the specified table.
 
    Notes:      If the tableName is left null, it will return index 
                information for all tables and indexes.
 
    Called by:  DBA
 
    Date        User   Description
    --------------------------------------------------------------------
    2008-10-28  MFU    Initial Release
************************************************************************
    Exec dbo.dba_indexLookup_sp
        @tableName = 'myTableName';
***********************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
 
Begin
 
    Declare @objectID int;
 
    If @tableName Is Not Null
        Set @objectID = Object_ID(@tableName);
 
    With indexCTE(partition_scheme_name
                , partition_function_name
                , data_space_id)
    As (
        Select sps.name
            , spf.name
            , sps.data_space_id
        From sys.partition_schemes As sps
        Join sys.partition_functions As spf
            On sps.function_id = spf.function_id
    )
 
    Select st.name As 'table_name'
        , IsNull(ix.name, '') As 'index_name'
        , ix.object_id
        , ix.index_id
		, Cast(
            Case When ix.index_id = 1 
                    Then 'clustered' 
                When ix.index_id =0
                    Then 'heap'
                Else 'nonclustered' End
			+ Case When ix.ignore_dup_key <> 0 
                Then ', ignore duplicate keys' 
                    Else '' End
			+ Case When ix.is_unique <> 0 
                Then ', unique' 
                    Else '' End
			+ Case When ix.is_primary_key <> 0 
                Then ', primary key' Else '' End As varchar(210)
            ) As 'index_description'
        , IsNull(Replace( Replace( Replace(
            (   
                Select c.name As 'columnName'
                From sys.index_columns As sic
                Join sys.columns As c 
                    On c.column_id = sic.column_id 
                    And c.object_id = sic.object_id
                Where sic.object_id = ix.object_id
                    And sic.index_id = ix.index_id
                    And is_included_column = 0
                Order By sic.index_column_id
                For XML Raw)
                , '"/><row columnName="', ', ')
                , '<row columnName="', '')
                , '"/>', ''), '')
            As 'indexed_columns'
        , IsNull(Replace( Replace( Replace(
            (   
                Select c.name As 'columnName'
                From sys.index_columns As sic
                Join sys.columns As c 
                    On c.column_id = sic.column_id 
                    And c.object_id = sic.object_id
                Where sic.object_id = ix.object_id
                    And sic.index_id = ix.index_id
                    And is_included_column = 1
                Order By sic.index_column_id
                For XML Raw)
                , '"/><row columnName="', ', ')
                , '<row columnName="', '')
                , '"/>', ''), '')
            As 'included_columns'
        , IsNull(cte.partition_scheme_name, '') 
            As 'partition_scheme_name'
        , Count(partition_number) As 'partition_count'
        , Sum(rows) As 'row_count'
    From sys.indexes As ix
    Join sys.partitions As sp
        On ix.object_id = sp.object_id
        And ix.index_id = sp.index_id
    Join sys.tables As st
        On ix.object_id = st.object_id
    Left Join indexCTE As cte
        On ix.data_space_id = cte.data_space_id
    Where ix.object_id = IsNull(@objectID, ix.object_id)
    Group By st.name
        , IsNull(ix.name, '')
        , ix.object_id
        , ix.index_id
		, Cast(
            Case When ix.index_id = 1 
                    Then 'clustered' 
                When ix.index_id =0
                    Then 'heap'
                Else 'nonclustered' End
			+ Case When ix.ignore_dup_key <> 0 
                Then ', ignore duplicate keys' 
                    Else '' End
			+ Case When ix.is_unique <> 0 
                Then ', unique' 
                    Else '' End
			+ Case When ix.is_primary_key <> 0 
                Then ', primary key' Else '' End As varchar(210)
            )
        , IsNull(cte.partition_scheme_name, '')
        , IsNull(cte.partition_function_name, '')
    Order By table_name
        , index_id;
 
    Set NoCount Off;
    Return 0;
End
Go

Index Defrag Script

UPDATE: This script has been significantly updated. Please view the latest version at http://sqlfool.com/2011/06/index-defrag-script-v4-1/.

Previously I posted that it’s important to keep your indexes in shape with weekly or (preferably) nightly defrags. Below is a script I use to execute nightly defrags in SQL 2005 Enterprise. I can’t claim complete credit… I believe this is a script I originally found on Microsoft and heavily modified to meet my needs.

You may want to modify the script if you’re doing much with XML or LOB’s. Also keep in mind that this is hitting the sys.dm_db_index_physical_stats table to view fragmentation information, which can be resource intensive.

If you’re not familiar with index defragmentation, check out “Alter Index” on Books Online.

If Not Exists(Select object_id From sys.tables 
                Where [name] = N'dba_indexDefragLog')
Begin
 
    Create Table dbo.dba_indexDefragLog
    (
          indexDefrag_id    int identity(1,1)   not null
        , objectID          int                 not null
        , objectName        nvarchar(130)       not null
        , indexID           int                 not null
        , indexName         nvarchar(130)       not null
        , partitionNumber   smallint            not null
        , fragmentation     float               not null
        , dateTimeStart     datetime            not null
        , durationSeconds   int                 not null
        Constraint PK_indexDefragLog 
            Primary Key Clustered (indexDefrag_id)
    );
 
    Print 'dba_indexDefragLog Table Created';
End
 
If ObjectProperty(Object_ID('dbo.dba_indexDefrag_sp'), 
        N'IsProcedure') Is Null
Begin
    Execute ('Create Procedure dbo.dba_indexDefrag_sp As 
                Print ''Hello World!''')
    RaisError('Procedure dba_indexDefrag_sp created.', 10, 1);
End;
Go
 
Set ANSI_Nulls On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set NoCount On;
Set Numeric_RoundAbort Off;
Set Quoted_Identifier On;
Go
 
Alter Procedure dbo.dba_indexDefrag_sp
 
    /* Declare Parameters */
      @minFragmentation     float           = 10.0  
        /* in percent, will not defrag if fragmentation 
           less than specified */
    , @rebuildThreshold     float           = 30.0  
        /* in percent, greater than @rebuildThreshold 
           will result in rebuild instead of reorg */
    , @onlineRebuild        bit             = 1     
        /* 1 = online rebuild; 0 = offline rebuild */
    , @executeSQL           bit             = 1     
        /* 1 = execute; 0 = print command only */
    , @tableName            varchar(4000)   = Null  
        /* Option to specify a table name */
    , @printCommands        bit             = 0     
        /* 1 = print commands; 0 = do not print commands */
    , @defragDelay          char(8)         = '00:00:05'
        /* time to wait between defrag commands */
As
/***********************************************************************
    Name:       dba_indexDefrag_sp
 
    Author:     Michelle F. Ufford
 
    Purpose:    Defrags all indexes for the current database
 
    Notes:
 
    CAUTION: Monitor transaction log when executing for the first time!
 
      @minFragmentation     defaulted to 10%, will not defrag if 
                            fragmentation if less than specified 
 
      @rebuildThreshold     defaulted to 30% as recommended by 
                            Microsoft in BOL;
                            > than 30% will result in rebuild instead
 
      @onlineRebuild        1 = online rebuild; 
                            0 = offline rebuild
 
      @executeSQL           1 = execute the SQL generated by this proc; 
                            0 = print command only
 
      @tableName            Specify if you only want to defrag indexes 
                            for a specific table
 
      @printCommands        1 = print commands to screen; 
                            0 = do not print commands
 
      @defragDelay          time to wait between defrag commands; 
                            gives the server a little time to catch up 
 
    Called by:  SQL Agent Job or DBA
 
    Date        Initials	Description
    -------------------------------------------------------------------
    2008-10-27  MFU         Initial Release
***********************************************************************
    Exec dbo.dba_indexDefrag_sp
	      @executeSQL    = 0
        , @printCommands = 1;
***********************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
 
Begin
 
    /* Declare our variables */
    Declare   @objectID         int
            , @indexID          int
            , @partitionCount   bigint
            , @schemaName       nvarchar(130)
            , @objectName       nvarchar(130)
            , @indexName        nvarchar(130)
            , @partitionNumber  smallint
            , @partitions       smallint
            , @fragmentation    float
            , @sqlCommand       nvarchar(4000)
            , @rebuildCommand   nvarchar(200)
            , @dateTimeStart    datetime
            , @dateTimeEnd      datetime
            , @containsLOB      bit;
 
    /* Just a little validation... */
    If @minFragmentation Not Between 0.00 And 100.0
        Set @minFragmentation = 10.0;
 
    If @rebuildThreshold Not Between 0.00 And 100.0
        Set @rebuildThreshold = 30.0;
 
    If @defragDelay Not Like '00:[0-5][0-9]:[0-5][0-9]'
        Set @defragDelay = '00:00:05';
 
    /* Determine which indexes to defrag using our 
       user-defined parameters */
    Select
          Object_ID AS objectID
        , index_id AS indexID
        , partition_number AS partitionNumber
        , avg_fragmentation_in_percent AS fragmentation
        , 0 As 'defragStatus' 
            /* 0 = unprocessed, 1 = processed */
    Into #indexDefragList
    From sys.dm_db_index_physical_stats 
        (DB_ID(), Object_Id(@tableName), NULL , NULL, N'Limited')
    Where avg_fragmentation_in_percent > @minFragmentation 
        And index_id > 0
    Option (MaxDop 1);
 
    /* Create a clustered index to boost performance a little */
    Create Clustered Index CIX_temp_indexDefragList
        On #indexDefragList(objectID, indexID, partitionNumber);
 
    /* Begin our loop for defragging */
    While (Select Count(*) From #indexDefragList 
            Where defragStatus = 0) > 0
    Begin
 
        /* Grab the most fragmented index first to defrag */
        Select Top 1 
              @objectID         = objectID
            , @fragmentation    = fragmentation
            , @indexID          = indexID
            , @partitionNumber  = partitionNumber
        From #indexDefragList
        Where defragStatus = 0
        Order By fragmentation Desc;
 
        /* Look up index information */
        Select @objectName = QuoteName(o.name)
             , @schemaName = QuoteName(s.name)
        From sys.objects As o
        Inner Join sys.schemas As s 
            On s.schema_id = o.schema_id
        Where o.object_id = @objectID;
 
        Select @indexName = QuoteName(name)
        From sys.indexes 
        Where object_id = @objectID
            And index_id = @indexID
            And type > 0;
 
        /* Determine if the index is partitioned */
        Select @partitionCount = Count(*)
        From sys.partitions
        Where object_id = @objectID 
            And index_id = @indexID;
 
        /* Look for LOBs */
        Select Top 1 
            @containsLOB = column_id
        From sys.columns With (NoLock) 
        Where  
            [object_id] = @objectID
            And (system_type_id In (34, 35, 99) 
            -- 34 = image, 35 = text, 99 = ntext
                    Or max_length = -1); 
            -- varbinary(max), varchar(max), nvarchar(max), xml
 
        /* See if we should rebuild or reorganize; handle thusly */
        If @fragmentation < @rebuildThreshold And @partitionCount <= 1
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + N' ReOrganize';
 
        If @fragmentation >= @rebuildThreshold 
            And IsNull(@containsLOB, 0) = 0 
                -- Cannot rebuild if the table has one or more LOB
            And @partitionCount <= 1
        Begin
 
            /* We should always rebuild online if possible 
                (SQL 2005 Enterprise) */
            If @onlineRebuild = 0 
                Set @rebuildCommand = N' Rebuild With 
                    (Online = Off, MaxDop = 1)';
            Else
                Set @rebuildCommand = N' Rebuild With 
                    (Online = On, MaxDop = 1)';
 
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + @rebuildCommand;
        End;
 
        /* If our index is partitioned, we should always reorganize */
        If @partitionCount > 1
            Set @sqlCommand = N'Alter Index ' + @indexName + N' On ' 
                + @schemaName + N'.' + @objectName + N' ReOrganize'
                + N' Partition = ' 
                + Cast(@partitionNumber As nvarchar(10)); 
                -- no MaxDop needed, single threaded operation
 
        /* Are we executing the SQL?  If so, do it */
        If @executeSQL = 1
        Begin
 
            /* Grab the time for logging purposes */
            Set @dateTimeStart  = GetDate();
            Execute (@sqlCommand);
            Set @dateTimeEnd  = GetDate();
 
            /* Log our actions */
            Insert Into dbo.dba_indexDefragLog
            (
                  objectID
                , objectName
                , indexID
                , indexName
                , partitionNumber
                , fragmentation
                , dateTimeStart
                , durationSeconds
            )
            Select
                  @objectID
                , @objectName
                , @indexID
                , @indexName
                , @partitionNumber
                , @fragmentation
                , @dateTimeStart
                , DateDiff(second, @dateTimeStart, @dateTimeEnd);
 
            /* Just a little breather for the server */
            WaitFor Delay @defragDelay;
 
            /* Print if specified to do so */
            If @printCommands = 1
                Print N'Executed: ' + @sqlCommand;
        End
        Else
        /* Looks like we're not executing, just print 
            the commands */
        Begin
            If @printCommands = 1
                Print @sqlCommand;
        End
 
        /* Update our index defrag list when we've
            finished with that index */
        Update #indexDefragList
        Set defragStatus = 1
        Where objectID         = @objectID
          And indexID          = @indexID
          And partitionNumber  = @partitionNumber;
 
    End
 
    /* When everything is done, make sure to get rid of 
        our temp table */
    Drop Table #indexDefragList;
 
    Set NoCount Off;
    Return 0
End
Go
 
Set Quoted_Identifier Off 
Set ANSI_Nulls On
Go

20081117 UPDATE: Check out my Updated Index Defrag Script for SQL 2005 and 2008.

Tips for Large Data Stores

I’m currently working on projects that involve the collection of massive amounts of data (i.e. terabyte-class, billion-row tables). One of the challenges of collecting so much data is the ability to report on the data quickly.  To this end, I plan to spend the next several blog posts discussing tips for designing and managing large relational data stores.  These are primarily design practices that have performed well for me within the various environments in which I’ve worked.

Partitioning

• Partitioning is the segregation of a single logical table into separate, identical physical structures. Basically, every table is stored in at least one partition; “partitioned tables” are merely comprised of many partitions.

• Implemented correctly, partitioning can have dramatic improvement on read performance, index maintenance cost reduction, storage of large fact tables on separate disks, and data archiving.

• If you’re not familiar with partitioning, check out Kimberly Tripp’s excellent white paper, Partitioned Tables and Indexes in SQL Server 2005.

• I cannot rave enough about the impact partitioning has had within our environment. To take one of the most extreme examples of improvement, the execution time of one BI report decreased from 2.5+ hours to 20 minutes after we implemented partitioning. While not all stored procedures experienced such phenomenal improvement, improvement in the range of 30% – 60% was very common.

• While partitioning does have many benefits, there are some negatives. Namely, more expensive writes (I’ve noticed around 10%), increased duration of queries that span many partitions (i.e. queries on long time spans, such as a year), and increased maintenance needs.

• My rule of thumb is to partition any table with growth rates > 10mm records per week or with regular archiving needs.

• Much more to come on partitioning in the near future. :)

Indexing

• Any experienced DBA can tell you that effective index management is critical.

• Ensure your stored procedures are using index seeks instead of scans. If scans cannot be avoided, consider using partitioning with aligned indexes to limit the amount of pages scanned.

• Periodically look for un-used indexes and remove them to improve write performance.

• Look for missing indexes, create one or two, and evaluate. Try composite indexes and included columns as a way to limit the amount of indexes you need to create.

• Defrag your indexes! Nightly, if you can get away with it.

Stored Procedure Tuning

• When querying on large amounts of data, try breaking your queries up into individual components and storing in temporary tables, then performing the joins.

• When querying on large date ranges, try looping through the days and inserting the data into a temp table. This can be especially beneficial when querying partitioned tables.

• Create indexes on your temporary tables!

• Utilize index seeks whenever possible.

• If permitted to use dirty reads and your environment’s isolation level is not defaulted to read uncommitted, use With (NoLock) hints.

• Temporary tables can be partitioned! This is usually not necessary but can sometimes be beneficial.

• Try replacing table variables with temporary tables. My rule of thumb is this: if I’m performing joins on the temporary data or if I have more than 100 records, use a temporary table. Table variables can decrease stored procedure recompiles but can sometimes have a negative impact on performance. When in doubt, test each method and evaluate which works better for your needs.

• *Always* include the partitioning key when querying partitioned tables.