Effective Clustered Indexing

My first Simple-Talk article was published yesterday! I’m pretty excited about it and wanted to share the link. In the article, I give an overview of how clustered and nonclustered indexes work, and I demonstrate why clustered index best practices — narrow, unique, static, and ever-increasing — are important design considerations.

You can find the article on Simple-Talk’s website at:
http://www.simple-talk.com/sql/learn-sql-server/effective-clustered-indexes/

Please let me know your thoughts! :)

Index Interrogation for SQL Server 2008

I had previously posted an index interrogation script for SQL Server 2005. I’ve updated that script for 2008; namely, it includes filtered index definitions. For anyone interested:

Declare @objectID int = Object_ID('Sales.SalesOrderHeader');
 
With indexCTE(partition_scheme_name
            , partition_function_name
            , data_space_id)
As (
    Select sps.name
        , spf.name
        , sps.data_space_id
    From sys.partition_schemes As sps
    Join sys.partition_functions As spf
        On sps.function_id = spf.function_id
)
 
Select st.name As 'table_name'
    , IsNull(ix.name, '') As 'index_name'
    , ix.object_id
    , ix.index_id
	, Cast(
        Case When ix.index_id = 1 
                Then 'clustered' 
            When ix.index_id =0
                Then 'heap'
            Else 'nonclustered' End
		+ Case When ix.ignore_dup_key <> 0 
            Then ', ignore duplicate keys' 
                Else '' End
		+ Case When ix.is_unique <> 0 
            Then ', unique' 
                Else '' End
		+ Case When ix.is_primary_key <> 0 
            Then ', primary key' Else '' End As varchar(210)
        ) As 'index_description'
    , IsNull(Replace( Replace( Replace(
        (   
            Select c.name As 'columnName'
            From sys.index_columns As sic
            Join sys.columns As c 
                On c.column_id = sic.column_id 
                And c.object_id = sic.object_id
            Where sic.object_id = ix.object_id
                And sic.index_id = ix.index_id
                And is_included_column = 0
            Order By sic.index_column_id
            For XML Raw)
            , '"/><row columnName="', ', ')
            , '<row columnName="', '')
            , '"/>', ''), '')
        As 'indexed_columns'
    , IsNull(Replace( Replace( Replace(
        (   
            Select c.name As 'columnName'
            From sys.index_columns As sic
            Join sys.columns As c 
                On c.column_id = sic.column_id 
                And c.object_id = sic.object_id
            Where sic.object_id = ix.object_id
                And sic.index_id = ix.index_id
                And is_included_column = 1
            Order By sic.index_column_id
            For XML Raw)
            , '"/><row columnName="', ', ')
            , '<row columnName="', '')
            , '"/>', ''), '')
        As 'included_columns'
    , ix.filter_definition
    , IsNull(cte.partition_scheme_name, '') As 'partition_scheme_name'
    , Count(partition_number) As 'partition_count'
    , Sum(rows) As 'row_count'
From sys.indexes As ix
Join sys.partitions As sp
    On ix.object_id = sp.object_id
    And ix.index_id = sp.index_id
Join sys.tables As st
    On ix.object_id = st.object_id
Left Join indexCTE As cte
    On ix.data_space_id = cte.data_space_id
Where ix.object_id = IsNull(@objectID, ix.object_id)
Group By st.name
    , IsNull(ix.name, '')
    , ix.object_id
    , ix.index_id
	, Cast(
        Case When ix.index_id = 1 
                Then 'clustered' 
            When ix.index_id =0
                Then 'heap'
            Else 'nonclustered' End
		+ Case When ix.ignore_dup_key <> 0 
            Then ', ignore duplicate keys' 
                Else '' End
		+ Case When ix.is_unique <> 0 
            Then ', unique' 
                Else '' End
		+ Case When ix.is_primary_key <> 0 
            Then ', primary key' Else '' End As varchar(210)
        )
    , ix.filter_definition
    , IsNull(cte.partition_scheme_name, '')
    , IsNull(cte.partition_function_name, '')
Order By table_name
    , index_id;

You may need to create some indexes to see this in AdventureWorks:

Create NonClustered Index IX_Sales_SalesOrderHeader_filtered_2005
    On Sales.SalesOrderHeader(AccountNumber)
    Include (CustomerID, SalesPersonID)
    Where OrderDate >= '2005-01-01'
        And OrderDate < '2006-01-01';
table_name           index_name                               object_id   index_id    index_description                   indexed_columns      included_columns               filter_definition                                            partition_scheme_name partition_count row_count
-------------------- ---------------------------------------- ----------- ----------- ----------------------------------- -------------------- ------------------------------ ------------------------------------------------------------ --------------------- --------------- --------------------
SalesOrderHeader     PK_SalesOrderHeader_SalesOrderID         1010102639  1           clustered, unique, primary key      SalesOrderID                                        NULL                                                                               1               31465
SalesOrderHeader     AK_SalesOrderHeader_rowguid              1010102639  2           nonclustered, unique                rowguid                                             NULL                                                                               1               31465
SalesOrderHeader     AK_SalesOrderHeader_SalesOrderNumber     1010102639  3           nonclustered, unique                SalesOrderNumber                                    NULL                                                                               1               31465
SalesOrderHeader     IX_SalesOrderHeader_CustomerID           1010102639  5           nonclustered                        CustomerID                                          NULL                                                                               1               31465
SalesOrderHeader     IX_SalesOrderHeader_SalesPersonID        1010102639  6           nonclustered                        SalesPersonID                                       NULL                                                                               1               31465
SalesOrderHeader     IX_Sales_SalesOrderHeader_filtered_2005  1010102639  13          nonclustered                        AccountNumber        CustomerID, SalesPersonID      ([OrderDate]>='2005-01-01' AND [OrderDate]<'2006-01-01')                           1               1379

Undocumented Function in SQL 2008

If you’ve been following my blog for a little while, you’ll know that I’m a fan of SQL Server internals. There’s a lot that can be learned or better understood by rolling up your sleeves and getting into the nitty-gritty of data pages (i.e. see my post on Overhead in Non-Unique Clustered Indexes). So imagine how happy I was when my co-worker Jeff shared an undocumented function with me today that retrieves the file number, page number, and slot number of a single record. Very cool! Well, at least to me. So now let’s see how you can use it.

The fn_physLocCracker function can be called in the following way:

Select Top 100 plc.*, soh.SalesOrderID
From Sales.SalesOrderHeader As soh
Cross Apply sys.fn_physLocCracker (%%physloc%%) As plc;

Results (just a sample):

file_id     page_id     slot_id     SalesOrderID
----------- ----------- ----------- ------------
1           14032       0           43659
1           14032       1           43660
1           14032       2           43661
1           14032       3           43662
1           14032       4           43663

If you look at the sp_helptext for sys.fn_physLocCracker, %%physloc%% is apparently a virtual column that contains information on where the record is stored. In fact, you can even append %%physloc%% to your column list if you want to see how the information is stored. But for our purposes, we now have a file number, page number, and slot number. What do we do with it?

Well, you can use the investigation proc I wrote to retrieve the actual data page:

Execute dba_viewPageData_sp 
      @databaseName = 'AdventureWorks'
    , @fileNumber = 1
    , @pageNumber = 14032;

Results (just a sample):

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4
 
SalesOrderID = 43659                 
 
Slot 0 Column 2 Offset 0x8 Length 1 Length (physical) 1
 
RevisionNumber = 1                   
 
Slot 0 Column 3 Offset 0x9 Length 8 Length (physical) 8
 
OrderDate = 2001-07-01 00:00:00.000  
 
Slot 0 Column 4 Offset 0x11 Length 8 Length (physical) 8
 
DueDate = 2001-07-13 00:00:00.000

Neat, huh? So why would you use it to look up the data page and file number when you can just pass the table name and index name to my proc and retrieve data pages? Well, my investigation proc will retrieve data pages for any index type — the fn_physLocCracker function will only retrieve data for the clustered index — but it will not retrieve the data page for a specific record. So just something to be aware of.

That’s all for now. Back to the #24HoursOfPASS! :)

Overhead in Non-Unique Clustered Indexes

I’ve received a couple of questions regarding my article, Performance Considerations of Data Types, and the overhead associated with non-unique clustered indexes. I started to respond via e-mail, but my response was so long I decided to turn it into a blog post instead. ;)

I should start by clarifying that non-unique clustered indexes do not necessarily consume more space and overhead; it depends on the data stored. If you have duplicate clustered key values, the first instance of the value will be handled as though it were unique. Any subsequent values, however, will incur overhead to manage the uniquifier that SQL Server adds to maintain row uniqueness. This same overhead is also incurred in non-clustered indexes, too, adding to the overall expense of this approach.

I think it helps to actually look at the data, so let’s walk through a few different common scenarios. We’ll create a table with a unique clustered index, a table with a non-unique clustered index but no duplicates, and a table with duplicate key values.

Also, a little warning that I started to write this in SQL Server 2008, and since I’m on a 2008 kick, I decided to leave it that way. You can modify this pretty easily to work in 2005, if necessary.

Use sandbox;
Go
 
/* Unique, clustered index, no duplicate values */
Create Table dbo.uniqueClustered
(
      myDate    date        Not Null
    , myNumber  int         Not Null
    , myColumn  char(995)   Not Null
);
 
Create Unique Clustered Index CIX_uniqueClustered
    On dbo.uniqueClustered(myDate);
 
 
/* Non-unique clustered index, but no duplicate values */
Create Table dbo.nonUniqueNoDups
(
      myDate    date        Not Null
    , myNumber  int         Not Null
    , myColumn  char(995)   Not Null
);
 
Create Clustered Index CIX_nonUniqueNoDups
    On dbo.nonUniqueNoDups(myDate);
 
 
/* Non-unique clustered index, duplicate values */
Create Table dbo.nonUniqueDuplicates
(
      myDate    date        Not Null
    , myNumber  int         Not Null
    , myColumn  char(995)   Not Null
);
 
Create Clustered Index CIX_nonUniqueDuplicates
    On dbo.nonUniqueDuplicates(myDate);

I’m going to use the date data type in 2008 for my clustered index key. To ensure uniqueness for the first two tables, I’ll iterate through a few years’ worth of dates. This is typical of what you may see in a data mart, where you’d have one record with an aggregation of each day’s data. For the table with duplicate values, I’m going to insert the same date for each row.

/* Populate some test data */
Set NoCount On;
Declare @myDate date = '1990-01-01'
    , @myNumber int = 1;
 
While @myDate < '2010-01-01'
Begin
 
    Insert Into dbo.uniqueClustered
    Select @myDate, @myNumber, 'data';
 
    Insert Into dbo.nonUniqueNoDups
    Select @myDate, @myNumber, 'data';
 
    Insert Into dbo.nonUniqueDuplicates
    Select '2009-01-01', @myNumber, 'data';
 
    Select @myDate = DateAdd(day, 1, @myDate)
        , @myNumber += 1;
 
End;

After running the above script, each table should have 7,305 records. This is obviously pretty small for a table, but it’ll serve our purposes. Now let’s take a look at the size of our tables:

/* Look at the details of our indexes */
 
/* Unique, clustered index, no duplicate values */
Select 'unique' As 'type', page_count, avg_page_space_used_in_percent, record_count
    , min_record_size_in_bytes, max_record_size_in_bytes
From sys.dm_db_index_physical_stats(DB_ID(), Object_ID(N'uniqueClustered'), Null, Null, N'Detailed') 
Where index_level = 0
Union All
/* Non-unique clustered index, but no duplicate values */
Select 'non-unique, no dups', page_count, avg_page_space_used_in_percent, record_count
    , min_record_size_in_bytes, max_record_size_in_bytes
From sys.dm_db_index_physical_stats(DB_ID(), Object_ID(N'nonUniqueNoDups'), Null, Null, N'Detailed') 
Where index_level = 0
Union All
/* Non-unique clustered index, duplicate values */
Select 'duplicates', page_count, avg_page_space_used_in_percent, record_count
    , min_record_size_in_bytes, max_record_size_in_bytes
From sys.dm_db_index_physical_stats(DB_ID(), Object_ID(N'nonUniqueDuplicates'), Null, Null, N'Detailed') 
Where index_level = 0;

Here’s the results:

type                page_count           avg_page_space_used_in_percent record_count         min_record_size_in_bytes max_record_size_in_bytes
------------------- -------------------- ------------------------------ -------------------- ------------------------ ------------------------
unique              914                  99.8055102545095               7305                 1009                     1009
non-unique, no dups 914                  99.8055102545095               7305                 1009                     1009
duplicates          1044                 88.066036570299                7305                 1009                     1017

I want to point out a couple of things. First, there is no difference in the number of pages between the non-unique clustered index with no duplicates ([nonUniqueNoDups]) and the unique clustered index ([uniqueClustered]). The table with duplicate clustered key values, however, requires 14% more pages to store the same amount of data. Secondly, the [max_record_size_in_bytes] of the [nonUniqueDuplicates] table is 8 bytes more than that of the other two. We’ll discuss why in a minute.

Now let’s take a look at the actual data pages. For this, I’m going to use my page internals proc.

EXECUTE dbo.dba_viewPageData_sp
      @databaseName = 'sandbox'
    , @tableName    = 'sandbox.dbo.uniqueClustered'
    , @indexName    = 'CIX_uniqueClustered';

I’m not going to post the entire results here, but I want to draw your attention to “m_slotCnt = 8″, which is near the top of the page. That means 8 records are stored on this page. Also, when you look near the end of the first record (Slot 0), you should see the following results:

Slot 0 Offset 0x60 Length 1009
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 1009
Memory Dump @0x00A9C060
00000000:   1000ee03 c3150b01 00000064 61746120 †..î.Ã......data 
[...]
000003F0:   00†††††††††††††††††††††††††††††††††††.  
 
Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3
myDate = 1990-01-01                  
 
Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4
myNumber = 1                         
 
Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995
myColumn = data

Now let’s look at the table that has a non-unique clustered index but no duplicates:

Execute dbo.dba_viewPageData_sp
      @databaseName = 'sandbox'
    , @tableName    = 'sandbox.dbo.nonUniqueNoDups'
    , @indexName    = 'CIX_nonUniqueNoDups';

The m_slotCnt count is also 8 for this page. This time, let’s glance at the first and second records (Slot 0 and 1 respectively):

Slot 0 Offset 0x60 Length 1009
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 1009
Memory Dump @0x62FDC060
00000000:   1000ee03 c3150b01 00000064 61746120 †..î.Ã......data    
[...]
000003F0:   00†††††††††††††††††††††††††††††††††††.                        
 
Slot 0 Column 0 Offset 0x0 Length 4 Length (physical) 0
UNIQUIFIER = 0                       
 
Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3
myDate = 1990-01-01                  
 
Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4
myNumber = 1                         
 
Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995
myColumn = data
 
 
Slot 1 Offset 0x451 Length 1009
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 1009
Memory Dump @0x62FDC451
 
00000000:   1000ee03 c4150b02 00000064 61746120 †..î.Ä......data          
[...]
000003F0:   00†††††††††††††††††††††††††††††††††††.                        
 
Slot 1 Column 0 Offset 0x0 Length 4 Length (physical) 0
UNIQUIFIER = 0                       
 
Slot 1 Column 1 Offset 0x4 Length 3 Length (physical) 3
myDate = 1990-01-02                  
 
Slot 1 Column 2 Offset 0x7 Length 4 Length (physical) 4
myNumber = 2                         
 
Slot 1 Column 3 Offset 0xb Length 995 Length (physical) 995
myColumn = data

We now see a new addition to the row, “UNIQUIFIER = 0.” This is SQL Server’s way of managing row uniqueness internally. You’ll notice that, because the clustered key values are unique, the UNIQUIFIER is set to 0 and the row size is still 1009; for all intents and purposes, the UNIQUIFIER is not consuming any space.

Update: The DBCC God himself, Paul Randal, explained that non-dupes actually have a NULL UNIQUIFIER, which DBCC PAGE displays as a 0. Thanks for explaining, Paul! I wondered about that but chalked it up to SQL voodoo.

Now let’s look at our final case, a non-unique clustered index with duplicate key values:

Execute dbo.dba_viewPageData_sp
      @databaseName = 'sandbox'
    , @tableName    = 'sandbox.dbo.nonUniqueDuplicates'
    , @indexName    = 'CIX_nonUniqueDuplicates';

Here’s where things get interesting. The m_slotCnt value is now 7, which means we’re now storing 1 record less per page. Let’s look at the details:

Slot 0 Offset 0x60 Length 1009
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP     Record Size = 1009
Memory Dump @0x00A9C060
00000000:   1000ee03 df300b01 00000064 61746120 †..î.ß0.....data  
[...]
000003F0:   00†††††††††††††††††††††††††††††††††††.                        
 
Slot 0 Column 0 Offset 0x0 Length 4 Length (physical) 0
UNIQUIFIER = 0                       
 
Slot 0 Column 1 Offset 0x4 Length 3 Length (physical) 3
myDate = 2009-01-01                  
 
Slot 0 Column 2 Offset 0x7 Length 4 Length (physical) 4
myNumber = 1                         
 
Slot 0 Column 3 Offset 0xb Length 995 Length (physical) 995
myColumn = data
 
 
Slot 1 Offset 0x451 Length 1017
 
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 1017                   
Memory Dump @0x00A9C451
00000000:   3000ee03 df300b02 00000064 61746120 †0.î.ß0.....data  
[...]
000003F0:   000100f9 03010000 00†††††††††††††††††...ù.....                
 
Slot 1 Column 0 Offset 0x3f5 Length 4 Length (physical) 4
UNIQUIFIER = 1                       
 
Slot 1 Column 1 Offset 0x4 Length 3 Length (physical) 3
myDate = 2009-01-01                  
 
Slot 1 Column 2 Offset 0x7 Length 4 Length (physical) 4
myNumber = 2                         
 
Slot 1 Column 3 Offset 0xb Length 995 Length (physical) 995
myColumn = data

The first record, Slot 0, looks exactly the same as in the previous table; the UNIQUIFIER is 0 and the row size is 1009. The second record (Slot 1), however, now has a UNIQUIFIER value of 1 and the row size is 1017. If you notice, the “Record Attributes” of Slot 1 are also different, with the addition of “VARIABLE_COLUMNS.” This is because the UNIQUIFIER is stored as a variable column. The extra 8 bytes of overhead break down to 4 bytes to store the UNIQUIFIER, 2 bytes to store the variable column offset, and 2 bytes to store the variable count. The tables we created used all fixed-length columns; you may notice some minor overhead differences if your table already contains variable columns.

To summarize, there is indeed a difference in the page structure between a unique clustered index and a non-unique clustered index; however, there’s only a possible performance and space impact when storing duplicate clustered key values. So there you go, more detail than you ever wanted to know about clustered indexes and uniqueness!

Page Internals – Investigation Proc

As many of you know, I like to crawl around in page internals in my free time. It can be very enlightening, or just a good check to make sure that what you think is happening, is actually happening. To help with this process, I’ve created myself a little stored procedure that I can simply pass a few parameters to and have it return the page data for me. So for those who don’t have anything better to do who are as interested in page internals as I am, here’s my proc:

Create Procedure dbo.dba_viewPageData_sp
 
        /* Declare Parameters */
          @databaseName varchar(128)
        , @tableName    varchar(128)    = Null -- database.schema.tableName
        , @indexName    varchar(128)    = Null
        , @fileNumber   int             = Null
        , @pageNumber   int             = Null
        , @printOption  int             = 3    -- 0, 1, 2, or 3
        , @pageType     char(4)         = 'Leaf' -- Leaf, Root, or IAM
 
As
/*********************************************************************************
    Name:       dba_viewPageData_sp
 
    Author:     Michelle Ufford
 
    Purpose:    Retrieves page data for the specified table/page.
 
    Notes:      Can pass either the table name or the pageID, but must pass one, or
                you'll end up with no results. 
                If the table name is passed, it will return the first page.
 
        @tableName must be '<databaseName>.<schemaName>.<tableName>' in order to
            function correctly for cross-database joins.  
 
        @printOption can be one of following values:
            0 - print just the page header
            1 - page header plus per-row hex dumps and a dump of the page slot array
            2 - page header plus whole page hex dump
            3 - page header plus detailed per-row interpretation
 
        Page Options borrowed from: 
        https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/10/625659.aspx
 
        @pageType must be one of the following values:
            Leaf - returns the first page of the leaf level of your index or heap
            Root - returns the root page of your index
            IAM - returns the index allocation map chain for your index or heap
 
        Conversions borrowed from:
        http://sqlskills.com/blogs/paul/post/Inside-The-Storage-Engine-
        sp_AllocationMetadata-putting-undocumented-system-catalog-views-to-work.aspx
 
    Called by:  DBA
 
    Date        User    Description
    ----------------------------------------------------------------------------
    2009-05-06  MFU     Initial release for public consumption
*********************************************************************************
    Exec dbo.dba_viewPageData_sp
          @databaseName = 'AdventureWorks'
        , @tableName    = 'AdventureWorks.Sales.SalesOrderDetail'
        , @indexName    = 'IX_SalesOrderDetail_ProductID'
        --, @fileNumber   = 1
        --, @pageNumber   = 38208
        , @printOption  = 3
        , @pageType     = 'Root';
*********************************************************************************/
 
Set NoCount On;
Set XACT_Abort On;
Set Ansi_Padding On;
Set Ansi_Warnings On;
Set ArithAbort On;
Set Concat_Null_Yields_Null On;
Set Numeric_RoundAbort Off;
 
Begin
 
    Declare @fileID         int
        , @pageID           int
        , @sqlStatement     nvarchar(1200)
        , @sqlParameters    nvarchar(255)
        , @errorMessage     varchar(100);
 
    Begin Try
 
        If @fileNumber Is Null And @pageNumber Is Null And @tableName Is Null
        Begin
            Set @errorMessage = 'You must provide either a file/page number, or a table name!';
            RaisError(@errorMessage, 16, 1);
        End;
 
        If @pageType Not In ('Leaf', 'Root', 'IAM')
        Begin
            Set @errorMessage = 'You have entered an invalid page type; valid options are "Leaf", "Root", or "IAM"';
            RaisError(@errorMessage, 16, 1);
        End;
 
        If @fileNumber Is Null Or @pageNumber Is Null
        Begin
 
            Set @sqlStatement = 
            Case When @pageType = 'Leaf' Then
                'Select Top 1 @p_fileID = Convert (varchar(6), Convert (int, 
                    SubString (au.first_page, 6, 1) +
                    SubString (au.first_page, 5, 1)))
                , @p_pageID = Convert (varchar(20), Convert (int, 
                     SubString (au.first_page, 4, 1) +
                     SubString (au.first_page, 3, 1) +
                     SubString (au.first_page, 2, 1) +
                     SubString (au.first_page, 1, 1)))'
            When @pageType = 'Root' Then
                'Select Top 1 @p_fileID = Convert (varchar(6), Convert (int, 
                    SubString (au.root_page, 6, 1) +
                    SubString (au.root_page, 5, 1)))
                , @p_pageID = Convert (varchar(20), Convert (int, 
                     SubString (au.root_page, 4, 1) +
                     SubString (au.root_page, 3, 1) +
                     SubString (au.root_page, 2, 1) +
                     SubString (au.root_page, 1, 1)))'
            When @pageType = 'IAM' Then
                'Select Top 1 @p_fileID = Convert (varchar(6), Convert (int, 
                    SubString (au.first_iam_page, 6, 1) +
                    SubString (au.first_iam_page, 5, 1)))
                , @p_pageID = Convert (varchar(20), Convert (int, 
                     SubString (au.first_iam_page, 4, 1) +
                     SubString (au.first_iam_page, 3, 1) +
                     SubString (au.first_iam_page, 2, 1) +
                     SubString (au.first_iam_page, 1, 1)))'
            End + 
            'From ' + QuoteName(ParseName(@databaseName, 1)) + '.sys.indexes AS i
            Join ' + QuoteName(ParseName(@databaseName, 1)) + '.sys.partitions AS p
                On i.[object_id] = p.[object_id]
                And i.index_id = p.index_id
            Join ' + QuoteName(ParseName(@databaseName, 1)) + '.sys.system_internals_allocation_units AS au
                On p.hobt_id = au.container_id
            Where p.[object_id] = Object_ID(@p_tableName)
                And au.first_page > 0x000000000000 ' 
                + Case When @indexName Is Null 
                    Then ';' 
                    Else 'And i.name = @p_indexName;' End;
 
            Set @sqlParameters = '@p_tableName varchar(128)
                                , @p_indexName varchar(128)
                                , @p_fileID int OUTPUT
                                , @p_pageID int OUTPUT';
 
            Execute sp_executeSQL @sqlStatement
                        , @sqlParameters
                        , @p_tableName = @tableName
                        , @p_indexName = @indexName
                        , @p_fileID = @fileID OUTPUT
                        , @p_pageID = @pageID OUTPUT;
 
            End
            Else
            Begin
                Select @fileID = @fileNumber
                    , @pageID = @pageNumber;
            End;
 
        DBCC TraceOn (3604);
        DBCC Page (@databaseName, @fileID, @pageID, @printOption);
        DBCC TraceOff (3604);
 
    End Try
    Begin Catch
 
        Print @errorMessage;
 
    End Catch;
 
    Set NoCount Off;
    Return 0;
End
Go

This proc does have cross-database support, i.e. you can install it in your DBA database and use it to investigate data in other databases. Here’s an example…

    Exec dbo.dba_viewPageData_sp
          @databaseName = 'AdventureWorks'
        , @tableName    = 'AdventureWorks.Sales.SalesOrderDetail'
        , @indexName    = 'IX_SalesOrderDetail_ProductID';

… will return a nice data page:

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
 
PAGE: (1:11000)
 
BUFFER:
 
BUF @0x0391F140
 
bpage = 0x0C0C0000                   bhash = 0x00000000                   bpageno = (1:11000)
bdbid = 7                            breferences = 0                      bUse1 = 35177
bstat = 0x1c00009                    blog = 0x21212159                    bnext = 0x00000000

[waits for the "oohs" and "aahs" to subside...]

I also give you the option to specify a specific page, in case you want to follow the page trail (i.e. m_nextPage). I’m not really providing support for partitions, although I do have a little dirty piece of code to return a hobt with data if possible (i.e. “first_page > 0×000000000000″).

Update: Special thanks to Jeremiah Peschka and Adam Machanic for showing me
QUOTENAME(PARSENAME(@databaseName, 1))! :)

Also, thanks to Paul Randal for his excellent blog posts on this very topic! Check out his blog post on DBCC Page. The conversion code was borrowed from his sp_AllocationMetadata proc.

Update 2: I’ve made some quick changes based on some feedback from Twitter. Thanks for the suggestions and hopefully you enjoy the updates.

Happy crawling!

Page Splitting & Rollbacks

So while I was at the grocery store last night, my mind wandered to SQL Server. This time, I was pondering what happens to a page split if the transaction is rolled back. I didn’t know the answer, but my guess was that the page split remained, since it would be less expensive for SQL Server to leave the data where it was sitting. Also, in theory, if the page split occurred once, it could need to occur again, so why undergo that same expense twice?

I decided to run a simple test to see what would happen. First, I created a really wide table and tossed 4 rows into it:

Create Table myTable
(
      id            int Primary Key
    , wideColumn    char(2000) 
);
 
Insert Into myTable
Select 1, Replicate('A', 2000) Union All
Select 2, Replicate('B', 2000) Union All
Select 4, Replicate('C', 2000) Union All
Select 5, Replicate('D', 2000);

I left an open spot for id=3, so I can force a page split. Next, I looked at the page data using the script I posted in February.

Here’s what I saw:

Before

Before

Pay attention to the 2 items boxed in red. m_slotCnt tells us how many records are on the page, and m_nextPage tells us the address of the next page. As you may have guessed, a m_nextPage value of 0:0 means you’re looking at the last page allocated to the object.

Now let’s insert a record, roll it back, and see what happens:

Begin Transaction;
 
Insert Into myTable
Select 3, Replicate('E', 2000);
 
Rollback Transaction;

I ran my DBCC Page command again and here’s what I saw:

After

After

As you can see, m_slotCnt is now 2, and m_nextPage is no longer 0:0 (although your actual page number will probably be different than mine). If I pull up the new page, I find my 2 relocated records, id’s 4 and 5.

So what’s this all mean? In short, page splits are NOT reversed when a transaction is rolled back. Why should you care? Well, you probably wouldn’t care much, unless you roll back a lot of transactions. But this is also a good thing to keep in mind if you have to abort a really large insert or update; if you don’t plan to re-execute the aborted script, you may want to defrag your indexes to fix the splits.

Source: http://sqlfool.com/2009/04/page-splitting-rollbacks/