Estimating Rows per Page

Ever wonder how many rows you store per page? Me too. So here’s the query I use to investigate this:

Select object_name(i.object_id) As 'tableName'
    , i.name As 'indexName'
    , i.type_desc
    , Max(p.partition_number) As 'partitions'
    , Sum(p.rows) As 'rows'
    , Sum(au.data_pages) As 'dataPages'
    , Sum(p.rows) / Sum(au.data_pages) As 'rowsPerPage'
From sys.indexes As i
Join sys.partitions As p
    On i.object_id = p.object_id
    And i.index_id = p.index_id
Join sys.allocation_units As au
    On p.hobt_id = au.container_id
Where object_name(i.object_id) Not Like 'sys%'
    And au.type_desc = 'IN_ROW_DATA'
Group By object_name(i.object_id)
    , i.name
    , i.type_desc
Having Sum(au.data_pages) > 100
Order By rowsPerPage;

What does this tell you? Well, the more rows you can fit on a page, the less IO you need to consume to retrieve those rows. It’s also a good way to improve your buffer cache hit ratio (i.e. retrieve data from memory instead of disk, which is more efficient). So take a good look at those first few rows… do you have a small number of [rowsPerPage] but a large number of [rows]? If so, it may be time to look at re-designing your tables.

Happy Coding!

Michelle Ufford (aka SQLFool)

Source: http://sqlfool.com/2009/02/estimating-rows-per-page/

0saves
If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
Tagged , , . Bookmark the permalink.

3 Responses to Estimating Rows per Page

  1. If you’re joining on sys.allocation_units and a partition has lob data, then the rows are being counted twice, but only data pages for in_row_data is being counted. This can throw off the rowsPerPages.

    To fix it you could add the filter
    au.type_desc = ‘IN_ROW_DATA’

    or better, use a dmv:
    SELECT OBJECT_NAME(i.OBJECT_ID) AS ‘tableName’
    , i.name AS ‘indexName’
    , i.type_desc
    , MAX(p.partition_number) AS ‘partitions’
    , SUM(p.row_count) AS ‘rows’
    , SUM(p.in_row_data_page_count) AS ‘dataPages’
    , SUM(p.row_count) / SUM(p.in_row_data_page_count) AS ‘rowsPerPage’
    FROM sys.indexes AS i
    JOIN sys.dm_db_partition_stats AS p
    ON i.OBJECT_ID = p.OBJECT_ID
    And i.index_id = p.index_id
    WHERE OBJECT_NAME(i.OBJECT_ID) Not Like ‘sys%’
    GROUP BY OBJECT_NAME(i.OBJECT_ID)
    , i.name
    , i.type_desc
    HAVING SUM(p.in_row_data_page_count) > 100
    ORDER BY rowsPerPage;

    And for *further* understanding of the structure of a database, I like to use SQL Internals Viewer by Danny Gould: http://www.sqlinternalsviewer.com It tells you where everything is down to the very last bit.

  2. Thanks, Michael! I appreciate the comments. I’ve updated my script to include type_desc = ‘IN_ROW_DATA’, and I just downloaded SQL Internals Viewer. It looks very interesting. :)

  3. Relanium says:

    The results make no sense.
    When I multply the dataPages by 8192, I don’t arrive at the physical table size. Why?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>