Indexing for Partitioned Tables
So you’ve partitioned your table, and now you’re ready to performance tune. As with any table, indexing is a great place to start. And if you’re like most people new to partitioning, you probably created all of your indexes on the partitioned scheme, either by design or unintentionally.
Let’s take a step back here and discuss what a partitioned index is. A partitioned index, like a partitioned table, separates data into different physical structures (partitions) under one logical name. Specifically, each partition in a nonclustered index contains its own B-tree structure with a subset of rows, based upon the partitioning scheme. By default, an unpartitioned nonclustered index has just one partition.
Keep in mind, when you create an index on a partitioned table, i.e.
CREATE NONCLUSTERED INDEX IX_myIndex ON dbo.myTable(myColumn);
… you are creating the index on the partitioned scheme by default. In order to create a NON-partitioned index on that same table, you would need to explicitly declare “On [FileGroup]“, i.e.
CREATE NONCLUSTERED INDEX IX_myIndex ON dbo.myTable(myColumn) ON [PRIMARY];
But should you partition your index? That depends on how you use it. In fact, most environments will probably want to use a mix of partitioned and non-partitioned indexes. I’ve found that that partitioned indexes perform better when aggregating data or scanning partitions. Conversely, you’ll probably find that, if you need to locate a single, specific record, nothing performs better than a non-partitioned index on that column.
Let’s walk through some examples and see how they perform. I’ll bring back my trusty ol’ orders table for this.
/* Create a partition function. */ CREATE Partition FUNCTION [test_monthlyDateRange_pf] (DATETIME) AS Range RIGHT FOR VALUES ('2009-01-01', '2009-01-08', '2009-01-15' , '2009-01-22', '2009-01-29'); Go /* Associate the partition function with a partition scheme. */ CREATE Partition Scheme test_monthlyDateRange_ps AS Partition test_monthlyDateRange_pf All TO ([PRIMARY]); Go /* Create a partitioned table. */ CREATE TABLE dbo.orders ( order_id INT IDENTITY(1,1) Not Null , orderDate DATETIME Not Null , orderData SMALLDATETIME Not Null CONSTRAINT PK_orders PRIMARY KEY CLUSTERED ( order_id , orderDate ) ) ON test_monthlyDateRange_ps(orderDate); Go /* Create some records to play with. */ SET NOCOUNT ON; DECLARE @endDate DATETIME = '2009-01-01'; WHILE @endDate < '2009-02-01' BEGIN INSERT INTO dbo.orders SELECT @endDate, @endDate; SET @endDate = DATEADD(MINUTE, 1, @endDate); END; SET NOCOUNT OFF; /* Let’s create an aligned, partitioned index. */ CREATE NONCLUSTERED INDEX IX_orders_aligned ON dbo.orders(order_id) ON test_monthlyDateRange_ps(orderDate); /* you don't actually need to declare the last line of this unless you want to create the index on a different partitioning scheme. */ /* Now let’s create an unpartitioned index. */ CREATE NONCLUSTERED INDEX IX_orders_unpartitioned ON dbo.orders(order_id) ON [PRIMARY];
Now that we have both a partitioned and an unpartitioned index, let’s take a look at our sys.partitions table:
/* Let's take a look at our index partitions */ SELECT i.name , i.index_id , p.partition_number , p.ROWS FROM sys.partitions AS p Join sys.indexes AS i ON p.OBJECT_ID = i.OBJECT_ID And p.index_id = i.index_id WHERE p.OBJECT_ID = OBJECT_ID('orders') ORDER BY i.index_id, p.partition_number;
As expected, both of our partitioned indexes, PK_orders and IX_orders_aligned, have 6 partitions, with a subset of rows on each partition. Our unpartitioned non-clustered index, IX_orders_unpartitioned, on the other hand has just 1 partition containing all of the rows.
Now that we have our environment set up, let’s run through some different queries and see the performance impact of each type of index.
/* Query 1, specific record look-up, covered */ SELECT order_id, orderDate FROM dbo.orders WITH (INDEX(IX_orders_aligned)) WHERE order_id = 25000; SELECT order_id, orderDate FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned)) WHERE order_id = 25000;
The unpartitioned index performs significantly better when given a specific record to look-up. Now let’s try the same query, but utilizing a scan instead of a seek:
/* Query 2, specific record look-up, uncovered */ SELECT order_id, orderDate, orderData FROM dbo.orders WITH (INDEX(IX_orders_aligned)) WHERE order_id = 30000; SELECT order_id, orderDate, orderData FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned)) WHERE order_id = 30000;
Again we see that the non-partitioned index performs better with the single-record look-up. This can lead to some pretty dramatic performance implications. So when *would* we want to use a partitioned index? Two instances immediately pop to mind. First, partition switching can only be performed when all indexes on a table are aligned. Secondly, partitioned indexes perform better when manipulating large data sets. To see this in action, let’s try some simple aggregation…
/* Query 3, aggregation */ SELECT CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) AS 'order_date' , COUNT(*) FROM dbo.orders WITH (INDEX(IX_orders_aligned)) WHERE orderDate Between '2009-01-01' And '2009-01-07 23:59' GROUP BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) ORDER BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME); SELECT CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) AS 'order_date' , COUNT(*) FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned)) WHERE orderDate Between '2009-01-01' And '2009-01-07 23:59' GROUP BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) ORDER BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME);
As you can see, partitioned indexes perform better when aggregating data. This is just a simple example, but the results can be even more dramatic in a large production environment. This is one of the reasons why partitioned tables and indexes are especially beneficial in data warehouses.
So now you have a general idea of what a partitioned index is and when to use a partitioned index vs a non-partitioned index. Ultimately, your indexing needs will depend largely on the application and how the data is used. When in doubt, test, test, test! So to recap…
- Specify “On [FileGroup]“ to create an unpartitioned index on a partitioned table
- Consider using non-partitioned indexes for single-record look-ups
- Use partitioned indexes for multiple records and data aggregations
- To enable partition switching, all indexes on the table must be aligned.
For more information on partitioning, check out my other partitioning articles:
Partitioning Example
Partitioning 101
Tips for Large Data Stores
Comments
Tell me what you're thinking...
and oh, if you've enjoyed the post, please consider subscribing to my blog. ![]()









