Index Defrag Script Updates – Beta Testers Needed

Update: Wow! I’ve received a ton of responses to my request for beta testers. Thank you all! The SQL Community is really amazing. I’ll hopefully have the new version online in just a few days. :)

Over the last few months, I’ve received many great comments and suggestions regarding my Index Defrag Script v3.0. I’ve just recently had time to implement most of these suggestions, plus some other things that I thought would be useful. :)

Here’s some of what you can look forward to shortly:

  • Probably the single most requested feature, the new version of the script allows you to set a time limit for index defrags.
  • There’s now a static table for managing the status of index defrags. This way, when your time limit is reached, you can pick up where you left off the next day, without the need to rescan indexes.
  • There’s now an option to prioritize defrags by range scan counts, fragmentation level, or page counts.
  • For those using partitioning, there is now an option to exclude the right-most populated partition from defrags (in theory, the one you’re writing to in a sliding-window scenario).
  • Options such as page count limits and SORT_IN_TEMPDB are now parameterized.
  • I’ve enhanced error logging.
  • … and more!

Right now, I’m looking for a few folks who are willing to beta test the script. If you’re interested, please send me an e-mail at michelle at sqlfool dot com with the editions of SQL Server you can test this on (i.e. 2005 Standard, 2008 Enterprise, etc.).

Thank you! :)

Replication Bug with Partitioned Tables

Recently, we came across a bug in SQL Server 2005 on one of our production servers. Apparently, if you execute an ALTER TABLE statement on a replicated table with more than 128 partitions, the log reader will fail. A relatively obscure bug, I know. Microsoft has recognized this as a confirmed bug, but I couldn’t find it anywhere on the intertubes, thus the inspiration for this blog post. Microsoft’s official solution for this issue is to upgrade to SQL Server 2008.

For various reasons, we were unable to execute an upgrade at the time. And since this was a 2 terabyte database, we wanted to come up with a solution that wouldn’t involve reinitializing the entire publication. Our quick-fix while we were troubleshooting the issue was to create a linked server to the production box. Not ideal, I know, but it worked in a pinch and minimized exposure of the issue. Fortunately for us, we were able to solve the problem on the publication database pretty easily. All of the affected partition functions had empty partitions created several months in the future, so we simply merged any empty partition ranges for future dates. Our solution to our now-out-of-date subscribers was to apply static row filtering to any table with more than 100 million records. While this would introduce some overhead with the replication of these tables, it would allow us a much faster recovery time. We decided to use the start of the most recent hour as our filtering criteria, just to give us a “clean” filter, so we had to delete data from any table where we were going to apply the filter. After that, it was simply a matter of resuming replication.

All things considered, it took us a little over a day to recover from the issue. Most of that time was spent troubleshooting the problem and identifying a workable solution; actual execution of the changes was pretty quick. Moral of the story? Upgrade to SQL Server 2008. :)

#PASSAwesomeness

Allen Kinsel on Twitter (@sqlinsaneo) recently started a new Twitter tag, #PASSAwesomeness, about all of the cool things about PASS Summit. I really like the tag, so I’m going to blatantly steal borrow it for this post. :)

First, and long overdue, I want to give a brief recap of the East Iowa SQL Saturday. On October 17th, our local PASS chapter, 380PASS, sponsored our first ever SQL Saturday at the University of Iowa in Iowa City. By all accounts, the event was a great success! We had 90 attendees, 11 speakers, and 21 sessions. We received numerous compliments on the quality of the speakers, the niceness of the facilities, and the abundance of food. Not too shabby for our first time hosting the event, if I do say so myself. :)

I’d like to thank all of our wonderful speakers, especially those who traveled from out of town and out of state, for making this event such a success. I’d also like to thank our amazing volunteers for helping put this all together. Lastly, but certainly not least, I’d like to thank our generous sponsors, without whom this event would not be possible. Because this event went so smoothly and was so well received in the community, we’ve already started planning our next big SQL event! In the meantime, don’t forget to check out our monthly 380PASS meetings to tide you over.

I’d also like to take a moment to discuss the PASS Summit. Unless you’re a DBA who’s been living under a rock, you’ve probably heard of the PASS Summit. If you *have* been living under a rock — and hey, I’m not poking fun, I used to live under a rock, too! — then what you need to know is that the Summit is the largest SQL Server conference in the world. It’s a gathering of Microsoft developers and SQL Server gurus; the rest of us show up to try to absorb as much from them as possible. Since I’ve recently moved to the Business Intelligence team, I’m extremely excited to delve into the amazing amount of BI content offered.

I’m also deeply honored to be presenting at the Summit this year on some of the performance tuning techniques I’ve used with great success in my production environments. The session is titled, Super Bowl, Super Load – A Look At Performance Tuning for VLDB’s. If you’re interested in performance tuning or VLDB (very large database) topics, consider stopping by to catch my session. From what I can tell, I’ll be presenting on Tuesday from 10:15am – 11:30am in room(s?) 602-604.

If you read my blog, or if we’ve ever interacted in any way on the internet — Twitter, LinkedIn, e-mails, blog comments, etc. — please stop by and say “hi”! Aside from all of the awesome SQL Server content, I’m really looking forward to meeting as many new folks as possible.

And on that note…

Getting to meet all of the amazing SQL Server professionals out there who have inspired and encouraged me in so many ways #PASSAwesomeness

Partitioning Tricks

For those of you who are using partitioning, or who are considering using partitioning, allow me to share some tips with you.

Easy Partition Staging Tables

Switching partitions (or more specifically, hobts) in and out of a partitioned table requires the use of a staging table. The staging table has very specific requirements: it must be completely identical to the partitioned table, including indexing structures, and it must have a check constraint that limits data to the partitioning range. Thanks to my co-worker Jeff, I’ve recently started using the SQL Server Partition Management tool on CodePlex. I haven’t used the automatic partition switching feature — frankly, using any sort of data modification tool in a production environment makes me nervous — but I’ve been using the scripting option to create staging tables in my development environment, which I then copy to production for use. It’s nothing you can’t do yourself, but it does make the whole process easy and painless, plus it saves you from annoying typos. But be careful when using this tool to just create the table and check constraints automatically, because you may need to…

Add Check Constraints After Loading Data

Most of the time, I add the check constraint when I create the staging table, then I load data and perform the partition switch. However, for some reason, I was receiving the following error:

.Net SqlClient Data Provider: Msg 4972, Level 16, State 1, Line 1
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table ‘myStagingTable’ allows values that are not allowed by check constraints or partition function on target table ‘myDestinationTable’.

This drove me crazy. I confirmed my check constraints were correct, that I had the correct partition number, and that all schema and indexes matched identically. After about 30 minutes of this, I decided to drop and recreate the constraint. For some reason, it fixed the issue. Repeat tests produced the same results: the check constraint needed to be added *after* data was loaded. This error is occurring on a SQL Server 2008 SP1 box; to be honest, I’m not sure what’s causing the error, so if you know, please leave me a comment. But I figured I’d share so that anyone else running into this issue can hopefully save some time and headache. :)

Replicating Into Partitioned and Non-Partitioned Tables

Recently, we needed to replicate a non-partitioned table to two different destinations. We wanted to use partitioning for Server A, which has 2008 Enterprise; Server B, which is on 2005 Standard, could not take advantage of partitioning. The solution was really easy: create a pre-snapshot and post-snapshot script for the publication, then modify to handle each server group differently. Using pseudo-code, it looked something like this:

/* Identify which servers get the partitioned version */
If @@ServerName In ('yourServerNameList') 
Begin
 
    /* Create your partitioning scheme if necessary */
    If Not Exists(Select * From sys.partition_schemes Where name = 'InsertPartitionScheme')
        CREATE PARTITION SCHEME InsertPartitionScheme 
            AS PARTITION InsertPartitionFunction ALL TO ([PRIMARY]);    
 
    /* Create your partitioning function if necessary */
    If Not Exists(Select * From sys.partition_functions Where name = 'InsertPartitionFunction')
        CREATE PARTITION FUNCTION InsertPartitionFunction (smalldatetime) 
            AS RANGE RIGHT FOR VALUES ('insertValues');    
 
    /* Create a partitioned version of your table */
    CREATE TABLE [dbo].[yourTableName] (
	    [yourTableSchema]
    ) ON InsertPartitionScheme([partitioningKey]);
 
End
Else
Begin
 
    /* Create a non-partitioned version of your table */
    CREATE TABLE [dbo].[yourTableName] (
	    [yourTableSchema]
    ) ON [Primary];
 
End

You could also use an edition check instead of a server name check, if you prefer. The post-snapshot script basically looked the same, except you create partitioned indexes instead.

Compress Old Partitions

Did you know you can set different compression levels for individual partitions? It’s true! I’ve just completed doing this on our largest partitioned table. Here’s how:

/* Apply compression to your partitioned table */
Alter Table dbo.yourTableName
Rebuild Partition = All
With 
(
      Data_Compression = Page On Partitions(1 to 9)
    , Data_Compression = Row  On Partitions(10 to 11) 
    , Data_Compression = None On Partitions(12)
);
 
/* Apply compression to your partitioned index */
Alter Index YourPartitionedIndex
    On dbo.yourTableName
    Rebuild Partition = All
    With 
    (
      Data_Compression = Page On Partitions(1 to 9)
    , Data_Compression = Row  On Partitions(10 to 11) 
    , Data_Compression = None On Partitions(12)
    );
 
/* Apply compression to your unpartitioned index */
Alter Index YourUnpartitionedIndex
    On dbo.yourTableName
    Rebuild With (Data_Compression = Row);

A couple of things to note. In all of our proof-of-concept testing, we found that compression significantly reduced query execution time, reads (IO), and storage. However, CPU was also increased significantly. The results were more dramatic, both good and bad, with page compression versus row compression. Still, for our older partitions, which aren’t queried regularly, it made sense to turn on page compression. The newer partitions receive row compression, and the newest partitions, which are still queried very regularly by routine processes, were left completely uncompressed. This seems to strike a nice balance in our environment, but of course, results will vary depending on how you use your data.

Something to be aware of is that compressing your clustered index does *not* compress your non-clustered indexes; those are separate operations. Lastly, for those who are curious, it took us about 1 minute to apply row compression and about 7 minutes to apply page compression to partitions averaging 30 million rows.

Looking for more information on table partitioning? Check out my overview of partitioning, my example code, and my article on indexing on partitioned tables.

Why I’m Blogging Less

I’ve received a few questions asking why I’ve been blogging less frequently, and even one inquiry after my health. Rest assured, I’m completely fine. But there are 2 perfectly good reasons why I’ve been blogging less these days.

East Iowa SQL Saturday:

I’m the event organizer for East Iowa SQL Saturday, which is eating up a lot of my free time. If you haven’t yet heard about our SQL Saturday event, let me give you a brief overview. It’s a FREE, one-day training event geared toward SQL Server professionals and anyone who wants to learn more about SQL Server. We have 22 sessions planned covering a variety of topics, from Business Intelligence to Disaster Recovery to SQL Server 2008 topics. And if you’re a .NET developer, we also have some .NET-related presentations, including PowerShell and MVC.

We’re very fortunate to have snagged an excellent set of speakers. Jessica Moss, Louis Davidson, Timothy Ford, Jason Strate, and Alex Kuznetsov are just a few of the great speakers we have lined up.

There’s only a handful of spots left, so if you’re interested in attending, you should register soon. To find out more details about the speakers and sessions, or to register, be sure to check out our website at http://sqlsaturday.380pass.org.

The Other Reason:

baby_uff

Yes, that’s right, I’m with child. Expecting. Eating for two. Bun in the oven. In the family way. You get the idea.

So when I’m not at work, planning SQL Saturday, or playing Civilization Revolution, I’m sleeping. For those who remotely care, I’m due around Super Bowl time in February 2010.

2010: The Year I Make Contact

2010: The Year I Make Contact

Rest assured, this blog isn’t going away. And hopefully once I get through SQL Saturday and then PASS Summit, I’ll have more free time again. :)

Bored this summer?

Bored this summer? Do you like to help others? Do you have too much free time? Do you find yourself thinking, “Man, I really should spend more time indoors.” If you answered “yes” to all any of these questions, then have I got a proposition for you!

What could be more fun than getting second-degree burns at the waterpark, you ask? Volunteering on the PASS Performance SIG! That’s right, we’re looking for a few good women and men to join our ranks as content contributors. Specifically, we’re looking for people to write articles and/or host LiveMeeting events on performance-related topics. Not a performance expert? This can be a great way for you to learn more.

In case I scared you off in my opening paragraph, let me assure you that it really does not take that much time to be a volunteer. Just 3-4 hours a month can be a huge help. We’re also looking for contributors of all experience levels, so if you’re only comfortable writing intro-level articles, that’s definitely okay.

Oh, and while I’m begging for volunteers, we’re still looking for speakers for the SQL Saturday in East Iowa. :)

If you’re interested in either, then please send me an e-mail at michelle at sqlfool dot com for more information.

Primary Key vs Unique Constraint

Recently, I encountered a table that needed to have the definition of a clustered index altered. It just so happens that the clustered index and the primary key were one and the same, a pretty common occurrence. However, when we went to modify the index, it failed.

The following entry in Books Online for CREATE INDEX explains why:

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

Let’s test this, shall we?

/* Create a table with a clustered primary key */
Create Table dbo.myTable
(
      myID      int identity(1,1)   Not Null
    , myDate    smalldatetime       Not Null
    , myNumber  int                 Not Null
 
    Constraint CIX_myTable 
        Primary Key Clustered (myDate, myID)
);
 
/* Insert some data */
Insert Into myTable
Select '2009-01-01', 100 Union All
Select '2009-02-01', 200 Union All
Select '2009-01-05', 300;
 
/* Try to alter the index - FAIL */
Create Clustered Index CIX_myTable
    On myTable(myID, myDate, myNumber)
        With (Drop_Existing = On);
 
/* Drop the clustered primary key */
Alter Table dbo.myTable
    Drop Constraint CIX_myTable;
 
/* Add a unique clustered index */
Create Unique Clustered Index CIX_myTable 
    On myTable(myDate, myID);
 
/* Add a unique constraint */
Alter Table myTable
    Add Constraint Unique_myTable
        Unique (myDate);
 
/* Try to alter the index - SUCCESS */
Create Clustered Index CIX_myTable
    On myTable(myID, myDate, myNumber)
        With (Drop_Existing = On);
 
/* Add a primary key constraint */
Alter Table myTable
    Add Constraint PK_myTable
        Primary Key (myID, myDate);
 
/* Try to alter the index - SUCCESS */
Create Clustered Index CIX_myTable
    On myTable(myID, myDate)
        With (Drop_Existing = On);
 
/* Clean-Up */
Drop Table myTable;

The only instance that actually fails is the PRIMARY KEY constraint. The unique clustered index is able to be modified successfully, even when a unique constraint is applied to the table. So either I’m misunderstanding BOL, or BOL is mistaken. Either way, I’m then left with the following question: is there any reason to actually use a primary key when a unique index serves the same purpose and offers greater flexibility?

Questions, comments, and explanations are welcome. :)

SQL Rap Contest Results!

Wow! We ended up with 19 SQL rap submissions! The response was more than I could’ve hoped for. Parodies ranged from Coolio to Beyonce, Akon to home-spun rhymes, and everyone single one was awesome. Brent, Ward, and I have closely reviewed each and every submission. We laughed, and then we laughed some more, and now we’re ready to name a winner.

But first… did you know that GoDaddy.com is hiring? I’ve been working for this company for 2 years, and I *love* it. Truly. I can honestly say it’s the best place I’ve ever worked. If you’re looking for a challenging job and a great work environment, go take a look at some of the open positions at GoDaddy.com/Jobs. [/shameless plug]

And now… the winner of the 32GB iPod Touch, generously donated by GoDaddy.com for this silly little contest, is… Steve Jones! Steve’s rap, DBA’s Delight, is set to the tune of ‘Rapper’s Delight’ by The Sugarhill Gang and has so many SQL features dropped it’s almost educational. Almost.

THANK YOU to everyone who submitted a rap! Hopefully you had as much fun writing them as we had reading them. I’ll be following up in the next few days with a full listing of all SQL raps, once I figure out how to organize it! If you’ve posted it to your blog, please leave me a comment with the URL. And if for some reason, you don’t want me to post your submission, that’s okay too, just let me know.

Oh, what’s that? You want to see Steve’s winning entry NOW? Oh, if you insist…

DBA Delight” by Steve Jones
(This is set to the tune of “Rapper’s Delight” by The Sugarhill Gang. If you’re not familiar with the song, go to Playlist.com and search for it; it really adds to the whole experience!)

Now what you hear is not a parse
I’m rapping to the code
’cause me, my tweeps, and the devs
are gonna try to gen a load.
See I am the D-B-A
and I’d like to say hello
to the devs, the PMs, and the boss
that make me want to explode

But first I gotta, query, query
with intellisense and a brand new de-bug-ger
Say lock, don’t block,
with the hints to make those latches stop
Well so far you’ve seen me code, but I brought 2 men along
And next on the mic is dev Mike
C’mon Mike, show those keys

Check out the C-L-R, and the C-T-E,
and the rest is S-Q-L
you see I go by the link “Code-2-Supreme”
and now I’ll show you why
You see I know my keys
and identities, both char and I-N-T
I got primary and foreign keys
and DDs to back them up.
I got Assemblies, Partitioned entities
And a queue to make to grin
I’ve got a package I can integrate
to find that MAX or MIN!
Master PM, am you on?
It’s now your turn to spin a yarn

Well it’s locks and it’s blocks and it’s mis-matched socks
and the delays don’t stop until I take stock
I’m the PM you hate, when you’re late
I’ve Project and Excel
I’ve got every estimate that you made
Before this project went to hell!
So get to work, and get it done
Server, instance, database, too
I need them all up running, humming along
and returning results so true.
DBA, it’s on you
Show us what you gonna do!

I got a new quad core
sixty-four bit bits
and memory up to the rim
Gonna add some S-S-Ds for speed
and pile on the load for all my sims.
Got Agent jobs, A DR plan,
and procedures for every ‘ject
No access I don’t let you have
and auditing you won’t expect.
Clus-ter-ing, Log shipping,
mirroring across the land
I’ve got Queues in place, just in case
You break some fiber strands

Have you ever went over a friend’s shop to work
and the code just ain’t no good?
The cursors are slow, the procedures long
and a box you wouldn’t patch if you could!
So he asks what to do, to speed things up
and you sit stunned for a sec
The boss wants to help, a little consult
for some work without a spec.
Then you say, that’s it, I got to leave this place,
Don’t care what these people think
I’m just asking for trouble without benefits
and a database on the blink
So you bust out the door, hit the road
go back to your own job
check your server, see green lights
stop that head before it throbs
you text your friend two weeks later
to see how he has been
He says sorry about that job
but DBA, we’re still friends.

Chainblogging: Deserted Islands Have WiFi?

Jason Massie tagged me in the latest round of ChainBlogging. This one was started by Tim Ford (SQLAgentMan) and asks:

So You’re On A Deserted Island With WiFi and you’re still on the clock at work. Okay, so not a very good situational exercise here, but let’s roll with it; we’ll call it a virtual deserted island. Perhaps what I should simply ask is if you had a month without any walk-up work, no projects due, no performance issues that require you to devote time from anything other than a wishlist of items you’ve been wanting to get accomplished at work but keep getting pulled away from I ask this question: what would be the top items that would get your attention?

Become a BI Jedi Master

I’ve mentioned before on Twitter that I’ve recently accepted a position on the BI team. So it stands to reason that one of my major goals now is to dig into BI. While I can muck my way through as a Padawan, I want to become a true BI Jedi Grand Master. This means not just learning the tools (i.e. BIDS), but learning how to best utilize BI to deliver fast, jaw-dropping BI results. This includes SSAS, MDX, data mining, and anything else BI-related. I want to be able to easily answer the how, what, why, and when. And did I mention I want it to be *fast*?

Service Broker

I’ve played around with this a little bit. For a while, I even toyed with the idea of rolling my own version of replication with Service Broker because of the inability to swap out partitions on a replicated table in SQL 2005. But I still haven’t had a good excuse to build and deploy Service Broker to production, so my experience and understanding is not as high as I’d like it to be.

Upgrade to SQL Server 2008

I’d like to upgrade a couple of our servers to SQL Server 2008 and immediately start applying some of the new functionality. I’d set up Resource Governor to help manage some of those pesky, run-away ad-hoc queries that users like to run. I’d start looking at opportunities to replace old indexes with sexy new filtered indexes, saving space and improving performance. And I’d like to see if MERGE really lives up to the hype in a production environment.

Clean Up My E-mail Inbox

This could take me a month alone. Seriously. Okay, maybe not seriously, but it could at least take a good week.

I’m tagging some of my favorite bloggers:

East Iowa SQL Saturday – Call for Speakers!

The Call for Speakers is now open for the East Iowa SQL Saturday! This is our first time hosting a SQL Saturday, and there’s a lot of excitement and interest from our local SQL Server folks. There’s some interest from local speakers, but we’ll probably also need to pull in speakers from outside of Cedar Rapids and Iowa City to support the number of tracks and attendees we plan to have.

The event will be held on October 10th, 2009 at the University of Iowa in Iowa City. Special thanks to Russ Allen for his help with securing a location for us!

For those who are not in the immediate area but would be willing to travel, here’s some general travel times from major cities in the area:

  • Chicago – 3.5 hours
  • Omaha – 3.5 hours
  • Milwaukee – 4 hours
  • Kansas City – 4.5 hours
  • Minneapolis – 5 hours
  • St. Louis – 5 hours
  • Indianapolis – 6 hours
  • Columbus, OH – 9 hours (hint, hint, Jeremiah!)

So if you’re in the general area, please, PLEASE consider speaking at our SQL Saturday. Pretty please with sugar on top. :)

You can find out a little more information and submit sessions at our SQL Saturday website.