See you in Seattle!
My Summit abstract was accepted! I'm still a little surprised, but I'm also excited (okay, and a little nervous) to once more be presenting at the PASS Summit. If you'll be at Summit this year -- and I really hope you are, as it's well worth the time and cost -- then please make sure to say "hi" if you see me wandering around. Aside from the *excellent* content, my favorite thing about Summit is getting to meet so many great people.
In other news, I've once more switched roles within GoDaddy. For the half dozen folks who've been following my blog from the beginning, you may remember that I originally started out on the traffic team working with tuning and VLDB's, then took an opportunity to switch to the BI team to learn more about OLAP. Recently, a new team has been formed under the BI branch that's tasked with developing a massive hybrid data warehouse (by hybrid, I mean half OLTP and half OLAP). "How massive is it?" Well, it's SO massive, we're expecting to be store petabytes of data when everything is said and done. I'm happy to say I'll be on this new team. So yes, that means we have an opening for an OLAP developer. We're also hiring SQL Server DBA's. We have offices in Cedar Rapids, Denver, and the Phoenix area. Send me an e-mail at michelle at sqlfool dot com if you're interested in learning more about this great job opportunity and company.
Lastly, I want to announce that SQL Saturday 50 is now open for registration! SQL Saturday 50 will be held in Iowa City, IA on Saturday, September 18th. We're almost at 50% of our attendance capacity, so if you're interested in attending, please register soon.
That's it for now. I promise that my next blog post will be uber technical.
SQL Saturday #50 – Call for Speakers
The Call for Speakers is now open for SQL Saturday #50, the East Iowa SQL Saturday event! This is our second time hosting a SQL Saturday, and we're hoping to build upon the success of last year's event. We're looking for a wide variety of topics on SQL Server and related technologies (i.e. PowerShell, R2, LINQ, etc.). We also have had several requests for intro-level topics, such as beginning disaster recovery and basic performance tuning. If you're even remotely thinking about speaking, please submit an abstract!
Last year we had about 100 folks attend from surrounding areas. This year, we're shooting for 125 attendees, which would max out our facility's capacity. Not sure how far away Iowa City is? It may be closer than you think. Allow me to rehash my travel times from last year's plea for speakers:
- Chicago – 3.5 hours
- Omaha – 3.5 hours
- Milwaukee – 4 hours
- Kansas City – 4.5 hours
- Minneapolis – 5 hours
- St. Louis – 5 hours
- Indianapolis – 6 hours
The event will be held on September 18th at the University of Iowa in Iowa City. You can find more information, including an abstract submission form, on our event website at http://sqlsaturday.com/50/eventhome.aspx.
Oh, and if you do make it to our SQL Saturday event, please make sure to stop me and say "hi!"
DELETE 5_Useless_Things FROM [SQL Server]
It's been a while since I've been caught up in a round of chainblogging, the blogosphere's version of a Facebook meme. This time, Denis Gobo tagged me in a post started by Paul Randal. Paul asked us to list the "top-5 things in SQL Server we all wish would just be removed from the product once and for all." I reviewed other posts, and the good and bad news is that they already listed several of the same things I would have. The good news is I'm apparently not alone; the bad news is that means I need to come up with something original! So while these wouldn't necessarily be the *first* 5 on my list, they'd still be on the list nevertheless:
Default Autogrowth Options
Okay, so I lied. I'm not completely original. Yes, I know Paul Randal also commented on this one. While I said I would try to come up with only original ones, this one just has to be repeated. I've actually this option overlooked in production environments, resulting in thousands of VLF's. It's just a terrible default, and it needs to be changed.
Edit Top 200 Rows
This "feature" is just asking for trouble. Any DBA who is managing a SQL Server database should understand how to actually write insert/update/delete statements. Maybe leave the option available in SQL Express, but please remove it from SQL Server Standard & Enterprise.
Debug
There's nothing wrong with the Debug option, but I think it should be removed as a default option for the toolbar. It's easily mistaken for "Execute," which I've seen more than one DBA do on occasion.
PIVOT
I understand the need to pivot your data, but let's face it. PIVOT is a clunky, expensive SQL operation. Let's move the presentation tasks to the presentation layer (.NET), and reserve the database layer for what it does best.
Update: By popular demand, I have removed PIVOT from this list. Who am I to argue with such fine folks?
Cursors
Okay, okay, I know I can't actually get rid of this, BUT I think it gets abused way too much. Set-based operations, anyone?
Alrighty, now it's my turn to tag! I'm not sure if they've already been hit, but I'm tagging:
Index Defrag Script Updates – Beta Testers Needed
Update: Wow! I've received a ton of responses to my request for beta testers. Thank you all! The SQL Community is really amazing. I'll hopefully have the new version online in just a few days.
Over the last few months, I've received many great comments and suggestions regarding my Index Defrag Script v3.0. I've just recently had time to implement most of these suggestions, plus some other things that I thought would be useful.
Here's some of what you can look forward to shortly:
- Probably the single most requested feature, the new version of the script allows you to set a time limit for index defrags.
- There's now a static table for managing the status of index defrags. This way, when your time limit is reached, you can pick up where you left off the next day, without the need to rescan indexes.
- There's now an option to prioritize defrags by range scan counts, fragmentation level, or page counts.
- For those using partitioning, there is now an option to exclude the right-most populated partition from defrags (in theory, the one you're writing to in a sliding-window scenario).
- Options such as page count limits and SORT_IN_TEMPDB are now parameterized.
- I've enhanced error logging.
- ... and more!
Right now, I'm looking for a few folks who are willing to beta test the script. If you're interested, please send me an e-mail at michelle at sqlfool dot com with the editions of SQL Server you can test this on (i.e. 2005 Standard, 2008 Enterprise, etc.).
Thank you!
Replication Bug with Partitioned Tables
Recently, we came across a bug in SQL Server 2005 on one of our production servers. Apparently, if you execute an ALTER TABLE statement on a replicated table with more than 128 partitions, the log reader will fail. A relatively obscure bug, I know. Microsoft has recognized this as a confirmed bug, but I couldn't find it anywhere on the intertubes, thus the inspiration for this blog post. Microsoft's official solution for this issue is to upgrade to SQL Server 2008.
For various reasons, we were unable to execute an upgrade at the time. And since this was a 2 terabyte database, we wanted to come up with a solution that wouldn't involve reinitializing the entire publication. Our quick-fix while we were troubleshooting the issue was to create a linked server to the production box. Not ideal, I know, but it worked in a pinch and minimized exposure of the issue. Fortunately for us, we were able to solve the problem on the publication database pretty easily. All of the affected partition functions had empty partitions created several months in the future, so we simply merged any empty partition ranges for future dates. Our solution to our now-out-of-date subscribers was to apply static row filtering to any table with more than 100 million records. While this would introduce some overhead with the replication of these tables, it would allow us a much faster recovery time. We decided to use the start of the most recent hour as our filtering criteria, just to give us a "clean" filter, so we had to delete data from any table where we were going to apply the filter. After that, it was simply a matter of resuming replication.
All things considered, it took us a little over a day to recover from the issue. Most of that time was spent troubleshooting the problem and identifying a workable solution; actual execution of the changes was pretty quick. Moral of the story? Upgrade to SQL Server 2008.
#PASSAwesomeness
Allen Kinsel on Twitter (@sqlinsaneo) recently started a new Twitter tag, #PASSAwesomeness, about all of the cool things about PASS Summit. I really like the tag, so I'm going to blatantly steal borrow it for this post.
First, and long overdue, I want to give a brief recap of the East Iowa SQL Saturday. On October 17th, our local PASS chapter, 380PASS, sponsored our first ever SQL Saturday at the University of Iowa in Iowa City. By all accounts, the event was a great success! We had 90 attendees, 11 speakers, and 21 sessions. We received numerous compliments on the quality of the speakers, the niceness of the facilities, and the abundance of food. Not too shabby for our first time hosting the event, if I do say so myself.
I'd like to thank all of our wonderful speakers, especially those who traveled from out of town and out of state, for making this event such a success. I'd also like to thank our amazing volunteers for helping put this all together. Lastly, but certainly not least, I'd like to thank our generous sponsors, without whom this event would not be possible. Because this event went so smoothly and was so well received in the community, we've already started planning our next big SQL event! In the meantime, don't forget to check out our monthly 380PASS meetings to tide you over.
I'd also like to take a moment to discuss the PASS Summit. Unless you're a DBA who's been living under a rock, you've probably heard of the PASS Summit. If you *have* been living under a rock -- and hey, I'm not poking fun, I used to live under a rock, too! -- then what you need to know is that the Summit is the largest SQL Server conference in the world. It's a gathering of Microsoft developers and SQL Server gurus; the rest of us show up to try to absorb as much from them as possible. Since I've recently moved to the Business Intelligence team, I'm extremely excited to delve into the amazing amount of BI content offered.
I'm also deeply honored to be presenting at the Summit this year on some of the performance tuning techniques I've used with great success in my production environments. The session is titled, Super Bowl, Super Load - A Look At Performance Tuning for VLDB's. If you're interested in performance tuning or VLDB (very large database) topics, consider stopping by to catch my session. From what I can tell, I'll be presenting on Tuesday from 10:15am - 11:30am in room(s?) 602-604.
If you read my blog, or if we've ever interacted in any way on the internet -- Twitter, LinkedIn, e-mails, blog comments, etc. -- please stop by and say "hi"! Aside from all of the awesome SQL Server content, I'm really looking forward to meeting as many new folks as possible.
And on that note...
Getting to meet all of the amazing SQL Server professionals out there who have inspired and encouraged me in so many ways #PASSAwesomeness
Partitioning Tricks
For those of you who are using partitioning, or who are considering using partitioning, allow me to share some tips with you.
Easy Partition Staging Tables
Switching partitions (or more specifically, hobts) in and out of a partitioned table requires the use of a staging table. The staging table has very specific requirements: it must be completely identical to the partitioned table, including indexing structures, and it must have a check constraint that limits data to the partitioning range. Thanks to my co-worker Jeff, I've recently started using the SQL Server Partition Management tool on CodePlex. I haven't used the automatic partition switching feature -- frankly, using any sort of data modification tool in a production environment makes me nervous -- but I've been using the scripting option to create staging tables in my development environment, which I then copy to production for use. It's nothing you can't do yourself, but it does make the whole process easy and painless, plus it saves you from annoying typos. But be careful when using this tool to just create the table and check constraints automatically, because you may need to...
Add Check Constraints After Loading Data
Most of the time, I add the check constraint when I create the staging table, then I load data and perform the partition switch. However, for some reason, I was receiving the following error:
.Net SqlClient Data Provider: Msg 4972, Level 16, State 1, Line 1
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table 'myStagingTable' allows values that are not allowed by check constraints or partition function on target table 'myDestinationTable'.
This drove me crazy. I confirmed my check constraints were correct, that I had the correct partition number, and that all schema and indexes matched identically. After about 30 minutes of this, I decided to drop and recreate the constraint. For some reason, it fixed the issue. Repeat tests produced the same results: the check constraint needed to be added *after* data was loaded. This error is occurring on a SQL Server 2008 SP1 box; to be honest, I'm not sure what's causing the error, so if you know, please leave me a comment. But I figured I'd share so that anyone else running into this issue can hopefully save some time and headache.
Replicating Into Partitioned and Non-Partitioned Tables
Recently, we needed to replicate a non-partitioned table to two different destinations. We wanted to use partitioning for Server A, which has 2008 Enterprise; Server B, which is on 2005 Standard, could not take advantage of partitioning. The solution was really easy: create a pre-snapshot and post-snapshot script for the publication, then modify to handle each server group differently. Using pseudo-code, it looked something like this:
/* Identify which servers get the partitioned version */ IF @@SERVERNAME In ('yourServerNameList') BEGIN /* Create your partitioning scheme if necessary */ IF Not Exists(SELECT * FROM sys.partition_schemes WHERE name = 'InsertPartitionScheme') CREATE PARTITION SCHEME InsertPartitionScheme AS PARTITION InsertPartitionFunction ALL TO ([PRIMARY]); /* Create your partitioning function if necessary */ IF Not Exists(SELECT * FROM sys.partition_functions WHERE name = 'InsertPartitionFunction') CREATE PARTITION FUNCTION InsertPartitionFunction (SMALLDATETIME) AS RANGE RIGHT FOR VALUES ('insertValues'); /* Create a partitioned version of your table */ CREATE TABLE [dbo].[yourTableName] ( [yourTableSchema] ) ON InsertPartitionScheme([partitioningKey]); END ELSE BEGIN /* Create a non-partitioned version of your table */ CREATE TABLE [dbo].[yourTableName] ( [yourTableSchema] ) ON [PRIMARY]; END
You could also use an edition check instead of a server name check, if you prefer. The post-snapshot script basically looked the same, except you create partitioned indexes instead.
Compress Old Partitions
Did you know you can set different compression levels for individual partitions? It's true! I've just completed doing this on our largest partitioned table. Here's how:
/* Apply compression to your partitioned table */ ALTER TABLE dbo.yourTableName Rebuild Partition = All WITH ( Data_Compression = Page ON Partitions(1 TO 9) , Data_Compression = ROW ON Partitions(10 TO 11) , Data_Compression = NONE ON Partitions(12) ); /* Apply compression to your partitioned index */ ALTER INDEX YourPartitionedIndex ON dbo.yourTableName Rebuild Partition = All WITH ( Data_Compression = Page ON Partitions(1 TO 9) , Data_Compression = ROW ON Partitions(10 TO 11) , Data_Compression = NONE ON Partitions(12) ); /* Apply compression to your unpartitioned index */ ALTER INDEX YourUnpartitionedIndex ON dbo.yourTableName Rebuild WITH (Data_Compression = ROW);
A couple of things to note. In all of our proof-of-concept testing, we found that compression significantly reduced query execution time, reads (IO), and storage. However, CPU was also increased significantly. The results were more dramatic, both good and bad, with page compression versus row compression. Still, for our older partitions, which aren't queried regularly, it made sense to turn on page compression. The newer partitions receive row compression, and the newest partitions, which are still queried very regularly by routine processes, were left completely uncompressed. This seems to strike a nice balance in our environment, but of course, results will vary depending on how you use your data.
Something to be aware of is that compressing your clustered index does *not* compress your non-clustered indexes; those are separate operations. Lastly, for those who are curious, it took us about 1 minute to apply row compression and about 7 minutes to apply page compression to partitions averaging 30 million rows.
Looking for more information on table partitioning? Check out my overview of partitioning, my example code, and my article on indexing on partitioned tables.
Why I’m Blogging Less
I've received a few questions asking why I've been blogging less frequently, and even one inquiry after my health. Rest assured, I'm completely fine. But there are 2 perfectly good reasons why I've been blogging less these days.
East Iowa SQL Saturday:
I'm the event organizer for East Iowa SQL Saturday, which is eating up a lot of my free time. If you haven't yet heard about our SQL Saturday event, let me give you a brief overview. It's a FREE, one-day training event geared toward SQL Server professionals and anyone who wants to learn more about SQL Server. We have 22 sessions planned covering a variety of topics, from Business Intelligence to Disaster Recovery to SQL Server 2008 topics. And if you're a .NET developer, we also have some .NET-related presentations, including PowerShell and MVC.
We're very fortunate to have snagged an excellent set of speakers. Jessica Moss, Louis Davidson, Timothy Ford, Jason Strate, and Alex Kuznetsov are just a few of the great speakers we have lined up.
There's only a handful of spots left, so if you're interested in attending, you should register soon. To find out more details about the speakers and sessions, or to register, be sure to check out our website at http://sqlsaturday.380pass.org.
The Other Reason:
Yes, that's right, I'm with child. Expecting. Eating for two. Bun in the oven. In the family way. You get the idea.
So when I'm not at work, planning SQL Saturday, or playing Civilization Revolution, I'm sleeping. For those who remotely care, I'm due around Super Bowl time in February 2010.

2010: The Year I Make Contact
Rest assured, this blog isn't going away. And hopefully once I get through SQL Saturday and then PASS Summit, I'll have more free time again.
Bored this summer?
Bored this summer? Do you like to help others? Do you have too much free time? Do you find yourself thinking, "Man, I really should spend more time indoors." If you answered "yes" to all any of these questions, then have I got a proposition for you!
Sorry, guys, not that kind of proposition
What could be more fun than getting second-degree burns at the waterpark, you ask? Volunteering on the PASS Performance SIG! That's right, we're looking for a few good women and men to join our ranks as content contributors. Specifically, we're looking for people to write articles and/or host LiveMeeting events on performance-related topics. Not a performance expert? This can be a great way for you to learn more.
In case I scared you off in my opening paragraph, let me assure you that it really does not take that much time to be a volunteer. Just 3-4 hours a month can be a huge help. We're also looking for contributors of all experience levels, so if you're only comfortable writing intro-level articles, that's definitely okay.
Oh, and while I'm begging for volunteers, we're still looking for speakers for the SQL Saturday in East Iowa.
If you're interested in either, then please send me an e-mail at michelle at sqlfool dot com for more information.
Primary Key vs Unique Constraint
Recently, I encountered a table that needed to have the definition of a clustered index altered. It just so happens that the clustered index and the primary key were one and the same, a pretty common occurrence. However, when we went to modify the index, it failed.
The following entry in Books Online for CREATE INDEX explains why:
If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.
Let's test this, shall we?
/* Create a table with a clustered primary key */ CREATE TABLE dbo.myTable ( myID INT IDENTITY(1,1) Not Null , myDate SMALLDATETIME Not Null , myNumber INT Not Null CONSTRAINT CIX_myTable PRIMARY KEY CLUSTERED (myDate, myID) ); /* Insert some data */ INSERT INTO myTable SELECT '2009-01-01', 100 UNION All SELECT '2009-02-01', 200 UNION All SELECT '2009-01-05', 300; /* Try to alter the index - FAIL */ CREATE CLUSTERED INDEX CIX_myTable ON myTable(myID, myDate, myNumber) WITH (Drop_Existing = ON); /* Drop the clustered primary key */ ALTER TABLE dbo.myTable DROP CONSTRAINT CIX_myTable; /* Add a unique clustered index */ CREATE UNIQUE CLUSTERED INDEX CIX_myTable ON myTable(myDate, myID); /* Add a unique constraint */ ALTER TABLE myTable ADD CONSTRAINT Unique_myTable UNIQUE (myDate); /* Try to alter the index - SUCCESS */ CREATE CLUSTERED INDEX CIX_myTable ON myTable(myID, myDate, myNumber) WITH (Drop_Existing = ON); /* Add a primary key constraint */ ALTER TABLE myTable ADD CONSTRAINT PK_myTable PRIMARY KEY (myID, myDate); /* Try to alter the index - SUCCESS */ CREATE CLUSTERED INDEX CIX_myTable ON myTable(myID, myDate) WITH (Drop_Existing = ON); /* Clean-Up */ DROP TABLE myTable;
The only instance that actually fails is the PRIMARY KEY constraint. The unique clustered index is able to be modified successfully, even when a unique constraint is applied to the table. So either I'm misunderstanding BOL, or BOL is mistaken. Either way, I'm then left with the following question: is there any reason to actually use a primary key when a unique index serves the same purpose and offers greater flexibility?
Questions, comments, and explanations are welcome.
Categories
- Business Intelligence
- Internals
- Miscellaneous
- PASS
- Performance & Tuning
- Presentations
- SQL 2008
- SQL Tips
- Syndication
- T-SQL Scripts
Subscribe to my blog!
| Like what you see? Subscribe! |
![]() |
Around the Web
Recent Tweets
- @zippy1981 I'm actually using @RedGate SQL Compare right now. It's worth every penny. #sqlhelp #redgate
- +1 :) RT @onpnt: Very well said, Janice :) @JaniceCLee your blog if full of WIN http://bit.ly/aZ4wPR
- @SQLDBA You're flying out of Orlando so there's def the possibility of a better deal. But I wouldn't do it unless you're a morning person :)



