String Parsing Function

This handy little script parses a string and returns the results as a table. I know there are a ton of string parsing functions out there, but I thought I’d add to the list. ;)

The basic logic of it (using a CTE) is derived from a forum post I found years ago. The table-valued UDF, delimiter, etc. is all stuff that I added. So thus, while I can not claim complete credit, I still thought it’d be worthwhile to share. Plus this will be used in another script I will be posting soon. :)

/* Let's create our parsing function... */
CREATE FUNCTION dbo.dba_parseString_udf
(
          @stringToParse VARCHAR(8000)  
        , @delimiter     CHAR(1)
)
RETURNS @parsedString TABLE (stringValue VARCHAR(128))
AS
/*********************************************************************************
    Name:       dba_parseString_udf
 
    Author:     Michelle Ufford, http://sqlfool.com
 
    Purpose:    This function parses string input using a variable delimiter.
 
    Notes:      Two common delimiter values are space (' ') and comma (',')
 
    Date        Initials    Description
    ----------------------------------------------------------------------------
    2011-05-20  MFU         Initial Release
*********************************************************************************
Usage: 		
    SELECT *
	FROM dba_parseString_udf(<string>, <delimiter>);
 
Test Cases:
 
    1.  multiple strings separated by space
        SELECT * FROM dbo.dba_parseString_udf('  aaa  bbb  ccc ', ' ');
 
    2.  multiple strings separated by comma
        SELECT * FROM dbo.dba_parseString_udf(',aaa,bbb,,,ccc,', ',');
*********************************************************************************/
BEGIN
 
    /* Declare variables */
    DECLARE @trimmedString  VARCHAR(8000);
 
    /* We need to trim our string input in case the user entered extra spaces */
    SET @trimmedString = LTRIM(RTRIM(@stringToParse));
 
    /* Let's create a recursive CTE to break down our string for us */
    WITH parseCTE (StartPos, EndPos)
    AS
    (
        SELECT 1 AS StartPos
            , CHARINDEX(@delimiter, @trimmedString + @delimiter) AS EndPos
        UNION ALL
        SELECT EndPos + 1 AS StartPos
            , CharIndex(@delimiter, @trimmedString + @delimiter , EndPos + 1) AS EndPos
        FROM parseCTE
        WHERE CHARINDEX(@delimiter, @trimmedString + @delimiter, EndPos + 1) <> 0
    )
 
    /* Let's take the results and stick it in a table */  
    INSERT INTO @parsedString
    SELECT SUBSTRING(@trimmedString, StartPos, EndPos - StartPos)
    FROM parseCTE
    WHERE LEN(LTRIM(RTRIM(SUBSTRING(@trimmedString, StartPos, EndPos - StartPos)))) > 0
    OPTION (MaxRecursion 8000);
 
    RETURN;   
END

11-Word Warning

Tom LaRock posted a new Meme Monday challenge: “Write a SQL blog post in 11 words or less.”

Donabel Santos tagged me, and I couldn’t resist the challenge. So here’s my entry:

Hasty coding, error prone. No backups, data loss. Company for sale.

This was inspired by the recent spate of stories I’ve heard about companies that have failed because they did not properly manage their data and databases.

I don’t know who’s been tagged or not, so I’m gagging some of my SQL Saturday Chicago friends:

Identity Columns: Are You Nearing The Limits?

I use identity columns frequently. After all, identity columns make great clustering keys. But it’s important when using identity columns to check on the amount of values you have left before you reach the limit of your data type. An identity column has a fixed amount of values it can use based upon whether you specified tinyint, smallint, int, or bigint when you defined the column. If you reach this limit, your inserts will blow up and cause a Chernobyl-like SQL meltdown will begin to fail. I just finished an audit of my tables and thought I’d share the script. I would like to warn that this script is *not* perfect… namely, it doesn’t handle negative integer values very elegantly. It also doesn’t know if you started your seed at zero, approached your max positive limit, then reseeded to the negative limit (see my “quick and dirty fix” tip at the end of this article).

/* Define how close we are to the value limit
   before we start throwing up the red flag.
   The higher the value, the closer to the limit. */
Declare @threshold decimal(3,2) = .85;
 
/* Create a temp table */
Create Table #identityStatus
(
      database_name     varchar(128)
    , table_name        varchar(128)
    , column_name       varchar(128)
    , data_type         varchar(128)
    , last_value        bigint
    , max_value         bigint
);
 
/* Use an undocumented command to run a SQL statement
   in each database on a server */
Execute sp_msforeachdb '
    Use [?];
    Insert Into #identityStatus
    Select ''?'' As [database_name]
        , Object_Name(id.object_id, DB_ID(''?'')) As [table_name]
        , id.name As [column_name]
        , t.name As [data_type]
        , Cast(id.last_value As bigint) As [last_value]
        , Case 
            When t.name = ''tinyint''   Then 255 
            When t.name = ''smallint''  Then 32767 
            When t.name = ''int''       Then 2147483647 
            When t.name = ''bigint''    Then 9223372036854775807
          End As [max_value]
    From sys.identity_columns As id
    Join sys.types As t
        On id.system_type_id = t.system_type_id
    Where id.last_value Is Not Null';
 
/* Retrieve our results and format it all prettily */
Select database_name
    , table_name
    , column_name
    , data_type
    , last_value
    , Case 
        When last_value < 0 Then 100
        Else (1 - Cast(last_value As float(4)) / max_value) * 100 
      End As [percentLeft]
    , Case 
        When Cast(last_value As float(4)) / max_value >= @threshold
            Then 'warning: approaching max limit'
        Else 'okay'
        End As [id_status]
From #identityStatus
Order By percentLeft;
 
/* Clean up after ourselves */
Drop Table #identityStatus;

If you find yourself quickly approaching your max limit and need to implement a quick and dirty fix, you can reseed your identity column. Of course, this only works if you started at zero instead of the actual lower, negative limit.

Effective Clustered Indexing

My first Simple-Talk article was published yesterday! I’m pretty excited about it and wanted to share the link. In the article, I give an overview of how clustered and nonclustered indexes work, and I demonstrate why clustered index best practices — narrow, unique, static, and ever-increasing — are important design considerations.

You can find the article on Simple-Talk’s website at:
http://www.simple-talk.com/sql/learn-sql-server/effective-clustered-indexes/

Please let me know your thoughts! :)

SQL Agent Job Script

This blog post is brought to you by the awesomeness that is the SQL Server Twitter community. Contributions were made by several awesome Tweeples, including Denny Cherry, Kendra Little, Ken Simmons, and Magnus Ahlkvist, among others.

What started this is something you’ve all probably run into at one time or another. We’re currently having some resource contention on our server when a ton of processes kickoff and overlap in the morning. Now, I have a script that I’ve used in the past for monitoring SQL agent jobs, but this time I wanted to add some historical run-time information. I know the sysjobhistory table contains the information I need, but it has some… let’s say, creative ways of storing the data. Opinions on the reasons why vary:



Regardless the reason, I needed to do some conversions. Denny shared with us the msdb.dbo.agent_datetime function to convert run_date and run_time into an actual datetime value. I have to say, this certainly cleans up the code quite a bit! Then Magnus shared a method to convert run_duration into seconds, which he modified from a script on SQLServerCentral. I was able to use these two tidbits to update my old script. You can now run this script to get back a list of all enabled procs, a script that will disable them, a breakdown of the schedule, and a historical run-time average.

/*  How many days do you want to include in your run-time average?
    Recent values tend to be more useful. */
Declare @daysToAverage smallint = 30;
 
Declare @avgRunTime Table
(
      job_id      uniqueidentifier
    , avgRunTime  int
);
 
/* We need to parse the schedule into something we can understand */
Declare @weekDay Table (
      mask       int
    , maskValue  varchar(32)
);
 
Insert Into @weekDay
Select 1, 'Sunday'  Union All
Select 2, 'Monday'  Union All
Select 4, 'Tuesday'  Union All
Select 8, 'Wednesday'  Union All
Select 16, 'Thursday'  Union All
Select 32, 'Friday'  Union All
Select 64, 'Saturday';
 
/* First, let's get our run-time average */
Insert Into @avgRunTime
Select job_id
    , Avg((run_duration/10000) * 3600 + (run_duration/100%100)*60 + run_duration%100) As 'avgRunTime' /* convert HHMMSS to seconds */
From msdb.dbo.sysjobhistory
Where step_id = 0 -- only grab our total run-time
    And run_status = 1 -- only grab successful executions
    And msdb.dbo.agent_datetime(run_date, run_time) >= DateAdd(day, -@daysToAverage, GetDate())
Group By job_id;
 
/* Now let's get our schedule information */
With myCTE
As(
    Select sched.name As 'scheduleName'
        , sched.schedule_id
        , jobsched.job_id
        , Case When sched.freq_type = 1 Then 'Once' 
            When sched.freq_type = 4 
                And sched.freq_interval = 1 
                    Then 'Daily'
            When sched.freq_type = 4 
                Then 'Every ' + Cast(sched.freq_interval As varchar(5)) + ' days'
            When sched.freq_type = 8 Then 
                Replace( Replace( Replace(( 
                    Select maskValue 
                    From @weekDay As x 
                    Where sched.freq_interval & x.mask <> 0 
                    Order By mask For XML Raw)
                , '"/><row maskValue="', ', '), '<row maskValue="', ''), '"/>', '') 
                + Case When sched.freq_recurrence_factor <> 0 
                        And sched.freq_recurrence_factor = 1 
                            Then '; weekly' 
                    When sched.freq_recurrence_factor <> 0 Then '; every ' 
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' weeks' End
            When sched.freq_type = 16 Then 'On day ' 
                + Cast(sched.freq_interval As varchar(10)) + ' of every '
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' months' 
            When sched.freq_type = 32 Then 
                Case When sched.freq_relative_interval = 1 Then 'First'
                    When sched.freq_relative_interval = 2 Then 'Second'
                    When sched.freq_relative_interval = 4 Then 'Third'
                    When sched.freq_relative_interval = 8 Then 'Fourth'
                    When sched.freq_relative_interval = 16 Then 'Last'
                End + 
                Case When sched.freq_interval = 1 Then ' Sunday'
                    When sched.freq_interval = 2 Then ' Monday'
                    When sched.freq_interval = 3 Then ' Tuesday'
                    When sched.freq_interval = 4 Then ' Wednesday'
                    When sched.freq_interval = 5 Then ' Thursday'
                    When sched.freq_interval = 6 Then ' Friday'
                    When sched.freq_interval = 7 Then ' Saturday'
                    When sched.freq_interval = 8 Then ' Day'
                    When sched.freq_interval = 9 Then ' Weekday'
                    When sched.freq_interval = 10 Then ' Weekend'
                End
                + Case When sched.freq_recurrence_factor <> 0 
                        And sched.freq_recurrence_factor = 1 Then '; monthly'
                    When sched.freq_recurrence_factor <> 0 Then '; every ' 
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' months' End
            When sched.freq_type = 64 Then 'StartUp'
            When sched.freq_type = 128 Then 'Idle'
          End As 'frequency'
        , IsNull('Every ' + Cast(sched.freq_subday_interval As varchar(10)) + 
            Case When sched.freq_subday_type = 2 Then ' seconds'
                When sched.freq_subday_type = 4 Then ' minutes'
                When sched.freq_subday_type = 8 Then ' hours'
            End, 'Once') As 'subFrequency'
        , Replicate('0', 6 - Len(sched.active_start_time)) 
            + Cast(sched.active_start_time As varchar(6)) As 'startTime'
        , Replicate('0', 6 - Len(sched.active_end_time)) 
            + Cast(sched.active_end_time As varchar(6)) As 'endTime'
        , Replicate('0', 6 - Len(jobsched.next_run_time)) 
            + Cast(jobsched.next_run_time As varchar(6)) As 'nextRunTime'
        , Cast(jobsched.next_run_date As char(8)) As 'nextRunDate'
    From msdb.dbo.sysschedules As sched
    Join msdb.dbo.sysjobschedules As jobsched
        On sched.schedule_id = jobsched.schedule_id
    Where sched.enabled = 1
)
 
/* Finally, let's look at our actual jobs and tie it all together */
Select job.name As 'jobName'
    , sched.scheduleName
    , sched.frequency
    , sched.subFrequency
    , SubString(sched.startTime, 1, 2) + ':' 
        + SubString(sched.startTime, 3, 2) + ' - ' 
        + SubString(sched.endTime, 1, 2) + ':' 
        + SubString(sched.endTime, 3, 2) 
        As 'scheduleTime' -- HH:MM
    , SubString(sched.nextRunDate, 1, 4) + '/' 
        + SubString(sched.nextRunDate, 5, 2) + '/' 
        + SubString(sched.nextRunDate, 7, 2) + ' ' 
        + SubString(sched.nextRunTime, 1, 2) + ':' 
        + SubString(sched.nextRunTime, 3, 2) As 'nextRunDate'
      /* Note: the sysjobschedules table refreshes every 20 min, 
        so nextRunDate may be out of date */
    , 'Execute msdb.dbo.sp_update_job @job_id = ''' 
        + Cast(job.job_id As char(36)) + ''', @enabled = 0;' As 'disableScript'
    , art.avgRunTime As 'avgRunTime_inSec' -- in seconds
    , (art.avgRunTime / 60) As 'avgRunTime_inMin' -- convert to minutes
From msdb.dbo.sysjobs As job
Join myCTE As sched
    On job.job_id = sched.job_id
Left Join @avgRunTime As art
    On job.job_id = art.job_id
Where job.enabled = 1 -- do not display disabled jobs
Order By nextRunDate;

If this doesn’t do exactly what you want, check out SQLJobVis, which Ken recommended. It’s a free tool that helps visualize the job history of SQL jobs.

Disposable Indexes

Today I had to run an ad hoc query on a 8.5 billion row table. The table had a dozen columns of a variety of data types and was clustered on a bigint identity. There were no other indexes on the table. My query involved a join to a smaller table with a date range restriction. Without an adequate index to use, SQL Server was going to be forced to scan this 8.5 billion row table. Now, I don’t have much patience for waiting for long running queries. I want to run the ad hoc, e-mail the results, and forget about it. But short of adding a nonclustered index, which would take a very long time to build and probably require additional space requisitioned from the SAN team, what could I do? Enter disposable indexes. Now, you might be asking yourself, “What the frilly heck does she mean by a disposable index? Is that new in Denali?” No, dear reader. I am actually referring to filtered indexes, which is available in SQL Server 2008 and 2008 R2. I call them “disposable” because I create them to significantly speed up ad hoc queries, then I drop them when I’m done.

Here, allow me to demonstrate using the AdventureWorks2008R2 database. Although the tables are smaller, this query is very similar in structure to what I needed to run today.

Select Count(Distinct sod.SalesOrderID) As 'distinctCount'
From AdventureWorks2008R2.Sales.SalesOrderDetail As sod
Join AdventureWorks2008R2.Production.Product As p
    On sod.ProductID = p.ProductID
Where sod.ModifiedDate Between '2008-01-01' And '2008-07-31'
    And p.MakeFlag = 0;

Now, let’s take a look at the type of indexes we currently have available:

Select name, has_filter, filter_definition
From sys.indexes
Where object_id = object_id('Sales.SalesOrderDetail');
name                                                    has_filter filter_definition
-----------------------------------------------------------------------------------------
PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID     0          NULL
AK_SalesOrderDetail_rowguid                             0          NULL
IX_SalesOrderDetail_ProductID                           0          NULL
 
(3 row(s) affected)

We need an index on ModifiedDate and ProductID, which it doesn’t look like we have currently. Without this, we’re going to end up scanning on the clustered index. That means SQL Server will have to evaluate each and every single row in the table to see if the row matches the criteria of our query. Not pretty, and certainly not fast. So instead, let’s create a filtered index on date. But we can greatly speed up the time it takes to create our filtered index by doing a little investigating upfront and finding a range of clustering key values that will cover the query. Doing this allows SQL Server to seek on the clustered index, greatly reducing the amount of reads necessary to create our filtered index. So let’s see this in action. First, let’s find out the current max value of the table:

Select Max(SalesOrderDetailID) As 'maxID'
From AdventureWorks2008R2.Sales.SalesOrderDetail;
maxID
-----------
121317

Now we get to do a little guessing. Let’s go back and see what date we get if we look at half of the records:

Select SalesOrderDetailID, ModifiedDate
From AdventureWorks2008R2.Sales.SalesOrderDetail
Where SalesOrderDetailID = (121317/2);
SalesOrderDetailID ModifiedDate
------------------ -----------------------
60658              2007-11-01 00:00:00.000

Okay, SalesOrderDetailID 60658 gets us back to 11/1/2007. That’s a little too far. Let’s see how a SalesOrderDetailID value of 75000 does…

Select SalesOrderDetailID, ModifiedDate
From AdventureWorks2008R2.Sales.SalesOrderDetail
Where SalesOrderDetailID = 75000;
SalesOrderDetailID ModifiedDate
------------------ -----------------------
75000              2007-12-27 00:00:00.000

Okay, SalesOrderDetailID 75000 takes us back to 12/27/2007. That’s close enough to 1/1/2008 for my purposes. Of course, depending on the size of the table, in real life it may make sense to try to get closer to the value you’re looking for. But for now, this will do. And because we’re looking for data through the “current date” (7/31/2008 in the AdventureWorks2008R2 database), we already know our outer limit is 121317.

Now let’s take these ranges and create a filtered index that will cover our query:

Create Nonclustered Index IX_SalesOrderDetail_filtered
    On Sales.SalesOrderDetail(ModifiedDate, ProductID)
    Include (SalesOrderID)
    Where SalesOrderDetailID >= 75000
      And SalesOrderDetailID <  121317;

By having this range identified, SQL Server can perform a seek on the clustered index to create the nonclustered index on just the subset of records that you need for your query. Remember that 8.5 billion row table I mentioned earlier? I was able to create a filtered index that covered my query in 10 seconds. Yes, that’s right… 10 SECONDS.

The last thing we need to do is include our filtered index definition in our ad hoc query to ensure that the filtered index is used. It also doesn’t hurt to explicitly tell SQL Server to use the filtered index if you’re absolutely sure it’s the best index for the job.

Select Count(Distinct sod.SalesOrderID) As 'distinctCount'
From AdventureWorks2008R2.Sales.SalesOrderDetail As sod With (Index(IX_SalesOrderDetail_filtered))
Join AdventureWorks2008R2.Production.Product As p
    On sod.ProductID = p.ProductID
Where sod.ModifiedDate Between '2008-01-01' And '2008-07-31'
    And p.MakeFlag = 0
    And sod.SalesOrderDetailID >= 75000
    And sod.SalesOrderDetailID <  121317;

That’s all there is to it. Using this method, I was able to complete my ad hoc request in 40 seconds: 10 seconds to create the filtered index and 30 seconds to actually execute the ad hoc. Of course, it also took a couple of minutes to write the query, look at existing indexes, and search for the correct identity values. All in all, from the time I received the request to the time I send the e-mail was about 5 minutes. All because of disposable filtered indexes. How’s that for some SQL #awesomesauce? :)

Yet Another PASS Summit Recap & Thoughts on PDW

The SQL blogosphere has been lit up with PASS Summit recaps.

I debated about whether or not to write my own post, until I remembered that this blog serves as a mini-journal for me too. I have a notoriously poor memory–my husband likes to say that my CPU and memory are good, but I must have an unusual clustering strategy–so maybe this blog post will be a good pointer for me when I start prepping for next year’s Summit. ;)

This was definitely the best PASS Summit conference ever. While there will always be opportunities to do things better–improvement is a never-ending process–it was clear that the organizers of this event listened to the feedback they had received the previous year. One of the best changes? Backpacks. These were very useful, as evidenced by their presence everywhere. Nice job, organizers!

My absolute favorite thing about Summit is the chance to meet and reconnect with so many amazing SQL folks. There were entirely too many people to list out, but some highlights include meeting Crys Manson, Jorge Segarra, and Karen Lopez for the first time. I also had a chance encounter with Ola Hallengren in the Sheraton elevator. Apparently we were only staying a few rooms apart this year. We ended up having a couple of really great discussions about index fragmentation, the differences between our scripts, and things we’d like to see changed in future releases of SQL Server.

I had the opportunity to sit on the panel at the WIT luncheon. All of the women on the panel were amazing, and I was honored just to be sitting at the same table as them. I was especially pleased to meet Nora Denzel, a Senior Vice President at Intuit. Intelligent, confident, and witty, she is a great role model for young technical women, myself included. I can only hope that some of her gumption rubbed off on me due to our close proximity. :) After the event, I was pleasantly surprised by how many folks–men and women both–came up to me to tell me how much they enjoyed it. Thanks to the WIT VC for organizing another great event!

The lightning talk sessions were a new feature this year, and I think I like it. The format of the lightning session is 7 speakers presenting on a topic for 5 quick minutes. Watching these sessions is kind of like skipping right to the center of a tootsie pop: all content and no fluff. The standout lightning talk presentation for me was Adam Machanic’s. It was beautifully rehearsed and choreographed. Nice job, Adam!

Another of the many highlights of the week was meeting the Microsoft execs. In addition to meeting Ted Kummert, Mark Souza, and Donald Farmer–all very nice gentlemen–I had the opportunity to speak at length with Jose Blakely about Parallel Data Warehouse (PDW). PDW, formerly codenamed Madison, was officially launched at Summit. Jose was kind enough to explain the PDW architecture, both where it came from and the vision for where it’s going. I’d attempt to regurgitate it here, but I think the probability of me misquoting would be high.

Suffice it to say, this technology has me excited. Why? Quite frankly, I think PDW will do for data warehousing what SQL Server did for databases, and what Analysis Services did for BI: make it affordable. With a compelling cost-per-terabyte, an attractive scale-out approach, and an entry point at under $1 million, we’ll see more small-to-midsized companies implementing data warehousing and business intelligence. This is good news for those of us looking for an affordable data warehouse solution and for those of us who make our living with SQL Server. And for those of you who might suggest that few companies need a datawarehouse that can support multi-terabyte data, I’d like to point out that just 3 or 4 years ago, 100 GB was considered a lot of data.

I spent most of my week digging into the PDW architecture. It’s not all roses–it’s a first release and, as such, is immature compared to the much older and more established data warehouse systems–but again, it has a lot going for it, not least of all it’s easy integration within a SQL Server environment and the relatively low cost. We’re currently investigating this as a possible data warehouse solution for our business intelligence environment, so expect to see more from me about PDW as I learn more about it.

Not attending PASS Summit? Watch LIVE streaming events FOR FREE!

If you’ve not yet heard, the annual PASS Summit is less than 2 weeks away. This is the largest SQL Server and Business Intelligence conference _in the world_, sponsored by Microsoft and Dell. The return on investment of attending this conference is pretty huge, and I highly recommend you attend if you can swing it.

I am once more fortunate to be attending and presenting at the Summit. Here’s where you can find me speaking throughout the week:

Tuesday at 3PM
Lightning Talk – Page Compression
This year, PASS has decided to try something new. A daily Lightning Talk session will be held where speakers present for 5 quick minutes on interesting SQL topics. I’ll be presenting on Tuesday with 6 amazingly talented speakers. My topic is page compression — what is it, how to do it, and (most importantly, of course) how it affects performance.

Wednesday at 11:30am in the ballroom
Women-In-Technology (WIT) Luncheon
I’ll be speaking on this year’s WIT luncheon panel, which is sponsored by GoDaddy.com. Contrary to common misconception, the luncheon is NOT just for women. In fact, men are encouraged to attend! If memory serves, last year’s luncheon had about 300 attendees, with a good mix of both genders. This year’s topic is focused on the recruitment, retention, and advancement of WIT. If you’re worried that this event will end up being a feminist bitch-fest, rest assured that’s most definitely not the case. I’ve always found the WIT events I’ve attended to be informative and thought-provoking. Plus, free lunch! :)

Thursday at 2:30PM (room 3AB)
Heaps of Trouble, Clusters of Glory – A Look At Index Internals
You can click the link above to read my abstract, but in short, I’ll be taking attendees on a journey through indexes. You’ll come away with a much better understanding of the internal structures of indexes, which should help DBA’s with database design and performance tuning.

If you’re not able to attend in person, Summit does sell DVD’s of the event afterwards, which are well worth the investment. But this year, to make the event more accessible to the community, PASS and Dell have teamed up to present live streaming of the keynotes and WIT luncheon sessions.

Here’s details of the keynotes from the PASS press release:

Ted Kummert, Senior Vice President of the Business Platform Division at Microsoft Corp., will kick off PASS Summit on November 9 by highlighting the continued innovation across Microsoft’s business and information platform. Kummert will explore Microsoft’s key technical investments, as well as Mission Critical applications and the accessibility of Business Intelligence.

Quentin Clark, General Manager of Database Systems Group at Microsoft Corp., will showcase the next version of SQL Server on November 10 and will share how features in this upcoming product milestone continue to deliver on Microsoft’s Information Platform vision. Clark will also demonstrate how developers can leverage new industry-leading tools with powerful features for data developers and a unified database development experience.

David DeWitt, Technical Fellow, Data and Storage Platform Division at Microsoft Corp., will be discussing SQL query optimization and address why it is difficult to always execute good plans in his highly anticipated technical keynote. DeWitt will also cover new technologies that offer the promise of better plans in future releases of SQL Server.

While all of the keynotes are interesting and definitely worth watching, I cannot recommend the David DeWitt keynote more highly. His keynote last Summit was outstanding. It was technical, thought provoking, and one of the best things of last year’s Summit.

You can find more information and register for the PASS Summit 2010 live streaming events at their website, www.sqlpass.org/LiveKeynotes

If you ARE attending Summit, make sure to swing by and say “hi” or message me via Twitter to see if there’s a time we can meet up. Meeting people is one of my favorite things about Summit. :)

Metadata for Table Valued Parameters

Table-valued parameters (TVP) are a great feature that was new in SQL Server 2008 that allow you to insert a dataset into a table. Previously, the most common way of doing this was by passing and parsing XML. As I’ve previously posted, TVP’s perform an astounding 94% faster than singleton inserts and 75% faster than XML inserts. But for some reason, TVP’s still aren’t widely used and understood. In this post, I’ll walk you through how to use these and how to query the metadata for TVP’s.

I’ve previously posted about what TVP’s are and how to use them. But in honor of Halloween this week, I’ve updated my demo script:

/* Create some tables to work with */
CREATE TABLE dbo.orders
(
      order_id      INT IDENTITY(1,1)   Not Null
    , orderDate     DATE                Not Null
    , customer      VARCHAR(20)         Not Null
 
    CONSTRAINT PK_orders
        PRIMARY KEY CLUSTERED(order_id)
);
 
CREATE TABLE dbo.orderDetails
(
      orderDetail_id    INT IDENTITY(1,1)   Not Null
    , order_id          INT                 Not Null
    , lineItem          INT                 Not Null
    , product           VARCHAR(20)         Not Null
 
    CONSTRAINT PK_orderDetails
        PRIMARY KEY CLUSTERED(orderDetail_id)
 
    CONSTRAINT FK_orderDetails_orderID
        FOREIGN KEY(order_id)
        REFERENCES dbo.orders(order_id)
);
 
 
/* Create our new table types */
CREATE TYPE dbo.orderTable AS TABLE 
( 
      orderDate     DATE
    , customer      VARCHAR(20)
);
GO
 
CREATE TYPE dbo.orderDetailTable AS TABLE 
( 
      lineItem      INT
    , product       VARCHAR(20)
);
GO
 
 
/* Create a new procedure using a table-valued parameter */
CREATE PROCEDURE dbo.insert_orderTVP_sp
      @myOrderTable orderTable READONLY
    , @myOrderDetailTable orderDetailTable READONLY
AS
BEGIN
 
    SET NOCOUNT ON;
 
    DECLARE @myOrderID INT;
 
    INSERT INTO dbo.orders
    SELECT orderDate
        , customer
    FROM @myOrderTable;
 
    SET @myOrderID = SCOPE_IDENTITY();
 
    INSERT INTO dbo.orderDetails
    SELECT @myOrderID
        , lineItem
        , product
    FROM @myOrderDetailTable;
 
    SET NOCOUNT OFF;
 
END
GO
 
 
/* Call our new proc! */
DECLARE @myTableHeaderData AS orderTable
    , @myTableDetailData AS orderDetailTable;
 
INSERT INTO @myTableHeaderData
(orderDate, customer)
VALUES (GETDATE(), 'Zombie');
 
INSERT INTO @myTableDetailData
(lineItem, product)
SELECT 10, 'Brains' UNION ALL
SELECT 20, 'More Brains';
 
EXECUTE dbo.insert_orderTVP_sp 
      @myTableHeaderData
    , @myTableDetailData;
 
DELETE FROM @myTableHeaderData;
DELETE FROM @myTableDetailData;
 
INSERT INTO @myTableHeaderData
(orderDate, customer)
VALUES (GETDATE(), 'Vampire');
 
INSERT INTO @myTableDetailData
(lineItem, product)
SELECT 10, 'Blood Type O+' UNION ALL
SELECT 20, 'Blood Type B-' UNION ALL
SELECT 30, 'Blood Type AB+' UNION ALL
SELECT 40, 'Blood Type A+';
 
EXECUTE dbo.insert_orderTVP_sp 
      @myTableHeaderData
    , @myTableDetailData;
 
 
/* Check our data */
SELECT * FROM dbo.orders;
SELECT * FROM dbo.orderDetails;

Once you’ve run this, you should see the following data:

order_id    orderDate  customer
----------- ---------- --------------------
1           2010-10-28 Zombie
2           2010-10-28 Vampire
 
(2 row(s) affected)
 
orderDetail_id order_id    lineItem    product
-------------- ----------- ----------- --------------------
1              1           10          Brains
2              1           20          More Brains
3              2           10          Blood Type O+
4              2           20          Blood Type B-
5              2           30          Blood Type AB+
6              2           40          Blood Type A+
 
(6 row(s) affected)

Now that we’ve successfully created a couple of table types to support our TVP’s, how do we go back and find out which objects we’ve created? You can query the sys.types catalog view to find out. Just search for system_type_id 243, which identifies the record as a table type.

/* Let's check out our new data types */
SELECT name
    , system_type_id
    , is_table_type
FROM sys.types
WHERE system_type_id = 243;
GO
name                 system_type_id is_table_type
-------------------- -------------- -------------
orderTable           243            1
orderDetailTable     243            1
 
(2 row(s) affected)

Even better yet, you can use the sys.table_types catalog view. This gives us the same information as sys.types but also gives us the type_table_object_id, which we’ll need shortly.

SELECT name
    , system_type_id
    , is_table_type
    , type_table_object_id
FROM sys.table_types;
name                 system_type_id is_table_type type_table_object_id
-------------------- -------------- ------------- --------------------
orderTable           243            1             917578307
orderDetailTable     243            1             933578364
 
(2 row(s) affected)

What if you need to look up the table type definition? You can do this using the type_table_object_id and joining to sys.columns.

SELECT tt.name AS 'table_type_name'
    , c.name AS 'column_name'
    , t.name AS 'data_type'
FROM sys.table_types AS tt
JOIN sys.columns AS c
    ON tt.type_table_object_id = c.object_id
JOIN sys.types As t
    ON c.system_type_id = t.system_type_id;
table_type_name      column_name     data_type
-------------------- --------------- ---------------
orderTable           orderDate       date
orderDetailTable     lineItem        int
orderTable           customer        varchar
orderDetailTable     product         varchar
 
(4 row(s) affected)

And last, but certainly not least, how do we see if any procs are currently using the table types? SQL Server 2008 makes this easy for us with the sys.dm_sql_referencing_entities DMV.

SELECT referencing_schema_name, referencing_entity_name, referencing_id
FROM sys.dm_sql_referencing_entities ('dbo.orderTable', 'TYPE');
referencing_schema_name referencing_entity_name referencing_id
----------------------- ----------------------- --------------
dbo                     insert_orderTVP_sp      949578421
 
(1 row(s) affected)

If you’re wondering how to implement SQL Server TVP’s in your .NET code, well… I can’t tell you how to do it, but I can point you to a place that can. Stephen Forte has a post that explains how easy it is to do.

So now that you have a better understanding of how to work with TVP’s, why don’t you go implement one in your environment and see how for yourself just how awesome it is? :)

Oh, and Happy Halloween!

Rename Database Objects En Masse

Ever need to rename all objects in a database? Yeah, okay… it doesn’t happen very often, but when it does, it can be time consuming. This recently came up as something I needed to do. When you consider all the tables involved and you add in defaults, indexes, and foreign keys, well… you can imagine how the number of objects adds up quickly. After doing a few tedious renames, it occurred to me that I could write a script to generate the rename scripts for me. Work smarter, not harder, right? For anyone interested, here’s the script.

Select name
    , [object_id] 
    , Case 
        When [type] = 'U' Then 'Execute sp_rename N''' + name + ''', N''old_' + name + ''''
        When [type] IN ('D', 'PK', 'F') Then 
            'Execute sp_rename N''' + name + ''', N''old_' + name + ''', N''OBJECT'''
        End As 'renameScript'
    , Case When parent_object_id > 0 Then 0 Else 1 End As 'sortOrder'
From sys.objects
Where [type] In ('D', 'PK', 'U', 'F')
Union ALL
Select i.name
    , o.[object_id]
    , 'Execute sp_rename N''' + o.name + '.' + i.name + ''', N''old_' + i.name + ''', N''INDEX''' As 'renameScript'
    , 0 As 'sortOrder'
From sys.indexes As i
JOIN sys.objects As o
    On i.object_id = o.object_id
Where i.is_primary_key = 0 -- exclude PKs, we take care of that above
    AND i.type <> 0 -- exclude heaps
    AND o.type Not In ('S', 'IT') -- exclude system & internal tables
Order By sortOrder;

Be forewarned that I only tested this on a couple of databases, but it seemed to run without problem for tables, indexes, primary keys, defaults, and foreign keys. The sortOrder column is there only to ensure that table renames are performed last. Otherwise, your index renames would fail. This will only run on SQL Server 2005 or 2008. If you have any problems with the script, please let me know. :)