The End of an Era – SQLFool scripts are now open-sourced

Hello, dear readers. It’s been a while.

For those who haven’t yet heard the news, I have switched to the dark side roles. I’m still kicking ass at GoDaddy and I’m still technical, but I now lead a team of big data engineers. As you can imagine, this means that I spend the majority of my time with Hadoop and its related ecosystem (Pig, Hive, Spark, etc.). This is very different than the performance-focused SQL Server role I had when I first created this blog. For those of you who make your online casinos living primarily or solely with SQL Server, please rest assured… SQL Server is still awesome, is still relevant, and still has a place in most enterprise data platform strategies. In fact, I’m presently working on egress automation of our Hadoop data warehouse to SQL Server for serving BI reports. Unfortunately, my change in technology focus means that my SQL Server blogging has largely come to an end, and with it, new posts on this blog. But while this blog may end, my blogging activities do not. For those interested, I have launched a new blog at that chronicles my adventures in a non-Microsoft world with Hadoop, open-source, and development on a Mac. 🙂

But back to the reason for this post. Recently, I saw a trackback comment from SQLBlimp that listed my index defrag script as #3 on its SQL Server Helpfulness Hall of Fame. This filled me with mixed emotions; while it made me happy that the script has been so helpful to so many people, I couldn’t help but feel sad because I knew I would no longer be keeping it updated. I have spent countless hours on this script, which has won numerous awards and has been mentioned in blogs, presentations, online publications, & books. It’s also still extremely popular; I stopped paying attention to its metrics when it broke 100,000 downloads 3 years ago. Similarly, I’ve been surprised to observe that, which broke 1 million visits in April 2014, now receives more web traffic today than it ever did at the height of my blogging activities.

For this reason, I have made two decisions. First, I will keep this blog online so that people can continue to stumble across blog posts that, while a couple of years old, are still largely relevant.

Secondly, I have decided to open source all of my SQL Server procedures & scripts on GitHub. I haven’t yet had a chance to upload all of my scripts, but I started with the most popular ones (based on traffic stats). These scripts are available in a single repository to make downloads easy. And, if you are so inclined, feel free to contribute back to the repo. For the index defrag script in particular, I have some automated regression tests that I can run against any pull requests to aid in merge approvals.

One final comment. This may be the end of my SQLFool era, but I hope to maintain the friendships I’ve made along this incredible journey. I wholeheartedly believe that the SQL Server community is the best damn technical community in the world. Thank you ALL for your camaraderie, friendship, help, support, and awesomeness. Keep kickin’ ass, my friends. 🙂



SqlDev PowerShell 102: Working with Data and Variables

In my previous post, SqlDev PowerShell 101: Getting Started with PowerShell, I walked through how to configure and use PowerShell to run basic commands. This post will cover working with data and variables in PowerShell. Specifically, I’m going to demonstrate how to connect to a database (i.e. SQL Server) using Invoke-Sqlcmd, retrieve data, store the data in a variable, and work with that variable.

Note: Invoke-Sqlcmd does require the use of the sqlps module. If you don’t have it installed yet, check here for more information.

From a new PowerShell window, write the following command:

Invoke-Sqlcmd -Query "SELECT DB_NAME();"

PowerShell Working with Data: PowerShell_102_01

Tip: You can auto-complete “Invoke-Sqlcmd” after typing “Invoke-S” and hitting TAB on your keyboard.

What’s happening here? Well, because no database instance was specified, the cmdlet is connecting to the database on the local computer and running the query specified. “Column1” is returned just as it would be in SQL Server Management Studio (SSMS), because no column name was specified. You can see this by running:

Invoke-Sqlcmd -Query "SELECT DB_NAME() AS MyDatabaseName;"

PowerShell Working with Data: PowerShell_102_02

But what if you don’t have a local database instance, or you want to connect to another database? No problem. Just add the -ServerInstance parameter:

Invoke-Sqlcmd -Query "SELECT DB_NAME() AS MyDatabaseName;" -ServerInstance "."

“.” refers to the local, default SQL Server instance. If you want to connect to a named or remote instance, just replace this value with your server name.

Tip: Hit the “Up” arrow to retrieve the previous results, then just edit the command.

PowerShell Working with Data: PowerShell_102_03

So that’s all pretty easy so far, but honestly, if all I needed to do was run a simple query, I’d do it in SSMS. No, the real power of PowerShell is the .NET Framework. So let’s see this in action by assigning our results to a variable. A variable is defined using the dollar sign ($) and a name; defining a type is optional. Let’s do that now.

$DbName = Invoke-Sqlcmd -Query "SELECT DB_NAME() AS MyDatabaseName;" -ServerInstance "."

PowerShell Working with Data: PowerShell_102_04

In order to view the value stored in the variable, just append a new command to the previous line using a semicolon (;) and the variable’s name.

$DbName = Invoke-Sqlcmd -Query "SELECT DB_NAME() AS MyDatabaseName;" -ServerInstance "."; $DbName

PowerShell Working with Data: PowerShell_102_05

Optionally, you could also type the variable’s name and press Enter.


PowerShell Working with Data: PowerShell_102_06

Now let’s run a query that returns multiple columns:

$DbDetails = Invoke-Sqlcmd -Query "SELECT database_id, name, state_desc, recovery_model_desc FROM sys.databases WHERE name = 'master';" -ServerInstance "."; $DbDetails

PowerShell Working with Data: PowerShell_102_07

We didn’t define a data type, so let’s see what data type PowerShell assigned for us. To do this, we’ll use the GetType method:


PowerShell Working with Data: PowerShell_102_08

You can see that the variable is defined as a system object; specifically, it’s a data row. That’s important, because it determines how we’ll interact with the variable. For instance, what if we wanted to reference the value stored in the variable, not the variable as an object? You can do this by referencing the column stored in the object directly, i.e.


PowerShell Working with Data: PowerShell_102_09

Now let’s run the same query, but remove the database criteria from the WHERE statement:

$DbDetailsAll = Invoke-Sqlcmd -Query "SELECT database_id, name, state_desc, recovery_model_desc FROM sys.databases;" -ServerInstance "."; $DbDetailsAll

PowerShell Working with Data: PowerShell_102_10

Your results will vary, of course, but the important thing is that you should have more than 1 row returned.

Now let’s return just the database names again:


PowerShell Working with Data: PowerShell_102_11

Hopefully you’re starting to see how retrieving an array of values and accessing just a single attribute could be useful.

Let’s now iterate through our list of databases and perform an action. In this example, we’ll just iterate through each database and retrieve the count of tables. To do this, we’ll need to use a Foreach loop.

Foreach ($DatabaseRow In $DbDetailsAll) {$TableCount = Invoke-Sqlcmd -Query "SELECT COUNT(*) AS TableCount FROM sys.tables;" -ServerInstance "." -Database $; $TableCount }

PowerShell Working with Data: PowerShell_102_12

This is the biggest statement we’ve written yet, so let’s take a minute to break it down.

We’re starting the loop by calling Foreach, then we’re specifying how we’re iterating through the array. As each row in the array comes into focus, we’re storing the row in the $DatabaseRow variable that we’ve just defined. Everything specified inside the curly brackets ({}) is executing for each row in the array. We’re then connecting to the local instance and changing the context of the database to be the value currently stored in $DatabaseRow. The results are stored in a new variable, $TableCount, which is then outputted to the screen.

It’s actually a pretty short piece of code, given all that it’s doing. But it’s difficult to understand which TableCount maps to which table, so let’s add that information to our output:

Foreach ($DatabaseRow In $DbDetailsAll) {$TableCount = Invoke-Sqlcmd -Query "SELECT '$' AS DatabaseName, COUNT(*) AS TableCount FROM sys.tables;" -ServerInstance "." -Database $; $TableCount }

PowerShell Working with Data: PowerShell_102_13

Woah, that’s not what we want! PowerShell is displaying the object type, not the value. In order to actually retrieve the value from an object variable, we’re going to store the database name in a string variable first, then reference that newly-defined variable.

Foreach ($DatabaseRow In $DbDetailsAll) {$DatabaseName = $; $TableCount = Invoke-Sqlcmd -Query "SELECT '$DatabaseName' AS DatabaseName, COUNT(*) AS TableCount FROM sys.tables;" -ServerInstance "." -Database $; $TableCount }

PowerShell Working with Data: PowerShell_102_14

Voila! Now we have a list of databases on a server and the associated count of tables in each database.

If you’re wondering why you’d ever actually use this PowerShell script, well… you probably wouldn’t. But it’s important to have these building blocks before going into more advanced concepts. In my next post, we’ll be creating a config table of SQL Server instances, connecting to each server, retrieving details about missing indexes, and writing those results out to disk. That should be a little more useful. 😉

Until then… happy scripting! 🙂

SqlDev PowerShell 101: Getting Started with PowerShell

This is the first in a new series focused on PowerShell for SQL developers. That means using PowerShell to create and/or modify table schema, stored procedures, and indexes. In other words, automating code development and database tuning processes. This is mostly what I’ve been using PowerShell for so far, and I have to say, I am LOVING it! It’s allowing me to save *days* of coding and validating.

Disclaimer: I am NOT a PowerShell expert, just a newly converted enthusiast who wants to save you from the headaches I myself have suffered while trying to learn PoSh. So I’ll mostly focus on the “how to’s” instead of the “why’s” in this series. 🙂

I’m going to start at the very beginning and assume you have never opened PowerShell before in your life. Basically, I’m going to assume you’re where I was just a few short months ago. So let us start by finding the application. 😉

If you’re using Windows 7 or later, the good news is that PowerShell comes pre-installed on the OS. We just need to configure it to allow the execution of scripts. Click on the Start icon –> “Search programs and files” –> type “PowerShell.”

Searching for PowerShell

Searching for PowerShell

The first time we open PowerShell, we will need to run it as Administrator for elevated permissions. To do this, right-click on the Windows PowerShell icon that should now be on your Taskbar, then click on “Run as Administrator.” While you’re there, also add it to your Taskbar and/or Start Menu for easy retrieval in the future.

Running PowerShell as Administrator

Running PowerShell as Administrator

Now we need to change the Execution Policy from “Restricted” — which prevents the execution of PowerShell scripts — to something a little more accessible. For the purposes of this post, we’re going to set it to “RemoteSigned,” with the understanding that this may pose some security risks. You can read more about Execution Policy settings here: Using the Get-ExecutionPolicy Cmdlet.

Similarly, you can run this command inside of PowerShell to get more details:

Get-Help About_Signing

Now that we have PowerShell open in Administrator mode, execute this command:

Set-ExecutionPolicy RemoteSigned

You will be prompted to confirm the change. Type “Y”. You can then confirm that the change has taken place by typing:

Changing Execution Policy

Changing Execution Policy

Alrighty, now we’re ready to write some PowerShell!

Go ahead and close your PowerShell window, then re-open as you normally would (not in Administrator mode; we don’t need elevated permissions any longer).

Let’s start with some simple commands. For this post, I have scripted all objects from AdventureWorks to a new directory called C:\MyCode\. We can take a look at the files in the directory using a cmdshell command that’s probably very familiar: Dir. Go ahead and run that command now.

Running Dir in PowerShell

Running Dir in PowerShell

So does that mean PowerShell supports all the old cmdshell commands that we know and love (to hate)? Well, yes and no. The PowerShell devs, forward-thinking folks that they are, knew that having support for cmdshell syntax would probably ease the transition for many people, so they created aliases. You can see this by running the Get-Alias cmdlet:

Get-Alias Dir
Get-Alias Dir

Get-Alias Dir

What we see is that, under the covers, Dir is nothing more than an alias for Get-ChildItem. So let’s switch to using Get-ChildItem from here on out.

One thing that took me a while to discover, but which is very helpful, is the auto-complete functionality inherent in PowerShell. To use this, just start typing a command, such as “Get-Ch”, and hit the Tab key. As you press Tab, PowerShell will iterate through the list of available cmdlets / switches that are available in the current context. Keep hitting Tab until you get to Get-ChildItem, then execute that command.



Hmm, those results look familiar. That’s a good sign. But what else can we do with Get-ChildItem? Let’s find out by using the Get-Help cmdlet:

Get-Help Get-ChildItem
Get-Help Get-ChildItem

Get-Help Get-ChildItem

Type “Get-ChildItem -” (make sure to include the hyphen, or you’ll be tabbing through way more than you’d like) and press Tab to iterate through the list of available switches. Stop when you get to the -Recurse switch. If you go too far, don’t worry… you can hit the Shift+Tab to go in reverse direction. This allows Get-ChildItem to execute against child folders too, so we’ll add that to our command.

Get-ChildItem -Recurse

So far, so good. Now what else can we do?

At this point, we need to discuss piping. Piping is a way to stack multiple cmdlets together to produce a single, final, perfect result. It’s one of the features that adds the “power” in PowerShell. You can read more about piping here: Piping and the Pipeline in Windows PowerShell.

So let’s build our first pipe. We know how to return all the files in a directory, but how do we *search* the directory for files that contain a specific keyword? It turns out it’s actually not that hard… we just need to use the Select-String cmdlet. To do this, press the Up arrow on your keyboard to retrieve the previously executed command, then type a pipe (|) before adding the Select-String cmdlet:

Get-ChildItem -Recurse | Select-String -Pattern "SalesOrderDetail"
Search Files

Search Files

So what’s this doing? Well, it’s iterating through each of the files listed in the directory and subdirectories, then calling the Select-String cmdlet to find files that contain the word “SalesOrderDetail.” It’s even showing us what line the pattern is found on. Pretty awesome, huh? But I think we can do better.

One of the best things about PowerShell is the sheer amount of documentation available. Case in point: TechNet has a quick reference of PowerShell 2.0 cmdlets. Looking at the list, some interesting cmdlets catch my eye, including Select-Object and Get-Unique. If we pipe these cmdlets to our existing command, we get something like this:

Get-ChildItem -Recurse | Select-String -Pattern "SalesOrderDetail" | Sort-Object | Select-Object Path | Get-Unique -AsString
Final PowerShell Command

Final PowerShell Command

This returns a very easy-to-consume list of files that contain the pattern “SalesOrderDetail.” Imagine using this to search through hundreds of files and thousands of lines of code to find dependencies. Powerful PowerShell indeed.

So that’s it for this first post and introduction to PowerShell. If you’re looking for more resources, here are some PowerShell-related sites I’ve found especially helpful:

Happy Scripting! 🙂

Conversion Issues Upgrading to PowerShell 3.0

This post probably has a small audience, but there were a handful of conversion issues I ran into when I upgraded to PowerShell 3.0. It was difficult finding posts related to the errors I encountered, so I’m posting my experiences in the hopes that it helps someone else down the road.

The first issue I encountered was calling PowerShell from xp_cmdshell on a remote SQL Server.

In PowerShell 2.0, I used this syntax:

EXEC xp_cmdshell 'powershell.exe -Command "get-executionpolicy"'
# PoSh 2.0 Results
# PoSh 3.0 Results
'powershell.exe' is not recognized as an internal or external command,
operable program or batch file.

To fix this issue, I had to fully qualify the PowerShell directory:

EXEC xp_cmdshell 'c:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command "get-executionpolicy"'
# PoSh 3.0 Results

Another issue I ran into was calling the Teradata .NET Data Provider. In PowerShell 2.0, I was calling the Invoke-SqlStmt function I previously posted to connect from a Windows server to a remote Teradata server.

PS X:\> .\Invoke-SqlStmt.ps1 -C TDATA -S MyTDServerName -Q "SELECT * FROM DBC.Tables;" -Verbose

Everything worked great. However, once I upgraded to PowerShell 3.0, I was getting this error:

Exception calling "GetFactory" with "1" argument(s): "Unable to find the requested .Net Framework Data Provider.  It may not be installed."
At line:1 char:1
+ $Factory = [System.Data.Common.DbProviderFactories]::GetFactory("Teradata.Client ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : ArgumentException

I couldn’t find anything that would explain this behavior, and I confirmed that everything still worked great when ran from PowerShell 2.0. Ultimately, I decided to install the Teradata .NET Data Provider locally, which solved the problem. I didn’t find anything that explained why I could previously use a remote adapter and now can only use a local adapter. I don’t really understand the problem, but I was able to solve it.

Here’s the syntax to install the Teradata .NET Data Provider remotely using xp_cmdshell.

EXEC xp_cmdshell 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command "X:\drop\tdnetdp__windows_indep. /s /v"/qn"'

Note that this code is installing the 13.11 adapter, so you may need to change the filename if you’re installing a different version.

And to leave things on a positive note, one awesome improvement of PowerShell 3.0 over 2.0 is the redirection operators. I’m a big fan of logging and verbose messaging, but in 2.0, I had to put extra steps into the code to log these messages. In 3.0, I’ve been able to log my verbose output simply by appending “*> directory” to my commands, i.e.

EXEC xp_cmdshell 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command "X:\MyAwesomeScript.ps1 -Verbose *> X:\MyAwesomeLog.txt"'

I think that is just so cool. You can read more about the different redirection operators by running this in PowerShell:

get-help redirect

Happy Scripting! 🙂

Invoke-SqlStmt – PowerShell for SQL Server & Teradata Queries

I’ve been using PowerShell quite a bit the last few months, and I finally feel I’m to the point where I can start sharing some of what I’ve learned. فريق البير ميونخ

I started a blog post that walks through using PowerShell for database development and automation, and I quickly realized that a lot of what I want to show is dependent upon the first function I wrote, Invoke-SqlStmt. طاولة اون لاين And by “wrote,” I mean “blatantly borrowed from Chad Miller’s Invoke-SqlCmd2 script” 😉 . Thank you, Chad! The main difference is that my function supports connecting to either Teradata or SQL Server, depending on the parameters passed.

One thing worth mentioning is that I spent a lot of time researching all of the different ways to connect to SQL Server. In the end, I went with ADO.NET, as a lot of what I will be doing is working with data sets (i.e. running a query against ServerA and writing the results to ServerB).


#   .NAME
#       Invoke-SqlStmt.ps1
#       Executes a SQL statement against either a Teradata or SQL Server source using ADO.NET
#   .INPUT
#       C - Connection Type; Required; MSSQL or TDATA
#       S - Server Name; Required; server name for any valid SQL Server or data instance
#       Q - Query; Optional; either Q or F is required
#       F - Filename; Optional; either Q or F is required
#       U - Username; Optional; if not supplied, Integrated mode is assumed
#       P - Password; Optional; if not supplied, Integrated mode is assumed
#       T - Connection Timeout; Optional; defaults to 600s but can be overridden
#       AS - Optional; can specify whether DataSet, DataTable, or DataRow will be returned
#       System.Data.(DataSet|DataTable|DataRow)
#       Modified from Chad Miller's Invoke-SqlCmd2 script:
#  --------------------------------------------------------------------------------------------
#   Invoke-SqlStmt -C MSSQL -S "(local)" -Q "SELECT @@SERVERNAME;"
#     This example connects to a named instance of the D Engine on a computer and runs a basic T-SQL Q.
#   Invoke-SqlStmt -C MSSQL -S "(local)" -F "C:\myScript.sql" | Out-File -filePath "C:\myOutput.txt"
#     This example reads a file containing T-SQL statements, runs the file, and writes the output to another file.
#   Invoke-SqlStmt  -C MSSQL -S "(local)" -Q "PRINT 'hello world'" -Verbose
#     This example uses the PowerShell -Verbose parameter to return the message output of the PRINT command.
#   Invoke-SqlStmt  -C TDATA -S "(local)" -Q "SELECT * FROM DBC.Tables;"
#     This example connects to Teradata and executes a query. 
#  --------------------------------------------------------------------------------------------
#   AUTHOR:     Michelle Ufford
#   CREATED:    2012-12-12
#  --------------------------------------------------------------------------------------------
#  ChangeDate/Version   Author                  Description
#  --------------------------------------------------------------------------------------------
#  20121212.0           Michelle Ufford         Initial Creation
#  20121218.1           Michelle Ufford         Tweaks

function Invoke-SqlStmt
    [Parameter(Position=0, Mandatory=$false)] [ValidateSet("MSSQL", "TDATA")] [string]$C="MSSQL",
    [Parameter(Position=1, Mandatory=$true)]  [string]$S,
    [Parameter(Position=2, Mandatory=$false)] [string]$D,
    [Parameter(Position=3, Mandatory=$false)] [string]$Q,
    [Parameter(Position=4, Mandatory=$false)] [ValidateScript({test-path $_})] [string]$F,
    [Parameter(Position=5, Mandatory=$false)] [string]$U,
    [Parameter(Position=6, Mandatory=$false)] [string]$P,
    [Parameter(Position=8, Mandatory=$false)] [Int32]$T=600,
    [Parameter(Position=9, Mandatory=$false)] [ValidateSet("DataSet", "DataTable", "DataRow")] [string]$AS="DataRow"

    # Check for input files
    if ($F)
        $filePath = $(resolve-path $F).path
        $Q =  [System.IO.File]::ReadAllText("$filePath")

    # ----------------------
    # Teradata Connections
    # ----------------------
    if ($C -eq "TDATA")
        $ConnectionString = "Data Source={0}; User Id={1}; Password={2}; Connection Pooling Timeout={3};" -f $S, $U, $P, $T

        Add-Type -AssemblyName System.Data
        $Factory = [System.Data.Common.DbProviderFactories]::GetFactory("Teradata.Client.Provider")
        $Connection = $Factory.CreateConnection()
        $Connection.ConnectionString = $ConnectionString
        $Command = $Connection.CreateCommand()
        $Command.CommandText = $Q
        $DataSet = New-Object System.Data.DataSet
        $DataAdapter = $Factory.CreateDataAdapter()
        $DataAdapter.SelectCommand = $Command
        $DataAdapter.SelectCommand.CommandTimeout = $T
        [void] $DataAdapter.Fill($DataSet) | Out-Null

        # get a table contained in the DataSet,
        switch ($As)
              'DataSet'   { Write-Output ($DataSet) }
              'DataTable' { Write-Output ($DataSet.Tables) }
              'DataRow'   { Write-Output ($DataSet.Tables[0]) }
    # ------------------------
    # SQL Server Connections
    # ------------------------
        $Connection = New-Object System.Data.SqlClient.SQLConnection

        if ($U)
        { $ConnectionString = "Server={0};Database={1};User ID={2};Password={3};Trusted_Connection=False;Connect Timeout={4}" -f $S,$D,$U,$P,$T }
        { $ConnectionString = "Server={0};Database={1};Integrated Security=True;Connect Timeout={2}" -f $S,$D,$CT }


        #Following EventHandler is used for PRINT and RAISERROR T-SQL statements. Executed when -Verbose parameter specified by caller
        if ($PSBoundParameters.Verbose)
            $Handler = [System.Data.SqlClient.SqlInfoMessageEventHandler] {Write-Verbose "$($_)"}

        $Command = New-Object System.Data.SqlClient.SqlCommand
        $Command.Connection = $Connection
        $Command.CommandText = $Q
        $Command.CommandTimeout = $T
        $DataSet = New-Object System.Data.DataSet
        $DataAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
        $DataAdapter.SelectCommand = $Command
        [void]$DataAdapter.Fill($DataSet) | Out-Null

        switch ($As)
            'DataSet'   { Write-Output ($DataSet) }
            'DataTable' { Write-Output ($DataSet.Tables) }
            'DataRow'   { Write-Output ($DataSet.Tables[0]) }

I plan to have a follow-up post up shortly that will walk you through how to use this script step-by-step.

Accelerating ETL Data Retrieval

As I’ve mentioned previously, SQL Server 2012 Integration Services Design Patterns is now available on Amazon. PASS has invited the ENTIRE author team to give a pre-conference training session at Summit 2012. Precons are all-day training events and are held on the days before Summit. Our precon will be held on Monday, November 5th. I’ve attended several precons at Summit, and in my opinion, they provide unparalleled depth and access to the presenters. I expect our precon will be no different.

So what will I be presenting on? Good question. I’m going to talk about Accelerating ETL Data Retrieval. Specifically, I’ll discuss strategies for retrieving data for full loads vs. incremental loads, and I’ll explore some strategies for retrieving data from large tables. To clarify “large,” since everyone has a different definition, I’ve successfully employed these strategies on tables with 10 billion rows.

Now you might be reading this and thinking, “Hey, that doesn’t sound like SSIS!” Well… yes and no. While I will be discussing some strategies within SSIS, most of what I discuss will take place at the database layer. In fact, most of my content could be useful for any ETL tool. is ivermectin recommended to treat the coronavirus I still think it’s a good topic for this precon for three reasons. First, most ETL developers I talk with — that is, people who specialize in and are primarily tasked with ETL — overlook the data retrieval step. I frequently hear, “I’m pulling all of the data, so it doesn’t matter, I can’t get the data any faster than SELECT * FROM. ivermectin hookworms ” That’s not always true, and I’ll explain why in the precon. Secondly, having a thorough understanding of data retrieval techniques is important. Data retrieval is one of the most — if not the single most — expensive components of ETL. Lastly, I think there’s a huge knowledge gap in this area. While there is a lot of discussion about query optimization, it rarely has to do with the type of query patterns that ETL requires.

So that’s what I’ll be talking about and why. What do you think? Worthwhile content?

Also, this is just one of the things that will be covered in our SSIS Design Patterns precon. The rest of the author team — Andy Leonard, Matt Masson, Tim Mitchell, and Jessica Moss — is putting together great content that includes package execution, parent/child relationships, expressions and scripting, dynamic configurations, error handling, data flow internals, scalability and parallelism, automation, dynamic package generation, data warehousing patterns, and more!

As an aside, I am honored to announce that I have been re-awarded as Microsoft MVP for SQL Server. My husband took this picture of my trophy for me, which I think turned out really cool. ivermectin dog heartworm 🙂

Go Daddy Insiders

I don’t hide the fact that I work at Go Daddy. All discussions of advertising methods aside, it’s a great company to work for. Not only am I treated well as an employee, I also get to work in a world-class technical environment. However, the marketing campaigns tend to steal the spotlight. As a result, few people are aware of technology that it takes to be the #1 hosting provider in the world. Some examples of little-known facts about Go Daddy:

  • 10 billion DNS queries answered daily
  • Over 35,000 servers & 100,000 square feet of state-of-the-art global data centers
  • 25 petabytes — yes, petabytes! — of networked data storage

Pretty cool, huh? Go Daddy has launched a new blog called Inside Go Daddy as a way to share all the nitty gritty details of what it takes to support this kind of environment. Here’s a blurb from the site:

This is your inside source for what’s going on with Go Daddy’s tech experts. You’ll get insight and opinions from Go Daddy’s tech leaders on industry topics, company projects & open source initiatives … the leading edge, unconventional, “behind-the-scenes” information you won’t find anywhere else. It’s not PR, it’s not executive talk, it’s the story straight from Go Daddy’s developers, engineers & IT personnel.

Shockingly, I’ve signed up to blog about database scalability. 😉 I’ve just started a new series that explores the tuning and design changes required to support 27k transactions per second during the airing of Go Daddy’s Super Bowl commercials. Those who attended my Summit 2009 session might recognize some of the high-level content, but this series will explore the topics in depth and with never-before-revealed detail. My first article, Scaling the Database: Data Types, is now live.

If you find the content helpful or interesting, please share the article or leave a comment. My employer monitors blog traffic, and we have a bit of a contest going on to see what topics get the most hits. Quite frankly, it’d be cool if the SQL Server topics outperformed the NoSQL topics. 😉

Also, I’ll entertain topic requests, so if there’s something you’re just dying to know about what we do or how we do it, let me know. 🙂

SQL Agent Job Script

I’ve been working on a script for a demo in my upcoming 24 Hours of PASS presentation, which I thought I’d share on my blog. In the presentation, I use this script to demonstrate how to execute one or more queries dynamically against a list of SQL Server instances.

The script itself explores SQL Agent Job metadata to get job statuses — when the job last ran, when it will run again, an aggregate count of the number of successful and failed executions in the queried time period, T-SQL code to disable the job, etc. I’ve only tested this in a handful of DEV and PROD environments, so please tell me if you experience any issues in your environment when running the script. As with most of my scripts, this will only work on SQL Server 2005 and newer.

I was planning to send the SQL Server development team a case of #bacon had they finally fixed the sysjob% date and time columns in SQL Server 2012, but alas, it seems they shall sadly remain pork-free. Credit for the logic that handles the time conversion goes to some awesome community member who posted it on the MSDN forum. I wish I had the URL, but it was a long time ago.

      job_id                UNIQUEIDENTIFIER
    , success               INT
    , cancel                INT
    , fail                  INT
    , retry                 INT
    , last_execution_id     INT
    , last_duration         CHAR(8)
    , last_execution_start  DATETIME
WITH lastExecution
    SELECT job_id
    , MAX(instance_id) AS last_instance_id
FROM msdb.dbo.sysjobhistory
WHERE step_id = 0
GROUP BY job_id
INSERT INTO @jobHistory
SELECT sjh.job_id
    , SUM(CASE WHEN sjh.run_status = 1 AND step_id = 0 THEN 1 ELSE 0 END) AS success
    , SUM(CASE WHEN sjh.run_status = 3 AND step_id = 0 THEN 1 ELSE 0 END) AS cancel
    , SUM(CASE WHEN sjh.run_status = 0 AND step_id = 0 THEN 1 ELSE 0 END) AS fail
    , SUM(CASE WHEN sjh.run_status = 2 THEN 1 ELSE 0 END) AS retry
    , MAX(CASE WHEN sjh.step_id = 0 THEN instance_id ELSE NULL END) last_execution_id
    , SUBSTRING(CAST(MAX(CASE WHEN le.job_id IS NOT NULL THEN sjh.run_duration ELSE NULL END) + 1000000 AS VARCHAR(7)),2,2) + ':' 
            + SUBSTRING(CAST(MAX(CASE WHEN le.job_id IS NOT NULL THEN sjh.run_duration ELSE NULL END) + 1000000 AS VARCHAR(7)),4,2) + ':' 
            + SUBSTRING(CAST(MAX(CASE WHEN le.job_id IS NOT NULL THEN sjh.run_duration ELSE NULL END) + 1000000 AS VARCHAR(7)),6,2)
            AS last_duration
    , MAX(CASE WHEN le.last_instance_id IS NOT NULL THEN 
        CONVERT(datetime, RTRIM(run_date)) 
        + ((run_time / 10000 *  3600) 
        + ((run_time % 10000) / 100 * 60) 
        + (run_time  % 10000) % 100) / (86399.9964)
      ELSE '1900-01-01' END) AS last_execution_start 
FROM msdb.dbo.sysjobhistory AS sjh
LEFT JOIN lastExecution     AS le
    ON sjh.job_id = le.job_id
   AND sjh.instance_id = le.last_instance_id
GROUP BY sjh.job_id;
/* We need to parse the schedule into something we can understand */
      mask          INT
    , maskValue     VARCHAR(32)
SELECT 1, 'Sunday'      UNION ALL
SELECT 2, 'Monday'      UNION ALL
SELECT 4, 'Tuesday'     UNION ALL
SELECT 8, 'Wednesday'   UNION ALL
SELECT 16, 'Thursday'   UNION ALL
SELECT 32, 'Friday'     UNION ALL
SELECT 64, 'Saturday';
/* Now let's get our schedule information */
    SELECT AS 'scheduleName'
        , sched.schedule_id
        , jobsched.job_id
        , CASE 
            WHEN sched.freq_type = 1 
                THEN 'Once' 
            WHEN sched.freq_type = 4 
                AND sched.freq_interval = 1 
                    THEN 'Daily'
            WHEN sched.freq_type = 4 
                THEN 'Every ' + CAST(sched.freq_interval AS VARCHAR(5)) + ' days'
            WHEN sched.freq_type = 8 THEN 
                REPLACE( REPLACE( REPLACE(( 
                    SELECT maskValue 
                    FROM @weekDay AS x 
                    WHERE sched.freq_interval & x.mask <> 0 
                    ORDER BY mask FOR XML RAW)
                , '"/><row maskValue="', ', '), '<row maskValue="', ''), '"/>', '') 
                + CASE 
                    WHEN sched.freq_recurrence_factor <> 0 
                        AND sched.freq_recurrence_factor = 1 
                            THEN '; weekly' 
                    WHEN sched.freq_recurrence_factor <> 0 
                        THEN '; every ' 
                + CAST(sched.freq_recurrence_factor AS VARCHAR(10)) + ' weeks' END
            WHEN sched.freq_type = 16 THEN 'On day ' 
                + CAST(sched.freq_interval AS VARCHAR(10)) + ' of every '
                + CAST(sched.freq_recurrence_factor AS VARCHAR(10)) + ' months' 
            WHEN sched.freq_type = 32 THEN 
                    WHEN sched.freq_relative_interval = 1 THEN 'First'
                    WHEN sched.freq_relative_interval = 2 THEN 'Second'
                    WHEN sched.freq_relative_interval = 4 THEN 'Third'
                    WHEN sched.freq_relative_interval = 8 THEN 'Fourth'
                    WHEN sched.freq_relative_interval = 16 THEN 'Last'
                END + 
                    WHEN sched.freq_interval = 1 THEN ' Sunday'
                    WHEN sched.freq_interval = 2 THEN ' Monday'
                    WHEN sched.freq_interval = 3 THEN ' Tuesday'
                    WHEN sched.freq_interval = 4 THEN ' Wednesday'
                    WHEN sched.freq_interval = 5 THEN ' Thursday'
                    WHEN sched.freq_interval = 6 THEN ' Friday'
                    WHEN sched.freq_interval = 7 THEN ' Saturday'
                    WHEN sched.freq_interval = 8 THEN ' Day'
                    WHEN sched.freq_interval = 9 THEN ' Weekday'
                    WHEN sched.freq_interval = 10 THEN ' Weekend'
                + CASE 
                    WHEN sched.freq_recurrence_factor <> 0 
                        AND sched.freq_recurrence_factor = 1 
                            THEN '; monthly'
                    WHEN sched.freq_recurrence_factor <> 0 
                        THEN '; every ' 
                + CAST(sched.freq_recurrence_factor AS VARCHAR(10)) + ' months' 
            WHEN sched.freq_type = 64   THEN 'StartUp'
            WHEN sched.freq_type = 128  THEN 'Idle'
          END AS 'frequency'
        , ISNULL('Every ' + CAST(sched.freq_subday_interval AS VARCHAR(10)) + 
                WHEN sched.freq_subday_type = 2 THEN ' seconds'
                WHEN sched.freq_subday_type = 4 THEN ' minutes'
                WHEN sched.freq_subday_type = 8 THEN ' hours'
            END, 'Once') AS 'subFrequency'
        , REPLICATE('0', 6 - LEN(sched.active_start_time)) 
            + CAST(sched.active_start_time AS VARCHAR(6)) AS 'startTime'
        , REPLICATE('0', 6 - LEN(sched.active_end_time)) 
            + CAST(sched.active_end_time AS VARCHAR(6)) AS 'endTime'
        , REPLICATE('0', 6 - LEN(jobsched.next_run_time)) 
            + CAST(jobsched.next_run_time AS VARCHAR(6)) AS 'nextRunTime'
        , CAST(jobsched.next_run_date AS CHAR(8)) AS 'nextRunDate'
    FROM msdb.dbo.sysschedules      AS sched
    JOIN msdb.dbo.sysjobschedules   AS jobsched
        ON sched.schedule_id = jobsched.schedule_id
    WHERE sched.enabled = 1
/* Finally, let's look at our actual jobs and tie it all together */
SELECT CONVERT(NVARCHAR(128), SERVERPROPERTY('Servername'))             AS [serverName]
    , job.job_id                                                        AS [jobID]
    ,                                                          AS [jobName]
    , CASE WHEN job.enabled = 1 THEN 'Enabled' ELSE 'Disabled' END      AS [jobStatus]
    , COALESCE(sched.scheduleName, '(unscheduled)')                     AS [scheduleName]
    , COALESCE(sched.frequency, '')                                     AS [frequency]
    , COALESCE(sched.subFrequency, '')                                  AS [subFrequency]
    , COALESCE(SUBSTRING(sched.startTime, 1, 2) + ':' 
        + SUBSTRING(sched.startTime, 3, 2) + ' - ' 
        + SUBSTRING(sched.endTime, 1, 2) + ':' 
        + SUBSTRING(sched.endTime, 3, 2), '')                           AS [scheduleTime] -- HH:MM
    , COALESCE(SUBSTRING(sched.nextRunDate, 1, 4) + '/' 
        + SUBSTRING(sched.nextRunDate, 5, 2) + '/' 
        + SUBSTRING(sched.nextRunDate, 7, 2) + ' ' 
        + SUBSTRING(sched.nextRunTime, 1, 2) + ':' 
        + SUBSTRING(sched.nextRunTime, 3, 2), '')                       AS [nextRunDate]
      /* Note: the sysjobschedules table refreshes every 20 min, so nextRunDate may be out of date */
    , COALESCE(jh.success, 0)                                           AS [success]
    , COALESCE(jh.cancel, 0)                                            AS [cancel]
    , COALESCE(, 0)                                              AS [fail]
    , COALESCE(jh.retry, 0)                                             AS [retry]
    , COALESCE(jh.last_execution_id, 0)                                 AS [lastExecutionID]
    , jh.last_execution_start                                           AS [lastExecutionStart]
    , COALESCE(jh.last_duration, '00:00:01')                            AS [lastDuration]
    , 'EXECUTE msdb.dbo.sp_update_job @job_id = ''' 
        + CAST(job.job_id AS CHAR(36)) + ''', @enabled = 0;'            AS [disableSQLScript]
FROM msdb.dbo.sysjobs               AS job
LEFT JOIN myCTE                     AS sched
    ON job.job_id = sched.job_id
LEFT JOIN @jobHistory               AS jh
    ON job.job_id = jh.job_id
WHERE job.enabled = 1 -- do not display disabled jobs
    --AND jh.last_execution_start >= DATEADD(day, -1, GETDATE()) /* Pull just the last 24 hours */
ORDER BY nextRunDate;

Example Output:

serverName       jobID                                 jobName                      jobStatus scheduleName                      frequency  subFrequency   scheduleTime   nextRunDate       success  cancel  fail  retry  lastExecutionID  lastExecutionStart       lastDuration  disableSQLScript
---------------- ------------------------------------- ---------------------------- --------- --------------------------------- ---------- -------------- -------------- ----------------- -------- ------- ----- ------ ---------------- ------------------------ ------------- ----------------------------------------------------------------------------------------------
SQLFOOL\SQL2012  180BFD8E-AE0C-44F8-992E-27522611992A  DW FactOrders Update         Enabled   Every 4 Hours                     Daily      Every 4 hours  00:00 - 23:59  2012/09/19 20:00  12       0       0     0      84               2012-09-17 14:00:01.000  00:00:04      EXECUTE msdb.dbo.sp_update_job @job_id = '180BFD8E-AE0C-44F8-992E-27522611992A', @enabled = 0;
SQLFOOL\SQL2012  3470C9E5-A2CD-454A-89A1-DEF55FF186D3  SSIS Server Maintenance Job  Enabled   SSISDB Scheduler                  Daily      Once           00:00 - 12:00  2012/09/20 00:00  3        0       0     0      68               2012-09-17 11:50:51.000  00:00:01      EXECUTE msdb.dbo.sp_update_job @job_id = '3470C9E5-A2CD-454A-89A1-DEF55FF186D3', @enabled = 0;
SQLFOOL\SQL2012  F965B24D-60EB-4B95-91B6-C7D66057A883  syspolicy_purge_history      Enabled   syspolicy_purge_history_schedule  Daily      Once           02:00 - 23:59  2012/09/20 02:00  3        0       1     1      70               2012-09-17 11:50:51.000  00:01:24      EXECUTE msdb.dbo.sp_update_job @job_id = 'F965B24D-60EB-4B95-91B6-C7D66057A883', @enabled = 0;