Hello, dear readers. It’s been a while. For those who haven’t yet heard the news, I have switched to the dark side roles. I’m still kicking ass at GoDaddy and I’m still technical, but I now lead a team of big data engineers. As you can imagine, this means that I spend the majority of my time with Hadoop and its related ecosystem (Pig, Hive, Spark, etc.). This is very different than the performance-focused SQL Server role I had when I first created this blog. For those of you who make your living primarily or solely with SQL Server, please rest assured… SQL Server is still awesome, is still relevant, and still has a place in most enterprise data platform strategies. In fact, I’m presently working on egress automation of our Hadoop data warehouse to SQL Server for serving BI reports. Unfortunately, my change in technology focus means that my SQL Server blogging has largely come to an end, and with it, new posts on this blog. But while this blog may end, my blogging activities do not. For those interested, I have launched a new blog at Hadoopsie.com that chronicles my adventures in a non-Microsoft world with Hadoop, open-source, … Continue reading
Just a quick blog to demonstrate how to write a recursive date query in Teradata. First, you’ll need to create a “dummy view” in order to perform a UNION ALL: CREATE VIEW DummyTable (DummyRow) AS SELECT 1; Then you’ll define the recursive statement and return the results: /* Recursively define a range of dates; the specifics can easily be modified */ WITH RECURSIVE myDates(beginDate, endDate) AS ( SELECT DATE’2014-01-01′ AS beginDate /* change to your start date */ , DATE’2014-01-07′ AS endDate /* change to the start date + desired interval */ FROM DummyTable UNION ALL SELECT beginDate + INTERVAL ’7′ DAY /* change to the desired interval */ , endDate + INTERVAL ’7′ DAY /* change to the desired interval */ FROM myDates WHERE beginDate < CURRENT_DATE /* change to define the end of the date range */ ) SELECT * FROM myDates ORDER BY 1; Not too bad to write, eh? This can be used to do some pretty cool things, like condition change queries.
In my previous post, SqlDev PowerShell 101: Getting Started with PowerShell, I walked through how to configure and use PowerShell to run basic commands. This post will cover working with data and variables in PowerShell. Specifically, I’m going to demonstrate how to connect to a database (i.e. SQL Server) using Invoke-Sqlcmd, retrieve data, store the data in a variable, and work with that variable. Note: Invoke-Sqlcmd does require the use of the sqlps module. If you don’t have it installed yet, check here for more information. From a new PowerShell window, write the following command: Invoke-Sqlcmd -Query "SELECT DB_NAME();" Tip: You can auto-complete “Invoke-Sqlcmd” after typing “Invoke-S” and hitting TAB on your keyboard. What’s happening here? Well, because no database instance was specified, the cmdlet is connecting to the database on the local computer and running the query specified. “Column1″ is returned just as it would be in SQL Server Management Studio (SSMS), because no column name was specified. You can see this by running: Invoke-Sqlcmd -Query "SELECT DB_NAME() AS MyDatabaseName;" But what if you don’t have a local database instance, or you want to connect to another database? No problem. Just add the -ServerInstance parameter: Invoke-Sqlcmd -Query "SELECT DB_NAME() … Continue reading
My latest blog post on Go Daddy’s Insiders site is now live! In it, I discuss query performance and scalability issues with GUIDs. Read the article and let me know what you think.
This is the first in a new series focused on PowerShell for SQL developers. That means using PowerShell to create and/or modify table schema, stored procedures, and indexes. In other words, automating code development and database tuning processes. This is mostly what I’ve been using PowerShell for so far, and I have to say, I am LOVING it! It’s allowing me to save *days* of coding and validating. Disclaimer: I am NOT a PowerShell expert, just a newly converted enthusiast who wants to save you from the headaches I myself have suffered while trying to learn PoSh. So I’ll mostly focus on the “how to’s” instead of the “why’s” in this series. I’m going to start at the very beginning and assume you have never opened PowerShell before in your life. Basically, I’m going to assume you’re where I was just a few short months ago. So let us start by finding the application. If you’re using Windows 7 or later, the good news is that PowerShell comes pre-installed on the OS. We just need to configure it to allow the execution of scripts. Click on the Start icon –> “Search programs and files” –> type “PowerShell.” The first time … Continue reading
This post probably has a small audience, but there were a handful of conversion issues I ran into when I upgraded to PowerShell 3.0. It was difficult finding posts related to the errors I encountered, so I’m posting my experiences in the hopes that it helps someone else down the road. The first issue I encountered was calling PowerShell from xp_cmdshell on a remote SQL Server. In PowerShell 2.0, I used this syntax: EXEC xp_cmdshell ‘powershell.exe -Command "get-executionpolicy"’ # PoSh 2.0 Results AllSigned # PoSh 3.0 Results ‘powershell.exe’ is not recognized as an internal or external command, operable program or batch file. To fix this issue, I had to fully qualify the PowerShell directory: EXEC xp_cmdshell ‘c:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command "get-executionpolicy"’ # PoSh 3.0 Results AllSigned Another issue I ran into was calling the Teradata .NET Data Provider. In PowerShell 2.0, I was calling the Invoke-SqlStmt function I previously posted to connect from a Windows server to a remote Teradata server. PS X:\> .\Invoke-SqlStmt.ps1 -C TDATA -S MyTDServerName -Q "SELECT * FROM DBC.Tables;" -Verbose Everything worked great. However, once I upgraded to PowerShell 3.0, I was getting this error: Exception calling "GetFactory" with "1" argument(s): "Unable to find the requested .Net … Continue reading
I’ve been using PowerShell quite a bit the last few months, and I finally feel I’m to the point where I can start sharing some of what I’ve learned. I started a blog post that walks through using PowerShell for database development and automation, and I quickly realized that a lot of what I want to show is dependent upon the first function I wrote, Invoke-SqlStmt. And by “wrote,” I mean “blatantly borrowed from Chad Miller’s Invoke-SqlCmd2 script” . Thank you, Chad! The main difference is that my function supports connecting to either Teradata or SQL Server, depending on the parameters passed. One thing worth mentioning is that I spent a lot of time researching all of the different ways to connect to SQL Server. In the end, I went with ADO.NET, as a lot of what I will be doing is working with data sets (i.e. running a query against ServerA and writing the results to ServerB). Invoke-SqlStmt ################################################################################################## # # .NAME # Invoke-SqlStmt.ps1 # # .SYNOPSIS # Executes a SQL statement against either a Teradata or SQL Server source using ADO.NET # # .INPUT # C – Connection Type; Required; MSSQL or TDATA # S – Server Name; … Continue reading
Recently, I needed to audit indexes on two different servers. I wanted to compare things like index keys and included columns, partitioning keys, unique constraints, clustered indexes and primary keys, and filter index definitions. Basically, I wanted to make sure that the indexing of two databases on two different servers were completely in sync. To do this, I wrote the following scripts. The first script will audit a single database or even a single table. The second script once more makes use of Aaron Bertrand’s sp_foreachdb procedure to iterate through every database on a server. To do a quick and easy compare, I dumped the results to a single table on each server and used Red Gate’s SQL Data Compare to find the differences. This is another metadata script, so it should be fairly lightweight. That said, all of the usual disclaimers apply: This script will only run on SQL 2008 or newer because of the filtered index component. See Patrick’s solution in the comments below for a mod that works in 2005 too. It worked for me, but YMMV depending on editions, collations, creative settings, etc. I don’t currently have access to any instances older than SQL Server 2012, … Continue reading
As I’ve mentioned previously, SQL Server 2012 Integration Services Design Patterns is now available on Amazon. PASS has invited the ENTIRE author team to give a pre-conference training session at Summit 2012. Precons are all-day training events and are held on the days before Summit. Our precon will be held on Monday, November 5th. I’ve attended several precons at Summit, and in my opinion, they provide unparalleled depth and access to the presenters. I expect our precon will be no different. So what will I be presenting on? Good question. I’m going to talk about Accelerating ETL Data Retrieval. Specifically, I’ll discuss strategies for retrieving data for full loads vs. incremental loads, and I’ll explore some strategies for retrieving data from large tables. To clarify “large,” since everyone has a different definition, I’ve successfully employed these strategies on tables with 10 billion rows. Now you might be reading this and thinking, “Hey, that doesn’t sound like SSIS!” Well… yes and no. While I will be discussing some strategies within SSIS, most of what I discuss will take place at the database layer. In fact, most of my content could be useful for any ETL tool. I still think it’s a … Continue reading
I don’t hide the fact that I work at Go Daddy. All discussions of advertising methods aside, it’s a great company to work for. Not only am I treated well as an employee, I also get to work in a world-class technical environment. However, the marketing campaigns tend to steal the spotlight. As a result, few people are aware of technology that it takes to be the #1 hosting provider in the world. Some examples of little-known facts about Go Daddy: 10 billion DNS queries answered daily Over 35,000 servers & 100,000 square feet of state-of-the-art global data centers 25 petabytes — yes, petabytes! — of networked data storage Pretty cool, huh? Go Daddy has launched a new blog called Inside Go Daddy as a way to share all the nitty gritty details of what it takes to support this kind of environment. Here’s a blurb from the site: This is your inside source for what’s going on with Go Daddy’s tech experts. You’ll get insight and opinions from Go Daddy’s tech leaders on industry topics, company projects & open source initiatives … the leading edge, unconventional, “behind-the-scenes” information you won’t find anywhere else. It’s not PR, it’s not executive … Continue reading