Have you ever been tasked with a project that you know has been done before, probably thousands of times, but you didn’t know where to start because it was new to you? If you’re nodding your head yes, you’re not alone. The fact is, we all have. Almost two years ago, Andy Leonard invited me to be part of a book project. He had a vision of compiling a “recipe book” of common tasks that nearly every ETL professional encounters at some point in his or her career. Whereas most technical books explain how to use the product or specific features, Andy wanted to create a pattern-oriented book that would help jumpstart an SSIS project. With some trepidation, I humbly agreed to join the outstanding group of authors Andy had assembled: Andy Leonard (Blog | Twitter) Matt Masson (Blog| Twitter) Tim Mitchell (Blog | Twitter) Jessica Moss (Blog | Twitter) I was excited about this project for three reasons. First, the incredible list of authors. These are the folks that I have learned SSIS from! To be included amongst their ranks is a deep honor for me. Secondly, I couldn’t keep track of the number of times such a book … Continue reading →
6 months ago, I returned from maternity leave only to be handed the proverbial keys to a new database kingdom: Teradata. Being a long-time devotee of all things Microsoft SQL Server, I had quite a bit of learning to do to get up to speed on this new environment. In an effort to save others from the same headaches that plagued me the first few months, I have created a new guide to aid those new to the Teradata platform. I will primarily focus on what I have found most lacking: examples of how to convert SQL Server T-SQL to Teradata SQL. Rather than create a series of posts with random tidbits, I thought it would make more sense to create a single page for this guide and update it incrementally. You can find the very beginnings of this guide here: The SQL Server DBA’s Guide to Teradata. Good luck, and happy converting.
Here’s just a short post to feed the search engine gerbils. So today, I created a stored procedure that accepted a list of ID’s and filtered them. The most efficient way to handle this was using table-valued parameters (TVPs). My code looks similiar to this: USE AdventureWorks2012; GO /* Create our new table type */ IF NOT EXISTS(SELECT * FROM sys.types WHERE name = ‘CustomerList’) BEGIN CREATE TYPE dbo.CustomerList AS TABLE ( CustomerID INT , ID INT ); END; IF OBJECTPROPERTY(OBJECT_ID(’dbo.CustomerTerritoryFilterGet_sp’) , N’IsProcedure’) IS NOT NULL DROP PROCEDURE dbo.CustomerTerritoryFilterGet_sp; GO /* Create a proc to test with */ CREATE PROCEDURE dbo.CustomerTerritoryFilterGet_sp @Customers CustomerList READONLY , @TerritoryFilter INT AS BEGIN SELECT DISTINCT sc.CustomerID, c.ID FROM Sales.Customer AS sc JOIN @Customers AS c ON sc.CustomerID = c.CustomerID WHERE sc.TerritoryID = @TerritoryFilter; RETURN 0; END GO /* Test our stored procedure */ DECLARE @myList CustomerList; INSERT INTO @myList VALUES (1, 10), (2, 20), (3, 30), (4, 40), (5, 50), (6, 60), (7, 70); EXECUTE dbo.CustomerTerritoryFilterGet_sp @Customers = @myList , @TerritoryFilter = 1; GO GRANT EXECUTE ON dbo.CustomerTerritoryFilterGet_sp To myApp; GO This would typically be sufficient for most stored procedures. However, my app dev was … Continue reading →
I’ve figured out a couple of tidbits today that I wanted to share. First and foremost, I’ve discovered a (new to me) option in SSMS to convert tabs to spaces. I had previously seen the “Untabify Selected Lines” option (under Edit –> Advanced), but this only seemed to remove the tabs at the beginning of the line; it would not remove in-line tabs. I’ve now found another option that will remove the tabs throughout the selected code. Here’s how you can do it: Disclaimer: I’m using SSMS 2008 and have not confirmed this in other versions of SSMS. From your toolbar, right-click in the empty gray space, or click on the drop-down arrow. Navigate to Customize: Click on the Commands tab, then navigate down to Edit –> Convert Tabs to Spaces. Note: if you’re one of those weirdos who like tabs instead of spaces, you can convert in the other (wrong) direction, too. Once you have found the option you want, click on it and drag it to your toolbar. Cool, huh? Another little thing I ran into today was a mismatch between nullable and non-nullable columns in related tables. I started to perform a manual audit before I said, … Continue reading →
I’m currently working on the logic migration of data marts from SQL Server to Teradata. While another team is working on the actual table migration, it’s still helpful to have some data in the tables for developing against. The easiest method I’ve found to do this is to use BCP to export some sample data. So of course, I’ve created a SQL script that will generate the BCP code for me. Because that’s what I like to do on Sunday evenings. — User-defined variables — DECLARE @tableToBCP NVARCHAR(128) = ‘AdventureWorksDW2008R2.dbo.DimCustomer’ , @Top VARCHAR(10) = NULL — Leave NULL for all rows , @Delimiter VARCHAR(4) = ‘|’ , @UseNULL BIT = 1 , @OverrideChar CHAR(1) = ‘~’ , @MaxDop CHAR(1) = ’1′ , @Directory VARCHAR(256) = ‘C:\bcp_output\’; — Script-defined variables — DECLARE @columnList TABLE (columnID INT); DECLARE @bcpStatement NVARCHAR(MAX) = ‘BCP "SELECT ‘ , @currentID INT , @firstID INT; INSERT INTO @columnList SELECT column_id FROM sys.columns WHERE object_id = OBJECT_ID(@tableToBCP) ORDER BY column_id; IF @Top IS NOT NULL SET @bcpStatement = @bcpStatement + ‘TOP (‘ + @Top + ‘) ‘; SELECT @firstID = MIN(columnID) FROM @columnList; WHILE EXISTS(SELECT * FROM @columnList) BEGIN … Continue reading →
My role at work has recently changed. While I will continue to be working with SQL Server, I will also be spending a good chunk of my time working in our newly acquired Teradata environment. My plan is to blog about both as time permits, but I am trying to determine whether I should create a separate blog for the Teradata content. So my question for you, dear reader, is whether you think I should blog on this site about both SQL Server *and* Teradata, or whether I should separate the content? Any input is appreciated.
In my post last week, How To Estimate Data Utilization, I said that it may be my last post for a while. Well… apparently I lied. For those of you who use table partitioning, you know that you need to define a partitioning scheme and function prior to applying partitioning to an index. Personally, I tend to build the function for a couple of years out, and I tend to create them through the end of a calendar year. Now, if I failed to expand a partition range at the end of the year, then come January 1st, all of my data would be written to the same partition. Not the end of the world, no, but it causes all kinds of nasty performance and maintenance issues. Thus, as part of my end-of-year / maternity-leave preparations, I’m in the process of examining all partitioned functions to identify those that need to have their partition ranges expanded. For those interested, here’s the script I used: IF OBJECT_ID(’tempdb..#Results’) IS NOT NULL DROP TABLE #Results; CREATE TABLE #Results ( databaseName NVARCHAR(128) , schemaName NVARCHAR(128) , functionName NVARCHAR(128) , data_space_id INT , maxRangeValue SQL_VARIANT ) /* Grab results for each database and … Continue reading →
Recently, on a conference call presenting data growth rates and database capacity projections, I had a top-line executive ask, “But how much of that data are we actually using today?” The question was met with silence; unless you have rigorous auditing in place — and kudos to you if you do — it’s a difficult question to answer. But it begs the question, is there some way to gleam this information from SQL Server? I think the answer is “yes,” if you make some assumptions and understand what you’re looking at. SQL Server collects stats about every time an index is used and how it is used (i.e. whether a user seeked or scanned the index, etc.). It also provides a DMV to view these stats: sys.dm_db_index_usage_stats. This DMV provides a wealth of great information, but to answer our question of “What data is actually being used?”, we have to refine our criteria. Are we talking in terms of table counts or data size? I’d argue that data size is more important than table counts; one unqueried millow-row table is more wasteful than a hundred ten-row tables. Also, are we looking at indexes or content? From a database perspective, I’m … Continue reading →
East Iowa’s 3rd annual SQL Saturday is right around the corner! This year’s event will be on Saturday, October 1st. It’s being held in the same location as previous years, at the University of Iowa Capitol Centre in Iowa City. We have a great mix of speakers this year, including nationally-recognized speakers such as Jason Strate and Ted Krueger. Topics include Denali, PowerPivot, High Availability, Disaster Recovery, PowerShell, SSIS, Analysis Services, Reporting Services, database tuning, and more. Don’t miss out on this great opportunity to network with SQL-savvy locals while receiving FREE training. If you haven’t registered yet, please do so we can make sure we order enough food. Oh, and make sure to stick around for the after-party, too. The after-party is one of my favorite parts of the whole event. This year’s party will be held at The Mill, which is within walking distance of the conference center. I hope to see you there!
Ever needed to calculate the number of rows inserted every second, for every table in every database on a server? Or, have you ever needed to validate that all processes have stopped writing to tables? These types of questions come up routinely for me. To help with this, I’ve written the following script, which examines metadata values using sys.partitions. This method isn’t as accurate as running SELECT COUNT(*) FROM, but it’s much faster. Keep in mind, since it’s just looking at row counts, it’s not much help on tables that have a lot of update/delete activity. But it does what I need it to do, and I use it pretty regularly, so I thought I’d share in case anyone else can benefit from it too. /* Declare Parameters */ DECLARE @newBaseline BIT = 1 — change to 0 when you don’t want to replace the baseline, i.e. after initial run , @delay CHAR(8) = ’00:00:30′; — change as needed IF @newBaseline = 1 BEGIN IF OBJECT_ID(’tempdb..#baseline’) IS NOT NULL DROP TABLE #baseline; CREATE TABLE #baseline ( database_name SYSNAME , table_name SYSNAME , table_rows BIGINT , captureTime DATETIME NULL ); END IF OBJECT_ID(’tempdb..#current’) IS NOT NULL DROP TABLE … Continue reading →