Yet Another PASS Summit Recap & Thoughts on PDW
The SQL blogosphere has been lit up with PASS Summit recaps.
I debated about whether or not to write my own post, until I remembered that this blog serves as a mini-journal for me too. I have a notoriously poor memory--my husband likes to say that my CPU and memory are good, but I must have an unusual clustering strategy--so maybe this blog post will be a good pointer for me when I start prepping for next year's Summit.
This was definitely the best PASS Summit conference ever. While there will always be opportunities to do things better--improvement is a never-ending process--it was clear that the organizers of this event listened to the feedback they had received the previous year. One of the best changes? Backpacks. These were very useful, as evidenced by their presence everywhere. Nice job, organizers!
My absolute favorite thing about Summit is the chance to meet and reconnect with so many amazing SQL folks. There were entirely too many people to list out, but some highlights include meeting Crys Manson, Jorge Segarra, and Karen Lopez for the first time. I also had a chance encounter with Ola Hallengren in the Sheraton elevator. Apparently we were only staying a few rooms apart this year. We ended up having a couple of really great discussions about index fragmentation, the differences between our scripts, and things we'd like to see changed in future releases of SQL Server.
I had the opportunity to sit on the panel at the WIT luncheon. All of the women on the panel were amazing, and I was honored just to be sitting at the same table as them. I was especially pleased to meet Nora Denzel, a Senior Vice President at Intuit. Intelligent, confident, and witty, she is a great role model for young technical women, myself included. I can only hope that some of her gumption rubbed off on me due to our close proximity.
After the event, I was pleasantly surprised by how many folks--men and women both--came up to me to tell me how much they enjoyed it. Thanks to the WIT VC for organizing another great event!
The lightning talk sessions were a new feature this year, and I think I like it. The format of the lightning session is 7 speakers presenting on a topic for 5 quick minutes. Watching these sessions is kind of like skipping right to the center of a tootsie pop: all content and no fluff. The standout lightning talk presentation for me was Adam Machanic's. It was beautifully rehearsed and choreographed. Nice job, Adam!
Another of the many highlights of the week was meeting the Microsoft execs. In addition to meeting Ted Kummert, Mark Souza, and Donald Farmer--all very nice gentlemen--I had the opportunity to speak at length with Jose Blakely about Parallel Data Warehouse (PDW). PDW, formerly codenamed Madison, was officially launched at Summit. Jose was kind enough to explain the PDW architecture, both where it came from and the vision for where it's going. I'd attempt to regurgitate it here, but I think the probability of me misquoting would be high.
Suffice it to say, this technology has me excited. Why? Quite frankly, I think PDW will do for data warehousing what SQL Server did for databases, and what Analysis Services did for BI: make it affordable. With a compelling cost-per-terabyte, an attractive scale-out approach, and an entry point at under $1 million, we'll see more small-to-midsized companies implementing data warehousing and business intelligence. This is good news for those of us looking for an affordable data warehouse solution and for those of us who make our living with SQL Server. And for those of you who might suggest that few companies need a datawarehouse that can support multi-terabyte data, I'd like to point out that just 3 or 4 years ago, 100 GB was considered a lot of data.
I spent most of my week digging into the PDW architecture. It's not all roses--it's a first release and, as such, is immature compared to the much older and more established data warehouse systems--but again, it has a lot going for it, not least of all it's easy integration within a SQL Server environment and the relatively low cost. We're currently investigating this as a possible data warehouse solution for our business intelligence environment, so expect to see more from me about PDW as I learn more about it.
Getting Started with Variables in SSIS
I recently had to create an SSIS package that used variables to pass data between procs, and I thought it would make a good topic for a blog post. There are many scenarios as to the how and why to use variables in SSIS, but we're going to keep it pretty simple. Let's assume we have some need to retrieve data from Proc A, pass it to Proc B, and store the results in Table C. First, let's set up our environment:
Use AdventureWorks; Go Create Procedure dbo.LastOrderGet_sp As Set NoCount On; Begin Select Max(SalesOrderID) As 'LastSalesOrderID' From AdventureWorks.Sales.SalesOrderHeader With (NoLock) Set NoCount Off; Return 0; End Go Create Procedure dbo.ProcessLastOrder_sp @LastOrderID int As Set NoCount On; Begin Select SalesOrderDetailID , ProductID , OrderQty , LineTotal From AdventureWorks.Sales.SalesOrderDetail With (NoLock) Where SalesOrderID = @LastOrderID; Set NoCount Off; Return 0; End Go Create Table dbo.testStage ( SalesOrderDetailID int , ProductId int , OrderQty smallint , LineTotal numeric ); Go
Now for the fun stuff!
(Please note, I'm assuming some basic understanding of SSIS, so I'm skipping the "how to create a project", etc. stuff and just going to the pertinent parts).
Inside BIDS (Business Intelligence Development Studio), create a new SSIS project and call it what you will. If your Variable window is not already open, open it now by going to View --> Other Windows --> Variables.
Now let's create a variable. To do this, click on the little icon in the upper left-hand corner of the Variables window. Name the variable LastSalesOrderID.
After you create the variable, you should now see it in the Variables window. Make sure the scope of the variable is the name of your project, which is "Blog" in my case (for obvious reasons); this means the variable is defined at the package scope. Once you've confirmed that the variable exists, create an Execute SQL task.
(Variables in SSIS, like in other programming languages, can have different scopes. For instance, a package scope means the variable can be accessed anywhere within the package, but a variable with a Data Flow scope can only be accessed within the specified Data Flow task.)
Double-click on your Execute SQL Task and configure with the following values:
- Set "Result Set" to Single Row.
- Set your Connection to your appropriate data source.
- Set your SQL Statement to: Execute AdventureWorks.dbo.LastOrderGet_sp;
Now click on "Result Set" and click on "Add." You'll want to put the name of the column that's returned by the proc in the "Result Name" column; in our case, that'll be LastSalesOrderID. Click on the Variable Name column and scroll down until you find the appropriate one (User::LastSalesOrderID).
Go ahead and add a Data Flow task to the designer surface. We don't need to use a Data Flow task here -- for example, we could use another Execute SQL task instead -- but this will help demonstrate one way to use variables.
Double-click on the Data Flow task and add an OLE DB Source, then double-click on it to open up the properties. Enter the following text in the "SQL Command text" window:
Execute AdventureWorks.dbo.ProcessLastOrder_sp ?
The question mark (?) tells SSIS to expect a parameter.
Now click on the Parameters button on the left. This is where we map our variable to our parameter. For the "Parameters" value, enter @LastOrderID (the parameter the stored procedure is expecting). In the "Variables" column, click on the drop-down and navigate to the User::LastSalesOrderID variable.
Finally, set up an OLE DB Destination, and configure the OLE DB Source to load into the testStage table.
At this point, you should be able to successfully execute your package. Upon successful execution, the testStage table will return the following results:
Select * From testStage; SalesOrderDetailID ProductId OrderQty LineTotal ------------------ ----------- -------- ------------------ 121315 878 1 21 121316 879 1 159 121317 712 1 8
That's all for now. Hopefully this gives you an idea of how easy and useful it is to work with variables in SSIS.
Max Parallelism on Cube Processing
The default behavior for processing cubes is to let the server determine how much parallelism to use. Let's face it, the server must've not learned to play nice with others as a kid; as such, it doesn't always make the best decisions. But fear not, you can help it along.
When processing a cube, click on "Change Settings..."
Now change the "Maximum parallel tasks" to 1, 2, or whatever is appropriate for your environment:
That's all there is to it. Happy cube processing!
Hello (BI) World!
For those who are not already aware, I've recently switched to the Business Intelligence team as an OLAP developer. I'm pretty excited about the opportunity to learn more about cubes and data mining, and I've decided it's time to start sharing some of what I'm learning. My initial BI-related blog posts will probably be more entry-level topics as I learn my way around BIDS (Business Intelligence Development Studio).
To get started, I'd like to share my current favorite BI resource website: http://www.learnmicrosoftbi.com. LearnMicrosoftBI is an excellent, FREE website run by Craig Utley. Yes, you heard me right... you have to register, but the site is FREE. There are currently 38 videos on SSAS and BI-related topics, ranging from 7 minutes to 58 minutes long. I haven't watched them all, but the ones I have watched have been helpful. If you're trying to learn SSAS, be sure to check this site out.
Here's a couple of other items of note:
- BIDS Helper on CodePlex
- A preview of Excel 2010 features for Analysis Services by Frederik Vandeputte
Categories
- Business Intelligence
- Internals
- Miscellaneous
- PASS
- Performance & Tuning
- Presentations
- SQL 2008
- SQL Tips
- Syndication
- T-SQL Scripts
Subscribe to my blog!
| Like what you see? Subscribe! |
![]() |
Around the Web
Recent Tweets
- RT @Phil_Factor: SQL Server table columns under the hood http://t.co/vp7gQ77B < what a great post
- @tradney haha that's awesome :)
- OH: @AdamMachanic's #sqlpass session was one of the best I've ever seen.
Archives
- November 2011
- October 2011
- September 2011
- July 2011
- June 2011
- May 2011
- April 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- July 2010
- June 2010
- May 2010
- April 2010
- February 2010
- January 2010
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008













