Yet Another PASS Summit Recap & Thoughts on PDW

The SQL blogosphere has been lit up with PASS Summit recaps.

I debated about whether or not to write my own post, until I remembered that this blog serves as a mini-journal for me too. I have a notoriously poor memory–my husband likes to say that my CPU and memory are good, but I must have an unusual clustering strategy–so maybe this blog post will be a good pointer for me when I start prepping for next year’s Summit. ;)

This was definitely the best PASS Summit conference ever. While there will always be opportunities to do things better–improvement is a never-ending process–it was clear that the organizers of this event listened to the feedback they had received the previous year. One of the best changes? Backpacks. These were very useful, as evidenced by their presence everywhere. Nice job, organizers!

My absolute favorite thing about Summit is the chance to meet and reconnect with so many amazing SQL folks. There were entirely too many people to list out, but some highlights include meeting Crys Manson, Jorge Segarra, and Karen Lopez for the first time. I also had a chance encounter with Ola Hallengren in the Sheraton elevator. Apparently we were only staying a few rooms apart this year. We ended up having a couple of really great discussions about index fragmentation, the differences between our scripts, and things we’d like to see changed in future releases of SQL Server.

I had the opportunity to sit on the panel at the WIT luncheon. All of the women on the panel were amazing, and I was honored just to be sitting at the same table as them. I was especially pleased to meet Nora Denzel, a Senior Vice President at Intuit. Intelligent, confident, and witty, she is a great role model for young technical women, myself included. I can only hope that some of her gumption rubbed off on me due to our close proximity. :) After the event, I was pleasantly surprised by how many folks–men and women both–came up to me to tell me how much they enjoyed it. Thanks to the WIT VC for organizing another great event!

The lightning talk sessions were a new feature this year, and I think I like it. The format of the lightning session is 7 speakers presenting on a topic for 5 quick minutes. Watching these sessions is kind of like skipping right to the center of a tootsie pop: all content and no fluff. The standout lightning talk presentation for me was Adam Machanic’s. It was beautifully rehearsed and choreographed. Nice job, Adam!

Another of the many highlights of the week was meeting the Microsoft execs. In addition to meeting Ted Kummert, Mark Souza, and Donald Farmer–all very nice gentlemen–I had the opportunity to speak at length with Jose Blakely about Parallel Data Warehouse (PDW). PDW, formerly codenamed Madison, was officially launched at Summit. Jose was kind enough to explain the PDW architecture, both where it came from and the vision for where it’s going. I’d attempt to regurgitate it here, but I think the probability of me misquoting would be high.

Suffice it to say, this technology has me excited. Why? Quite frankly, I think PDW will do for data warehousing what SQL Server did for databases, and what Analysis Services did for BI: make it affordable. With a compelling cost-per-terabyte, an attractive scale-out approach, and an entry point at under $1 million, we’ll see more small-to-midsized companies implementing data warehousing and business intelligence. This is good news for those of us looking for an affordable data warehouse solution and for those of us who make our living with SQL Server. And for those of you who might suggest that few companies need a datawarehouse that can support multi-terabyte data, I’d like to point out that just 3 or 4 years ago, 100 GB was considered a lot of data.

I spent most of my week digging into the PDW architecture. It’s not all roses–it’s a first release and, as such, is immature compared to the much older and more established data warehouse systems–but again, it has a lot going for it, not least of all it’s easy integration within a SQL Server environment and the relatively low cost. We’re currently investigating this as a possible data warehouse solution for our business intelligence environment, so expect to see more from me about PDW as I learn more about it.

Getting Started with Variables in SSIS

I recently had to create an SSIS package that used variables to pass data between procs, and I thought it would make a good topic for a blog post. There are many scenarios as to the how and why to use variables in SSIS, but we’re going to keep it pretty simple. Let’s assume we have some need to retrieve data from Proc A, pass it to Proc B, and store the results in Table C. First, let’s set up our environment:

Use AdventureWorks;
Go
 
Create Procedure dbo.LastOrderGet_sp
As
Set NoCount On;
Begin
 
    Select Max(SalesOrderID) As 'LastSalesOrderID'
    From AdventureWorks.Sales.SalesOrderHeader With (NoLock)
 
    Set NoCount Off;
    Return 0;
End
Go
 
Create Procedure dbo.ProcessLastOrder_sp
    @LastOrderID int
As
Set NoCount On;
Begin
 
    Select SalesOrderDetailID
        , ProductID
        , OrderQty
        , LineTotal
    From AdventureWorks.Sales.SalesOrderDetail With (NoLock)
    Where SalesOrderID = @LastOrderID;
 
    Set NoCount Off;
    Return 0;
End
Go
 
Create Table dbo.testStage
(
      SalesOrderDetailID    int
    , ProductId             int
    , OrderQty              smallint
    , LineTotal             numeric
);
Go

Now for the fun stuff!

(Please note, I’m assuming some basic understanding of SSIS, so I’m skipping the “how to create a project”, etc. stuff and just going to the pertinent parts).

Inside BIDS (Business Intelligence Development Studio), create a new SSIS project and call it what you will. If your Variable window is not already open, open it now by going to View –> Other Windows –> Variables.

Open Variables Window

Open Variables Window

Now let’s create a variable. To do this, click on the little icon in the upper left-hand corner of the Variables window. Name the variable LastSalesOrderID.

Create a variable

Create a variable

After you create the variable, you should now see it in the Variables window. Make sure the scope of the variable is the name of your project, which is “Blog” in my case (for obvious reasons); this means the variable is defined at the package scope. Once you’ve confirmed that the variable exists, create an Execute SQL task.

(Variables in SSIS, like in other programming languages, can have different scopes. For instance, a package scope means the variable can be accessed anywhere within the package, but a variable with a Data Flow scope can only be accessed within the specified Data Flow task.)

Create Execute SQL Task

Create Execute SQL Task

Double-click on your Execute SQL Task and configure with the following values:

  • Set “Result Set” to Single Row.
  • Set your Connection to your appropriate data source.
  • Set your SQL Statement to: Execute AdventureWorks.dbo.LastOrderGet_sp;
Set up your Execute SQL Task

Set up your Execute SQL Task

Now click on “Result Set” and click on “Add.” You’ll want to put the name of the column that’s returned by the proc in the “Result Name” column; in our case, that’ll be LastSalesOrderID. Click on the Variable Name column and scroll down until you find the appropriate one (User::LastSalesOrderID).

Mapping the results to a variable

Mapping the results to a variable

Go ahead and add a Data Flow task to the designer surface. We don’t need to use a Data Flow task here — for example, we could use another Execute SQL task instead — but this will help demonstrate one way to use variables.

Add Data Flow Task

Add Data Flow Task

Double-click on the Data Flow task and add an OLE DB Source, then double-click on it to open up the properties. Enter the following text in the “SQL Command text” window:
Execute AdventureWorks.dbo.ProcessLastOrder_sp ?
The question mark (?) tells SSIS to expect a parameter.

Edit OLE DB Source

Edit OLE DB Source

Now click on the Parameters button on the left. This is where we map our variable to our parameter. For the “Parameters” value, enter @LastOrderID (the parameter the stored procedure is expecting). In the “Variables” column, click on the drop-down and navigate to the User::LastSalesOrderID variable.

Map Variables to Parameters

Map Variables to Parameters

Finally, set up an OLE DB Destination, and configure the OLE DB Source to load into the testStage table.

Configure OLE DB Destination

Configure OLE DB Destination

At this point, you should be able to successfully execute your package. Upon successful execution, the testStage table will return the following results:

Select * From testStage;
 
SalesOrderDetailID ProductId   OrderQty LineTotal
------------------ ----------- -------- ------------------
121315             878         1        21
121316             879         1        159
121317             712         1        8

That’s all for now. Hopefully this gives you an idea of how easy and useful it is to work with variables in SSIS.

Max Parallelism on Cube Processing

The default behavior for processing cubes is to let the server determine how much parallelism to use. Let’s face it, the server must’ve not learned to play nice with others as a kid; as such, it doesn’t always make the best decisions. But fear not, you can help it along.

When processing a cube, click on “Change Settings…”

SSAS Cube Processing

SSAS Cube Processing

Now change the “Maximum parallel tasks” to 1, 2, or whatever is appropriate for your environment:

SSAS Cube Processing Settings

SSAS Cube Processing Settings

That’s all there is to it. Happy cube processing!

Hello (BI) World!

For those who are not already aware, I’ve recently switched to the Business Intelligence team as an OLAP developer. I’m pretty excited about the opportunity to learn more about cubes and data mining, and I’ve decided it’s time to start sharing some of what I’m learning. My initial BI-related blog posts will probably be more entry-level topics as I learn my way around BIDS (Business Intelligence Development Studio).

To get started, I’d like to share my current favorite BI resource website: http://www.learnmicrosoftbi.com. LearnMicrosoftBI is an excellent, FREE website run by Craig Utley. Yes, you heard me right… you have to register, but the site is FREE. There are currently 38 videos on SSAS and BI-related topics, ranging from 7 minutes to 58 minutes long. I haven’t watched them all, but the ones I have watched have been helpful. If you’re trying to learn SSAS, be sure to check this site out.

Here’s a couple of other items of note: