SQL Agent Job Script

This blog post is brought to you by the awesomeness that is the SQL Server Twitter community. Contributions were made by several awesome Tweeples, including Denny Cherry, Kendra Little, Ken Simmons, and Magnus Ahlkvist, among others.

What started this is something you’ve all probably run into at one time or another. We’re currently having some resource contention on our server when a ton of processes kickoff and overlap in the morning. Now, I have a script that I’ve used in the past for monitoring SQL agent jobs, but this time I wanted to add some historical run-time information. I know the sysjobhistory table contains the information I need, but it has some… let’s say, creative ways of storing the data. Opinions on the reasons why vary:

Regardless the reason, I needed to do some conversions. Denny shared with us the msdb.dbo.agent_datetime function to convert run_date and run_time into an actual datetime value. I have to say, this certainly cleans up the code quite a bit! Then Magnus shared a method to convert run_duration into seconds, which he modified from a script on SQLServerCentral. I was able to use these two tidbits to update my old script. You can now run this script to get back a list of all enabled procs, a script that will disable them, a breakdown of the schedule, and a historical run-time average.

/*  How many days do you want to include in your run-time average?
    Recent values tend to be more useful. */
Declare @daysToAverage smallint = 30;
Declare @avgRunTime Table
      job_id      uniqueidentifier
    , avgRunTime  int
/* We need to parse the schedule into something we can understand */
Declare @weekDay Table (
      mask       int
    , maskValue  varchar(32)
Insert Into @weekDay
Select 1, 'Sunday'  Union All
Select 2, 'Monday'  Union All
Select 4, 'Tuesday'  Union All
Select 8, 'Wednesday'  Union All
Select 16, 'Thursday'  Union All
Select 32, 'Friday'  Union All
Select 64, 'Saturday';
/* First, let's get our run-time average */
Insert Into @avgRunTime
Select job_id
    , Avg((run_duration/10000) * 3600 + (run_duration/100%100)*60 + run_duration%100) As 'avgRunTime' /* convert HHMMSS to seconds */
From msdb.dbo.sysjobhistory
Where step_id = 0 -- only grab our total run-time
    And run_status = 1 -- only grab successful executions
    And msdb.dbo.agent_datetime(run_date, run_time) >= DateAdd(day, -@daysToAverage, GetDate())
Group By job_id;
/* Now let's get our schedule information */
With myCTE
    Select sched.name As 'scheduleName'
        , sched.schedule_id
        , jobsched.job_id
        , Case When sched.freq_type = 1 Then 'Once' 
            When sched.freq_type = 4 
                And sched.freq_interval = 1 
                    Then 'Daily'
            When sched.freq_type = 4 
                Then 'Every ' + Cast(sched.freq_interval As varchar(5)) + ' days'
            When sched.freq_type = 8 Then 
                Replace( Replace( Replace(( 
                    Select maskValue 
                    From @weekDay As x 
                    Where sched.freq_interval & x.mask <> 0 
                    Order By mask For XML Raw)
                , '"/><row maskValue="', ', '), '<row maskValue="', ''), '"/>', '') 
                + Case When sched.freq_recurrence_factor <> 0 
                        And sched.freq_recurrence_factor = 1 
                            Then '; weekly' 
                    When sched.freq_recurrence_factor <> 0 Then '; every ' 
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' weeks' End
            When sched.freq_type = 16 Then 'On day ' 
                + Cast(sched.freq_interval As varchar(10)) + ' of every '
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' months' 
            When sched.freq_type = 32 Then 
                Case When sched.freq_relative_interval = 1 Then 'First'
                    When sched.freq_relative_interval = 2 Then 'Second'
                    When sched.freq_relative_interval = 4 Then 'Third'
                    When sched.freq_relative_interval = 8 Then 'Fourth'
                    When sched.freq_relative_interval = 16 Then 'Last'
                End + 
                Case When sched.freq_interval = 1 Then ' Sunday'
                    When sched.freq_interval = 2 Then ' Monday'
                    When sched.freq_interval = 3 Then ' Tuesday'
                    When sched.freq_interval = 4 Then ' Wednesday'
                    When sched.freq_interval = 5 Then ' Thursday'
                    When sched.freq_interval = 6 Then ' Friday'
                    When sched.freq_interval = 7 Then ' Saturday'
                    When sched.freq_interval = 8 Then ' Day'
                    When sched.freq_interval = 9 Then ' Weekday'
                    When sched.freq_interval = 10 Then ' Weekend'
                + Case When sched.freq_recurrence_factor <> 0 
                        And sched.freq_recurrence_factor = 1 Then '; monthly'
                    When sched.freq_recurrence_factor <> 0 Then '; every ' 
                + Cast(sched.freq_recurrence_factor As varchar(10)) + ' months' End
            When sched.freq_type = 64 Then 'StartUp'
            When sched.freq_type = 128 Then 'Idle'
          End As 'frequency'
        , IsNull('Every ' + Cast(sched.freq_subday_interval As varchar(10)) + 
            Case When sched.freq_subday_type = 2 Then ' seconds'
                When sched.freq_subday_type = 4 Then ' minutes'
                When sched.freq_subday_type = 8 Then ' hours'
            End, 'Once') As 'subFrequency'
        , Replicate('0', 6 - Len(sched.active_start_time)) 
            + Cast(sched.active_start_time As varchar(6)) As 'startTime'
        , Replicate('0', 6 - Len(sched.active_end_time)) 
            + Cast(sched.active_end_time As varchar(6)) As 'endTime'
        , Replicate('0', 6 - Len(jobsched.next_run_time)) 
            + Cast(jobsched.next_run_time As varchar(6)) As 'nextRunTime'
        , Cast(jobsched.next_run_date As char(8)) As 'nextRunDate'
    From msdb.dbo.sysschedules As sched
    Join msdb.dbo.sysjobschedules As jobsched
        On sched.schedule_id = jobsched.schedule_id
    Where sched.enabled = 1
/* Finally, let's look at our actual jobs and tie it all together */
Select job.name As 'jobName'
    , sched.scheduleName
    , sched.frequency
    , sched.subFrequency
    , SubString(sched.startTime, 1, 2) + ':' 
        + SubString(sched.startTime, 3, 2) + ' - ' 
        + SubString(sched.endTime, 1, 2) + ':' 
        + SubString(sched.endTime, 3, 2) 
        As 'scheduleTime' -- HH:MM
    , SubString(sched.nextRunDate, 1, 4) + '/' 
        + SubString(sched.nextRunDate, 5, 2) + '/' 
        + SubString(sched.nextRunDate, 7, 2) + ' ' 
        + SubString(sched.nextRunTime, 1, 2) + ':' 
        + SubString(sched.nextRunTime, 3, 2) As 'nextRunDate'
      /* Note: the sysjobschedules table refreshes every 20 min, 
        so nextRunDate may be out of date */
    , 'Execute msdb.dbo.sp_update_job @job_id = ''' 
        + Cast(job.job_id As char(36)) + ''', @enabled = 0;' As 'disableScript'
    , art.avgRunTime As 'avgRunTime_inSec' -- in seconds
    , (art.avgRunTime / 60) As 'avgRunTime_inMin' -- convert to minutes
From msdb.dbo.sysjobs As job
Join myCTE As sched
    On job.job_id = sched.job_id
Left Join @avgRunTime As art
    On job.job_id = art.job_id
Where job.enabled = 1 -- do not display disabled jobs
Order By nextRunDate;

If this doesn’t do exactly what you want, check out SQLJobVis, which Ken recommended. It’s a free tool that helps visualize the job history of SQL jobs.

Disposable Indexes

Today I had to run an ad hoc query on a 8.5 billion row table. The table had a dozen columns of a variety of data types and was clustered on a bigint identity. There were no other indexes on the table. My query involved a join to a smaller table with a date range restriction. Without an adequate index to use, SQL Server was going to be forced to scan this 8.5 billion row table. Now, I don’t have much patience for waiting for long running queries. I want to run the ad hoc, e-mail the results, and forget about it. But short of adding a nonclustered index, which would take a very long time to build and probably require additional space requisitioned from the SAN team, what could I do? Enter disposable indexes. Now, you might be asking yourself, “What the frilly heck does she mean by a disposable index? Is that new in Denali?” No, dear reader. I am actually referring to filtered indexes, which is available in SQL Server 2008 and 2008 R2. I call them “disposable” because I create them to significantly speed up ad hoc queries, then I drop them when I’m done.

Here, allow me to demonstrate using the AdventureWorks2008R2 database. Although the tables are smaller, this query is very similar in structure to what I needed to run today.

Select Count(Distinct sod.SalesOrderID) As 'distinctCount'
From AdventureWorks2008R2.Sales.SalesOrderDetail As sod
Join AdventureWorks2008R2.Production.Product As p
    On sod.ProductID = p.ProductID
Where sod.ModifiedDate Between '2008-01-01' And '2008-07-31'
    And p.MakeFlag = 0;

Now, let’s take a look at the type of indexes we currently have available:

Select name, has_filter, filter_definition
From sys.indexes
Where object_id = object_id('Sales.SalesOrderDetail');
name                                                    has_filter filter_definition
PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID     0          NULL
AK_SalesOrderDetail_rowguid                             0          NULL
IX_SalesOrderDetail_ProductID                           0          NULL
(3 row(s) affected)

We need an index on ModifiedDate and ProductID, which it doesn’t look like we have currently. Without this, we’re going to end up scanning on the clustered index. That means SQL Server will have to evaluate each and every single row in the table to see if the row matches the criteria of our query. Not pretty, and certainly not fast. So instead, let’s create a filtered index on date. But we can greatly speed up the time it takes to create our filtered index by doing a little investigating upfront and finding a range of clustering key values that will cover the query. Doing this allows SQL Server to seek on the clustered index, greatly reducing the amount of reads necessary to create our filtered index. So let’s see this in action. First, let’s find out the current max value of the table:

Select Max(SalesOrderDetailID) As 'maxID'
From AdventureWorks2008R2.Sales.SalesOrderDetail;

Now we get to do a little guessing. Let’s go back and see what date we get if we look at half of the records:

Select SalesOrderDetailID, ModifiedDate
From AdventureWorks2008R2.Sales.SalesOrderDetail
Where SalesOrderDetailID = (121317/2);
SalesOrderDetailID ModifiedDate
------------------ -----------------------
60658              2007-11-01 00:00:00.000

Okay, SalesOrderDetailID 60658 gets us back to 11/1/2007. That’s a little too far. Let’s see how a SalesOrderDetailID value of 75000 does…

Select SalesOrderDetailID, ModifiedDate
From AdventureWorks2008R2.Sales.SalesOrderDetail
Where SalesOrderDetailID = 75000;
SalesOrderDetailID ModifiedDate
------------------ -----------------------
75000              2007-12-27 00:00:00.000

Okay, SalesOrderDetailID 75000 takes us back to 12/27/2007. That’s close enough to 1/1/2008 for my purposes. Of course, depending on the size of the table, in real life it may make sense to try to get closer to the value you’re looking for. But for now, this will do. And because we’re looking for data through the “current date” (7/31/2008 in the AdventureWorks2008R2 database), we already know our outer limit is 121317.

Now let’s take these ranges and create a filtered index that will cover our query:

Create Nonclustered Index IX_SalesOrderDetail_filtered
    On Sales.SalesOrderDetail(ModifiedDate, ProductID)
    Include (SalesOrderID)
    Where SalesOrderDetailID >= 75000
      And SalesOrderDetailID <  121317;

By having this range identified, SQL Server can perform a seek on the clustered index to create the nonclustered index on just the subset of records that you need for your query. Remember that 8.5 billion row table I mentioned earlier? I was able to create a filtered index that covered my query in 10 seconds. Yes, that’s right… 10 SECONDS.

The last thing we need to do is include our filtered index definition in our ad hoc query to ensure that the filtered index is used. It also doesn’t hurt to explicitly tell SQL Server to use the filtered index if you’re absolutely sure it’s the best index for the job.

Select Count(Distinct sod.SalesOrderID) As 'distinctCount'
From AdventureWorks2008R2.Sales.SalesOrderDetail As sod With (Index(IX_SalesOrderDetail_filtered))
Join AdventureWorks2008R2.Production.Product As p
    On sod.ProductID = p.ProductID
Where sod.ModifiedDate Between '2008-01-01' And '2008-07-31'
    And p.MakeFlag = 0
    And sod.SalesOrderDetailID >= 75000
    And sod.SalesOrderDetailID <  121317;

That’s all there is to it. Using this method, I was able to complete my ad hoc request in 40 seconds: 10 seconds to create the filtered index and 30 seconds to actually execute the ad hoc. Of course, it also took a couple of minutes to write the query, look at existing indexes, and search for the correct identity values. All in all, from the time I received the request to the time I send the e-mail was about 5 minutes. All because of disposable filtered indexes. How’s that for some SQL #awesomesauce? :)