BCP Basics

In this blog post, I’m going to walk through the basics of BCP (bulk copy program). BCP is a utility that installs with SQL Server and can assist with large data transfers. Let’s see what parameter options are available to use. From the command line on a machine with SQL Server installed, type “bcp” and press Enter. You can find out more information on BCP parameters on Books Online: http://msdn.microsoft.com/en-us/library/ms162802.aspx For now, we’re going to examine just the basics. The simplest syntax of a BCP command is: bcp databaseName.Schema.TableName *or* “Query” in, out, *or* queryout -S ServerName\instanceName -U userName -P password *or* -T -c *or* -n *or* specify storage information for each column Let’s look at these options in a little more detail: databaseName.Schema.TableName *or* Query You can specify either an entire table to copy or a query. The query should be surrounded in quotations and must also include the fully qualified table name. in, out, *or* queryout in = import, out = full table export, queryout = query to select data for export -U userName -P password *or* -T You can either specify a specific account to access SQL Server, or use -T to indicate Trusted Connection (i.e. Windows … Continue reading

Replication Monitor

In my last blog post, I provided a script to view replication latency.  Ian Kirk took the script and ran with it, adding centralized execution and permanent logging. I’ve tweaked it a little bit further and deployed to production. So far, so good. Here’s the latest and greatest for those interested: If Object_ID(’dbo.dba_replicationMonitor’) Is Null Begin Create Table dbo.dba_replicationMonitor ( monitor_id int Identity(1,1) Not Null , monitorDate smalldatetime Not Null , publicationName sysname Not Null , publicationDB sysname Not Null , iteration int Null , tracer_id int Null , distributor_latency int Null , subscriber varchar(1000) Null , subscriber_db varchar(1000) Null , subscriber_latency int Null , overall_latency int Null ); End;   If ObjectProperty(Object_ID(’dbo.dba_replicationLatencyMonitor_sp’), N’IsProcedure’) = 1 Begin Drop Procedure dbo.dba_replicationLatencyMonitor_sp; Print ‘Procedure dba_replicationLatencyMonitor_sp dropped’; End; Go   Set Quoted_Identifier On Go Set ANSI_Nulls On Go   Create Procedure dbo.dba_replicationLatencyMonitor_sp   /* Declare Parameters */ @publicationToTest sysname = N’yourPublicationName’ , @publicationDB sysname = N’yourPublicationDB’ , @replicationDelay varchar(10) = N’00:00:30′ , @iterations int = 5 , @iterationDelay varchar(10) = N’00:00:30′ , @displayResults bit = 0 , @deleteTokens bit = 1 As /********************************************************************************* Name: dba_replicationLatencyMonitor_sp   Author: Michelle F. Ufford   Purpose: Retrieves the amount of replication latency in seconds   Notes: Default … Continue reading

Checking Replication Latency with T-SQL

This post may only appeal to small subset of DBA’s. This particular script was born of a need to view replication latency in production. I had to investigate whether the replication latency issues we were experiencing in production were occurring on the publisher (us) or the subscriber (them), and address accordingly. “Why just not use Replication Monitor?”, you might ask. Good question! I would definitely use Replication Monitor if it were available to me; however, I do not have the necessary permissions to the Distributor. Fortunately, SQL 2005 provides us with some tools to help: sp_postTracerToken and sp_helpTracerTokenHistory. This allows us to access the same Tracer Tokens feature that is in Replication Monitor.   So that’s the backstory, and here’s the script I came up with. For those interested, it’d be pretty easy to modify this script to use a permanent table and regularly log replication latency. Create Procedure dbo.dba_replicationLatencyGet_sp   /* Declare Parameters */ @publicationToTest sysname = N’yourPublicationName’ , @replicationDelay varchar(10) = N’00:00:30′ , @iterations int = 5 , @iterationDelay varchar(10) = N’00:00:30′ , @deleteTokens bit = 1 , @deleteTempTable bit = 1 As /********************************************************************************* Name: dba_replicationLatencyGet_sp   Author: Michelle F. Ufford   Purpose: Retrieves the amount of replication … Continue reading

Bulk Inserts with XML

Last week, I blogged about how to perform one-to-many inserts with table-valued parameters, a feature new in 2008. For those who do not yet have 2008 or will not have it in the near future, it may still be beneficial to use XML for bulk inserts. Here’s a pretty simple example of how to accomplish this: /* Create some tables to work with */ CREATE TABLE dbo.orders ( order_id INT IDENTITY(1,1) Not Null , orderDate DATETIME Not Null , customer_id INT Not Null   CONSTRAINT PK_orders PRIMARY KEY CLUSTERED(order_id) );   CREATE TABLE dbo.orderDetails ( orderDetail_id INT IDENTITY(1,1) Not Null , order_id INT Not Null , lineItem INT Not Null , product_id INT Not Null   CONSTRAINT PK_orderDetails PRIMARY KEY CLUSTERED(orderDetail_id)   CONSTRAINT FK_orderDetails_orderID FOREIGN KEY(order_id) REFERENCES dbo.orders(order_id) ); Go   /* Create a new procedure using an XML parameter */ CREATE PROCEDURE dbo.insert_orderXML_sp @orderDate DATETIME , @customer_id INT , @orderDetailsXML XML AS BEGIN   SET NOCOUNT ON;   DECLARE @myOrderID INT;   INSERT INTO dbo.orders ( orderDate , customer_id ) VALUES ( @orderDate , @customer_id );   SET @myOrderID = SCOPE_IDENTITY();   INSERT INTO dbo.orderDetails ( order_id , lineItem , product_id ) SELECT @myOrderID , myXML.value(’./@lineItem’, ‘int’) , myXML.value(’./@product_id’, … Continue reading

Updated Index Defrag Script (2005, 2008)

Thanks to everyone who left a comment or sent me an e-mail regarding the Index Defrag Script. I’ve received some great feedback and requests for features. I’ve also had some questions regarding how to use it, which I will answer at the end of this post. Changes include: – separate version for both Enterprise and Standard editions     - Standard edition removes partitioning and online options – output option to see fragmentation levels – page_count added to the log table I’ve also verified that this script works well in SQL 2008. Enterprise Version: IF EXISTS(SELECT OBJECT_ID FROM sys.tables WHERE [name] = N’dba_indexDefragLog’) BEGIN   DROP TABLE dbo.dba_indexDefragLog;   PRINT ‘dba_indexDefragLog table dropped!’;   END   CREATE TABLE dbo.dba_indexDefragLog ( indexDefrag_id INT IDENTITY(1,1) NOT NULL , objectID INT NOT NULL , objectName NVARCHAR(130) NOT NULL , indexID INT NOT NULL , indexName NVARCHAR(130) NOT NULL , partitionNumber smallint not null , fragmentation FLOAT NOT NULL , page_count INT NOT NULL , dateTimeStart DATETIME NOT NULL , durationSeconds INT NOT NULL CONSTRAINT PK_indexDefragLog PRIMARY KEY CLUSTERED (indexDefrag_id) );   PRINT ‘dba_indexDefragLog Table Created’;   IF OBJECTPROPERTY(OBJECT_ID(’dbo.dba_indexDefrag_sp’), N’IsProcedure’) IS Null BEGIN EXECUTE (’Create Procedure dbo.dba_indexDefrag_sp As Print ”Hello World!”’); RAISERROR(’Procedure dba_indexDefrag_sp created.’ , 10, … Continue reading

Performance Comparison of Singleton, XML, and TVP Inserts

As promised, today I took a look at the performance of bulk inserts using XML and Table-Valued Parameters. I also compared it against singleton inserts to show the value in the bulk-insert approach. My tests were pretty simple: insert 100 records using each method. Each test was executed 10 times to ensure consistency. The duration was recorded in microseconds. The goal was to compare the performance of the inserts. Because I was executing this entire test within SQL Server, I had to isolate only the actual insert transactions and ignore everything else, such as the loading of the data; that work would normally be performed by the calling application. So without further ado… screenshots of the Profiler traces: (click to enlarge) Summary Method Avg CPU Avg Reads Avg Writes Avg Duration (micro) Singleton Method 3 202 0 13378 XML Method 0 222 0 3124 TVP Method 1 207 0 780   As expected, both the XML and the TVP method performed significantly better than the single-insert method. As hoped, the table-valued parameter arguably performed the best of all 3.

One-to-Many Inserts with Table-Valued Parameters

There’s been much talk about table-valued parameters, but I’ve yet to see an example illustrating one of the greatest potential benefits of this new feature (at least, imho): one-to-many inserts. But first, for those not yet up to speed on this new feature… What are Table-Valued Parameters? According to SQL Books Online 2008: Table-valued parameters are a new parameter type in SQL Server 2008. Table-valued parameters are declared by using user-defined table types. You can use table-valued parameters to send multiple rows of data to a Transact-SQL statement or a routine, such as a stored procedure or function, without creating a temporary table or many parameters. That’s great, but why do I care? This means that a single proc call with table-valued parameters can insert numerous records into one or more tables; for example, inserting a parent record into TableA (header) and several related child records into TableB (details). This has great potential when used with .NET, i.e. a web service that sends sales data to SQL Server. In 2005, this can be accomplished by using XML and temp tables. My hope is this new feature will outperform the XML method. In my next post, I’ll follow up with a … Continue reading

Large Updates on Replicated Tables

Late last night, I executed a 70mm update on a replicated table. This was a somewhat low priority update, so the primary goal (aside from the data change) was to avoid any issues on the publishing and subscribing servers, including replication latency errors. I have performed many large updates in the past, but this was the first on a replicated table. To minimize impact, one of our system DBA’s suggested the use of a replicated stored procedure. Basically, instead of replicating each of the 70mm updates as a singleton transaction, the replicated stored procedure is called on the subscribing server, which then performs the bulk update locally. This was my first time using it and it worked beautifully. Another of my colleagues, Jeff M., suggested the use of a control table. Normally, I would just output the last affected ID and update the script manually. However, this also worked so well that it will be adopted in all future update scripts. Using the following pseudo-script and a replicated stored procedure, I was able to execute the update on 70mm records in 11.5 hours with *zero* impact on the servers and replication. I could’ve probably increased the batch size and reduced … Continue reading

SQL Quiz – DBA Mistakes

Yesterday, Jason Massie called me out to answer the SQL Quiz begun by Chris Shaw.  The goal is to share a mistake you’ve made and what you’ve learned from it; hopefully, others will learn vicariously through you and avoid the same mistakes. I didn’t have to think too hard about this one. I’ve made some small mistakes in my time as DBA, but only one really big one.  You may have noticed my fondness for partitioning. My first partitioning project was to transition some very large existing tables to a new partitioning scheme. The plan was to stage the new partitioned schema and the newly partitioned data in the staging database, then perform a database rename.  To make things easier on myself, the script I was using in DEV and TEST first dropped the tables and then rebuilt them. (I’m sure by now you know where this is going…) One morning, while still in the midst of development, I sat down at my desk, opened up the script, and pressed F5.  My first clue that I was in the wrong environment was the errors about dropping replicated tables. Ultimately, the failure to drop replicated tables was what prevented this from … Continue reading

Row Concatenation in T-SQL

Have you ever needed to create a comma-delimited list of related records, grouped by a parent record? This is one of those business requirements that just happens to pop up every so often. For example, turning this: AuthorID    BookName 1 “Great Expectations” 1 “David Copperfield” 1 “Bleak House” 2 “Divine Comedy” 3 “Pride and Prejudice” 3 “Mansfield Park”     into this: AuthorID    ListOfBooks 1 “Great Expectations”, “David Copperfield”, “Bleak House” 2 “Divine Comedy” 3 “Pride and Prejudice”, “Mansfield Park”     There’s a handful of ways you can handle this, especially when dealing with small data sets. However, the problem becomes a bit more tricky when dealing with large record sets (i.e. hundreds of thousands or even millions of records). My first attempt at a solution used a CTE (common table expression). It did the job but was very slow. Looking for a better solution, I discovered the XML method. To give you a quick example: /* Create a table variable to play with */ Declare @myTable Table (customerID int, textData varchar(10));   /* Populate some test rescords */ Insert Into @myTable Select 1, ‘abc’ Union All Select 1, ‘def’ Union All Select 2, ‘uvw’ Union All Select 2, … Continue reading