This handy little script parses a string and returns the results as a table. I know there are a ton of string parsing functions out there, but I thought I’d add to the list. 😉
The basic logic of it (using a CTE) is derived from a forum post I found years ago. The table-valued UDF, delimiter, etc. is all stuff that I added. So thus, while I can not claim complete credit, I still thought it’d be worthwhile to share. Plus this will be used in another script I will be posting soon. 🙂
/* Let's create our parsing function... */ CREATE FUNCTION dbo.dba_parseString_udf ( @stringToParse VARCHAR(8000) , @delimiter CHAR(1) ) RETURNS @parsedString TABLE (stringValue VARCHAR(128)) AS /********************************************************************************* Name: dba_parseString_udf Author: Michelle Ufford, http://sqlfool.com Purpose: This function parses string input using a variable delimiter. Notes: Two common delimiter values are space (' ') and comma (',') Date Initials Description ---------------------------------------------------------------------------- 2011-05-20 MFU Initial Release ********************************************************************************* Usage: SELECT * FROM dba_parseString_udf(<string>, <delimiter>); Test Cases: 1. multiple strings separated by space SELECT * FROM dbo.dba_parseString_udf(' aaa bbb ccc ', ' '); 2. multiple strings separated by comma SELECT * FROM dbo.dba_parseString_udf(',aaa,bbb,,,ccc,', ','); *********************************************************************************/ BEGIN /* Declare variables */ DECLARE @trimmedString VARCHAR(8000); /* We need to trim our string input in case the user entered extra spaces */ SET @trimmedString = LTRIM(RTRIM(@stringToParse)); /* Let's create a recursive CTE to break down our string for us */ WITH parseCTE (StartPos, EndPos) AS ( SELECT 1 AS StartPos , CHARINDEX(@delimiter, @trimmedString + @delimiter) AS EndPos UNION ALL SELECT EndPos + 1 AS StartPos , CharIndex(@delimiter, @trimmedString + @delimiter , EndPos + 1) AS EndPos FROM parseCTE WHERE CHARINDEX(@delimiter, @trimmedString + @delimiter, EndPos + 1) <> 0 ) /* Let's take the results and stick it in a table */ INSERT INTO @parsedString SELECT SUBSTRING(@trimmedString, StartPos, EndPos - StartPos) FROM parseCTE WHERE LEN(LTRIM(RTRIM(SUBSTRING(@trimmedString, StartPos, EndPos - StartPos)))) > 0 OPTION (MaxRecursion 8000); RETURN; END |
Pingback: Log Buffer #221, A Carnival of the Vanities for DBAs | The Pythian Blog
Pingback: Log Buffer #221, A Carnival of the Vanities for DBAs | The Pythian Blog
This is exactly what I have been looking for!!! Thanks for posting it!
Michelle, this is perfect for what I needed to do, and by far the best example I came across. Thank you for sharing it.
Jeff.
Michelle,
As Jeff said: “this is perfect for what I needed to do, and by far the best example I came across. Thank you for sharing it.”
I’m using this in my project with full credit given to the author.
Thanks again!
This has two major problems. It is using a recursive cte for counting which is really a loop behind the scenes. The other big concern is that this is a multi-statement table valued function. The performance of mstvf is often far worse than a scalar function. There several options available that are for more performant than this. http://sqlperformance.com/2012/07/t-sql-queries/split-strings OR this one http://www.sqlservercentral.com/articles/Tally+Table/72993/
Modified this to use PATINDEX to allow usage of input of more than 8000 characters – also allow for multi character delimiter.
Slower than Michelle’s version but I needed a larger input capacity and performance was fast enough for my needs…
ALTER FUNCTION [dbo].[dba_parseString_udf]
(
@stringToParse VARCHAR(max)
, @delimiter varchar(100)
)
RETURNS @parsedString TABLE (stringValue VARCHAR(max))
AS
/*
*********************************************************************************
Usage:
SELECT *
FROM dba_parseString_udf(, );
Test Cases:
1. multiple strings separated by space
SELECT * FROM dbo.dba_parseString_udf(‘ aaa bbb ccc ‘, ‘ ‘);
2. multiple strings separated by comma
SELECT * FROM dbo.dba_parseString_udf(‘,aaa,bbb,,,ccc,’, ‘,’);
********************************************************************************
*/
BEGIN
/* Declare variables */
DECLARE @trimmedString VARCHAR(max);
/* Trim string input in case the user entered extra spaces */
SET @trimmedString = LTRIM(RTRIM(@stringToParse));
/* Create a recursive CTE to break down our string for us */
WITH parseCTE (StartPos , EndPos)
AS
(
SELECT cast (1 as bigint) AS StartPos
, PATINDEX(‘%’ + @delimiter + ‘%’, @trimmedString + @delimiter ) AS EndPos
UNION ALL
SELECT EndPos + 1 AS StartPos
, PATINDEX(
‘%’ + @delimiter + ‘%’
, substring(
@trimmedString + @delimiter
,EndPos + 1
,len(@trimmedString) – (EndPos + 1)
)
) + EndPos AS EndPos
FROM parseCTE
WHERE PATINDEX(
‘%’ + @delimiter + ‘%’
, substring(
@trimmedString + @delimiter
,EndPos + 1
,len(@trimmedString) – (EndPos + 1)
)
) 0
)
/* Take the results and stick it in a table */
INSERT INTO @parsedString
SELECT SUBSTRING(@trimmedString, StartPos, EndPos – StartPos)
FROM parseCTE
WHERE LEN(LTRIM(RTRIM(SUBSTRING(@trimmedString, StartPos, EndPos – StartPos)))) > 0
OPTION (MaxRecursion 15000); –MaxRecursion = 15000 means we can get a resultset with 15,001 rows…
RETURN;
END