String Parsing Function

This handy little script parses a string and returns the results as a table. I know there are a ton of string parsing functions out there, but I thought I’d add to the list. 😉

The basic logic of it (using a CTE) is derived from a forum post I found years ago. The table-valued UDF, delimiter, etc. is all stuff that I added. So thus, while I can not claim complete credit, I still thought it’d be worthwhile to share. Plus this will be used in another script I will be posting soon. 🙂

/* Let's create our parsing function... */
CREATE FUNCTION dbo.dba_parseString_udf
(
          @stringToParse VARCHAR(8000)  
        , @delimiter     CHAR(1)
)
RETURNS @parsedString TABLE (stringValue VARCHAR(128))
AS
/*********************************************************************************
    Name:       dba_parseString_udf
 
    Author:     Michelle Ufford, http://sqlfool.com
 
    Purpose:    This function parses string input using a variable delimiter.
 
    Notes:      Two common delimiter values are space (' ') and comma (',')
 
    Date        Initials    Description
    ----------------------------------------------------------------------------
    2011-05-20  MFU         Initial Release
*********************************************************************************
Usage: 		
    SELECT *
	FROM dba_parseString_udf(<string>, <delimiter>);
 
Test Cases:
 
    1.  multiple strings separated by space
        SELECT * FROM dbo.dba_parseString_udf('  aaa  bbb  ccc ', ' ');
 
    2.  multiple strings separated by comma
        SELECT * FROM dbo.dba_parseString_udf(',aaa,bbb,,,ccc,', ',');
*********************************************************************************/
BEGIN
 
    /* Declare variables */
    DECLARE @trimmedString  VARCHAR(8000);
 
    /* We need to trim our string input in case the user entered extra spaces */
    SET @trimmedString = LTRIM(RTRIM(@stringToParse));
 
    /* Let's create a recursive CTE to break down our string for us */
    WITH parseCTE (StartPos, EndPos)
    AS
    (
        SELECT 1 AS StartPos
            , CHARINDEX(@delimiter, @trimmedString + @delimiter) AS EndPos
        UNION ALL
        SELECT EndPos + 1 AS StartPos
            , CharIndex(@delimiter, @trimmedString + @delimiter , EndPos + 1) AS EndPos
        FROM parseCTE
        WHERE CHARINDEX(@delimiter, @trimmedString + @delimiter, EndPos + 1) <> 0
    )
 
    /* Let's take the results and stick it in a table */  
    INSERT INTO @parsedString
    SELECT SUBSTRING(@trimmedString, StartPos, EndPos - StartPos)
    FROM parseCTE
    WHERE LEN(LTRIM(RTRIM(SUBSTRING(@trimmedString, StartPos, EndPos - StartPos)))) > 0
    OPTION (MaxRecursion 8000);
 
    RETURN;   
END
0saves
If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
Tagged , , , , , , . Bookmark the permalink.

7 Responses to String Parsing Function

  1. Pingback: Log Buffer #221, A Carnival of the Vanities for DBAs | The Pythian Blog

  2. Pingback: Log Buffer #221, A Carnival of the Vanities for DBAs | The Pythian Blog

  3. Michael says:

    This is exactly what I have been looking for!!! Thanks for posting it!

  4. Jeff Stevens says:

    Michelle, this is perfect for what I needed to do, and by far the best example I came across. Thank you for sharing it.

    Jeff.

  5. Miles says:

    Michelle,
    As Jeff said: “this is perfect for what I needed to do, and by far the best example I came across. Thank you for sharing it.”
    I’m using this in my project with full credit given to the author.

    Thanks again!

  6. Sean Lange says:

    This has two major problems. It is using a recursive cte for counting which is really a loop behind the scenes. The other big concern is that this is a multi-statement table valued function. The performance of mstvf is often far worse than a scalar function. There several options available that are for more performant than this. http://sqlperformance.com/2012/07/t-sql-queries/split-strings OR this one http://www.sqlservercentral.com/articles/Tally+Table/72993/

  7. nickE says:

    Modified this to use PATINDEX to allow usage of input of more than 8000 characters – also allow for multi character delimiter.

    Slower than Michelle’s version but I needed a larger input capacity and performance was fast enough for my needs…

    ALTER FUNCTION [dbo].[dba_parseString_udf]
    (
    @stringToParse VARCHAR(max)
    , @delimiter varchar(100)
    )
    RETURNS @parsedString TABLE (stringValue VARCHAR(max))
    AS
    /*
    *********************************************************************************
    Usage:
    SELECT *
    FROM dba_parseString_udf(, );

    Test Cases:

    1. multiple strings separated by space
    SELECT * FROM dbo.dba_parseString_udf(‘ aaa bbb ccc ‘, ‘ ‘);

    2. multiple strings separated by comma
    SELECT * FROM dbo.dba_parseString_udf(‘,aaa,bbb,,,ccc,’, ‘,’);
    ********************************************************************************
    */
    BEGIN
    /* Declare variables */
    DECLARE @trimmedString VARCHAR(max);
    /* Trim string input in case the user entered extra spaces */
    SET @trimmedString = LTRIM(RTRIM(@stringToParse));
    /* Create a recursive CTE to break down our string for us */
    WITH parseCTE (StartPos , EndPos)
    AS
    (
    SELECT cast (1 as bigint) AS StartPos
    , PATINDEX(‘%’ + @delimiter + ‘%’, @trimmedString + @delimiter ) AS EndPos
    UNION ALL
    SELECT EndPos + 1 AS StartPos
    , PATINDEX(
    ‘%’ + @delimiter + ‘%’
    , substring(
    @trimmedString + @delimiter
    ,EndPos + 1
    ,len(@trimmedString) – (EndPos + 1)
    )
    ) + EndPos AS EndPos
    FROM parseCTE
    WHERE PATINDEX(
    ‘%’ + @delimiter + ‘%’
    , substring(
    @trimmedString + @delimiter
    ,EndPos + 1
    ,len(@trimmedString) – (EndPos + 1)
    )
    ) 0
    )
    /* Take the results and stick it in a table */
    INSERT INTO @parsedString
    SELECT SUBSTRING(@trimmedString, StartPos, EndPos – StartPos)
    FROM parseCTE
    WHERE LEN(LTRIM(RTRIM(SUBSTRING(@trimmedString, StartPos, EndPos – StartPos)))) > 0
    OPTION (MaxRecursion 15000); –MaxRecursion = 15000 means we can get a resultset with 15,001 rows…
    RETURN;
    END

Leave a Reply

Your email address will not be published. Required fields are marked *