Pages

Wednesday, November 25, 2009

Text Parsing in SQL Server

There are lots scenarios in database procedure we may required to parse the test. Most common scenario is to parse the comma separated text.

For example if we want to get the details of few employees, we will send those employee ids separated by a delimiter as a parameter to the stored procedure.

Till SQL Server 2000 we will be parsing the text character by character and then we will split according to the delimiter.

Following is the general split function which we can use in any version of SQL Server:

Function Code:

set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
go

ALTER FUNCTION [dbo].[FNSplitString]
(
-- Add the parameters for the function here
@string nvarchar(4000),
@delimiter nvarchar(10)
)
RETURNS
@StringValueTable TABLE
(
-- Add the column definitions for the TABLE variable here
Pos int IDENTITY(1,1),
StringValue nvarchar(4000)
)
AS
BEGIN
-- Fill the table variable with the rows for your result set
DECLARE @delimiterPos int
DECLARE @StringValue nvarchar(4000)

SET @string = @string + @delimiter

-- Loop thru all the characters
WHILE(LEN(@string)>0)
BEGIN

SET @delimiterPos = CHARINDEX(@delimiter,@string)

SET @StringValue = SUBSTRING(@string,0,@delimiterPos)

-- Insert the record into table variable
INSERT INTO @StringValueTable VALUES (@StringValue)

SET @string = SUBSTRING(@string,@delimiterPos+1,LEN(@string))

-- End the loop if there no more elements
IF @delimiterPos = 0
BEGIN
Break
END

END

RETURN
END

Test Sample:

SELECT * FROM [dbo].[FNSplitString]('a,bc,def,ghij,klmno,pqrstu,vw,x,yz',',')

With the new feature "CTE" in SQL Server 2005 and above we can implement the same very easily without any explicit looping mentioned in the above code.

Following is the split function using CTE:

Function Code:

set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
go

CREATE FUNCTION [dbo].[FNSplitString_New]
(
-- Add the parameters for the function here
@string nvarchar(4000),
@delimiter nvarchar(1)
)
RETURNS
@StringValueTable TABLE
(
-- Add the column definitions for the TABLE variable here
Pos int IDENTITY(1,1),
StringValue nvarchar(4000)
)
AS
BEGIN

Declare @SourceString nvarchar(MAX)

SET @SourceString = ',' + @string + ',';

-- Query using CTE
WITH CTE (Indx, Position) AS
(
SELECT CHARINDEX(@delimiter, @SourceString) Indx, 1 AS Position
UNION ALL
SELECT CHARINDEX(@delimiter, @SourceString, Indx+1) , Position + 1
FROM CTE WHERE CHARINDEX(@delimiter, @SourceString , Indx+1) <> 0
)
INSERT INTO @StringValueTable(StringValue)
SELECT SUBSTRING(@SourceString, B.Indx+1, A.Indx - B.Indx - 1) Val
FROM CTE A INNER JOIN CTE B ON A.Position = B.Position + 1
OPTION (MAXRECURSION 5000)

-- In the above query recursive option is limited to 5000
-- By default it is 100, which means we cannot loop recursively more than 100 times

RETURN
END

Test Sample:

SELECT * FROM [dbo].[FNSplitString_New]('a,bc,def,ghij,klmno,pqrstu,vw,x,yz',',')

In most of our applications we are using first approach as we are reusing from long time. Now we started taking the power of new features like CTE in our regular design and development.

We can also think about passing the 'n' parameter values to a procedure as an xml. And by using XML query capability we can get the individual values as part of the query itself.

Tuesday, November 17, 2009

SQL Server Table Variables in Transactions

Today in one of the forum I found an intersting question to execute few SQL Statments as not part of the enclosed the transaction.

Let us consider we have a following Scenario:
We have lots of statements participating in the transaction as

Begin Transaction

SQL Inerts Statement 1
Log a message in Log Table
SQL Update Statement 1
Log a message in Log Table
SQL Delete Statement 1

Commit Transaction / Rollback Transaction

In the above block if the transaction is rollbacked then we will be loosing the enteries in Log table too. But if we need the data available in the Log tables without participating in the transaction...

Now the Question is How we can acheive it...?

Immediately after see the question, most of the people thinks about using inner transaction. we have to remember that:

"Committing inner transactions is ignored by the SQL Server Database Engine. The transaction is either committed or rolled back based on the action taken at the end of the outermost transaction. If the outer transaction is committed, the inner nested transactions are also committed. If the outer transaction is rolled back, then all inner transactions are also rolled back, regardless of whether or not the inner transactions were individually committed."

So the Inner/Nested Transaction are ruled out... Now we have to think that is there any way we can get the solution...?

Yes, we can do that by using Local Table Variables. One of the importatnt behaviour of the Local Table Variable will not have any impact of the transaction. Which means irrespective of the Commit/Rollback there will not be any change in the local table variables.

Following the sample demomstrating the same:

-- Create a Sample Table to test
Create Table tblSample(col1 varchar(10))

-- declare log table variable
Declare @Tbl table(Col1 varchar(10))

--
Begin TRansaction

-- Insert Data into table variable
INSERT INTO @Tbl values ('A11')

-- Insert data into actual table
INSERT INTO tblSample
SELECT * FROM
(
Select 'A' Col1
UNION ALL
Select 'B' Col1
) T

-- Insert data into table variable
INSERT INTO @Tbl values ('B11')

rollback Transaction
--commit Transaction

SELECT * from @Tbl
SELECT * from tblSample

Run the above block once with rollback and second time with commit transaction, you will find the same data in the log table @tbl.

Hope this provides some additonal info on local table variables and how we can use in these scenarios.