2
votes

Is there a ultimate solution for fixing the following error:

Msg 9455, Level 16, State 1, Line 8
XML parsing: line 1, character 12, illegal qualified name character

which is raised when converting "illegal" NVARCHAR string to XML.

For example:

DECLARE @Text NVARCHAR(MAX)
SET @Text =  '<tag>' + 'Test <'  + '</tag>' 
SELECT CAST(@Text AS XML)

Can be fixed using CDATA but it is handling few symbols only. In situations like the query below:

DECLARE @Text NVARCHAR(MAX)
SET @Text = '<tag><![CDATA[' + 'Test' + CHAR(28)  + ']]></tag>' -- CHAR(28) --CHAR(8) -- CHAR(29) -- CHAR(63) 
SELECT CAST(@Text AS XML)

it does nothing. Also, it cannot be used in the XML attribute value.

I have try to find a list with all symbols that are breaking XML but I was not able to do. So each time some symbol breaks the XML I am finding it and replacing it, but this is very temporary and hard to maintain solution.

Is there a complete solution for such cases - no matter if it should be done in the application or using CLR function?

1

1 Answers

4
votes

It would depend on the XML version but to be on the safer side, one thing you can do is replace all the C0 control ASCII characters in a loop:

DECLARE @counter INT=0
DECLARE @Text NVARCHAR(MAX)
SET @Text = '<tag><![CDATA[' + 'Test' + CHAR(28) + CHAR(55) + CHAR(29) + '<' + ']]></tag>' -- CHAR(28) --CHAR(8) -- CHAR(29) -- CHAR(63) 
WHILE @counter<32
BEGIN
SET @text= REPLACE(@text,CHAR(@counter),'?')
SET @counter=@counter+1
END
SELECT CAST(@Text AS XML)

more info about XML char set:

http://www.w3.org/TR/xml11/#charsets

http://en.wikipedia.org/wiki/Valid_characters_in_XML

http://en.wikipedia.org/wiki/C0_and_C1_control_codes