2
votes

I have an xml that contains some special characters like & and whitespaces.
I want to handle these special characters in xsl.
How can I handle special characters in xsl?

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="cpdhtml.xsl"?>
<pmd-cpd>
    <duplication lines="72" tokens="75">
        <file line="632" path="M:\PBA0039 & Code\Common\ssc\src\Main.c"/>
        <file line="1802" path="M:\PBA0039 & Code\Common\ssc\src\link1.c"/>
    </duplication>
</pmd-cpd>

Here you can see & in path. It gives error while transforming xml.
Please help me to fix this problem.

3
That's not well-formed XML, you need to fix the unescaped & at source before you can process it with XSLT.Ian Roberts
@IanRoberts This xml is generated by some tool. Like in example, some folder names may contain &. So how can I replace it with '&amp;'?Sachin Mhetre
The tool needs to be fixed to produce well formed XML. XSL cannot operate on files that are not well formed XML.Oded
OK... Thanks for youiformation.Sachin Mhetre
In XSLT 2.0 you can read this as a regular text (not xml) file and replace all occurences of " & " with " &amp; ". That escapes the "&" characters and produces the textual representation of a well-formed XML document. Then you can process this XML document with your XSLT code.Dimitre Novatchev

3 Answers

0
votes

You need to escape them, as you do in any XML document.

The escape for & is &amp;.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="cpdhtml.xsl"?>
<pmd-cpd>
    <duplication lines="72" tokens="75">
        <file line="632" path="M:\PBA0039 &amp; Code\Common\ssc\src\Main.c"/>
        <file line="1802" path="M:\PBA0039 &amp; Code\Common\ssc\src\link1.c"/>
    </duplication>
</pmd-cpd>
0
votes

I guess these XML files were generated by string concatenation .. otherwise there is no way you would end up with uncoded XML.

Only way you can get rid of special characters is to use any programming language like C#, VB.NET and load the file as string..
use string manipulation operations..

string.Replace("&","&amp;");

updated per Flynn1179's comment:

if you are afraid of running into an issue, where your XML already have some characters encoded.. then add one more line:

string.Replace("&amp;amp;","&amp;");

Well the better solution would be to modify the code which is generating such XML files..

example: usage of XML DOM instead of String-concat

0
votes

You can not use XSLT to transform an XML file if it is not valid. To keep excaping characters in your xml, you can use CDATA section in your XML. Then you can safely use XSLT to pick those CDATA. Check this following post on how to use CDATA to keep escaping characters.

http://vvratha.blogspot.com/2012/11/extracting-cdata-section-using-xslt.html