3
votes

My Code:

using (XmlTextReader inputReader = new XmlTextReader(xml, XmlNodeType.Document, new XmlParserContext(null, null, "en", XmlSpace.Default)))
        {
            XsltArgumentList arglist = new XsltArgumentList();
            GetXSLT().Transform(inputReader, arglist, outputStream);
        }

The XmlTextReader is created fine, inside the XML there is an entity reference for a vertical tab ()

The line that errors is the call to Transform. It says that there is an invalid XML character (the vertical tab of course).

I've tried using the approach referenced in the following article:
Escape invalid XML characters in C#

My question is: how can I remove or ignore the invalid characters using the .NET framework like the link states?

note: in a way that doesn't involve hard coding a list of entity references to replace (I'm already doing this and it is horrible and I feel bad, and I should)

1
You can try ignoring it instead of removing.GSerg
I tried but it still throws the same exceptionNateous
You are ignoring them while reading, you should also ignore them while writing.GSerg
you are right, I just got thatNateous
i'll mark as Answer if you can post a nice way to use var validXmlChars = text.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray(); to get the characters removedNateous

1 Answers

1
votes

Try ignoring invalid XML characters both while reading and writing:

var readerSettings = new XmlReaderSettings() { CheckCharacters = false, ConformanceLevel = ConformanceLevel.Document };

using (var inputReader = XmlTextReader.Create(xml, readerSettings, new XmlParserContext(null, null, "en", XmlSpace.Default)))
{
    XsltArgumentList arglist = new XsltArgumentList();
    var xslt = GetXSLT();

    var writerSettings = xslt.OutputSettings.Clone();
    writerSettings.CheckCharacters = false;

    using (var outputWriter = XmlWriter.Create(outputStream, writerSettings))
    {
        xslt.Transform(inputReader, arglist, outputWriter);
    }
}