0
votes

My data entity contains a Dictionary, but XmlSerializer does not support them out of the box. So I decided to use DataContractSerializer. The problem is that I cannot get it to behave as I need.

I started with the following code:

public static string SerializeObject<T>(T serialisable)
{
    var serializer = new DataContractSerializer(serialisable.GetType());
    using (var writer = new StringWriter())
    using (var stm = new XmlTextWriter(writer))
    {
        serializer.WriteObject(stm, serialisable);
        return writer.ToString();
    }
}

It seemed to work fine until I noticed that if I put "\r\n" in a string, it does not get serialized to XML entities. From my experience with XmlSerializer, I knew that I can set up XmlWriterSettings with NewLineHandling = NewLineHandling.Entitize. So I converted my code to the following:

public static string SerializeObject<T>(T serialisable)
{
    var serializer = new DataContractSerializer(serialisable.GetType());
    using (var writer = new StringWriter())
    {
        using (var stm = XmlWriter.Create(writer,
            new XmlWriterSettings()
            {
                NewLineHandling = NewLineHandling.Entitize
            }))
        {
            serializer.WriteObject(stm, serialisable);
            return writer.ToString();
        }
    }
}

Now the problem is that I get an empty string. No exceptions, nothing - just an empty string. The stm variable holds XmlWellFormedWriter. Maybe it's not supported by DataContractSerializer?

Then I tried to enforce XmlTextWriter as follows:

public static string SerializeObject<T>(T serialisable)
{
    var serializer = new DataContractSerializer(serialisable.GetType());
    using (var writer = new StringWriter())
    using (var stm = XmlWriter.Create(new XmlTextWriter(writer),
        new XmlWriterSettings()
        {
            NewLineHandling = NewLineHandling.Entitize
        }))
    {
        serializer.WriteObject(stm, serialisable);
        return writer.ToString();
    }
}

And this gets me back to where I started - I get back XML string, but again "\r\n" string is not translated to entities.

How do I make DataContractSerializer to entitize newlines and return XML as string?

2

2 Answers

2
votes

I know this is a pretty old thread but I stumbled upon it looking for an answer and figured I would answer what I found out.

The reason the \n are not being entitized is because they are in a text node value. The serializer will only entitize \n chars if they are in an attribute.

Here is what I have found will happen in each of the NewLineHandling values

Text Nodes

NewLineHandling.Replace (Default) 
\r \n \r\n all go to \r\n
\t remains as \t

NewLineHandling.Entitize
\r\n goes to &#D;
\n remains as \n
\r goes to &#D;
\t remains as \t

NewLineHandling.None
\r remains \r
\r\n remains \n
\r\n remains \r\n
\t remains as \t

Attributes

NewLineHandling.Replace (Default) 
\r\n goes to &#D;&#A;
\n goes to &#A;
\r goes to &#D;
\t remains &#9;

NewLineHandling.Entitize
\r\n goes to &#D;&#A;
\n goes to &#A;
\r goes to &#D;
\t remains &#9;

NewLineHandling.None
\r remains \r
\r\n remains as \n
\r\n remains as \r\n
\t remains as \t
0
votes

It seems, the problem is mostly because of how disposing XmlWriter works - if I create it with XmlWriter.Create, it does not flush until it's closed, so the StringWriter is empty. What's weird - if I create it with new XmlTextWriter, it somehow flushes its contents to the StringWriter, so my initial method worked just fine.

This time I just had to rearrange one line of code:

    public static string SerializeObject<T>(T serialisable)
    {
        var serializer = new DataContractSerializer(serialisable.GetType());
        using (var writer = new StringWriter())
        {
            using (var stm = XmlWriter.Create(writer,
                new XmlWriterSettings()
                {
                    NewLineHandling = NewLineHandling.Entitize,
                    Encoding = UTF8Encoding.UTF8
                }))
            {
                serializer.WriteObject(stm, serialisable);
                // <- previously writer.ToString() was here and I got an empty string
            }     

            return writer.ToString();
        }
    }

Now "\r" characters are encoded correctly as &#xD;, but "\n" are not. And encoding is still utf-16, although I set it to UTF8. I guess, that's another issue.