1
votes

I am using Visual Studio 2005 with .Net20 version of protobuf-net r480.

I try to follow an example and serialize a class with string, enum, int, and byte[] data as below

[ProtoContract]
public class Proto_DemoMessage {
    public enum ESSCommandType : int {
        ClientServer = 0,
        Broadcast = 10,
        Assign = 11,
    }

    [ProtoMember(1, IsRequired = false)]
    public string Name;
    [ProtoMember(3, IsRequired = true)]
    public ESSCommandType CommandType;
    [ProtoMember(4, IsRequired = false)]
    public int Code;
    [ProtoMember(5, IsRequired = false)]
    public byte[] BytesData;

    public byte[] Serialize(){
        byte[] b = null;
        using (MemoryStream ms = new MemoryStream()) {
            Serializer.Serialize<Proto_DemoMessage>(ms, this);
            b = new byte[ms.Position];
            byte[] fullB = ms.GetBuffer();
            Array.Copy(fullB, b, b.Length);
        }
        return b;
    }

And give value to each field as below

Proto_DemoMessage inner_message = new Proto_DemoMessage();
inner_message.Name = "innerName";
inner_message.CommandType = Proto_DemoMessage.ESSCommandType.Broadcast;
inner_message.Code = 11;
inner_message.BytesData = System.Text.Encoding.Unicode.GetBytes("1234567890");

After calling inner_message.Serialize(), I write the result byte[] to a file. When I open the file in HEX mode to verify it, I found each byte in byte[] has a 00 padding behind it. The result is:

2A 14 31 00 32 00 33 00 34 00 35 00 36 00 37 00 38 00 39 00 30 00

Is there something I did wrong? I appreciate for your help.

2
You threw me with your MemoryStream code... you could just use .ToArray() there - it would be far more direct. But Cicada is entirely correct; you are including BytesData as UTF-16; which has zeros for ASCII-range characters. That is entirely self-inflicted, via the BytesData = line - Marc Gravell

2 Answers

3
votes

Everything's OK. Your string is encoded in UTF-16. In this encoding all characters are (at least) two bytes wide.

1
votes

While I was checking if these zeroes are indeed because of Unicode encoding, I also checked that UTF8 is more compact (as it could be expected), so using

inner_message.BytesData = System.Text.Encoding.UTF8.GetBytes("1234567890");

might do some good.

upd: or really using a string property as per Marc Gravell's suggestion.