Simply, a bug; this is fixed in r640 (now deployed to both NuGet and google-code), along with an additional test based on your code above so that it can't creep back in.
Re performance (comments); the first hint I would look at would be: "prefer groups". Basically, the protobuf specification includes 2 different ways of including sub-objects - "groups" and "length-prefix". Groups was the original implementation, but google have now move towards "length-prefix", and try to advise people not to use "groups". However! Because of how protobuf-net works, "groups" are actually noticeably cheaper to write; this is because unlike the google implementation, protobuf-net does not know the length of things in advance. This means that to write a length-prefix, it needs to do one of:
- calculate the length (almost as much work as actually serializing the data, bud adds an entire duplicate of the code) as needed; write the length, then actually serialize the data
- serialize to a buffer, write the length, write the buffer
- leave a place-holder, serialize, then loop back and write the actual length into the place-holder, adjusting the padding if needed
I've implemented all 3 approaches at different times, but v2 uses the 3rd option. I keep toying with adding a 4th implementation:
- leave a place-holder, serialize, then loop back and write the actual length using an overlong form (so no padding adjustments ever needed)
but... consensus seems to be that the "overlong form" is a bit risky; still, it would work nicely for protobuf-net to protobuf-net.
But as you can see: length-prefix always has some overhead. Now imagine fairly deeply nested objects, and you can see a few blips. Groups work very differently; the encoding format for a group is:
- write a start marker; serialize; write an end marker
that's it; no length needed; really, really, really cheap to write. On the wire, the main difference between them is:
- groups: cheap to write, but you can't skip them if you encounter them as unexpected data; you have to parse the headers of the payload
- length-prefix: more expensive to write, but cheap to skip if you encounter them as unexpected data - you just read the length and copy/move that many bytes
But! too much detail!
What does that mean for you? Well, imagine you have:
[ProtoContract]
public class SomeWrapper
{
[ProtoMember(1)]
public List<Person> People { get { return people; } }
private readonly List<Person> people = new List<Person>();
}
You can make the super complex change:
[ProtoContract]
public class SomeWrapper
{
[ProtoMember(1, DataFormat=DataFormat.Group)]
public List<Person> People { get { return people; } }
private readonly List<Person> people = new List<Person>();
}
and it'll use the cheaper encoding scheme. All your existing data will be fine as long as you are using protobuf-net.