We are using protobuf-net v.2.3.2 to serialize and deserialize some complex objects (with lists, dictionaries etc. inside) in our project. Most of the time, everything is fine, but in some rare cases we are encountering very strange behavior: the object serialized in one process causes errors on deserialization in the other process, if the call to serializer's .FromProto<SomeComplexType>(bytes)
method in that second process is not preceded by call to .ToProto(someComplexObject)
.
Here is an example: let's say our Process 1 looks like this:
class Program1 {
public static void Main()
{
SomeComplexType complexObject = new SomeComplexType();
// Here goes some code filling complexObject with data
byte[] serialized = ToProto(complexObject);
File.WriteAllBytes("serialized.data", serialized);
}
public static byte[] ToProto(object value)
{
using (var stream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(stream, value);
return stream.ToArray();
}
}
public static T FromProto<T>(byte[] value)
{
using (var stream = new MemoryStream(value))
{
return ProtoBuf.Serializer.Deserialize<T>(stream);
}
}
}
Now, we are trying to read that object in the Process 2:
class Program2 {
public static void Main()
{
byte[] serialized = File.ReadAllBytes("serialized.data");
SomeComplexType complexObject =
FromProto<SomeComplexType>(serialized);
}
public static byte[] ToProto(object value)
{
using (var stream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(stream, value);
return stream.ToArray();
}
}
public static T FromProto<T>(byte[] value)
{
using (var stream = new MemoryStream(value))
{
return ProtoBuf.Serializer.Deserialize<T>(stream);
}
}
}
What we see is that in some rare cases Process 1 generates the file that makes Process 2 to fail on call to FromProto (we observed various errors, starting from 'missing parameterless constructor' up to StackOverflowException).
However, adding a line like this: ToProto(new SomeComplexType());
somewhere before the call to FromProto makes the errors go away, and the same set of bytes is being deserialized without a hitch. No other methods (we tried PrepareSerializer, GetSchema) seem to do the trick.
It looks like there are some subtle differences in how ToProto and FromProto parse the object model. Another point is that ProtoBuf seems to "remember" the state after call to ToProto that helps it with subsequent deserializations.
UPDATE: Here is more details: The class structure that we have looks similar to this (very much simplified):
[ProtoContract(ImplicitFields = ImplicitFields.AllPublic)]
[ProtoInclude(1, typeof(A))]
[ProtoInclude(2, typeof(B))]
public interface IBase
{
[ProtoIgnore]
string Id { get; }
}
[ProtoContract(ImplicitFields=ImplicitFields.AllPublic, AsReferenceDefault=true)]
public class A : IBase
{
[ProtoIgnore]
public string Id { get; }
public string PropertyA { get; set; }
}
[ProtoContract(ImplicitFields=ImplicitFields.AllPublic, AsReferenceDefault=true)]
public class B : IBase
{
[ProtoIgnore]
public string Id { get; }
public string PropertyB { get; set; }
}
[ProtoContract(ImplicitFields=ImplicitFields.AllPublic, AsReferenceDefault=true)]
public class C
{
public List<IBase> ListOfBase = new List<IBase>();
}
[ProtoContract(ImplicitFields=ImplicitFields.AllPublic, AsReferenceDefault=true)]
public class D
{
public C PropertyC { get; set; }
public Dictionary<string, B> DictionaryOfBs { get; set; }
}
The root cause of the problem seems to be somewhat non-deterministic way in which Protobuf-net prepares serializers for types. Here is what we observe.
Let say we have two programs: producer and consumer. Producer creates an instance of D, adds some data and serializes that instance using protobuf-net. Consumer picks up that serialized data and deserializes it into instance of D.
In producer, protobuf sometimes discovers type B before it discovers IBase, so it generates serializer for B and serializes values in DictionaryOfBs as straight instances of B.
In consumer, it may so happen that protobuf-net may discover IBase first, so when it prepares (de)serializer for B, it treats it as subclass of IBase. So when it comes to deserializing values for DictionaryOfBs, it is trying to read them as subclass of IBase, expecting field number to discriminate between A and B. The data in the stream may be such that IBase serializer decides that what it sees is an instance of A, tries to convert it to B (using Merge method) and gets into infinite recursion trying to convert A into B into A into B etc., thus resulting in eventual StackOverflowException.
Adding Serializer.Serialize(stream, new D()) before deserialization changes the order in which serializers are created, so there is no error in that case, although it seems to be a lucky coincidence. Unfortunately, in our case even that cannot be used as satisfactory workaround, because that leads to occasional "Internal error; a key mismatch occurred" errors on deserialization.
FromProto
/ToProto
methods. Now, it is entirely possible that I've simply forgotten them (I'm not at a PC), but: are you sure they're not your own methods? I can't see them here: github.com/mgravell/protobuf-net/blob/master/src/protobuf-net/… and that class is not markedpartial
, so I shouldn't need to look in any other files... – Marc Gravell