9
votes

I'd like to use protocol buffer in my program to read data from a file. I also would like to be able to edit the data file with any text editor, for a start (I'll write a data editor later on, and switch to full binary).

Is there a way to parse a human-readable format ? (debug string provided by protobuf itself, or some other format).

4

4 Answers

6
votes

There is a text based format too, but support for this is implementation specific. For example, I don't support it at all in protobuf-net. But yes: such is defined, and discussed (for example) here: http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.text_format.html

Personally, I'd rather use binary and write a UI around the model.

5
votes

If you don't mind using command-line tools, the Piqi project includes piqi convert command for converting between 4 formats: binary Protocol Buffers, JSON, XML and Piq. The Piq format is specially designed for viewing and editing data in a text editor.

3
votes

The question doesn't specify the programming language, and my answer is only about Java.

In Java, a Message instance's toString method returns a human-readable textual format. The same format can then be parsed into a Message instance by TextFormat.merge:

String messageString = ...
MyMessage.Builder builder = MyMessage.newBuilder();
TextFormat.merge(messageString, builder);
MyMessage newMessage = builder.build();

(Variations of the merge method can also read from a stream, to avoid reading the whole message string into memory.)

0
votes

Are you sure you want to use ProtoBuf? You could use Json at first, and then switch to either Bson or MessagePack as a binary format.

The Json/Bson combination has the advantage that you can use the same library (Json.net) for them. I believe Bson is a bit bigger than ProtoBuf though.

Or you can use Json/MessagePack. Technically MessagePack is a better binary format than Bson/ProtoBuf IMO. But the tooling support is worse, and you'll need a seperate library for Json and MessagePack. It supports everything Json does and more(in particular it can use both string and integer keys in dictionaries).

Quick comparison of MsgPack and ProtoBuf:

  • Resulting data size if similar constructs are used seems to be comparable.
  • Encoding/Decoding performance largely depends on the implementation, but I expect it to be of similar magnitude
  • MsgPack is more self describing. . In ProtoBuf you don't even see if something is a submessage or a blob.
  • MsgPack supports non integer keys in a dictionary. One thing this allows is storing properties by name when you don't care about size and switch to integers where the gains are large.
  • MsgPack stores the element count instead of the size for arrays/dictionaries. This has the advantage that you don't need to go back in the output and fit in the size all the time, making writing a serializer easier and possibly gives faster write speed. On the other hand you can't easily skip over an element because you don't know its size.
  • MsgPack naturally supports a superset of Json, so you can migrate from Json easily.
  • Tool support, documentation and popularity are much better with ProtoBuf. In particular ProtoBuf.net looks nicer than the C# code available for MsgPack.