What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers?
15 Answers
They both offer many of the same features; however, there are some differences:
- Thrift supports 'exceptions'
- Protocol Buffers have much better documentation/examples
- Thrift has a builtin
Set
type - Protocol Buffers allow "extensions" - you can extend an external proto to add extra fields, while still allowing external code to operate on the values. There is no way to do this in Thrift
- I find Protocol Buffers much easier to read
Basically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).
Another important difference are the languages supported by default.
- Protocol Buffers: Java, Android Java, C++, Python, Ruby, C#, Go, Objective-C, Node.js
- Thrift: Java, C++, Python, Ruby, C#, Go, Objective-C, JavaScript, Node.js, Erlang, PHP, Perl, Haskell, Smalltalk, OCaml, Delphi, D, Haxe
Both could be extended to other platforms, but these are the languages bindings available out-of-the-box.
- Protobuf serialized objects are about 30% smaller than Thrift.
- Most actions you may want to do with protobuf objects (create, serialize, deserialize) are much slower than thrift unless you turn on
option optimize_for = SPEED
. - Thrift has richer data structures (Map, Set)
- Protobuf API looks cleaner, though the generated classes are all packed as inner classes which is not so nice.
- Thrift enums are not real Java Enums, i.e. they are just ints. Protobuf has real Java enums.
For a closer look at the differences, check out the source code diffs at this open source project.
As I've said as "Thrift vs Protocol buffers" topic :
Referring to Thrift vs Protobuf vs JSON comparison :
- Thrift supports out of the box AS3, C++, C#, D, Delphi, Go, Graphviz, Haxe, Haskell, Java, Javascript, Node.js, OCaml, Smalltalk, Typescript, Perl, PHP, Python, Ruby, ...
- C++, Python, Java - in-box support in Protobuf
- Protobuf support for other languages (including Lua, Matlab, Ruby, Perl, R, Php, OCaml, Mercury, Erlang, Go, D, Lisp) is available as Third Party Addons (btw. Here is SWI-Prolog support).
- Protobuf has much better documentation and plenty of examples.
- Thrift comes with a good tutorial
- Protobuf objects are smaller
- Protobuf is faster when using "optimize_for = SPEED" configuration
- Thrift has integrated RPC implementation, while for Protobuf RPC solutions are separated, but available (like Zeroc ICE ).
- Protobuf is released under BSD-style license
- Thrift is released under Apache 2 license
Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.
Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:
We decided against some extreme storage optimizations (i.e. packing small integers into ASCII or using a 7-bit continuation format) for the sake of simplicity and clarity in the code. These alterations can easily be made if and when we encounter a performance-critical use case that demands them.
Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.
I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.
So, if serialization/deserialization is all you need, then you can probably use something else.
http://dhruvbird.blogspot.com/2010/05/protocol-buffers-vs-http.html
One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.
Also, both have bit less tool support than standard formats like xml (and maybe even json).
(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.
ProtocolBuffers is FASTER.
There is a nice benchmark here:
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
You might also want to look into Avro, as Avro is even faster.
Microsoft has a package here:
http://www.nuget.org/packages/Microsoft.Hadoop.Avro
By the way, the fastest I've ever seen is Cap'nProto;
A C# implementation can be found at the Github-repository of Marc Gravell.
I think the basic data structure is different
- Protocol Buffer use variable-length integee which refers to variable-length digital encoding, turning a fixed-length number into a variable-length number to save space.
- Thrift proposed different types of serialization formats (called "protocols"). In fact, Thrift has two different JSON encodings, and no less than three different binary encoding methods.
In conclusion,these two libraries are completely different. Thrift likes a one-stop shop, giving you the entire integrated RPC framework and many options (supporting cross-language), while Protocol Buffers is more inclined to "just do one thing and do it well".
There are some excellent points here and I'm going to add another one in case someones' path crosses here.
Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.
PS: See this excellent benchmark project by thekvs
which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializers
PS: There is another serializer named YAS
which gives this option too but it is schema-less see the link above.