3
votes

The docs for PubSub state that the max payload after decoding is 10MB. My question is whether or not it is advantageous to compress the payload at the publisher before publishing to increase data throughput?

This especially can be helpful if the payload has a high compression ratio like a json formatted payload.

1

1 Answers

6
votes

If you are looking for efficiency on PubSub I would first concentrate on using the best API, and that's the gRPC one. If are using the client libraries then the chance is high that it's using gRPC anyway. Why gRPC?

  • gRPC is binary and your payload doesn't need to go through hoops to be enoded
  • REST needs to base64 the payload, making it bigger and has an extra encoding step

Second I would try to batch the message if possible, making the number of calls lower, eliminating some latency.

And last I would look at compression, but that means you need to specifically de-compress it at the subscriber. This means your application code will get more complex. If all your workloads are on the Google Cloud Platform I wouldn't bother with compression. If your workload is outside of GCP you might consider it, but testing would make sense.

An alternative for compression and if your schema is stable, is looking at using ProtoBuf.

To conclude, I would:

  1. Make sure your using gRPC
  2. Batch where possible
  3. Only compress when needed and after benchmarking (implies extra logic in your application)