How to broadcast data to all Google App Engine instances?

Question

For the sake of simplicity, let's say my app needs to allow thousands of users to see a real-time read-only stream of a chat room. The host can type messages, but no other users can—they just see what's being typed by the hosts, in real time. Imagine users are following a textual play-by-play of a sporting event.

Each user checks for new messages by polling once per second using a simple /get-recent-messages call to the GAE server. (Before you ask, I believe using Google's Channels API would be far too expensive.)

Considering that this app is used by many thousands of users simultaneously, which means dozens to hundreds of GAE instances are running, how can I get these /get-recent-messages calls to return the latest chat room messages with less than 1000 ms latency, while minimizing server load and GAE costs?

Some ideas I had:

Store chat messages in datastore entities.
- Obviously this is way too slow and expensive, especially if using queries/indexes
Store chat messages in memcache keys. I imagine I'd use one key to store the list of keys for the last 50 messages, and then 50 more keys, one for each message.
- This would cause huge bottlenecks because App Engine's memcache shards by key and thus all 500 instances would be constantly reading from the same memcache keys and thus the same memcache server.
Store chat messages in instance memory, backed by memcache. And pull from memcache (like in #2) when instance memory is stale.
- This would probably result in an expensive race condition when multiple requests see stale instance memory cache and pull from memcache simultaneously.
Use a background thread to update instance memory from memcache. This could run once per second per instance using a thread started in the warmup request. It would work like #3 but with only one thread pulling instead of random requests triggering memcache reads.
- I don't think this is how background threads work on App Engine. I also don't really know if this is an appropriate use of warmup requests.
Use Google's Pub/Sub service.
- I have no idea how this works and it seems like it could be overkill for this use case.
Run a once-per-second cron job to pull from memcache. This would be like #4 except without relying on background threads.
- I'd need this to run on every instance every second. I don't believe the cron/taskqueue API has a way to run a job or task on all active instances.

Thoughts?

If you actually hit the limits described in cloud.google.com/appengine/articles/… you could do something as simple as using two app engine projects and a load balancer. I'd start by going with the standard approach with datastore and memcache and see how far that gets me. — konqi
My first thought as I read the question was Cloud Pub/Sub, but I'm not familiar enough with it to be sure. Without knowing more about your app, I would be minded to go initially with datastore + memcache. You don't say how often each chat room would be publishing new messages so I'm not sure datastore that would be 'way too slow' (you don't say if you need strong consistency here). You could use memcache as your first port of call and aggregate your writes to datastore using a pull queue for resilience. — tx802
@konqi I'm not concerned about hitting limits. Just concerned about cost. — Keith
@tx802 new messages would happen every few seconds and I need sub-1000ms latency in getting those messages out to clients. — Keith

Bogdan.Nourescu Bogdan.Nourescu · Accepted Answer · 2015-10-12T11:56:30

You should check this video. I would go for the memcache/datastore version and a small amount of cache (1-2 sec) so you can reduce the amount of instances you need to serve the traffic. If you still need like 100-500 instances to serve your traffic, i would still go for memcache/datastore version. If memcache is a bottleneck for you, shard it in like 10 keys.

Another solution is to use Compute Engine and a web server that you can connect your users via sockets. You can talk to your compute instances either via HTTP and store the value in memory or using pull queues.

If you really need to communicate to all the instances, take a look at communicating between modules

Pub/sub might be a a good option for you to communicate between the instance that publishes new messages and the instances that read the new messages. From what i read in the docs, you should be able to subscribe your users directly to Pub/Sub too (pull only tho).

How to broadcast data to all Google App Engine instances?

1 Answers